Home
Description
Publications

Available Resources
Text Acknowledgements
Related links


Events


CLaRK System

CLaRK System Online Manual


Bulgarian dialects'
electronic archive




 

 

 

 

 

 

 

title.gif (18679 bytes)

CLaRK System Online User Manual



Menu Constraints

Regular Expression Constraints

Each Regular Expression Constraint consists of 2 main parts and 3 additional(optional) parts:

  • Constraint name (obligatory) - a unique identifier for the constraint in the system;
  • Regular expression (obligatory) - a valid regular expression which represents the constraint over the nodes' content;
  • Default XPath expression - it is an XPath expression defining the selection of the nodes to be processed by the constraint. This expression appears as a default text   in the appropriate for this specification area;
  • Tokenizer - it is used when the constraint tests text nodes' content. If the tokenizer remains unspecified, then the processor takes the tokenizer, which is specifiied in the DTD of the current document;
  • Filter - it is used to filter the tokenizer categories when the text nodes' content is tested.

Edit Regular Expression Constraints (REC)

This section is responsible for the regular expression constraint management. Here the REC can be created, modified, removed, saved to a file and loaded from a file. Here is a picture of the dialog window:

The left side of the window is a table with all REC in the system. The first column contains the names of the contstraints. The second one contains the regular expressions for each of the constraints. Having selected a row in this table, the user can apply a manipulation over a constraint by using the buttons on the right.

Description of the buttons on the right:

  • New - creates a new Regular Expression Constraint. Having pressed the "New" button, a new constraint editor window appears on the screen (for more details, see below);
  • Edit - the currently selected constraint is opened for editing in a new editor window;
  • Remove - removes the currently selected constraint in the table. The removal is preceded by a confirmation message;
  • OK - updates the current changes in the constraints and closes the manager window;
  • Cancel - closes the manager window without saving the changes (if any) in the constraints;
  • Save To File - serializes all the REC into an external file in an XML format. This function can be used for two main purposes: back-ups and interaction with external applications. The description of the output XML file(the DTD) can be found in the file: regConstraint.dtd;
  • Load From File - loads the REC from an external file. The external file must be an XML document, valid with respect to the DTD in the file: regConstraint.dtd;

Regular Expression Constraint Editor

Here is the interface view of the editor window for the REC:

Image2.gif (4865 bytes)

The last three fields are optional. The tokenizer and filter lists contain all tokenizers and filters defined in the system. The regular expression may consist of: tags, token kategories, token values and token value templates.

Apply Regular Expression Constraints

The actual applying of the Regular Expression Constraints can be performed in two ways:

  • by selecting a node from the tree panel and choosing a constraint;
  • by selecting a set of nodes with the help of an XPath expression and then applying a certain constraint on each of them;

Here we describe the latter case. The user chooses 'Apply Regular Expression Constraints' from the menu Constraints/Regular Expression Constraints. Then the following dialog window appears:

Image3.gif (4823 bytes)

The first input field Select nodes contains the XPath which is evaluated in order to select nodes for the constraints operation. If the default XPath expression is specified for the constraint, then it appears in this field as a default text.

The second field selects by name a constraint to be applied.

The last two fields are activated when the constraint tokenizer and filter are ignored and new ones have to be defined explicitly.

Having pressed the Apply button, the XPath is evaluated and a set of nodes is selected. Then for each of them the constraint is applied. If the node's content satisfies the constraint, then the node is marked as Valid. Otherwise it is marked as Non Valid. In this way two groups of nodes are formed and each of them can be observed separately. Here is a picture of the navigation panel window:

Image4.gif (4531 bytes)

Here the user can change the group under observation by using the two radio buttons. Pushing Next and Previous buttons the user changes the current selection in the editor. On the top of the window there is some information about the constraint and the nodes which satisfy or do not satisfy it.

Value Constraints

The constraint engine is a means for setting restrictions on the content of nodes in an XML Document, which can not be expressed by the DTD. Each value constraint in the system is attached to a DTD.
A value constraint in general consists of two parts. A target section and a source section.
    Target Section
        In this part one can find a description of the nodes to which the constraint will be applied. Initially, the nodes are selected by their tag name and then a further restriction is made by an XPath expression. In this way, using the expressive power of the XPath language, a context dependancy can be expressed. The evaluation of the target nodes is performed in the following way. First, from all the elements in the current document only the nodes with the specified tag name are retrieved. Then for each of them the XPath expression is evaluated with the current node as a context one. If the XPath expression is evaluated as a non-empty list, then the context node is included in the set of nodes to which the constraint is to be applied. Otherwise, it is excluded.

    Source Section
        Here the possible values for the target nodes (selected by the previous section) are defined. The possible values are tag names and tokens depending on the type of the constraint. The source list can be selected by an XPath expression or by typing the choices explicitly as an XML markup. If the selection is made by a relative XPath expression, then the current target node is taken as a context node for the constraint. If a text node is selected as a source, then its text value is tokenized and the tokens are added to the source list, excluding the node itself. It is possible that the source for the constraint is an external document. The only requirements in such cases are the following: the external document has to be in the internal database of the system and the XPath expression cannot be relative.

There are four types of value constraints, currently supported by the system. They are distinguished by their target and the way of their usage. Here is a description each value constraint separately:

    1. Parent Constraint
        This type of a value constraint sets limits on the possible parents of a node. There are two ways of applying this constraint type: by changing the parent of a node(local) or explicitly runing the constraint engine(global).

The first possibility is changing the parent of a node(or a set of nodes at one level). The list of all the relevant parent nodes can be restricted further by applying other constraints. The final list contains the intersection between the source of the constraints and its former content. If the operation - changing of the parent of a set of nodes - is performed, then all compatible (parent)constraints are applied.

The second possibility is running the Constraint Engine. It works in the following way. First, the targets are selected(by their tag names and the XPath restriction). Then the source is compiled. If there is more than one choice, the user is asked to select one option from a list. If the choice happens to be exactly one element, it automaticly is inserted as a parent of the target.

The source list of each constraint must contain only tag names. All tokens in the list are ignored.

    2. All Children Constraint
        This type of a value constraint sets limits on the names of a node's children and the content of its text children. All children, that are tags, must have names coinciding with the name of some node from the source list. Then all the data in text children is tokenized and a list A of tokens is formed. After that all the data in text nodes in the source list is tokenized and a list B of tokens is formed. For every token in A there must exist a token in B such that the values (not categories) of A and B are equal. This type of a value constraint is applied automatically during the validation of a document.

    3. Some Children Constraint
        This is a special type of a value constraint, because its main task is not to set limits on the node's content. Instead, it is used for a value restriction when the operation inserting a child in a node is performed. This constraint type is not applied each time a new node is inserted. These constraints are used separately. Here the target node is the node where the insertion takes place. The constraint is blocked when:
            - there is a child of the target node that is a tag and there is a node in source list, such that both nodes have the same names.
            - there is a text node in the target node that has a token, whose value equals the value of a token in the source list.
        To sum up, when there is a non-empty intersection between the sourse list and the target node's content, the constraint is satisfied and there is nothing more to be done. In cases when the source list is empty and there is content for the target node, an error message is shown. If the target content is also empty, then the constraint is satisfied.
        When the source list is not empty and there is no intersection with the target's content, the user is offered a list with the possible values from the source list for the target node. The user can choose one item to add. If the item is a token and the target node has already some text content, the new value is appended to it with a comma as a separator.
    4. Some Attributes Constraint
        This constraint is very similar to the previous one. The only difference is that the target here is an attribute of a node. Also the target selection includes a selection of an attribute defined in the DTD for the selected tag name.

The following screen shot is the dialog window of the value constraints editor:

VConst.gif (19423 bytes)

The dialog is separated into 5 sections:

  • Constraint Info - here the user gives a short name of the constraint (free text) which is obligatory. Optionally some additional descriptions can be written in the second text box.
  • Constraint Type - this is a list of four elements where the user specifies the type of the constraint. The options are : Parent, All Children, Some Children, Some Attributes (described before). The two checkboxes on the right can activate some runtime information as follows:
        - Show status before - indicates the number of the target nodes the constraint is to be applied to, i.e. before the real application;
        - Show status after - indicates the number of the target nodes the constraint has already been applied to, i.e. after the real application.
    The check box "Prompt for save on each ... times:" and the text field next to it are used for making backups of the current state of the document while applying the constraints. In order to use this option, the check box must be marked and in the text field a number must be entered. It indicates the number of the successful applications, after which the system reminds the user  to save the document.
  • Target Node - here the description of the target nodes for the constraint is given. The first field defines the name of the target node(s). The field itself represents a sorted list of all tag names defined in the DTD. The second field is the place where the XPath restriction is specified. It is evaluated sequentially on every initially selected node as a context node. The function of these two fields can be represented by an XPath query: /descendant-or-self::ta/self::*[not(child::*)] (for the picture above). The bolded parts represent the text, which comes from the two text fields. The third field (disabled in the picture) is used when the target of the constraint is an attribute(Some Attributes). It gives the list of all attributes defined for the chosen element in the first text field according to the DTD.
  • Constraint Source - this section defines the source list for the constraint. The text field content is either an XPath expression, or an XML markup. It depends on the radio button, which has been currently selected for the source type. If the source type is 'XML Mark-up', then the content of the text field is XML. Otherwise it must be an XPath expression. If the selected type is 'Local Document', then the XPath expression evaluates each target node as a context one. If the type is 'External Document', then the choice box gets enabled and the user is expected to choose a document. The XPath expression is evaluated on this document and the root node is the context. In the latter case it is expected for the XPath expression to be absolute.
  • Tokenization & Help section - here a tokenizer can be activated for the constraint or it can be blocked in order not to treat the text nodes as a set of tokens but as a whole. Also a filter can be set in order to exclude some "garbage" categories as separators or others from the source list. Another restriction can be set here by defining the token value and the category templates. The templates are defined in the same way as these in the ClarkSystem grammars (using @ and # symbols for wildcards). Another facility, which can be relied upon here, is the Help Document. This option ensures the following possibility: while listing the different choices, the user can get brief information about the meaning of each choice. This information must be stored in an internal document. Its structure is described in a DTD in the file: helpFile.dtd. The information about a given choice appears in the status bar of the editor when the mouse pointer is over the choice.

In this section a short description of the Constraint Editor was presented. It is envoked whenever a change on a Value Constraint is needed or a new constraint is to be defined. The Value Constraint management is handled by the following Constraint Manager dialog window:

VConst_m.gif (22410 bytes)

Within the CLaRK System this module can be envoked from the menu: Constraints/Value constraints. The user is asked to choose the DTD according to which the Value Constraints are to be applied. Then the dialog window from the picture above appears.

The Constraint Manager represents a table of all constraints defined for the current DTD(if any). Each constraint is represented as one row in the table. The ordering in the table is important only if the constraints depend on each other. The constraint in the first row is applied first, then the second constraint is applied and so on. But the ordering can be changed by the two buttons on the right side: "Move Up" and "Move Down". They swap the position of the selected row with one of its neighbours above or below.

Sometimes it is useful to deactivate some of the constraints temporarily. It can be done just by non-selecting the check box on the constraint row. For example, in the picture above the second constraint remains deactivated.

The other buttons:

  • New Constraint - creates a new Value constraint by calling the Constraint Editor.
  • Edit Constraint - edits the selected Value constraint.
  • Remove Constraint - removes the selected Value constraint.
  • Load From File - loads the Value constraints, which had been saved before. This is needed when making backups
  • Save To File - saves the Value constraints in the current manager to file in order to make backups.
  • Options - these options allow/disable the usage of certain types of constraints. This can be used as a filter.
  • Apply Constraints - apply all Value constraints which are activated at the moment for the current DTD. Here the settings from the options hold.This button is disabled in case there is no document opened or the current one has a different DTD from the constraint's DTD.
  • Done - closes the dialog window by saving the changes on the constraints (if any).
  • Cancel - closes the dialog window without saving the changes on the constraints (if any).

Value Constraints Group

The constraints described so far work in the following way: the first constraint is applied to all targets, then the second one is applied, and so on. The constraint groups, however, work in a slightly different way. First, a context node is selected and then all constraints from the group are applied within this context. Each group contains three parts:

  • name - unique identifier of the group;
  • context - an XPath expression, selecting the context for the group;
  • list - a list of all value constraints included in the group;

The Value Constraint Group Manager:

VGroup.gif (8389 bytes)

The Group management includes: the creation of a new group, the modification and removal of an existing group, the application of a group of constraints. Each operation (except New Group) is preceded by selecting a group from the list.

Number Constraints

This constraint type restricts the occurences of some specific elements within the content of a document. The node specification is given by an XPath expression. This XPath evaluates the root of the document as a context node . The evaluation of the expression produces a list of nodes. The number of the entries in this list must range between MIN and MAX values in order to satisfy the constraint. The MIN or MAX value must not be a negative number. Instead of specifying MAX value, the user can write the character '*' which means positive infinity, i.e. without any upper limit.

The Number Constraint Manager dialog:

NumConst.gif (6065 bytes)

In the example above, the first constraint has no upper limit. The fourth column is responsible for the activation/deactivation of the constraints. It becomes necessary when the user would like to apply only a certain subset from all the constraints. Checking the (active)constraints can be done by pressing the Apply button. This button is disabled when there in no document in the editor. After applying the constraints, the user receives information about the number of the satisfied constraints and the number and type of the unsatisfied ones.

Back to Contents