Each Regular Expression Constraint consists of 2 main parts and 3
additional(optional) parts:
- Constraint name (obligatory) - a unique identifier for the constraint
in the system;
- Regular expression (obligatory) - a valid regular expression which
represents the constraint over the nodes' content;
- Default XPath expression - it is an XPath expression defining the
selection of the nodes to be processed by the constraint. This expression appears as a default text
in the appropriate for this specification area;
- Tokenizer - it is used when the constraint tests text
nodes' content. If the tokenizer remains unspecified, then the processor takes the tokenizer, which is
specifiied in the DTD of the current document;
- Filter - it is used to filter the tokenizer categories when
the
text nodes' content is tested.
This section is responsible for the regular expression constraint management. Here
the REC can be created, modified, removed, saved to a file and loaded from a file. Here is a picture of the dialog
window:

The left side of the window is a table with all REC in the system. The first
column contains the names of the contstraints. The second one contains the regular
expressions for each of the constraints. Having selected a row in this table, the user can
apply a manipulation over a constraint by using the buttons on the right.
Description of the buttons on the right:
- New - creates a new Regular Expression Constraint. Having
pressed the "New" button, a new constraint editor window appears on the screen
(for more details, see below);
- Edit - the currently selected constraint is opened for editing in a new
editor window;
- Remove - removes the currently selected constraint in the table. The
removal is preceded by a confirmation message;
- OK - updates the current changes in the constraints and closes the
manager window;
- Cancel - closes the manager window without saving the changes (if any)
in the constraints;
- Save To File - serializes all the REC into an external file in
an XML format. This function can be used for two main purposes: back-ups and interaction with
external applications. The description of the output XML file(the DTD) can be found in
the file: regConstraint.dtd;
- Load From File - loads the REC from an external file. The external
file must be an XML document, valid with respect to the DTD in the file: regConstraint.dtd;
Regular Expression Constraint Editor
Here is the interface view of the editor window for the REC:

The last three fields are optional. The tokenizer and filter lists contain all
tokenizers and filters defined in the system. The regular expression may consist of: tags,
token kategories, token values and token value templates.
The actual applying of the Regular Expression Constraints can be performed in two ways:
- by selecting a node from the tree panel and choosing a constraint;
- by selecting a set of nodes with the help of an XPath expression and then applying a certain
constraint on each of them;
Here we describe the latter case. The user chooses 'Apply Regular Expression
Constraints' from the menu Constraints/Regular Expression Constraints. Then the
following dialog window appears:

The first input field Select nodes contains the XPath which is
evaluated in order to select nodes for the constraints operation. If the default XPath expression is specified
for the constraint, then it appears in this field as a default text.
The second field selects by name a constraint to be applied.
The last two fields are activated when the constraint tokenizer and filter are ignored
and new ones have to be defined explicitly.
Having pressed the Apply button, the XPath is evaluated and a set of nodes is
selected. Then for each of them the constraint is applied. If the node's content satisfies
the constraint, then the node is marked as Valid. Otherwise it is marked as Non
Valid. In this way two groups of nodes are formed and each of them can be observed
separately. Here is a picture of the navigation panel window:

Here the user can change the group under observation by using the two radio buttons.
Pushing Next and Previous buttons the user changes the current
selection in the editor. On the top of the window there is some information about the
constraint and the nodes which satisfy or do not satisfy it.
The constraint engine is a means for setting restrictions on the content of nodes in an
XML Document, which can not be expressed by the DTD. Each value constraint in the system
is attached to a DTD.
A value constraint in general consists of two parts. A target section and a source
section.
Target Section
In this part one can find a description of the
nodes to which the constraint will be applied. Initially, the nodes are selected by their
tag name and then a further restriction is made by an XPath expression. In this way, using
the expressive power of the XPath language, a context dependancy can be expressed. The
evaluation of the target nodes is performed in the following way. First, from all the elements
in the current document only the nodes with the specified tag name are retrieved. Then for
each of them the XPath expression is evaluated with the current node as a context one. If the
XPath expression is evaluated as a non-empty list, then the context node is included in
the set of nodes to which the constraint is to be applied. Otherwise, it is excluded.
Source Section
Here the possible values for the target nodes
(selected by the previous section) are defined. The possible values are tag names and
tokens depending on the type of the constraint. The source list can be selected by an
XPath expression or by typing the choices explicitly as an XML markup. If the selection is
made by a relative XPath expression, then the current target node is taken as a context
node for the constraint. If a text node is selected as a source, then its text value is
tokenized and the tokens are added to the source list, excluding the node itself. It is
possible that the source for the constraint is an external document. The only requirements
in such cases are the following: the external document has to be in the internal database
of the system and the
XPath expression cannot be relative.
There are four types of value constraints, currently supported by the system. They are
distinguished by their target and the way of their usage. Here is a description each value
constraint separately:
1. Parent Constraint
This type of a value constraint sets limits on the
possible parents of a node. There are two ways of applying this constraint type: by
changing the parent of a node(local) or explicitly runing the constraint engine(global).
The first possibility is changing the parent of a node(or a set of nodes
at one level). The list of all the relevant parent nodes can be restricted further by applying
other constraints. The final list contains the intersection between the source of the constraints
and its former content. If the operation - changing of the parent of a set of nodes - is performed,
then all compatible (parent)constraints are applied.
The second possibility is running the Constraint Engine. It works in the following way. First,
the targets are
selected(by their tag names and the XPath restriction). Then the source is compiled. If
there is more than one choice, the user is asked to select one option from a list. If the choice
happens to be exactly one element, it automaticly is inserted as a parent of the target.
The source list of each constraint must contain only tag names. All tokens in the list
are ignored.
2. All Children Constraint
This type of a value constraint sets limits on
the names
of a node's children and the content of its text children. All children, that are
tags, must have names coinciding with the name of some node from the source list. Then all the
data in text children is tokenized and a list A of tokens is formed. After that all the data in
text nodes in the source list is tokenized and a list B of tokens is formed. For every
token in A there must exist a token in B such that the values (not categories) of A and B
are equal. This type of a value constraint is applied automatically during the validation of a
document.
3. Some Children Constraint
This is a special type of a value
constraint, because its main task is not to set limits on the node's content. Instead, it is used
for a value restriction when the operation inserting a child in a node is
performed. This constraint type is not applied each time a new node is inserted. These
constraints are used separately. Here the target node is the node where the insertion
takes place. The constraint is blocked when:
- there is a child of
the target node that is a tag and there is a node in source list, such that both nodes
have the same names.
- there is a text node
in the target node that has a token, whose value equals the value of a token in the
source list.
To sum up, when there is a non-empty
intersection between the sourse list and the target node's content, the constraint is
satisfied and there is nothing more to be done. In cases when the source list is empty and there is
content for the target node, an error message is shown. If the target content is also
empty, then the constraint is satisfied.
When the source list is not empty and there is
no intersection with the target's content, the user is offered a list with the possible
values from the source list for the target node. The user can choose one item to add. If
the item is a token and the target node has already some text content, the new value is
appended to it with a comma as a separator.
4. Some Attributes Constraint
This constraint is very similar to the
previous one. The only difference is that the target here is an attribute of a node. Also
the target selection includes a selection of an attribute defined in the DTD for the
selected tag name.
The following screen shot is the dialog window of the value constraints editor:

The dialog is separated into 5 sections:
- Constraint Info - here the user gives a short name of the constraint
(free text) which is obligatory. Optionally some additional descriptions can be written in the
second text box.
- Constraint Type - this is a list of four elements where the user
specifies the type of the constraint. The options are : Parent, All
Children, Some Children, Some Attributes
(described before). The two checkboxes on the right can activate some
runtime information as follows:
- Show status before - indicates
the number of the target nodes the constraint is to be applied to, i.e. before the real application;
- Show status after - indicates the number of the
target nodes the constraint has already been applied to, i.e. after the real application.
The check box "Prompt for save on each ... times:" and the text field next to it
are used for making backups of the current state of the document while applying the
constraints. In order to use this option, the check box must be marked and in the text
field a number must be entered. It indicates the number of the successful applications,
after which
the system reminds the user to save the document.
- Target Node - here the description of the target nodes for
the constraint is given. The first field defines the name of the target node(s). The field itself
represents a sorted list of all tag names defined in the DTD. The second field is the
place where the XPath restriction is specified. It is evaluated sequentially on every
initially selected node as a context node. The function of these two fields can
be represented by an XPath query: /descendant-or-self::ta/self::*[not(child::*)]
(for the picture above). The bolded parts represent the text, which comes from the two
text fields. The third field (disabled in the picture) is used when the target of the
constraint is an attribute(Some Attributes). It gives the list of all attributes defined
for the chosen element in the first text field according to the DTD.
- Constraint Source - this section defines the source list for the
constraint. The text field content is either an XPath expression, or an XML markup. It depends
on the radio button, which has been currently selected for the source type. If the source type is 'XML
Mark-up', then the content of the text field is XML. Otherwise it must be an XPath
expression. If the selected type is 'Local Document', then the XPath expression
evaluates each target node as a context one. If the type is 'External Document', then
the choice box gets enabled and the user is expected to choose a document. The XPath
expression is evaluated on this document and the root node is the context. In
the latter case it is expected for the XPath expression to be absolute.
- Tokenization & Help section - here a tokenizer can be activated for the
constraint or it can be blocked in order not to treat the text nodes as a set of tokens but
as a whole. Also a filter can be set in order to exclude some "garbage"
categories as separators or others from the source list. Another restriction can be set
here by defining the token value and the category templates. The templates are defined in the same
way as these in the ClarkSystem grammars (using @ and # symbols for wildcards). Another
facility, which can be relied upon here, is the Help Document. This option ensures the
following possibility: while listing the different choices, the user can get brief information about
the meaning of each choice. This information must be stored in an internal document. Its
structure is described in a DTD in the file: helpFile.dtd. The information about a given
choice appears in the status bar of the editor when the mouse pointer is over the choice.
In this section a short description of the Constraint Editor was presented. It is envoked
whenever a
change on a Value Constraint is needed or a new constraint is to be defined. The Value
Constraint management is handled by the following Constraint Manager
dialog window:

Within the CLaRK System this module can be envoked from the menu: Constraints/Value
constraints. The user is asked to choose the DTD according to which the Value Constraints
are to be applied. Then the dialog window from the picture above appears.
The Constraint Manager represents a table of all constraints defined
for the current DTD(if any). Each constraint is represented as one row in the table. The
ordering in the table is important only if the constraints depend on each other. The constraint in
the first row is applied first, then the second constraint is applied and so on. But the
ordering can be changed by the two buttons on the right side: "Move Up"
and "Move Down". They swap the position of the selected row
with one of its neighbours above or below.
Sometimes it is useful to deactivate some of the constraints temporarily. It can be
done just by non-selecting the check box on the constraint row. For example, in the
picture above the second
constraint remains deactivated.
The other buttons:
- New Constraint - creates a new Value constraint by calling the
Constraint Editor.
- Edit Constraint - edits the selected Value constraint.
- Remove Constraint - removes the selected Value constraint.
- Load From File - loads the Value constraints, which had been saved before.
This is needed when
making backups
- Save To File - saves the Value constraints in the current manager to
file in order to make backups.
- Options - these options allow/disable the usage of certain types of
constraints. This can be used as a filter.
- Apply Constraints - apply all Value constraints which are activated
at the moment for the current DTD. Here the settings from the options hold.This button is
disabled in case there is no document opened or the current one has a different DTD
from the constraint's DTD.
- Done - closes the dialog window by saving the changes on the
constraints (if any).
- Cancel - closes the dialog window without saving the changes on the
constraints (if any).
Value Constraints Group
The constraints described so far work in the following way: the first
constraint is applied to all targets, then the second one is applied, and
so on. The constraint groups, however, work in a slightly different way. First, a context node
is selected
and then all constraints from the group are applied within this context. Each group
contains three parts:
- name - unique identifier of the group;
- context - an XPath expression, selecting the context for the group;
- list - a list of all value constraints included in the group;
The Value Constraint Group Manager:

The Group management includes: the creation of a new group, the modification and removal of
an existing
group, the application of a group of constraints. Each operation (except New Group) is
preceded by selecting a group from the list.
This constraint type restricts the occurences of some specific elements within the
content of a document. The node specification is given by an XPath expression. This
XPath evaluates the root of the document as a context node . The evaluation of the
expression produces a list of nodes. The number of the entries in this list must
range between MIN and MAX values in order to satisfy the constraint. The MIN or MAX value must not
be a negative number. Instead of specifying MAX value, the user can write the character '*'
which means positive infinity, i.e. without any upper limit.
The Number Constraint Manager dialog:

In the example above, the first constraint has no upper limit. The fourth column is
responsible for the activation/deactivation of the constraints. It becomes necessary when the user
would like to apply only a certain subset from all the constraints. Checking the (active)constraints
can be done by
pressing the Apply button. This button is disabled when there in no document in
the editor. After applying the constraints, the user receives information about
the number of the satisfied constraints and the number and type of the unsatisfied ones.