|
|

What's New in CLaRK System Version 3.0
(since version 2.0)
This is a general overview of the new features implemented in the CLaRK System (version 3.0) since
the previous big release - version 2.0.
The new features are grouped in four categories on the base of their nature as follows:
- New Implemented Tools:
- XPath Debugger - a facility for
tracing the process of XPath expressions evaluation, step-by-step in graphical environment;
- Graphical Tree View - an
XPath based means for drawing of tree structures encoded in XML;
- MultiQueryToolEx
- a new strategy for combining different tools in the system;
- Tree Intersection Tool
- a tool for comparing tree structures and extracting
the most specific substructure common for all input data. User aided intersection expansion on the basis
of substructures approval or rejection;
- Cross-document synchronization - a means for parallel navigation of
opened documents in the system editor. The synchronization is symmetric for each document with respect
to the others and it is based on structure equivalence;
- Synchronization
rules - a means for linking certain parts of one (source) document
with parts of other (target) document(s), on the base of XPath conditions and XPath pointers. The links
here are asymmetric with respect with theirs terminals;
- Further Extended Tools:
- Entity Converters
- the tool has joint to the MultiQuery family, i.e.
different types of entity conversions can be applied on a set of (internal) documents, as well as
conversion can be included in more complex processing procedures;
- New XPath
Transformations - a completely renovated tool with many new
transformation abilities. The tool is adapted to the MultiQuery architecture;
- Multi Import manager
- the new version offers more flexible means for
management of the process of importing many documents in the Internal Documents database;
- MultiQuery Tool
extensions - Conditions - The tool introduces new kind
of control operators which supervise the quality of the intermediate results and in case of conflict, make the
system to try to find a better decision (backtrack);
- Environment:
- Internal Documents Database
- a new kind of documents repositories are
introduced - corpora. More functionality for document management are added;
- New Grammar Manager
architecture - inspired by the new architecture
for internal documents storage, the Grammar definitions storage is similarly organized;
- New Value Constraints
management system - the corpora architecture is adopted here for definitions management. The notion
Constraints Group now comes naturally from the hierarchical organization of the definitions
repository;
- Import/Export
system definitions - this functionality allows transferring system
definitions from one system to another (DTDs, tokenizers, filters, etc.);
- Editor:
- Multi tree support in the tree panel - now the system editor can show more than one tree view
simultaneously, which facilitates parallel documents traversal.
- Navi Toolbar - enables an Internet browser-like functions Back and
Forward, thus making easier the inter documents swapping;
- XPath Search Query as shortcut - now XPath search queries can be assigned to different key
combinations in order to facilitate checking of different XPath conditions for certain selected nodes
in the opened document;
- Implemented typing undo in text area - the system of restoring previous structure events is now
extended to include typing events;
- Processing Optimizations
- Document Indexing - the system offers a new speed optimization for XPath
searching in documents for tokens (partial or full) and XML structures. The partial query string is a
regular expression which may contain wildcards;
- Speed and memory optimised grammar dictionary compilation - the system has an optimised module
for compiling acyclic grammar definitions (dictionaries, lists of items, etc.); The module automatically detects
the presence of acyclic rules;
- Negation in Grammar rule expressions syntax - the user now can express the rejection of
certain tokens, token categories or elements in the recognized structures.
CLaRK Support Team (clark-support@bultreebank.org)
|