Home
Description
Publications

Available Resources
Text Acknowledgements
Related links


Events


CLaRK System

CLaRK System Online Manual


Bulgarian dialects'
electronic archive




eXTReMe Tracker

 

 

 

 

 

 

 


What's New in CLaRK System Version 3.0

(since version 2.0)




This is a general overview of the new features implemented in the CLaRK System (version 3.0) since the previous big release - version 2.0.

The new features are grouped in four categories on the base of their nature as follows:

  • New Implemented Tools:

    • XPath Debugger - a facility for tracing the process of XPath expressions evaluation, step-by-step in graphical environment;
    • Graphical Tree View - an XPath based means for drawing of tree structures encoded in XML;
    • MultiQueryToolEx - a new strategy for combining different tools in the system;
    • Tree Intersection Tool - a tool for comparing tree structures and extracting the most specific substructure common for all input data. User aided intersection expansion on the basis of substructures approval or rejection;
    • Cross-document synchronization - a means for parallel navigation of opened documents in the system editor. The synchronization is symmetric for each document with respect to the others and it is based on structure equivalence;
    • Synchronization rules - a means for linking certain parts of one (source) document with parts of other (target) document(s), on the base of XPath conditions and XPath pointers. The links here are asymmetric with respect with theirs terminals;

  • Further Extended Tools:

    • Entity Converters - the tool has joint to the MultiQuery family, i.e. different types of entity conversions can be applied on a set of (internal) documents, as well as conversion can be included in more complex processing procedures;
    • New XPath Transformations - a completely renovated tool with many new transformation abilities. The tool is adapted to the MultiQuery architecture;
    • Multi Import manager - the new version offers more flexible means for management of the process of importing many documents in the Internal Documents database;
    • MultiQuery Tool extensions - Conditions - The tool introduces new kind of control operators which supervise the quality of the intermediate results and in case of conflict, make the system to try to find a better decision (backtrack);

  • Environment:

    • Internal Documents Database - a new kind of documents repositories are introduced - corpora. More functionality for document management are added;
    • New Grammar Manager architecture - inspired by the new architecture for internal documents storage, the Grammar definitions storage is similarly organized;
    • New Value Constraints management system - the corpora architecture is adopted here for definitions management. The notion Constraints Group now comes naturally from the hierarchical organization of the definitions repository;
    • Import/Export system definitions - this functionality allows transferring system definitions from one system to another (DTDs, tokenizers, filters, etc.);

  • Editor:

    • Multi tree support in the tree panel - now the system editor can show more than one tree view simultaneously, which facilitates parallel documents traversal.
    • Navi Toolbar - enables an Internet browser-like functions Back and Forward, thus making easier the inter documents swapping;
    • XPath Search Query as shortcut - now XPath search queries can be assigned to different key combinations in order to facilitate checking of different XPath conditions for certain selected nodes in the opened document;
    • Implemented typing undo in text area - the system of restoring previous structure events is now extended to include typing events;

  • Processing Optimizations

    • Document Indexing - the system offers a new speed optimization for XPath searching in documents for tokens (partial or full) and XML structures. The partial query string is a regular expression which may contain wildcards;
    • Speed and memory optimised grammar dictionary compilation - the system has an optimised module for compiling acyclic grammar definitions (dictionaries, lists of items, etc.); The module automatically detects the presence of acyclic rules;
    • Negation in Grammar rule expressions syntax - the user now can express the rejection of certain tokens, token categories or elements in the recognized structures.

CLaRK Support Team
(clark-support@bultreebank.org)