News News Archive

Ti trovi qui: Home

Annotation and Mapping Discovery among Data Sources

Schema matching is the task of finding the semantic correspondences (mappings) between elements of two schemata

  • Approach: starting from “hidden” meanings associated to schema labels (i.e. class and attribute names, also called terms), the MOMIS Data Integration system discovers lexical relationships among schema elements
  • Lexical Annotation of schema labels is the explicit assignment of meanings w.r.t. a reference lexical thesaurus (such as WordNet )
    Manual Annotation is a boring and not scalable task --> Automatic or Semi-automatic Annotation

  •     WSD (Word Sense Disambiguation) is the ability of identifying the meanings of words in a context by a computational technique
        The semi-automatic CWSD (Combined Word Sense Disambiguation) method:
  1. associates to each label, one/more WordNet meanings
  2. combines two WSD algorithms: SD (Structural Disambiguation) exploits the schema derived relationships & WND (WordNet domains Disambiguation) exploits WordNet Domains
  • Schema label normalization: is the reduction of each label to some standardized form that can be easily recognized
        → abbreviation expansion and CN (Compound Noun) annotation

  • For a detailed description, please see the Phd Thesis of Serena Sorrentino and the Phd Thesis of Laura Po
  • Techniques are implemented in NORMS, a tool of the MOMIS-Datariver Data Integrator, developed within the FIT STARTUP project.

Categorie: DBGroup Activities