RELEvant VAlues geNeraTor








Sonia Bergamaschi, University of Modena and Reggio Emilia, Italy
Claudio Sartori, University of Bologna, Italy
Francesco Guerra, University of Modena and Reggio Emilia, Italy
Mirko Orsini, University of Modena and Reggio Emilia, Italy





  • The idea is that, in many cases, the values of an attribute domain may be clustered because strongly related according to some kind of "hidden relationship".
  • Providing a name to these clusters, we may refer to a relevant value name which encompasses a set of values.


  • More formally, given a class/table C and one of its attributes At, a relevant value for it, rvAt is a pair

    rvAt = < rvnAt, valuesAt >

    where rvnAt is the name of the relevant value, while valuesAt is the set of values referring to it



Publications about Relevant

We started to research about this topic some years ago. The Relevant idea and implementation started on December 2005.





  • Our research is motivated form the idea that the knowledge about the metadata describing a database (table's name, attributes' names and domain, ...) is often not enough for writing a query, expecially in a data integration environment like MOMIS.

    • Integration puts together in the same global class a number of local semantically similar classes coming from different sources.
    • The name/description of a global class/global attribute is often generic and significantly limiting the effectiveness of querying.
    • Ignoring the values assumed by a global attribute may generate meaningless, too selective or empty queries.
    • Knowing all the data collected from a global class is infeasible for a user: databases contain large amount of data which a user cannot deal with.
  • A metadata structure derived from an analysis of the attribute extension could be of great help in overcoming such limitation. Such metadata represent a synthesized extensional knowledge emerging from the attribute values. They are “relevant values” as they provide to the users a synthetic description of the values of the attribute by representing its domain with a reduced number of values.





For generating relevant values we have to face two issues:

  1. How can we cluster the values of the domain in order to put together in a relevant value a set of values which are strongly related?
    By means of data mining and clustering techniques, adapted to the problem on hand. The techniques take into account some semantics extracted from:
    1. the syntax of the values: values related to the same object may have the same etymology and then share a common root;
    2. the dominance, which discovers values more general than other ones;
    3. the lexical meaning w.r.t. WordNet, which identifies semantically related values expressed with a different terminology.
  2. How can we choose the relevant value names?
    It will require the intervention of the designer, but we will provide an effective assistant.


Further information



