Ti trovi qui: Home » Research

Big Data

The amount of data is exponentially growing.
To deal with this huge amount of information, and to extract its enormous hidden value, the DBgroup is carrying on research about: data management, data analysis and data accessibility.

  1. Big Data Integration: For making sense of big data, scattered across multiple sources, novel scalable techniques are needed. The DBgroup is studying and developing cutting-edge tools for supporting data engineers and data scientists to do that easily and efficiently.
  2. Data Management, i.e., how to handle the huge amount of data: since the volume of the data to be analysed is extremely large, the DBgroup is adopting cutting-edge technologies to manage Big Data (e.g. Apache Hadoop, Apache Spark, NoSQL/NewSQL DBMS).
  3. Data Analysis, i.e., how to get valuable insight form the data, and how to extract information to drive decision making process: given the huge amount of involved data, traditional techniques for machine learning and, more generally, data analysis on “small” data are no longer applicable. Hence, the DBgroup is focused on developing new approach to work in this context and integrated with the systems for Data Management.


  • [Nov '20] Giovanni Simonini received an honourable mention from the "Gruppo 2003 per la Ricerca Scientifica" for his research on "Schema-agnostic Progressi Entity Resolution" [news]
  • [Jun '20] DBGroup's master student Luca Zecchini is runner-up (2nd prize) in this year student programming contest at SIGMOD '20.
  • [Jun '20]  "Three-dimensional Entity Resolution with JedAI" accepted at Information Systems. [code]
  • [May '20]  "BLAST2: an Efficient Technique for Loose Schema Information Extraction from heterogeneous Big Data sources" accepted at ACM JIDQ.
  • [Jan '20]  "RulER: Scaling Up Record-level Matching Rules" accepted as a demo at EDBT 2020. [paper] [code]
  • [Jan '19] DBgroup is organizing the "Academy Big Data" [link to the program]


  • Digital Experience Platform: providing better cutomer experience with Big Data and AI (with Doxee, funded by Regione Emilia Romagna)
  • Smart monitoring of Local Energy Community (funded by-and in collaboration with-ENEA)

Recent Talks

  • RulER has been presented at EDBT 2020, online conference (due to covid-19). [paper] [code]
  • Prof. Sonia bergamaschi, Dr. Luca Gagliardelli and Dr. Giovanni Simonni partecipated to SEBD 2020 to present our last work about how to "Scaling Up Record-level Matching Rules"

Ongoing Projects & Collaborations

Categorie: DBGroup Activities