Big Data
The amount of data is exponentially growing.
To deal with this huge amount of information, and to extract its enormous hidden value, the DBGroup is carrying on research about: data management, data analysis, and data accessibility.
- Big Data Integration: for making sense of big data, scattered across multiple sources, novel scalable techniques are needed. The DBGroup is studying and developing cutting-edge tools for supporting data engineers and data scientists to do that easily and efficiently.
- Big Data Management, i.e., how to handle the huge amount of data: since the volume of the data to be analysed is extremely large, the DBGroup is adopting cutting-edge technologies to manage Big Data (e.g. Apache Hadoop, Apache Spark, NoSQL/NewSQL DBMS).
- Big Data Analysis, i.e., how to get valuable insight form the data, and how to extract information to drive decision making process: given the huge amount of involved data, traditional techniques for machine learning and, more generally, data analysis on small data are no longer applicable. Hence, the DBGroup is focused on developing new approach to work in this context and integrated with the systems for Data Management.
The DBGroup is part of the international research movement that proposes a new perspective of using Machine Learning (ML), that is MLOps. The goal of MLOps is making high quality data available through all stages of the ML project lifecycle. MLOps tools are needed to make Data-Driven AI an efficient and systematic process.
Theory, techniques and tools to deal with big data are taught in the courses held by the DBGroup for the Master's Degree in Computer Engineering.
Higher Training Courses
- Master in Development, Manufacturing and Authorization of Biopharmaceuticals organized by the Department of Life Sciences of the University of Modena and Reggio Emilia (2022)
- Master in Artificial Intelligence and Telehealth organized by the University of Parma (2022)
- Academy Big Data (2019)
- Tools and techniques for massive data analysis promoted by CINECA (2016)
Recent Talks
- Prof. Sonia Bergamaschi and Luca Zecchini presented the contribution "Big Data Integration for Data-Centric AI", describing the research activities carried out by the DBGroup, at ItaData 2022 in Milan on September 20-21, 2022. [slides]
- Dr. Giovanni Simonini and Dr. Luca Gagliardelli presented our research papers "Entity Resolution On-Demand" and "Generalized Supervised Meta-blocking" at VLDB 2022, held in Sydney on September 5-9, 2022.
- Dr. Luca Gagliardelli presented our contribution "ECDP: A Big Data Platform for the Smart Monitoring of Local Energy Communities" at the DataPlat workshop at EDBT/ICDT 2022 on March 29, 2022.
- Prof. Sonia Bergamaschi presented our contribution "Big Data Integration & Data-Centric AI for eHealth", describing the research activities carried out by the DBGroup in this area, at Ital-IA 2022, on February 10, 2022. [paper] [slides]
- Prof. Sonia Bergamaschi held a talk entitled "Big Data Integration for e-Health" at the Data4SmartHealth workshop in Bolzano on October 27, 2021. [slides]
- Prof. Sonia Bergamaschi held a talk "Big Data & Cognitive Computing: Challenges and Opportunities for Data Driven Economies" at Pulsar Event in Formigine on October 1st, 2021. [slides]
- RulER has been presented at EDBT 2020, online conference (due to covid-19). [paper] [code]
- Prof. Sonia Bergamaschi, Dr. Luca Gagliardelli and Dr. Giovanni Simonini participated to SEBD 2020 to present our last work about how to "Scaling Up Record-level Matching Rules".
Projects & Collaborations
- Pleinair (since 2021, coordinated by DataRiver)
- DXP Digital Experience Platform (since 2020, with Doxee, funded by Regione Emilia-Romagna)
- Smart Monitoring of a Local Energy Community (2020-2021, funded by MISE and supervised by ENEA)
- Entity Resolution for Big Data Integration (SparkER)
- Research Agreement with CINECA
- Member of the CINI Big Data Lab Laboratorio Big Data dell'Universitą degli Studi di Modena e Reggio Emilia, Dipartimento di Ingegneria "Enzo Ferrari" (Supervisor: Prof. Sonia Bergamaschi)
- Member of the CINI National Laboratory of Artificial Intelligence and Intelligent Systems [pdf]
See Projects section for more details.
Categorie: DBGroup Activities