Data Intensive Systems
The Group
The Data Intensive Systems group studies issues and copes with challenges related to systems that manage massive volumes of data.
It develops tools, techniques and methodologies that are related to Heterogeneous Data Integration, Graph Data Management, Knowledge Graphs, Data Curation, Explainability, and Dataset Federations. The group is active in various initiatives, like the AI Lab on Sustainable Finance, Applied Data Science Focus Area, and Data Sharing Coalition.
Top:The Data Intensive Systems group studies issues and copes with challenges related to systems that manage massive volumes of data.
It develops tools, techniques and methodologies that are related to Heterogeneous Data Integration, Graph Data Management, Knowledge Graphs, Data Curation, Explainability, and Dataset Federations. The group is active in various initiatives, like the AI Lab on Sustainable Finance, Applied Data Science Focus Area, and Data Sharing Coalition.
About our Research
About our research
Data has been coined “The oil of the 21st century”. Businesses and organizations have realized that in order to thrive in the data driven economy, have to adopt modern data management solutions that will allow them to innovate and generate high-quality added value services.
Before any data can be leveraged by data analytics to generate insights, it has to be first prepared, understood, and curated to maintain its value. New York Times has reported that such tasks may take up to 80% of a data scientist time.
The Data Intensive Systems Group aims at supporting the users of tomorrow, and particularly data scientists, to: (i) integrate a multitude of highly heterogeneous and independently developed data sources, (ii) analyze and understand data, even with complex, unknown or non-traditional structures, (iii) eliminate data quality issues, and (iv) extract and manage knowledge in a systematic way. All these, with the intention to be performed in ways that are less laborious, less time-consuming, and less error prone.
The group studies new paradigms of user interaction with Big Data and develops algorithms and systems that exploit state-of-the-art technologies to cope with the large-scale and intensive processing that modern massive datasets require. The expertise and research revolves around (but is not limited to) the following areas:
- Data Preparation & Curation (Data Discovery, Heterogeneous Information Integration, Entity Linkage, Data Cleaning, Data Quality, Data Preservation)
- Data Understanding (Data Exploration, Metadata Management, Big Data Profiling)
- Information Extraction from Heterogeneous Big Data Repositories (Keyword Search, Search through Examples)
- Graph Management (Labeled Graphs, Ontologies)
- Knowledge Management (Knowledge Extraction, Reasoning, Knowledge Graph Management, Semantic Web Data)
- Evolving Data (Streams, Time Series, Anomaly Detection, Evolving Graphs, Temporal Knowledge graphs)
The generated solutions find applications in many different domains, ranging from Science, Engineering, Retailing and Finance to Healthcare, Telecommunication, Education, and Transportation.
The Data Intensive Systems is one of the research groups of the AI and Data Science division within the department of Information and Computing sciences.