Università degli Studi dell'Insubria Insubria Space
 

InsubriaSPACE - Thesis PhD Repository >
Insubria Thesis Repository >
01 - Tesi di dottorato >

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/10277/676

Autori: Zanzi, Antonella
Tutor interno: TROMBETTA, ALBERTO
Titolo: Data quality evaluation through data quality rules and data provenance.
Abstract: The application and exploitation of large amounts of data play an ever-increasing role in today’s research, government, and economy. Data understanding and decision making heavily rely on high quality data; therefore, in many different contexts, it is important to assess the quality of a dataset in order to determine if it is suitable to be used for a specific purpose. Moreover, as the access to and the exchange of datasets have become easier and more frequent, and as scientists increasingly use the World Wide Web to share scientific data, there is a growing need to know the provenance of a dataset (i.e., information about the processes and data sources that lead to its creation) in order to evaluate its trustworthiness. In this work, data quality rules and data provenance are used to evaluate the quality of datasets. Concerning the first topic, the applied solution consists in the identification of types of data constraints that can be useful as data quality rules and in the development of a software tool to evaluate a dataset on the basis of a set of rules expressed in the XML markup language. We selected some of the data constraints and dependencies already considered in the data quality field, but we also used order dependencies and existence constraints as quality rules. In addition, we developed some algorithms to discover the types of dependencies used in the tool. To deal with the provenance of data, the Open Provenance Model (OPM) was adopted, an experimental query language for querying OPM graphs stored in a relational database was implemented, and an approach to design OPM graphs was proposed.
Parole chiave: missing
MIUR : INF/01 INFORMATICA
Data: 2013
Lingua: eng
Corso di dottorato: Informatica
Ciclo di dottorato: 24
Università di conseguimento titolo: Università degli Studi dell'Insubria
Citazione: Zanzi, A.Data quality evaluation through data quality rules and data provenance. (Doctoral Thesis, Università degli Studi dell'Insubria, 2013).

Full text:

File Descrizione DimensioniFormatoConsultabilità
Phd_thesis_antonellazanzi_completa.pdftesto completo tesi851,63 kBAdobe PDFVisualizza/apri

Questo documento è distribuito in accordo con Licenza Creative Commons
Creative Commons


Tutti i documenti archiviati in InsubriaSPACE sono protetti da copyright. Tutti i diritti riservati.


Segnala questo record su
Del.icio.us

Citeulike

Connotea

Facebook

Stumble it!

reddit


 

  ICT Support, development & maintenance are provided by the AePIC team @ CILEA. Powered on DSpace Software.  Feedback