Multidimensional Ontologies for Contextual Quality Data Specification and Extraction

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Creator: 

Khaghani Milani, Mostafa

Date: 

2017

Abstract: 

Data quality assessment and data cleaning are context-dependent activities. Starting from this observation, in previous work a context model for the assessment of the quality of a database was proposed. A context takes the form of a possibly virtual database or a data integration system into which the database under assessment is mapped, for additional analysis, processing, and quality data extraction. In this work, we extend contexts with dimensions, and by doing so, multidimensional data quality assessment becomes possible. At the core of multidimensional contexts we introduce ontologies with provably good properties in terms of query answering (QA). We use the ontologies to represent dimension hierarchies, dimensional constraints, dimensional rules, and specifying quality data. Query answering relies on and triggers dimensional navigation, and becomes an important tool for the extraction of quality data. We introduce and investigate an ontological-multidimensional (OMD) data model for which the aforementioned multidimensional ontology is a particular case. The OMD model extends the traditional multidimensional data model, embedding it into a Datalog± ontology. The ontology allows for the introduction of generalized fact tables, called categorical relations, which may be incomplete and associated to categories at arbitrary levels of the dimensions. The dimensional rules in the ontology are represented as Datalog± rules, and they enable dimensional navigation while propagating data between different dimension levels, for data completion where data is missing. The dimensional constraints are semantic conditions that have to be satisfied and are represented as Datalog± constraints. It turns out that the ontologies created according to the OMD model correspond to weakly-sticky (WS) programs, for which tractability of conjunctive QA is guaranteed. We analyse the representational and computational properties of the OMD model, we investigate QA and optimization for the WS programs which was only partly studied in the literature. We provide a practical QA algorithms and results of experiments with them.

Subject: 

Computer Science

Language: 

English

Publisher: 

Carleton University

Thesis Degree Name: 

Doctor of Philosophy: 
Ph.D.

Thesis Degree Level: 

Doctoral

Thesis Degree Discipline: 

Computer Science

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).