Multidimensional Ontologies for Contextual Quality Data Specification and Extraction

Public Deposited
Resource Type
Creator
Abstract
  • Data quality assessment and data cleaning are context-dependent activities. Starting from this observation, in previous work a context model for the assessment of the quality of a database was proposed. A context takes the form of a possibly virtual database or a data integration system into which the database under assessment is mapped, for additional analysis, processing, and quality data extraction. In this work, we extend contexts with dimensions, and by doing so, multidimensional data quality assessment becomes possible. At the core of multidimensional contexts we introduce ontologies with provably good properties in terms of query answering (QA). We use the ontologies to represent dimension hierarchies, dimensional constraints, dimensional rules, and specifying quality data. Query answering relies on and triggers dimensional navigation, and becomes an important tool for the extraction of quality data. We introduce and investigate an ontological-multidimensional (OMD) data model for which the aforementioned multidimensional ontology is a particular case. The OMD model extends the traditional multidimensional data model, embedding it into a Datalog± ontology. The ontology allows for the introduction of generalized fact tables, called categorical relations, which may be incomplete and associated to categories at arbitrary levels of the dimensions. The dimensional rules in the ontology are represented as Datalog± rules, and they enable dimensional navigation while propagating data between different dimension levels, for data completion where data is missing. The dimensional constraints are semantic conditions that have to be satisfied and are represented as Datalog± constraints. It turns out that the ontologies created according to the OMD model correspond to weakly-sticky (WS) programs, for which tractability of conjunctive QA is guaranteed. We analyse the representational and computational properties of the OMD model, we investigate QA and optimization for the WS programs which was only partly studied in the literature. We provide a practical QA algorithms and results of experiments with them.

Subject
Language
Publisher
Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Identifier
Rights Notes
  • Copyright © 2017 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created
  • 2017

Relations

In Collection:

Items