Multidimensional Ontologies for Contextual Quality Data Specification and Extraction

Khaghani Milani, Mostafa

Download PDF

Resource Type

Thesis

Creator

Khaghani Milani, Mostafa

Abstract

Data quality assessment and data cleaning are context-dependent activities. Starting from this observation, in previous work a context model for the assessment of the quality of a database was proposed. A context takes the form of a possibly virtual database or a data integration system into which the database under assessment is mapped, for additional analysis, processing, and quality data extraction. In this work, we extend contexts with dimensions, and by doing so, multidimensional data quality assessment becomes possible. At the core of multidimensional contexts we introduce ontologies with provably good properties in terms of query answering (QA). We use the ontologies to represent dimension hierarchies, dimensional constraints, dimensional rules, and specifying quality data. Query answering relies on and triggers dimensional navigation, and becomes an important tool for the extraction of quality data. We introduce and investigate an ontological-multidimensional (OMD) data model for which the aforementioned multidimensional ontology is a particular case. The OMD model extends the traditional multidimensional data model, embedding it into a Datalog± ontology. The ontology allows for the introduction of generalized fact tables, called categorical relations, which may be incomplete and associated to categories at arbitrary levels of the dimensions. The dimensional rules in the ontology are represented as Datalog± rules, and they enable dimensional navigation while propagating data between different dimension levels, for data completion where data is missing. The dimensional constraints are semantic conditions that have to be satisfied and are represented as Datalog± constraints. It turns out that the ontologies created according to the OMD model correspond to weakly-sticky (WS) programs, for which tractability of conjunctive QA is guaranteed. We analyse the representational and computational properties of the OMD model, we investigate QA and optimization for the WS programs which was only partly studied in the literature. We provide a practical QA algorithms and results of experiments with them.

Subject

Computer science

Language

English

Publisher

Carleton University

Thesis Degree Level

Doctoral

Thesis Degree Name

Doctor of Philosophy (Ph.D.)

Thesis Degree Discipline

Computer Science

Identifier

DOI: https://doi.org/10.22215/etd/2017-11827

Rights Notes

Copyright © 2017 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created

2017

Relations

In Collection:

Theses and Dissertations

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
	khaghanimilani-multidimensionalontologiesforcontextualquality.pdf	2023-05-05	Public	Download

Multidimensional Ontologies for Contextual Quality Data Specification and Extraction

Downloadable Content

Relations

Items