A Multidimensional Data Model with Subcategories for Expressing and Repairing Summarizability

Public Deposited
Resource Type
Creator
Abstract
  • In multidimensional (MD) databases summarizability is a key property for obtaining interactive response times. With summarizable dimensions, pre-computed and materialized aggregate query results at lower levels of the dimension hierarchy can be used to correctly compute results at higher levels of the same hierarchy, improving efficiency. Being summarizability such a desirable property, we argue that established MD models cannot properly model the summarizability condition, and this is a consequence of the limited expressive power of the modeling languages. In addition, because of limitations in existing MD models, algorithms for deciding summarizability and cube view selection are not efficient or practical. We propose an extension to the Hurtado-Meldelzon (HM) MD model, the EHM model, that includes subcategories and explore its properties specially in addressing issues related to summarizability. We investigate the extended model as a way to directly model MDDBs, with some clear advantages over HM models. Most importantly, EHM is -in a precise technical sense- more expressive than HM for modeling MDDBs that are subject to summarizability conditions. Moreover, given an MD aggregate query in an EHM database, we can determine in a practical way (that only requires processing the dimension schema as opposed to the instance), from which minimal subset of pre-computed cube views it can be correctly computed. Our extended model allows for a repair approach that transforms non-summarizable HM dimensions into summarizable EHM dimensions. We propose and formalize a two-step process that involves modifying both the schema and the instance of a non-summarizable HM dimension.

Subject
Language
Publisher
Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Identifier
Rights Notes
  • Copyright © 2014 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created
  • 2014

Relations

In Collection:

Items