Parallel Olap on Multi/Many-Core and Cloud Platforms

Public Deposited
Resource Type
Creator
Abstract
  • One of the most powerful and prominent technologies for knowledge discovery in Decision Support System environments is Online Analytical Processing (OLAP). OLAP is the foundation for a wide range of essential business applications. Since its introduction, OLAP has consistently required a massive computational power. On the other hand, the physical limits of the speed for single processor systems bound the performance of any single processor solution. Parallel and distributed processing can provide two key ingredients to solve this problem: increased computational power through parallel processors and increased I/O bandwidth through parallel storage. In this thesis, we provide new methods to parallelize OLAP systems on the recent parallel and distributed platforms. We provide new algorithms including two parallel sorting algorithms (an important part of data cube construction) on many-core Graphics Processors (GPUs) and multi-core CPUs. In addition, we introduce a method for parallel construction of static data cubes on multi-core CPUs. Next, we present the main contribution of this thesis in the area of Real-time OLAP. We offer and discuss a new algorithmic solution with a new data structure called PDC-tree that supplies Real-time OLAP for multi-core platforms. To our knowledge, the PDC-tree is the first solution that provides a fully parallel Real-time OLAP using a parallelized tree data structure. We emphasize that the PDC-tree provides Real-time OLAP without materializing any data cube, and hence avoids its drawbacks. In the last part of this thesis, we focus on the subject of parallel and distributed Realtime OLAP on cloud architectures. A cloud-based framework called CR-OLAP is developed that builds the structure of our cloud solution. CR-OLAP encompasses a new OLAP data structure called PDCR-tree, a non-trivial enhanced successor of the PDC-tree. In addition to answering OLAP queries in a real-time manner, CR-OLAP provides the scalability and load balancing of data among cloud resources, while assuring performance for very large data warehouses. Experiments on the Amazon EC2 Cloud confirm the real-time responsibility of CR-OLAP under a heavy load of OLAP queries in large data warehouses.

Subject
Language
Publisher
Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Identifier
Rights Notes
  • Copyright © 2014 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created
  • 2014

Relations

In Collection:

Items