Improving Automatic Tuning of Hadoop and Spark by Analysing Container Performance Metrics
Public Deposited- Resource Type
- Creator
- Abstract
This research introduces novel container performance metrics and proves that these metrics are beneficial in the development of automatic tuning systems. Hadoop and Spark show different patterns in the static and dynamic values of container creation rate, container completion rate, container average response time and relative standard deviation of response-time(RSD). By applying five kinds of machine learning algorithms, container creation rate was found to be the most sensitive metric to identify and classify the workload type at an average accuracy of 83%. RSD can be used to detect workload transitions with an average accuracy of 74%. Our research results will decrease tuning overhead and promote the development of automatic tuning systems.
- Subject
- Language
- Publisher
- Thesis Degree Level
- Thesis Degree Name
- Thesis Degree Discipline
- Identifier
- Rights Notes
Copyright © 2019 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
- Date Created
- 2019
Relations
- In Collection:
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
zhou-improvingautomatictuningofhadoopandsparkby.pdf | 2023-05-05 | Public | Download |