Auto-scaling mechanisms allow applications running on Cloud environments to maintain a guaranteed Quality of Service while efficiently utilizing resources and keeping operational costs low for the service providers. However, creating such an auto-scaling framework may be challenging due to the need to precisely estimate resource usage while the workload patterns vary significantly.
The research presented in this thesis focuses on automatic provisioning of compute resources in the Cloud performed by an intermediary enterprise for a single client enterprise. The enterprise hosting a broker uses techniques for dynamically controlling the number of resources used by the client enterprise. The research introduces three auto-scaling techniques: a reactive, a proactive and a hybrid technique. These techniques allow resources to be scaled based on user demand.
The primary goal of these auto-scaling techniques is to achieve a profit for the intermediary enterprise while maintaining the desired grade of service for the client enterprise. A secondary goal is to generate a lower cost for the client enterprise in comparison to the situation in which the client acquires resources directly from the cloud provider. The techniques support both on-demand requests as well as requests with service level agreements (SLAs). The effectiveness of the proposed auto-scaling techniques is demonstrated through experiments performed on proof of concept prototypes and simulations. The experimental results show that for a number of different combinations of system and workload parameters experimented with, the proposed algorithms lead to a significant broker profit and a lower user cost in comparison to a conventional non-auto-scaling system.