Creator:
Date:
Abstract:
Big data processing has become essential for businesses in recent years as it enables organizations to gather insights from streaming data in near real-time and capitalize on business opportunities. One drawback of stream processing engines is the lack of support for priority scheduling. There are cases where businesses need to ensure that important input data items are processed with low latencies thus avoiding a missed business opportunity. This thesis proposes a technique that enables users to prioritize important input data so that they are processed in time even when the system is under high or bursty input load. Using a prototype this thesis demonstrates the efficacy of the proposed technique. Performance analysis demonstrates that there is a significant latency improvement for high priority data over low priority data especially when there is high system contention.