Creator:
Date:
Abstract:
This thesis proposes a new graph-based indexing technique to improve the search latency for textual documents by using a Graph-Based Index (GBI) structure. GBI uses a directed graph built using a hash table to effectively capture the simultaneous occurrence of multiple keywords in a document. The objective is to use the relationship between the search keywords captured in the graph structure and a fast hash table lookup to effectively retrieve all the results of a query at once. A proof-of-concept prototype has been built for both GBI and Inverted Index. A thorough performance analysis is carried out for comparing GBI with Inverted Index using a synthetic workload. GBI is also compared with an enterprise-level search engine called Elasticsearch. The results show that the graph-based indexing technique can reduce the search latency for executing queries notably in comparison to Inverted Index and Elasticsearch.