Semi-automated Hypothesis Evaluation Using Semantic Technologies

Public Deposited
Resource Type
Creator
Abstract
  • In today’s age of “big data” and “omics” research, biologists face two unique challenges - sharing their results with the larger community in an interpretable and reusable format and integrating their experimental data and findings with the prevailing hypotheses that govern their field. Publicly funded biological data curation and warehousing centers have emerged to address the former, but the challenge remains of sifting out relevant information from these resources and integrating it in a scalable way towards assessing biological hypotheses, and in disseminating the results of this process. To address these challenges, I have developed, implemented and evaluated a semi-automated system for biological hypothesis evaluation that uses semantic technologies to reason over existing experimental data and knowledge. Chapter 1 presents the motivation, driving hypothesis and objectives for this doctoral thesis, as well as a brief review of the Semantic Web and automated systems for hypothesis formulation and evaluation. In Chapter 2 I present HyQue, a Semantic Web tool for evaluating scientific hypotheses, including the system architecture and a prototype implementation for evaluating hypotheses about yeast metabolism. In Chapter 3, I describe efforts to publish and integrate biological data on the Semantic Web through the Bio2RDF project, a key data source for HyQue that enables browsing, querying and downloading over 3 billion statements from more than 25 life sciences databases. In Chapter 4 I describe the ovopub, a linked data model for capturing provenance on the Semantic Web, as well as its implementation and application to Bio2RDF data. The ovopub provides a simple model for describing basic elements of linked data provenance, and enables provenance-based querying and filtering over biological linked data. In Chapter 5 I describe the application of HyQue to evaluating hypotheses about the role of C. elegans genes in aging. HyQue correctly identified known lifespan-related genes, as well as 24 candidate aging-related genes by retrieving and evaluating domain-specific evidence from multiple sources. Chapter 6 summarizes the contributions of this thesis and proposes future work.

Subject
Language
Publisher
Thesis Degree Level
Thesis Degree Name
Thesis Degree Discipline
Identifier
Rights Notes
  • Copyright © 2014 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created
  • 2014

Relations

In Collection:

Items