Structured Web Data Extraction: University Domain

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Creator: 

Li, Yifeng

Date: 

2014

Abstract: 

In the Semantic Web, information is structured and thus processable by machines. However, it is still largely unrealized. The current web is simply a collection of unstructured documents. To find information on the web, we use search engines such as Google to retrieve relevant documents. Users often need to search through the retrieved documents to find information. Due to web information explosion, it has become harder and harder for users to find information easily. While Google is trying to provide the most relevant results, our goal is to provide precise results that answer structured
queries. To achieve our goal, we adopt the information extraction approach. In particular, we extract structured data from the unstructured web and organize the extracted data in a database to provide search functions. This thesis focuses on the implementation of a web information extraction system in a university domain.

Subject: 

COMMUNICATIONS AND THE ARTS Information Science

Language: 

English

Publisher: 

Carleton University

Thesis Degree Name: 

Master of Computer Science: 
M.C.S.

Thesis Degree Level: 

Master's

Thesis Degree Discipline: 

Computer Science

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).