Unsupervised Text Mining Techniques for Forecasting Crude Oil

Resource Type

Creator

Abstract

While it has been shown that news articles can influence the rationality of investors' decisions, the effect that news may have on commodity prices such as crude oil is uncertain. I explored Natural Language Processing (NLP) techniques to extract textual features from news articles and then constructed a "horse-race" among economic and tree-based machine learning methods to forecast weekly crude oil prices. I obtained two types of textual features, latent topics and sentiment probabilities, using two state-of-the-art NLP models: Latent Dirichlet Allocation (LDA) and a pre-trained version of Bidirectional Encoder Representations from Transformers (BERT) on a financial corpus. This paper introduced a novel forecasting strategy to calculate the out-of-sample (OoS) performance metrics of competing models. The evidence I found shows that textual features can improve forecasts of oil prices, however, textual features from news on their own are not sufficient for high forecasting accuracy.

Subject

Language

Publisher

Thesis Degree Level

Thesis Degree Name

Thesis Degree Discipline

Identifier

Rights Notes

Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created

Relations

In Collection:

Thumbnail	Title	Date Uploaded	Visibility	Actions
	hazen-unsupervisedtextminingtechniquesforforecasting.pdf	2023-05-05	Public	Download