Text Classification with Noisy Class Labels
Public Deposited- Resource Type
- Creator
- Abstract
This thesis focuses on the problem of text classification with noise in the labels of the training data. Label noise can have many potential consequences, such as decreasing the model's accuracy and increasing the model's complexity. Designing learning algorithms that help maximize a desired performance measure in such noisy settings is important for achieving success on real world data. This thesis also investigates a recently proposed text classification method, called the Tsetlin Machine. The Tsetlin Machine can learn human readable rules made up of clauses. There is currently only one paper about the Tsetlin Machine applied to text classification problems, and this thesis builds on the work of that paper. Our experiments have shown that classical methods and the Tsetlin Machine have reasonably low impact on their performance from label noise, while recent state-of-the-art methods in text classification are not as robust.
- Subject
- Language
- Publisher
- Thesis Degree Level
- Thesis Degree Name
- Thesis Degree Discipline
- Identifier
- Rights Notes
Copyright © 2020 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
- Date Created
- 2020
Relations
- In Collection:
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
pagotto-textclassificationwithnoisyclasslabels.pdf | 2023-05-05 | Public | Download |