Creator:
Date:
Abstract:
This thesis focuses on the problem of text classification with noise in the labels of the training data. Label noise can have many potential consequences, such as decreasing the model's accuracy and increasing the model's complexity. Designing learning algorithms that help maximize a desired performance measure in such noisy settings is important for achieving success on real world data. This thesis also investigates a recently proposed text classification method, called the Tsetlin Machine. The Tsetlin Machine can learn human readable rules made up of clauses. There is currently only one paper about the Tsetlin Machine applied to text classification problems, and this thesis builds on the work of that paper. Our experiments have shown that classical methods and the Tsetlin Machine have reasonably low impact on their performance from label noise, while recent state-of-the-art methods in text classification are not as robust.