Multi-Domain Text Classification with Adversarial Training

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.

Creator: 

Wu, Yuan

Date: 

2022

Abstract: 

Text classification is one of the fundamental tasks in natural language processing (NLP), which has been studied for decades and various approaches have been proposed. Unfortunately, text classification is a highly domain-dependent task, a subtle shift between training and testing data distributions can cause catastrophic performance deterioration. Moreover, the availability of massive labeled data varies among different domains in real-world applications. Therefore, it is of great importance to investigate how to improve the classification accuracy of the target domain by leveraging resources from related domains. Multi-domain text classification (MDTC) is proposed to address the above problem. Nowadays, the mainstream MDTC approaches resort to transfer learning techniques to reduce domain divergence across different domains. In particular, these methods adopt adversarial training and shared-private paradigm to implement domain alignment, yielding state-of-the-art performance. Adversarial learning can reduce domain divergence through a minimax optimization to produce domain-invariant features. The domain-invariant features are supposed to be both transferable and discriminative, while shared-private employs domain-specific features to boost the discriminability of the domain-invariant features. In this thesis, we make several contributions to advance MDTC: First, we propose a dual adversarial co-learning method that utilizes two forms of adversarial training to refine domain-variant features. Second, we apply mixup to conduct the category and domain regularizations to enrich the intrinsic features in the shared latent space and enforce the consistent predictions in-between training samples such that the learned features can be more transferable and discriminative. Third, we analyze the limitation of the adversarial alignment on marginal distributions and propose a novel conditional adversarial network that aligns joint distributions of domain-invariant features and label predictions. Fourth, we propose a co-regularized adversarial learning framework that constructs two diverse adversarial training streams and aligns multiple conditional distributions by penalizing the disagreements of outputs of these two streams. We also incorporate entropy minimization and virtual adversarial training to avoid the violation of the cluster assumption. Finally, we incorporate the margin discrepancy to measure the domain divergence for MDTC and fill the gap between MDTC algorithms and theories by deriving a new generalization bound based on the margin discrepancy.

Subject: 

Computer Science

Language: 

English

Publisher: 

Carleton University

Thesis Degree Name: 

Doctor of Philosophy: 
Ph.D.

Thesis Degree Level: 

Doctoral

Thesis Degree Discipline: 

Computer Science

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).