Supplemental Files:
Creator:
Date:
Abstract:
Nowadays, Twitter sentiment analysis is drawing a lot of attention due to its potential to drive decision making in a variety of domains. However, the trend that publicly available training datasets are becoming less available, the difficulty in determining topic numbers for topic model-based approach, and a lack of data level discussion about how to utilize the proposed models day-to-day to drive applications are still the remained concerns. To solve these problems, we firstly offer a new method to collect and build Twitter training dataset based on noisy labels; In addition, we proposed a topic-model based hybrid sentiment classification model by using our self-collected tweets, which utilizes three different topic models and coherence score to choose the best topic model in an automated way; Last but not least, a use case is illustrated to show how to apply our pipeline in a daily basis to solve real business problems.