Identifying Opinion Based Questions in Developer Chat Communication
Public Deposited- Resource Type
- Creator
- Abstract
Today, software developers work on complex and fast-moving projects that often require instant assistance. With numerous topics discussed in parallel in chat servers such as Discord, mining them would offer researchers opportunities to develop software tools and services. Firstly, we propose a dataset called DISCO consisting of the one-year public DIScord chat COnversations of four software development communities. Secondly, we improve the existing ChatEO's opinion-asking question identification process by replacing heuristics with Deep Learning (DL) architecture (with various word embeddings) in Natural Language Processing tasks. The results show a better performance of DL models over heuristics and are validated with a manual qualitative study. We have employed an automatic weak learner, Snorkel to label a larger dataset to increase DL performance. We have also used class balancing techniques - SMOTE and Near-Miss on this larger dataset. SMOTE along with Multi-CNN and GloVe-Twitter achieves the best performance in this study (0.95 recall).
- Subject
- Language
- Publisher
- Thesis Degree Level
- Thesis Degree Name
- Thesis Degree Discipline
- Identifier
- Rights Notes
Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.
- Date Created
- 2022
Relations
- In Collection:
Items
Thumbnail | Title | Date Uploaded | Visibility | Actions |
---|---|---|---|---|
muthusubash-identifyingopinionbasedquestionsindeveloper.pdf | 2023-05-05 | Public | Download |