Publikation: Automatically classifying posts into qu...
Stammdaten
Titel: | Automatically classifying posts into question categories on stack overflow |
Untertitel: | |
Kurzfassung: | Software developers frequently solve development issues with the help of question and answer web forums, such as Stack Overflow (SO). While tags exist to support question searching and browsing, they are more related to technological aspects than to the question purposes. Tagging questions with their purpose can add a new dimension to the investigation of topics discussed in posts on SO. In this paper, we aim to automate such a classification of SO posts into seven question categories. As a first step, we have manually created a curated data set of 500 SO posts, classified into the seven categories. Using this data set, we apply machine learning algorithms (Random Forest and Support Vector Machines) to build a classification model for SO questions. We then experiment with 82 different configurations regarding the preprocessing of the text and representation of the input data. The results of the best performing models show that our models can classify posts into the correct question category with an average precision and recall of 0.88 and 0.87 when using Random Forest and the phrases indicating a question category as input data for the training. The obtained model can be used to aid developers in browsing SO discussions or researchers in building recommenders based on SO. |
Schlagworte: |
Publikationstyp: | Beitrag in Proceedings (Autorenschaft) |
Erscheinungsdatum: | 28.05.2018 (Online) |
Erschienen in: |
Proceedings of the 26th Conference on Program Comprehension
Proceedings of the 26th Conference on Program Comprehension
(
ACM New York;
)
zur Publikation |
Titel der Serie: | Proceedings of the 26th Conference on Program Comprehension |
Bandnummer: | - |
Erstveröffentlichung: | Ja |
Version: | - |
Seite: | S. 211 - 221 |
Versionen
Keine Version vorhanden |
Erscheinungsdatum: | 28.05.2018 |
ISBN (e-book): |
|
eISSN: | - |
DOI: | http://dx.doi.org/10.1145/3196321.3196333 |
Homepage: | https://dl.acm.org/citation.cfm?id=3196333 |
Open Access |
|
AutorInnen
Stefanie Beyer (intern) |
Christian Macho (intern) |
Martin Pinzger (intern) |
Massimiliano Di Penta (extern) |
Zuordnung
Organisation | Adresse | ||||
---|---|---|---|---|---|
Fakultät für Technische Wissenschaften
Institut für Informatik-Systeme
|
AT - A-9020 Klagenfurt |
Kategorisierung
Sachgebiete | |
Forschungscluster | Kein Forschungscluster ausgewählt |
Peer Reviewed |
|
Publikationsfokus |
Klassifikationsraster der zugeordneten Organisationseinheiten:
|
Arbeitsgruppen |
|
Forschungsaktivitäten
(Achtung: Externe Aktivitäten werden im Suchergebnis nicht mitangezeigt)
Projekte: | Keine verknüpften Projekte vorhanden |
Publikationen: | Keine verknüpften Publikationen vorhanden |
Veranstaltungen: | Keine verknüpften Veranstaltung vorhanden |
Vorträge: | Keine verknüpften Vorträge vorhanden |