Stammdaten

Titel: Automatically classifying posts into question categories on stack overflow
Untertitel:
Kurzfassung:

Software developers frequently solve development issues with the help of question and answer web forums, such as Stack Overflow (SO). While tags exist to support question searching and browsing, they are more related to technological aspects than to the question purposes. Tagging questions with their purpose can add a new dimension to the investigation of topics discussed in posts on SO. In this paper, we aim to automate such a classification of SO posts into seven question categories. As a first step, we have manually created a curated data set of 500 SO posts, classified into the seven categories. Using this data set, we apply machine learning algorithms (Random Forest and Support Vector Machines) to build a classification model for SO questions. We then experiment with 82 different configurations regarding the preprocessing of the text and representation of the input data. The results of the best performing models show that our models can classify posts into the correct question category with an average precision and recall of 0.88 and 0.87 when using Random Forest and the phrases indicating a question category as input data for the training. The obtained model can be used to aid developers in browsing SO discussions or researchers in building recommenders based on SO.

Schlagworte:
Publikationstyp: Beitrag in Proceedings (Autorenschaft)
Erscheinungsdatum: 28.05.2018 (Online)
Erschienen in: Proceedings of the 26th Conference on Program Comprehension
Proceedings of the 26th Conference on Program Comprehension
zur Publikation
 ( ACM New York; )
Titel der Serie: Proceedings of the 26th Conference on Program Comprehension
Bandnummer: -
Erstveröffentlichung: Ja
Version: -
Seite: S. 211 - 221

Versionen

Keine Version vorhanden
Erscheinungsdatum: 28.05.2018
ISBN (e-book):
  • 978-1-4503-5714-2
eISSN: -
DOI: http://dx.doi.org/10.1145/3196321.3196333
Homepage: https://dl.acm.org/citation.cfm?id=3196333
Open Access
  • Online verfügbar (nicht Open Access)

Zuordnung

Organisation Adresse
Fakultät für Technische Wissenschaften
 
Institut für Informatik-Systeme
Universitätsstr. 65-67
A-9020 Klagenfurt
Österreich
  -993503
   kerstin.smounig@aau.at
https://www.aau.at/isys/
zur Organisation
Universitätsstr. 65-67
AT - A-9020  Klagenfurt

Kategorisierung

Sachgebiete
  • 102022 - Softwareentwicklung
Forschungscluster Kein Forschungscluster ausgewählt
Peer Reviewed
  • Ja
Publikationsfokus
  • Science to Science (Qualitätsindikator: II)
Klassifikationsraster der zugeordneten Organisationseinheiten:
Arbeitsgruppen
  • Software Engineering Research Group (SERG)

Kooperationen

Organisation Adresse
Univercity of Sannio
Italien
IT  

Beiträge der Publikation

Keine verknüpften Publikationen vorhanden