Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/123364
Citations
Scopus Web of Science® Altmetric
?
?
Type: Journal article
Title: Automatically categorizing software technologies
Author: Nassif, M.
Treude, C.
Robillard, M.
Citation: IEEE Transactions on Software Engineering, 2020; 46(1):20-32
Publisher: IEEE
Issue Date: 2020
ISSN: 0098-5589
1939-3520
Statement of
Responsibility: 
Mathieu Nassif, Christoph Treude, and Martin P. Robillard
Abstract: Informal language and the absence of a standard taxonomy for software technologies make it difficult to reliably analyze technology trends on discussion forums and other on-line venues. We propose an automated approach called Witt for the categorization of software technologies (an expanded version of the hypernym discovery problem). Witt takes as input a phrase describing a software technology or concept and returns a general category that describes it (e.g., integrated development environment), along with attributes that further qualify it (commercial, php, etc.). By extension, the approach enables the dynamic creation of lists of all technologies of a given type (e.g., web application frameworks). Our approach relies on Stack Overflow and Wikipedia, and involves numerous original domain adaptations and a new solution to the problem of normalizing automatically-detected hypernyms. We compared Witt with six independent taxonomy tools and found that, when applied to software terms, Witt demonstrated better coverage than all evaluated alternative solutions, without a corresponding degradation in false positive rate.
Keywords: Taxonomy; information retrieval; natural language processing; Wikipedia; tagging
Description: Date of [online] publication: 15 May 2018
Rights: © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
DOI: 10.1109/TSE.2018.2836450
Published version: http://dx.doi.org/10.1109/tse.2018.2836450
Appears in Collections:Aurora harvest 8
Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.