Publication View

Using Text Mining and Link Analysis for Software Mining (2008)

Abstract
Abstract. Many data mining techniques are these days in use for ontology learning – text mining, Web mining, graph mining, link analysis, relational data mining, and so on. In the current state-of-the-art bundle there is a lack of “software mining ” techniques. This term denotes the process of extracting knowledge out of source code. In this paper we approach the software mining task with a combination of text mining and link analysis techniques. We discuss how each instance (i.e. a programming construct such as a class or a method) can be converted into a feature vector that combines the information about how the instance is interlinked with other instances, and the information about its (textual) content. The so-obtained feature vectors serve as the basis for the construction of the domain ontology with OntoGen, an existing system for semi-automatic data-driven ontology construction.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.106.1997
Source http://www.gate.ac.uk/projects/tao/webpage/publications/grcar-using-text-mining-and-link-analysis-for-software-mining-ecml-pkdd07.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Keywords software mining, text mining, link analysis, graph and network theory, feature vectors, ontologies, OntoGen
Type text
Language English
Relation 10.1.1.41.255, 10.1.1.49.2772, 10.1.1.1.6485, 10.1.1.103.9222, 10.1.1.5.392, 10.1.1.83.3474