Publication View

Deep Classification in Large-scale Text Hierarchies (2009)

Abstract
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification method does not give very detailed topic-related class information for the user because the first two levels are often too coarse. However, classification on a large-scale hierarchy is known to be intractable for many target categories with cross-link relationships among them. In this paper, we propose a novel deep-classification approach to categorize Web documents into categories in a large-scale taxonomy. The approach consists of two stages: a search stage and a classification stage. In the first stage, a category-search algorithm is used to acquire the

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.805
Source http://www.cse.ust.hk/~qyang/Docs/2008/fp350-xue.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Keywords Categories and Subject Descriptors H.4.m [Information Systems, Miscellaneous, I.5.4 [Pattern Recognition, Applications | Text processing General Terms, Algorithms, Performance, Experimentation. Keywords, Deep Classification, Large Scale Hierarchy, Hierarchical Classification
Type text
Language English
Relation 10.1.1.17.6513, 10.1.1.32.9956, 10.1.1.11.9519, 10.1.1.109.2516, 10.1.1.21.988, 10.1.1.14.5443, 10.1.1.116.499, 10.1.1.133.6957, 10.1.1.113.6227, 10.1.1.30.4612, 10.1.1.39.5073, 10.1.1.19.6575, 10.1.1.110.4923, 10.1.1.63.2400, 10.1.1.4.8853, 10.1.1.127.1429, 10.1.1.72.1506, 10.1.1.123.1308, 10.1.1.41.8507, 10.1.1.141.24