Publication View

Implementing the context tree weighting method for context recognition (2008)

Abstract
The context tree weighting method (CTW) is a statistics–based universal date compres-sion algorithm that is capable of achieving superior performance compared to Lempel– Ziv based algorithms [1], [2]. Motivated by this fact, we investigate the usability of CTW for applications involving content recognition. Recently, various authors have explored the application of other data compression algorithms for content recognition, e.g. see [3], [4], [5]. Given a test file that needs to be classified among a set of several reference files that represent different classes, the reference file which leads to the best compression of the test file when both files are appended is selected as the most probable match. Moreover, we modify CTW for content recognition purposes by introducing the concept of context tree freezing after the reference sequence is encoded to avoid learning the memory structure of the appended test sequence. Results show that CTW with the proposed freezing technique achieves a clearly superior performance compared to a wide range of other compression algorithms for content recognition problems such as language recognition, authorship attribution, and DNA data classification. For more details, the reader is referred to the full paper version available at [6].

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.109.7517
Source http://csdl.computer.org/comp/proceedings/dcc/2004/2082/00/20820536.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.1.3346, 10.1.1.30.1819, 10.1.1.109.3733, 10.1.1.50.9317, 10.1.1.26.1074, 10.1.1.124.1232