Publication View

Decremental Feature-based Compaction (2004)

Abstract
In this paper we present the results of repeated feature-based compaction applied as a part of DUC 2004 task 1 – 75-byte short summary, on a set of printed news stories. We report the performance of a system built using tf*idf and namedentities as the main features employed to retain the most relevant parts of text, while compacting it. Multiple stages of the compaction, whilst arriving at the final summary ensure that we retain the text based on informativeness of information from the already chosen information-rich zones. From the nature of the summaries produced at 75 bytes, we conjecture that there exists a certain threshold for compacting a news story, beyond which the quality and readability of a summary deteriorate. 1.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.123.6982
Source http://www.dcs.shef.ac.uk/~heidi/pubs/duc2004.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.29.6428, 10.1.1.50.2490, 10.1.1.27.3999, 10.1.1.13.9982, 10.1.1.10.1686, 10.1.1.94.6384