Publication View

Improving translation quality by discarding most of the phrasetable (2007)

Abstract
It is possible to reduce the bulk of phrasetables for Statistical Machine Translation using a technique based on the significance testing of phrase pair co-occurrence in the parallel corpus. The savings can be quite substantial (up to 90%) and cause no reduction in BLEU score. In some cases, an improvement in BLEU is obtained at the same time although the effect is less pronounced if state-of-the-art phrasetable smoothing is employed.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.76.7717
Source http://acl.ldc.upenn.edu/D/D07/D07-1103.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.13.8919, 10.1.1.19.9416, 10.1.1.122.2975, 10.1.1.126.4352