Chris Callison-burch

ParaMetric: An Automatic Evaluation Metric for Paraphrasing (2009)

Chris Callison-burch, Trevor Cohn, Mirella Lapata

We present ParaMetric, an automatic evaluation metric for data-driven approaches to paraphrasing. ParaMetric provides an objective measure of quality using a collection of multiple translations whose...

Constructing Corpora for the Development and Evaluation of Paraphrase Systems (2009)

Trevor Cohn, Chris Callison-burch, Mirella Lapata

Automatic paraphrasing is an important component in many natural language processing tasks. In this paper we present a new parallel corpus with paraphrase annotations. We adopt a definition of...

Syntactic Constraints on Paraphrases Extracted from Parallel Corpora (2009)

Chris Callison-burch

ccb cs jhu edu We improve the quality of paraphrases extracted from parallel corpora by requiring that phrases and their paraphrases be the same syntactic type. This is achieved by parsing the...

Paraphrase Substutution for Recognizing Textual Entailment (2008)

Wauter Bosma, Chris Callison-burch

Abstract. We describe a method for recognizing textual entailment that uses the length of the longest common subsequence (LCS) between two texts as its decision criterion. Rather than requiring...

June 19–21, 2006 • Barcelona, Spain TC-STAR Workshop on Speech-to-Speech Translation Edinburgh System Description for the 2006 TC-STAR Spoken Language Translation Evaluation (2008)

Abhishek Arun, Amittai Axelrod, Ra Birch Mayne, Chris Callison-burch, Hieu Hoang, Miles Osborne, ...

In this paper we describe the Edinburgh University statistical machine translation system, as used for the TC-STAR 2006 evaluation campaign. We participated in the primary Final Text Edition track...

A Compact Data Structure for Searchable Translation Memories (2008)

Chris Callison-burch, Colin Bannard

In this paper we describe searchable translation memories, which allow translators to search their archives for possible translations of phrases. We describe how statistical machine translation can...

Secondary Benefits of Feedback and User Interaction in Machine Translation Tools (2007)

Raymond S. Flournoy, Chris Callison-burch

User feedback has often been proposed as a method for improving the accuracy of machine translation systems, but useful feedback can also serve a number of secondary benefits, including increasing...

1 Statistical Natural Language Processing (2007)

Ali Farghaly, Chris Callison-burch, Miles Osborne

Statistical natural language processing (SNLP) 1 is a field lying in the intersection of natural language processing and machine learning. SNLP differs from traditional natural language processing in...

Upping the Ante for “Best of Breed ” Machine Translation Providers (2007)

Chris Callison-burch

The notion of “best of breed ” among value-added machine translation technology providers is generally defined as providing access to the single best commercially available machine translation...

Contents (2007)

John Beavers, Tim Baldwin, Colin Bannard, Chris Callison-burch, Ann Copestake, Dan Flickinger, ...

all of their comments, suggestions, and help in developing this system. I'd like to especially acknowledge Ann Copestake and Aline Villavicencio's earlier CCG LKB implementation as an...

Paraphrasing and Translation (2007)

Chris Callison-burch

Paraphrasing and translation have previously been treated as unconnected natural lan-guage processing tasks. Whereas translation represents the preservation of meaning when an idea is rendered in the...

Paraphrasing and Translation (2007)

Chris Callison-burch

Paraphrasing and translation have previously been treated as unconnected natural lan-guage processing tasks. Whereas translation represents the preservation of meaning when an idea is rendered in the...

Moses: Open source toolkit for statistical machine translation (2007)

Hieu Hoang, Alexandra Birch, Chris Callison-burch, Richard Zens, Rwth Aachen, Alexandra Constantin, ...

We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c)...

Improved statistical machine translation using paraphrases (2006)

Chris Callison-burch, Miles Osborne

Parallel corpora are crucial for training SMT systems. However, for many language pairs they are available only in very limited quantities. For these language pairs a huge portion of phrases...

Re-evaluating the role of BLEU in machine translation research (2006)

Chris Callison-burch, Miles Osborne

We argue that the machine translation community is overly reliant on the Bleu machine translation evaluation metric. We show that an improved Bleu score is neither necessary nor sufficient for...

Re-evaluating the Role of BLEU in Machine Translation Research (2006)

Chris Callison-Burch, Miles Osborne, Philipp Koehn

We argue that the machine translation community is overly reliant on the Bleu machine translation evaluation metric. We show that an improved Bleu score is neither necessary nor sufficient for...

Proceedings of the Workshop on Statistical Machine Translation, pages 154--157, (2006)

New York City, Alexandra Birch, Chris Callison-burch, Miles Osborne

The joint probability model proposed by Marcu and Wong (2002) provides a strong probabilistic framework for phrase-based statistical machine translation (SMT). The model's usefulness is,...

Improved statistical machine translation using paraphrases (2006)

Chris Callison-burch, Miles Osborne

Parallel corpora are crucial for training SMT systems. However, for many language pairs they are available only in very limited quantities. For these language pairs a huge portion of phrases...

Re-evaluating the role of BLEU in machine translation research (2006)

Chris Callison-burch, Miles Osborne

We argue that the machine translation community is overly reliant on the Bleu machine translation evaluation metric. We show that an improved Bleu score is neither necessary nor sufficient for...

Constraining the phrase-based, joint probability statistical translation model (2006)

Alexandra Birch, Chris Callison-burch, Miles Osborne

The joint probability model proposed by Marcu and Wong (2002) provides a strong probabilistic framework for phrase-based statistical machine translation (SMT). The model’s usefulness is, however,...

Improved statistical machine translation using paraphrases (2006)

Chris Callison-burch, Miles Osborne

Parallel corpora are crucial for training SMT systems. However, for many language pairs they are available only in very limited quantities. For these language pairs a huge portion of phrases...

Scaling phrase-based statistical machine translation to larger corpora and longer phrases (2005)

Chris Callison-burch, Colin Bannard

In this paper we describe a novel data structure for phrase-based statistical machine translation which allows for the retrieval of arbitrarily long phrases while simultaneously using less memory...

Paraphrasing with Bilingual Parallel Corpora (2005)

Colin Bannard, Chris Callison-burch

Previous work has used monolingual parallel corpora to extract and generate paraphrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource....

Edinburgh system description for the 2005 IWSLT speech translation evaluation (2005)

Amittai Axelrod, Ra Birch Mayne, Chris Callison-burch, Miles Osborne, David Talbot

Our participation in the IWSLT 2005 speech translation task is our first effort to work on limited domain speech data. We adapted our statistical machine translation system that performed...

Paraphrasing with Bilingual Parallel Corpora (2005)

Colin Bannard, Chris Callison-burch

Previous work has used monolingual parallel corpora to extract and generate paraphrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource....

Scaling phrase-based statistical machine translation to larger corpora and longer phrases (2005)

Chris Callison-burch, Colin Bannard

In this paper we describe a novel data structure for phrase-based statistical machine translation which allows for the retrieval of arbitrarily long phrases while simultaneously using less memory...

Statistical machine translation with word- and sentence-aligned parallel corpora (2004)

Chris Callison-burch, David Talbot, Miles Osborne

The parameters of statistical translation models are typically estimated from sentence-aligned parallel corpora. We show that significant improvements in the alignment and translation quality of such...

Statistical Machine Translation with Word- And Sentence-Aligned Parallel Corpora (2004)

Chris Callison-Burch, David Talbot, Miles Osborne

The parameters of statistical translation models are typically estimated from sentence-aligned parallel corpora. We show that significant improvements in the alignment and translation quality of such...

Searchable translation memories (2004)

Chris Callison-burch, Colin Bannard, Josh Schroeder

In this paper we introduce a technique for creating searchable translation memories. Linear B’s searchable translation memories allow a translator to type in a phrase and retrieve a ranked list of...

Statistical machine translation with word- and sentence-aligned parallel corpora (2004)

Chris Callison-burch, David Talbot, Miles Osborne

The parameters of statistical translation models are typically estimated from sentence-aligned parallel corpora. We show that significant improvements in the alignment and translation quality of such...

Improving statistical translation through editing. European Association for Machine Translation (EAMT-04) Workshop (2004)

Chris Callison-burch, Colin Bannard

In this paper we introduce Linear B’s statistical machine translation system. We describe how Linear B’s phrase-based translation models are learned from a parallel corpus, and show how the...

Evaluating question answering systems using FAQ answer injection (2003)

Jochen L. Leidner, Chris Callison-burch

Question answering (NLQA) systems which retrieve a textual fragment from a document collection that represents the answer to a question are an active field of research. But evaluations currently...

Active Learning for Statistical Machine Translation (2003)

Chris Callison-burch

For my PhD I propose to apply active learning to statistical machine translation. Sta-tistical machine translation is a data-intensive way of producing translation systems. It uses machine learning...

Bootstrapping parallel corpora (2003)

Chris Callison-burch, Miles Osborne

We present two methods for the automatic creation of parallel corpora. Whereas previous work into the automatic construction of parallel corpora has focused on harvesting them from the web, we...

1 Statistical Natural Language Processing (2003)

Ali Farghaly, Chris Callison-burch, Miles Osborne

Statistical natural language processing (SNLP) 1 is a field lying in the intersection of natural language processing and machine learning. SNLP differs from traditional natural language processing in...

Co-training for Statistical Machine Translation (2002)

Chris Callison-burch

I propose a novel co-training method for statistical machine translation. As co-training requires multiple learners trained on views of the data which are disjoint and suffi-cient for the labeling...

Co-training for Statistical Machine Translation (2002)

Chris Callison-burch

We present a novel co-training method for statistical machine translation. Since cotraining requires independent views on the data, with each view being sufficient for the labeling task, we use...

A Program for Automatically Selecting the Best Output from Multiple Machine Translation Engines (2001)

Chris Callison-burch, Raymond S. Flournoy

This paper describes a program that automatically selects the best translation from a set of translations produced by multiple commercial machine translation engines. The program is simplified by...

A Natural Language Question and Answer System (2000)

Chris Callison-burch, Philip Shilane

Our project is a question and answer system that allows natural language question to be asked of a knowledge base of information. Our program is grammar-based