James R. Curran

Publication List Details

Period

1999 - 2009

Number

50

Co-Authors

Challenges for automatically extracting molecular interactions from full-text articles (2009)

McIntosh, Tara, Curran, James R

Abstract Background The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information...

Experiments in Mutual Exclusion Bootstrapping (2009)

Tara Murphy, James R. Curran

Mutual Exclusion Bootstrapping (MEB) was designed to overcome the problem of semantic drift suffered by iterative bootstrapping, where the meaning of extracted terms quickly drifts from the original...

Experiments in Mutual Exclusion Bootstrapping (2009)

Tara Murphy, James R. Curran

Mutual Exclusion Bootstrapping (MEB) was designed to overcome the problem of semantic drift suffered by iterative bootstrapping, where the meaning of extracted terms quickly drifts from the original...

Random Indexing using Statistical Weight Functions (2009)

James Gorman, James R. Curran

Random Indexing is a vector space technique that provides an efficient and scalable approximation to distributional similarity problems. We present experiments showing Random Indexing to be poor at...

The Pronto QA system at TREC-2007: harvesting hyponyms, using nominalisation patterns, and computing answer cardinality (2009)

Johan Bos, James R. Curran, Edoardo Guzzetti

The backbone of the Pronto QA system is linguistically-principled: Combinatory Categorial Grammar is used to generate syntactic analyses of questions and potential answer snippets, and Discourse...

Named Entity Recognition for Astronomy Literature (2009)

Tara Murphy, Tara Mcintosh, James R. Curran

We present a system for named entity recognition (ner) in astronomy journal articles. We have developed this system on a ne corpus comprising approximately 200,000 words of text from astronomy...

Building a Search Engine to Drive Problem-Based Learning ABSTRACT (2008)

Steven Bird, James R. Curran

Search engines pervade the digital world, mediating most access to information instantaneously. We have found that students can build search engine components, and even entire search engines, in the...

jbos,stevec,steedman¡ (2008)

Johan Bos, Stephen Clark, Mark Steedman, James R. Curran, Julia Hockenmaier

This paper shows how to construct semantic representations from the derivations produced by a wide-coverage CCG parser. Unlike the dependency structures returned by the parser itself, these can be...

Formalisation of Transformation-based Learning (2008)

James R. Curran, Raymond K. Wong

Abstract. Research in automatic Part of Speech (POS) tagging has been dominated by Markov Model (MM) taggers. Brill [1, 3, 6], has recently described a transformation-based system with comparable...

Improving the Efficiency of a Wide-Coverage CCG Parser (2008)

Bojan Djordjevic, James R. Curran

The C&C CCG parser is a highly efficient linguistically motivated parser. The efficiency is achieved using a tightly-integrated supertagger, which assigns CCG lexical categories to words in a...

Minimising semantic drift with Mutual Exclusion Bootstrapping (2008)

James R. Curran, Tara Murphy, Bernhard Scholz

Iterative bootstrapping techniques are commonly used to extract lexical semantic resources from raw text. Their major weakness is that, without costly human intervention, the extracted terms (often...

Improving the Efficiency of a Wide-Coverage CCG Parser (2008)

Bojan Djordjevic, James R. Curran

The C&C CCG parser is a highly efficient linguistically motivated parser. The efficiency is achieved using a tightly-integrated supertagger, which assigns CCG lexical categories to words in a...

Named Entity Recognition for Astronomy Literature (2008)

Tara Murphy, Tara Mcintosh, James R. Curran

We present a system for named entity recognition (ner) in astronomy journal articles. We have developed this system on a ne corpus comprising approximately 200,000 words of text from astronomy...

Minimising semantic drift with Mutual Exclusion Bootstrapping (2008)

James R. Curran, Tara Murphy, Bernhard Scholz

Iterative bootstrapping techniques are commonly used to extract lexical semantic resources from raw text. Their major weakness is that, without costly human intervention, the extracted terms (often...

Building a Search Engine to Drive Problem-Based Learning ABSTRACT (2008)

Steven Bird, James R. Curran

Search engines pervade the digital world, mediating most access to information instantaneously. We have found that students can build search engine components, and even entire search engines, in the...

Improving the Efficiency of a Wide-Coverage CCG Parser (2008)

Bojan Djordjevic, James R. Curran

The C&C CCG parser is a highly efficient linguistically motivated parser. The efficiency is achieved using a tightly-integrated supertagger, which assigns CCG lexical categories to words in a...

Linguistically motivated large-scale NLP with C&C and Boxer (2007)

James R. Curran

The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically...

Widecoverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics (2007)

Stephen Clark, James R. Curran

This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are “full " parsing models in the sense that probabilities are...

Linguistically motivated large-scale NLP with C&C and Boxer (2007)

James R. Curran

The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically...

Sentence retrieval for extracting biomedical knowledge (2007)

Tara McIntosh, James R. Curran

At present, the majority of biomedical Information Retrieval tools process abstracts rather than full-text articles. The increasing availability of full text will allow more knowledge to be extracted...

Widecoverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics (2007)

Stephen Clark, James R. Curran

This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are “full " parsing models in the sense that probabilities are...

Linguistically motivated large-scale NLP with C&C and Boxer (2007)

James R. Curran

The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically...

Perceptron training for a wide-coverage lexicalized-grammar parser (2007)

Stephen Clark, James R. Curran

This paper investigates perceptron training for a wide-coverage CCG parser and compares the perceptron with a log-linear model. The CCG parser uses a phrase-structure parsing model and dynamic...

Multi-tagging for lexicalized-grammar parsing (2006)

James R. Curran

With performance above 97 % accuracy for newspaper text, part of speech (POS) tagging might be considered a solved problem. Previous studies have shown that allowing the parser to resolve POS tag...

Multi-tagging for lexicalized-grammar parsing (2006)

James R. Curran

With performance above 97 % accuracy for newspaper text, part of speech (POS) tagging might be considered a solved problem. Previous studies have shown that allowing the parser to resolve POS tag...

Web Text Corpus for Natural Language Processing (2006)

Vinci Liu, James R. Curran

Web text has been successfully used as training data for many NLP applications.

Multi-tagging for lexicalized-grammar parsing (2006)

James R. Curran

With performance above 97 % accuracy for newspaper text, part of speech (POS) tagging might be considered a solved problem. Previous studies have shown that allowing the parser to resolve POS tag...

Approximate Searching for Distributional Similarity (2005)

James Gorman, James R. Curran

Distributional similarity requires large volumes of data to accurately represent infrequent words. However, the nearestneighbour approach to finding synonyms suffers from poor scalability. The...

Question Answering with QED at TREC-2005 (2005)

Kisuh Ahn Johan, Johan Bos, James R. Curran, Dave Kor, Malvina Nissim, Bonnie Webber

This report describes the system developed by the University of Edinburgh and the University of Sydney for the TREC-2005 question answering evaluation exercise. The backbone of our question-answering...

Wide-coverage semantic representations from a CCG parser (2004)

Johan Bos, Stephen Clark, Mark Steedman, James R. Curran, Julia Hockenmaier

This paper shows how to construct semantic representations from the derivations produced by a wide-coverage CCG parser. Unlike the dependency structures returned by the parser itself, these can be...

Question answering with QED and WEE at TREC-2004 (2004)

Kisuh Ahn, Johan Bos, Stephen Clark, James R. Curran, Tiphaine Dalmas, Jochen L. Leidner, ...

This report describes the experiments of the University of Edinburgh and the University of Sydney at the TREC-2004 question answering evaluation exercise. Our system combines two approaches: one with...

Qed: The edinburgh trec-2003 question answering system (2003)

Jochen L. Leidner, Johan Bos, Tiphaine Dalmas, James R. Curran, Stephen Clark, Colin J. Bannard, ...

This report describes a new open-domain answer retrieval system developed at the University of Edinburgh and gives results for the TREC-12 question answering track. Phrasal answers are identified by...

Bootstrapping POS taggers using unlabelled data (2003)

Stephen Clark, James R. Curran, Miles Osborne

This paper investigates booststrapping part-ofspeech taggers using co-training, in which two taggers are iteratively re-trained on each other’s output. Since the output of the taggers is noisy,...

Investigating GIS and smoothing for maximum entropy taggers (2003)

James R. Curran, Stephen Clark

This paper investigates two elements of Maximum Entropy tagging: the use of a correction feature in the Generalised Iterative Scaling (GIS) estimation algorithm, and techniques for model smoothing....

Investigating GIS and smoothing for maximum entropy taggers (2003)

James R. Curran, Stephen Clark

This paper investigates two elements of Maximum Entropy tagging: the use of a correction feature in the Gener-alised Iterative Scaling (GlS) estimation algorithm, and techniques for model smoothing....

Qed: The edinburgh trec-2003 question answering system (2003)

Jochen L. Leidner, Johan Bos, Tiphaine Dalmas, James R. Curran, Stephen Clark, Colin J. Bannard, ...

This report describes a new open-domain answer retrieval system developed at the University of Edinburgh and gives results for the TREC-12 question answering track. Phrasal answers are identified by...

Bootstrapping POS taggers using unlabelled data (2003)

Stephen Clark, James R. Curran, Miles Osborne

This paper investigates booststrapping part-ofspeech taggers using co-training, in which two taggers are iteratively re-trained on each other’s output. Since the output of the taggers is noisy,...

Language Independent NER using a Maximum Entropy Tagger (2003)

James Curran And, James R. Curran, Stephen Clark

Named Entity Recognition (NER) systems need to integrate a wide variety of information for optimal performance. This paper demonstrates that a maximum entropy tagger can effectively encode such...

Bootstrapping POS taggers using Unlabelled Data (2003)

Stephen Clark, James R. Curran, Miles Osborne

This paper investigates booststrapping part-ofspeech taggers using co-training, in which two taggers are iteratively re-trained on each other's output. Since the output of the taggers is noisy,...

A very very large corpus doesn’t always yield reliable estimates (2002)

James R. Curran, Miles Osborne

Banko and Brill (2001) suggested that the development of very large training corpora may be more effective for progress in empirical Natural Language Processing than improving methods that use...

Ensemble methods for automatic thesaurus extraction (2002)

James R. Curran

Ensemble methods are state of the art for many NLP tasks. Recent work by Banko and Brill (2001) suggests that this would not necessarily be true if very large training corpora were available....

Ensemble methods for automatic thesaurus extraction (2002)

James R. Curran

Ensemble methods are state of the art for many NLP tasks. Recent work by Banko and Brill (2001) suggests that this would not necessarily be true if very large training corpora were available....

2002. Improvements in automatic thesaurus extraction (2002)

James R. Curran, Marc Moens

The use of semantic resources is common in modern NLP systems, but methods to extract lexical semantics have only recently begun to perform well enough for practical use. We evaluate existing and new...

Scaling context space (2002)

James R. Curran, Marc Moens

Context is used in many NLP systems as an indicator of a term’s syntactic and semantic function. The accuracy of the system is dependent on the quality and quantity of contextual information...

A very very large corpus doesn’t always yield reliable estimates (2002)

James R. Curran, Miles Osborne

Banko and Brill (2001) suggested that the development of very large training corpora may be more effective for progress in empirical Natural Language Processing than improving methods that use...

A very very large corpus doesn’t always yield reliable estimates (2002)

James R. Curran, Miles Osborne

Banko and Brill (2001) suggested that the development of very large training corpora may be more effective for progress in empirical Natural Language Processing than improving methods that use...

Scaling context space (2002)

James R. Curran, Marc Moens

Context is used in many NLP systems as an indicator of a term’s syntactic and semantic function. The accuracy of the system is dependent on the quality and quantity of contextual information...

Improvements in Automatic Thesaurus Extraction (2002)

James R. Curran, Marc Moens

The use of semantic resources is common in modern NLP systems, but methods to extract lexical semantics have only recently begun to perform well enough for practical use. We evaluate existing and new...

Transformation-based learning for automatic translation from HTML to XML (1999)

James R. Curran, Raymond K. Wong

Format tags implicitly represent content information in the same ambiguous, context dependent manner that words represent semantics in natural language. Translation from format to content markup...