Timothy Baldwin

Measuring and Predicting Orthographic Associations: Modelling the Similarity of Japanese Kanji (2009)

Lars Yencken, Timothy Baldwin

As human beings, our mental processes for recognising linguistic symbols generate perceptual neighbourhoods around such symbols where confusion errors occur. Such neighbourhoods also provide us with...

Automatic Event Reference Identification (2009)

Timothy Baldwin

Event reference identification is often treated as a sentence level classification task. However, several different event references can occur within a single sentence. We present a set of...

Applying Discourse Analysis and Data Mining Methods to Spoken OSCE Assessments (2009)

Meladel Mistica, Timothy Baldwin, Marisa Cordella, Simon Musgrave

This paper looks at the transcribed data of patient-doctor consultations in an examination setting. The doctors are internationally qualified and enrolled in a bridging course as preparation for...

A Machine Learning Approach to Multiword Expression Extraction (2009)

Timothy Baldwin, Stefan Evert, Brigitte Krenn, Pavel Pecina, Dimitra Anastasiou, Michael Carl, ...

11.00- 13.30 Resource session II 13.30- 14.30 Lunch break A Lexicon of shallow-typed German-English MW-Expressions and a German Corpus of MW-Expressions annotated Sentences

Facilitating Biomedical Systematic Reviews Using Ranked Text Retrieval and Classification (2009)

David Martinez, Sarvnaz Karimi, Lawrence Cavedon, Timothy Baldwin

Abstract Searching and selecting articles to be included in systematic reviews is a real challenge for healthcare agencies responsible for publishing these reviews. The current practice of manually...

MELB-YB: Preposition Sense Disambiguation Using Rich Semantic Features (2009)

Patrick Ye, Timothy Baldwin

This paper describes a maxent-based preposition sense disambiguation system entry to the preposition sense disambiguation task of the SemEval 2007. This system uses a wide variety of semantic and...

An Unsupervised Approach to Interpreting Noun Compounds (2009)

Su Nam Kim, Timothy Baldwin

Abstract—This paper proposes an unsupervised approach to automatically interpret noun compounds using semantic similarity. Our proposed unsupervised method is based on obtaining a large amount of...

Using Collaboratively Constructed Document Collections to Simulate Real-World Object Comparisons (2009)

Karl Grieser, Timothy Baldwin, Fabian Bohnert, Liz Sonenberg

Abstract While the layout of a museum exhibition is largely prescribed by the curator, visitors to museums view connections between exhibits in ways unique to themselves. With the assistance of a...

Landmark Classification for Route Directions (2009)

Aidan Furlan, Timothy Baldwin

In order for automated navigation systems to operate effectively, the route instructions they produce must be clear, concise and easily understood by users. In order to incorporate a landmark within...

Measuring and Predicting Orthographic Associations: Modelling the Similarity of Japanese Kanji (2009)

Lars Yencken, Timothy Baldwin

As human beings, our mental processes for recognising linguistic symbols generate perceptual neighbourhoods around such symbols where confusion errors occur. Such neighbourhoods also provide us with...

Donostia, Basque Country (2009)

Eneko Agirre, Timothy Baldwin, David Martinez

To date, parsers have made limited use of semantic information, but there is evidence to suggest that semantic features can enhance parse disambiguation. This paper shows that semantic classes help...

MELB-KB: Nominal Classification as Noun Compound Interpretation (2009)

Su Nam Kim, Timothy Baldwin

In this paper, we outline our approach to interpreting semantic relations in nominal pairs in SemEval-2007 task #4: Classification of Semantic Relations between Nominals. We build on two baseline...

Aspect-Based Personalized Text Summarization (2009)

Shlomo Berkovsky, Timothy Baldwin, Ingrid Zukerman

Abstract. This work investigates user attitudes towards personalized summaries generated from a coarse-grained user model based on document aspects. We explore user preferences for summaries at...

∗Center for the Study of Language and Information (2009)

Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen

This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...

MELB-MKB: Lexical Substitution System based on Relatives in Context (2009)

David Martinez, Su Nam Kim, Timothy Baldwin

In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...

Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2009)

Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag

This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions, arguing...

In Proceedings of the 9th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2002), Keihanna, Japan, pp. 1-11. Alternation-based Lexicon Reconstruction (2009)

Timothy Baldwin, Francis Bond

This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...

Learning Count Classifier Preferences of Malay Nouns (2009)

Jeremy Nicholson, Timothy Baldwin

We develop a data set of Malay lexemes labelled with count classifiers, that are attested in raw or lemmatised corpora. A maximum entropy classifier based on simple, languageinspecific features...

Abstract (2009)

Timothy Baldwin

We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...

The Corpus and the Lexicon: Standardising Deep Lexical Acquisition Evaluation (2009)

Yi Zhang, Timothy Baldwin, Valia Kordoni

This paper is concerned with the standardisation of evaluation metrics for lexical acquisition over precision grammars, which are attuned to actual parser performance. Specifically, we investigate...

2006. Interpretation of compound nominalisations using corpus and web statistics (2009)

Jeremy Nicholson, Timothy Baldwin

We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We...

Evaluating the FOKS Error Model (2009)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

Learners of Japanese face great difficulty when trying to lookup words containing kanji in a dictionary, due to the requirement of knowing the correct reading of the target word. We propose a system...

Dictionary-driven analysis of Japanese verbal alternations (2009)

Timothy Baldwin, Francis Bond, Kentaro Ogura

We present a method for extracting verbal (diathesis) alternations from a valency dictionary, based on comparison of selectional restrictions. The quality of match between selectional restrictions is...

2006. Interpretation of compound nominalisations using corpus and web statistics (2009)

Jeremy Nicholson, Timothy Baldwin

We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We...

Experiments on pattern-based relation learning (2009)

Yap, Willy, Baldwin, Timothy

Relation extraction is a sub-task of Information Extraction (IE) that is concerned with extracting semantic relations---such as antonymy, synonymy or hypernymy---between word pairs from corpus data....

Experiments on pattern-based relation learning (2009)

Yap, Willy, Baldwin, Timothy

Relation extraction is a sub-task of Information Extraction (IE) that is concerned with extracting semantic relations---such as antonymy, synonymy or hypernymy---between word pairs from corpus data....

Benchmarking Noun Compound Interpretation (2008)

Su Nam Kim, Timothy Baldwin

In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...

Benchmarking Noun Compound Interpretation (2008)

Su Nam Kim, Timothy Baldwin

In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...

Extending Sense Collocations in Interpreting Noun Compounds (2008)

Su Nam Kim, Meladel Mistica, Timothy Baldwin

This paper investigates the task of noun compound interpretation, building on the sense collocation approach proposed by Moldovan et al. (2004). Our primary task is to evaluate the impact of similar...

Abstract (2008)

Ichiro Yamada, Timothy Baldwin

We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...

Detecting Compositionality of English Verb-Particle Constructions using Semantic Similarity (2008)

Su Nam Kim, Timothy Baldwin

We present a novel method for detecting the compositionality of English verbparticle constructions (VPCs), based on the assumption that compositionality can be modelled with semantic similarity...

MELB-MKB: Lexical Substitution System based on Relatives in Context (2008)

David Martinez, Su Nam Kim, Timothy Baldwin

In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...

Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2008)

Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag

Abstract This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions,...

An Investigation into the Interaction between Feature Selection and Discretization: Learning How and When to Read Numbers (2008)

Sumukh Ghodke, Timothy Baldwin

Abstract. Pre-processing is an important part of machine learning, and has been shown to significantly improve the performance of classifiers. In this paper, we take a selection of pre-processing...

Efficient Grapheme-phoneme Alignment for Japanese (2008)

Lars Yencken, Timothy Baldwin

Current approaches to the grapheme-phoneme alignment problem for Japanese achieve good accuracy, but are extremely computationally expensive. In this paper we evaluate various modifications to...

Deep Lexical Acquisition (2008)

Timothy Baldwin

lingo.stanford.edu/pubs/tbaldwin/ altss2003-ohp.pdf

Modelling the Orthographic Neighbourhood for Japanese Kanji (2008)

Lars Yencken, Timothy Baldwin

Abstract. Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity...

Takaaki Tanaka (2008)

Timothy Baldwin

We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...

An Intelligent Search Infrastructure for Language Resources on the Web ARC Special Research Initiative (E-Research) SR0567353 (2008)

Chief Investigators, Timothy Baldwin, Steven Bird, Baden Hughes

Language occupies a central role on the web: most content is expressed in a given language, and most access takes place via natural language input and interfaces. Today, investigation of human...

2000, ‘Verb alternations and Japanese — how, what and where (2008)

Timothy Baldwin, Hozumi Tanaka

We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...

Scalable Deep Linguistic Processing: Mind the Lexical Gap (2008)

Timothy Baldwin

Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domain-tuning the complex...

1999a, ‘An alternation-based Japanese valency dictionary architecture (2008)

Timothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

2002, ‘Alternation-based lexicon reconstruction (2008)

Timothy Baldwin, Francis Bond

This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...

A Computational Account of Modality-based Case Frame Transformation (2008)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

Verb modality presents a major processing obstacle in any NLP application, and can be overcome either by

Balancing up Efficiency and Accuracy in Translation Retrieval (2008)

Timothy Baldwin, Hozumi Tanaka

This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...

Modelling the Orthographic Neighbourhood for Japanese Kanji (2008)

Lars Yencken, Timothy Baldwin

Abstract. Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity...

General-purpose lexical acquisition: Procedures, questions and results (2008)

Timothy Baldwin

We discuss a range of in vitro and in vivo approaches to deep lexical acquisition, and evaluate a representative sample of each in learning lexical items for a precision grammar. Evaluation focuses...

2002, ‘Alternation-based lexicon reconstruction (2008)

Timothy Baldwin, Francis Bond

This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...

Translation Memory Engines: A Look under the Hood and Road Test (2008)

Timothy Baldwin

In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...

Abstract (2008)

Colin Bannard, Timothy Baldwin

Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We first illustrate the suitability of distributional similarity...

Abstract (2008)

Timothy Baldwin

We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...

2000, ‘Verb alternations and Japanese — how, what and where (2008)

Timothy Baldwin, Hozumi Tanaka

We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...

Word Sense Disambiguation Incorporating Lexical and Structural Semantic Information (2008)

Takaaki Tanaka, Francis Bond, Timothy Baldwin, Sanae Fujita, Chikara Hashimoto

We present results that show that incorporating lexical and structural semantic information is effective for word sense disambiguation. We evaluated the method by using precise information from a...

Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2008)

Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag

Abstract This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions,...

Abstract (2008)

Ichiro Yamada, Timothy Baldwin

We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...

Scalable Deep Linguistic Processing: Mind the Lexical Gap (2008)

Timothy Baldwin

Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domain-tuning the complex...

Evaluation (LREC 2002), Las Palmas, Canary Islands, pp. 979-85. Enhanced Japanese Electronic Dictionary Look-up (2008)

Timothy Baldwin, Slaven Bilac, Ryo Okumura, Takenobu Tokunaga, Hozumi Tanaka

This paper describes the process of data preparation and reading generation for an ongoing project aimed at improving the accessibility of unknown words for learners of foreign languages, focusing...

MELB-MKB: Lexical Substitution System based on Relatives in Context (2008)

David Martinez, Su Nam Kim, Timothy Baldwin

In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...

ACL/HCSNet Advanced Programme in NLP Learning Lexical Semantic Representations (2008)

Timothy Baldwin

• Lexical semantic representations are all well and good, BUT: how can we produce them automatically?

Structure of Course (2008)

Timothy Baldwin

c. Extraction/identification

2002, ‘Alternation-based lexicon reconstruction (2008)

Timothy Baldwin, Francis Bond

Author(s) hidden for anonymous review Institute also hidden Address also hidden (probably two lines) Email also hidden This research is aimed at developing a hierarchical alternation-based lexical...

What are Multiword Expressions (MWEs)? (2008)

Timothy Baldwin

• Definition: A multiword expression (MWE) is:

An Investigation into the Interaction between Feature Selection and Discretization: Learning How and When to Read Numbers (2008)

Sumukh Ghodke, Timothy Baldwin

Abstract. Pre-processing is an important part of machine learning, and has been shown to significantly improve the performance of classifiers. In this paper, we take a selection of pre-processing...

Detecting Compositionality of English Verb-Particle Constructions using Semantic Similarity (2008)

Su Nam Kim, Timothy Baldwin

We present a novel method for detecting the compositionality of English verbparticle constructions (VPCs), based on the assumption that compositionality can be modelled with semantic similarity...

Automatic Thread Classification for Linux User Forum Information Access (2008)

Timothy Baldwin, David Martinez, Richard B. Penman

Abstract We experiment with text classification of threads from Linux web user forums, in the context of improving information access to the problems and solutions described in the threads. We...

Extending Sense Collocations in Interpreting Noun Compounds (2008)

Su Nam Kim, Meladel Mistica, Timothy Baldwin

This paper investigates the task of noun compound interpretation, building on the sense collocation approach proposed by Moldovan et al. (2004). Our primary task is to evaluate the impact of similar...

Verb Sense Disambiguation Using Selectional Preferences Extracted with a State-of-the-art Semantic Role Labeler (2008)

Patrick Ye, Timothy Baldwin

This paper investigates whether multisemantic-role (MSR) based selectional preferences can be used to improve the performance of supervised verb sense disambiguation. Unlike conventional selectional...

POS Tagging with a More Informative Tagset (2008)

Andrew Mackinlay, Timothy Baldwin

We investigate the impact of introducing finer distinctions into the tagset on the accuracy of partof-speech tagging. This is a tangential approach to most recent research in the field, which has...

Benchmarking noun compound interpretation (2008)

Su Nam Kim, Timothy Baldwin

In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...

L.: Using collaborative models to adaptively predict visitor locations in museums (2008)

Fabian Bohnert, Ingrid Zukerman, Shlomo Berkovsky, Timothy Baldwin, Liz Sonenberg

Abstract. The vast amounts of information presented in museums can be overwhelming to a visitor, whose receptivity and time are typically limited. Hence, s/he might have difficulties selecting...

Preliminary analysis of the range and frequency of Japanese verb alternations (2007)

Timothy Baldwin, H. Tanaka, Takenobu Tokunaga, Hozumi Tanaka

We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese patternbased valency dictionary. This is achieved...

Construction of an alternation-based English valency dictionary (2007)

English Valency Dictionary, Ben Hutchinson, Francis Bond, Timothy Baldwin

This paper describes the construction of an English valency dictionary which lists a wide range of alternations for each verb sense. Information is automatically extracted from the on-line version of...

Lexical Effects in Verb Sense Disambiguation (2007)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

First, we propose a unified framework for evaluating verb sense in a selectional restriction-based dictionary architecture, including both generalised and fixed verb senses. The proposed methodology...

Supervised by Prof. Hozumi Tanaka (2007)

Timothy Baldwin

1.1 Objectives and outline.................................... 1 1.1.1 Statement of purpose of this research........................ 1

Abstract (2007)

Timothy Baldwin

We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...

Abstract (2007)

Colin Bannard, Timothy Baldwin

Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We first illustrate the suitability of distributional similarity...

2002, ‘Alternation-based lexicon reconstruction (2007)

Timothy Baldwin, Francis Bond

This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...

NTT Communication Science Laboratories (2007)

Timothy Baldwin, Francis Bond, Kentaro Ogura

We present a method for extracting verbal (diathesis) alternations from a valency dictionary, based on comparison of selectional restrictions. The quality of match between selectional restrictions is...

Balancing up Efficiency and Accuracy in Translation Retrieval (2007)

Timothy Baldwin, Hozumi Tanaka

This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...

2000, ‘Verb alternations and Japanese — how, what and where (2007)

Timothy Baldwin, Hozumi Tanaka

We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...

A Computational Account of Modality-based Case Frame (2007)

Transformation Timothy Baldwin, Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

this paper: pass = passive, cause = causative, pres = non-past, nom = nominative, acc = accusative, dat = dative, com = comitative, gen = genitive

1999a, ‘An alternation-based Japanese valency dictionary architecture (2007)

Timothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

Automatic Acquisition of Qualia Structure from Corpus Data (2007)

YAMADA, Ichiro, BALDWIN, Timothy, SUMIYOSHI, Hideki, SHIBATA, Masahiro, YAGI, Nobuyuki

This paper presents a method to automatically acquire a given noun's telic and agentive roles from corpus data. These relations form part of the qualia structure assumed in the generative lexicon,...

Structured Classification for Multilingual Natural Language Processing (2007)

Philip Blunsom, Timothy Baldwin, Philip Blunsom, Steven Bird, James Curran

This thesis investigates the application of structured sequence classification models to multilingual natural language processing (NLP). Many tasks tackled by NLP can be framed as classification,...

The impact of deep linguistic processing on parsing technology (2007)

Timothy Baldwin, Julia Hockenmaier

As the organizers of the ACL 2007 Deep Linguistic Processing workshop (Baldwin et al., 2007), we were asked to discuss our perspectives on the role of current trends in deep linguistic processing for...

Interpreting Noun Compound Using Bootstrapping and Sense Collocation (2007)

Su Nam Kim, Timothy Baldwin

This paper describes a bootstrapping method for automatically tagging noun compounds with their corresponding semantic relations. Our work takes advantage of the collocation of senses of the noun...

Interpreting Noun Compound Using Bootstrapping and Sense Collocation (2007)

Su Nam Kim, Timothy Baldwin

This paper describes a bootstrapping method for automatically tagging noun compounds with their corresponding semantic relations. Our work takes advantage of the collocation of senses of the noun...

Dynamic path prediction and recommendation in a museum environment (2007)

Karl Grieser, Timothy Baldwin, Steven Bird

This research is concerned with making recommendations to museum visitors based on their history within the physical environment, and textual information associated with each item in their history....

Disambiguating noun compounds (2007)

Su Nam Kim, Timothy Baldwin

This paper is concerned with the interaction between word sense disambiguation and the interpretation of noun compounds (NCs) in English. We develop techniques for disambiguating word sense...

Disambiguating noun compounds (2007)

Su Nam Kim, Timothy Baldwin

This paper is concerned with the interaction between word sense disambiguation and the interpretation of noun compounds (NCs) in English. We develop techniques for disambiguating word sense...

Scalable Deep Linguistic Processing: Mind the Lexical Gap (2007)

Baldwin, Timothy

PACLIC 21 / Seoul National University, Seoul, Korea / November 1-3, 2007

Reconsidering Language Identification for Written Language Resources (2006)

Hughes, Baden, Baldwin, Timothy, Bird, Steven, Nicholson, Jeremy, MacKinlay, Andrew

The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches...

Collecting Low-Density Language Materials on the Web (2006)

Baldwin, Timothy, Bird, Steven, Hughes, Baden

Most web content exists in a few dozen languages. Hundreds of other languages - the `low-density languages' - are only represented in scarce quantities on the web. How can we locate, store and...

Detecting Entailment Using an Extended Implementation of the Basic Elements Overlap Metrics (2006)

Jeremy Nicholson, Nicola Stokes, Timothy Baldwin

In this paper we evaluate the utility of the recently proposed Basic Elements (BE) summarisation evaluation metric as a means of detecting entailment in a text/hypothesis sentence pair. Basic...

3 Landmark Center, (2006)

Boban Arsenijević, Timothy Baldwin, Beata Trawiński, Priscilla Rasmussen

Workshop on Prepositions Proceedings of the Workshop Workshop Chairs:

Multilingual deep lexical acquisition for HPSGs via supertagging (2006)

Phil Blunsom, Timothy Baldwin

We propose a conditional random fieldbased method for supertagging, and apply it to the task of learning new lexical items for HPSG-based precision grammars of English and Japanese. Using a...

Semantic role labeling of prepositional phrases (2006)

Patrick Ye, Timothy Baldwin

Abstract. We propose a method for labelling prepositional phrases according to two different semantic role classifications, as contained in the Penn treebank and the CoNLL 2004 Semantic Role...

Die morphologie (f): Targeted lexical acquisition for languages other than English (2006)

Jeremy Nicholson, Timothy Baldwin, Phil Blunsom

We examine standard deep lexical acquisition features in automatically predicting the gender of noun types and tokens by bootstrapping from a small annotated corpus. Using a knowledge-poor approach...

Reconsidering language identification for written language resources (2006)

Baden Hughes, Timothy Baldwin, Steven Bird, Jeremy Nicholson, Andrew Mackinlay

The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches...

Semantic role labeling of prepositional phrases (2006)

Patrick Ye, Timothy Baldwin

Abstract. In this paper, we propose a method for labelling prepositional phrases according to two different semantic role classifications, as contained in the Penn treebank and the CoNLL 2004...

Automatic identification of English verb particle constructions using linguistic features (2006)

Su Nam Kim, Timothy Baldwin

This paper presents a method for identifying token instances of verb particle constructions (VPCs) automatically, based on the output of the RASP parser. The proposed method pools together instances...

Interpreting semantic relations in noun compounds via verb semantics. COLING-ACL (2006)

Su Nam Kim, Timothy Baldwin

We propose a novel method for automatically interpreting compound nouns based on a predefined set of semantic relations. First we map verb tokens in sentential contexts to a fixed set of seed verbs...

Bootstrapping Deep Lexical Resources: Resources for Courses (2005)

Timothy Baldwin

We propose a range of deep lexical acquisition methods which make use of morphological, syntactic and ontological language resources to model word similarity and bootstrap from a seed lexicon. The...

Statistical interpretation of compound nominalisations (2005)

Jeremy Nicholson, Timothy Baldwin

This paper presents a method for detecting compound nominalisations from open data, and providing a semantic intepretation. It uses a statistical model based on confidence intervals over frequencies...

Automatic interpretation of noun compounds using WordNet similarity (2005)

Su Nam Kim, Timothy Baldwin

Abstract. The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based...

Statistical Interpretation of Compound Nouns A thesis presented by (2005)

Jeremy Nicholson, C○ Jeremy Nicholson, Timothy Baldwin, Jeremy Nicholson

We present a method for detecting compound nominalisations in open data, and deriving an interpretation for them. Discovering the semantic relationship between the modifier and head noun in a...

DISTRIBUTIONAL SIMILARITY AND PREPOSITION SEMANTICS (2005)

Timothy Baldwin

Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We illustrate the suitability of distributional similarity methods...

Automatic Discovery of Telic and Agentive Roles from Corpus Data (2004)

Yamada, Ichiro, Baldwin, Timothy

We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...

Road-testing the English Resource Grammar over the British National Corpus (2004)

Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen

This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...

VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)

Timothy Baldwin

We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...

Arboretum: Using a precision grammar for grammar checking in CALL (2004)

Emily M. Bender, Dan Flickinger, Stephan Oepen, Annemarie Walsh, Timothy Baldwin

We present a tutorial system for language learners, using a computational grammar augmented with mal-rules for analysis, error diagnosis, and semantics-centered generation of corrected forms....

VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)

Timothy Baldwin

We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...

Crosslingual countability classification with EuroWordNet (2004)

Timothy Baldwin

We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method...

A multilingual database of idioms (2004)

Aline Villavicencio, Timothy Baldwin, Benjamin Waldron

This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a database and propose a possible encoding...

VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)

Timothy Baldwin

We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...

Road-testing the English Resource Grammar over the British National Corpus (2004)

Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen

This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...

A Multilingual Database of Idioms (2004)

Aline Villavicencio Timothy, Timothy Baldwin, Benjamin Waldron

This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a database and propose a possible encoding...

Arboretum: Using a precision grammar for grammar checking in CALL (2004)

Emily Bender Dan, Dan Flickinger, Stephan Oepen, Annemarie Walsh, Timothy Baldwin

We present a tutorial system for language learners, using a computational grammar augmented with mal-rules for analysis, error diagnosis, and semantics-centered generation of corrected forms.

Crosslingual countability classification with EuroWordNet (2004)

Timothy Baldwin

We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method...

The ins and outs of Dutch noun countability classification (2003)

Timothy Baldwin

This paper presents a range of methods for classifying Dutch noun countability based on either Dutch or English data. The classification is founded on translational equivalences and the corpus...

Increasing the error coverage of the FOKS Japanese dictionary interface (2003)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

With the advent of electronic dictionaries, significant progress has been made in improving the accessibility of dictionary entries allowing for speedy and wide-ranging dictionary lookups....

The ins and outs of Dutch noun countability classification (2003)

Timothy Baldwin

This paper presents a range of methods for classifying Dutch noun countability based on either Dutch or English data. The classification is founded on translational equivalences and the corpus...

Learning the countability of English nouns from corpus data (2003)

Timothy Baldwin

This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...

Improving dictionary accessibility by maximizing use of available knowledge. Traitement automatique des langues (2003)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

ABSTRACT. The dictionary lookup of unknown words is particularly difficult in Japanese due to the requirement of knowing the correct word reading. We propose a system which supplements partial...

Learning the countability of English nouns from corpus data (2003)

Timothy Baldwin

This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...

Learning the Countability of English Nouns from Corpus Data (2003)

Timothy Baldwin Csli, Timothy Baldwin

This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...

Learning the Countability of English Nouns from Corpus Data (2003)

Timothy Baldwin Csli, Timothy Baldwin, Francis Bond

This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...

An Empirical Model of Multiword Expression Decomposability (2003)

Timothy Baldwin, Colin Bannard, Takaaki Tanaka, Dominic Widdows

This paper presents a constructioninspecific model of multiword expression decomposability based on latent semantic analysis. We use latent semantic analysis to determine the similarity between a...

A Plethora of Methods for Learning English Countability (2003)

Timothy Baldwin Csli, Timothy Baldwin

This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns.

An empirical model of multiword expression decomposability (2003)

Timothy Baldwin, Colin Bannard, Takaaki Tanaka, Dominic Widdows

This paper presents a constructioninspecific model of multiword expression decomposability based on latent semantic analysis. We use latent semantic analysis to determine the similarity between a...

A plethora of methods for learning English countability (2003)

Timothy Baldwin

This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to...

Improving dictionary accessibility by maximizing use of available knowledge. Traitement automatique des langues (2003)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

ABSTRACT. The dictionary lookup of unknown words is particularly difficult in Japanese due to the requirement of knowing the correct word reading. We propose a system which supplements partial...

Increasing the error coverage of the FOKS Japanese dictionary interface (2003)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

With the advent of electronic dictionaries, significant progress has been made in improving the accessibility of dictionary entries allowing for speedy and wide-ranging dictionary lookups....

Learning the countability of English nouns from corpus data (2003)

Timothy Baldwin

This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...

In search of a systematic treatment of determinerless PPs (2003)

Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag

This paper examines Determinerless PPs in English from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions, arguing that the observed cases fall...

Multiword Expressions: Some Problems for Japanese NLP (2002)

Timothy Baldwin, Francis Bond

Multiword expressions (MWEs) are notoriously difficult to handle in any language, due to syntactic and semantic idiosyncrasies. In this paper, we focus on Japanese in illustrating the types of...

Bringing the Dictionary to the User: the FOKS system (2002)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

The dictionary look-up of unknown words is particularly difficult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according...

• Aktionsart and aspect • Modification (2002)

Timothy Baldwin

• Desire for some semantic account of the semantics of VPCs, at least in terms of compositionality/predication (e.g. cheer up vs. bring up vs. own up) • Interface between semantic...

Multiword expressions: A pain in the neck for nlp (2002)

Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake

Abstract. Multiword expressions are a key problem for the development of large-scale, linguistically sound natural language processing technology. This paper surveys the problem and some currently...

Extracting the unextractable: A case study on verbparticles (2002)

Timothy Baldwin, Aline Villavicencio

This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...

Bringing the Dictionary to the User: the FOKS system (2002)

Slaven Bilac, Timothy Baldwin, Hozumi Tanaka

The dictionary look-up of unknown words is particularly difficult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according...

Bringing the Dictionary to the User: the FOKS system (2002)

Slaven Bilac Timothy, Timothy Baldwin, Hozumi Tanaka

The dictionary look-up of unknown words is particularly di#cult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according to...

Extracting the unextractable: A case study on verbparticles (2002)

Timothy Baldwin, Aline Villavicencio

This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...

Extracting the unextractable: A case study on verbparticles (2002)

Timothy Baldwin, Aline Villavicencio

This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...

Multiword Expressions: Some Problems for Japanese NLP (2002)

Timothy Baldwin, Francis Bond

Multiword expressions (MWEs) are notoriously difficult to handle in any language, due to syntactic and semantic idiosyncrasies. In this paper, we focus on Japanese in illustrating the types of...

high-performance translation retrieval: Dumber is better (2001)

Timothy Baldwin

In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...

The Japanese Translation Task: Lexical and Structural Perspectives (2001)

Timothy Baldwin, Atsushi Okazaki, Takenobu Tokunaga, Hozumi Tanaka

This paper describes two distinct attempts at the Senseval2 Japanese translation task. The first implementation is based on lexical similarity and builds on the results of Baldwin (2001b; 2001a),...

The Japanese Translation Task: Lexical and Structural Perspectives (2001)

Timothy Baldwin, Atsushi Okazaki, Takenobu Tokunaga, Hozumi Tanaka

This paper describes two distinct attempts at the Senseval2 Japanese translation task. The first implementation is based on lexical similarity and builds on the results of Baldwin (2001b; 2001a),...

high-performance translation retrieval: Dumber is better (2001)

Timothy Baldwin

In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...

The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)

Timothy Baldwin, Hozumi Tanaka

This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...

The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)

Timothy Baldwin, Hozumi Tanaka

This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...

Verb Alternations and Japanese - How, What and Where? (2000)

Timothy Baldwin, Hozumi Tanaka

We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...

The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)

Timothy Baldwin, Hozumi Tanaka

This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...

The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)

Timothy Baldwin, Hozumi Tanaka

This research looks at tim cIt'ccts of word order mL(t scgm(mtation on l;ra.nslation retri(~val t)(~rfor-III~[.11C ( ~. lot " ~.111 eXl)erim(:nta.1 Jal>an(>s(>English...

A valency dictionary architecture for machine translation (1999)

Timothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

Argument status in Japanese verb sense disambiguation (1999)

Timothy Baldwin, Hozumi Tanaka

This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...

The parameter-based analysis of Japanese relative clause constructions (1999)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

We examine the validity of a procedural Japanese relative clause analysis system by way of running C4.5 over the same basic parameter space and comparing results. In reformatting data for use with...

A Valency Dictionary Architecture for Machine Translation (1999)

Timothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

An alternation-based Japanese valency dictionary architecture (1999)

TImothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

A valency dictionary architecture for machine translation (1999)

Timothy Baldwin, Francis Bond, Ben Hutchinson

This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....

Argument status in Japanese verb sense disambiguation (1999)

Timothy Baldwin, Hozumi Tanaka

This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...

Argument status in Japanese verb sense disambiguation (1999)

Timothy Baldwin, Hozumi Tanaka

This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...

Relative clause coordination and subordination in Japanese (1998)

Timothy Baldwin

The research described in this paper is a direct extension of Baldwin et al. (1997), which proposed a declarative rule-based system to analyse gapping in simple Japanese relative clauses. Two...

Relative Clause Coordination and Subordination in Japanese (1998)

Timothy Baldwin

The research described in this paper is a direct extension of Baldwin et al. (1997), which proposed a declarative rule-based system to analyse gapping in simple Japanese relative clauses. Two...

Semantic verb classes in the analysis of head gapping in Japanese relative clauses (1997)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...

Syntactic and semantic constraints on head gapping in Japanese relative clauses (1997)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

This paper represents the continutation of research into the identification of the relationship between the head of a relative clause and the clause body, for Japanese.

Semantic Verb Classes in the Analysis of Head Gapping in Japanese Relative Clauses (1997)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...

Semantic verb classes in the analysis of head gapping in Japanese relative clauses (1997)

Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka

This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...