As human beings, our mental processes for recognising linguistic symbols generate perceptual neighbourhoods around such symbols where confusion errors occur. Such neighbourhoods also provide us with...
Automatic Event Reference Identification (2009)
Event reference identification is often treated as a sentence level classification task. However, several different event references can occur within a single sentence. We present a set of...
Applying Discourse Analysis and Data Mining Methods to Spoken OSCE Assessments (2009)
Meladel Mistica, Timothy Baldwin, Marisa Cordella, Simon Musgrave
This paper looks at the transcribed data of patient-doctor consultations in an examination setting. The doctors are internationally qualified and enrolled in a bridging course as preparation for...
A Machine Learning Approach to Multiword Expression Extraction (2009)
Timothy Baldwin, Stefan Evert, Brigitte Krenn, Pavel Pecina, Dimitra Anastasiou, Michael Carl, ...
11.00- 13.30 Resource session II 13.30- 14.30 Lunch break A Lexicon of shallow-typed German-English MW-Expressions and a German Corpus of MW-Expressions annotated Sentences
Facilitating Biomedical Systematic Reviews Using Ranked Text Retrieval and Classification (2009)
David Martinez, Sarvnaz Karimi, Lawrence Cavedon, Timothy Baldwin
Abstract Searching and selecting articles to be included in systematic reviews is a real challenge for healthcare agencies responsible for publishing these reviews. The current practice of manually...
MELB-YB: Preposition Sense Disambiguation Using Rich Semantic Features (2009)
This paper describes a maxent-based preposition sense disambiguation system entry to the preposition sense disambiguation task of the SemEval 2007. This system uses a wide variety of semantic and...
An Unsupervised Approach to Interpreting Noun Compounds (2009)
Abstract—This paper proposes an unsupervised approach to automatically interpret noun compounds using semantic similarity. Our proposed unsupervised method is based on obtaining a large amount of...
Karl Grieser, Timothy Baldwin, Fabian Bohnert, Liz Sonenberg
Abstract While the layout of a museum exhibition is largely prescribed by the curator, visitors to museums view connections between exhibits in ways unique to themselves. With the assistance of a...
Landmark Classification for Route Directions (2009)
In order for automated navigation systems to operate effectively, the route instructions they produce must be clear, concise and easily understood by users. In order to incorporate a landmark within...
As human beings, our mental processes for recognising linguistic symbols generate perceptual neighbourhoods around such symbols where confusion errors occur. Such neighbourhoods also provide us with...
Donostia, Basque Country (2009)
Eneko Agirre, Timothy Baldwin, David Martinez
To date, parsers have made limited use of semantic information, but there is evidence to suggest that semantic features can enhance parse disambiguation. This paper shows that semantic classes help...
MELB-KB: Nominal Classification as Noun Compound Interpretation (2009)
In this paper, we outline our approach to interpreting semantic relations in nominal pairs in SemEval-2007 task #4: Classification of Semantic Relations between Nominals. We build on two baseline...
Aspect-Based Personalized Text Summarization (2009)
Shlomo Berkovsky, Timothy Baldwin, Ingrid Zukerman
Abstract. This work investigates user attitudes towards personalized summaries generated from a coarse-grained user model based on document aspects. We explore user preferences for summaries at...
∗Center for the Study of Language and Information (2009)
Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen
This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...
MELB-MKB: Lexical Substitution System based on Relatives in Context (2009)
David Martinez, Su Nam Kim, Timothy Baldwin
In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...
Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2009)
Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag
This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions, arguing...
This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...
Learning Count Classifier Preferences of Malay Nouns (2009)
Jeremy Nicholson, Timothy Baldwin
We develop a data set of Malay lexemes labelled with count classifiers, that are attested in raw or lemmatised corpora. A maximum entropy classifier based on simple, languageinspecific features...
We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...
The Corpus and the Lexicon: Standardising Deep Lexical Acquisition Evaluation (2009)
Yi Zhang, Timothy Baldwin, Valia Kordoni
This paper is concerned with the standardisation of evaluation metrics for lexical acquisition over precision grammars, which are attuned to actual parser performance. Specifically, we investigate...
2006. Interpretation of compound nominalisations using corpus and web statistics (2009)
Jeremy Nicholson, Timothy Baldwin
We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We...
Evaluating the FOKS Error Model (2009)
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
Learners of Japanese face great difficulty when trying to lookup words containing kanji in a dictionary, due to the requirement of knowing the correct reading of the target word. We propose a system...
Dictionary-driven analysis of Japanese verbal alternations (2009)
Timothy Baldwin, Francis Bond, Kentaro Ogura
We present a method for extracting verbal (diathesis) alternations from a valency dictionary, based on comparison of selectional restrictions. The quality of match between selectional restrictions is...
2006. Interpretation of compound nominalisations using corpus and web statistics (2009)
Jeremy Nicholson, Timothy Baldwin
We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We...
Experiments on pattern-based relation learning (2009)
Relation extraction is a sub-task of Information Extraction (IE) that is concerned with extracting semantic relations---such as antonymy, synonymy or hypernymy---between word pairs from corpus data....
Experiments on pattern-based relation learning (2009)
Relation extraction is a sub-task of Information Extraction (IE) that is concerned with extracting semantic relations---such as antonymy, synonymy or hypernymy---between word pairs from corpus data....
Benchmarking Noun Compound Interpretation (2008)
In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...
Benchmarking Noun Compound Interpretation (2008)
In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...
Extending Sense Collocations in Interpreting Noun Compounds (2008)
Su Nam Kim, Meladel Mistica, Timothy Baldwin
This paper investigates the task of noun compound interpretation, building on the sense collocation approach proposed by Moldovan et al. (2004). Our primary task is to evaluate the impact of similar...
Ichiro Yamada, Timothy Baldwin
We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...
Detecting Compositionality of English Verb-Particle Constructions using Semantic Similarity (2008)
We present a novel method for detecting the compositionality of English verbparticle constructions (VPCs), based on the assumption that compositionality can be modelled with semantic similarity...
MELB-MKB: Lexical Substitution System based on Relatives in Context (2008)
David Martinez, Su Nam Kim, Timothy Baldwin
In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...
Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2008)
Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag
Abstract This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions,...
Sumukh Ghodke, Timothy Baldwin
Abstract. Pre-processing is an important part of machine learning, and has been shown to significantly improve the performance of classifiers. In this paper, we take a selection of pre-processing...
Efficient Grapheme-phoneme Alignment for Japanese (2008)
Current approaches to the grapheme-phoneme alignment problem for Japanese achieve good accuracy, but are extremely computationally expensive. In this paper we evaluate various modifications to...
Modelling the Orthographic Neighbourhood for Japanese Kanji (2008)
Abstract. Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity...
Linguistics Dimensions of Syntax and Semantics of Prepositions. Kluwer (2008)
Collins Cobuild, Dictionary Idioms, Harper Collins, Andre Schenk, Rob Schreuder, ...
In search of a systematic treatment of Determinerless PPs. Computational
We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...
Chief Investigators, Timothy Baldwin, Steven Bird, Baden Hughes
Language occupies a central role on the web: most content is expressed in a given language, and most access takes place via natural language input and interfaces. Today, investigation of human...
2000, ‘Verb alternations and Japanese — how, what and where (2008)
Timothy Baldwin, Hozumi Tanaka
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...
Linguistic Dimensions of Prepositions and their Use in Computational Linguistics (2008)
Judith Aissen, Differential Iconicity Natural, Jennifer E. Arnold, Thomas Wasow, Anthony Losongco, ...
Adrian Akmajian. On deriving cleft sentences from pseudo-cleft sentences.
Scalable Deep Linguistic Processing: Mind the Lexical Gap (2008)
Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domain-tuning the complex...
1999a, ‘An alternation-based Japanese valency dictionary architecture (2008)
Timothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
2002, ‘Alternation-based lexicon reconstruction (2008)
This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...
English valency dictionary (2008)
Ben Hutchinson, Francis Bond, Timothy Baldwin
Construction of an alternation-based
A Computational Account of Modality-based Case Frame Transformation (2008)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
Verb modality presents a major processing obstacle in any NLP application, and can be overcome either by
Balancing up Efficiency and Accuracy in Translation Retrieval (2008)
Timothy Baldwin, Hozumi Tanaka
This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...
Modelling the Orthographic Neighbourhood for Japanese Kanji (2008)
Abstract. Japanese kanji recognition experiments are typically narrowly focused, and feature only native speakers as participants. It remains unclear how to apply their results to kanji similarity...
General-purpose lexical acquisition: Procedures, questions and results (2008)
We discuss a range of in vitro and in vivo approaches to deep lexical acquisition, and evaluate a representative sample of each in learning lexical items for a precision grammar. Evaluation focuses...
2002, ‘Alternation-based lexicon reconstruction (2008)
This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...
Translation Memory Engines: A Look under the Hood and Road Test (2008)
In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...
Colin Bannard, Timothy Baldwin
Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We first illustrate the suitability of distributional similarity...
We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...
taught us about the grammar (2008)
Timothy Baldwin, John Beavers, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen
broad-coverage precision grammar over the BNC
2000, ‘Verb alternations and Japanese — how, what and where (2008)
Timothy Baldwin, Hozumi Tanaka
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...
Word Sense Disambiguation Incorporating Lexical and Structural Semantic Information (2008)
Takaaki Tanaka, Francis Bond, Timothy Baldwin, Sanae Fujita, Chikara Hashimoto
We present results that show that incorporating lexical and structural semantic information is effective for word sense disambiguation. We evaluated the method by using precise information from a...
Chapter 1 IN SEARCH OF A SYSTEMATIC TREATMENT OF DETERMINERLESS PPS (2008)
Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag
Abstract This paper examines determinerless prepositional phrases in English and Dutch from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions,...
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
r!P9k%!Pj!H7FO$(1) *r)sN1cJ^CAs0$(2) PC-s
Ichiro Yamada, Timothy Baldwin
We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...
Scalable Deep Linguistic Processing: Mind the Lexical Gap (2008)
Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domain-tuning the complex...
Timothy Baldwin, Slaven Bilac, Ryo Okumura, Takenobu Tokunaga, Hozumi Tanaka
This paper describes the process of data preparation and reading generation for an ongoing project aimed at improving the accessibility of unknown words for learners of foreign languages, focusing...
kj!r$m$mHsF7$=l>lr>A9k%5iK$>hNrO!KHQ (2008)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
rC)9kH-K$abraN#f-rrC9kj!H$aVNrar}l5;
MELB-MKB: Lexical Substitution System based on Relatives in Context (2008)
David Martinez, Su Nam Kim, Timothy Baldwin
In this paper we describe the MELB-MKB system, as entered in the SemEval-2007 lexical substitution task. The core of our system was the “Relatives in Context ” unsupervised approach, which ranked...
ACL/HCSNet Advanced Programme in NLP Learning Lexical Semantic Representations (2008)
• Lexical semantic representations are all well and good, BUT: how can we produce them automatically?
2002, ‘Alternation-based lexicon reconstruction (2008)
Author(s) hidden for anonymous review Institute also hidden Address also hidden (probably two lines) Email also hidden This research is aimed at developing a hierarchical alternation-based lexical...
What are Multiword Expressions (MWEs)? (2008)
• Definition: A multiword expression (MWE) is:
Sumukh Ghodke, Timothy Baldwin
Abstract. Pre-processing is an important part of machine learning, and has been shown to significantly improve the performance of classifiers. In this paper, we take a selection of pre-processing...
Detecting Compositionality of English Verb-Particle Constructions using Semantic Similarity (2008)
We present a novel method for detecting the compositionality of English verbparticle constructions (VPCs), based on the assumption that compositionality can be modelled with semantic similarity...
Automatic Thread Classification for Linux User Forum Information Access (2008)
Timothy Baldwin, David Martinez, Richard B. Penman
Abstract We experiment with text classification of threads from Linux web user forums, in the context of improving information access to the problems and solutions described in the threads. We...
Extending Sense Collocations in Interpreting Noun Compounds (2008)
Su Nam Kim, Meladel Mistica, Timothy Baldwin
This paper investigates the task of noun compound interpretation, building on the sense collocation approach proposed by Moldovan et al. (2004). Our primary task is to evaluate the impact of similar...
This paper investigates whether multisemantic-role (MSR) based selectional preferences can be used to improve the performance of supervised verb sense disambiguation. Unlike conventional selectional...
POS Tagging with a More Informative Tagset (2008)
Andrew Mackinlay, Timothy Baldwin
We investigate the impact of introducing finer distinctions into the tagset on the accuracy of partof-speech tagging. This is a tangential approach to most recent research in the field, which has...
Benchmarking noun compound interpretation (2008)
In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using...
L.: Using collaborative models to adaptively predict visitor locations in museums (2008)
Fabian Bohnert, Ingrid Zukerman, Shlomo Berkovsky, Timothy Baldwin, Liz Sonenberg
Abstract. The vast amounts of information presented in museums can be overwhelming to a visitor, whose receptivity and time are typically limited. Hence, s/he might have difficulties selecting...
Preliminary analysis of the range and frequency of Japanese verb alternations (2007)
Timothy Baldwin, H. Tanaka, Takenobu Tokunaga, Hozumi Tanaka
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese patternbased valency dictionary. This is achieved...
Construction of an alternation-based English valency dictionary (2007)
English Valency Dictionary, Ben Hutchinson, Francis Bond, Timothy Baldwin
This paper describes the construction of an English valency dictionary which lists a wide range of alternations for each verb sense. Information is automatically extracted from the on-line version of...
A Computational Account of Modality-based Case Frame Transformation (2007)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
this paper: pass
Lexical Effects in Verb Sense Disambiguation (2007)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
First, we propose a unified framework for evaluating verb sense in a selectional restriction-based dictionary architecture, including both generalised and fixed verb senses. The proposed methodology...
Supervised by Prof. Hozumi Tanaka (2007)
1.1 Objectives and outline.................................... 1 1.1.1 Statement of purpose of this research........................ 1
kj!r$m$mHsF7$=l>lr>A9k%5iK$>hNrO!KHQ (2007)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
rC)9kH-K$abraN#f-rrC9kj!H$aVNrar}l5;
We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over...
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
r!P9k%!Pj!H7FO$(1) *r)sN1cJ^CAs0$(2) PC-s
Colin Bannard, Timothy Baldwin
Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We first illustrate the suitability of distributional similarity...
2002, ‘Alternation-based lexicon reconstruction (2007)
This research is aimed at developing a hierarchical alternation-based lexical architecture for machine translation. The proposed architecture makes extensive use of information sharing in describing...
NTT Communication Science Laboratories (2007)
Timothy Baldwin, Francis Bond, Kentaro Ogura
We present a method for extracting verbal (diathesis) alternations from a valency dictionary, based on comparison of selectional restrictions. The quality of match between selectional restrictions is...
taught us about the grammar (2007)
Timothy Baldwin, John Beavers, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen
broad-coverage precision grammar over the BNC
Ann Copestake, Fabre Lambeau, Aline Villavicencio, Francis Bond, Timothy Baldwin, Ivan A. Sag, ...
y
Balancing up Efficiency and Accuracy in Translation Retrieval (2007)
Timothy Baldwin, Hozumi Tanaka
This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...
2000, ‘Verb alternations and Japanese — how, what and where (2007)
Timothy Baldwin, Hozumi Tanaka
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...
A Computational Account of Modality-based Case Frame (2007)
Transformation Timothy Baldwin, Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
this paper: pass = passive, cause = causative, pres = non-past, nom = nominative, acc = accusative, dat = dative, com = comitative, gen = genitive
1999a, ‘An alternation-based Japanese valency dictionary architecture (2007)
Timothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
English valency dictionary (2007)
Ben Hutchinson, Francis Bond, Timothy Baldwin
Construction of an alternation-based
Automatic Acquisition of Qualia Structure from Corpus Data (2007)
YAMADA, Ichiro, BALDWIN, Timothy, SUMIYOSHI, Hideki, SHIBATA, Masahiro, YAGI, Nobuyuki
This paper presents a method to automatically acquire a given noun's telic and agentive roles from corpus data. These relations form part of the qualia structure assumed in the generative lexicon,...
Structured Classification for Multilingual Natural Language Processing (2007)
Philip Blunsom, Timothy Baldwin, Philip Blunsom, Steven Bird, James Curran
This thesis investigates the application of structured sequence classification models to multilingual natural language processing (NLP). Many tasks tackled by NLP can be framed as classification,...
The impact of deep linguistic processing on parsing technology (2007)
Timothy Baldwin, Julia Hockenmaier
As the organizers of the ACL 2007 Deep Linguistic Processing workshop (Baldwin et al., 2007), we were asked to discuss our perspectives on the role of current trends in deep linguistic processing for...
Interpreting Noun Compound Using Bootstrapping and Sense Collocation (2007)
This paper describes a bootstrapping method for automatically tagging noun compounds with their corresponding semantic relations. Our work takes advantage of the collocation of senses of the noun...
Interpreting Noun Compound Using Bootstrapping and Sense Collocation (2007)
This paper describes a bootstrapping method for automatically tagging noun compounds with their corresponding semantic relations. Our work takes advantage of the collocation of senses of the noun...
Dynamic path prediction and recommendation in a museum environment (2007)
Karl Grieser, Timothy Baldwin, Steven Bird
This research is concerned with making recommendations to museum visitors based on their history within the physical environment, and textual information associated with each item in their history....
Disambiguating noun compounds (2007)
This paper is concerned with the interaction between word sense disambiguation and the interpretation of noun compounds (NCs) in English. We develop techniques for disambiguating word sense...
Disambiguating noun compounds (2007)
This paper is concerned with the interaction between word sense disambiguation and the interpretation of noun compounds (NCs) in English. We develop techniques for disambiguating word sense...
Scalable Deep Linguistic Processing: Mind the Lexical Gap (2007)
PACLIC 21 / Seoul National University, Seoul, Korea / November 1-3, 2007
Particle Verbs in English: Syntax, Information Structure and Intonation (review) (2006)
Language - Volume 82, Number 3, September 2006
Reconsidering Language Identification for Written Language Resources (2006)
Hughes, Baden, Baldwin, Timothy, Bird, Steven, Nicholson, Jeremy, MacKinlay, Andrew
The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches...
Analysis and prediction of user behaviour in a museum environment (2006)
Grieser, Karl, Baldwin, Timothy, Bird, Steven
N/A
Collecting Low-Density Language Materials on the Web (2006)
Baldwin, Timothy, Bird, Steven, Hughes, Baden
Most web content exists in a few dozen languages. Hundreds of other languages - the `low-density languages' - are only represented in scarce quantities on the web. How can we locate, store and...
Detecting Entailment Using an Extended Implementation of the Basic Elements Overlap Metrics (2006)
Jeremy Nicholson, Nicola Stokes, Timothy Baldwin
In this paper we evaluate the utility of the recently proposed Basic Elements (BE) summarisation evaluation metric as a means of detecting entailment in a text/hypothesis sentence pair. Basic...
Boban Arsenijević, Timothy Baldwin, Beata Trawiński, Priscilla Rasmussen
Workshop on Prepositions Proceedings of the Workshop Workshop Chairs:
Multilingual deep lexical acquisition for HPSGs via supertagging (2006)
We propose a conditional random fieldbased method for supertagging, and apply it to the task of learning new lexical items for HPSG-based precision grammars of English and Japanese. Using a...
Semantic role labeling of prepositional phrases (2006)
Abstract. We propose a method for labelling prepositional phrases according to two different semantic role classifications, as contained in the Penn treebank and the CoNLL 2004 Semantic Role...
Die morphologie (f): Targeted lexical acquisition for languages other than English (2006)
Jeremy Nicholson, Timothy Baldwin, Phil Blunsom
We examine standard deep lexical acquisition features in automatically predicting the gender of noun types and tokens by bootstrapping from a small annotated corpus. Using a knowledge-poor approach...
Reconsidering language identification for written language resources (2006)
Baden Hughes, Timothy Baldwin, Steven Bird, Jeremy Nicholson, Andrew Mackinlay
The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches...
Semantic role labeling of prepositional phrases (2006)
Abstract. In this paper, we propose a method for labelling prepositional phrases according to two different semantic role classifications, as contained in the Penn treebank and the CoNLL 2004...
Automatic identification of English verb particle constructions using linguistic features (2006)
This paper presents a method for identifying token instances of verb particle constructions (VPCs) automatically, based on the output of the RASP parser. The proposed method pools together instances...
Interpreting semantic relations in noun compounds via verb semantics. COLING-ACL (2006)
We propose a novel method for automatically interpreting compound nouns based on a predefined set of semantic relations. First we map verb tokens in sentential contexts to a fixed set of seed verbs...
Bootstrapping Deep Lexical Resources: Resources for Courses (2005)
We propose a range of deep lexical acquisition methods which make use of morphological, syntactic and ontological language resources to model word similarity and bootstrap from a seed lexicon. The...
Statistical interpretation of compound nominalisations (2005)
Jeremy Nicholson, Timothy Baldwin
This paper presents a method for detecting compound nominalisations from open data, and providing a semantic intepretation. It uses a statistical model based on confidence intervals over frequencies...
Automatic interpretation of noun compounds using WordNet similarity (2005)
Abstract. The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based...
Statistical Interpretation of Compound Nouns A thesis presented by (2005)
Jeremy Nicholson, C○ Jeremy Nicholson, Timothy Baldwin, Jeremy Nicholson
We present a method for detecting compound nominalisations in open data, and deriving an interpretation for them. Discovering the semantic relationship between the modifier and head noun in a...
DISTRIBUTIONAL SIMILARITY AND PREPOSITION SEMANTICS (2005)
Prepositions are often considered to have too little semantic content or be too polysemous to warrant a proper semantic description. We illustrate the suitability of distributional similarity methods...
Automatic Discovery of Telic and Agentive Roles from Corpus Data (2004)
Yamada, Ichiro, Baldwin, Timothy
We present two methods for automatically discovering the telic and agentive roles of nouns from corpus data. These relations form part of the qualia structure assumed in generative lexicon theory,...
Road-testing the English Resource Grammar over the British National Corpus (2004)
Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen
This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...
VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)
We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...
Arboretum: Using a precision grammar for grammar checking in CALL (2004)
Emily M. Bender, Dan Flickinger, Stephan Oepen, Annemarie Walsh, Timothy Baldwin
We present a tutorial system for language learners, using a computational grammar augmented with mal-rules for analysis, error diagnosis, and semantics-centered generation of corrected forms....
VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)
We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...
Crosslingual countability classification with EuroWordNet (2004)
We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method...
A multilingual database of idioms (2004)
Aline Villavicencio, Timothy Baldwin, Benjamin Waldron
This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a database and propose a possible encoding...
VRML 97: The Virtual Reality Modeling Language, iso/iec 14772:1997 (2004)
We present a method for compositionally translating noun-noun (NN) compounds, using a word-level bilingual dictionary and syntactic templates for candidate generation, and corpus and dictionary...
Road-testing the English Resource Grammar over the British National Corpus (2004)
Timothy Baldwin, Emily M. Bender, Dan Flickinger, Ara Kim, Stephan Oepen
This paper addresses two questions: (1) when a large deep processing resource developed for relatively closed domains is run over open text, what coverage does it have, and (2) what are the most...
A Multilingual Database of Idioms (2004)
Aline Villavicencio Timothy, Timothy Baldwin, Benjamin Waldron
This paper presents a possible architecture for a multilingual database of idioms. We discuss the challenges that idioms present to the creation of such a database and propose a possible encoding...
Arboretum: Using a precision grammar for grammar checking in CALL (2004)
Emily Bender Dan, Dan Flickinger, Stephan Oepen, Annemarie Walsh, Timothy Baldwin
We present a tutorial system for language learners, using a computational grammar augmented with mal-rules for analysis, error diagnosis, and semantics-centered generation of corrected forms.
Crosslingual countability classification with EuroWordNet (2004)
We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method...
The ins and outs of Dutch noun countability classification (2003)
This paper presents a range of methods for classifying Dutch noun countability based on either Dutch or English data. The classification is founded on translational equivalences and the corpus...
Increasing the error coverage of the FOKS Japanese dictionary interface (2003)
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
With the advent of electronic dictionaries, significant progress has been made in improving the accessibility of dictionary entries allowing for speedy and wide-ranging dictionary lookups....
The ins and outs of Dutch noun countability classification (2003)
This paper presents a range of methods for classifying Dutch noun countability based on either Dutch or English data. The classification is founded on translational equivalences and the corpus...
Learning the countability of English nouns from corpus data (2003)
This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
ABSTRACT. The dictionary lookup of unknown words is particularly difficult in Japanese due to the requirement of knowing the correct word reading. We propose a system which supplements partial...
Learning the countability of English nouns from corpus data (2003)
This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...
Learning the Countability of English Nouns from Corpus Data (2003)
Timothy Baldwin Csli, Timothy Baldwin
This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...
Learning the Countability of English Nouns from Corpus Data (2003)
Timothy Baldwin Csli, Timothy Baldwin, Francis Bond
This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...
An Empirical Model of Multiword Expression Decomposability (2003)
Timothy Baldwin, Colin Bannard, Takaaki Tanaka, Dominic Widdows
This paper presents a constructioninspecific model of multiword expression decomposability based on latent semantic analysis. We use latent semantic analysis to determine the similarity between a...
A Plethora of Methods for Learning English Countability (2003)
Timothy Baldwin Csli, Timothy Baldwin
This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns.
An empirical model of multiword expression decomposability (2003)
Timothy Baldwin, Colin Bannard, Takaaki Tanaka, Dominic Widdows
This paper presents a constructioninspecific model of multiword expression decomposability based on latent semantic analysis. We use latent semantic analysis to determine the similarity between a...
A plethora of methods for learning English countability (2003)
This paper compares a range of methods for classifying words based on linguistic diagnostics, focusing on the task of learning countabilities for English nouns. We propose two basic approaches to...
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
ABSTRACT. The dictionary lookup of unknown words is particularly difficult in Japanese due to the requirement of knowing the correct word reading. We propose a system which supplements partial...
Increasing the error coverage of the FOKS Japanese dictionary interface (2003)
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
With the advent of electronic dictionaries, significant progress has been made in improving the accessibility of dictionary entries allowing for speedy and wide-ranging dictionary lookups....
Learning the countability of English nouns from corpus data (2003)
This paper describes a method for learning the countability preferences of English nouns from raw text corpora. The method maps the corpus-attested lexico-syntactic properties of each noun onto a...
In search of a systematic treatment of determinerless PPs (2003)
Timothy Baldwin, John Beavers, Francis Bond, Dan Flickinger, Ivan A. Sag
This paper examines Determinerless PPs in English from a theoretical perspective. We classify attested P + N combinations across a number of analytic dimensions, arguing that the observed cases fall...
Multiword Expressions: Some Problems for Japanese NLP (2002)
Multiword expressions (MWEs) are notoriously difficult to handle in any language, due to syntactic and semantic idiosyncrasies. In this paper, we focus on Japanese in illustrating the types of...
Bringing the Dictionary to the User: the FOKS system (2002)
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
The dictionary look-up of unknown words is particularly difficult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according...
• Aktionsart and aspect • Modification (2002)
• Desire for some semantic account of the semantics of VPCs, at least in terms of compositionality/predication (e.g. cheer up vs. bring up vs. own up) • Interface between semantic...
Multiword expressions: A pain in the neck for nlp (2002)
Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake
Abstract. Multiword expressions are a key problem for the development of large-scale, linguistically sound natural language processing technology. This paper surveys the problem and some currently...
Extracting the unextractable: A case study on verbparticles (2002)
Timothy Baldwin, Aline Villavicencio
This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...
Bringing the Dictionary to the User: the FOKS system (2002)
Slaven Bilac, Timothy Baldwin, Hozumi Tanaka
The dictionary look-up of unknown words is particularly difficult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according...
Bringing the Dictionary to the User: the FOKS system (2002)
Slaven Bilac Timothy, Timothy Baldwin, Hozumi Tanaka
The dictionary look-up of unknown words is particularly di#cult in Japanese due to the complicated writing system. We propose a system which allows learners of Japanese to look up words according to...
Extracting the unextractable: A case study on verbparticles (2002)
Timothy Baldwin, Aline Villavicencio
This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...
Extracting the unextractable: A case study on verbparticles (2002)
Timothy Baldwin, Aline Villavicencio
This paper proposes a series of techniques for extracting English verb–particle constructions from raw text corpora. We initially propose three basic methods, based on tagger output, chunker output...
Multiword Expressions: Some Problems for Japanese NLP (2002)
Multiword expressions (MWEs) are notoriously difficult to handle in any language, due to syntactic and semantic idiosyncrasies. In this paper, we focus on Japanese in illustrating the types of...
high-performance translation retrieval: Dumber is better (2001)
In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...
Making Lexical Sense of Japanese-English Machine Translation: A Disambiguation Extravaganza (2001)
c #The author(s) of this report reserves all the rights.
The Japanese Translation Task: Lexical and Structural Perspectives (2001)
Timothy Baldwin, Atsushi Okazaki, Takenobu Tokunaga, Hozumi Tanaka
This paper describes two distinct attempts at the Senseval2 Japanese translation task. The first implementation is based on lexical similarity and builds on the results of Baldwin (2001b; 2001a),...
The Japanese Translation Task: Lexical and Structural Perspectives (2001)
Timothy Baldwin, Atsushi Okazaki, Takenobu Tokunaga, Hozumi Tanaka
This paper describes two distinct attempts at the Senseval2 Japanese translation task. The first implementation is based on lexical similarity and builds on the results of Baldwin (2001b; 2001a),...
high-performance translation retrieval: Dumber is better (2001)
In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both...
The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)
Timothy Baldwin, Hozumi Tanaka
This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...
The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)
Timothy Baldwin, Hozumi Tanaka
This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...
Verb Alternations and Japanese - How, What and Where? (2000)
Timothy Baldwin, Hozumi Tanaka
We set out to empirically identify the range and frequency of basic verb alternation types in Japanese, through analysis of the Goi-Taikei Japanese pattern-based valency dictionary. This is achieved...
The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)
Timothy Baldwin, Hozumi Tanaka
This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both...
The Effects of Word Order and Segmentation on Translation Retrieval Performance (2000)
Timothy Baldwin, Hozumi Tanaka
This research looks at tim cIt'ccts of word order mL(t scgm(mtation on l;ra.nslation retri(~val t)(~rfor-III~[.11C ( ~. lot " ~.111 eXl)erim(:nta.1 Jal>an(>s(>English...
A valency dictionary architecture for machine translation (1999)
Timothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
Argument status in Japanese verb sense disambiguation (1999)
Timothy Baldwin, Hozumi Tanaka
This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...
The parameter-based analysis of Japanese relative clause constructions (1999)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
We examine the validity of a procedural Japanese relative clause analysis system by way of running C4.5 over the same basic parameter space and comparing results. In reformatting data for use with...
A Valency Dictionary Architecture for Machine Translation (1999)
Timothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
An alternation-based Japanese valency dictionary architecture (1999)
TImothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
A valency dictionary architecture for machine translation (1999)
Timothy Baldwin, Francis Bond, Ben Hutchinson
This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly....
Argument status in Japanese verb sense disambiguation (1999)
Timothy Baldwin, Hozumi Tanaka
This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...
Argument status in Japanese verb sense disambiguation (1999)
Timothy Baldwin, Hozumi Tanaka
This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture effects such as underspecification,...
Relative clause coordination and subordination in Japanese (1998)
The research described in this paper is a direct extension of Baldwin et al. (1997), which proposed a declarative rule-based system to analyse gapping in simple Japanese relative clauses. Two...
Relative Clause Coordination and Subordination in Japanese (1998)
The research described in this paper is a direct extension of Baldwin et al. (1997), which proposed a declarative rule-based system to analyse gapping in simple Japanese relative clauses. Two...
Semantic verb classes in the analysis of head gapping in Japanese relative clauses (1997)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...
Syntactic and semantic constraints on head gapping in Japanese relative clauses (1997)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
This paper represents the continutation of research into the identification of the relationship between the head of a relative clause and the clause body, for Japanese.
Semantic Verb Classes in the Analysis of Head Gapping in Japanese Relative Clauses (1997)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...
Semantic verb classes in the analysis of head gapping in Japanese relative clauses (1997)
Timothy Baldwin, Takenobu Tokunaga, Hozumi Tanaka
This paper describes an attempt to identify case gapping instances of Japanese relative clauses, and disambiguate the case slot from which the gapping occurred. The method utilised relies principally...