Name Extraction and Translation for Distillation (2009)
Heng Ji, Ralph Grishman, Dayne Freitag, Matthias Blume, Zhiqiang (john Wang, Fair Isaac Corp, ...
Name translation is important well beyond the relative frequency of names in a text: a correctly translated passage, but with the wrong name, may lose most of its value. The Nightingale team has...
A Study of Using an Out-Of-Box Commercial MT System for Query Translation in CLIR (2009)
Dan Wu, Daqing He, Heng Ji, Ralph Grishman
Recent availability of commercial online machine translation (MT) systems makes it possible for layman Web users to utilize the MT capability for cross-language information retrieval (CLIR). To study...
Dan Wu, Daqing He, Heng Ji, Ralph Grishman
Named entities (NEs) are the expressions in human languages that explicitly link notations in languages to the entities in the real world. They play important role in cross-language information...
Extracting Information and Answering Questions (2009)
To move beyond current keyword-based approaches to document retrieval, we need to provide the user with a range of technologies for obtaining information and answering questions. One of these is...
Is this NE tagger getting old? (2009)
This paper focuses on the influence of changing the text time frame on the performance of a named entity tagger. We followed a twofold approach to investigate this subject: on the one hand, we...
Name Extraction and Translation for Distillation (2009)
Heng Ji, Ralph Grishman, Dayne Freitag, Matthias Blume, Zhiqiang (john Wang, Fair Isaac Corp, ...
Name translation is important well beyond the relative frequency of names in a text: a correctly translated passage, but with the wrong name, may lose most of its value. The Nightingale team has...
An Equipment Model and Its Role in the Interpretation of Nominal Compounds (2009)
Tomasz Ksiezyk, Ralph Grishman
For natural language understanding systems designed for domains including relatively complex equipment, it is not sufficient to use general knowledge about this equipment. We show problems which can...
the ARPA Human Language Technology programs. In the written (2008)
Common evaluations have grown to be a major component of all
Transforming Examples into Patterns for Information Extraction Abstract (2008)
Roman Yangarber, Ralph Grishman
roman, grishman~cs.nyu.edu Information Extraction (IE) systems today are commonly based on pattern matching. The patterns are regular expressions stored in a customizable knowledge base. Adapting an...
Ralph Grishman, Michiko Kosaka
Better methods are needed for acquiring the knowledge which must go into machine translation systems. The call for papers for this conference contrast two approaches: the rationalist (based on...
742 Generalizing Automatically Generated Selectional Patterns Abstract (2008)
Frequency information on co-occurrence patterns can be atttomatically collected from a syntactically ana-lyzed corpus; this information can then serve as the ba-sis for selectional constraints when...
Punctuating speech for information extraction (2008)
Benoit Favre, Ralph Grishman, Dustin Hillard, Heng Ji, Dilek Hakkani-tür, Mari Ostendorf
This paper studies the effect of automatic sentence boundary detection and comma prediction on entity and relation extraction in speech. We show that punctuating the machine generated transcript...
Abstract Information Extraction and Semantic Constraints (2007)
We consider the problem of extracting specified types of information from natural language text. To properly analyze the text, we wish to apply semantic (selectional) constraints whenever possi-ble;...
David R. Dowty, Lauri Karttunen, Arnold M. Zwicky, Amichai Kronfeld, Martha Walton Evens, An Introduction, ...
This series of monographs, texts, and edited volumes is published in
Ralph Grishman, Lynette Sirschman, Carol Friedman
In order to analyze their input properly, natural language interfaces require access to domain-speciflc semantic information. However, design considerations for practical systems-- in particular, the...
ACQUISITION OF SELECTIONAL PATTERNS 1 The Problem (2007)
For most natural language analysis systems, one of the major hurdles in porting the system to a new domain is the development of an appropri-ate set of semantic patterns. Such patterns are typically...
LANGUAGE AND SPATIAL COGNITION (2007)
Sergei Nirenburg, Graeme Hirst, An Introduction, Ralph Grishman
This series of monographs, texts, and edited volumes is published in cooperation with the Association for Computational Linguistics. Coming in 1987:
Information extraction is the process of analyzing natural language and collecting information about specified types of entities, relationships, or events. This paper provides an overview of a range...
Automatic Pattern Acquisition for Japanese Information Extraction (2007)
Sudo, Kiyoshi, Sekine, Satoshi, Grishman, Ralph
One of the central issues for information extraction is the cost of customization from one scenario to another. Research on the automated acquisition of patterns is important for portability and...
The COMLEX Syntax Project (2007)
Grishman, Ralph, Macleod, Catherine, Wolff, Susanne
The goal of the COMLEX Syntax Project is to create a moderately-broad-coverage shareable dictionary containing the syntactic features of English words,intended for automatic language analysis. We are...
The NYU System for MUC-6 or Where's the Syntax? (2007)
Over the past five MUCs, New York University has clung faithfully to the idea that information extraction should begin with a phase of full syntactic analysis, followed by a semantic analysis of the...
Covering Treebanks with GLARF (2007)
Meyers, Adam, Grishman, Ralph, Kosaka, Michiko, Zhao, Shubin
This paper introduces GLARF, a framework for predicate argument structure. We report on converting the Penn Treebank II into GLARF by automatic methods that achieved about 90% precision/recall on...
New York University: Description of the Proteus System as Used for MUC-4 (2007)
Grishman, Ralph, Macleod, Catherine, Sterling, John
The PROTEUS system which we have used for MUC-4 is largely unchanged from that used for MUC-3. It has three main components: a syntactic analyzer, a semantic analyzer, and a template generator. The...
Discriminative Slot Detection Using Kernel Methods (2007)
Zhao, Shubin, Meyers, Adam, Grishman, Ralph
Most traditional information extraction approaches are generative models that assume events exist in text in certain patterns and these patterns can be regenerated in various ways. These assumptions...
Question answering using integrated information retrieval and information extraction (2007)
Barry Schiffman, Kathleen R. Mckeown, Ralph Grishman
This paper addresses the task of providing extended responses to questions regarding specialized topics. This task is an amalgam of information retrieval, topical summarization, and Information...
Re-Ranking Algorithms for Name Tagging (2006)
Ji, Heng, Rudin, Cynthia, Grishman, Ralph
Integrating information from different stages of an NLP processing pipeline can yield significant error reduction. We demonstrate how re-ranking can improve name tagging in a Chinese information...
2006b. Analysis and Repair of Name Tagger Errors (2006)
Name tagging is a critical early stage in many natural language processing pipelines. In this paper we analyze the types of errors produced by a tagger, distinguishing name classification and various...
Data Selection in Semisupervised Learning for Name Tagging (2006)
We present two semi-supervised learning techniques to improve a state-of-the-art multi-lingual name tagger. For English and Chinese, the overall system obtains 1.7 %- 2.1 % improvement in F-measure,...
Improving Name Tagging by Reference Resolution and Relation Detection (2005)
Information extraction systems incorporate multiple stages of linguistic analysis. Although errors are typically compounded from stage to stage, it is possible to reduce the errors in one stage by...
Using semantic relations to refine coreference decisions (2005)
Heng Ji, David Westbrook, Ralph Grishman
We present a novel mechanism for improving reference resolution by using the output of a relation tagger to rescore coreference hypotheses. Experiments show that this new framework can improve...
Extracting relations with integrated information using kernel methods (2005)
Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues...
The NomBank Project: An Interim Report (2004)
Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young, ...
This paper describes NomBank, a project that will provide argument structure for instances of common nouns in the Penn Treebank II corpus. NomBank is part of a larger effort to add additional layers...
Developing a Syntactic Annotation Scheme and Tools for a Spanish treebank (2003)
Antonio Moreno, Susana López, O Sánchez, Ralph Grishman
Abstract This chapter will describe our experience developing specifications and tools for building a Syntactically Annotated Corpus (SAC) for Spanish newspaper texts. The initial corpus consists of...
An improved extraction pattern representation model for automatic IE pattern acquisition (2003)
Kiyoshi Sudo, Satoshi Sekine, Ralph Grishman
Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction. Each approach is based on a particular model for the patterns to be acquired,...
An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition (2003)
Kiyoshi Sudo, Satoshi Sekine, Ralph Grishman
Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction.
An Improved Extraction Pattern Representation Model (2003)
For Automatic Ie, Kiyoshi Sudo, Satoshi Sekine, Ralph Grishman
Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction.
Grishman,Ralph, Hirschman,Lynette
We are engaged in the development systems capable of analyzing short narrative messages dealing with a limited domain and extracting the information contained in the narrative. These systems are...
Model-Based Analysis of Messages about Equipment. (2002)
Grishman, Ralph, Ksiezyk, Tomasz, Nhan, Ngo T.
Considerable progress has been made in developing systems which understand short passages of technical text. Several prototypes have been developed, for such domains as patient medical records,...
PROTEUS and PUNDIT: Research in Text Understanding. (2002)
Grishman, Ralph, Hirschman, Lynette
We are engaged in the development of systems capable of analyzing short narrative messages dealing with a limited domain and extracting the information contained in the narrative. These systems are...
Covering Treebanks with GLARF (2001)
Adam Meyers And, Adam Meyers, Ralph Grishman, Kosaka Michiko, Shubin Zhao
This paper introduces GLARF, a framework for predicate argument structure.
Automatic Pattern Acquisition for Japanese Information Extraction (2001)
Kiyoshi Sudo, Satoshi Sekine, Ralph Grishman
One of the central issues for information extraction is the cost of customization from one scenario to another. Research on the automated acquisition of patterns is important for portability and...
The American National Corpus: (2001)
Standardized Resource For, Catherine Macleod, Nancy Ide, Ralph Grishman
At the first conference on Language Resources and Evaluation, Granada 1998, Charles Fillmore, Nancy Ide, Daniel Jurafsky, and Catherine Macleod proposed creating an American National Corpus (ANC)...
Web-Based Language Documentation and Description (2000)
Martha Palmer, Ralph Grishman, Nicoletta Calzolari, Antonio Zampolli
This paper briefly describes several different types of semantic information which are used by various natural language processing applications. It focuses on syntactic frames and semantic class...
Automatic acquisition of domain knowledge for information extraction (2000)
Roman Yangarber, Ralph Grishman, Past Tapanainen
In developing an Infbrmation Extraction tIE) system tbr a new class of events or relations, one of the major tasks is identifying the many ways in which these events or relations may be ex-pressed in...
Unsupervised discovery of scenario-level patterns for information extraction (2000)
Roman Yangarber, Ralph Grishman
Information Extraction (IE) systems are com-monly based on pattern matching. Adapting an IE system to a new scenario entails the construction of a new pattern base---a time-consuming and expensive...
R.: Chart-Based Transfer Rule Application in Machine Translation (2000)
Adam Meyers, Michiko Kosaka, Ralph Grishman
Transfer-based Machine Translation systems require a procedure for choosing the set of transfer rules for generating a target language translation from a given source language sentence. In an MT...
Event99: A proposed event indexing task for broadcast news (1999)
Lynette Hirschman, Erica Brown, Nancy Chinchor, Aaron Douthat, Lisa Ferro, Ralph Grishman, ...
The goal of the proposed Event99 task is to evaluate event-level indexing into news stories, including news wire, radio, and television sources. The Event99 task is distinguished from earlier,...
Finding Causal and Temporal Relations in Equipment Failure Messages, (1998)
Joskowicz, Leo, Grishman, Ralph, Ksiezyk, Tomasz
The work presented here is part of the PROTEUS (Prototype Text Understanding System) project, whose objective is to understand short narrative messages about equipment installed in Navy ships....
Responding to Semantically Ill-Formed Input, (1998)
One cause of failure language interfaces is semantic overshoot; this is reflected in input sentences which do not correspond to any semantic pattern in the system, We describe a system which provides...
Equipment Simulation for Language Understanding, (1998)
Ksiezyk, Tomasz, Grishman, Ralph
This work is part of PROTEUS (the PROtotype TExt Understanding System), currently under development. The objective is to understand short natural language texts. Our texts at present are CASualty...
An Equipment Model and Its Role in the Interpretation of Noun Phrases, (1998)
Ksiezyk, Tomasz, Grishman, Ralph, Sterling, John
This work is part of PROTEUS (PROto-type TExt Understanding System). The objective is to understand short natural language texts. Our texts as present are CASualty REPorts (CAS-REPs) which describe...
Evaluation of a Parallel Chart Parser. (1998)
Grishman, Ralph, Chitrao, Mahesh
A parallel implementation of a chart parser is described for a shared memory multiprocessor. The speed ups obtained with this parser have been measured for a number of small natural language...
Research in Natural Language Processing January 15, 1985 - September 15, 1987. (1998)
This report describes research done by the PROTEUS Project at New York University during the period January 15, 1985 to September 15, 1987. All of the activities described below were supported in...
Domain Modeling for Language Analysis. (1998)
In section 2 of this paper we briefly characterize our notion of understanding a text. In section 3 we give an overview of the system we have constructed for analyzing equipment failure messages, and...
A Comparative Study of Japanese and English Sublanguage Patterns. (1998)
Teller, Virginia, Kosaka, Michiko, Grishman, Ralph
As part of a project to develop a Japanese-English machine translation system for technical texts within a limited domain, we conducted a study to investigate the roles that sublanguage techniques...
Equipment Simulation for Language Understanding, Revision, (1998)
Ksiezyk, Tomasz, Grishman, Ralph
This paper considers the task of analyzing reports regarding the failure, diagnosis and repair of equipment. The authors show that a general knowledge of equipment is not sufficient for a full...
Research in Natural Language Processing. (1998)
Primary interest is in the development of systems which can automatically process natural language text concerning limited domains. At the outset of our research, a number of research groups had...
Deriving Transfer Rules from Dominance-Preserving Alignments (1998)
Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, Antonio Moreno-s
NYU: Description of the MENE Named Entity System as Used in MUC-7 (1998)
Andrew Borthwick, John Sterling, Eugene Agichtein, Ralph Grishman
Nomlex: A lexicon of nominalizations (1998)
Catherine Macleod, Ralph Grishman, Adam Meyers, Leslie Barrett, Ruth Reeves
NOMLEX (NOMinalization Lexicon) is a dictionary of English nominalizations currently under development at New York University. NOMLEX seeks not only to describe the allowed complements for a...
Using NOMLEX to produce nominalization patterns for information extraction (1998)
Adam Meyers, Catherine Macleod, Roman Yangarber, Ralph Grishman, Leslie Barrett, Ruth Reeves
This paper describes how NOMLEX, a dictionary of nominalizations, can be used in Information Extraction (IE). This paper details a procedure which maps syntactic and semantic information designed for...
Using NOMLEX to produce nominalization patterns for information extraction (1998)
Adam Meyers, Catherine Macleod, Roman Yangarber, Ralph Grishman, Leslie Barrett, Ruth Reeves
meyers~cs, nyu. edu This paper describes how NOMLEX, a dictio-nary of nominalizations, can be used in Informa-tion Extraction (IE). This paper details a pro-cedure which maps syntactic and semantic...
A decision tree method for finding and classifying names in Japanese texts (1998)
Satoshi Sekine, Ralph Grishman
[sekine] gri shman] ©cs. nyu. edu This paper describes a system which uses a deci-sion tree to find and classify names in Japanese texts. The decision tree uses part-of-speech, character type, and...
Deriving Transfer Rules from Dominance-Preserving Alignments (1998)
Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, Antonio Moreno-s, Oval T
Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition (1998)
Andrew Borthwick, John Sterling, Eugene Agichtein, Ralph Grishman
This paper describes a novel statistical named-entity (i.e. "proper name") recognition system built around a maximum entity framework. By work-ing v,ithin the framework of maximum...
Information extraction and speech recognition (1998)
Information extraction is the process of analyzing natural language and collecting information about specified types of entities, relationships, or events. This paper provides an overview of a range...
NYU language modeling experiment for 1996 CSR evaluation (1997)
Satoshi Sekine, Andrew Borthwick, Ralph Grishman
This paper describes NYU's effort toward improving recognition accuracy
Information extraction: techniques and challenges (1997)
This volume takes a broad view of information extraction as any method for ltering information from large volumes of text. This includes the retrieval of documents from collections and the tagging of...
and the TIPSTER Phase H Contractors ' Architecture Working Group (CA WG): (1996)
Ralph Grishman, Bill Caid, Jamie Callan, Jim Conley, Harold Corbin, Jim Cowie, ...
The TIPSTER Program aims to push the technology for access to information in large (multi-GB) text collections, in particular for the analysts in Government agencies. Technology is being developed...
The influence of tagging on the classification of lexical complements (1996)
Catherine Macleod, Adam Meyers, Ralph Grishman
A large corpus (about 100 MB of text) was selected and examples of 750 fl'e-quently occurring verbs were tagged with their compleinent (:lass as defined by a large computational syntactic...
NYU Language Modeling Experiments for the 1996 CSR Evaluation (1995)
Satoshi Sekine Andrew, Andrew Borthwick, Ralph Grishman
This paper describes NYU's effort toward improving recognition accuracy for the 1996 ARPA Large Vocabulary Continuous Speech Recognition evaluation. We are trying to develop different kinds of...
A Corpus-based Probabilistic Grammar with Only Two Non-terminals (1995)
Satoshi Sekine, Ralph Grishman
The availability of large, syntactically-bracketed corpora such as the Penn Tree Bank affords us the opportunity to automatically build or train broad-coverage grammars, and in particular to train...
Comlex Syntax: Building a Computational Lexicon (1994)
Grishman, Ralph, Macleod, Catherine, Meyers, Adam
We describe the design of Comlex Syntax, a computational lexicon providing detailed syntactic information for approximately 38,000 English headwords. We consider the types of errors which arise in...
Creating a Common Syntactic Dictionary of English (1994)
Catherine Macleod, Ralph Grishman, Adam Meyers
COMLEX Syntax is a 38,000 head word English dictionary containing detailed information about syntactic features and complements, and is intended for use in natural language processing. It has been...
PROTEUS: un sistema multilingüe de extracción de información (1993)
Moreno Sandoval, Antonio, Olmeda Moreno, Cristina, Grishman, Ralph, Macleod, Catherine, Sterling, John
El sistema PROTEUS (PROtotype TExt Understanding System) tiene como objetivo analizar e interpretar textos reales y mostrar un resumen de la información contenida en ellos de una forma estructurada...
Evaluating parsing strategies using standardized parse files (1992)
Ralph Grishman, Catherine Macleod
The availability of large files of manually-reviewed parse trees from the University of Pennsylvania "tree bank", along with a pro-gram for comparing system-generated parses against...
A comparative study of Japanese and English sublanguage patterns (1988)
Virginia Teller, Michiko Kosaka, Ralph Grishman
As part of a project to develop a Japanese-English machine translation system for technical texts within a limited domain, we conducted a study to investigate the roles that sublanguage techniques...
An equipment model and its role in the interpretation of noun phrases (1987)
Tomasz Ksiezyk, Ralph Grishman, John Sterling
For natural language understanding systems designed for domains including relatively complex equipment, it is not sufficient to use general knowledge about this equipment. We show problems which can...
Model-based analysis of messages about equipment (1986)
Ralph Grishman, Tomasz Ksiezyk, Ngo Thanh Nhan
The aim of PROTEUS-- a system for the analysis of short technical texts-- is to increase the reliability of the analysis process through the integration of syntactic and semantic constraints, domain...
Discovery procedures for sublanguage selectional patterns: Initial experiments (1986)
Ralph Grishman, Lynette Hirschman, Ngo Thanh Nhan
Selectional constraints specify, for a particular domain, the combinations of semantic classes accepta-ble in subject-verb-object relationships and other syntactic structures. These constraints are...
AUTOMATED DETERMINATION OF SUBLANGUAGE SYNTACTIC USAGE
Ralph Grishman, Ngo Thanh Nhan, Elate Marsh, Lyneae Hirschman