Categories and Subject Descriptors (2009)
Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng
This paper describes a question answering system that is designed to capitalize on the tremendous amount of data that is now available online. Most question answering systems use a wide variety of...
General Terms Algorithms (2008)
Raman Ch, Harr Chen, Simon Corston-oliver, Eric Brill
information retrieval, customized search, subwebs We describe a method to define and use subwebs, user-defined neighborhoods of the Internet. Subwebs help improve search performance by inducing a...
In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge...
In this paper we describe and evaluate a Question Answering (QA) system that goes beyond answering factoid questions. Our approach to QA assumes no restrictions on the type of questions that are...
Categories and Subject Descriptors (2008)
Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng
This paper describes a question answering system that is designed to capitalize on the tremendous amount of data that is now available online. Most question answering systems use a wide variety of...
Raman Ch, Harr Chen, Simon Corston-oliver, Eric Brill
We describe a method to define and use subwebs, user-defined neighborhoods of the Internet. Subwebs help improve search performance by inducing a topic-specific page relevance bias over a collection...
Evaluating user preferences of web search results is crucial for search engine development, deployment, and maintenance. We present a real-world study of modeling the behavior of web search users to...
Steven Abney, Michael Collins, Amit Singhal, Answer Extraction In, Sasha Blair-goldensohn, Kathleen R. Mckeown, ...
gov/projects/duc/roadmapping.html.
Article Submitted to Computer Speech and Language (2007)
Lidia Mangu, Eric Brill, Andreas Stolcke
Finding consensus in speech recognition: word error minimization and other applications of confusion networks
Chapter 5: The Superparser (2007)
Eric Brill Barbora, Eric Brill, Barbora Hladk
ing this training set S times with replacement. Some training instances will occur multiple times in a bag, while others may not appear at all. Next, each bag is used to train a classifier. We now...
Categories and Subject Descriptors (2007)
Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng
This paper describes a question answering system that is designed to capitalize on the tremendous amount of data that is now available online. Most question answering systems use a wide variety of...
Bagging and boosting, two eective machine learning techniques, are applied to natural language parsing. Experiments using these techniques with a trainable statistical parser are described. The best...
1198 A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation (2007)
brill~goldilocks.lcs.mit.edu I:n this paper, we describe a new corpus-based ap-proach to prepositional phrase attachment disam-biguation, and present results colnparing peffo> mange of this...
Towards An Adaptive Framework for Information Retrieval (2007)
Scott A. Weiss, Simon Kasif, Eric Brill
We report on our investigation into techniques for adaptive information retrieval. We describe our domain of USENET newsgroups, and discuss some of our inital experiments. We illustrate the weakness...
Web Search Intent Induction via Search Results Partitioning (2007)
We present a computationally efficient method for automatic clustering of web search results based on partitioning the return set according to queries the user may have intended. The method requires...
Web-Based Question Answering: A Decision-Making Perspective (2007)
David Azari Eric, Eric Horvitz, Susan Dumais, Eric Brill
We investigate the use of probabilistic models and cost-benefit analyses to guide the operation of a Web-based question-answering system. We first provide an overview of research on questionanswering...
Actions, Answers, and Uncertainty: a Decision-making Perspective (2004)
David Azari, Eric Horvitz, Susan Dumais, Eric Brill
We present research on methods for generating answers to freely posed questions, based upon information drawn from the Web. The methods exploit the typical redundancy of information on the Web by...
Web Search Intent Induction via Automatic Query Reformulation (2004)
We present a computationally efficient method for automatic grouping of web search results based on reformulating the original query to alternative queries the user may have intended.
Actions, Answers, and Uncertainty: (2004)
David Azari, Susan Dumais, Eric Horvitz, Eric Brill
We present research on methods for generating answers to freely posed questions, based upon information drawn from the Web. The methods exploit the typical redundancy of information on the Web by...
Actions, Answers, and Uncertainty: a Decision-making Perspective (2004)
David Azari, Susan Dumais, Eric Horvitz, Eric Brill
We present research on methods for generating answers to freely posed questions, based upon information drawn from the Web. The methods exploit the typical redundancy of information on the Web by...
An analysis of the AskMSR question-answering system (2002)
Eric Brill, Susan Dumais, Michele Banko
We describe the architecture of the AskMSR question answering system and systematically evaluate contributions of different system components to accuracy. The system differs from most question...
An analysis of the AskMSR question-answering system (2002)
Eric Brill, Susan Dumais, Michele Banko
We describe the architecture of the AskMSR question answering system and systematically evaluate contributions of different system components to accuracy. The system differs from most question...
AskMSR: Question answering using the Worldwide Web (2002)
Michele Banko, Eric Brill, Susan Dumais, Jimmy Lin
The design of the AskMSR question answering system is motivated by recent observations in natural language processing that for many applications, significant improvements in accuracy can be attained...
An analysis of the AskMSR question-answering system (2002)
Eric Brill, Susan Dumais, Michele Banko
We describe the architecture of a question answering system and systematically evaluate contributions of different system components to accuracy. The system differs from most question answering...
An analysis of the AskMSR question-answering system (2002)
Eric Brill, Susan Dumais, Michele Banko
We describe the architecture of the AskMSR question answering system and systematically evaluate contributions of different system components to accuracy. The system differs from most question...
AskMSR: Question answering using the Worldwide Web (2002)
Michele Banko, Eric Brill, Susan Dumais, Jimmy Lin
The design of the AskMSR question answering system is motivated by recent observations in natural language processing that for many applications, significant
Web question answering: Is more always better (2002)
Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Ng
This paper describes a question answering system that is designed to capitalize on the tremendous amount of data that is now available online. Most question answering systems use a wide variety of...
AskMSR: Question answering using the Worldwide Web (2002)
Michele Banko, Eric Brill, Susan Dumais, Jimmy Lin
The design of the AskMSR question answering system is motivated by recent observations in natural language processing that for many applications, significant improvements in accuracy can be attained...
Data-intensive question answering (2001)
Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, Andrew Ng
Microsoft Research Redmond participated for the first time in TREC this year, focusing on the question answering track. There is a separate report in this volume on the Microsoft Research Cambridge...
Data-intensive question answering (2001)
Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, Andrew Ng
Data-driven methods have proven to be powerful techniques for natural language processing. It is still unclear to what extent this success can be attributed to specific techniques, versus simply the...
Data-intensive question answering (2001)
Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, Andrew Ng
Microsoft Research Redmond participated for the first time in TREC this year, focusing on the question answering track. There is a separate report in this volume on the Microsoft Research Cambridge...
Pattern-Based Disambiguation for Natural Language Processing (2000)
brill @ microsoft.corn A wide range of natural language problems can be viewed as disambiguating between a small set of alternatives based upon the string context surrounding the ambiguity site. In...
An improved error model for noisy channel spelling correction (2000)
{ brill,bobmoore} @ microsoft.corn The noisy channel model has been applied to a wide range of problems, including spelling correction. These models consist of two components: a source model and a...
Automatic Grammar Induction: Combining, Reducing and Doing Nothing (2000)
Eric Brill, John C. Henderson, Grace Ngai
This paper surveys three research directions in parsing. First, we look at methods for both automatically generating a set of diverse parsers and combining the outputs of dierent parsers into a...
Lattice Compression in the Consensual Post-Processing Framework (1999)
Word Lattices are used by most speech recognizers as a compact representation of a set of alternative hypotheses. In large-vocabulary, multi-pass recognition systems it is important to generate word...
Exploiting Diversity in Natural Language Processing: Combining Parsers (1999)
Three state-of-the-art statistical parsers are combined to produce more accurate parses, as well as new bounds on achievable Treebank parsing accuracy. Two general approaches are presented and two...
Man vs. Machine: A Case Study in Base Noun Phrase Learning (1999)
A great deal of work has been done demonstrating the ability of machine learning algorithms to automatically extract linguistic knowledge from annotated corpora. Very little work has gone into...
Beyond n-grams: Can linguistic sophistication improve language modeling (1998)
Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech...
Beyond n-grams: Can linguistic sophistication improve language modeling (1998)
Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech...
Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling? (1998)
Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech...
Classifier Combination for Improved Lexical Disambiguation (1998)
One of the most exciting recent directions in machine learning is the discovery that the combination of multiple classifiers often results in significantly better performance than what can be...
Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling? (1998)
Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech...
Beyond n-grams: Can linguistic sophistication improve language modeling (1998)
Eric Brill, Radu Florian, John C. Henderson, Lidia Mangu
It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech...
Automatic Rule Acquisition for Spelling Correction (1997)
This paper describes a new approach to automatically learning linguistic knowledge for spelling correction. A major feature of this approach is the fact that the acquired knowledge is captured in a...
Using Multiple Taggers to Improve English Part-of-Speech Tagging by Error Learning (1997)
Jun Wu, Advisor Prof, Eric Brill
Many approaches to Part-of-Speech tagging have reached accuracy about 96-97% which is close to the upper bound, and little improvement was made in recent years. In this project, we propose an idea of...
Efficient transformation-based parsing (1996)
In transformation-based parsing, a finite sequence of tree rewriting rules are checked for application to an input structure. Since in practice only a small percentage of rules are applied to any...
Text Classification in USENET Newsgroups: A Progress Report (1996)
Scott A. Weiss, Simon Kasif, Eric Brill
We report on our investigations into topic classification with USENET newsgroups. Our framework is to determine the newsgroup that a new document should be posted to. We train our system by forming...
Efficient Transformation-Based Parsing (1996)
In transformation-based parsing, a finite sequence of tree rewriting rules are checked for application to an input structure. Since in practice only a small percentage of rules are applied to any...
Text Classification in USENET Newsgroups: A Progress Report (1996)
Scott Weiss, Simon Kasif, Eric Brill
We report on our investigations into topic classification with USENET newsgroups. Our framework is to determine the newsgroup that a new document should be posted to. We train our system by forming...
Recently, there has been a rebirth of empiricism in the field of natural language processing. Manual encoding of linguistic information is being challenged by automated corpus-based learning as a...
Recently, there hasbeen a rebirth of empiricism in the eld of natural language processing. Manual encoding of linguistic information is being challenged by automated corpus-based learning as a method...
Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging (1995)
In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the...
Recently, there has been a rebirth of empiricism in the eld of natural language processing. Manual encoding of linguistic information is being challenged by automated corpus-based learning as a...
Recently, there has been a rebirth of empiricism in the field of natural language processing. Man-ual encoding of linguistic information is being challenged by automated corpus-based learning as a...
Recently, there has been a rebirth of empiricism in the field of natural language processing. Man-ual encoding of linguistic information is being challenged by automated corpus-based learning as a...
Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging (1995)
In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the...
this paper, we will describe a simple rule-based approach to automated learning of linguistic knowledge. This approach has been shown for a number of tasks to capture information in a clearer and...
Pegasus: A Spoken Language Interface for On-Line Air Travel Planning (1994)
Victor Zue, Stephanie Seneff, Joseph Polifroni, Michael Phillips, Christine Pao, David Goddeau, ...
This paper describes PEGASUS, a spoken language inter-face for on-line air travel planning that we have recently devel-oped. PEGASUS leverages off our spoken language technology development in the...
A report of recent progress in Transformation-based Error-driven Learning (1994)
Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of...
Some Advances in Transformation-Based Part of Speech Tagging (1994)
Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of...
Some Advances in Transformation-Based Part of Speech Tagging (1994)
Most recent research in trainable part of speech taggers has explored stochastic tagging. While these taggers obtain high accuracy, linguistic information is captured indirectly, typically in tens of...
A corpus-based approach to language learning / (1993)
Thesis (Ph.D. in Computer and Information Science) -- Graduate School of Arts and Sciences, University of Pennsylvania, 1993.
Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach (1993)
In this paper we describe a new technique for parsing free text: a transformational grammar 1 is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees...
Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach (1993)
In this paper we describe a new technique for parsing free text: a transformational grammar I is automatically learned that is capable of accu-rately parsing text into binary-branching syntac-tic...
Transformation-Based Error-Driven Parsing (1993)
In this paper we describe a new technique for parsing free text: a transformational grammar 1 is automatically learned that is capable of accurately parsing text into binarybranching syntactic trees....
An Information-theoretic Solution to Parameter Setting (1993)
Eric Brill Shyam, Eric Brill, Eric Brill, Shyam Kapur, Shyam Kapur
this paper, we point out a possible way by which the child could obtain the target values of the word order parameters for her language. The essential idea is an entropy-based statistical analysis of...
Automatic Grammar Induction and Parsing Free Text: A Transformation-Based Approach (1993)
In this paper we describe a new technique for parsing free text: a transformational grammar 1 is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees...
An Information-theoretic Solution to Parameter Setting (1993)
this paper, we point out a possible way by which the child could obtain the target values of the word order parameters for her language. The essential idea is an entropy-based statistical analysis of...
A corpus-based approach to Language learning (1993)
A Dissertation, Eric Brill, Eric Brill
Acknowledgements Many people deserve thanks in helping me progress from my mother's womb to finishing my dissertation. Rather than beginning with the doctor who delivered me and filling pages...
An information-theoretic solution to parameter setting (1993)
Eric Brill, Eric Brill, Eric Brill, Shyam Kapur, Shyam Kapur, Shyam Kapur
by
A Simple Rule-Based Part Of Speech Tagger (1992)
Automatic part of speech tagging is an area of natural language processing where statistical techniques have been more successful than rule-based methods. In this paper, we present a simple...
Tagging an Unfamiliar Text With Minimal Human Supervision (1992)
In this paper, we will discuss a method for assigning part of speech tags to words in an unannotated text corpus whose structure is completely unknown, with a little bit of help from an informant....
Finding Consensus Among Words: Lattice-Based Word Error Minimization
Lidia Mangu Eric, Eric Brill, Andreas Stolcke
We describe a new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER). Our approach thus overcomes the mismatch between the word-based...
Finding Consensus Among Words: Lattice-Based Word Error Minimization
Lidia Mangu, Eric Brill, Andreas Stolcke
We describe a new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER). Our approach thus overcomes the mismatch between the word-based...