Advisors Professors, Kathleen R. Mckeown, Judith L. Klavans
Computer Interaction. My main interest is in applying natural language techniques to real− world applications. A key focus of my work is improving information retrieval user interfaces with the use...
Sasha Blair-goldensohn, Kathleen R. Mckeown, Andrew Hazen Schlaikjer
Much of the effort in Question Answering (QA) has gone into building short answer QA systems, which answer questions for which the correct answer is a single word or short phrase. However, there are...
Building a Foundation System for Producing Short Answers to Factual Questions (2008)
Sameer S. Pradhan, Gabriel Illouz, Andrew Hazen Schlaikjer, Valerie Krugler, Elena Filatova, ...
Building a Foundation System for Producing Short Answers to Factual Questions (2008)
Sameer S. Pradhan, Gabriel Illouz, Andrew Hazen Schlaikjer, Valerie Krugler, Elena Filatova, ...
An Investigation Into the Detection of New Information (2008)
Barry Schiffman, Kathleen R. Mckeown
This paper explores new-information detection, describing a strategy for filtering a stream of documents to present only information that is fresh. We focus on multi-document summarization and seek...
Columbia Newsblaster: Multilingual News Summarization on the Web (2008)
David Kirk, Evans Judith, L. Klavans, Kathleen R. Mckeown
We present the new multilingual version of the Columbia Newsblaster news summarization system. The system addresses the problem of user access to browsing news from multiple languages from multiple...
Building Natural Language Interfaces for Rule-based Expert Systems (2008)
Galina Datskovsky Moerdler, Kathleen R. Mckeown, J. Robert Ensor
In this paper we discuss a semantics for translating natural language statements into facts of an underlying expert system, replacing the more conventional menu interface for gathering data from the...
ARTIFICIAL INTELLIGENCE 1 Discourse Strategies for Generating Natural-Language Text* (2008)
If a generation system is to produce text in response to a given communicative goal, it must be able to determine what to include in its text and how to organize this information so that it can be...
Building a Foundation System for Producing Short Answers to Factual Questions (2008)
Sameer S. Pradhan, Gabriel Illouz, Andrew Hazen Schlaikjer, Valerie Krugler, Elena Filatova, ...
Extracting Patient Profiles from Patient Records and Online Literature (2008)
Vasileios Hatzivassiloglou, Kathleen R. Mckeown, Ph. D, Desmond A. Jordan, ...
We present a representation model for the content of medical documents (journal articles and the patient’s record) that allows the extraction of critical relationship information from online texts...
Building a Foundation System for Producing Short Answers to Factual Questions (2008)
Sameer S. Pradhan, Gabriel Illouz, Andrew Hazen Schlaikjer, Valerie Krugler, Elena Filatova, ...
Tailoring Explanations for the User 1 (2008)
Kathleen R Mckeown, Myron Wish, Kevin Matthews
In order for an expert system to provide the most effective explanations, it should be able to tailor its responses to the concerns of the user One way in which explanations may be tailored is by...
Steven Abney, Michael Collins, Amit Singhal, Answer Extraction In, Sasha Blair-goldensohn, Kathleen R. Mckeown, ...
gov/projects/duc/roadmapping.html.
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
composite topic structure trees for multiple domain
Kathleen R. Mckeown, Regina Barzilay, David Evans, Vasileios Hatzivassiloglou, Judith L. Klavans, Ani Nenkova, ...
Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these...
Information Fusion in the Context of Multi-Document Summarization (2007)
Regina Barzuay, Kathleen R. Mckeown
We present a method to automatically generate a concise summary by identifying and synthe-sizing similar elements across related text from a set of multiple documents. Our approach is unique in its...
Madeline Bates, Barbara Grosz, Sri International, David D. Mcdonald, Kathleen R. Mckeown
This report consists of two documents describing the state of the art of computer generation of natural language text. Both were prepared by a panel of indi-viduals who are active in research on text...
Building a Generation Knowledge Source using (2007)
Internet-accessible Newswire, Dragomir R. Radev, Kathleen R. Mckeown
In this paper, we describe a method for automatic creation of a knowledge source for text generation using information ex-traction over the Internet. We present a prototype system called PROFILE...
Alfred V. Aho, Shih-fu Chang, Kathleen R. Mckeown, Dragomir R. Radev, John R. Smith, Kazi A. Zaman
Abstract. In this paper we describe an ongoing research
on New Methods in Language Processing (2007)
Eric V. Siegel, Kathleen R. Mckeown
Abstract. This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the...
Cut and Paste Based Text Summarization Abstract (2007)
Hongyan Jing, Kathleen R. Mckeown
hjing, kathyQcs.columbia.edu We present a cut and paste based text summa-rizer, which uses operations derived from an anal-ysis of human written abstracts. The summarizer edits extracted sentences,...
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
composite topic structure trees for multiple domain
Coordinating Text and Graphics in Explanation Generation (2007)
Feiner, Steven K., McKeown, Kathleen R.
To generate multimedia explanations, a system must he able to coordinate the use of different media in a single explanation. In this paper, we present an architecture that we have developed for COMET...
Question answering using integrated information retrieval and information extraction (2007)
Barry Schiffman, Kathleen R. Mckeown, Ralph Grishman
This paper addresses the task of providing extended responses to questions regarding specialized topics. This task is an amalgam of information retrieval, topical summarization, and Information...
A Hybrid Approach for QA Track Definitional Questions (2006)
Blair-Goldensohn, Sasha, McKeown, Kathleen R., Schlaikjer, Andrew H.
This paper presents an overview of DefScriber, a hybrid goal-driven and data-driven system for definitional questions that was developed at Columbia University. DefScriber combines knowledge-based...
Similarity-based Multilingual Multi-Document Summarization (2005)
Evans, David Kirk, McKeown, Kathleen R., Klavans, Judith L.
We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to...
Sentence fusion for multidocument news summarization (2005)
Regina Barzilay, Kathleen R. Mckeown
A system that can produce informative summaries, highlighting common information found in many online documents, will help Web users to pinpoint information that they need without extensive reading....
An Investigation Into the Detection of New Information (2004)
Schiffman, Barry, McKeown, Kathleen R.
This paper explores new-information detection, describing a strategy for filtering a stream of documents to present only information that is fresh. We focus on multi-document summarization and seek...
Machine Learning and Text Segmentation in Novelty Detection (2004)
Schiffman, Barry, McKeown, Kathleen R.
This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with the data...
Machine learning and text segmentation in novelty detection (2004)
Barry Schiffman, Kathleen R. Mckeown
Abstract This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with...
Machine learning and text segmentation in novelty detection (2004)
Barry Schiffman, Kathleen R. Mckeown
This paper explores a combination of machine learning, approximate text segmentation and a vector-space model to distinguish novel information from repeated information. In experiments with the data...
Statistical Acquisition of Content Selection Rules for Natural Language Generation (2003)
Duboue, Pablo A., McKeown, Kathleen R.
A Natural Language Generation system produces text using as input semantic data. One of its very first tasks is to decide which pieces of information to convey in the output. This task, called...
A Hybrid Approach for Answering Definitional Questions (2003)
Blair-Goldensohn, Sasha, McKeown, Kathleen R., Schlaikjer, Andrew Hazen
We present DefScriber, a fully implemented system that combines knowledge-based and statistical methods in forming multi-sentence answers to open-ended definitional questions of the form, ``What is...
Nenkova, Ani, McKeown, Kathleen R.
References included in multi-document summaries are often problematic. In this paper, we present a corpus study performed to derive statistical models for the syntactic realization of referential...
A Hybrid Approach for Answering Definitional Questions (2003)
Sasha Blair-goldensohn, Kathleen R. Mckeown, Andrew Hazen Schlaikjer
We present DefScriber, a fully implemented system that combines knowledgebased and statistical methods in forming multi-sentence answers to open-ended definitional questions of the form,...
Statistical Acquisition of Content Selection Rules for Natural Language Generation (2003)
Pablo A. Duboue, Kathleen R. McKeown, A Natural, Language Generation
A Natural Language Generation system produces text using as input semantic data.
Kathleen R. McKeown, Noemie Elhadad, Vasileios Hatzivassiloglou
Despite the large amount of online medical literature, it can be difficult for clinicians to find relevant information at the point of patient care. In this paper, we present techniques to...
ProGenIE: Biographical Descriptions for Intelligence Analysis (2003)
Pablo A. Duboue, Kathleen R. McKeown, Vasileios Hatzivassiloglou
Intelligence analysts face the need for immediate, up-to-date information about individuals of interest. While biographies can be written and stored in text databases, we argue that they can get...
Building a Foundation System for Producing Short Answers to (2003)
Factual Questions Sameer, Sameer S. Pradhan, Gabriel Illouz, Andrew Hazen Schlaikjer, Valerie Krugler, ...
In this paper we describe the goals of question answering research being pursued as a joint project between Columbia University and the University of Colorado at Boulder as part of ARDA's...
Leveraging a Common Representation for Personalized Search and (2003)
Summarization In Medical, Kathleen R. Mckeown, Noemie Elhadad, Vasileios Hatzivassiloglou
Despite the large amount of online medical literature, it can be difficult for clinicians to find relevant information at the point of patient care. In this paper, we present techniques to...
A Hybrid Approach for Answering Definitional Questions (2003)
Sasha Blair-goldensohn, Kathleen R. Mckeown, Andrew Hazen Schlaikjer
We present DefScriber, a fully implemented system that combines knowledgebased and statistical methods in forming multi-sentence answers to open-ended definitional questions of the form, “What is...
Kathleen R. Mckeown, Noemie Elhadad, Vasileios Hatzivassiloglou
Despite the large amount of online medical literature, it can be difficult for clinicians to find relevant information at the point of patient care. In this paper, we present techniques to...
A Hybrid Approach for QA Track Definitional Questions (2003)
Sasha Blair-goldensohn, Kathleen R. Mckeown, Andrew Hazen Schlaikjer
We present an overview of DefScriber, a system developed at Columbia University that combines knowledge-based and statistical methods to answer definitional questions of the form, “What is X? ”...
A Hybrid Approach for Answering Definitional Questions (2003)
Sasha Blair-goldensohn, Kathleen R. Mckeown, Andrew Hazen Schlaikjer
We present DefScriber, a fully implemented system that combines knowledgebased and statistical methods in forming multi-sentence answers to open-ended definitional questions of the form, “What is...
Using the Annotated Bibliography as a Resource for Indicative Summarization (2002)
Kan, Min-Yen, Klavans, Judith L., McKeown, Kathleen R.
We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated...
Using Density Estimation to Improve Text Categorization (2002)
Sable, Carl, McKeown, Kathleen R., Hatzivassiloglou, Vassilis
This paper explores the use of a statistical technique known as density estimation to potentially improve the results of text categorization systems which label documents by computing similarities...
Using the Annotated Bibliography as a Resource for Indicative Summarization (2002)
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated...
Inferring strategies for sentence ordering in multidocument news summarization (2002)
Regina Barzilay, Noemie Elhadad, Kathleen R. Mckeown
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document...
Inferring strategies for sentence ordering in multidocument news summarization (2002)
Regina Barzilay, Noemie Elhadad, Kathleen R. Mckeown
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document...
Of Of The, Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated...
Regina Barzilay Regina, Noemie Elhadad, Kathleen R. Mckeown
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document...
Using the Annotated Bibliography as a Resource for Indicative Summarization (2002)
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated...
Inferring strategies for sentence ordering in multidocument news summarization (2002)
Regina Barzilay, Noemie Elhadad, Kathleen R. Mckeown
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document...
Applying Natural Language Generation to Indicative Summarization (2001)
Kan, Min-Yen, McKeown, Kathleen R., Klavans, Judith L.
The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from a...
Synthesizing composite topic structure trees for multiple domain specific documents (2001)
Kan, Min-Yen, McKeown, Kathleen R., Klavans, Judith L.
Domain specific texts often have implicit rules oncontent and organization. We introduce a novel method forsynthesizing this topical structure. The system uses corpus examplesand recursively merges...
Applying natural language generation to indicative summarization (2001)
Min-yen Kan, Kathleen R. Mckeown
min,kathy¡ The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from...
Simfinder: A flexible clustering tool for summarization (2001)
Vasileios Hatzivassiloglou, Judith L. Klavans, Melissa L. Holcombe, Regina Barzilay, Min-yen Kan, Kathleen R. Mckeown
We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text...
Columbia Multi-Document Summarization: Approach and Evaluation (2001)
Kathleen R. Mckeown, Vasileios Hatzivassiloglou, Regina Barzilay, Barry Schiffman, David Evans, Simone Teufel
Different forms of summarization are useful in different situations, depending on the intended
Towards generating patient specific summaries of medical articles (2001)
Noemie Elhadad, Kathleen R. Mckeown
The end users of medical digital libraries need quick access to information that is specific to the patients under their care. We present a summarization system that finds and extracts results from...
Extracting paraphrases from a parallel corpus (2001)
Regina Barzilay, Kathleen R. Mckeown
While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised...
Applying natural language generation to indicative summarization (2001)
Min-yen Kan, Kathleen R. Mckeown, Judith L. Klavans
Workshop on
Columbia Multi-Document Summarization: Approach and Evaluation (2001)
Kathleen R. Mckeown, Regina Barzilay, David Evans, Vasileios Hatzivassiloglou, Simone Teufel
Dierent forms of summarization are useful in dierent situations, depending on the intended purpose of the summary and on the types of
Domain-specific informative and indicative summarization for information retrieval (2001)
Min-yen Kan, Kathleen R. Mckeown, Judith L. Klavans
retrieval
a system for personalized search and summarization over multimedia healthcare information (2001)
Kathleen R. Mckeown, Shih-fu Chang, James Cimino, Steven K. Feiner, Luis Gravano, Vasileios Hatzivassiloglou, ...
Simfinder: A flexible clustering tool for summarization (2001)
Vasileios Hatzivassiloglou, Judith L. Klavans, Melissa L. Holcombe, Regina Barzilay, Min-yen Kan, Kathleen R. Mckeown
We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text...
Personalizing Retrieval of Journal Articles for Patient Care (2001)
Simone Teufel, Vasileios Hatzivassiloglou, Kathleen R. McKeown, Desmond A. Jordan, Kathleen M. Dunn, Sergey Sigelman, ...
this paper and other work in the context of PERSIVAL, we collected a corpus of 29,784 medical articles in full text, either from the web with an automated crawler or via a licensing agreement with...
Towards Generating Patient Specific Summaries of Medical Articles (2001)
Noemie Elhadad, Kathleen R. McKeown
The end users of medical digital libraries need quick access to information that is specific to the patients under their care. We present a summarization system that finds and extracts results from...
Sentence Ordering in Multidocument Summarization (2001)
Regina Barzilay, Noemie Elhadad, Kathleen R. Mckeown
The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. In this paper, we describe two naive ordering...
SIMFINDER: A Flexible Clustering Tool for Summarization (2001)
Vasileios Hatzivassiloglou Judith, Judith L. Klavans, Melissa L. Holcombe, Regina Barzilay, Min-yen Kan, Kathleen R. Mckeown
We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text...
Kathleen R. McKeown, Shih-Fu Chang, Shih-fu Changý, James Cimino, Carol Friedman, Steven K. Feiner, ...
In healthcare settings, patients need access to online information that can help them understand their medical situation. Physicians need information that is clinically relevant to an individual...
Simfinder: A flexible clustering tool for summarization (2001)
Vasileios Hatzivassiloglou, Judith L. Klavans, Melissa L. Holcombe, Regina Barzilay, Min-yen Kan, Kathleen R. Mckeown
We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text...
Applying natural language generation to indicative summarization (2001)
Min-yen Kan, Kathleen R. Mckeown
The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from a...
Experiments in automated lexicon building for text searching (2000)
Barry Schiffman, Kathleen R. Mckeown
This paper describes experiments in the automatic construction of lexicons that would be useful in searching large document collections for text fragments that address a specific information need,...
Experiments in automated lexicon building for text searching (2000)
Barry Schiffman, Kathleen R. Mckeown
This paper describes experiment's in the automat'ic construction of lexicons that would be useflfl in searching large document collect'ions tot text frag~ ments tinct address a...
Kathleen R. McKeown, Dragomir R. Radev
This chapter describes a class of word groups that lies between idioms and free word combinations. Idiomatic expressions are those in which the semantics of the whole cannot be deduced from the...
Eric V. Siegel, Kathleen R. Mckeown
Aspectual classification maps verbs to a small set of primitive categories in order to reason about time. This classification is necessary for interpreting temporal modifiers and assessing temporal...
Cut and paste based text summarization (2000)
Hongyan Jing, Kathleen R. Mckeown
s /Jr [1], [2], [3] 7HH` o # 6?/6x` r. identification of cutting and pasting operation decomposition of human-written summary sentences development of an automatic system to perform cut and paste...
Experiments in automated lexicon building for text searching (2000)
Barry Schiffman, Kathleen R. Mckeown
This paper describes experiments in the automatic construction of lexicons that would be useful in searching large document collections for text fragments that address a specific information need,...
Experiments in automated lexicon building for text searching (2000)
This paper describes experiments in the automatic construction of lexicons that would be useful in searching large document collections for text fragments that address a specific information need,...
Integrating a Large-scale, Reusable Lexicon with a Natural Language Generator (2000)
Hongyan Jing, Yael Dahan Netzer, Michael Elhadad, Kathleen R. McKeown
This paper presents the integration of a large-scale, reusable lexicon for generation with the FUF/SURGE unification-based syntactic realizer. The lexicon was combined from multiple existing...
Cut and Paste Based Text Summarization (2000)
Hongyan Jing, Kathleen R. McKeown
We present a cut and paste based text summarizer, which uses operations derived from an analysis of human written abstracts. The summarizer edits extracted sentences, using reduction to remove...
Kathleen R. Mckeown, Dragomir R. Radev
This chapter describes a class of word groups that lies between idioms and free word combinations. Idiomatic expressions are those in which the semantics of the whole cannot be deduced from the...
Information Extraction and Summarization: Domain Independence through Focus Types (1999)
Kan, Min-Yen, McKeown, Kathleen R.
We show how information extraction (IE) andsummarization can be merged in a sequential pipeline, resulting in anew approach to domain-independent summarization. IE finds thedocument's terms and...
Information Fusion in the Context of Multi-Document Summarization (1999)
Regina Barzilay, Kathleen R. Mckeown
We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its...
Towards multidocument summarization by reformulation: Progress and prospects (1999)
Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
A description of the CIDR system as used for TDT-2 (1999)
Dragomir R. Radev, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
A description of the CIDR system as used for TDT-2 (1999)
Dragomir R. Radev, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
A description of the CIDR system as used for TDT-2 (1999)
Dragomir R. Radev, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
Word Informativeness and Automatic Pitch Accent Modeling (1999)
Shimei Pan, Kathleen R. Mckeown
In intonational phonology and speech synthesis research, it has been suggested that the relative informativeness of a word can be used to predict pitch prominence. The more information conveyed by a...
The decomposition of human-written summary sentences (1999)
Hongyan Jing, Kathleen R. Mckeown
We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and pasting of the full...
Information Fusion in the Context of Multi-Document Summarization (1999)
Regina Barzilay, Kathleen R. McKeown, Michael Elhadad
We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its...
Seungyup Paek, Carl L. Sable, Vasileios Hatzivassiloglou, Alejandro Jaimes, Barry H. Schiffman, Shih-fu Chang, ...
Annotating photographs automatically with content descriptions facilitates organization, storage, and search over visual information. We present an integrated approach for scene classification that...
Word Informativeness and Automatic Pitch Accent Modeling (1999)
Appear In Proc, Shimei Pan, Kathleen R. Mckeown
In intonational phonology and speech synthesis research, it has been suggested that the relative informativeness of a word can be used to predict pitch prominence. The more information conveyed by a...
A Description of the CIDR System as Used for TDT-2 (1999)
Dragomir Radev Vasileios, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
Information Fusion in the Context of Multi-Document (1999)
Summarization Regina Barzilay, Regina Barzilay, Kathleen R. Mckeown
We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its...
Towards Multidocument Summarization by Reformulation: (1999)
Progress And Prospects, Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
Extracting Patient Profiles from Patient Records and Online Literature (1999)
Vasileios Hatzivassiloglou Ph, Kathleen R. Mckeown, Ph. D, Desmond A. Jordan
this paper, we present a new model for representing the information from an online source, be it an article in a medical journal or a laboratory report in an online clinical information system....
A Description Of The Cidr System As Used For Tdt-2 (1999)
Dragomir Radev Vasileios, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
The Decomposition of Human-Written Summary Sentences (1999)
Hongyan Jing And, Hongyan Jing, Kathleen R. Mckeown
We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and pasting of the full...
Information Fusion in the Context of Multi-Document Summarization (1999)
Regina Barzilay, Kathleen R. Mckeown
We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its...
A description of the CIDR system as used for TDT-2 (1999)
Dragomir R. Radev, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
Word Informativeness and Automatic Pitch Accent Modeling (1999)
Shimei Pan, Kathleen R. Mckeown
In intonational phonology and speech syn-thesis research, it has been suggested that the relative informativeness of a word can be used to predict pitch prominence. The more information conveyed by a...
Towards multidocument summarization by reformulation: Progress and prospects (1999)
Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
A description of the CIDR system as used for TDT-2 (1999)
Dragomir R. Radev, Vasileios Hatzivassiloglou, Kathleen R. Mckeown
We describe several experimental parameters and a parallelization technique used in our online document clustering system, CIDR. These modifications were introduced into CIDR to reduce the running...
Text Generation: The State of the Art and the Literature. (1998)
Mann,William C., Bates,Madeline, Grosz,Barbara J., McDonald,David D., McKeown,Kathleen R.
This report consists of two documents describing the state of the art of computer generation of natural language text. Both were prepared by a panel of individuals who are active in research on text...
Natural Language for Problem Solving Systems. (1998)
Over the course of the contract, we have had results in three separate projects: natural language interpretation for expert systems (covered under the previous contract), goal oriented explanation...
Resources for Evaluation of Summarization Techniques (1998)
Klavans, Judith L., McKeown, Kathleen R., Kan, Min-Yen, Lee, Susan
We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of...
Linear Segmentation and Segment Significance (1998)
Kan, Min-Yen, Klavans, Judith L., McKeown, Kathleen R.
We present a new method for discovering a segmental discourse structure of a document while categorizing segment function. We demonstrate how retrieval of noun phrases and pronominal forms, along...
Linear segmentation and segment significance (1998)
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
We present a new method for discovering a segmental discourse structure of a document while categorizing each segment's function and importance. Segments are determined by a zero-sum weighting...
Linear Segmentation and Segment Significance (1998)
Min-yen Kan, Judith L. Klavans, Kathleen R. McKeown
We present a new method for discovering a segmental discourse structure of a document while categorizing each segment's function and importance. Segments are determined by a zero-sum weighting...
Resources for Evaluation of Summarization Techniques (1998)
Judith L. Klavans, Kathleen R. McKeown, Min-Yen Kan, Susan Lee
We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of...
Resources for the Evaluation of Summarization Techniques (1998)
Judith L. Klavans, Kathleen R. Mckeown, Min-yen Kan, Susan Lee
Linear segmentation and segment significance (1998)
Min-yen Kan, Judith L. Klavans, Kathleen R. Mckeown
We present a new method for discovering a segmental discourse structure of a document while categorizing each segment’s function and importance. Segments are determined by a zero-sum weighting...
Generating natural language summaries from multiple on-line sources (1998)
Dragomir R. Radev, Kathleen R. Mckeown
We present a methodology for summarization of news about current events in the form of brief-ings that include appropriate background (historical) information. The system that we developed, SUMMONS,...
Building a Rich Large-scale Lexical Base for Generation (1997)
Jing, Hongyan, McKeown, Kathleen R., Passonneau, Rebecca J.
Most large lexical resources have been developed with language interpretation in mind and can not be used directly for generation. we present a rich large-scale lexical base for generation,...
Building a Generation Knowledge Source using Internet-Accessible Newswire (1997)
Radev, Dragomir R., McKeown, Kathleen R.
In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which...
Predicting the semantic orientation of adjectives (1997)
Vasileios Hatzivassiloglou, Kathleen R. Mckeown
{vh, kathy)©cs, columbia, edu We identify and validate from a large cor-pus constraints from conjunctions on the positive or negative semantic orientation of the conjoined adjectives. A log-linear...
Language generation for multimedia healthcare briefings (1997)
Kathleen R. Mckeown, Shimei Pan, James Shaw
kathy, pan, shaw©cs, columbia, edu This paper identifies issues for language generation that arose in developing a multimedia interface to healthcare data that includes coordinated speech, text and...
Integrating language generation with speech synthesis in a concept to speech system (1997)
Shimei Pan, Kathleen R. Mckeown
Concept To Speech (CTS) systems are closely related to two other types of
Building A Rich Large-Scale Lexical Base For Generation (1997)
Hongyan Jing, Kathleen R. McKeown, Rebecca Passonneau
this paper, we describe our work in building a large-scale lexical base for generation by automatically merging existing linguistic resources to produce the links between syntactic and semantic...
Language Generation for Multimedia Healthcare Briefings (1997)
Kathleen R. McKeown, Shimei Pan, James Shaw, Desmond A. Jordan, Barry A. Allen
This paper identifies issues for language generation that arose in developing a multimedia interface to healthcare data that includes coordinated speech, text and graphics. In order to produce brief...
Building a Generation Knowledge Source using Internet-Accessible Newswire (1997)
Internet-accessible Newswire, Dragomir Radev, Kathleen R. Mckeown
In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which...
Corpus Analysis Resources for Discourse (1997)
James Allen, Johanna Moore, Kathleen R. McKeown, Kathleen R. Mckeown
d to capture properties of discourse independent of modality (e.g., spoken vs. written), number of participants, domain, genre and so on. That is, procedures for identifying a given DAL element, such...
Language generation for multimedia healthcare briefings (1997)
Kathleen R. Mckeown, Shimei Pan, James Shaw
This paper identifies issues for language generation that arose in developing a multimedia interface to healthcare data that includes coordinated speech, text and graphics. In order to produce brief...
Chapter of the Assocation for Computational Linguistics, Madrid, Spain, (1997)
Dragomir Radev, Erin Doumpoulaki, Branimir Boguraev, Gael Dias, Hongyan Jing, Mark Kantrowitz, ...
This document contains a rather incomplete bibliography of research in text summarization. The list of references was compiled using materials provided
Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm (1996)
Siegel, Eric V., McKeown, Kathleen R.
This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class...
Spoken Language Generation (1996)
Kathleen R. Mckeown, Johanna D. Moore
Interactive natural language capabilities are needed for a wide range of today’s intelligent systems: expert systems must explain their results and reasoning, intelligent assistants must...
I Artificial Intelligence (1996)
Peter Allen Terrance, Terrance E. Boult, John R. Kender, Kathleen R. Mckeown, Shree K. Nayar, ...
ther sensing devices in a general way. A modular system that allows new sensing devices to be added incrementally would be extremely useful. 2. Multiple sensing is in many ways a problem in...
Emergent Linguistic Rules from Inducing Decision Trees: Disambiguating Discourse Clue Words (1994)
Siegel, Eric V., McKeown, Kathleen R.
We apply decision tree induction to the problem of discourse clue word sense disambiguation with a genetic algorithm. The automatic partitioning of the training set which is intrinsic to decision...
Vasileios Hatzivassiloglou, Kathleen R. Mckeown
One type of lexical knowledge which is useful for many natural language (NL) tasks is the semantic re-In this paper we present a method to group adjectives latedness between words of the same or...
Vasileios Hatzivassiloglou, Kathleen R. Mckeown
One type of lexical knowledge which is useful for many natural language (NL) tasks is the semantic re-In this paper we present a method to group adjectives latedness between words of the same or...
Michael Elhadad, Kathleen R. Mckeown
We present an implemented procedure to select an appropriate connective to link two propositions, which is part of a large text generation system. Each connec-tive is defined as a set of constraints...
Michael Elhadad, Kathleen R. Mckeown
Vol 3, pp.97-101 ABSTRACT an appropriate connective given IFs for two propositions. We demonstrate how our surface generator uses We present an implemented procedure to select an IFs to choose...
Michael Elhadad, Kathleen R. Mckeown
Vol 3, pp.97-101 ABSTRACT an appropriate connective given IFs for two propositions. We demonstrate how our surface generator uses We present an implemented procedure to select an IFs to choose...
Using focus to generate complex and simple sentences (1984)
Marcia A. Derr, Kathleen R. Mckeown
One problem for the generation of natural language text is determining when to use a sequence of simple sentences and when a single complex one is more appropriate. In this paper, we show how focus...
Generating natural language text in response to questions about database structure / (1982)
Thesis (Ph. D.)--University of Pennsylvania, 1982.
Generating natural language text in response to questions about database structure / (1982)
Cover title.
Generating natural language text in response to questions about database structure / (1982)
Thesis (Ph. D.)--University of Pennsylvania, 1982.
Paraphrasing Using Given and New Information in a Question-Answer System (1980)
The design and implementation of a paraphrase component for a natural language question-answer system (CO-OP) is presented. A major point made is the role of given and new information in formulating...
Paraphrasing using given and new information in a question-answer system. (1979)
Thesis (M.S. in Computer and Information Sciences)--Graduate School of Arts and Sciences, University of Pennsylvania, 1979.
Translating Collocations for Bilingual Lexicons: A Statistical Approach
Frank Smadja, Vasileios Hatzivassiloglou, Kathleen R. McKeown
Language Generation for Multimedia Healthcare Briefings
Kathleen R. McKeown, Desmond A. Jordan, Shimei Pan, James Shaw, Barry A. Allen
Jordan, Desmond A., McKeown, Kathleen R., Concepcion, Kristian J., Feiner, Steven K., Hatzivassiloglou, Vasileios
Objective: The authors present a system that scans electronic records from cardiac surgery and uses inference rules to identify and classify abnormal events (e.g., hypertension) that may occur during...
Jordan, Desmond A., McKeown, Kathleen R., Concepcion, Kristian J., Feiner, Steven K., Hatzivassiloglou, Vasileios
Objective: The authors present a system that scans electronic records from cardiac surgery and uses inference rules to identify and classify abnormal events (e.g., hypertension) that may occur during...
Extracting Patient Profiles from Patient Records and Online Literature
Hatzivassiloglou, Vasileios, Merport, Olga, McKeown, Kathleen R., Jordan, Desmond A.
Re-engineering an Inference Engine to Support Continuous Quality Improvement
Duboué, Pablo A., Jordan, Desmond, McKeown, Kathleen R.
Supporting Continuous Quality Improvement (CQI) requires specific software design structures and approaches, not necessarily present in legacy code. In this poster we present a new design for an...