Hongyan Jing

Active Learning for Mention Detection: A Comparison of Sentence Selection Strategies (2009)

Madnani, Nitin, Jing, Hongyan, Kambhatla, Nanda, Roukos, Salim

We propose and compare various sentence selection strategies for active learning for the task of detecting mentions of entities. The best strategy employs the sum of confidences of two statistical...

Abstract Information Retrieval Based on Context Distance and Morphology (2008)

Hongyan Jing

We present an approach to information retrieval based on context distance and morphology. Context distance is a measure we use to assess the closeness of word meanings. This context distance model...

Abstract Information Retrieval Based on Context Distance and Morphology (2007)

Hongyan Jing

We present an approach to information retrieval based on context distance and morphology. Context distance is a measure we use to assess the closeness of word meanings. This context distance model...

I I I I I I I I I I I I I (2007)

Dragomir R. Radev, Hongyan Jing

I Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies

Cut and Paste Based Text Summarization Abstract (2007)

Hongyan Jing, Kathleen R. Mckeown

hjing, kathyQcs.columbia.edu We present a cut and paste based text summa-rizer, which uses operations derived from an anal-ysis of human written abstracts. The summarizer edits extracted sentences,...

Summarization of Noisy Documents: A Pilot Study (2007)

Hongyan Jing Ibm, Hongyan Jing

We investigate the problem of summarizing text documents that contain errors as a result of optical character recognition. Each stage in the process is tested, the error effects analyzed, and...

HowtogetaChineseName(Entity): Segmentation and Combination Issues (2007)

Jing, Hongyan, Florian, Radu, Luo, Xiaoqiang, Zhang, Tong, Ittycheriah, Abraham

When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique...

Extracting social networks and biographical facts from conversational speech transcripts (2007)

Hongyan Jing

We present a general framework for automatically extracting social networks and biographical facts from conversational speech. Our approach relies on fusing the output produced by multiple...

Factorizing complex models: A case study in mention detection (2006)

Radu Florian, Hongyan Jing, A Kambhatla, Imed Zitouni

As natural language understanding research advances towards deeper knowledge modeling, the tasks become more and more complex: we are interested in more nuanced word characteristics, more linguistic...

A Mention-Synchronous Coreference Resolution Algorithm Based on the Bell Tree (2004)

Xiaoqiang Luo, Abe Ittycheriah, Hongyan Jing, A Kambhatla, Salim Roukos

This paper proposes a new approach for coreference resolution which uses the Bell tree to represent the search space and casts the coreference resolution problem as finding the best path from the...

Named entity recognition through classifier combination (2003)

Radu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang

This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based...

Quantitative measurement of prosodic strength in Mandarin (2003)

Greg Kochanski, Chilin Shih, Hongyan Jing

We describe models of Mandarin prosody that allow us to make quantitative measurements of prosodic strengths. These models use Stem-ML, which is a phenomenological model of the muscle dynamics and...

Summarizing noisy documents (2003)

Hongyan Jing, Daniel Lopresti, Chilin Shih

We investigate the problem of summarizing text documents that contain errors as a result of optical character recognition. Each stage in the process is tested, the error effects analyzed, and...

Named entity recognition through classifier combination (2003)

Radu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang

This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based...

Hierarchical structure and word strength prediction of Mandarin prosody (2003)

Greg Kochanski, Chilin Shih, Hongyan Jing

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle...

Hierarchical structure and word strength prediction of Mandarin prosody (2003)

Greg Kochanski, Chilin Shih, Hongyan Jing

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle...

Hierarchical structure and word strength prediction of Mandarin prosody (2003)

Greg Kochanski, Chilin Shih, Hongyan Jing

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle...

HowtogetaChineseName(Entity): Segmentation and combination issues (2003)

Hongyan Jing, Radu Florian, Xiaoqiang Luo, Tong Zhang, Abraham Ittycheriah

hjing,raduf,xiaoluo,tzhang,abeiĀ” When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters...

Named entity recognition through classifier combination (2003)

Radu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang

This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based...

Discourse Segmentation of Multi-Party Conversation (2003)

Michel Galley, Kathleen Mckeown, Eric Fosler-lussier, Hongyan Jing

We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based algorithm combines knowledge about content using a text-based algorithm as a feature and about...

Hierarchical structure and word strength prediction of Mandarin prosody (2003)

Greg Kochanski, Chilin Shih, Hongyan Jing

We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle...

Cut-and-paste text summarization (2002)

Jing, Hongyan

Automatic text summarization provides a concise summary for a document. In this thesis, we present a cut-and-paste approach to addressing the text generation problem in domain-independent,...

Using hidden Markov modeling to decompose human-written summaries (2002)

Hongyan Jing

Professional summarizers often reuse original documents to generate summaries. The task of summary sentence decomposition is to deduce whether a summary sentence is constructed by reusing the...

Cut-and-Paste Text Summarization (2001)

Hongyan Jing

Automatic text summarization provides a concise summary for a document. In this thesis, we present a cut-and-paste approach to addressing the text generation problem in domain-independent,...

Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies (2000)

Radev, Dragomir R., Jing, Hongyan, Budzikowska, Malgorzata

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on...

Sentence reduction for automatic text summarization (2000)

Hongyan Jing

We present a novel sentence reduction system for automatically removing extraneous phrases from sentences that are extracted from a document for summarization purposes. The system uses multiple...

Sentence reduction for automatic text summarization (2000)

Hongyan Jing

We present a novel sentence reduction system for automatically removing extraneous phrases from sentences that are extracted from a document for summarization purpose. The system uses multiple...

Cut and paste based text summarization (2000)

Hongyan Jing, Kathleen R. Mckeown

s /Jr [1], [2], [3] 7HH` o # 6?/6x` r. identification of cutting and pasting operation decomposition of human-written summary sentences development of an automatic system to perform cut and paste...

Integrating a Large-scale, Reusable Lexicon with a Natural Language Generator (2000)

Hongyan Jing, Yael Dahan Netzer, Michael Elhadad, Kathleen R. McKeown

This paper presents the integration of a large-scale, reusable lexicon for generation with the FUF/SURGE unification-based syntactic realizer. The lexicon was combined from multiple existing...

Sentence Reduction for Automatic Text Summarization (2000)

Hongyan Jing

We present a novel sentence reduction system for automatically removing extraneous phrases from sentences that are extracted from a document for summarization purpose. The system uses multiple...

Centroid-Based Summarization of Multiple Documents: Sentence Extraction, Utility-Based Evaluation, and User Studies (2000)

Dragomir Radev, Hongyan Jing, Malgorzata Budzikowska

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on...

Cut and Paste Based Text Summarization (2000)

Hongyan Jing, Kathleen R. McKeown

We present a cut and paste based text summarizer, which uses operations derived from an analysis of human written abstracts. The summarizer edits extracted sentences, using reduction to remove...

Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies (2000)

Dragomir R. Radev, Hongyan Jing

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on...

The decomposition of human-written summary sentences (1999)

Hongyan Jing, Kathleen R. Mckeown

We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and pasting of the full...

Information Retrieval Based on Context Distance and Morphology (1999)

Hongyan Jing, Evelyne Tzoukermann

We present an approach to information retrieval based on context distance and morphology. Context distance is a measure we use to assess the closeness of word meanings. This context distance model...

Information Retrieval Based on Context Distance and Morphology (1999)

Hongyan Jing

We present an approach to information retrieval based on context distance and morphology. Context distance is a measure we use to assess the closeness of word meanings. This context distance model...

The Decomposition of Human-Written Summary Sentences (1999)

Hongyan Jing And, Hongyan Jing, Kathleen R. Mckeown

We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and pasting of the full...

ii Contents (1999)

Hongyan Jing

1.1 Problem description........................................ 1 1.2 Why is summary generation important?............................. 1 1.3 Why is summary generation...

Usage of WordNet in natural language generation (1998)

Hongyan Jing

WordNet has rarely been applied to natural language generation, despite of its wide application in other fields. In this paper, we address three issues in the usage of WordNet in generation: adapting...

Summarization Evaluation Methods: Experiments and Analysis (1998)

Hongyan Jing Dept, Hongyan Jing, Regina Barzilay, Kathleen Mckeown, Michael Elhadad

Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a...

Combining Multiple, Large-Scale Resources in a Reusable Lexicon for Natural Language Generation (1998)

Hongyan Jing, Kathleen Mckeown

A lexicon is an essential component in a generation system but few e#orts have been made to build a rich, large-scale lexicon and make it reusable for di#erent generation applications. In this paper,...

Summarization Evaluation Methods: Experiments and Analysis (1998)

Hongyan Jing, Regina Barzilay, Kathleen Mckeown, Michael Elhadad

Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a...

Building a Rich Large-scale Lexical Base for Generation (1997)

Jing, Hongyan, McKeown, Kathleen R., Passonneau, Rebecca J.

Most large lexical resources have been developed with language interpretation in mind and can not be used directly for generation. we present a rich large-scale lexical base for generation,...

Investigating complementary methods for verb sense pruning (1997)

Hongyan Jing, Vasileios Hatzivassiloglou, Rebecca Passonneau, Kathleen Mckeown

We present an approach for tagging verb sense that combines a domain-independent method based on subcategorization and alternations with a domain-dependent method utilizing statistically extracted...

Investigating complementary methods for verb sense pruning (1997)

Hongyan Jing, Vasileios Hatzivassiloglou, Rebecca Passonneau, Kathleen Mckeown

We present an approach for tagging verb sense that combines a domain-independent method based on subcategorization and al-ternations with a domain-dependent meth-od utilizing statistically extracted...

Building A Rich Large-Scale Lexical Base For Generation (1997)

Hongyan Jing, Kathleen R. McKeown, Rebecca Passonneau

this paper, we describe our work in building a large-scale lexical base for generation by automatically merging existing linguistic resources to produce the links between syntactic and semantic...

Software Re-Use and Evolution in Text Generation Applications (1997)

Karen Kukich, Rebecca Passonneau, Kathleen Mckeown, Dragomir Radev, Vasileios Hatzivassiloglou, Hongyan Jing

A practical goal for natural language text generation research is to converge on a separation of functions into modules that can be independently re-used. This paper addresses issues related to...

Chapter of the Assocation for Computational Linguistics, Madrid, Spain, (1997)

Dragomir Radev, Erin Doumpoulaki, Branimir Boguraev, Gael Dias, Hongyan Jing, Mark Kantrowitz, ...

This document contains a rather incomplete bibliography of research in text summarization. The list of references was compiled using materials provided

Generating summaries of work flow diagrams (1996)

Rebecca Passonneau, Karen Kukich, Jacques Robin, Vasileios Hatzivassiloglou, Larry Lefkowitz, Hongyan Jing

FLOWDOC is a prototype text generator that summarizes information from work flow graphs in a business re-engineering context. A richer ontology than is typically used allows generalization of input...