Andreas Stolcke

IEEE TRANSACTIONS ON SPEECH & AUDIO PROCESSING 1 Enriching Speech Recognition with Automatic Detection of Sentence Boundaries and Disfluencies (2008)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Senior Member, Dustin Hillard, Mari Ostendorf, ...

Abstract — Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and...

Submitted to ICASSP-95 USING A STOCHASTIC CONTEXT-FREE GRAMMAR AS A LANGUAGE MODEL FOR SPEECH RECOGNITION (2008)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Eric Fosler, Gary Tajchman, ...

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous...

To appear in ICSLP-94 (2008)

Daniel Jurafsky, Chuck Wooters, Gary Tajchman, Jonathan Segal, Andreas Stolcke, Eric Fosler, ...

This paper describes the architecture and performance of the Berkeley Restaurant Project (BeRP), a medium-vocabulary, speaker-independent, spontaneous continuous speech understanding system currently...

To appear in ICASSP-95 USING A STOCHASTIC CONTEXT-FREE GRAMMAR AS A LANGUAGE MODEL FOR SPEECH RECOGNITION (2008)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Eric Fosler, Gary Tajchman, ...

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous...

To appear in ICGI-94 Inducing Probabilistic Grammars by Bayesian Model Merging (2008)

Andreas Stolcke, Stephen Omohundro

We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding ad-hoc rules to a working grammar; subsequently, elements of...

Effective Acoustic Modeling for Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition (2008)

Jing Zheng, Horacio Franco, Andreas Stolcke

We investigate several variants of speech-rate-dependent acoustic models for large-vocabulary conversational speech recognition, in the framework of combining rate-specific models in decoding to...

HUMAN LANGUAGE TECHNOLOGY: OPPORTUNITIES AND CHALLENGES (2008)

Mari Ostendorf, Elizabeth Shriberg, Andreas Stolcke

In recent years, there has been dramatic progress in both speech and language processing, in many cases leveraging some of the same underlying methods. This progress and the growing technical ties...

“TalkPrinting”: Improving Speaker Recognition by Modeling Stylistic Features (2008)

Sachin Kajarekar, Kemal Sönmez, Luciana Ferrer, Venkata Gadde, Elizabeth Shriberg, Andreas Stolcke, ...

Abstract. Automatic speaker recognition is an important technology for intelligence gathering, law enforcement, and audio mining. Conventional speaker recognition systems, which are based on...

Detecting Nonnative Speech Using Speaker Recognition Approaches (2008)

Elizabeth Shriberg, Luciana Ferrer, Sachin Kajarekar, Nicolas Scheffer, Andreas Stolcke, Murat Akbacak

Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers...

Speaker Clustered Regression-Class Trees for MLLR Adaptation (2008)

Arindam M, Mari Ostendorf, Andreas Stolcke

A speaker clustering algorithm is presented that is based on an eigenspace representation of Maximum Likelihood Linear Regression (MLLR) transformations and is used for training cluster-dependent...

Detecting Deception Using Critical Segments (2008)

Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke

We present an investigation of segments that map to GLOBAL LIES, that is, the intent to deceive with respect to salient topics of the discourse. We propose that identifying the truth or falsity of...

Detecting Nonnative Speech Using Speaker Recognition Approaches (2008)

Elizabeth Shriberg, Luciana Ferrer, Sachin Kajarekar, Nicolas Scheffer, Andreas Stolcke, Murat Akbacak

Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers...

To appear in ICSLP-94 (2008)

Daniel Jurafsky, Chuck Wooters, Gary Tajchman, Jonathan Segal, Andreas Stolcke, Eric Fosler, ...

This paper describes the architecture and performance of the Berkeley Restaurant Project (BeRP), a medium-vocabulary, speaker-independent, spontaneous continuous speech understanding system currently...

Iterative Statistical Language Model Generation for Use with an Agent-Oriented Natural Language Interface (2008)

Babak Hodjat, Horacio Franco, Harry Bratt, Kristin Precoda, Andreas Stolcke, Anand Venkataraman, ...

We describe a method for developing a statistical language model (SLM) with high keyword spotting accuracy for a natural language interface (NLI). The NLI is based on the Adaptive Agent Oriented...

fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains (2008)

Jing Zheng, Andreas Stolcke

Maximum a posteriori (MAP) adaptation and its discriminative variants, such as MMI-MAP (maximum mutual information MAP) and MPE-MAP (minimum phone error MAP), have been widely applied to acoustic...

NAP AND WCCN: COMPARISON OF APPROACHES USING MLLR-SVM SPEAKER VERIFICATION SYSTEM (2008)

Sachin S. Kajarekar, Andreas Stolcke

We compare two recently proposed techniques, within class covariance normalization (WCCN) [1] and nuisance attribute projection (NAP) [2], for intersession variability compensation in speaker...

Morph-Based Speech Recognition and Modeling of Out-of-Vocabulary Words Across Languages (2007)

Creutz, Mathias, Hirsimäki, Teemu, Kurimo, Mikko, Puurula, Antti, Pylkkönen, Janne, Siivola, Vesa, ...

We explore the use of morph-based language models in large-vocabulary continuous speech recognition systems across four so-called "morphologically rich'' languages: Finnish, Estonian, Turkish, and...

Morph-Based SR & Modeling of OOVs Across Languages. (2007)

Creutz, Mathias, Hirsimaki, Teemu, Kurimo, Mikko, Puurula, Antti, Pylkkönen, Janne, Siivola, Vesa, ...

We explore the use of morph-based language models in large-vocabulary continuous-speech recognition systems across four so-called morphologically rich languages: Finnish, Estonian, Turkish, and...

Article Submitted to Computer Speech and Language (2007)

Lidia Mangu, Eric Brill, Andreas Stolcke

Finding consensus in speech recognition: word error minimization and other applications of confusion networks

Integrating prosodic and lexical cues for automatic topic segmentation (2007)

Gökhan Tür, Dilek Hakkani-Tür, Andreas Stolcke, Elizabeth Shriberg

We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and...

To appear in Speech Communication (2007)

Ananth Sankar, Venkata Ramana, Rao Gadde, Andreas Stolcke, Fuliang Weng

Over the last few years, the DARPA-sponsored Hub4 continuous speech recognition evaluations have pushed speech recognition technology for the very interesting and difficult task of automatically...

RATE-OF-SPEECH MODELING FOR LARGE VOCABULARY CONVERSATIONAL SPEECH RECOGNITION (2007)

Jing Zheng, Horacio Franco, Andreas Stolcke

Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect automatic speech recognition (ASR) systems. To deal with these ROS effects, we propose...

Gapping and Frame Semantics: A fresh look from a cognitive perspective (2007)

Andreas Stolcke

this paper is one of the pet subjects of transformational syntax---gapping. Several examples below will show that gapping is actually affected by a combination of syntactic, semantic and pragmatic...

1 Intro'duction (2007)

Andreas Stolcke

A fresh look from a cognitive perspective

To appear in ICASSP-95 USING A STOCHASTIC CONTEXT-FREE GRAMMAR AS A LANGUAGE MODEL FOR SPEECH RECOGNITION (2007)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Gary Tajchman, Nelson Morgan

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous...

Venkataraman, et. al. Automatic Discourse Act Labeling AUTOMATIC DIALOG ACT LABELING WITH MINIMAL SUPERVISION (2007)

Anand Venkataraman, Andreas Stolcke, Elizabeth Shriberg

ABSTRACT: For many natural language applications it is desirable to be able to automatically tag utterances according to their discourse function (dialog act), such as statement, question or...

SRI International (2007)

Andreas Stolcke, Noah Coccaro, Rebecca Bates, Paul Taylor, Klaus Ries, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-

Submitted to ICASSP-95 USING A STOCHASTIC CONTEXT-FREE GRAMMAR AS A LANGUAGE MODEL FOR SPEECH RECOGNITION (2007)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Eric Fosler, Gary Tajchman, ...

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous...

To appear in Speech Communication (2007)

Ananth Sankar, Venkata Ramana, Rao Gadde, Andreas Stolcke, Fuliang Weng

Over the last few years, the DARPA-sponsored Hub-4 continuous speech recognition evaluations have advanced speech recognition technology for automatic transcription of broadcast news. In this paper,...

2 (2007)

Thilo Pfau, Andreas Stolcke

As part of a project into speech recognition in meeting environments, we have collected a corpus of multi-channel meeting recordings. We expected the identification of speaker activity to be...

Direct Modeling of Prosody: An Overview of Applications in Automatic Speech Processing (2007)

Elizabeth Shriberg, Andreas Stolcke

We describe a "direct modeling" approach to using prosody in various speech technology tasks. The approach does not involve any hand-labeling or modeling of prosodic events such as pitch...

Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages (2007)

Creutz, Mathias, Hirsimäki, Teemu, Kurimo, Mikko, Puurula, Antti, Pylkkönen, Janne, Siivola, Vesa, ...

We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four "morphologically rich'' languages: Finnish, Estonian, Turkish, and Egyptian Colloquial...

Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages (2007)

Creutz, Mathias, Hirsimaki, Teemu, Kurimo, Mikko, Puurula, Antti, Pylkkönen, Janne, Siivola, Vesa, ...

We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four “morphologically rich” languages: Finnish, Estonian, Turkish, and Egyptian Colloquial...

The Meeting Project at ICSI (2007)

Morgan, Nelson, Baron, Don, Edwards, Jane, Ellis, Dan, Gelbart, David, Janin, Adam, ...

In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings. The work includes a substantial data collection and...

Abstract (2007)

Arindam Mandal, Arindam Mandal, Mari Ostendorf, Mari Ostendorf, Andreas Stolcke, Jeffrey Bilmes, ...

This is to certify that I have examined this copy of a doctoral dissertation by

Detecting categories in news video using acoustic, speech and image features (2007)

Slav Petrov, Arlo Faria, Pascal Michaillat, Er Berg, Andreas Stolcke, Dan Klein, ...

This work describes systems for detecting semantic categories present in news video. The multimedia data was processed in three ways: the audio signal was converted to a sequence of acoustic...

S.: Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification (2007)

Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar

We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and...

S.: Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification (2007)

Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar

We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and...

Combining prosodic, lexical and cepstral systems for deceptive speech detection (2006)

Martin Graciarena, Elizabeth Shriberg, Andreas Stolcke, Frank Enos, Julia Hirschberg, Sachin Kajarekar

We report on machine learning experiments to distinguish deceptive from nondeceptive speech in the Columbia-SRI-Colorado (CSC) corpus. Specifically, we propose a system combination approach using...

The ICSI-SRI spring 2006 meeting recognition system (2006)

Adam Janin, Andreas Stolcke, Xavier Anguera, Kofi Boakye, Özgür Çetin, Joe Frankel, ...

Abstract. We describe the development of the ICSI-SRI speech recognition

Improvements in MLLRtransform-based speaker recognition (2006)

Andreas Stolcke, Luciana Ferrer, Sachin Kajarekar

We previously proposed the use of MLLR transforms derived from a speech recognition system as speaker features in a speaker verification system [1]. In this paper we report recent improvements to...

Combining prosodic, lexical and cepstral systems for deceptive speech detection (2006)

Martin Graciarena, Elizabeth Shriberg, Andreas Stolcke, Frank Enos, Julia Hirschberg, Sachin Kajarekar

We report on machine learning experiments to distinguish deceptive from nondeceptive speech in the Columbia-SRI-Colorado (CSC) corpus. Specifically, we propose a system combination approach using...

Generalized linear kernels for one-versus-all classification: application to speaker recognition (2006)

Andrew O. Hatch, Andreas Stolcke

In this paper, we examine the problem of kernel selection for oneversus-all (OVA) classification of multiclass data with support vector machines (SVMs). We focus specifically on the problem of...

Improved Speech Activity Detection Using Cross-Channel Features for Recognition of Multiparty Meetings (2006)

Kofi Boakye, Andreas Stolcke

We describe the development of a speech activity detection system using an HMM-based segmenter for automatic speech recognition on individual headset microphones in multispeaker meetings. We look at...

Within-class Covariance Normalization for SVM-based Speaker Recognition (2006)

Andrew O. Hatch, Sachin Kajarekar, Andreas Stolcke

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an...

A Study in Machine Learning from Imbalanced Data for Sentence Boundary Detection in Speech (2006)

Yang Liu, Nitesh V. Chawla, Mary P. Harper, Elizabeth Shriberg, Andreas Stolcke

Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have constructed a hidden...

Within-class Covariance Normalization for SVM-based Speaker Recognition (2006)

Andrew O. Hatch, Sachin Kajarekar, Andreas Stolcke

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an...

Distinguishing Deceptive from Non-Deceptive Speech (2005)

Julia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, ...

To date, studies of deceptive speech have largely been confined to descriptive studies and observations from subjects, researchers, or practitioners, with few empirical studies of the specific...

Does active learning help automatic dialog act taggin in meeting data (2005)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Knowledge of Dialog Acts (DAs) is important for the automatic understanding and summarization of meetings. Current approaches rely on a lot of hand labeled data to train automatic taggers. One...

Further progress in meeting recognition: The ICSI-SRI Spring 2005 speech-to-text evaluation system (2005)

Andreas Stolcke, Xavier Anguera, Kofi Boakye, Özgür Çetin, Arindam M, Chuck Wooters, ...

Abstract. We describe the development of our speech recognition system for the National Institute of Standards and Technology (NIST) Spring 2005 Meeting Rich Transcription (RT-05S) evaluation,...

Toward joint segmentation and classification of dialog acts in multiparty meetings (2005)

Matthias Zimmermann, Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Abstract. We present baseline results for the joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus. Two simple approaches based on word information are investigated...

Structural Metadata Research in the EARS Program (2005)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Barbara Peskin, Jeremy Ang, Dustin Hillard, ...

Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS...

Further progress in meeting recognition: The ICSI-SRI Spring 2005 speech-to-text evaluation system (2005)

Andreas Stolcke, Xavier Anguera, Kofi Boakye, Özgür Çetin, Adam Janin, Arindam M, ...

Abstract. We describe the development of our speech recognition system for the National Institute of Standards and Technology (NIST) Spring 2005 Meeting Rich Transcription (RT-05S) evaluation,...

Does active learning help automatic dialog act taggin in meeting data (2005)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Knowledge of Dialog Acts (DAs) is important for the automatic understanding and summarization of meetings. Current approaches rely on a lot of hand labeled data to train automatic taggers. One...

Comparing HMM, maximum entropy and conditional random fields for disfluency detection (2005)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary Harper

Automatic detection of disfluencies in spoken language is important for making speech recognition output more readable, and for aiding downstream language processing modules. We compare a generative...

Two experiments comparing reading with listening for human processing of conversational telephone speech (2005)

Douglas Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa Kamm

We report on results of two experiments designed to compare subjects ’ ability to extract information from audio recordings of conversational telephone speech (CTS) with their ability to extract...

Improved Phonetic Speaker Recognition Using Lattice Decoding (2005)

Andrew O. Hatch, Barbara Peskin, Andreas Stolcke

The current “state-of-the-art ” in phonetic speaker recognition uses relative frequencies of phone n-grams as features for training speaker models and for scoring test-target pairs. Typically,...

Using MLP features in SRI’s conversational speech recognition system (2005)

Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan

We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLP). The acoustic...

Using MLP features in SRI’s conversational speech recognition system (2005)

Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan

We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLP). The acoustic...

Improved discriminative training using phone lattices (2005)

Jing Zheng, Andreas Stolcke

We present an efficient discriminative training procedure utilizing phone lattices. Different approaches to expediting lattice generation, statistics collection, and convergence were studied. We also...

SRI’s 2004 NIST speaker recognition evaluation system (2005)

Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg, Kemal Sonmez, Andreas Stolcke, Jing Zheng

This paper describes our recent efforts in exploring longerrange features and their statistical modeling techniques for speaker recognition. In particular, we describe a system that uses discriminant...

SRI’s 2004 NIST speaker recognition evaluation system (2005)

Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg, Kemal Sonmez, Andreas Stolcke, Jing Zheng

This paper describes our recent efforts in exploring longerrange features and their statistical modeling techniques for speaker recognition. In particular, we describe a system that uses discriminant...

Combining feature sets with support vector machines: Application to speaker recognition (2005)

Andrew O. Hatch, Andreas Stolcke, Barbara Peskin

In this paper, we describe a general technique for optimizing the relative weights of feature sets in a support vector machine (SVM) and show how it can be applied to the field of speaker...

Improved Phonetic Speaker Recognition Using Lattice Decoding (2005)

Andrew O. Hatch, Barbara Peskin, Andreas Stolcke

The current “state-of-the-art ” in phonetic speaker recognition uses relative frequencies of phone n-grams as features for training speaker models and for scoring test-target pairs. Typically,...

Using conditional random fields for sentence boundary detection in speech (2005)

Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mary Harper

Sentence boundary detection in speech is important for enriching speech recognition output, making it easier for humans to read and downstream modules to process. In previous work, we have developed...

Morphology-based language modeling for Arabic speech recognition (2004)

Dimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke

Language modeling is a difficult problem for languages with rich morphology. In this paper we investigate the use of morphology-based language models at different stages in a speech recognition...

Progress on mandarin conversational telephone speech recognition (2004)

Mei-yuh Hwang, Xin Lei, Tim Ng, Ivan Bulyko, Mari Ostendorf, Andreas Stolcke, ...

Over the past decade, there has been good progress on English conversational telephone speech (CTS) recognition, built on the Switchboard and Fisher corpora. In this paper, we present our efforts on...

Trapping Conversational Speech: Extending TRAP/Tandem Approaches to Conversational Telephone Speech Recognition (2004)

Nelson Morgan, Barry Y. Chen, Qifeng Zhu, Andreas Stolcke

TempoRAl Patterns (TRAPs) and Tandem MLP/HMM approaches incorporate feature streams computed from longer time intervals than the conventional short-time analysis. These methods have been used for...

On Using MLP Features in LVCSR (2004)

Qifeng Zhu Barry, Barry Chen, Nelson Morgan, Andreas Stolcke

One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP) based features in automatic speech recognition (ASR). This paper presents a study of three aspects...

Progress in Meeting Recognition: The ICSI-SRI-UW Spring 2004 Evaluation System (2004)

Andreas Stolcke, Chuck Wooters, Nikki Mirghafori, Tuomo Pirinen, Ivan Bulyko, Dave Gelbart, ...

We describe the ICSI-SRI-UW team's entry in the Spring 2004 NIST Meeting Recognition Evaluation. The system was derived from SRI's 5xRT Conversational Telephone Speech (CTS) recognizer by...

The ICSI Meeting Project: Resources and Research (2004)

Adam Janin, Jeremy Ang, Sonali Bhagat, Rajdip Dhillon, Jane Edwards, Javier Macías-guarasa, ...

This paper provides a progress report on ICSI’s Meeting Project, including both the data collected and annotated as part of the project, as well as the research lines such materials support. We...

The ICSI-SRI-UW metadata extraction system (2004)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, ...

Both human and automatic processing of speech require recognizing more than just the words. We describe a state-of-the-art system for automatic detection of “metadata ” (information beyond the...

An efficient repair procedure for quick transcriptions (2004)

Andreas Stolcke, Wen Wang, Dimitra Vergyri, Venkata Ramana, Rao Gadde, Jing Zheng

We describe an efficient procedure for automatic repair of quickly transcribed (QT) speech. QT speech, typically closed captioned data from television broadcasts, usually has a significant number of...

TRAPping conversational speech: Extending TRAP/Tandem approaches to conversational telephone speech recognition (2004)

Nelson Morgan, Barry Y. Chen, Qifeng Zhu, Andreas Stolcke

TempoRAl Patterns (TRAPs) and Tandem MLP/HMM approaches incorporate feature streams computed from longer time intervals than the conventional short-time analysis. These methods have been used for...

Scaling up: Learning large-scale recognition methods from smallscale recognition tasks (2004)

Nelson Morgan, Barry Y Chen, Qifeng Zhu, Andreas Stolcke

Despite the common wisdom that lessons learned from small experimental speech recognition tasks often do not scale to larger tasks, many important algorithms used in larger tasks were first developed...

M.: Progress in meeting recognition: The ICSI-SRI-UW Spring 2004 evaluation system (2004)

Andreas Stolcke, Chuck Wooters, Nikki Mirghafori, Tuomo Pirinen, Dave Gelbart, Martin Graciarena, ...

We describe the ICSI-SRI-UW team’s entry in the Spring 2004 NIST Meeting Recognition Evaluation. The system was derived from SRI’s 5xRT Conversational Telephone Speech (CTS) recognizer by...

Incorporating tandem/HATs MLP features into SRI’s conversational speech recognition system (2004)

Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan

We describe the development of a speech recognition system for conversational telephone speech (CTS) that incorporates acoustic features estimated by multilayer perceptrons (MLPs). The acoustic...

M.: Progress in meeting recognition: The ICSI-SRI-UW Spring 2004 evaluation system (2004)

Andreas Stolcke, Chuck Wooters, Nikki Mirghafori, Tuomo Pirinen, Ivan Bulyko, Dave Gelbart, ...

We describe the ICSI-SRI-UW team’s entry in the Spring 2004 NIST Meeting Recognition Evaluation. The system was derived from SRI’s 5xRT Conversational Telephone Speech (CTS) recognizer by...

Voicing feature integration in SRI’s Decipher LVCSR system (2004)

Martin Graciarena, Horacio Franco, Jing Zheng, Dimitra Vergyri, Andreas Stolcke

We augment the Mel cepstral (MFCC) feature representation with voicing features from an independent front end. The voicing feature front end parameters are optimized for recognition accuracy. The...

Automatic disfluency identification in conversational speech using multiple knowledge sources (2003)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...

The ICSI meeting corpus (2003)

Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, ...

We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains...

Classdependent interpolation for estimating language models from multiple text sources (2003)

Ivan Bulyko, Ivan Bulyko, Mari Ostendorf, Mari Ostendorf, Andreas Stolcke, Andreas Stolcke

Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the...

Training a prosody-based dialog act tagger from unlabeled data (2003)

Anand Venkataraman, Luciana Ferrer, Andreas Stolcke, Elizabeth Shriberg

Dialog act tagging is an important step toward speech understanding, yet training such taggers usually requires large amounts of data labeled by linguistic experts. Here we investigate the use of...

Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources (2003)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...

Automatic disfluency identification in conversational speech using multiple knowledge sources (2003)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...

Gadde, “Speaker recognition using prosodic and lexical features (2003)

Sachin Kajarekar, Luciana Ferrer, Kemal Sonmez, Elizabeth Shriberg, Andreas Stolcke, Harry Bratt, ...

Conventional speaker recognition systems identify speakers by using spectral information from very short slices of speech. Such systems perform well (especially in quiet conditions), but fail to...

A Prosody-Based Approach To End-Of-Utterance Detection That Does (2003)

Not Require Speech, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke

In previous work we showed that state-of-the-art end-of-utterance detection (as used, for example, in dialog systems) can be improved significantly by making use of prosodic and/or language models...

Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources (2003)

Yang Liu, Elizabeth Shriberg, Andreas Stolcke

Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...

Prosodic Knowledge Sources For Automatic Speech Recognition (2003)

Dimitra Vergyri, Andreas Stolcke, Luciana Ferrer, Elizabeth Shriberg

In this work, different prosodic knowledge sources are integrated into a state-of-the-art large vocabulary speech recognition system. Prosody manifests itself on different levels in the speech...

Prosody modeling for automatic speech recognition and understanding (2002)

Elizabeth Shriberg, Andreas Stolcke

Abstract. This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective...

Prosody-based automatic detection of annoyance and frustration in human-computer dialog (2002)

Jeremy Ang, Rajdip Dhillon, Ashley Krupski, Elizabeth Shriberg, Andreas Stolcke

We investigate the use of prosody for the detection of frustration and annoyance in natural human-computer dialog. In addition to prosodic features, we examine the contribution of language model...

Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues (2002)

Don Baron, Elizabeth Shriberg, Andreas Stolcke

We investigate automatic approaches to finding “hidden ” spontaneous speech events, such as sentence boundaries and disfluencies, in multi-party meetings. Hidden events are characterized...

Building an ASR system for noisy environments: SRI’s 2001 SPINE evaluation system (2002)

Venkata Ramana, Rao Gadde, Andreas Stolcke, Dimitra Vergyri, Jing Zheng, Kemal Sonmez

We describe SRI’s recognition system as used in the 2001 DARPA Speech in Noisy Environments (SPINE) evaluation. The SPINE task involves recognition of speech in simulated military environments. The...

SRILM - An Extensible Language Modeling Toolkit (2002)

Andreas Stolcke

SRILM is a collection of C++ libraries, executable programs, and helper scripts designed to allow both production of and experimentation with statistical language models for speech recognition and...

Is The Speaker Done Yet? (2002)

Faster And More, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke

We examine the problem of end-of-utterance (EOU) detection for real-time speech recognition, particularly in the context of a human-computer dialog system. Current EOU detection algorithms use only a...

Prosody Modeling For Automatic Speech Recognition And Understanding (2002)

Elizabeth Shriberg, Andreas Stolcke

This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective prosodic...

Automatic Dialog Act Labeling With Minimal (2002)

Supervision Anand Venkataraman, Anand Venkataraman, Andreas Stolcke, Elizabeth Shriberg

For many natural language applications it is desirable to be able to automatically tag utterances according to their discourse function (dialog act), such as statement, question or acknowledgment. We...

Building an ASR System for Noisy Environments: SRI's 2001 SPINE Evaluation System (2002)

Venkata Ramana Rao, Rao Gadde, Andreas Stolcke, Dimitra Vergyri, Jing Zheng, Kemal Sonmez

We describe SRI's recognition system as used in the 2001 DARPA Speech in Noisy Environments (SPINE) evaluation. The SPINE task involves recognition of speech in simulated military environments....

DynaSpeak: SRI's Scalable Speech Recognizer for (2002)

Embedded And Mobile, Horacio Franco, Jing Zheng, John Butzberger, Federico Cesari, Michael Fr, ...

We introduce SRI's new speech recognition engine, , which is characterized by its scalability and flexibility, high recognition accuracy, memory and speed efficiency, adaptation capability,...

Modeling Word-Level Rate-of-Speech (2002)

Variation In Large, Jing Zheng, Jing Zheng, Jing Zheng, Horacio Franco, Horacio Franco, ...

Variations in rate of speech (ROS) produce variations in both spectral features and word pronunciations that affect automatic speech recognition systems. To deal with these ROS effects, we propose to...

Can Prosody Aid the Automatic Processing of Multi-Party Meetings? Evidence from Predicting Punctuation, Disfluencies, and Overlapping Speech (2001)

Elizabeth Shriberg, Andreas Stolcke, Don Baron

We investigate whether probabilistic modeling of prosody can aid various automatic labeling tasks essential for processing of multi-party meetings. Task 1, automatic punctuation, seeks to classify...

Prosody modeling for automatic speech understanding: an overview of recent research at SRI (2001)

Elizabeth Shriberg, Andreas Stolcke

Prosody has long been studied as an important knowledge source for speech understanding. In recent years there has been a large amount of computational work aimed at prosodic

The meeting project at ICSI (2001)

Nelson Morgan, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Adam Janin, ...

In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings. The work includes a substantial data collection and...

Observations on overlap: findings and implications for automatic processing of multi-party conversation (2001)

Elizabeth Shriberg, Andreas Stolcke, Don Baron

We examine the distribution of overlapping speech in different corpora of natural multi-party conversations, including two types of meetings, and two corpora of telephone conversations. Analyses are...

Multispeaker Speech Activity Detection (2001)

For The Icsi, Thilo Pfau, Andreas Stolcke

As part of a project into speech recognition in meeting environments, we have collected a corpus of multi-channel meeting recordings. We expected the identification of speaker activity to be...

Improved Maximum Mutual Information Estimation Training of Continuous Density HMMs (2001)

Jing Zheng, John Butzberger, Horacio Franco, Andreas Stolcke

In maximum mutual information estimation (MMIE) training, the currently widely used update equations derive from the Extended Baum-Welch (EBW) algorithm, which was originally designed for the...

Integrating prosodic and lexical cues for automatic topic segmentation (2001)

Gökhan Tür, Dilek Hakkani-tür, Andreas Stolcke, Elizabeth Shriberg

SRI International SRI International We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two...

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Stolcke, Andreas, Coccaro, Noah, Bates, Rebecca, Taylor, Paul, Ries, Klaus, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, QUESTION, BACKCHANNEL, AGREEMENT, DISAGREEMENT, and APOLOGY. Our...

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Stolcke, Andreas, Coccaro, Noah, Bates, Rebecca, Taylor, Paul, Ries, Klaus, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, QUESTION, BACKCHANNEL, AGREEMENT, DISAGREEMENT, and APOLOGY. Our...

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact

Rate-dependent acoustic modeling for large vocabulary conversational speech recognition (2000)

Jing Zheng, Horacio Franco, Andreas Stolcke

Variations in rate of speech (ROS) produce changes in both spectral features and word pronunciations that affect automatic speech recognition (ASR) systems. To deal with these ROS effects, we propose...

Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...

this article is twofold: On the one hand, we aim to present a comprehensive framework for modeling and automatic classification of DAs, founded on well-known statistical methods. In doing so, we will...

Dialog act modeling for automatic tagging and recognition of conversational speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Dan Jurafsky, ...

We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like

Prosody-based automatic segmentation of speech into sentences and topics (2000)

Elizabeth Shriberg, Andreas Stolcke, Dilek Hakkani-tür, Gökhan Tür

A crucial step in processing speech audio data for informationextraction, topic detection, or browsing/playbackis to segment the input into sentence and topic units. Speech segmentation is...

Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)

Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...

We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like

Combining Words and Prosody for Information Extraction from Speech (1999)

Dilek Hakkani-tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg

Information extraction from speech is a crucial step on the way from speech recognition to speech understanding. A preliminary step toward speech understanding is the detection of topic boundaries,...

Combining words and speech prosody for automatic topic segmentation (1999)

Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-tür, Gökhan Tür, Kemal Sönmez

We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language...

Combining Words and Prosody for Information Extraction from Speech (1999)

Dilek Hakkani-tur, Gokhan Tur, Andreas Stolcke, Elizabeth Shriberg

Information extraction from speech is a crucial step on the way from speech recognition to speech understanding. A preliminary step toward speech understanding is the detection of topic boundaries,...

Modeling The Prosody Of Hidden Events For Improved Word Recognition (1999)

Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tur, Gokhan Tur

We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such as...

Combining Words and Speech Prosody for Automatic Topic Segmentation (1999)

Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür, Ze'ev Rivlin, ...

We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language...

Maestro: Conductor Of Multimedia Analysis (1999)

Technologies Ze'ev Rivlin, Robert Bolles, Adam Cheyer, Dilek Hakkani-tür, David Israel, Luc Julia, ...

is technologies --- for example, speech recognition, image understanding, and optical character recognition --- to the indexing and retrieval of multimedia. Informedia [1] and Broadcast News...

Modeling the prosody of hidden events for improved word recognition (1999)

Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-tür, Gökhan Tür

We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such as...

Human Language Technology: Opportunities and Challenges (1998)

Ostendorf, Mari, Shriberg, Elizabeth, Stolcke, Andreas

In recent years, there has been dramatic progress in both speech and language processing, in many cases leveraging some of the same underlying methods. This progress and the growing technical ties...

Structural Metadata Research in the Ears Program (1998)

Liu, Yang, Shriberg, Elizabeth, Stolcke, Andreas, Peskin, Barbara, Ang, Jeremy, Hillard, Dustin, ...

Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS...

Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings (1998)

Zimmermann, Matthias, Liu, Yang, Shriberg, Elizabeth, Stolcke, Andreas

The authors present baseline results for the joint segmentation and classification of dialog acts (DAs) of the International Computer Science Institute (ICSI) Meeting Corpus. Two simple approaches...

Dialog act modelling for conversational speech (1998)

Stolcke, Andreas, Shriberg, Elizabeth, Bates, Rebecca, Coccaro, Noah, Jurafsky, Daniel, Martin, Rachel, ...

We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 'dialog acts’ (e.g., Statement, Question, Backchannel,...

Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? (1998)

Shriberg, Elizabeth, Bates, Rebecca, Taylor, Paul, Stolcke, Andreas, Jurafsky, Daniel, Ries, Klaus, ...

Identifying whether an utterance is a statement, question, greeting, and so forth is integral to effective automatic understanding of natural dialog. Little is known, however, about how such dialog...

Dialog act modelling for conversational speech (1998)

Stolcke, Andreas, Shriberg, Elizabeth, Bates, Rebecca, Coccaro, Noah, Jurafsky, Daniel, Martin, Rachel, ...

We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 'dialog acts’ (e.g., Statement, Question, Backchannel,...

Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? (1998)

Shriberg, Elizabeth, Bates, Rebecca, Taylor, Paul, Stolcke, Andreas, Jurafsky, Daniel, Ries, Klaus, ...

Identifying whether an utterance is a statement, question, greeting, and so forth is integral to effective automatic understanding of natural dialog. Little is known, however, about how such dialog...

Entropy-based pruning of backoff language models (1998)

Andreas Stolcke

A criterion for pruning parameters from N-gram backoff language models is developed, based on the relative entropy between the original and the pruned model. It is shown that the relative entropy...

The development of SRI's 1997 Broadcast News transcription system (1998)

Ananth Sankar, Fuliang Weng, Andreas Stolcke, Ramana Rao Gadde

This paper describes SRI's 1997 broadcast news transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. These include automatic segmentation of...

New Developments in Lattice-Based Search Strategies in SRI's Hub4 System (1998)

Sri's Hub System, Fuliang Weng, Andreas Stolcke, Ananth Sankar

We describe new developments in SRI's lattice-based progressive search strategy. These developments include the implementation of a new bigram lattice algorithm, lattice optimization techniques,...

How Far Do Speakers Back Up In Repairs? A Quantitative Model (1998)

Elizabeth Shriberg, Andreas Stolcke

Speakers frequently retrace one or more words when continuing after a break in fluency. Syntactic principles constrain the points from which speakers retrace; however syntactic principles do not...

Dialog Act Modeling for Conversational Speech (1998)

Andreas Stolcke, Elizabeth Shriberg, Sri International, Noah Coccaro, Daniel Jurafsky, Rachel Martin, ...

We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts' (e.g., Statement, Question,...

Development of SRI's 1997 Broadcast News Transcription System (1998)

Ananth Sankar Fuliang, Ananth Sankar, Fuliang Weng, Andreas Stolcke, Ramana Rao Gadde

This paper describes SRI's 1997 broadcastnews transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. These include automatic segmentation of entire...

Entropy-based Pruning of Backoff Language Models (1998)

Andreas Stolcke

A criterion for pruning parameters from N-gram backoff language models is developed, based on the relative entropy between the original and the pruned model. It is shown that the relative entropy...

Efficient Lattice Representation and Generation (1998)

Fuliang Weng, Andreas Stolcke, Ananth Sankar

In large-vocabulary, multi-pass speech recognition systems, it is desirable to generate word lattices incorporating a large number of hypotheses while keeping the lattice sizes small. We describe two...

Dialog Act Modeling for Conversational Speech (1998)

Andreas Stolcke, Elizabeth Shriberg, Sri International, Noah Coccaro, Daniel Jurafsky, Rachel Martin, ...

We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts' (statement, question, backchannel,...

Automatic Detection Of Sentence Boundaries And Disfluencies Based On Recognized Words (1998)

Andreas Stolcke, Elizabeth Shriberg, Rebecca Bates, Mari Ostendorf, Dilek Hakkani, Madelaine Plauche, ...

We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such...

Entropy-based Pruning of Backoff Language Models (1998)

Andreas Stolcke

A criterion for pruning parameters from N-gram backoff language models is developed, based on the relative entropy between the original and the pruned model. It is shown that the relative entropy...

Automatic detection of discourse structure for speech recognition and understanding. (1997)

Jurafsky, Daniel, Bates, Rebecca, Coccaro, Noah, Martin, Rachel, Meteer, Marie, Ries, Klaus, ...

We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs), (question, answer,...

Automatic detection of discourse structure for speech recognition and understanding. (1997)

Jurafsky, Daniel, Bates, Rebecca, Coccaro, Noah, Martin, Rachel, Meteer, Marie, Ries, Klaus, ...

We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs), (question, answer,...

Hub4 Language Modeling Using Domain Interpolation and Data Clustering (1997)

Fuliang Weng, Andreas Stolcke, Ananth Sankar

In SRI's language modeling experiments for the Hub4 domain, three basic approaches were pursued: interpolating multiple models estimated from Hub4 and non-Hub4 training data, adapting the...

A prosody-only decision-tree model for disfluency detection (1997)

Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke

Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for...

Acoustic modeling for the SRI Hub4 partitioned evaluation continuous speech recognition system (1997)

Ananth Sankar, Larry Heck, Andreas Stolcke

We describe the development of the SRI systemevaluated in the 1996 DARPA continuous speechrecognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4evaluation was to recognize speech...

Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System (1997)

Ananth Sankar, Larry Heck, Andreas Stolcke

We describe the development of the SRI systemevaluated in the 1996 DARPA continuous speechrecognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4evaluation was to recognize speech...

Dependency Language Modeling (1997)

Andreas Stolcke, Ciprian Chelba, David Engle, Victor Jimenez, Lidia Mangu, Harry Printz, ...

This report summarizes the work of the Dependency Language Modeling group at the 1996 Summer Speech Workshop at the Center for Language and Speech Processing at Johns Hopkins University (WS96). We...

Structure And Performance Of A Dependency Language Model (1997)

Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, ...

We present a maximum entropy language model that incorporates both syntax and semantics via a dependency grammar. Such a grammar expresses the relations between words by a directed graph. Because the...

Structure and Performance of a Dependency Language Model (1997)

Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, ...

We present a maximum entropy language model that incorporates both syntax and semantics via a dependency grammar. Such a grammar expresses the relations between words by a directed graph. Because the...

A Study Of Multilingual Speech Recognition (1997)

Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke

This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through...

Neural-Network Based Measures Of Confidence For Word Recognition (1997)

Mitch Weintraub, Françoise Beaufays, Ze'ev Rivlin, Yochai Konig, Andreas Stolcke

This paper proposes a probabilistic framework to define and evaluate confidence measures for word recognition. We describe a novel method to combine different knowledge sources and estimate the...

A Prosody-Only Decision-Tree Model For Disfluency Detection (1997)

Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke

Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for...

Hub4 Language Modeling Using Domain Interpolation and Data Clustering (1997)

Fuliang Weng, Andreas Stolcke, Ananth Sankar

In SRI's language modeling experiments for the Hub4 domain, three basic approaches were pursued: interpolating multiple models estimated from Hub4 and non-Hub4 training data, adapting the...

Modeling Linguistic Segment And Turn Boundaries For N-Best Rescoring Of Spontaneous Speech (1997)

Andreas Stolcke

Language modeling, especially for spontaneous speech, often suffers from a mismatch of utterance segmentations between training and test conditions. In particular, training often uses...

Linguistic knowledge and empirical methods in speech recognition (1997)

Andreas Stolcke

(For membership information, consult our web page) The material herein is copyrighted material. It may not be reproduced in any form by any electronic or mechanical means (including photocopying,...

Explicit Word Error Minimization in N-Best List Rescoring (1997)

Andreas Stolcke, Yochai König, Mitchel Weintraub

We show that the standard hypothesis scoring paradigm used in maximum-likelihood-based speech recognition systems is not optimal with regard to minimizing the word error rate, the commonly used...

Modeling Pitch Range Variation within and across Speakers: Predicting F0 Targets when "Speaking Up (1996)

Elizabeth Shriberg, D. Robert Ladd, Jacques Terken, Andreas Stolcke

We study F0 variation produced by “speaking up”, as part of a larger study of pitch range variation within and across speakers [1]. We provide a function to predict target F0 values in this...

Automatic Linguistic Segmentation Of Conversational Speech (1996)

Andreas Stolcke, Elizabeth Shriberg

As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into...

Statistical Language Modeling For Speech Disfluencies (1996)

Andreas Stolcke, Elizabeth Shriberg

Speech disfluencies (such as filled pauses, repetitions, restarts) are among the characteristics distinguishing spontaneous speech from planned or read speech. We introduce a language model that...

L_0 - The First Five Years of an Automated Language Acquisition Project (1996)

Jerome Feldman, George Lako, David Bailey, Srini Narayanan, Terry Regier, ...

The L 0 project at ICSI and UC Berkeley attempts to combine not only vision and natural language modelling, but also learning. The original task was put forward in #Feldman et al. 1990a# as a...

L_0 - The First Five Years of an Automated Language Acquisition Project (1996)

Jerome Feldman, George Lakoff, David Bailey, Srini Narayanan, Terry Regier, Andreas Stolcke

The L 0 project at ICSI and UC Berkeley attempts to combine not only vision and natural language modelling, but also learning. The original task was put forward in (Feldman et al. 1990a) as a...

Modeling Pitch Range Variation Within And Across Speakers: Predicting F_0 Targets When "SPEAKING UP" (1996)

Elizabeth Shriberg, D. Robert Ladd, Jacques Terken, Andreas Stolcke

We study F0 variation produced by "speaking up", as part of a larger study of pitch range variation within and across speakers [1]. We provide a function to predict target F0 values in this...

Word Predictability After Hesitations: A Corpus-Based Study (1996)

Elizabeth Shriberg, Andreas Stolcke

We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the...

Word Predictability After Hesitations: A Corpus-Based Study (1996)

Elizabeth Shriberg, Andreas Stolcke

We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the...

An efficient probabilistic context-free parsing algorithm that computes prefix probabilities (1995)

Andreas Stolcke

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a)...

A Natural Law of Succession (1995)

Eric Sven Ristad, Andreas Stolcke, Robert Thomas, Kenji Yamanishi

Consider the problem of multinomial estimation. You are given an alphabet of distinct symbols and are told the frequency with which each symbol occurred in the past. On the basis of this information...

Partitioning Grammars and composing Parsers (1995)

Fuliang Weng, Andreas Stolcke

GLR parsers have been criticized by various authors for their potentially large sizes. Other parsers also have individual weaknesses and strengths. Our heterogeneous parsing algorithm is based on GLR...

Partitioning Grammars and Composing Parsers (1995)

Fuliang Weng, Andreas Stolcke

this paper are: a general schema for partitioning a grammar into sub-grammars, and the combination of parsers for sub-grammars into an overall parser that yields the same parses as one for the...

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities (1995)

Andreas Stolcke

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a)...

Using A Stochastic Context-Free Grammar As A Language Model For Speech Recognition (1995)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Eric Fosler, Gary Tajchman, ...

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous...

A Natural Law of Succession (1995)

Eric Sven Ristad, Andreas Stolcke, Robert Thomas, Kenji Yamanishi

Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly n i times in the past. On the basis of this information alone, you...

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities (1995)

Andreas Stolcke

this article can compute solutions to all four of these problems in a single framework,withanumber of additional advantagesoverpreviously presented isolated solutions. Most probabilistic parsers are...

An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities (1994)

Stolcke, Andreas

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities...

Inducing Probabilistic Grammars by Bayesian Model Merging (1994)

Stolcke, Andreas, Omohundro, Stephen M.

We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements...

Precise n-gram Probabilities from Stochastic Context-free Grammars (1994)

Stolcke, Andreas, Segal, Jonathan

We present an algorithm for computing n-gram probabilities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from...

Best-first Model Merging for Hidden Markov Model Induction (1994)

Stolcke, Andreas, Omohundro, Stephen M.

This report describes a new technique for inducing the structure of Hidden Markov Models from data which is based on the general `model merging' strategy (Omohundro 1992). The process begins with a...

Bayesian learning of probabilistic language models / (1994)

Stolcke, Andreas.

Thesis (Ph. D. in Electrical Engineering and Computer Science)--University of California, Berkeley, May 1994.

Precise n-gram Probabilities from Stochastic Context-free Grammars (1994)

Andreas Stolcke, Jonathan Segal

We present an algorithm for computing n-gram probabilities from stochastic contextfree grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from...

Best-first Model Merging for Hidden Markov Model Induction (1994)

Andreas Stolcke, Stephen M. Omohundro

This report describes a new technique for inducing the structure of Hidden Markov Models from data which is based on the general `model merging' strategy (Omohundro 1992). The process begins...

Bayesian Learning of Probabilistic Language Models (1994)

Andreas Stolcke, Andreas Stolcke

The general topic of this thesis is the probabilistic modeling of language, in particular natural language. In probabilistic language modeling, one characterizes the strings of phonemes, words, etc....

Inducing Probabilistic Grammars by Bayesian Model Merging (1994)

Andreas Stolcke, Stephen Omohundro

We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding ad-hoc rules to a working grammar; subsequently, elements of...

Multiple-Pronunciation Lexical Modeling In A Speaker Independent Speech Understanding System (1994)

Chuck Wooters, Andreas Stolcke

One of the sources of difficulty in speech recognition and understanding is the variability due to alternate pronunciations of words. To address the issue we have investigated the use of...

Multiple-Pronunciation Lexical Modeling In A Speaker Independent Speech Understanding System (1994)

Chuck Wooters, Andreas Stolcke

One of the sources of difficulty in speech recognition and understanding is the variability due to alternate pronunciations of words. To address the issue we have investigated the use of...

The Berkeley Restaurant Project (1994)

Daniel Jurafsky, Chuck Wooters, Gary Tajchman, Jonathan Segal, Andreas Stolcke, Eric Fosler, ...

This paper describes the architecture and performance of the Berkeley Restaurant Project (BeRP), a medium-vocabulary, speaker-independent, spontaneous continuous speech understanding system currently...

Integrating experimental models of syntax, phonology, and accent/dialect in a speech recognizer. An investigation of tightly coupled time synchronous speech (1994)

Daniel Jurafsky, Chuck Wooters, Jonathan Segal, Andreas Stolcke, Nelson Morgan

As the field of speech understanding matures, and particularly as the quality of front-end and phonetic components improves, researchers have begun to explore ways to add new kinds of language...

Hidden Markov Model Induction by Bayesian Model Merging (1993)

Andreas Stolcke, Stephen Omohundro

This paper describes a technique for learning both the number of states and the topology of Hidden Markov Models from examples. The induction process starts with the most specific model consistent...

Tree Matching with Recursive Distributed Representations (1992)

Andreas Stolcke, Dekai Wu

We present an approach to the structure unification problem using distributed representations of hierarchical objects. Binary trees are encoded using the recursive auto-association method (RAAM), and...

Sather Language Design and Performance Evaluation (1991)

Andreas Stolcke

Sather is an object-oriented language recently designed and implemented at the International Computer Science Institute in Berkeley. It compiles into C and is intended to allow development of...

Syntactic Category Formation with Vector Space Grammars (1991)

Andreas Stolcke

A method for deriving phrase structure categories from structured samples of a context-free language is presented. The learning algorithm is based on adaptation and competition, as well as error...

Miniature Language Acquisition: A touchstone for cognitive science (1990)

Jerome Feldman, George Lakoff, Andreas Stolcke, Susan Hollbach Weber

Cognitive Science, whose genesis was interdisciplinary, shows signs of reverting to a disjoint collection of fields. This paper presents a compact, theory-free task that inherently requires an...

L0: A Testbed for Miniature Language Acquisition (1990)

Susan Hollbach Weber, Andreas Stolcke

L 0 constitutes a recent effort in Cognitive Science to build a natural language acquisition system for a limited visual domain. As a preparatory step towards addressing the issue of learning in this...

Learning Feature-based Semantics with Simple Recurrent Networks (1990)

Andreas Stolcke

The paper investigates the possibilities for using simple recurrent networks as transducers which map sequential natural language input into non-sequential feature-based semantics. The networks...

Learning Feature-based Semantics with Simple Recurrent Networks (1990)

Andreas Stolcke

The paper investigates the possibilities for using simple recurrent networks as transducers which map sequential natural language input into non-sequential feature-based semantics. The networks...

Miniature Language Acquisition: A touchstone for cognitive science (1990)

Jerome Feldman, George Lakoff, Andreas Stolcke, Susan Hollbach Weber

Cognitive Science, whose genesis was interdisciplinary, shows signs of reverting to a disjoint collection of fields. This paper presents a compact, theory-free task that inherently requires an...

Miniature Language Acquisition: A touchstone for cognitive science (1990)

Jerome A. Feldman, George Lakoff, Andreas Stolcke, Susan Hollbach Weber

Cognitive Science, whose genesis was interdisciplinary, shows signs of reverting to a disjoint collection of fields. This paper presents a compact, theory-free task that inherently requires an...

Unification as Constraint Satisfaction in Structured Connectionist Networks (1989)

Andreas Stolcke

Unification is a basic concept in several traditional symbolic formalisms that should be well-suited for a connectionist implementation due to the intuitive nature of the notions it formalizes. It is...

A Connectionist Model of Unification (1989)

Andreas Stolcke

A general approach to encode and unify recursively nested feature structures in connectionist networks is described. The unification algorithm implemented by the net is based on iterative coarsening...

A Connectionist Model of Unification (1989)

Andreas Stolcke

A general approach to encode and unify recursively nested feature structures in connectionist networks is described. The unification algorithm implemented by the net is based on iterative coarsening...

Speaker recognition with session variability normalization based on MLLR adaptation transforms (1987)

Andreas Stolcke, Senior Member, Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg

Abstract—We present a new modeling approach for speaker recognition that uses the maximum-likelihood linear regression (MLLR) adaptation transforms employed by a speech recognition system as...

The Berkeley Restaurant Project

Daniel Jurafsky Chuck, Chuck Wooters, Gary Tajchman, Jonathan Segal, Andreas Stolcke, Eric Fosler, ...

This paper describes the architecture and performance of the Berkeley Restaurant Project (BeRP), a medium-vocabulary, speaker-independent, spontaneous continuous speech understanding system currently...

Finding Consensus Among Words: Lattice-Based Word Error Minimization

Lidia Mangu Eric, Eric Brill, Andreas Stolcke

We describe a new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER). Our approach thus overcomes the mismatch between the word-based...

Finding Consensus Among Words: Lattice-Based Word Error Minimization

Lidia Mangu, Eric Brill, Andreas Stolcke

We describe a new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER). Our approach thus overcomes the mismatch between the word-based...

Processing Unification-based Grammars in a Connectionist Framework

Andreas Stolcke

We present an approach to the processing of unification-based grammars in the connectionist paradigm. The method involves two basic steps: (1) Translation of a grammar's rules into a set of...