Sébastien Cuendet, Dilek Hakkani-tür, Elizabeth Shriberg
Abstract. In conversational speech, irregularities in the speech such as overlaps and disruptions make it difficult to decide what is a sentence. Thus, despite very precise guidelines on how to label...
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Senior Member, Dustin Hillard, Mari Ostendorf, ...
Abstract — Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and...
Frank Enos, Stefan Benus, Robin L. Cautin, Martin Graciarena, Julia Hirschberg, Elizabeth Shriberg
Previous studies of human performance in deception detection have found that humans generally are quite poor at this task, comparing unfavorably even to the performance of automated procedures....
Cross-Genre Feature Comparisons for Spoken Sentence Segmentation (2008)
Sébastien Cuendet, Dilek Hakkani-tür, Elizabeth Shriberg, James Fung, Benoit Favre
Automatic sentence segmentation of spoken language is an important precursor to downstream natural language processing. Previous studies combine lexical and prosodic features, but can impose...
Automatic speech recognition has improved dramatically over the past few decades. The goal of such systems is typically only to output a simple stream of words. Humans, however, use information...
Acknowledgements: We thank the organizers and sponsors of the WS97 Workshop on Innovative Techniques
Daniel Jurafsky, Elizabeth Shriberg, Barbara Fox, Traci Curl
The structure of a discourse is reflected in many aspects of its linguistic realization, including its lexical, prosodic, syntactic, and semantic nature. Multiparty dialog contains a particular kind...
HUMAN LANGUAGE TECHNOLOGY: OPPORTUNITIES AND CHALLENGES (2008)
Mari Ostendorf, Elizabeth Shriberg, Andreas Stolcke
In recent years, there has been dramatic progress in both speech and language processing, in many cases leveraging some of the same underlying methods. This progress and the growing technical ties...
Özgür Çetin, Elizabeth Shriberg
In previous work we found that automatic speech recognition (ASR) results on meetings show interesting patterns with respect to speaker overlaps, including a robust asymmetry in word error rates...
“TalkPrinting”: Improving Speaker Recognition by Modeling Stylistic Features (2008)
Sachin Kajarekar, Kemal Sönmez, Luciana Ferrer, Venkata Gadde, Elizabeth Shriberg, Andreas Stolcke, ...
Abstract. Automatic speaker recognition is an important technology for intelligence gathering, law enforcement, and audio mining. Conventional speaker recognition systems, which are based on...
Detecting Nonnative Speech Using Speaker Recognition Approaches (2008)
Elizabeth Shriberg, Luciana Ferrer, Sachin Kajarekar, Nicolas Scheffer, Andreas Stolcke, Murat Akbacak
Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers...
Detecting Deception Using Critical Segments (2008)
Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke
We present an investigation of segments that map to GLOBAL LIES, that is, the intent to deceive with respect to salient topics of the discourse. We propose that identifying the truth or falsity of...
A Smoothing Kernel for Spatially Related Features and Its Application to Speaker Verification (2008)
Luciana Ferrer, Kemal Sönmez, Elizabeth Shriberg
Most commonly used kernels are invariant to permutations of the feature vector components. This characteristic may make machine learning methods that use such kernels suboptimal in cases where the...
Detecting Nonnative Speech Using Speaker Recognition Approaches (2008)
Elizabeth Shriberg, Luciana Ferrer, Sachin Kajarekar, Nicolas Scheffer, Andreas Stolcke, Murat Akbacak
Detecting whether a talker is speaking his native language is useful for speaker recognition, speech recognition, and intelligence applications. We study the problem of detecting nonnative speakers...
E. Shriberg A, Dr. Elizabeth Shriberg
We describe a novel approach to modeling idiosyncratic prosodic behavior for automatic speaker recognition. The approach computes various duration, pitch, and energy features for each estimated...
Frank Enos, Stefan Benus, Robin L. Cautin, Martin Graciarena, Julia Hirschberg, Elizabeth Shriberg
Previous studies of human performance in deception detection have found that humans generally are quite poor at this task, comparing unfavorably even to the performance of automated procedures....
Nock for assistance with data resources and recognizer software. We are grateful to Susann LuperFoy, Nigel Ward, James Allen, Julia Hirschberg, and Marilyn Walker for advice on the design of the...
Integrating prosodic and lexical cues for automatic topic segmentation (2007)
Gökhan Tür, Dilek Hakkani-Tür, Andreas Stolcke, Elizabeth Shriberg
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and...
Carlos Teixeira L, Horacio Franco, Elizabeth Shriberg, Kemal Ssnmez, Kristin Precoda
Predicting the degree of nativeness of a student utterance is an important issue in computer-aided language learning. This task has been addressed by many studies focusing on the segmental assessment...
Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? (2007)
Anu Erringer, Michelle Gregory, Lori Heintzelman, Taimi Metzler, Amma Oduro, To The, ...
Nock for assistance with data resources and recognizer software. We are grateful to Susann LuperFoy, Nigel Ward, James Allen, Julia Hirschberg, and Marilyn Walker for advice on the design of the...
Anand Venkataraman, Andreas Stolcke, Elizabeth Shriberg
ABSTRACT: For many natural language applications it is desirable to be able to automatically tag utterances according to their discourse function (dialog act), such as statement, question or...
A System for Labeling Self-Repairs in Speech 1 (2007)
John Bear, John Dowding, Elizabeth Shriberg, Patti Price
This document outlines a system for labeling self-repairs in spontaneous speech. The system marks the location and extent of a repair, as well as relevant words in the region of the repair. Together...
Direct Modeling of Prosody: An Overview of Applications in Automatic Speech Processing (2007)
Elizabeth Shriberg, Andreas Stolcke
We describe a "direct modeling" approach to using prosody in various speech technology tasks. The approach does not involve any hand-labeling or modeling of prosodic events such as pitch...
Comparing Evaluation Metrics for Sentence Boundary Detection (2007)
In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is...
S.: Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification (2007)
Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar
We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and...
Higher-Level Features in Speaker Recognition,” in Speaker Classification I (2007)
Abstract. Higher-level features based on linguistic or long-range information have attracted significant attention in automatic speaker recognition. This article briefly summarizes approaches to...
S.: Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification (2007)
Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar
We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using duration- and...
E.: Speaker adaptation of language models for automatic dialog act segmentation of meetings (2007)
Dialog act (DA) segmentation in meeting speech is important for meeting understanding. In this paper, we explore speaker adaptation of hidden event language models (LMs) for DA segmentation using the...
L.: A Text-Constrained Prosodic System for Speaker Verification (2007)
Elizabeth Shriberg, Luciana Ferrer
We describe four improvements to a prosody SVM system, including a new method based on text- and part-of-speechconstrained prosodic features. The improved system shows remarkably good performance on...
Combining prosodic, lexical and cepstral systems for deceptive speech detection (2006)
Martin Graciarena, Elizabeth Shriberg, Andreas Stolcke, Frank Enos, Julia Hirschberg, Sachin Kajarekar
We report on machine learning experiments to distinguish deceptive from nondeceptive speech in the Columbia-SRI-Colorado (CSC) corpus. Specifically, we propose a system combination approach using...
Combining prosodic, lexical and cepstral systems for deceptive speech detection (2006)
Martin Graciarena, Elizabeth Shriberg, Andreas Stolcke, Frank Enos, Julia Hirschberg, Sachin Kajarekar
We report on machine learning experiments to distinguish deceptive from nondeceptive speech in the Columbia-SRI-Colorado (CSC) corpus. Specifically, we propose a system combination approach using...
Pauses in deceptive speech (2006)
Stefan Benus, Frank Enos, Julia Hirschberg, Elizabeth Shriberg
We use a corpus of spontaneous interview speech to investigate the relationship between the distributional and prosodic characteristics of silent and filled pauses and the intent of an interviewee to...
Y.: “Using Prosody for Automatic Sentence Segmentation of Multi-Party Meetings (2006)
Abstract. We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data. We examine two different...
A Study in Machine Learning from Imbalanced Data for Sentence Boundary Detection in Speech (2006)
Yang Liu, Nitesh V. Chawla, Mary P. Harper, Elizabeth Shriberg, Andreas Stolcke
Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have constructed a hidden...
Annotation and Analysis of Importance in Meetings (2006)
Robert Eklund, Rebecca Bates, Chad Kuyper, Elizabeth Willingham, Elizabeth Shriberg, Robert Eklund, ...
Meetings typically contain important regions that are likely to be the focus of summarization and recall requests. We present a new approach for labeling speech corpora with categories of importance...
Automatic Dialog Act Segmentation and Classification in Multiparty Meetings (2005)
Jeremy Ang, Yang Liu, Elizabeth Shriberg
We explore the two related tasks of dialog act (DA) segmentation and DA classification for speech from the ICSI Meeting Corpus. We employ simple lexical and prosodic knowledge sources, and compare...
Distinguishing Deceptive from Non-Deceptive Speech (2005)
Julia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, ...
To date, studies of deceptive speech have largely been confined to descriptive studies and observations from subjects, researchers, or practitioners, with few empirical studies of the specific...
Does active learning help automatic dialog act taggin in meeting data (2005)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Knowledge of Dialog Acts (DAs) is important for the automatic understanding and summarization of meetings. Current approaches rely on a lot of hand labeled data to train automatic taggers. One...
Toward joint segmentation and classification of dialog acts in multiparty meetings (2005)
Matthias Zimmermann, Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Abstract. We present baseline results for the joint segmentation and classification of dialog acts (DAs) of the ICSI Meeting Corpus. Two simple approaches based on word information are investigated...
Structural Metadata Research in the EARS Program (2005)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Barbara Peskin, Jeremy Ang, Dustin Hillard, ...
Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS...
Does active learning help automatic dialog act taggin in meeting data (2005)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Knowledge of Dialog Acts (DAs) is important for the automatic understanding and summarization of meetings. Current approaches rely on a lot of hand labeled data to train automatic taggers. One...
Comparing HMM, maximum entropy and conditional random fields for disfluency detection (2005)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary Harper
Automatic detection of disfluencies in spoken language is important for making speech recognition output more readable, and for aiding downstream language processing modules. We compare a generative...
Douglas Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa Kamm
We report on results of two experiments designed to compare subjects ’ ability to extract information from audio recordings of conversational telephone speech (CTS) with their ability to extract...
SRI’s 2004 NIST speaker recognition evaluation system (2005)
Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg, Kemal Sonmez, Andreas Stolcke, Jing Zheng
This paper describes our recent efforts in exploring longerrange features and their statistical modeling techniques for speaker recognition. In particular, we describe a system that uses discriminant...
SRI’s 2004 NIST speaker recognition evaluation system (2005)
Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg, Kemal Sonmez, Andreas Stolcke, Jing Zheng
This paper describes our recent efforts in exploring longerrange features and their statistical modeling techniques for speaker recognition. In particular, we describe a system that uses discriminant...
Spontaneous speech: How people really talk and why engineers should care (2005)
Spontaneous conversation is optimized for human-human communication, but differs in some important ways from the types of speech for which human language technology is often developed. This overview...
Using conditional random fields for sentence boundary detection in speech (2005)
Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mary Harper
Sentence boundary detection in speech is important for enriching speech recognition output, making it easier for humans to read and downstream modules to process. In previous work, we have developed...
Identifying Agreement and Disagreement in Conversational Speech: (2004)
Use Of Bayesian, Michel Galley, Kathleen Mckeown, Julia Hirschberg, Elizabeth Shriberg
We describe a statistical approach for modeling agreements and disagreements in conversational interaction. Our approach first identifies adjacency pairs using maximum entropy ranking based on a set...
Multimodal Model Integration for Sentence Unit Detection (2004)
Lei Chen, Yang Liu, Mary P. Harper, Elizabeth Shriberg
In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as...
Rajdip Dhillon, Sonali Bhagat, Hannah Carvey, Elizabeth Shriberg, Chuck Wooters For
providing us with the TableTrans software. We are also grateful to Don Baron and Chris Oei for their assistance in preparing data for annotation. We are thankful to Ashley Krupski for her annotation...
The ICSI Meeting Project: Resources and Research (2004)
Adam Janin, Jeremy Ang, Sonali Bhagat, Rajdip Dhillon, Jane Edwards, Javier Macías-guarasa, ...
This paper provides a progress report on ICSI’s Meeting Project, including both the data collected and annotated as part of the project, as well as the research lines such materials support. We...
The ICSI-SRI-UW metadata extraction system (2004)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, ...
Both human and automatic processing of speech require recognizing more than just the words. We describe a state-of-the-art system for automatic detection of “metadata ” (information beyond the...
Özgür Çetin, Elizabeth Shriberg
Abstract. We analyze speaker overlap in multiparty meetings both in terms of automatic speech recognition (ASR) performance, and in terms of distribution of overlap with respect to various factors...
Modeling NERFs for speaker recognition (2004)
Sachin Kajarekar, Luciana Ferrer, Kemal Sönmez, Jing Zheng, Elizabeth Shriberg
We introduce a new type of feature to capture long-range patterns associated with individual speakers or with speaking styles. NERFs, or Nonuniform Extraction Region Features, are defined based on...
Automatic disfluency identification in conversational speech using multiple knowledge sources (2003)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...
The ICSI meeting corpus (2003)
Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, ...
We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains...
Sonali Bhagat, Hannah Carvey, Elizabeth Shriberg
We investigate whether automatically extracted prosodic features can serve as cues to dialog acts (DAs) in naturallyoccurring meetings. We focus on the classification of four short DAs, all of which...
Training a prosody-based dialog act tagger from unlabeled data (2003)
Anand Venkataraman, Luciana Ferrer, Andreas Stolcke, Elizabeth Shriberg
Dialog act tagging is an important step toward speech understanding, yet training such taggers usually requires large amounts of data labeled by linguistic experts. Here we investigate the use of...
The Relationship Between Dialogue Acts And Hot Spots In Meetings (2003)
Britta Wrede Elizabeth, Elizabeth Shriberg
We examine the relationship between hot spots (annotated in terms of involvement) and dialogue acts (DAs, annotated in an independent effort) in roughly 32 hours of speech data from...
Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources (2003)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...
Automatic disfluency identification in conversational speech using multiple knowledge sources (2003)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...
Gadde, “Speaker recognition using prosodic and lexical features (2003)
Sachin Kajarekar, Luciana Ferrer, Kemal Sonmez, Elizabeth Shriberg, Andreas Stolcke, Harry Bratt, ...
Conventional speaker recognition systems identify speakers by using spectral information from very short slices of speech. Such systems perform well (especially in quiet conditions), but fail to...
A Prosody-Based Approach To End-Of-Utterance Detection That Does (2003)
Not Require Speech, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke
In previous work we showed that state-of-the-art end-of-utterance detection (as used, for example, in dialog systems) can be improved significantly by making use of prosodic and/or language models...
Automatic Disfluency Identification in Conversational Speech Using Multiple Knowledge Sources (2003)
Yang Liu, Elizabeth Shriberg, Andreas Stolcke
Disfluencies occur frequently in spontaneous speech. Detection and correction of disfluencies can make automatic speech recognition transcripts more readable for human readers, and can aid downstream...
Modeling Duration Patterns for Speaker Recognition (2003)
Luciana Ferrer Harry, Harry Bratt, Sachin Kajarekar, Elizabeth Shriberg, Kemal S Andreas, ...
We present a method for speaker recognition that uses the duration patterns of speech units to aid speaker classification. The approach represents each word and/or phone by a feature vector comprised...
Prosodic Knowledge Sources For Automatic Speech Recognition (2003)
Dimitra Vergyri, Andreas Stolcke, Luciana Ferrer, Elizabeth Shriberg
In this work, different prosodic knowledge sources are integrated into a state-of-the-art large vocabulary speech recognition system. Prosody manifests itself on different levels in the speech...
Prosody modeling for automatic speech recognition and understanding (2002)
Elizabeth Shriberg, Andreas Stolcke
Abstract. This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective...
Prosody-based automatic detection of annoyance and frustration in human-computer dialog (2002)
Jeremy Ang, Rajdip Dhillon, Ashley Krupski, Elizabeth Shriberg, Andreas Stolcke
We investigate the use of prosody for the detection of frustration and annoyance in natural human-computer dialog. In addition to prosodic features, we examine the contribution of language model...
Don Baron, Elizabeth Shriberg, Andreas Stolcke
We investigate automatic approaches to finding “hidden ” spontaneous speech events, such as sentence boundaries and disfluencies, in multi-party meetings. Hidden events are characterized...
Is The Speaker Done Yet? (2002)
Faster And More, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke
We examine the problem of end-of-utterance (EOU) detection for real-time speech recognition, particularly in the context of a human-computer dialog system. Current EOU detection algorithms use only a...
Prosody Modeling For Automatic Speech Recognition And Understanding (2002)
Elizabeth Shriberg, Andreas Stolcke
This paper summarizes statistical modeling approaches for the use of prosody (the rhythm and melody of speech) in automatic recognition and understanding of speech. We outline effective prosodic...
Automatic Dialog Act Labeling With Minimal (2002)
Supervision Anand Venkataraman, Anand Venkataraman, Andreas Stolcke, Elizabeth Shriberg
For many natural language applications it is desirable to be able to automatically tag utterances according to their discourse function (dialog act), such as statement, question or acknowledgment. We...
The meeting project at ICSI (2001)
Nelson Morgan, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Adam Janin, ...
Elizabeth Shriberg, Andreas Stolcke, Don Baron
We investigate whether probabilistic modeling of prosody can aid various automatic labeling tasks essential for processing of multi-party meetings. Task 1, automatic punctuation, seeks to classify...
Evaluation of Speaker’ s Degree of Nativeness Using Text-Independent Prosodic Features (2001)
Carlos Teixeira, Horacio Franco, Elizabeth Shriberg, Kristin Precoda, Kemal Sönmez
Giving feedback on the degree of nativeness of a student’s speech is an important aspect of computer-aided language learning. This task has been addressed by many studies focusing on the segmental...
We investigate whether probabilistic modeling of prosody can aid various automatic labeling tasks essential for processing of multi-party meetings. Task 1, automatic punctuation, seeks to classify...
We examine the distribution of overlapping speech in different corpora of natural multi-party conversations, including two types of meetings, and two corpora of telephone conversations. Analyses are...
Prosody modeling for automatic speech understanding: an overview of recent research at SRI (2001)
Elizabeth Shriberg, Andreas Stolcke
Prosody has long been studied as an important knowledge source for speech understanding. In recent years there has been a large amount of computational work aimed at prosodic
The meeting project at ICSI (2001)
Nelson Morgan, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Adam Janin, ...
In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings. The work includes a substantial data collection and...
Elizabeth Shriberg, Andreas Stolcke, Don Baron
We examine the distribution of overlapping speech in different corpora of natural multi-party conversations, including two types of meetings, and two corpora of telephone conversations. Analyses are...
The meeting project at ICSI (2001)
Nelson Morgan, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Adam Janin, ...
Unlike Read Or, Elizabeth Shriberg
This paper aims to promote `disuency awareness' especially in the eld of phonetics which has much to offer in the way of increasing our understanding of these phenomena. Two broad claims are...
Integrating prosodic and lexical cues for automatic topic segmentation (2001)
Gökhan Tür, Dilek Hakkani-tür, Andreas Stolcke, Elizabeth Shriberg
SRI International SRI International We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two...
Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like
Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like
Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg
The constant frame length in typical ASR front ends is too long to capture transient phenomena in speech, such as stop bursts. However, current HMM systems have consistently outperformed systems...
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...
this article is twofold: On the one hand, we aim to present a comprehensive framework for modeling and automatic classification of DAs, founded on well-known statistical methods. In doing so, we will...
Dialog act modeling for automatic tagging and recognition of conversational speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Dan Jurafsky, ...
We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco
The constant frame length in typical ASR front ends is too long to capture transient phenomena in speech, such as stop bursts. However, current HMM systems have consistently outperformed systems...
Prosody-based automatic segmentation of speech into sentences and topics (2000)
Elizabeth Shriberg, Andreas Stolcke, Dilek Hakkani-tür, Gökhan Tür
A crucial step in processing speech audio data for informationextraction, topic detection, or browsing/playbackis to segment the input into sentence and topic units. Speech segmentation is...
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco
The constant frame length in typical ASR front ends is too long to capture transient phenomena in speech, such as stop bursts. However, current HMM systems have consistently outperformed systems...
Dialogue act modeling for automatic tagging and recognition of conversational speech (2000)
Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, ...
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like
Combining Words and Prosody for Information Extraction from Speech (1999)
Dilek Hakkani-tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg
Information extraction from speech is a crucial step on the way from speech recognition to speech understanding. A preliminary step toward speech understanding is the detection of topic boundaries,...
Combining words and speech prosody for automatic topic segmentation (1999)
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-tür, Gökhan Tür, Kemal Sönmez
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language...
Combining Words and Prosody for Information Extraction from Speech (1999)
Dilek Hakkani-tur, Gokhan Tur, Andreas Stolcke, Elizabeth Shriberg
Information extraction from speech is a crucial step on the way from speech recognition to speech understanding. A preliminary step toward speech understanding is the detection of topic boundaries,...
Modeling The Prosody Of Hidden Events For Improved Word Recognition (1999)
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tur, Gokhan Tur
We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such as...
Combining Words and Speech Prosody for Automatic Topic Segmentation (1999)
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür, Ze'ev Rivlin, ...
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language...
Maestro: Conductor Of Multimedia Analysis (1999)
Technologies Ze'ev Rivlin, Robert Bolles, Adam Cheyer, Dilek Hakkani-tür, David Israel, Luc Julia, ...
is technologies --- for example, speech recognition, image understanding, and optical character recognition --- to the indexing and retrieval of multimedia. Informedia [1] and Broadcast News...
Modeling the prosody of hidden events for improved word recognition (1999)
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-tür, Gökhan Tür
We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such as...
How Far Do Speakers Back Up In Repairs? A Quantitative Model (1998)
Elizabeth Shriberg, Andreas Stolcke
Speakers frequently retrace one or more words when continuing after a break in fluency. Syntactic principles constrain the points from which speakers retrace; however syntactic principles do not...
Dialog Act Modeling for Conversational Speech (1998)
Andreas Stolcke, Elizabeth Shriberg, Sri International, Noah Coccaro, Daniel Jurafsky, Rachel Martin, ...
We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts' (e.g., Statement, Question,...
Modeling Dynamic Prosodic Variation For Speaker Verification (1998)
Kemal Sonmez, Elizabeth Shriberg, Larry Heck, Mitchel Weintraub
Statistics of frame-level pitch have recently been used in speaker recognition systems with good results [1, 2, 3]. Although they convey useful long-term information about a speaker's...
Harry Bratt, Leo Neumeyer, Elizabeth Shriberg, Horacio Franco
We describe the methodologies for collecting and annotating a Latin-American Spanish speech database. The database includes recordings by native and nonnative speakers. The nonnative recordings are...
Dialog Act Modeling for Conversational Speech (1998)
Andreas Stolcke, Elizabeth Shriberg, Sri International, Noah Coccaro, Daniel Jurafsky, Rachel Martin, ...
We describe an integrated approach for statistical modeling of discourse structure for natural conversational speech. Our model is based on 42 `dialog acts' (statement, question, backchannel,...
Automatic Detection Of Sentence Boundaries And Disfluencies Based On Recognized Words (1998)
Andreas Stolcke, Elizabeth Shriberg, Rebecca Bates, Mari Ostendorf, Dilek Hakkani, Madelaine Plauche, ...
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such...
Robert Eklund, Elizabeth Shriberg
We report results from a cross-language study of disfluencies (DFs) in Swedish and American English humanmachine and human-human dialogs. The focus is on comparisons not directly affected by...
A prosody-only decision-tree model for disfluency detection (1997)
Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke
Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for...
A Lognormal Tied Mixture Model Of Pitch For Prosody-Based Speaker Recognition (1997)
M. Kemal Sönmez, Larry Heck, Mitchel Weintraub, Elizabeth Shriberg, M. Kemal, S Larry, ...
Statistics of pitch have recently been used in speaker recognition systems with good results. The success of such systems depends on robust and accurate computation of pitch statistics in the...
A Prosody-Only Decision-Tree Model For Disfluency Detection (1997)
Elizabeth Shriberg, Rebecca Bates, Andreas Stolcke
Speech disfluencies (filled pauses, repetitions, repairs, and false starts) are pervasive in spontaneous speech. The ability to detect and correct disfluencies automatically is important for...
Elizabeth Shriberg, D. Robert Ladd, Jacques Terken, Andreas Stolcke
We study F0 variation produced by “speaking up”, as part of a larger study of pitch range variation within and across speakers [1]. We provide a function to predict target F0 values in this...
Automatic Linguistic Segmentation Of Conversational Speech (1996)
Andreas Stolcke, Elizabeth Shriberg
As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into...
Statistical Language Modeling For Speech Disfluencies (1996)
Andreas Stolcke, Elizabeth Shriberg
Speech disfluencies (such as filled pauses, repetitions, restarts) are among the characteristics distinguishing spontaneous speech from planned or read speech. We introduce a language model that...
Disfluencies in Switchboard (1996)
Disfluencies ("um," repeats, self-repairs) are prevalent in spontaneous speech, and are relevant to both human speech communication and speech processing by machine. Although disfluencies...
Elizabeth Shriberg, D. Robert Ladd, Jacques Terken, Andreas Stolcke
We study F0 variation produced by "speaking up", as part of a larger study of pitch range variation within and across speakers [1]. We provide a function to predict target F0 values in this...
Word Predictability After Hesitations: A Corpus-Based Study (1996)
Elizabeth Shriberg, Andreas Stolcke
We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the...
Word Predictability After Hesitations: A Corpus-Based Study (1996)
Elizabeth Shriberg, Andreas Stolcke
We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the...
Expanding the scope of the ATIS task: the ATIS-3 corpus (1994)
Deborah A. Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-smith, David Pallett, ...
The Air Travel Information System (ATIS) domain serves as the common evaluation task for ARPA"spoken language system developers. 1 To support this task, the Multi-Site ATIS Data COl-lection...
A System for Labeling Self-Repairs in Speech (1993)
John Bear, John Dowding, Elizabeth Shriberg, Patti Price
This document outlines a system for labeling self-repairs in spontaneous speech. The system marks the location and extent of a repair, as well as relevant words in the region of the repair. Together...
John Bear, John Dowding, Elizabeth Shriberg
We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques for automatically...
Automatic Detection and Correction of Repairs in Human-Computer Dialog (1992)
Elizabeth Shriberg, John Bear, John Dowding
We have analyzed 607 sentences of spontaneous humancomputer speech data containing repairs (drawn from a corpus of 10,718). We present here criteria and techniques for automatically detecting the...
Andreas Stolcke, Senior Member, Sachin S. Kajarekar, Luciana Ferrer, Elizabeth Shriberg
Abstract—We present a new modeling approach for speaker recognition that uses the maximum-likelihood linear regression (MLLR) adaptation transforms employed by a speech recognition system as...