Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation (2009)
Jia Xu, Jianfeng Gao, Kristina Toutanova, Hermann Ney
Words in Chinese text are not naturally separated by delimiters, which poses a challenge to standard machine translation (MT) systems. In MT, the widely used approach is to apply a Chinese word...
Can We Translate Letters? (2009)
David Vilar, Jan-t. Peter, Hermann Ney, Lehrstuhl Für Informatik
Current statistical machine translation systems handle the translation process as the transformation of a string of symbols into another string of symbols. Normally the symbols dealt with are the...
Name Extraction and Translation for Distillation (2009)
Heng Ji, Ralph Grishman, Dayne Freitag, Matthias Blume, Zhiqiang (john Wang, Fair Isaac Corp, ...
Name translation is important well beyond the relative frequency of names in a text: a correctly translated passage, but with the wrong name, may lose most of its value. The Nightingale team has...
Name Extraction and Translation for Distillation (2009)
Heng Ji, Ralph Grishman, Dayne Freitag, Matthias Blume, Zhiqiang (john Wang, Fair Isaac Corp, ...
Name translation is important well beyond the relative frequency of names in a text: a correctly translated passage, but with the wrong name, may lose most of its value. The Nightingale team has...
Deformation-aware Log-Linear Models (2009)
Gass, Tobias, Deselaers, Thomas, Ney, Hermann
In this paper, we present a novel deformation-aware discriminative model for handwritten digit recognition. Unlike previous approaches our model directly considers image deformations and allows...
Log-Linear Mixtures for Object Recognition (2009)
Weyand, Tobias, Deselaers, Thomas, Ney, Hermann
We present log-linear mixture models as a fully discriminative approach to object category recognition which can, analogously to kernelised models, represent non-linear decision boundaries. It is...
Integration of Speech to Computer-Assisted Translation Using Finite-State Automata (2009)
Shahram Khadivi, Richard Zens, Hermann Ney
State-of-the-art computer-assisted translation engines are based on a statistical prediction engine, which interactively provides completions to what a human translator types. The integration of...
Jointly Optimising Relevance and Diversity in Image Retrieval (2009)
Deselaers, Thomas, Gass, Tobias, Dreuw, Philippe, Ney, Hermann
In this paper we present a method to jointly optimise the relevance and the diversity of the results in image retrieval. Without considering diversity, image retrieval systems often mainly find a set...
Benchmark Databases for Video-Based Automatic Sign Language Recognition (2009)
Carol Neidle, Vassilis Athitsos, Stan Sclaroff, Hermann Ney
A new, linguistically annotated, video database for automatic sign language recognition is presented. The new RWTH-BOSTON-400 corpus, which consists of 843 sentences, several speakers and separate...
Jens Forster, Thomas Deselaers, Hermann Ney
We propose several tracking adaptation approaches to recover from early tracking errors in sign language recognition by optimizing the obtained tracking paths w.r.t. to the hypothesized word...
Pan, Zoom, Scan – Time-coherent, Trained Automatic Video Cropping (2009)
We present a method to fully automatically fit videos in 16:9 format on 4:3 screens and vice versa. It can be applied to arbitrary aspect ratios and can be used to make videos suitable for mobile...
Evgeny Matusov, Richard Zens, David Vilar, Arne Mauser, Maja Popović, Hermann Ney
We present the statistical machine translation system used by RWTH in the second TC-STAR evaluation. We give a short overview of the system as used in the first evaluation and then enumerate the...
TC-Star: Cross-Language Voice Conversion Revisited (2008)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Julia Hirschberg
In the framework of the European speech-to-speech translation project TC-Star, one of the research tasks is cross-language voice conversion. In the recent second evaluation campaign, five...
Hand in Hand: Automatic Sign Language to English Translation (2008)
In this paper, we describe the first data-driven automatic sign-languageto-speech translation system. While both sign language (SL) recognition and translation techniques exist, both use an...
Deselaers: Discriminative Training for Object Recognition 1 Idea (2008)
Thomas Deselaers, Daniel Keysers, Hermann Ney
object recognition in cluttered scenes: question:What objects are contained in an image? several approaches known, active field of research promising approach:
Creating a Large-Scale Arabic to French Statistical Machine Translation System (2008)
In this work, the creation of a large-scale Arabic to French statistical machine translation system is presented. We introduce all necessary steps from corpus aquisition, preprocessing the data to...
Shared-Memory Parallelization for Content-based Image Retrieval (2008)
Christian Terboven, Thomas Deselaers, Christian Bischof, Hermann Ney
Abstract. In this paper we show how modern shared-memory parallelization techniques can gain nearly linear speedup in content-based image retrieval. Using OpenMP, few changes are applied to the...
Categories and Subject Descriptors (2008)
Thomas Deselaers, Tobias Wey, Hermann Ney
I feel smaller than 10 words but should be 250 words.
Yuqi Zhang, Richard Zens, Hermann Ney
In this paper, we describe a sourceside reordering method based on syntactic chunks for phrase-based statistical machine translation. First, we shallow parse the source language sentences. Then,...
A Flexible Architecture for CAT Applications (2008)
Shahram Khadivi, Richard Zens, Hermann Ney
We present an intuitive technical framework for making Computer Assisted Translation (CAT) adaptable and more suitable for rapid application development. The framework is a client-server-based...
Integrated Chinese Word Segmentation in Statistical Machine Translation (2008)
Jia Xu, Evgeny Matusov, Richard Zens, Hermann Ney
A Chinese sentence is represented as a sequence of characters, and words are not separated from each other. In statistical machine translation, the conventional approach is to segment the Chinese...
Abstract Features for Image Retrieval: An Experimental Comparison (2008)
Thomas Deselaers, Daniel Keysers, Hermann Ney
An experimental comparison of a large number of different image descriptors for content-based image retrieval is presented. Many of the papers describing new techniques and descriptors for...
Thomas Deselaers, Tobias Gass, Tobias Wey, Hermann Ney
We present the methods we applied in the four different tasks of the ImageCLEF 2007 content-based image retrieval evaluation. We participated in all four tasks using a variety of methods. Global and...
Parameter Estimation for Automatic Dose Control in Radioscopy (2008)
Daniel Keysers, Sami Celik, Henning Braess, Jörg Dahmen, Hermann Ney, Philips Gmbh Forschungslaboratorien
Abstract During a medical radioscopic examination, the X-ray dose needs to be adjusted continuously to the body region examined. In current systems, this adjustment is based on the mean grayvalue of...
2006. Statistical Machine Translation of German Compound Words (2008)
Maja Popović, Daniel Stein, Hermann Ney
Abstract. German compound words pose special problems to statistical machine translation systems: the occurence of each of the components in the training data is not sufficient for successful...
Integration of Speech to Computer-Assisted Translation Using Finite-State Automata (2008)
Shahram Khadivi, Richard Zens, Hermann Ney
State-of-the-art computer-assisted translation engines are based on a statistical prediction engine, which interactively provides completions to what a human translator types. The integration of...
2001. Morpho-syntactic analysis for reordering in statistical machine translation (2008)
In the framework of statistical machine translation (SMT), correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so-called...
PROBABILISTIC RETRIEVAL BASED ON DOCUMENT REPRESENTATIONS (2008)
Wolfgang Macherey, Jörg Viechtbauer, Hermann Ney
Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In the recent past, research increasingly...
A Systematic Comparison of Training Criteria for Statistical Machine Translation (2008)
We address the problem of training the free parameters of a statistical machine translation system. We show significant improvements over a state-of-the-art minimum error rate training baseline on a...
† Speech Technology Group – ETSI de Telecomunicación (2008)
David Vilar, Jia Xu, Luis Fern, Hermann Ney, Dpto Ingeniería Electrónica
Evaluation of automatic translation output is a difficult task. Several performance measures like Word Error Rate, Position Independent Word Error Rate and the BLEU and NIST scores are widely use and...
† Speech Technology Group – ETSI de Telecomunicación (2008)
David Vilar, Jia Xu, Luis Fern, Hermann Ney, Dpto Ingeniería Electrónica
Evaluation of automatic translation output is a difficult task. Several performance measures like Word Error Rate, Position Independent Word Error Rate and the BLEU and NIST scores are widely use and...
Overview of the ImageCLEF 2007 Object Retrieval Task (2008)
Thomas Deselaers Rwth, Allan Hanbury Prip, Ville Viitaniemi Hut, Hugo Jair, Escalante Balderas Inaoe, Theo Gevers Isla, ...
We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL...
Acoustic-Phonetic Knowledge and Statistics in Automatic Speech Recognition (2008)
This paper deals with the relation of acoustic-phonetic knowledge and its role in automatic speech recognition. Two applications of acoustic-phonetic knowledge are considered in more detail: 1)...
Overview of the ImageCLEF 2007 Object Retrieval Task (2008)
Deselaers, Thomas, Hanbury, Allan, Viitaniemi, Ville, Farquhar, Jason D.R., Brendel, Mátyás, Daróczy, Bálint, ...
We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL...
Estimating Translation Quality (2007)
The evaluation of machine translation systems is a dicult and time consuming task. To be meaningful and reliable, translation quality has to be evaluated
Abstract. We present two novel bounds for the classification error that, at the same time, can be used as practical training criteria. Unlike the bounds reported in the literature so far, these novel...
In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily...
Ismael Garcfa Varea, Franz J. Och, Hermann Ney
Efficient integration of maximum entropy models within a maximum likelihood training
Combining Neighboring Filter Channels to Improve Quantile Based Histogram Equalization (2007)
Florian Hilger, Hermann Ney, Olivier Siohan, Frank K. Soong
A mismatch between the training data and the test condition of an automatic speech recognition system usually deteriorates the recognition performance. Quantile based histogram equalization can...
Probabilistic Aspects in Spoken Document Retrieval (2007)
Wolfgang Macherey, Hans Jorg Viechtbauer, Hermann Ney
Abstract | Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In SDR, a set of automatically...
Journal of Electronic Imaging 12(1), pp. 59--68 (January 2003) (2007)
Daniel Keysers, Jo Rg Dahmen, Hermann Ney, Berthold B. Wein, Thomas M. Lehmann
and is made available as an electronic reprint with permission of SPIE. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple...
The recognition performance of automatic speech recognition systems can be improved by reducing the mismatch between training and test data during feature extraction. The approach described in this...
Comparison of Log-Linear Models and Weighted Dissimilarity Measures (2007)
Daniel Keysers, Roberto Paredes, Enrique Vidal, Hermann Ney
Abstract. We compare two successful discriminative classification algorithms on three databases from the UCI and STATLOG repositories. The two approaches are the log-linear model for the class...
AN ITERATIVE, DP-BASED SEARCH ALGORITHM FOR STATISTICAL MACHINE TRANSLATION (2007)
Francisco Casacuberta, Hermann Ney
The increasing interest in the statistical approach to Machine Translation is due to the development of effective algorithms for training the probabilistic models proposed so far. However, one of the...
USING PHASE SPECTRUM INFORMATION FOR IMPROVED SPEECH RECOGNITION PERFORMANCE (2007)
In this work, new acoustic features for continuous speech recognition based on the short-term Fourier phase spectrum are introduced for mono (telephone) recordings. The new phase based features were...
Ralf Schluter, Wolfgang Macherey, Boris Muèller, Hermann Ney
The aim of this work is to build up a common framework for a class of discriminative training criteria and optimization methods for continuous speech recognition. A uni®ed discriminative criterion...
Generation of Word Graphs in Statistical Machine Translation (2007)
Nicola Ueng, Franz Josef Och, Hermann Ney
Statistical machine translation systems usually compute the single sentence that has the highest probability according to the models that are trained on data. We describe a method for constructing a...
PROBABILISTIC RETRIEVAL BASED ON DOCUMENT REPRESENTATIONS (2007)
Wolfgang Macherey, J Org Viechtbauer, Hermann Ney
Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In the recent past, research increasingly...
Architecture and Search Organization for Large Vocabulary Continuous Speech Recognition (2007)
Stefan Ortmanns, Lutz Welling, Klaus Beulen, Frank Wessel, Hermann Ney
Abstract. This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the first part of the paper, we describe the...
1 A Statistical Framework for Model-based Image Retrieval in Medical Applications (2007)
Daniel Keysers, Hermann Ney, Berthold Wein, Thomas Lehmann
Abstract | Recently, research within the eld of contentbased image retrieval has attracted a lot of attention. Nevertheless, most existing methods cannot be easily applied to medical image databases,...
1 A Statistical Framework for Model-based Image Retrieval in Medical Applications (2007)
Daniel Keysers, Hermann Ney, Berthold B. Wein, Thomas M. Lehmann
Recently, research in the eld of content-based image retrieval has attracted a lot of attention. Nevertheless, most existing methods cannot be easily applied to medical image databases, as global...
lInstituto Tecnoldgico de Inform;itica (2007)
Daniel Keysers, Roberto Paredes, Hermann Ney, Enrique Vidal
{rparedes, evidal}iti.upv.es Abstract. Statistical classification using tangent vectors and classifi-cation based on local features are two successful methods for various image recognition problems....
Robust speech recognition using a voiced-unvoiced feature (2007)
Andr As Zolnay, Ralf Schl Uter, Hermann Ney
In this paper, a voiced-unvoiced measure is used as acoustic feature for continuous speech recognition. The voiced-unvoiced measure was combined with the standard Mel Frequency Cepstral Coefficients...
Morpho-Syntactic Analysis for Reordering in Statistical Machine Translation (2007)
In the framework of statistical machine translation (SMT), correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so-called...
Franz Josef Och, Aixplain Ag, Hermann Ney
The performance of machine translation technology after 50 years of development leaves much to be desired. There is a high demand for well performing and cheap MT systems for many language pairs and...
Parameter Estimation for Automatic Dose Control in Radioscopy (2007)
Daniel Keysers, Sami Celik, Henning Braess, Hermann Ney, Philips Gmbh Forschungslaboratorien
Abstract During a medical radioscopic examination, the X-ray dose needs to be adjusted continuously to the body region examined. In current systems, this adjustment is based on the mean grayvalue of...
Roberto Paredes, Daniel Keysers, Thomas M. Lehmann, Berthold Wein, Hermann Ney
Abstract In medical image retrieval, the images are usually subject to a large range of variability. In order to classify medical images, we therefore propose the use of local representations, which...
Ralf Schl Uter, Wolfgang Macherey, Boris M Uller, Hermann Ney
In this work a method for splitting continuous mixture density hidden Markov models (HMM) is presented. The approach combines a model evaluation measure based on the Maximum Mutual Information (MMI)...
Statistical Image Object Recognition using Mixture Densities (2007)
Abstract. In this paper, we present a mixture density based approach to invariant image object recognition. To allow for a reliable estimation of the mixture parameters, the dimensionality of the...
SPEECH RECOGNITION USING CONTEXT CONDITIONAL WORD POSTERIOR PROBABILITIES (2007)
Ralf Schl Uter, Frank Wessel, Hermann Ney
In this paper two new scoring schemes for large vocabulary continuous speech recognition are compared. Instead of using the joint probability of a word sequence and a sequence of acoustic...
Combined Classication of Handwritten Digits using the `Virtual Test Sample Method' (2007)
Abstract. In this paper, we present a combined classication approach called the `virtual test sample method'. Contrary to classier combination, where the outputs of a number of classiers are...
hierarchical models for statistical machine translation of
Multi-Level Error Handling for Tree Based Dialogue Course Management (2007)
Klaus Macherey, Oliver Bender, Hermann Ney
For spoken dialogue systems, errors can occur on different levels of the system's architecture. One of the principal causes for errors during a dialogue session are erroneous recognition results...
Towards the Use of Word Stems and Suffixes for Statistical Machine Translation (2007)
In this paper we present methods for improving the quality of translation from an inflected language into English by making use of part-of-speech tags and word stems and suffixes in the source...
David Vilar, Hermann Ney, Alfons Juan, Enrique Vidal
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds of thousands even...
Maximum Entropy Models for Named Entity Recognition (2007)
Oliver Bender And, Oliver Bender, Franz Josef Och, Hermann Ney
In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily...
Toward Hierarchical Models for Statistical Machine Translation of (2007)
Inflected Languages Sonja, Sonja Nießen, Hermann Ney
In statistical machine translation, correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so called alignment models.
Overview of the ImageCLEF 2007 Object Retrieval Task (2007)
Deselaers, Thomas, Hanbury, Allan, Viitaniemi, Ville, Benczur, Andras, Brendel, Matyas, Daroczy, Balint, ...
We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely used PASCAL...
Deformation models for image recognition (2007)
Daniel Keysers, Thomas Deselaers, Student Member, Christian Gollan, Hermann Ney
c○2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for...
Speech recognition techniques for a sign language recognition system (2007)
David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney
One of the most significant differences between automatic sign language recognition (ASLR) and automatic speech recognition (ASR) is due to the computer vision problems, whereas the corresponding...
Improving speech translation with automatic boundary prediction (2007)
Evgeny Matusov, Dustin Hillard, Mathew Magimai-doss, Dilek Hakkani-tur, Mari Ostendorf, Hermann Ney
This paper investigates the influence of automatic sentence boundary and sub-sentence punctuation prediction on machine translation (MT) of automatically recognized speech. We use prosodic and...
Overview of the ImageCLEF 2007 object retrieval task (2007)
Thomas Deselaers, Allan Hanbury, Ville Viitaniemi, András Benczúr, Mátyás Brendel, Bálint Daróczy, ...
Abstract. We describe the object retrieval task of ImageCLEF 2007, give an overview of the methods of the participating groups, and present and discuss the results. The task was based on the widely...
Minimum Bayes risk decoding for BLEU (2007)
Nicola Ehling, Richard Zens, Hermann Ney
We present a Minimum Bayes Risk (MBR) decoder for statistical machine translation. The approach aims to minimize the expected loss of translation errors with regard to the BLEU score. We show that...
Word error rates: Decomposition over POS classes and applications for error analysis (2007)
Evaluation and error analysis of machine translation output are important but difficult tasks. In this work, we propose a novel method for obtaining more details about actual translation errors in...
The 2005 PASCAL Visual Object Classes Challenge (2006)
Everingham, Mark, Zisserman, Andrew, Williams, Christopher, Van Gool, Luc, Allan, Moray, Bishop, Chris, ...
The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not...
Improving statistical word alignments with morpho-syntactic transformations (2006)
Adrià De Gispert, Deepa Gupta, Maja Popović, Patrik Lambert, Jose B. Mariño, Marcello Federico, ...
Abstract. This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of...
The 2005 pascal visual object classes challenge (2006)
Mark Everingham, Andrew Zisserman, Luc Van Gool, Moray Allan, Christopher M. Bishop, ...
Abstract. The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes...
Human Language Technology and Pattern Recognition (2006)
Saša Hasan, Evgeny Matusov, Arne Mauser, David Vilar, Richard Zens, Hermann Ney, ...
2. Related work
Maja Popović, Hermann Ney, Adrià De Gispert, José B. Mariño, Deepa Gupta, Marcello Federico, ...
Evaluation of machine translation output is an important but difficult task. Over the last years, a variety of automatic evaluation measures have been studied, some of them like Word Error Rate...
Sparse patch-histograms for object classification in cluttered images (2006)
Thomas Deselaers, Andre Hegerath, Daniel Keysers, Hermann Ney
Abstract. We present a novel model for object recognition and detection that follows the widely adopted assumption that objects in images can be represented as a set of loosely coupled parts. In...
MorphoSyntax Based Statistical Methods for Automatic Sign Language Translation (2006)
Daniel Stein, Jan Bungeroth, Hermann Ney
We present a novel approach for the automatic translation of written text into sign language. A new corpus focussing on the weather report domain for the language pair German and German Sign Language...
A german sign language corpus of the domain weather report (2006)
Jan Bungeroth, Daniel Stein, Morteza Zahedi, Hermann Ney
All systems for automatic sign language translation and recognition, in particular statistical systems, rely on adequately sized corpora. For this purpose, we created the Phoenix corpus that is based...
Reranking translation hypotheses using structural properties (2006)
We investigate methods that add syntactically motivated features to a statistical machine translation system in a reranking framework. The goal is to analyze whether shallow parsing techniques help...
The 2005 pascal visual object classes challenge (2006)
Mark Everingham, Andrew Zisserman, Luc Van Gool, Moray Allan, Christopher M. Bishop, ...
Abstract. The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes...
Evgeny Matusov, Nicola Ueffing, Hermann Ney
This paper describes a novel method for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The outputs are combined and a possibly new translation...
CDER: Efficient MT Evaluation Using Block Movements (2006)
Gregor Leusch, Nicola Ueffing, Hermann Ney
Most state-of-the-art evaluation measures for machine translation assign high costs to movements of word blocks. In many cases though such movements still result in correct or almost correct...
Proceedings of the Workshop on Statistical Machine Translation, pages 55--63, (2006)
New York City, Richard Zens, Hermann Ney
We present discriminative reordering models for phrase-based statistical machine translation. The models are trained using the maximum entropy principle.
Maja Popovic, Hermann Ney, Adrià De Gispert, José B. Mariño, Deepa Gupta, Marcello Federico, ...
Evaluation of machine translation output is an important but difficult task. Over the last years, a variety of automatic evaluation measures have been studied, some of them like Word Error Rate...
N -Gram Posterior Probabilities for Statistical Machine Translation (2006)
Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior...
Partitioning Parallel Documents Using Binary Segmentation (2006)
Jia Xu, Richard Zens, Hermann Ney
In statistical machine translation, large numbers of parallel sentences are required to train the model parameters. However, plenty of the bilingual language resources available on web are aligned...
Text-independent voice conversion based on unit selection (2006)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan Black, Shri Narayanan
So far, most of the voice conversion training procedures are text-dependent, i.e., they are based on parallel training utterances of source and target speaker. Since several applications (e.g....
The 2005 pascal visual object classes challenge (2006)
Mark Everingham, Andrew Zisserman, Luc Van Gool, Moray Allan, Christopher M. Bishop, ...
Abstract. The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes...
Patch-based object recognition using discriminatively trained gaussian mixtures (2006)
Andre Hegerath, Thomas Deselaers, Hermann Ney
We present an approach using Gaussian mixture models for part-based object recognition where spatial relationships of the parts are explicitly modeled and parameters of the generative model are tuned...
Confidence measures for machine translation is a method for labeling each word in an automatically generated translation as correct or incorrect. In this paper, we will present a new approach to...
Thomas Deselaers, Tobias Wey, Daniel Keysers, Wolfgang Macherey, Hermann Ney
In this paper we describe the methods we used in the 2005 ImageCLEF content-based image retrieval evaluation. For the medical retrieval task, we combined several low-level image features with textual...
Automatic Filtering of Bilingual Corpora for Statistical Machine Translation (2005)
Abstract. For many applications such as machine translation and bilingual information retrieval, the bilingual corpora play an important role in training the system. Because they are obtained through...
Gesture Recognition Using Image Comparison Methods (2005)
Daniel Keysers, Thomas Deselaers, Hermann Ney
Abstract. We introduce the use of appearance-based features, and tangent distance or the image distortion model to account for image variability within the hidden Markov model emission probabilities...
One Decade of Statistical Machine Translation: 1996-2005 (2005)
In the last decade, the statistical approach has found widespread use in machine translation both for written and spoken language and has had a major impact on the translation accuracy. This paper...
Morteza Zahedi, Daniel Keysers, Thomas Deselaers, Hermann Ney
Abstract. In this paper, we employ a zero-order local deformation model to model the visual variability of video streams of American sign language (ASL) words. We discuss two possible ways of...
Acoustic Feature Combination for Robust Speech Recognition (2005)
András Zolnay, Ralf Schlüter, Hermann Ney
In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (Mel Frequency Cepstrum...
Evaluating machine translation output with automatic sentence segmentation (2005)
Evgeny Matusov, Gregor Leusch, Oliver Bender, Hermann Ney
This paper presents a novel automatic sentence segmentation method for evaluating machine translation output with possibly erroneous sentence boundaries. The algorithm can process translation...
Statistical machine translation of european parliamentary speeches (2005)
David Vilar, Evgeny Matusov, Richard Zens, Hermann Ney
In this paper we present the ongoing work at RWTH Aachen University for building a speechto-speech translation system within the TC-Star project. The corpus we work on consists of parliamentary...
Open vocabulary speech recognition with flat hybrid models (2005)
Maximilian Bisani, Hermann Ney
Today’s speech recognition systems are able to recognize arbitrary sentences over a large but finite vocabulary. However, many important speech recognition tasks feature an open, constantly...
Novel reordering approaches in phrase-based statistical machine translation (2005)
Stephan Kanthak, David Vilar, Evgeny Matusov, Richard Zens, Hermann Ney
This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical...
Morteza Zahedi, Daniel Keysers, Hermann Ney
Abstract. In this paper, we present a system for automatic sign language recognition of segmented words in American Sign Language (ASL). The system uses appearance-based features extracted directly...
Confidence measures for machine translation is a method for labeling each word in an automatically generated translation as correct or incorrect. In this paper, we will present a new approach to...
Appearance-Based Recognition of Words in (2005)
American Sign Language, Morteza Zahedi, Daniel Keysers, Hermann Ney
In this paper, we present how appearance-based features can be used for the recognition of words in American sign language (ASL) from a video stream. The features are extracted without any...
Maja Popovic, David Vilar, Hermann Ney, Slobodan Jovičić
In this work, we examine the quality of several statistical machine translation systems constructed on a small amount of parallel Serbian-English text. The main bilingual parallel corpus consists of...
Word Graphs for Statistical Machine Translation (2005)
Word graphs have various applications in the field of machine translation. Therefore it is important for machine translation systems to produce compact word graphs of high quality. We will describe...
Preprocessing and Normalization for Automatic Evaluation of Machine Translation (2005)
Gregor Leusch, Nicola Ueffing, David Vilar, Hermann Ney
Evaluation measures for machine translation depend on several common methods, such as preprocessing, tokenization, handling of sentence boundaries, and the choice of a reference length. In this...
Novel Reordering Approaches in Phrase-Based Statistical Machine Translation (2005)
Stephan Kanthak, David Vilar, Evgeny Matusov, Richard Zens, Hermann Ney
This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical...
Residual Prediction Based on Unit Selection (2005)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan W Black
Recently, we presented a study on residual prediction techniques that can be applied to voice conversion based on linear transformation or hidden Markov model-based speech synthesis. Our voice...
Residual Prediction Based on Unit Selection (2005)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan W Black
Recently, we presented a study on residual prediction techniques that can be applied to voice conversion based on linear transformation or hidden Markov model-based speech synthesis. Our voice...
Improving a discriminative approach to object recognition using image patches (2005)
Thomas Deselaers, Daniel Keysers, Hermann Ney
Abstract. In this paper we extend a method that uses image patch histograms and discriminative training to recognize objects in cluttered scenes. The method generalizes and performs well for...
T.M: The CLEF 2005 automatic medical image annotation task. CLEF (2005)
Thomas Deselaers, Henning Müller, Paul Clough, Hermann Ney, Thomas M. Lehmann
In this paper, the automatic annotation task from the 2005 CLEF cross-language image retrieval campaign (ImageCLEF) is described. This paper focuses on the database used, the task setup, and the...
Abstract. In this work, the use of a phrasal lexicon for statistical machine translation is proposed, and the relation between data acquisition costs and translation quality for different types and...
Thomas Deselaers, Daniel Keysers, Hermann Ney
object recognition in cluttered scenes: question:What objects are contained in an image? several approaches known, active field of research promising approach:
Novel reordering approaches in phrase-based statistical machine translation (2005)
Stephan Kanthak, David Vilar, Evgeny Matusov, Richard Zens, Hermann Ney
This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical...
Voice conversion using exclusively unaligned training data (2004)
Sündermann, David, Bonafonte Cávez, Antonio, Höge, Harald, Ney, Hermann
Although all conventional voice conversion approaches require equivalent training utterances of source and target speaker, several recently proposed applications call for breaking this demand. In...
Voice conversion sing exclusively unaligned training data (2004)
Although all conventional voice conversion approaches require equivalent training utterances of source and target speaker, several recently proposed applications call for breaking this demand. In...
Voice conversion sing exclusively unaligned training data (2004)
Although all conventional voice conversion approaches require equivalent training utterances of source and target speaker, several recently proposed applications call for breaking this demand. In...
In this paper we present the RWTH FSA toolkit – an efficient implementation of algorithms for creating and manipulating weighted finite-state automata. The toolkit has been designed using the...
Discriminative Training with Tied Covariance Matrices (2004)
Wolfgang Macherey, Ralf Schlüter, Hermann Ney
Discriminative training techniques have proved to be a powerful method for improving large vocabulary speech recognition systems based on Gaussian mixture hidden Markov models. Typically, the...
Thomas Deselaers, Daniel Keysers, Hermann Ney
A major problem in the field of content-based image retrieval is the lack of a common performance measure which allows the researcher to compare different image retrieval systems in a quantitative...
Reordering constraints for phrase-based statistical machine translation (2004)
Richard Zens, Hermann Ney, Taro Watanabe, Eiichiro Sumita
In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary reorderings are permitted, the search problem is NP-hard. On the other hand,...
Bayes decision rule and confidence measures for statistical machine translation (2004)
Abstract. In this paper, we re-visit the foundations of the statistical approach to machine translation and study two forms of the Bayes decision rule: the common rule for minimizing the number of...
Effect of feature smoothing methods in text classification tasks (2004)
David Vilar, Hermann Ney, Alfons Juan, Enrique Vidal
Abstract. The number of features to be considered in a text classification system is given by the size of the vocabulary and this is normally in the range of the tens or hundreds of thousands even...
Enhancements for local feature based image classification (2004)
Tobias Kolsch, Daniel Keysers, Hermann Ney, Roberto Paredes
Using local features with nearest neighbor search and direct voting obtains excellent results for various image classification tasks. In this work we decompose the method into its basic steps which...
Improvements in phrase-based statistical machine translation (2004)
In statistical machine translation, the currently best performing systems are based in some way on phrases or word groups. We describe the baseline phrase-based translation system and various...
D.: Error Measures and Bayes Decision Rules Revisited with Applications to POS Tagging (2004)
Hermann Ney, Maja Popović, David Sündermann
Starting from first principles, we re-visit the statistical approach and study two forms of the Bayes decision rule: the common rule for minimizing the number of string errors and a novel rule for...
Symmetric Word Alignments for Statistical Machine Translation (2004)
Evgeny Matusov, Richard Zens, Hermann Ney
In this paper, we address the word alignment problem for statistical machine translation. We aim at creating a symmetric word alignment allowing for reliable one-to-many and many-to-one word...
Statistical Sign Language Translation (2004)
In the field of machine translation, significant progress has been made by using statistical methods. In this paper we suggest a statistical machine translation system for Sign Language and written...
In this paper we present the RWTH FSA toolkit – an efficient implementation of algorithms for creating and manipulating weighted finite-state automata. The toolkit has been designed using the...
The Statistical Approach to Spoken Language Translation (2004)
This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the...
Pixel-to-Pixel Matching for Image Recognition using Hungarian Graph Matching (2004)
Daniel Keysers, Thomas Deselaers, Hermann Ney
Abstract. A fundamental problem in image recognition is to evaluate the similarity of two images. This can be done by searching for the best pixel-to-pixel matching taking into account suitable...
Features for Image Retrieval – A Quantitative Comparison (2004)
Thomas Deselaers, Daniel Keysers, Hermann Ney
Abstract. In this paper, different well-known features for image retrieval are quantitatively compared and their correlation is analyzed. We compare the features for two different image retrieval...
Linear discriminant analysis and discriminative log-linear modeling (2004)
We discuss the relationship between the discriminative training of Gaussian models and the maximum entropy framework for log-linear models. Observing that linear transforms leave the distributions...
Adaption in Statistical Pattern Recognition Using . . . (2004)
Wolfgang Macherey, Hermann Ney, Jörg Dahmen
We integrate the tangent method into a statistical framework for classification analytically and practically. The resulting consistent framework for adaptation allows us to efficiently estimate the...
Local Context in Non-linear Deformation Models for Handwritten Character Recognition (2004)
Daniel Keysers, Christian Gollan, Hermann Ney
We evaluate different two-dimensional non-linear deformation models for handwritten character recognition. Starting from a true two-dimensional model, we derive pseudo-two-dimensional and zero-order...
Classification of Medical Images Using Non-Linear Distortion Models (2004)
Daniel Keysers, Christian Gollan, Hermann Ney
We propose the application of two-dimensional distortion models for comparisons of medical images in a distance-based classifier.
Classification Error Rate for Quantitative Evaluation of Content-Based Image Retrieval (2004)
Thomas Deselaers, Daniel Keysers, Hermann Ney
A major problem in the field of content-based image retrieval is the lack of a common performance measure which allows the researcher to compare different image retrieval systems in a quantitative...
Do We Need Chinese Word Segmentation (2004)
For Statistical Machine, Jia Xu, Richard Zens, Hermann Ney
In Chinese texts, words are not separated by white spaces. This is problematic for many natural language processing tasks. The standard approach is to segment the Chinese character sequence into...
FIRE - Flexible Image Retrieval Engine: ImageCLEF 2004 Evaluation (2004)
Thomas Deselaers, Daniel Keysers, Hermann Ney
In this paper we present FIRE, a content-based image retrieval system and the methods we used in the ImageCLEF 2004 evaluation. In FIRE, different features are available to represent images. This...
Enhancements for local feature based image classification (2004)
Tobias Kölsch, Daniel Keysers, Hermann Ney, Roberto Paredes
Using local features with nearest neighbor search and direct voting obtains excellent results for various image classification tasks. In this work we decompose the method into its basic steps which...
Linear discriminant analysis and discriminative log-linear modeling (2004)
We discuss the relationship between the discriminative training of Gaussian models and the maximum entropy framework for log-linear models. Observing that linear transforms leave the distributions...
Local Context in Non-linear Deformation Models for Handwritten Character Recognition (2004)
Daniel Keysers, Christian Gollan, Hermann Ney
We evaluate different two-dimensional non-linear deformation models for handwritten character recognition. Starting from a true two-dimensional model, we derive pseudo-two-dimensional and zero-order...
Probabilistic Aspects in Spoken Document Retrieval (2003)
Wolfgang Macherey, Hans Jörg Viechtbauer, Hermann Ney
Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In SDR, a set of automatically transcribed...
A novel string-to-string distance measure with applications to machine translation evaluation (2003)
Gregor Leusch, Nicola Ueffing, Hermann Ney, Lehrstuhl Für Informatik
We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in...
Maximum entropy models for named entity recognition (2003)
Oliver Bender, Franz Josef Och, Hermann Ney
In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily...
Oliver Bender, Klaus Macherey, Franz Josef Och, Hermann Ney
In this paper we compare two approaches to natural language understanding (NLU). The first approach is derived from the field of statistical machine translation (MT), whereas the other uses the...
Efficient search for interactive statistical machine translation (2003)
The goal of interactive machine translation is to improve the productivity of human translators. An interactive machine translation system operates as follows: the automatic system proposes a...
Feature space normalization in adverse acoustic conditions (2003)
Sirko Molau, Florian Hilger, Hermann Ney
We study the effect of different feature space normalization techniques in adverse acoustic conditions. Recognition tests are reported for cepstral mean and variance normalization, histogram...
Wolfgang Macherey, Hermann Ney
While Maximum Entropy (ME) based learning procedures have been successfully applied to text based natural language processing, there are only little investigations on using ME for acoustic modeling...
Oliver Bender, Klaus Macherey, Franz Josef Och, Hermann Ney
In this paper we compare two approaches to natural language understanding (NLU). The first approach is derived from the field of statistical ma-chine translation (MT), whereas the other uses the...
Extraction Methods of Voicing Feature for Robust Speech Recognition (2003)
Andr As Zolnay, Ralf Schl Uter, Hermann Ney
In this paper, three different voicing features are studied as additional acoustic features for continuous speech recognition. The harmonic product spectrum based feature is extracted in frequency...
Efficient Search for Interactive Statistical Machine Translation (2003)
Franz Josef Och, Richard Zens, Hermann Ney
The goal of interactive machine translation is to improve the productivity of hu-man translators. An interactive machine translation system operates as follows: the automatic system proposes a...
A comparative study on reordering constraints in statistical machine translation (2003)
In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other...
Confidence Measures for Statistical Machine Translation (2003)
Nicola Ueffing Klaus, Klaus Macherey, Hermann Ney
In this paper, we present several confidence measures for (statistical) machine translation. We introduce word posterior probabilities for words in the target sentence that can be determined either...
A Novel String-to-String Distance Measure With Applications to (2003)
Machine Translation Evaluation, Gregor Leusch, Nicola Ueffing, Hermann Ney, Lehrstuhl Für Informatik
We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in...
Synther -- A New M-Gram Pos Tagger (2003)
David Undermann And, David Sündermann, Hermann Ney
In this paper, the Part-Of-Speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling...
A Comparative Study on Reordering Constraints in Statistical Machine (2003)
Translation Richard Zens, Richard Zens, Hermann Ney
In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other...
Features for Tree Based Dialogue Course Management (2003)
Klaus Macherey And, Klaus Macherey, Hermann Ney
In this paper, we introduce different features for dialogue course management and investigate their effect on the system's behaviour for choosing the subsequent dialogue action during a dialogue...
Local Representations for Multi-Object Recognition (2003)
Thomas Deselaers, Daniel Keysers, Roberto Paredes, Enrique Vidal, Hermann Ney
Methods for the recognition of multiple objects in images using local representations are introduced. Starting from a straight forward approach, we combine the use of local representations with...
Maximum Entropy Models for Named Entity Recognition (2003)
Oliver Bender Franz, Hermann Ney
In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily...
Training and recognition of complex scenes using a holistic statistical model (2003)
Daniel Keysers, Michael Motter, Thomas Deselaers, Hermann Ney
Abstract. We present a holistic statistical model for the automatic analysis of complex scenes. Here, holistic refers to an integrated approach that does not take local decisions about segmentation...
A comparative study on reordering constraints in statistical machine translation (2003)
In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other...
A novel string-to-string distance measure with applications to machine translation evaluation (2003)
Gregor Leusch, Nicola Ueffing, Hermann Ney, Lehrstuhl Für Informatik
We introduce a string-to-string distance measure which extends the edit distance by block transpositions as constant cost edit operation. An algorithm for the calculation of this distance measure in...
Clustering visually similar images to improve image search engines (2003)
Thomas Deselaers, Daniel Keysers, Hermann Ney, Supervisor Prof
GI subjects: image understanding (1.0.4), machine learning (1.1.3) At the moment Google image search is probably the only widely known way to search the world wide web for images. Google’s search...
VTLN-Based CrossLanguage Voice Conversion (2003)
In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As cross-language voice conversion aims at the transformation of a source...
Vocal Tract Normalization as Linear Transformation (2003)
We have shown previously that vocal tract normalization (VTN) results in a linear transformation in the cepstral domain. In this paper we show that Mel-frequency warping can equally well be...
Confidence measures for Statistical Machine Translation (2003)
Nicola Ueffing, Klaus Macherey, Hermann Ney
In this paper, we present several confidence measures for (statistical) machine translation. We introduce word posterior probabilities for words in the target sentence that can be determined either...
Christoph Tillmann, Hermann Ney
In this article, we describe an efficient beam search algorithm for statistical machine translation based on dynamic programming (DP). The search algorithm uses the translation model presented in...
Oliver Bender, Klaus Macherey, Franz Josef Och, Hermann Ney
In this paper we compare two approaches to natural language understanding (NLU). The first approach is derived from the field of statistical machine translation (MT), whereas the other uses the...
Probabilistic Aspects in Spoken Document Retrieval (2003)
Hermann Ney, Hans Jörg Viechtbauer, Wolfgang Macherey
Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In SDR, a set of automatically transcribed...
Probabilistic Aspects in Spoken Document Retrieval (2003)
Wolfgang Macherey, Hans Jörg Viechtbauer, Hermann Ney
Accessing information in multimedia databases encompasses a wide range of applications in which spoken document retrieval (SDR) plays an important role. In SDR, a set of automatically transcribed...
Efficient Maximum Entropy Training for Statistical Object Recognition (2002)
Daniel Keysers, Franz Josef Och, Hermann Ney, Supervisor Prof
GI subjects: image understanding (1.0.4), machine learning (1.1.3) In statistical pattern recognition, we use probabilistic models within the task of assigning observations to one of a set of...
Ismael Garcfa Varea, Dpto De Informhtica, Franz J. Och, Hermann Ney, Lehrstuhl Fiir, Francisco Casacuberta
maximum entropy models
Speaker Adaptive Modeling by Vocal Tract Normalization (2002)
Lutz Welling, Hermann Ney, Stephan Kanthak
Abstract. This paper presents methods for speaker adaptive modeling using vocal tract normalization (VTN) along with experimental tests on three databases. We propose a new training method for VTN:...
A comparison of two LVR search optimization techniques (2002)
Stephan Kanthak, Hermann Ney, Michael Riley, Mehryar Mohri
This paper presents a detailed comparison between two search optimization techniques for large vocabulary speech recognition-- one based on word-conditioned tree search (WCTS) and one based on...
Reversing and smoothing the multinomial naive Bayes text classifier (2002)
Abstract. The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is...
Enhanced histogram normalization in the acoustic feature space (2002)
Sirko Molau, Florian Hilger, Daniel Keysers, Hermann Ney
We describe two methods that aim at normalizing acoustic vectors at the filterbank level such that the test data distribution matches the training data distribution. They enhance the histogram...
Ismael Garcfa Varea, Dpto De Informhtica, Franz J. Och, Hermann Ney, Lehrstuhl Fiir, Francisco Casacuberta
maximum entropy models*
Discriminative training and maximum entropy models for statistical machine translation (2002)
We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source-channel approach as a special case. All...
Towards Automatic Corpus Preparation for a German Broadcast News Transcription System (2002)
Wolfgang Macherey, Hermann Ney
When setting up a speech recognition system for a new domain, a lot of manual effort is spent on corpus preparation, i.e., data acquisition, cutting and segmentation of the audio material, generation...
Quantile based histogram equalization for online applications (2002)
Florian Hilger, Sirko Molau, Hermann Ney
The noise robustness of automatic speech recognition systems can be increased by transforming the signal to make the cumulative density functions of the signal's values in recognition match the...
Scoring Criteria for Tree based Dialogue Course Management (2002)
neyinformatik.rwth-aachen.de In this paper, we propose different scoring criteria for dialogue course management and investigate their effect on the system's behaviour for choosing the...
Training of across-word phoneme models for large vocabulary continuous speech recognition (2002)
Today's speech recognition systems use across-word context dependent phoneme models to capture coarticulation across word boundaries. While there are several publications about the organization...
Maximum Entropy and Gaussian Models for Image Object Recognition (2002)
Daniel Keysers, Franz Josef Och, Hermann Ney
Abstract. The principle of maximum entropy is a powerful framework that can be used to estimate class posterior probabilities for pattern recognition tasks. In this paper, we show how this principle...
Phrase-based statistical machine translation (2002)
Richard Zens, Franz Josef Och, Hermann Ney
Abstract. This paper is based on the work carried out in the framework of the Verbmobil project, which is a limited-domain speech translation task (German-English). In the nal evaluation, the...
Combination of Tangent Vectors and Local Representations for Handwritten Digit Recognition (2002)
Daniel Keysers, Roberto Paredes, Hermann Ney, Enrique Vidal
Abstract. Statistical classication using tangent vectors and classi-cation based on local features are two successful methods for various image recognition problems. These two approaches tolerate...
Discriminative training and maximum entropy models for statistical machine translation (2002)
We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source-channel approach as a special case. All...
Phrase-based statistical machine translation (2002)
This paper gives an overview of statistical machine translation and presents the publically available SMT toolkit EGYPT. Starting with the Bayes decision rule as in speech recognition, we show how...
Efficient Maximum Entropy Training for Statistical Object Recognition (2002)
Daniel Keysers, Franz Josef Och, Hermann Ney, Supervisor Prof
In statistical pattern recognition, we use probabilistic models within the task of assigning observations to one of a set of predefined classes, like e.g. images of handwritten digits to one of the...
Maximum Entropy and Gaussian Models for Image Object Recognition (2002)
Daniel Keysers, Franz Josef Och, Hermann Ney
The principle of maximum entropy is a powerful framework that can be used to estimate class posterior probabilities for pattern recognition tasks. In this paper, we show how this principle is related...
A Comparison Of Two Lvr Search Optimization Techniques (2002)
Stephan Kanthak Hermann, Hermann Ney, Michael Riley, Mehryar Mohri
This paper presents a detailed comparison between two search optimization techniques for large vocabulary speech recognition -- one based on word-conditioned tree search (WCTS) and one based on...
Classification of medical images using local representations (2002)
Roberto Paredes, Daniel Keysers, Thomas M. Lehmann, Berthold Wein, Hermann Ney, Enrique Vidal, ...
Abstract In medical image retrieval, the images are usually subject to a large range of variability. In order to classify medical images, we therefore propose the use of local representations, which...
Towards Automatic Corpus Preparation for a German Broadcast News Transcription System (2002)
Wolfgang Macherey, Hermann Ney
When setting up a speech recognition system for a new domain, a lot of manual effort is spent on corpus preparation, i.e., data acquisition, cutting and segmentation of the audio material, generation...
Combination of Tangent Vectors and Local Representations for Handwritten Digit Recognition (2002)
Daniel Keysers, Roberto Paredes, Hermann Ney, Enrique Vidal
Abstract. Statistical classification using tangent vectors and classification based on local features are two successful methods for various image recognition problems. These two approaches tolerate...
Classification of medical images using local representations (2002)
Roberto Paredes, Daniel Keysers, Thomas M. Lehmann, Berthold Wein, Hermann Ney, Enrique Vidal, ...
Abstract In medical image retrieval, the images are usually subject to a large range of variability. In order to classify medical images, we therefore propose the use of local representations, which...
Discriminative training and maximum entropy models for statistical machine translation (2002)
We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source-channel approach as a special case. All...
Güld: A Statistical Framework for MultiObject Recognition (2001)
Daniel Keysers, Jörg Dahmen, Hermann Ney, Mark Oliver Güld, Supervisor Prof
GI subjects: image understanding (1.0.4), machine learning (1.1.3) In this paper, we present a statistical framework for the recognition of multiple objects in an image, which is a generalization of...
Computing Mel-frequency cepstral coefficients on the power spectrum (2001)
Sirko Molau, Michael Pitz, Ralf Schlüter, Hermann Ney
In this paper we present a method to derive Mel-frequency cepstral coefficients directly from the power spectrum of a speech signal. We show that omitting the filterbank in signal analysis does not...
Refined lexicon models for statistical machine translation using a maximum entropy approach (2001)
Dpto De Informatica, Franz J. Och, Hermann Ney, ...
Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct...
Explicit word error minimization using word hypothesis posterior probabilities (2001)
In this paper, we introduce a new concept, the time frame error rate. We show that this error rate is closely correlated with the word error rate and use it to overcome the mismatch between Bayes...
Vocal tract normalization equals linear transformation in cepstral space (2001)
Michael Pitz, Sirko Molau, Ralf Schl Uter, Hermann Ney
We show that vocal tract normalization (VTN) frequency warping results in a linear transformation in the cepstral domain. For the special case of a piece-wise linear warping function, the...
Unsupervised training of acoustic models for large vocabulary continuous speech recognition (2001)
For speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were thus recorded and transcribed manually for training. Since...
Confidence measures for large vocabulary continuous speech recognition (2001)
Frank Wessel, Ralf Schltiter, Klaus Macherey, Hermann Ney
Abstract--In this paper, we present several confidence measures for large vocabulary continuous speech recognition. We propose to estimate the confidence of a hypothesized word directly as its...
Unsupervised training of acoustic models for large vocabulary continuous speech recognition (2001)
For speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were thus recorded and transcribed manually for training. Since...
An efficient A* search algorithm for statistical machine translation (2001)
Franz Josef Och, Nicola Ueffing, Hermann Ney
In this paper, we describe an efficient A * search algorithm for statistical machine translation. In contrary to beamsearch or greedy approaches it is possible to guarantee the avoidance of search...
Natural Language Understanding Using Statistical Machine Translation (2001)
Klaus Macherey, Franz Josef Och, Hermann Ney
Over the past years, automatic dialogue systems and telephonebased machine inquiry systems have received increasing attention. In addition to an automatic speech recognizer and a dialogue manager,...
Refined lexicon models for statistical machine translation using a maximum entropy approach (2001)
Dpto De Informatica, Franz J. Och, Hermann Ney, ...
Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct...
Learning of variability for invariant statistical pattern recognition (2001)
Daniel Keysers, Wolfgang Macherey, Hermann Ney
Abstract. In many applications, modelling techniques are necessary which take into account the inherent variability of given data. In this paper, we present an approach to model class specic pattern...
Güld: A Statistical Framework for MultiObject Recognition (2001)
Daniel Keysers, Hermann Ney, Mark Oliver Guld, Supervisor Prof
GI subjects: image understanding (1.0.4), machine learning (1.1.3) In this paper, we present a statistical framework for the recognition of multiple objects in an image, which is a generalization of...
Computing Mel-frequency cepstral coefficients on the power spectrum (2001)
Sirko Molau, Michael Pitz, Ralf Schl Uter, Hermann Ney
In this paper we present a method to derive Mel-frequency cepstral coefficients directly from the power spectrum of a speech signal. We show that omitting the filterbank in signal analysis does not...
Statistical multi-source translation (2001)
We describe methods for translating a text given in multiple source languages into a single target language. The goal is to improve translation quality in applications where the ultimate goal is to...
Histogram Based Normalization In The Acoustic Feature Space (2001)
Sirko Molau, Michael Pitz, Hermann Ney
We describe a technique called histogram normalization that aims at normalizing feature space distributions at different stages in the signal analysis front-end, namely the log-compressed filterbank...
Quantile based histogram equalization for noise robust speech recognition (2001)
This paper describes an approach to increase the noise robustness of automatic speech recognition systems by, transforming the signal after Mel scaled filtering, to make the cumulative density...
Using phase spectrum information for improved speech recognition performance (2001)
In this work, new acoustic features for continuous speech recognition based on the short-term Fourier phase spectrum are introduced for mono (telephone) recordings. The new phase based features were...
Stochastic Modelling: From Pattern Classification (2001)
To Language Translation, Hermann Ney
This paper gives an overview of the stochastic modelling approach to machine translation. Starting with the Bayes decision rule as in pattern classification and speech recognition, we show how the...
An Efficient A* Search Algorithm for Statistical Machine Translation (2001)
Franz Josef Och, Nicola Ueffing, Hermann Ney
In this paper, we describe an efficient A* search algorithm for statistical machine translation. In contrary to beamsearch or greedy approaches it is possible to guarantee the avoidance of search...
Learning of Variability for Invariant Statistical Pattern Recognition (2001)
Daniel Keysers, Wolfgang Macherey, Jörg Dahmen, Hermann Ney
In many applications, modelling techniques are necessary which take into account the inherent variability of given data. In this paper, we present an approach to model class specific pattern...
Natural Language Understanding Using Statistical Machine Translation (2001)
Klaus Macherey, Franz Josef Och, Hermann Ney
Over the past years, automatic dialogue systems and telephonebased machine inquiry systems have received increasing attention. In addition to an automatic speech recognizer and a dialogue manager,...
An efficient A* search algorithm for statistical machine translation (2001)
Franz Josef Och, Nicola Ueffing, Hermann Ney
In this paper, we describe an efficient A * search algorithm for statistical machine translation. In contrary to beamsearch or greedy approaches it is possible to guarantee the avoidance of search...
Explicit Word Error Minimization Using Word Hypothesis Posterior Probabilities (2001)
Frank Wessel, Ralf Schlüter, Hermann Ney
In this paper, we introduce a new concept, the time frame error rate. We show that this error rate is closely correlated with the word error rate and use it to overcome the mismatch between...
Fast search for large vocabulary speech recognition (2000)
Stephan Kanthak, Achim Sixtus, Sirko Molau, Ralf Schlüter, Hermann Ney
Abstract. In this article we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on...
Word re-ordering and dp-based search in statistical machine translation (2000)
Christoph Tillmann, Hermann Ney
In this paper, we describe a search procedure for sta-tistical machine translation (MT) based on dynmnic programming (DP). Starting from a DP-based solu-tion to the traveling salesman problem, we...
Michael Pitz, Frank Wessel, Hermann Ney
Automatic recognition of conversational speech tends to have higher word error rates (WER) than read speech. Improvements gained from unsupervised speaker adaptation methods like Maximum Likelihood...
Progress in dynamic programming search for LVCSR (2000)
Abstract- This paper gives an overview of the recent improvements in dynamic programming search for large vocabulary continuous speech recognition: search using lexical trees, time-conditioned search...
Improved Statistical Alignment Models (2000)
In this paper, we present and compare various single-word based align-ment models for statistical machine translation. We discuss the five IBM alignment models, the HiddenMarkov alignment model,...
Translation with cascaded finite state transducers (2000)
In this paper we discuss the use of cascaded finite state transducers for machine translation. A num-ber of small, dedicated transducers is applied to convert sentence pairs from a bilingual corpus...
2000. Automatic extrapolation of human assessment of translation quality (2000)
The evaluation of machine translation systems is a dicult and time consuming task. To be meaningful and reliable, translation quality has to be evaluated
Statistical methods for machine translation (2000)
Stephan Vogel, Franz Josef Och, Christof Tillmann, Sonja Nieen, Hassan Sawaf, Hermann Ney
Abstract. In this article we describe the statistical approach to machine translation as implemented in the stattrans module of the Verbmobil system. The statistical translation approach uses two...
On the use of grammar based language models for statistical machine translation (2000)
In this paper, we describe some concepts of language models beyond the usually used standard trigram and prove the need of such language models for statistical machine translation. In statistical...
Translation with cascaded finite state transducers (2000)
In this paper we discuss the use of cascaded nite state transducers for machine translation. A number of small, dedicated transducers is applied to convert sentence pairs from a bilingual corpus into...
Efficient Vocal Tract Normalization in Automatic Speech Recognition (2000)
Sirko Molau, Stephan Kanthak, Hermann Ney
In this paper we study the effect of vocal tract normalization (VTN) on the word error rate (WER) in speaker independent large vocabulary speech recognition. Evaluation test results are reported for...
Using SIMD Instructions for Fast Likelihood Calculation (2000)
Most modern processor architectures provide SIMD (single instruction multiple data) instructions to speed up algorithms based on vector or matrix operations. This paper describes the use of SIMD...
H.Ney: The RWTH Large Vocabulary Speech Recognition System for Spontaneous Speech (2000)
Stephan Kanthak, Sirko Molau, Achim Sixtus, Ralf Schluter, Hermann Ney
This paper presents details of the RWTH large vocabulary continuous speech recognition system used in the VERBMOBIL spontaneous speech translation system. In particular, we report on methods for...
Experiments with an extended tangent distance (2000)
Daniel Keysers, Jorg Dahmen, Thomas Theiner, Hermann Ney
Invariance is an important aspect in image object recognition. We present results obtained with an extended tangent distance incorporated in a kernel density based Bayesian classifier to compensate...
Fast search for large vocabulary speech recognition (2000)
Stephan Kanthak, Achim Sixtus, Sirko Molau, Ralf Schluter, Hermann Ney
Abstract. In this article we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on...
Word re-ordering and dp-based search in statistical machine translation (2000)
Christoph Tillmann, Hermann Ney
In this paper, we describe a search procedure for statistical machine translation (MT) based on dynamic programming (DP). Starting from a DP-based solution to the traveling salesman problem, we...
Within-word vs. across-word decoding for online speech recognition (2000)
Stephan Kanthak, Achim Sixtus, Sirko Molau, Hermann Ney
In this paper we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both...
Invariant Image Object Recognition using Mixture Densities (2000)
Jorg Dahmen, Daniel Keysers, Mark Oliver Guld, Hermann Ney
In this paper we present a mixture density based approach to invariant image object recognition. We start our experiments using Gaussian mixture densities within a Bayesian classifier. Invariance to...
Structured Covariance Matrices for Statistical Image Object Recognition (2000)
Jörg Dahmen, Daniel Keysers, Michael Pitz, Hermann Ney, H. Ney
In this paper we present different approaches to structuring covariance matrices within statistical classifiers. This is motivated by the fact that the use of full covariance matrices is infeasible...
Achim Sixtus, Sirko Molau, Stephan Kanthak, Ralf Schlüter, Hermann Ney
This paper presents recent improvements of the RWTH large vocabulary continuous speech recognition system (LVCSR). In particular, we will report on the integration of across-word models into the rst...
A Mixture Density Based Approach to Object Recognition for Image Retrieval (2000)
Jörg Dahmen, Mark Oliver Güld, Hermann Ney, H. Ney, Klaus Beulen, ...
In the last few years, statistical classifiers based on Gaussian mixture densities proved to be very efficient for automatic speech recognition. The aim of this paper is to find out how well such a...
Noise Level Normalization And Reference Adaptation For Robust Speech Recognition (2000)
This paper describes an approach to normalize the noise level of a speech signal at the outputs of the Mel scaled filter--bank used in MFCC--feature extraction. An adaptive normalizing function that...
Improving SMT quality with morpho-syntactic analysis (2000)
In the framework of statistical machine translation (SMT), correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so-called...
Jörg Dahmen, Thomas Theiner, Daniel Keysers, Hermann Ney, H. Ney, Berthold Wein, ...
In this paper we present a new approach to classifying radiographs, which is the first important task of the IRMA system. Given an image, we compute posterior probabilities for each image class, as...
Improved Statistical Alignment Models (2000)
In this paper, we present and compare various single-word based alignment models for statistical machine translation. We discuss the five IBM alignment models, the Hidden-Markov alignment model,...
A Comparison of Alignment Models for Statistical Machine Translation (2000)
In this paper, we present and compare various alignment models for statistical machine translation. We propose to measure the quality of an alignment model using the quality of the Viterbi alignment...
Automatic Classification of Red Blood Cells using Gaussian Mixture Densities (2000)
Jörg Dahmen, Jens Hektor, Ralf Perrey, Hermann Ney, H. Ney
In this paper we present an invariant statistical approach to classifying red blood cells (RBC). Given a database of 5062 grayscale images, we model the distribution of the observations by using...
A Probabilistic View on Tangent Distance (2000)
Daniel Keysers, Jörg Dahmen, Hermann Ney, H. Ney
In this paper we present a new probabilistic interpretation of tangent distance, which proved to be very effective in modeling image transformations in object recognition. Descriptions of the...
Experiments with an Extended Tangent Distance (2000)
Daniel Keysers Org, Daniel Keysers, Jörg Dahmen, Thomas Theiner, Hermann Ney, Lehrstuhl Für Informatik
Invariance is an important aspect in image object recognition. We present results obtained with an extended tangent distance incorporated in a kernel density based Bayesian classifier to compensate...
Experiments with an extended tangent distance (2000)
Daniel Keysers, Jörg Dahmen, Thomas Theiner, Hermann Ney, Lehrstuhl Für Informatik
Invariance is an important aspect in image object recognition. We present results obtained with an extended tangent distance incorporated in a kernel density based Bayesian classifier to compensate...
Improved alignment models for statistical machine translation (1999)
Franz Josef Och, Christoph Tillmann, Hermann Ney, Lehrstuhl Fiir Informatik
In this paper, we describe improved alignment models for statistical machine translation. The statistical translation approach uses two types of information: a translation model and a lan-guage...
Improved alignment models for statistical machine translation (1999)
{och, ney}~inf ormat ik. ruth-aachen, de In this paper, we t)resent and compare various align-nmnt models for statistical machine translation. We propose to measure tile quality of an aligmnent model...
Automatic Transcription Verification of Broadcast News and Similar Speech Corpora (1999)
Michael Pitz, Sirko Molau, Ralf Schl Uter, Hermann Ney
In the last few years, the focus in ASR research has shifted from the recognition of clean read speech (i.e. WSJ) to the more challenging task of transcribing found speech like broadcast news (Hub-4...
Automatic Transcription Verification of Broadcast News and Similar Speech Corpora (1999)
Michael Pitz, Sirko Molau, Ralf Schlüter, Hermann Ney
In the last few years, the focus in ASR research has shifted from the recognition of clean read speech (i.e. WSJ) to the more challenging task of transcribing found speech like broadcast news (Hub-4...
Improved alignment models for statistical machine translation (1999)
In this paper, we present and compare various alignment models for statistical machine translation. We propose to measure the quality of an alignment model using the quality of the Viterbi alignment...
Improved Alignment Models for Statistical Machine Translation (1999)
Franz Josef Och, Christoph Tillmann, Hermann Ney
In this paper, we describe improved alignment models for statistical machine translation. The statistical translation approach uses two types of information: a translation model and a language model....
A Comparison Of Dialogue-State Dependent Language Models (1999)
Frank Wessel, Andrea Baader, Hermann Ney
Dialogue-state dependent language models in automatic inquiry systems can be employed to improve speech recognition and understanding. In this paper, the dialogue state is defined by the set of...
A Comparison Of Word Graph And N-Best List Based Confidence Measures (1999)
Frank Wessel, Klaus Macherey, Hermann Ney
In this paper we present and compare several confidence measures for large vocabulary continuous speech recognition. We show that posterior word probabilities computed on word graphs and N-best lists...
Dynamic Programming Search for Continuous Speech Recognition (1999)
. Initially introduced in the late 1960s and early 1970s, dynamic programming algorithms have become increasingly popular in automatic speech recognition. There are two reasons why this has occurred:...
Improved lexical tree search for large vocabulary speech recognition (1998)
Stefan Ortmanns, Andreas Eiden, Hermann Ney
This paper describes some extensions to the language model (LM) look-ahead pruning approach which is integrated into the time-synchronous beam search algorithm. The search algorithm is based on a...
Algorithms for bigram and trigram word clustering (1998)
Sven Martin, Jorg Liermann, Hermann Ney
ABSTRACT. This paper presents and analyzes improved algorithms for clustering bigram and trigram word equivalence classes, and their respective results: 1) We give a detailed time complexity analysis...
Algorithms For Bigram And Trigram Word Clustering (1998)
Sven Martin, Jörg Liermann, Hermann Ney
In this paper, we describe an efficient method for obtaining word classes for class language models. The method employs an exchange algorithm using the criterion of perplexity improvement. The novel...
An Iterative, DP-based Search Algorithm for Statistical Machine Translation (1998)
Ismael García-varea, Francisco Casacuberta, Hermann Ney
The increasing interest in the statistical approach to Machine Translation is due to the development of effective algorithms for training the probabilistic models proposed so far. However, one of the...
Word trigger and the em algorithm (1997)
Christoph Tillmann, Hermann Ney
{t illmann, ney}©inf ormat ik. rwth-aachen, de In this paper, we study the use of so-called word trigger pairs to improve an existing language model, which is typically a tri-gram model in...
Word trigger and the em algorithm (1997)
Christoph Tillmann, Hermann Ney
{t i 1 lmann, ney} info rmat ik. rwt h- aachen. de In this paper, we study the use of so-called word trigger pairs to improve an existing language model, which is typically a trigram model in...
Adaptive topicdependent language modelling using word-based varigrams (1997)
Sven C. Martin, Jorg Liermann, Hermann Ney
This paper presents two extensions of the standard interpolated word trigram and cache model, namely the extension of the trigram model by useful word m--grams with m? 3 resulting into a varigram...
Stefan Ortmanns, Hermann Ney, Thorsten Firzlaff
This paper studies algorithms for reducing the computational effort of the mixture density calculations in HMM-based speech recognition systems. These likelihood calculations take about 70 \Gamma 85...
Word trigger and the em algorithm (1997)
Christoph Tillmann, Hermann Ney
In this paper, we study the use of so-called word trigger pairs to improve an existing language model, which is typically a trigram model in combination with a cache component. A word trigger pair is...
Word trigger and the em algorithm (1997)
Christoph Tillmann, Hermann Ney
In this paper, we study the use of so-called word trigger pairs to improve an existing language model, which is typically a trigram model in combination with a cache component. A word trigger pair is...
Implementation Of Word Based Statistical Language Models (1997)
Frank Wessel, Stefan Ortmanns, Hermann Ney
. In this paper we present an efficient data structure for storing trigram, bigram and unigram counts. The amount of memory required has been reduced by 53% compared to straightforward approaches....
Hmm-based word alignment in statistical translation (1996)
Stephan Vogel, Hermann Ney, Christoph Tillmann
In this paper, we describe a new model for word alignment in statistical trans-lation and present experimental results. The idea of the model is to make the alignment probabilities dependent on the...
Statistical language modeling and word triggers (1996)
Christoph Tillmann, Hermann Ney
This paper describes the use of word triggers in the context of statistical language modeling for speech recognition. It consists of two parts: First we describe the use of trigram models and...
Selection criteria for word trigger pairs in language modelling (1996)
Christoph Tillmann, Hermann Ney
Abstract. In this paper, we study selection criteria for the use of word trigger pairs in statistical language modeling. A word trigger pair is defined as a long-distance word pair. To select the...
Bloch,S.,Height pairings for algebraic cycles,J (1984)
When translating from languages with hardly any inflectional morphology like English into morphologically rich languages, the English word forms often do not contain enough information for producing...
Bloch,S.,Height pairings for algebraic cycles,J (1984)
When translating from languages with hardly any inflectional morphology like English into morphologically rich languages, the English word forms often do not contain enough information for producing...
Family Compliance Office (1974)
The performance of a statistical machine translation system depends on the size of the available task-specific bilingual training corpus. On the other hand, acquisition of a large high-quality...
Assessment Of Smoothing Methods And Complex Stochastic Language Modeling (1939)
Sven Martin, Christoph Hamacher, Jörg Liermann, J Org Liermann, Frank Wessel, Hermann Ney
This paper studies the overall effect of language modeling on perplexity and word error rate, starting from a trigram model with a standard smoothing method up to complex state--of--the-- art...
A DP-based Search Using Monotone Alignments in Statistical Translation
Christoph Tillmann, Stephan Vogel, Hermann Ney, Alex Zubiaga
A DP-based Search Using Monotone Alignments in Statistical Translation
Christoph Tillmann, Stephan Vogel, Hermann Ney, Alex Zubiaga
Statistical Methods for Machine Translation
Stephan Vogel, Franz Josef Och, S. Nießen, H. Sawaf, C. Tillmann, Hermann Ney
. In this article we describe the statistical approach to machine translation as implemented in the it stattrans-module of the VERBMOBIL system. 1 Introduction In this paper, we describe the present...