TC-Star: Cross-Language Voice Conversion Revisited (2008)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Julia Hirschberg
In the framework of the European speech-to-speech translation project TC-Star, one of the research tasks is cross-language voice conversion. In the recent second evaluation campaign, five...
H.: Residual Prediction (2008)
David Sündermann, Harald Höge, Antonio Bonafonte, Helenca Duxans
Residual prediction is a technique that aims at recovering the spectral details of speech that was encoded using parameterizations as linear predictive coefficients. Example applications of residual...
Text-independent voice conversion based on unit selection (2006)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan Black, Shri Narayanan
So far, most of the voice conversion training procedures are text-dependent, i.e., they are based on parallel training utterances of source and target speaker. Since several applications (e.g....
Residual Prediction Based on Unit Selection (2005)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan W Black
Recently, we presented a study on residual prediction techniques that can be applied to voice conversion based on linear transformation or hidden Markov model-based speech synthesis. Our voice...
H.: A Study on Residual Prediction Techniques for Voice Conversion (2005)
David Sündermann, Antonio Bonafonte
Several well-studied voice conversion techniques use line spectral frequencies as features to represent the spectral envelopes of the processed speech frames. In order to return to the time domain,...
Residual Prediction Based on Unit Selection (2005)
David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan W Black
Recently, we presented a study on residual prediction techniques that can be applied to voice conversion based on linear transformation or hidden Markov model-based speech synthesis. Our voice...
Voice conversion using exclusively unaligned training data (2004)
Sündermann, David, Bonafonte Cávez, Antonio, Höge, Harald, Ney, Hermann
Although all conventional voice conversion approaches require equivalent training utterances of source and target speaker, several recently proposed applications call for breaking this demand. In...
D.: Error Measures and Bayes Decision Rules Revisited with Applications to POS Tagging (2004)
Hermann Ney, Maja Popović, David Sündermann
Starting from first principles, we re-visit the statistical approach and study two forms of the Bayes decision rule: the common rule for minimizing the number of string errors and a novel rule for...
A First Step Towards Text-Independent Voice Conversion (2004)
David Sündermann, Antonio Bonafonte
So far, all conventional voice conversion approaches are text-dependent, i.e., they need equivalent training utterances of source and target speaker. Since several recently proposed applications call...
Synther -- A New M-Gram Pos Tagger (2003)
David Undermann And, David Sündermann, Hermann Ney
In this paper, the Part-Of-Speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling...
VTLN-Based CrossLanguage Voice Conversion (2003)
In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As cross-language voice conversion aims at the transformation of a source...