Nonlinear Spectral Transformations For Robust Speech Recognition (2004)
Shajith Ikbal, Hynek Hermansky, Herv E Bourlard
Recently, a nonlinear transformation of autocorrelation coefficients named Phase AutoCorrelation (PAC) coefficients has been considered for feature extraction [1]. PAC based features show improved...
Phase Autocorrelation (pac) Derived Robust Speech Features (2004)
Shajith Ikbal, Hemant Misra, Herv E Bourlard
In this paper, we introduce a new class of noise robust acoustic features derived from a new measure of autocorrelation, and explicitly exploiting the phase variation of the speech signal frame over...
Phase Autocorrelation (pac) Features In Entropy Based Multi-Stream (2004)
Shajith Ikbal, Hemant Misra, Herv E Bourlard, Hynek Hermansky
Methods to improve noise robustness of speech recognition systems often result in degradation of recognition performance for clean speech. Recently proposed Phase AutoCorrelation (PAC) [1, 2] based...
Spectro-Temporal Activity Pattern (STAP) Features for Noise Robust ASR (2004)
Shajith Ikbal, Mathew Magimai. -doss, Hemant Misra, Herv E Bourlard
In this paper, we introduce a new noise robust representation of speech signal obtained by locating points of potential importance in the spectrogram, and parameterizing the activity of...
Mel-Cepstrum Modulation Spectrum (mcms) Features For Robust Asr (2003)
Vivek Tyagi, Iain Mccowan, Hemant Misra, Herv E Bourlard
In this paper, we present new dynamic features derived from the modulation spectrum of the cepstral trajectories of the speech signal. Cepstral trajectories are projected over the basis of sines and...
On Factorizing Spectral Dynamics for Robust Speech Recognition (2003)
Vivek Tyagi, Iain Mccowan, Herv E Bourlard, Hemant Misra
In this paper, we introduce new dynamic speech features based on the modulation spectrum. These features, termed Melcepstrum Modulation Spectrum (MCMS), map the time trajectories of the spectral...
On Automatic Annotation Of Meeting Databases (2003)
Daniel Gatica-perez, Iain Mccowan, Mark Barnard, Samy Bengio, Herv E Bourlard
In this paper, we present meetings as an application domain for multimedia content analysis. Meeting databases are a rich data source suitable for a variety of audio, visual and multi-modal tasks,...
New Entropy Based Combination Rules In Hmm/ann Multi-Stream Asr (2003)
Classifier performance is often enhanced through combining multiple streams of information. In the context of multistream HMM/ANN systems in ASR, a confidence measure widely used in classifier...
Modeling Auxiliary Information in Bayesian Network Based ASR (2001)
Todd A. Stephenson, M. Mathew, Herv E Bourlard
Automatic speech recognition bases its models on the acoustic features derived from the speech signal. Some have investigated replacing or supplementing these features with information that can not...
Non-Stationary Multi-Channel (multi-Stream) Processing Towards Robust And Adaptive Asr (2000)
In this paper 1 , we discuss the rationale behind multichannel processing as applied to multi-stream automatic speech recognition (ASR). In this framework, we will develop different mathematical...
Different Weighting Schemes In The Full Combination Subbands Approach For Noise Robust ASR (1999)
Astrid Hagen, Andrew Morris, Herv E Bourlard
In this paper, we present and investigate a new method for subband-based Automatic Speech Recognition (ASR) which approximates the ideal `full combination' approach which is itself often not...