| Automatic language model adaptation for spoken document retrieval (2000) | |||||||||||||||
Abstract | |||||||||||||||
| This paper describes experiments implemented at NIST in adapting language models over time to improve recognition of broadcast news recorded over many months. These experiments were designed specifically to improve the utility of automatically generated transcripts for retrieval applications. To evaluate the potential of the approach, a time-adaptive automatic speech recognition run was implemented to support the 1999 TREC Spoken Document Retrieval (SDR) Track – more than 500 hours of broadcast news sampled across 5 months. The accuracy of retrieval for several systems using the time-adaptive system transcripts was evaluated against transcripts produced by virtually the same recognition system with a fixed language model. This paper details the process we employed to identify and implement the time-adaptive language model and discusses the results of the experiment in terms of its effect on word error rate, out of vocabulary rate and retrieval accuracy (Mean Average Precision). 1. | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||