| Abstract Document Expansion for Speech Retrieval (2007) | |||||||||||||||
Abstract | |||||||||||||||
| Advances in automatic speech recognition allow us to search large speech collections using traditional information retrieval methods. The problem of \aboutness " for documents | is a document about a certain concept | has been at the core of document indexing for the entire history of IR. This problem is more di cult for speech indexing since automatic speech transcriptions often contain mistakes. In this study we showthatdocument expansion can be successfully used to alleviate the e ect of transcription mistakes on speech retrieval. The loss of retrieval e ectiveness due to automatic transcription errors can be reduced by document expansion from 15{27 % relative toretrieval from human transcriptions to only about 7{13%, even for automatic transcriptions with word error rates as high as 65%. For good automatic transcriptions (25 % word error rate), retrieval e ectiveness with document expansion is indistinguishable from retrieval from human transcriptions. This makes speech retrieval from automatic transcriptions, even poor ones, competitive with retrieval from perfect transcriptions. 1 | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||