| International Guidelines for Museum Object Information: The CIDOC Information Categories. http://www.cidoc.icom.org/guide (2007) | |||||||||||||||
Abstract | |||||||||||||||
| This paper presents the development of a Named Entity (NE) recognition sys-tem for the Italian broadcast news do-main. A statistical model is introduced based on a trigram language model de-fined on words and NE classes. The estimation of the NE model is carried out with a very little list of 2,360 manually tagged NEs and a large untagged newspaper corpus. An iterative training procedure is applied which goes through the estimation of simpler models, whose parameters are used to initialize the complete NE model. In the end, NE recognition experiments are reported, on broadcast news transcripts generated by a speech recognition system. | |||||||||||||||
Publication details | |||||||||||||||
| |||||||||||||||