Marcello Federico

Efficient Handling of N-gram Language Models for Statistical Machine Translation (2009)

Marcello Federico, Fondazione Bruno, Kessler Irst, Mauro Cettolo

Statistical machine translation, as well as other areas of human language processing, have recently pushed toward the use of large scale n-gram language models. This paper presents efficient...

A Web-based Demonstrator of a Multi-lingual Phrase-based Translation System (2009)

Roldano Cattoni, Nicola Bertoldi, Mauro Cettolo, Boxing Chen, Marcello Federico

This paper describes a multi-lingual phrase-based Statistical Machine Translation system accessible by means of a Web page. The user can issue translation requests from Arabic, Chinese or Spanish...

Exploiting Word Transformation in Statistical Machine Translation from Spanish to English (2008)

Deepa Gupta, Marcello Federico

This paper investigates the use of morphosyntactic information to reduce datasparseness in statistical machine translation from Spanish to English. In particular, word-alignment training is performed...

A Web-based Demonstrator of a Multi-lingual Phrase-based Translation System (2008)

Roldano Cattoni, Nicola Bertoldi, Mauro Cettolo, Boxing Chen, Marcello Federico

This paper describes a multi-lingual phrase-based Statistical Machine Translation system accessible by means of a Web page. The user can issue translation requests from Arabic, Chinese or Spanish...

International Journal on Digital Libraries manuscript No. (will be inserted by the editor) Accessing the Spoken Word (2008)

Jerry Goldman, Steve Renals, Steven Bird, Franciska Jong, Marcello Federico, Carl Fleischhauer, ...

The date of receipt and acceptance will be inserted by the editor Abstract. Spoken word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental...

Robust and Reliable Speech Understanding in Restricted Domains (2007)

Giuliano Antoniol, Mauro Cettolo, Marcello Federico

This paper describes the components of an Automatic Speech Understanding (ASU) system developed at IRST within the framework of the MAIA

An Optimum Classifier Approximation for Network-Based Handwritten Character Recognition (2007)

Marcello Federico, Stefano Messelodi, Luigi Stringa

An approximation of the Bayes decision rule and its implementation on a two-layered network are described. The net is trained in two phases: first, probabilities of the discrete-valued input features...

International Guidelines for Museum Object Information: The CIDOC Information Categories. http://www.cidoc.icom.org/guide (2007)

Marcello Federico, Nicola Bertoldi, Vanessa Sandrini

This paper presents the development of a Named Entity (NE) recognition sys-tem for the Italian broadcast news do-main. A statistical model is introduced based on a trigram language model de-fined on...

TECHNIQUES FOR APPROXIMATING A TRIGRAM LANGUAGE MODEL (2007)

Fabio Brugnara, Marcello Federico

In this paper several methods are proposed for reducing the size of a trigram language model �LM�, which is often the biggest data structure in a continuous speech recognizer, without a�ecting...

USABILITY FIELD-TEST OF A SPOKEN DATA-ENTRY SYSTEM (2007)

Marcello Federico, Fabio Brugnara, Roberto Gretter

This paper reports on the field-test of a speech based data-entry system developed as a follow-up of an EC funded project. The application domain is the data-entry of personnel absence records from a...

Robust Analysis Of Spoken Input Combining Statistical And (2007)

Roldano Cattoni, Marcello Federico, Alon Lavie

The work presented in this paper concerns the analysis of automatic transcription of spoken input into an interlingua formalism for a speech-to-speech machine translation system. This process is...

Moses: Open source toolkit for statistical machine translation (2007)

Hieu Hoang, Alexandra Birch, Chris Callison-burch, Richard Zens, Rwth Aachen, Alexandra Constantin, ...

We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c)...

Improving statistical word alignments with morpho-syntactic transformations (2006)

Adrià De Gispert, Deepa Gupta, Maja Popović, Patrik Lambert, Jose B. Mariño, Marcello Federico, ...

Abstract. This paper presents a wide range of statistical word alignment experiments incorporating morphosyntactic information. By means of parallel corpus transformations according to information of...

Improving phrase-based statistical translation through combination of word alignment (2006)

Boxing Chen, Marcello Federico

Abstract. This paper investigates the combination of word-alignments computed with the competitive linking algorithm and well-established IBM models. New training methods for phrase-based statistical...

Morpho-syntactic information for automatic error analysis of statistical machine translation output (2006)

Maja Popović, Hermann Ney, Adrià De Gispert, José B. Mariño, Deepa Gupta, Marcello Federico, ...

Evaluation of machine translation output is an important but difficult task. Over the last years, a variety of automatic evaluation measures have been studied, some of them like Word Error Rate...

Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output (2006)

Maja Popovic, Hermann Ney, Adrià De Gispert, José B. Mariño, Deepa Gupta, Marcello Federico, ...

Evaluation of machine translation output is an important but difficult task. Over the last years, a variety of automatic evaluation measures have been studied, some of them like Word Error Rate...

How Many Bits Are Needed To Store Probabilities for Phrase-Based Translation? (2006)

Marcello Federico, Nicola Bertoldi

State of the art in statistical machine translation is currently represented by phrasebased models, which typically incorporate a large number of probabilities of phrase-pairs and word n-grams. In...

Accessing the spoken word (2005)

Goldman, Jerry, Renals, Steve, Bird, Steven, De Jong, Franciska, Federico, Marcello, Fleischhauer, Carl, ...

Spoken word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access...

Transforming Access to the Spoken Word (2005)

Goldman, Jerry, Renals, Steve, Bird, Steven, De Jong, Franciska, Federico, Marcello, Fleischhauer, Carl, ...

Spoken word audio collections cover many domains,including radio and television broadcasts, oral narratives,governmental proceedings, lectures, and telephone conversations.The collection, access and...

Accessing the spoken word (2005)

Goldman, Jerry, Renals, Steve, Bird, Steven, De Jong, Franciska, Federico, Marcello, Fleischhauer, Carl, ...

Spoken-word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access,...

Accessing the spoken word (2005)

Goldman, Jerry, Renals, Steve, Bird, Steven, De Jong, Franciska, Federico, Marcello, Fleischhauer, Carl, ...

Spoken-word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access,...

Accessing the spoken word (2005)

Goldman, Jerry, Renals, Steve, Bird, Steven, De Jong, Franciska, Federico, Marcello, Fleischhauer, Carl, ...

Spoken word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access...

Accessing the Spoken Word (2005)

Jerry Goldman, Steve Renals, Steven Bird, Franciska Jong, Mark Kornbluh, ...

Spoken word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access...

The CLEF 2003 Cross-Language Spoken Document Retrieval Track (2004)

Marcello Federico, Gareth Jones

The current expansion in collections of natural language based digital documents in various media and languages is creating challenging opportunities for automatically accessing the information...

Evaluation Frameworks for Speech Translation Technologies (2003)

Marcello Federico

This paper reports on activities carried out under the European project PF-STAR and within the CSTAR consortium, which aim at evaluating speech translation technologies. In PF-STAR, speech...

Language model adaptation through topic decomposition and mdi estimation (2002)

Marcello Federico

This work presents a language model adaptation method combining the latent semantic analysis framework with the minimum discrimination information estimation criterion. In particular, an unsupervised...

ITC-irst at CLEF 2001: Monolingual and bilingual tracks (2002)

Nicola Bertoldi, Marcello Federico

Abstract. This paper reports on the participation of ITC-irst in the Cross Language Evaluation Forum (CLEF) of 2001. ITC-irst has taken part to two tracks: the monolingual retrieval task, and the...

Cross-task portability of a broadcast news speech recognition system. Speech Communication (2002)

N. Bertoldi, F. Brugnara, M. Cettolo, M. Federico, D. Giuliani, Marcello Federico

This paper reports on experiments of porting the ITC-irst Italian broadcast news recognition system to two spontaneous dialogue domains. Porting was investigated by applying state-of-the-art...

ITC-irst at CLEF 2001: Monolingual and bilingual tracks (2002)

Nicola Bertoldi, Marcello Federico

This paper reports on the participation of ITC-irst in the Cross Language Evaluation Forum (CLEF) of 2001. ITC-irst has taken part to two tracks: the monolingual retrieval task, and the bilingual...

Broadcast News LM Adaptation using Contemporary Texts (2001)

Marcello Federico, Nicola Bertoldi

This paper investigates the problem of dynamically updating the language model (LM) of a broadcast news speech recognition system, in order to cope with language and topic changes, typical of the...

ITC-irst at CLEF 2000: Italian monolingual track (2001)

Nicola Bertoldi, Marcello Federico

Abstract. This paper presents work on document retrieval for Italian carried out at ITC-irst. Two different approaches to information retrieval were investigated, one based on the Okapi weighting...

Unsupervised Language and Acoustic Model Adaptation for Cross Domain Portability (2001)

Diego Giuliani, Marcello Federico

This work investigates the task of porting a broadcast news recognition system to a conversational speech domain, for which only untranscribed acoustic data are available. An iterative adaptation...

Robust Analysis Of Spoken Input Combining Statistical And (2001)

Roldano Cattoni, Marcello Federico

The work presented in this paper concerns the analysis of automatic transcription of spoken input into an interlingua formalism for a speech-to-speech machine translation system. This process is...

Development and Evaluation of an Italian Broadcast News Corpus (2000)

Marcello Federico, Dimitri Giordani, Paolo Coletti

This paper reports on the development and evaluation of an Italian broadcast news corpus at ITC-irst, under a contract with the European Language resources Distribution Agency (ELDA). The corpus...

A System for the Retrieval of Italian Broadcast News (2000)

Marcello Federico

This paper presents a prototype for the retrieval of Italian broadcast news, which has been developed at ITC-irst. The architecture employs a speech recognition engine for the automatic transcription...

Model selection criteria for acoustic segmentation (2000)

Mauro Cettolo, Marcello Federico

Robust acoustic segmentation has become a critical issue in order to apply speech recognition to audio streams with variable acoustic content, e.g. radio programs. Many techniques in the literature...

Italian text retrieval for CLEF 2000 at ITC-irst (2000)

Nicola Bertoldi, Marcello Federico

This paper presents work on document retrieval for Italian carried out at ITC-irst. Two different approaches to information retrieval were investigated, one based on the Okapi weighting formula and...

Usability Evaluation of a Spoken Data-Entry Interface”, ITC-Irst Centro per la Ricera Scientifica e Technologica (1999)

Marcello Federico

This work presents a usability evaluation performed during the field-test of a speech based data-entry system. The application domain is the data-entry of personnel absence records from a huge...

Efficient Language Model Adaptation through MDI Estimation (1999)

Marcello Federico

This paper presents a method for n-gram language model adaptation based on the principle of minimum discrimination information. A background language model is adapted to fit constraints on its...

A two-stage speech recognition method for information retrival applications (1999)

Paolo Coletti, Marcello Federico

This paper presents a two-stage approach to speech recognition that is suited for information retrieval tasks, e.g. accessing a large telephone directory. The first stage performs a Viterbi beam...

Bayesian Estimation Methods for N-Gram Language Model Adaptation (1996)

Marcello Federico

Stochastic n-gram language models have been successfully applied in continuous speech recognition for several years. Such language models provide many computational advantages but also require huge...

Language Modeling for Efficient Beam-Search (1995)

Marcello Federico, Mauro Cettolo, Fabio Brugnara, Giuliano Antoniol

This paper considers the problems of estimating bigram language mod-els and of efficiently representing them by a finite state network, which can be employed by an hidden Markov model based,...

Language Model Representations For Beam-Search Decoding (1995)

Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico

This paper presents an efficient way of representing a bigram language model for a beam-search based, continuous speech, large vocabulary HMM recognizer. The tree-based topology considered takes...

RADIOLOGICAL REPORTING BY SPEECH RECOGNITION: THE A.Re.S. SYSTEM (1994)

Bianca Angelini, Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico, Roberto Fiutem, ...

Radiological reporting has already been identified as a field in which voice technologies can prove to be very useful. Recent progress in automatic speech recognition and in hardware and software...

Language Model Estimations And Representations For Real-Time Continuous Speech Recognition (1994)

Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico

This paper compares different ways of estimating bigram language models and of representing them in a finite state network used by a beam-search based, continuous speech, and speaker independent HMM...

Language Model Estimations and Representations for Real-time Continuous Speech Recognition (1994)

Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico

This paper compares different ways of estimating bigram language models and of representing them in a finite state network used by a beam-search based, continuous speech, and speaker independent HMM...

Radiological Reporting by Speech Recognition: The A.Re.S. System (1994)

Bianca Angelini, Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico, Roberto Fiutem, ...

Radiological reporting has already been identified as a field in which voice technologies can prove to be very useful. Recent progress in automatic speech recognition and in hardware and software...

Techniques For Robust Recognition In Restricted Domains (1993)

Giuliano Antoniol, Mauro Cettolo, Marcello Federico

This paper describes an Automatic Speech Understanding (ASU) system used in a human-robot interface for the remote control of a mobile robot. The intended application is that of an operator issuing...

Robust Speech Understanding for Robot Telecontrol (1993)

Giuliano Antoniol, Roldano Cattoni, Mauro Cettolo, Marcello Federico

This paper describes an Automatic Speech Understanding (ASU) system used in a human-robot interface for the remote control of a mobile robot. The intended application is that of an operator issuing...

Techniques For Robust Recognition In Restricted Domains (1993)

Giuliano Antoniol, Mauro Cettolo, Marcello Federico

This paper describes an Automatic Speech Understanding (ASU) system used in a human-robot interface for the remote control of a mobile robot. The intended application is that of an operator issuing...

Language Models Comparison in a Robot Telecontrol Application (1993)

Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico

Stochastic Language Models (LMs) are key for achieving good performance in speech recognition systems. This is confirmed by the numerous LMs that have been proposed recently in the literature. This...