A Parser from Antiquity 1 (2008)
Aravind K. Joshi, Phil Hopely, Mark Liberman, Mitch Marcus, Mehryar Mohri
B. Srinivas for their valuable comments during the preparation of this paper. We are grateful to William Schmidt for providing some information about Univac 1 and also to
Towards an Integrated Understanding of Speaking Rate (2008)
Jiahong Yuan, Mark Liberman, Christopher Cieri
We investigate factors that affect speaking rate in conversation, using large corpora of conversational telephone speech in English and Chinese. We find that speaking rate as a function of “turn...
ryantm,mpalmer,ais,ungar¡ (2008)
Seth Kulick, Ann Bies, Mark Liberman, Mark M, Scott Winters, Pete White
winters,white¡ We describe an approach to two areas of biomedical information extraction, drug development and cancer genomics. We have developed a framework which includes corpus annotation...
This research was supported by an NDSEG Fellowship, (2008)
Mark Liberman, Mitch Marcus, Joseph Rosenzweig
and greatly appreciates the use of its resources in support of this work. He would like to thank Jason Eisner, Libby
LightlySupervised Attribute Extraction for Web Search (2008)
Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, O Pereira, Mark Liberman, Andrew Mccallum, ...
Web search engines can greatly benefit from knowledge about attributes of entities present in search queries. In this paper, we introduce lightly-supervised methods for extracting entity attributes...
ryantm,mpalmer,ais,ungar¡ (2008)
Seth Kulick, Ann Bies, Mark Liberman, Mark M, Scott Winters, Pete White
winters,white¡ We describe an approach to two areas of biomedical information extraction, drug development and cancer genomics. We have developed a framework which includes corpus annotation...
Christopher Cieri, Mark Liberman
This presentation reports on recent progress the Linguistic Data Consortium has made in addressing the needs of multiple research communities by collecting, annotating and distributing, simplifying...
Christopher Cieri, Walt Andrews, Joseph P. Campbell, George Doddington, Jack Godfrey, Shudong Huang, ...
This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development. Recent speaker...
Towards an Integrated Understanding of Speaking Rate in Conversation (2008)
Jiahong Yuan Mark, Mark Liberman, Christopher Cieri
We investigate factors that affect speaking rate in conversation, using large corpora of conversational telephone speech in English and Chinese. We find that speaking rate as a function of...
A Progress Report from the Linguistic Data Consortium: recent activities in (2008)
Resource Creation And, Christopher Cieri, Mark Liberman
This paper described recent activities of the Linguistic Data Consortium in the collection, annotation and distribution of language data the developments of tools and standards for using that data,...
The tonal phonology of Yoruba clitics (2007)
Akinbiyi Akinlabi And, Mark Liberman
This paper examines the tonal behavior of six types of enclitics in Standard Yoruba, and shows that in all six cases, a constraint applies preventing the last syllable of the host and the adjacent...
Annotation Graphs: A Foundation for Integrating Tools, Formats and Corpora (2007)
In recent work we have presented a formal framework for linguistic annotations using labeled acyclic digraphs. These `annotation graphs' offer a simple yet powerful method for representing...
Cross-Lingual Topic Tracking using idf-Weighted Cosine Coefficient (2007)
J. Michael Schultz, Mark Liberman
We investigate a method of cross-lingual topic tracking which builds upon our cosine coefficient basedmonolingual approach. The system relies on a bilingual dictionary for translation as well as for...
Integrated Linguistic Resources for Language Exploitation Technologies (2006)
Stephanie Strassel Christopher, Christopher Cieri, Andrew Cole, Denise Dipersio, Mark Liberman, Mohamed Maamouri, ...
Linguistic Data Consortium has recently embarked on an effort to create integrated linguistic resources and related infrastructure for language exploitation technologies within the DARPA GALE (Global...
Integrated Annotation for Biomedical Information Extraction (2004)
Seth Kulick And, Seth Kulick, Ann Bies, Mark Liberman, Mark M, Scott Winters, ...
We describe an approach to two areas of biomedical information extraction, drug development and cancer genomics. We have developed a framework which includes corpus annotation integrated at multiple...
A Formal Framework for Linguistic Annotation (revised version) (2000)
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions - audio, video and/or physiological recordings -...
ATLAS: A flexible and extensible architecture for linguistic annotation (2000)
Bird, Steven, Day, David, Garofolo, John, Henderson, John, Laprun, Christophe, Liberman, Mark
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract...
Issues in Corpus Creation and Distribution: The Evolution of the Linguistic Data Consortium (2000)
Christopher Cieri, Mark Liberman
The Linguistic Data Consortium (LDC) is a non-profit consortium of universities, companies and government research laboratories that supports education, research and technology development in...
Christopher Cieri, David Graff, Mark Liberman, Nii Martey, Stephanie Strassel
This paper describes the creation and content two corpora, TDT-2 and TDT-3, created for the DARPA sponsored Topic Detection and Tracking project. The research goal in the TDT program is to create the...
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation (2000)
Steven Bird, David Day, John Garofolo, John Henderson, Christophe Laprun, Mark Liberman
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract...
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation (2000)
Steven Bird David, David Day, John Garofolo, John Henderson, Christophe Laprun, Mark Liberman
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract...
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation (2000)
Steven Bird, David Day, John Garofolo, John Henderson, Christophe Laprun, Mark Liberman
We describe a formal model for annotating linguistic artifacts, from which we derive an application programming interface (API) to a suite of tools for manipulating these annotations. The abstract...
Annotation graphs as a framework for multidimensional linguistic data analysis (1999)
In recent work we have presented a formal framework for linguistic annotation based on labeled acyclic digraphs. These `annotation graphs' offer a simple yet powerful method for representing complex...
A Formal Framework for Linguistic Annotation (1999)
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings...
A Formal Framework for Linguistic Annotation (1999)
"Linguistic annotation" covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions - audio, video and/or physiological recordings -...
Annotation graphs as a framework for multidimensional linguistic data analysis (1999)
In recent work we have presented a formal framework for linguistic annotation based on labeled acyclic digraphs. These 'annotation graphs' offer a simple yet powerful method for...
Topic detection and Tracking using idf-weighted Cosine Coefficient (1999)
J. Michael Schultz, Mark Liberman
The goal of TDT Topic Detection and Tracking is to develop automatic methods of identifying topically related stories within a stream of news media. We describe approaches for both detection and...
The TDT-2 Text and Speech Corpus (1999)
Chris Cieri, David Graff, Mark Liberman, Nii Martey, Stephanie Strassel
This paper describes the creation and content of the TDT-2 corpus in the context of the TDT-2 research project it supports and in comparison to previous and subsequent efforts 1.
A formal framework for linguistic annotation (1999)
`Linguistic annotation ' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions-- audio, video and/or physiological...
Annotation Graphs as a Framework for Multidimensional Linguistic Data Analysis (1999)
In recent work we have presented a formal framework for linguistic annotation based on labeled acyclic digraphs. These #annotation graphs# o#er a simple yet powerful method for representing complex...
Towards A Formal Framework For Linguistic Annotations (1999)
Steven Bird Mark, Mark Liberman
`Linguistic annotation' is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats...
A Formal Framework for Linguistic Annotation (1999)
Steven Bird And, Steven Bird, Mark Liberman
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological...
Annotation Graphs as a Framework for Multidimensional Linguistic Data Analysis (1999)
In recent work we have presented a formal framework for linguistic annotation based on labeled acyclic digraphs. These `annotation graphs' offer a simple yet powerful method for representing...
A Formal Framework for Linguistic Annotation (1999)
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological...
Towards A Formal Framework For Linguistic Annotations (1999)
`Linguistic annotation' is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats...
Topic Detection and Tracking using idf-Weighted Cosine Coefficient (1999)
Michael Schultz, Mark Liberman
The goal of TDT Topic Detection and Tracking is to develop automatic methods of identifying topically related stories within a stream of news media. We describe approaches for both detection and...
Professional Experience (1999)
Partha Pratim Talukdar, Advisors Prof, Mark Liberman, Prof Fern, O Pereira, Best Poster Award, ...
I am primarily interested in Machine Learning and Computational Linguistics. My recent research
Mark Liberman, Christopher Cieri
The Linguistic Data Consortium (LDC) is an open consortium of universities, companies and government research laboratories. It creates and distributes speech and text databases, lexicons and other...
Pronunciation Modeling In Speech Synthesis (1998)
Mark Liberman, George Cardona, Corey Andrew Miller, Corey Andrew Miller, To Jonathan Connett
This dissertation investigates the area of pronunciation modeling in speech synthesis. By pronunciation modeling, we mean architectures and principles for generating high-quality human-like...
ny words with a low frequency and only a few with a high frequency. Example data from the Swedish Press65 corpus confirms this: the corpus contains more than 30,000 words with frequency one but there...
Maintenance Medication for Schizophrenia and Schizoaffective Patients (1995)
Lerner, Vladimir, Fotyanov, Mikhail, Liberman, Mark, Shlafman, Michael, Bar-El, Yair
This study assesses the different approaches to treating patients with schizoaffective and paranoid schizophrenia in remission. Individualized treatment of 220 outpatient schizophrenia patients was...
Commentary on kaplan and kay (1994)
Anyone with a fundamental interest in morphology and phonology, either from a scientific or a computational perspective, will want to study this long-awaited paper carefully. Kaplan and Kay...
Unipen project of on-line data exchange and recognizer benchmarks (1994)
Isabelle Guyon, Lambert Schomaker, Réjean Plamondon, Mark Liberman, Stan Janet
We report the status of the UNIPEN project of data exchange and recognizer benchmarks started two years ago at the initiative of the International Association of Pattern Recognition (Technical...
The intonational system of English /--by Mark Yoffe Liberman. (1975)
Microfilm of typescript. [Cambridge] : M. I. T. Libraries, 1976. 1 reel ; 35 mm.
Some observations on semantic scope /--by Mark Yoffe Liberman. (1973)
Photocopy. Cambridge : Massachusetts Institute of Technology Libraries, 1977.
Some observations on semantic scope. (1973)
Massachusetts Institute of Technology. Dept. of Foreign Literatures and Linguistics. Thesis. 1973. M.S.
# Hidden Training Training Cross Testing Testing Units Epochs Error Error Errors Error (%) (1731)
Christiane Ho Mann, Mark Liberman, Martin Roscheisen, Mark Wasson, Kenneth W. Church, Mark Y. Liberman
Table 3: Results of comparing hidden layer sizes (6-context). Training was done on 573 items, using a cross validation set of 258 items. and perhaps even to other markup recognition problems, and...