Automatic Code Assignment to Medical Text (2009)
Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar, Steven Carroll
Code assignment is important for handling large amounts of electronic medical data in the modern hospital. However, only expert annotators with extensive training can assign codes. We present a...
Regularized Learning with Networks of Features (2009)
Ted S, Partha Pratim Talukdar, Lyle H. Ungar, John Blitzer
For many supervised learning problems, we possess prior knowledge about which features yield similar information about the target variable. In predicting the topic of a document, we might know that...
DRASO: Declaratively Regularized Alternating Structural Optimization (2009)
Partha Pratim Talukdar, Ted Sandler, Mark Dredze, Koby Crammer, John Blitzer, Fernando Pereira
Recent work has shown that Alternating Structural Optimization (ASO) can improve supervised learners by learning feature representations from unlabeled data. However, there is no natural way to...
A Rate-Distortion One-Class Model and its Applications to Clustering (2009)
Koby Crammer, Partha Pratim Talukdar
In one-class classification we seek a rule to find a coherent subset of instances similar to a few positive examples in a large pool of instances. The problem can be formulated and analyzed naturally...
A. G. Ramakrishnan, Partha Pratim Talukdar
This paper addresses the problem of Hindi compound word splitting and its relevance to developing a good quality phonetizer for Hindi Speech Synthesis. The constituents of a Hindi compound word are...
LightlySupervised Attribute Extraction for Web Search (2008)
Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, O Pereira, Mark Liberman, Andrew Mccallum, ...
Web search engines can greatly benefit from knowledge about attributes of entities present in search queries. In this paper, we introduce lightly-supervised methods for extracting entity attributes...
Duration Modeling for Hindi Text-to-Speech Synthesis System (2008)
N. Sridhar Krishna, Partha Pratim Talukdar, Kalika Bali
This paper reports preliminary results of data-driven modeling of segmental (phoneme) duration for Hindi. Classification and Regression Tree (CART) based datadriven duration modeling for segmental...
Automatic Code Assignment to Medical Text (2008)
Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar, Steven Carroll
Code assignment is important for handling large amounts of electronic medical data in the modern hospital. However, only expert annotators with extensive training can assign codes. We present a...
Hindi Text Normalization (2008)
K. Panchapagesan, Partha Pratim Talukdar, N. Sridhar Krishna
All areas of language and speech technology, directly or indirectly, require handling of real (unrestricted) text. For example, Text-to-Speech systems directly need to work on real text, whereas...
Learning to Create Data-Integrating Queries (2008)
Partha Pratim Talukdar, Marie Jacob, Muhammad Salman Mehmood, Koby Crammer, Zachary G. Ives, Fernando Pereira, ...
The number of potentially-related data resources available for querying — databases, data warehouses, virtual integrated schemas — continues to grow rapidly. Perhaps no area has seen this problem...
Frustratingly hard domain adaptation for dependency parsing (2007)
Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira
We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...
Frustratingly hard domain adaptation for dependency parsing (2007)
Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira
We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...
Frustratingly hard domain adaptation for dependency parsing (2007)
Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira
We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...
A Context Pattern Induction Method for Named Entity Extraction (2006)
Partha Pratim Talukdar, Thorsten Brants
We present a novel context pattern induction method for information extraction, specifically named entity extraction. Using this method, we extended several classes of seed entity lists into much...
A Context Pattern Induction Method for Named Entity Extraction (2006)
Partha Pratim Talukdar, Thorsten Brants
We present a novel context pattern induction method for information extraction, specifically named entity extraction. Using this method, we extended several classes of seed entity lists into much...
Tools For The Development Of A Hindi Speech Synthesis System (2004)
Kalika Bali, A.G. Ramakrishnan, Partha Pratim Talukdar, N. Sridhar Krishna
We describe in detail a Grapheme-to-Phoneme (G2P) converter required for the development of a good quality Hindi Text-to-Speech (TTS) system. The Festival framework is chosen for developing the Hindi...
Professional Experience (1999)
Partha Pratim Talukdar, Advisors Prof, Mark Liberman, Prof Fern, O Pereira, Best Poster Award, ...
I am primarily interested in Machine Learning and Computational Linguistics. My recent research