John Blitzer

Publication List Details

Period

2001 - 2009

Number

40

Co-Authors

Zero-Shot Domain Adaptation: A Multi-View Approach (2009)

John Blitzer, Dean P. Foster, Sham M. Kakade

Domain adaptation algorithms attempt to address situations where our training (source) data distribution and test (target) data distribution differ, potentially by a substantial amount. For example,...

Regularized Learning with Networks of Features (2009)

Ted S, Partha Pratim Talukdar, Lyle H. Ungar, John Blitzer

For many supervised learning problems, we possess prior knowledge about which features yield similar information about the target variable. In predicting the topic of a document, we might know that...

DRASO: Declaratively Regularized Alternating Structural Optimization (2009)

Partha Pratim Talukdar, Ted Sandler, Mark Dredze, Koby Crammer, John Blitzer, Fernando Pereira

Recent work has shown that Alternating Structural Optimization (ASO) can improve supervised learners by learning feature representations from unlabeled data. However, there is no natural way to...

Feature Design for Transfer Learning (2008)

Mark Dredze, John Blitzer, Koby Crammer, Fernando Pereira

Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution and labeled using the same function. However, often we have labeled...

General Terms (2008)

Mark Dredze, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, O Pereira

We present two prediction problems under the rubric of Intelligent Email that are designed to support enhanced email interfaces that relieve the stress of email overload. Reply prediction alerts...

Batch Performance for an Online Price (2008)

Koby Crammer, Mark Dredze, John Blitzer, O Pereira

Batch learning techniques achieve good performance, but at the cost of many (sometimes even hundreds) of passes over the data. For many tasks, such as web-scale ranking of machine translation...

2006, ‘Statistical LTAG Parsing (2008)

Libin Shen, Aravind K. Joshi, Rajeev Alur, John Blitzer, Jinying Chen, ...

First and foremost, I would like to thank my advisor Aravind Joshi for his continuous support and guidance in both academic and daily life ever since my first day at Penn. Many thanks to my...

Multi-View Learning over Structured and Non-Identical Outputs (2008)

Ganchev, Kuzman, Graca, Joao V, Blitzer, John, Taskar, Ben

In many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is...

Domain adaptation of natural language processing systems (2008)

Blitzer, John

Statistical language processing models are being applied to an ever wider and more varied range of linguistic domains. Collecting and curating training sets for each different domain is prohibitively...

Learning bounds for domain adaptation (2008)

John Blitzer, Koby Crammer, Alex Kulesza, O Pereira, Jennifer Wortman

Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source...

Learning bounds for domain adaptation (2008)

John Blitzer, Koby Crammer, Alex Kulesza, O Pereira, Jennifer Wortman

Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source...

Learning bounds for domain adaptation (2008)

John Blitzer, Koby Crammer, Alex Kulesza, O Pereira, Jennifer Wortman

Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source...

Multi-view learning over structured and non-identical outputs (2008)

Kuzman Ganchev, João V. Graça, John Blitzer, Ben Taskar

In many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is...

Multi-view learning over structured and non-identical outputs (2008)

Kuzman Ganchev, João V. Graça, John Blitzer, Ben Taskar

In many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is...

Frustratingly hard domain adaptation for dependency parsing (2007)

Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira

We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...

Frustratingly hard domain adaptation for dependency parsing (2007)

Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira

We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...

Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification (2007)

John Blitzer, Mark Dredze, Fernando Pereira

Automatic sentiment classification has been extensively studied and applied in recent years. However, sentiment is expressed differently in different domains, and annotating corpora for every...

Analysis of representations for domain adaptation (2007)

Shai Ben-david, John Blitzer, Koby Crammer, Presented Marina Sokolova

Domain is a distribution D on an instance set X Domain adaptation of a classifier A classification task Source domain (DS)

Analysis of representations for domain adaptation (2007)

Shai Ben-david, John Blitzer, Koby Crammer, O Pereira

Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. In many situations, though, we have labeled training data for a...

Frustratingly hard domain adaptation for dependency parsing (2007)

Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, João V. Graça, O Pereira

We describe some challenges of adaptation in the 2007 CoNLL Shared Task on Domain Adaptation. Our error analysis for this task suggests that a primary source of error is differences in annotation...

Analysis of representations for domain adaptation (2007)

Shai Ben-david, John Blitzer, Koby Crammer, O Pereira

Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. In many situations, though, we have labeled training data for a...

Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification (2007)

John Blitzer, Mark Dredze, Fernando Pereira

Automatic sentiment classification has been extensively studied and applied in recent years. However, sentiment is expressed differently in different domains, and annotating corpora for every...

Domain adaptation with structural correspondence learning (2006)

John Blitzer, Ryan Mcdonald, Fernando Pereira

Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks,...

Distance metric learning for large margin nearest neighbor classification (2006)

Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul

We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always...

Domain adaptation with structural correspondence learning (2006)

John Blitzer, Ryan Mcdonald, Fernando Pereira

Discriminative learning methods are widely used in natural language processing. These methods work best when their training and test data are drawn from the same distribution. For many NLP tasks,...

Distance metric learning for large margin nearest neighbor classification (2006)

Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul

We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always...

Distributed Latent Variable Models of Lexical Co-occurrences (2005)

John Blitzer And, John Blitzer

Low-dimensional representations for lexical co-occurrence data have become increasingly important in alleviating the sparse data problem inherent in natural language processing tasks. This work...

Hierarchical Distributed Representations for Statistical Language Modeling (2004)

Blitzer, John, Weinberger, Kilian Q, Saul, Lawrence K, Pereira, Fernando C.N.

Statistical language models estimate the probability of a word occurring in a given context. The most common language models rely on a discrete enumeration of predictive contexts (e.g., n-grams) and...

Hierarchical distributed representations for statistical language modeling (2004)

John Blitzer, Kilian Q. Weinberger, Lawrence K. Saul

Statistical language models estimate the probability of a word occurring in a given context. The most common language models rely on a discrete enumeration of predictive contexts (e.g., n-grams) and...

MEAD - a platform for multidocument multilingual text summarization (2004)

Dragomir Radev, Timothy Allison, Sasha Blair-goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, ...

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500...

MEAD - a platform for multidocument multilingual text summarization (2004)

Dragomir Radev Timothy, Timothy Allison, Sasha Blair-goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, ...

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500...

MEAD - a platform for multidocument multilingual text summarization (2004)

Dragomir Radev, Timothy Allison, Sasha Blair-goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, ...

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500...

MEAD - a platform for multidocument multilingual text summarization (2004)

Dragomir Radev, Timothy Allison, Sasha Blair-goldensohn, John Blitzer, Arda Çelebi, Stanko Dimitrov, ...

This paper describes the functionality of MEAD, a comprehensive, public domain, open source, multidocument multilingual summarization environment that has been thus far downloaded by more than 500...

Evaluation Challenges in Large-Scale Document Summarization (2003)

Dragomir R. Radev, Wai Lam, Arda Celebi, Simone Teufel, John Blitzer

We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 Million automatic...

Evaluation Challenges in Large-Scale Document Summarization (2003)

Dragomir Radev Radev, Wai Lam, Arda C Elebi, Simone Teufel, John Blitzer, Danyu Liu, ...

We present a large-scale meta evaluation of eight evaluation measures for both single-document and multi-document summarizers. To this end we built a corpus consisting of (a) 100 Million automatic...

Evaluation of Text Summarization in a Cross-lingual Information Retrieval Framework (2001)

Summer Johns, Hopkins Workshop, Dragomir Radev, Simone Teufel, Horacio Saggion, Wai Lam, ...

We report on research in multi-document summarization and on evaluation of summarization in the framework of cross-lingual information retrieval. This work was carried out during a summer workshop on...