Efficient Inference in Phylogenetic InDel Trees (2009)
Alexandre Bouchard-côté, Michael I. Jordan, Dan Klein
Accurate and efficient inference in evolutionary trees is a central problem in computational biology. While classical treatments have made unrealistic site independence assumptions, ignoring...
Fully Distributed EM for Very Large Datasets (2009)
Jason Wolfe, Aria Haghighi, Dan Klein
In EM and related algorithms, E-step computations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of the parameters in a...
Coarse-to-Fine Syntactic Machine Translation using Language Projections (2009)
Slav Petrov, Aria Haghighi, Dan Klein
The intersection of tree transducer-based translation models with n-gram language models results in huge dynamic programs for machine translation decoding. We propose a multipass, coarse-to-fine...
Mixture-of-Parents Maximum Entropy Markov Models (2009)
David S. Rosenberg, Dan Klein, Ben Taskar
We present the mixture-of-parents maximum entropy Markov model (MoP-MEMM), a class of directed graphical models extending MEMMs. The MoP-MEMM allows tractable incorporation of long-range dependencies...
Parsing German with Latent Variable Grammars (2009)
We describe experiments on learning latent variable grammars for various German treebanks, using a language-agnostic statistical approach. In our method, a minimal initial grammar is hierarchically...
Learning Bilingual Lexicons from Monolingual Corpora (2009)
Aria Haghighi, Percy Liang, Taylor Berg-kirkpatrick, Dan Klein
We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as context counts and...
Coarse-to-Fine Syntactic Machine Translation using Language Projections (2009)
Slav Petrov, Aria Haghighi, Dan Klein
The intersection of tree transducer-based translation models with n-gram language models results in huge dynamic programs for machine translation decoding. We propose a multipass, coarse-to-fine...
Analyzing the Errors of Unsupervised Learning (2009)
We identify four types of errors that unsupervised induction systems make and study each one in turn. Our contributions include (1) using a meta-model to analyze the incorrect biases of a model in a...
The Complexity of Phrase Alignment Problems (2009)
Many phrase alignment models operate over the combinatorial space of bijective phrase alignments. We prove that finding an optimal alignment in this space is NP-hard, while computing alignment...
EFFICIENT SENTENCE SEGMENTATION USING SYNTACTIC FEATURES (2009)
Benoit Favre, Dilek Hakkani-tür, Slav Petrov, Dan Klein
To enable downstream language processing, automatic speech recognition output must be segmented into its individual sentences. Previous sentence segmentation systems have typically been very local,...
Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing (2009)
We present a discriminative, latent variable approach to syntactic parsing in which rules exist at multiple scales of refinement. The model is formally a latent variable CRF grammar over trees,...
Percy Liang, Dan Klein, Michael I. Jordan
The learning of probabilistic models with many hidden variables and nondecomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate...
Non-Local Modeling with a Mixture of PCFGs (2009)
Slav Petrov, Leon Barrett, Dan Klein
While most work on parsing with PCFGs has focused on local correlations between tree configurations, we attempt to model non-local correlations using a finite mixture of PCFGs. A mixture grammar fit...
coast on Sunday packing 135 mph winds and torrential rain and causing panic in Cancun, where frightened tourists squeezed into musty shelters.
Evaluating Strategies for Similarity Search on the Web \Lambda (2008)
Taher H. Haveliwala, Aristides Gionis, Dan Klein
x
� How do we know what nodes go in the tree?
Approximate Factoring for A ∗ Search (2008)
Aria Haghighi, John Denero, Dan Klein
We present a novel method for creating A ∗ estimates for structured search problems. In our approach, we project a complex model onto multiple simpler models for which exact inference is efficient....
Dan Klein, Uc Berkeley, Next Few Weeks, Next Few Weeks, John Smith
� Sign up with me � You’ve got 10- 20 minutes, one slot per group � Tell us: � The problem: why do we care? � Your concrete task: input, output, evaluation � A simple baseline for the...
� You’ve got 6-8 minutes! � Tell us: � The problem: why do we care? � Your concrete task: input, output, evaluation � A simple baseline for the task � Your method (half the time here)...
Phrase Structure Parsing (2008)
Dan Klein, Uc Berkeley, Grammar Induction
WSD? � Remember when we discussed WSD?
Truth-Conditional Semantics � Proper names: � Refer directly to some entity in the world (2008)
� Some notions worth knowing:
Non-Local Modeling with a Mixture of PCFGs (2008)
Slav Petrov, Leon Barrett, Dan Klein
While most work on parsing with PCFGs has focused on local correlations between tree configurations, we attempt to model non-local correlations using a finite mixture of PCFGs. A mixture grammar fit...
� Phrase structure parsing organizes syntax into constituents or brackets � In general, this involves nested trees � Linguists can, and do, argue about details � Lots of ambiguity � Not the...
Dan Klein, Uc Berkeley, Hurricane Emily, Mexico Caribbean
� Phrase structure parsing organizes syntax into constituents or brackets � In general, this involves nested trees � Linguists can, and do, argue about details � Lots of ambiguity � Not the...
� Syntactic language and translation models Hypotheis Lattices WSD? (2008)
Dan Klein, Uc Berkeley, Grammar Induction
� Remember when we discussed WSD? � Word-based MT systems rarely have a WSD step � Why not? Phrase Structure Parsing � Phrase structure parsing organizes syntax into constituents or brackets...
Kinds of Reference � Referring expressions (2008)
Dan Klein, Uc Berkeley, Next Few Weeks, John Smith
� Sign up with me � You’ve got 10-20 minutes, one slot per group � Tell us: � The problem: why do we care? � Your concrete task: input, output, evaluation � A simple baseline for the...
Dan Klein, Christopher D. Manning
We present a novel generative model for natural language tree structures in which semantic (lexical dependency) and syntactic (PCFG) structures are scored with separate models. This factorization...
Dan Klein, Christopher D. Manning
We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The...
Dan Klein, Christopher D. Manning
We present a novel generative model for natural language tree structures in which semantic (lexical dependency) and syntactic (PCFG) structures are scored with separate models. This factorization...
Learning Structured Models for Phone Recognition (2008)
Slav Petrov, Adam Pauls, Dan Klein
We present a maximally streamlined approach to learning HMM-based acoustic models for automatic speech recognition. In our approach, an initial monophone HMM is iteratively refined using a...
CHAMP (camera, handlens, and microscope probe) (2008)
Mungas, Greg S., Boynton, John E., Balzer, Mark A., Beegle, Luther, Sobel, Harold R., Fisher, Ted, ...
CHAMP (Camera, Handlens And Microscope Probe)is a novel field microscope capable of color imaging with continuously variable spatial resolution from infinity imaging down to diffraction-limited...
Dan Klein, Christopher D. Manning
We present a novel generative model for natural language tree structures in which semantic (lexical dependency) and syntactic (PCFG) structures are scored with separate models. This factorization...
Percy Liang, Dan Klein, Michael I. Jordan
The learning of probabilistic models with many hidden variables and nondecomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate...
Discriminative log-linear grammars with latent variables (2008)
We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchical pruning procedure...
Discriminative log-linear grammars with latent variables (2008)
We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchical pruning procedure...
Structure compilation: trading structure for features (2008)
Percy Liang, Hal Daumé Iii, Dan Klein
Structured models often achieve excellent performance but can be slow at test time. We investigate structure compilation, where we replace structure with features, which are often computationally...
Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk
Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...
Including Chris Manning, Dan Klein, Eric Gaussier, Nicola Cancedda, Franck Thollard, Alexander Simon Clark, ...
I hereby declare that this thesis has not been submitted, either in the same or different form, to this or any other university for a degree. Signature: Acknowledgements First, I would like to thank...
Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk
Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...
Dan Klein, Christopher D. Manning
) methods for parsing probabilistic context-free grammars (PCFGs) are well known, a tabular parsing framework for arbitrary PCFGs which allows for botton-up, topdown, and other parsing strategies,...
Combining Heterogeneous Classi (2007)
H. Tolga Ilhan, Ar D. Kamvar, Dan Klein, Christopher D. Manning, Kristina Toutanova
The Stanford-CS224N system is an ensemble of simple classi ers. The rst-tier systems are heterogeneous, consisting primarily of naive-Bayes variants, but also including vector space, memory-based,...
Dan Klein, Joseph Smarr, Christopher D. Manning
We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level...
Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk
Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...
Fully Distributed EM for Very Large Datasets (2007)
Jason Wolfe, Aria Delier Haghighi, Daniel Klein, Jason Wolfe, Aria Haghighi, Dan Klein
personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the...
Percy Liang, Dan Klein, Coreference Resolution
Recent interest in Bayesian nonparametric methods Probabilistic modeling is a core technique for many NLP tasks such as the ones listed. In recent years, there has been increased interest in applying...
The infinite PCFG using hierarchical Dirichlet processes (2007)
Percy Liang, Slav Petrov, Michael I. Jordan, Dan Klein
We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDP-PCFG model allows the complexity of the grammar to grow as more training data...
Detecting categories in news video using acoustic, speech and image features (2007)
Slav Petrov, Arlo Faria, Pascal Michaillat, Er Berg, Andreas Stolcke, Dan Klein, ...
This work describes systems for detecting semantic categories present in news video. The multimedia data was processed in three ways: the audio signal was converted to a sequence of acoustic...
The infinite PCFG using hierarchical Dirichlet processes (2007)
Percy Liang, Slav Petrov, Michael I. Jordan, Dan Klein
We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDP-PCFG model allows the complexity of the grammar to grow as more training data...
The infinite PCFG using hierarchical Dirichlet processes (2007)
Percy Liang, Slav Petrov, Michael I. Jordan, Dan Klein
We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDP-PCFG model allows the complexity of the grammar to grow as more training data...
Improved inference for unlexicalized parsing (2007)
We present several improvements to unlexicalized parsing with hierarchically state-split PCFGs. First, we present a novel coarse-to-fine method in which a grammar’s own hierarchical projections are...
Word alignment via quadratic assignment (2006)
Simon Lacoste-julien, Dan Klein
Recently, discriminative word alignment methods have achieved state-of-the-art accuracies by extending the range of information sources that can be easily incorporated into aligners. The chief...
Why generative phrase models underperform surface heuristics (2006)
John Denero, Dan Gillick, James Zhang, Dan Klein
We investigate why weights from generative models underperform heuristic estimates in phrasebased machine translation. We first propose a simple generative, phrase-based model and verify that its...
An end-to-end discriminative approach to machine translation (2006)
Percy Liang, Alexandre Bouchard-côté, Dan Klein, Ben Taskar
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of...
An end-to-end discriminative approach to machine translation (2006)
Percy Liang, Alexandre Bouchard-côté, Dan Klein, Ben Taskar
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of...
Proceedings of the Workshop on Statistical Machine Translation, pages 31--38, (2006)
New York City, John Denero, Dan Gillick, James Zhang, Dan Klein
We investigate why weights from generative models underperform heuristic estimates in phrasebased machine translation. We first propose a simple generative, phrase-based model and verify that its...
Word alignment via quadratic assignment (2006)
Simon Lacoste-julien, Dan Klein
Recently, discriminative word alignment methods have achieved state-of-the-art accuracies by extending the range of information sources that can be easily incorporated into aligners. The chief...
An end-to-end discriminative approach to machine translation (2006)
Percy Liang, Alexandre Bouchard-côté, Dan Klein, Ben Taskar
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of...
Learning Accurate, Compact, and Interpretable Tree Annotation (2006)
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein
We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar...
Learning Accurate, Compact, and Interpretable Tree Annotation (2006)
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein
We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar...
An end-to-end discriminative approach to machine translation (2006)
Percy Liang, Alexandre Bouchard-côté, Dan Klein, Ben Taskar
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of...
Word Alignment via Quadratic Assignment (2006)
Lacoste-Julien, Simon, Taskar, Ben, Klein, Dan, Jordan, Michael I.
Recently, discriminative word alignment methods have achieved state-of-the-art accuracies by extending the range of information sources that can be easily incorporated into aligners. The chief...
An end-to-end discriminative approach to machine translation (2006)
Percy Liang, Alexandre Bouchard-côté, Dan Klein, Ben Taskar
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system can take advantage of...
Prototype-driven grammar induction (2006)
We investigate prototype-driven learning for primarily unsupervised grammar induction. Prior knowledge is specified declaratively, by providing a few canonical examples of each target phrase type....
A Discriminative Matching Approach to Word Alignment (2005)
Taskar, Ben, Lacoste-Julien, Simon, Klein, Dan
We present a discriminative, large-margin approach to feature-based matching for word alignment. In this framework, pairs of word tokens receive a matching score, which is based on features of that...
A discriminative matching approach to word alignment (2005)
Ben Taskar, Simon Lacoste-julien, Dan Klein
We present a discriminative, largemargin approach to feature-based matching for word alignment. In this framework, pairs of word tokens receive a matching score, which is based on features of that...
A Core-Tools Statistical NLP Course (2005)
In the fall term of 2004, I taught a new statistical NLP course focusing on core tools and machine-learning algorithms.
A discriminative matching approach to word alignment (2005)
Ben Taskar, Simon Lacoste-julien, Dan Klein
We present a discriminative, largemargin approach to feature-based matching for word alignment. In this framework, pairs of word tokens receive a matching score, which is based on features of that...
Transfer of Grammatical Structure (2005)
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein
Recent research has demonstrated that PCFGs with latent annotations are an effective way to provide automated increases in parsing accuracy. We feel that they have more potential than the literature...
The unsupervised learning of natural language structure / (2005)
Klein, Dan., Manning, Christopher D. Advisor
Submitted to the Department of Computer Science.
Corpus-based induction of syntactic structure: Models of dependency and constituency (2004)
We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The...
Analyzing an Italian Treebank with State-of-the-Art Statistical Parsers (2004)
Thomas M. Cover, Alberto Lavelli, Giorgio Satta, Roberto Zanoli, Wiley Series, Telecommunications John Wiley, ...
this paper we report work in progress on the application of state-of-the-art statistical parsing techniques to Italian. Our approach partially differs from previous efforts on other languages because...
Corpus-based induction of syntactic structure: Models of dependency and constituency (2004)
We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The...
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network (2003)
Kristina Toutanova, Dan Klein, Christopher D. Manning, Yoram Singer
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of...
Accurate Unlexicalized Parsing (2003)
Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
Factored A* search for models over sequences and trees (2003)
We investigate the calculation of A * bounds for sequence and tree models which are the explicit intersection of a set of simpler models or can be bounded by such an intersection. We provide a...
Sepandar D. Kamvar, Dan Klein, Christopher D. Manning
We present a simple, easily implemented spectral learning algorithm which applies equally whether we have no supervisory information, pairwise link constraints, or labeled examples. In the...
Named entity recognition with character-level models (2003)
Dan Klein, Joseph Smarr, Huy Nguyen, Christopher D. Manning
We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level...
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network (2003)
Kristina Toutanova, Dan Klein, Christopher D. Manning, Yoram Singer
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of...
Sepandar D. Kamvar, Dan Klein, Christopher D. Manning
We present a simple, easily implemented spectral learning algorithm that applies equally whether we have no supervisory information, pairwise link constraints, or labeled examples. In the...
A* parsing: Fast exact Viterbi parse selection (2003)
We present an extension of the classic A * search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating...
Accurate Unlexicalized Parsing (2003)
Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
Computing PageRank using power extrapolation (2003)
Taher Haveliwala, Ar Kamvar, Dan Klein, Chris Manning, Gene Golub
Abstract. We present a novel technique for speeding up the computation of PageRank, a hyperlink-based estimate of the "importance " of Web pages, based on the ideas presented in...
Accurate Unlexicalized Parsing (2003)
Dan Klein Stanford, Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
A* Parsing: Fast Exact Viterbi Parse Selection (2003)
Dan Klein, Christopher D. Manning
We present an extension of the classic A* search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating...
Named Entity Recognition with Character-Level Models (2003)
Dan Klein And, Dan Klein, Joseph Smarr, Huy Nguyen, Christopher D. Manning
We discuss two named-entity recognition models which use characters and character -grams either exclusively or as an important part of their data representation. The first model is a character-level...
Fast Exact Inference with a Factored Model for Natural Language Parsing (2003)
Dan Klein, Christopher D. Manning
We present a novel generative model for natural language tree structures in which semantic (lexical dependency) and syntactic (PCFG) structures are scored with separate models. This factorization...
Accurate Unlexicalized Parsing (2003)
Dan Klein Stanford, Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
Accurate Unlexicalized Parsing (2003)
Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
Accurate Unlexicalized Parsing (2003)
Dan Klein, Christopher D. Manning
We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence...
Factored A* search for models over sequences and trees (2003)
We investigate the calculation of A * bounds for sequence and tree models which are the explicit intersection of a set of simpler models or can be bounded by such an intersection. We provide a...
A* parsing: Fast exact Viterbi parse selection (2003)
We present an extension of the classic A * search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating...
Computing PageRank using power extrapolation (2003)
Taher Haveliwala, Ar Kamvar, Dan Klein, Chris Manning, Gene Golub
Abstract. We present a novel technique for speeding up the computation of PageRank, a hyperlink-based estimate of the “importance ” of Web pages, based on the ideas presented in [7]. The original...
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network (2003)
Kristina Toutanova, Dan Klein, Christopher D. Manning, Yoram Singer
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of...
Factored A* Search for Models over Sequences and Trees (2003)
Dan Klein, Christopher D. Manning
We investigate the calculation of A* bounds for sequence and tree models which are the explicit intersection of a set of simpler models or can be bounded by such an intersection. We provide a natural...
Named entity recognition with character-level models (2003)
Dan Klein, Joseph Smarr, Huy Nguyen, Christopher D. Manning
We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level...
A* parsing: Fast exact Viterbi parse selection (2003)
We present an extension of the classic A * search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating...
Factored A* search for models over sequences and trees (2003)
We investigate the calculation of A * bounds for sequence and tree models which are the explicit intersection of a set of simpler models or can be bounded by such an intersection. We provide a...
A* parsing: Fast exact Viterbi parse selection (2003)
We present an extension of the classic A * search procedure to tabular PCFG parsing. The use of A* search can dramatically reduce the time required to find a best parse by conservatively estimating...
Percy Liang, Dan Klein, Michael I. Jordan
The learning of probabilistic models with many hidden variables and nondecomposable dependencies is an important and challenging problem. In contrast to traditional approaches based on approximate...
Evaluating Strategies for Similarity Search on the Web (2002)
Haveliwala, Taher H., Gionis, Aristades, Klein, Dan, Indyk, Piotr
Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...
Combining heterogeneous classifiers for word-sense disambiguation (2002)
Dan Klein, Kristina Toutanova, H. Tolga Ilhan, Ar D. Kamvar, Christopher D. Manning
This paper discusses ensembles of simple but heterogeneous classifiers for word-sense disambiguation, examining the Stanford-CS224N system entered in the SENSEVAL-2 English lexical sample task....
Sepandar D. Kamvar, Dan Klein, Christopher D. Manning
We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms--...
Dan Klein, Sepandar D. Kamvar, Christopher D. Manning
We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have spacelevel...
Dan Klein, Sepandar D. Kamvar, Christopher D. Manning
We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have spacelevel...
Dan Klein, Sepandar D. Kamvar, Christopher D. Manning
We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have spacelevel...
Conditional structure versus conditional estimation in NLP models (2002)
Dan Klein, Christopher D. Manning
This paper separates conditional parameter estimation, which consistently raises test set accuracy on statistical NLP tasks, from conditional model structures, such as the conditional Markov model...
Natural language grammar induction using a constituent-context model (2002)
Dan Klein, Christopher D. Manning
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG...
Combining heterogeneous classifiers for word-sense disambiguation (2002)
Dan Klein, Kristina Toutanova, H. Tolga Ilhan, Ar D. Kamvar, Christopher D. Manning
This paper discusses ensembles of simple but heterogeneous classifiers for word-sense disambiguation, examining the Stanford-CS224N system entered in the SENSEVAL-2 English lexical sample task....
Interpreting and Extending Classical Agglomerative Clustering Algorithms (2002)
Sepandar D. Kamvar, Dan Klein, Christopher D. Manning
We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms --...
Natural language grammar induction using a constituent-context model (2002)
Dan Klein, Christopher D. Manning
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG...
Combining heterogeneous classifiers for word-sense disambiguation (2002)
Dan Klein, Kristina Toutanova, H. Tolga Ilhan, Ar D. Kamvar, Christopher D. Manning
This paper discusses ensembles of simple but heterogeneous classifiers for word-sense disambiguation, examining the Stanford-CS224N system entered in the SENSEVAL-2 English lexical sample task....
Sepandar D. Kamvar, Dan Klein, Christopher D. Manning
We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms –...
Natural language grammar induction using a constituent-context model (2002)
Dan Klein, Christopher D. Manning
This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG...
A Generative Constituent-Context Model for Improved Grammar Induction (2002)
Dan Klein, Christopher D. Manning
We present a generative distributional model for the unsupervised induction of natural language syntax which explicitly models constituent yields and contexts. Parameter
Dan Klein, Sepandar D. Kamvar, Christopher D. Manning
We present an improved method for clustering in the presence of very limited supervisory information, given as pairwise instance constraints. By allowing instance-level constraints to have spacelevel...
Evaluating Strategies for Similarity Search on the Web (2002)
Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk
Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...
Artists in Glass : Late Twentieth Century Masters in Glass (2001)
Libro constituido por pequeños ensayos sobre las creaciones de alrededor de ochenta artistas que trabajan con el vidrio como materia prima. Incluye imágenes de la obra de cada artista. El texto...
Artists in Glass : Late Twentieth Century Masters in Glass / D. Klein. (2001)
Libro constituido por pequeños ensayos sobre las creaciones de alrededor de ochenta artistas que trabajan con el vidrio como materia prima. Incluye imágenes de la obra de cada artista. El texto...
Distributional phrase structure induction (2001)
Dan Klein, Christopher D. Manning
Unsupervised grammar induction systems commonly judge potential constituents on the basis of their effects on the likelihood of the data. Linguistic justifications of constituency, on the other hand,...
Dan Klein, Christopher D. Manning
This paper presents empirical studies and closely corresponding theoretical models of the performance of a chart parser exhaustively parsing the Penn Treebank with the Treebank's own CFG...
Distributional phrase structure induction (2001)
Dan Klein, Christopher D. Manning
Unsupervised grammar induction systems commonly judge potential constituents on the basis of their effects on the likelihood of the data. Linguistic justifications of constituency, on the other hand,...
Parsing and hypergraphs (2001)
Dan Klein, Christopher D. Manning
While symbolic parsers can be viewed as deduction systems, this view is less natural for probabilistic parsers. We present a view of parsing as directed hypergraph analysis which naturally covers...
Parsing And Hypergraphs (2001)
Dan Klein And, Dan Klein, Christopher D. Manning
While symbolic parsers can be viewed as deduction systems, this view is less natural for probabilistic parsers.
Distributional phrase structure induction (2001)
Dan Klein, Christopher D. Manning
Unsupervised grammar induction systems commonly judge potential constituents on the basis of their effects on the likelihood of the data. Linguistic justifications of constituency, on the other hand,...
Parsing and hypergraphs (2001)
Dan Klein, Christopher D. Manning
While symbolic parsers can be viewed as deduction systems, this view is less natural for probabilistic parsers. We present a view of parsing as directed hypergraph analysis which naturally covers...
Distributional phrase structure induction (2001)
Dan Klein, Christopher D. Manning
Unsupervised grammar induction systems commonly judge potential constituents on the basis of their effects on the likelihood of the data. Linguistic justifications of constituency, on the other hand,...
Candidate Model Problems in Software Architecture (1994)
Mary Shaw David, David Garlan, Robert Allen, Dan Klein, John Ockerbloom, Curtis Scott, ...
data types. The second solution decomposes the system into a similar set of five modules. However, in this case data is no longer directly shared by the computational components. Instead, each module...
Huang, Yang, Lowe, Henry J., Klein, Dan, Cucina, Russell J.
Objective: The aim of this study was to develop and evaluate a method of extracting noun phrases with full phrase structures from a set of clinical radiology reports using natural language processing...
Huang, Yang, Lowe, Henry J., Klein, Dan, Cucina, Russell J.
Objective: The aim of this study was to develop and evaluate a method of extracting noun phrases with full phrase structures from a set of clinical radiology reports using natural language processing...
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
Kristina Toutanova Dan, Dan Klein, Christopher D. Manning, Yoram Singer
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of...
Factors affecting secondary share offerings in the IPO process
We investigate whether the sale of secondary shares in the IPO process is affected by an issuing firm's market-timing and window-dressing activities. We find that secondary share offerings in IPOs...