Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moricz
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents...
Predicting Book Use for Off-Site Storage (2007)
Craig Silverstein, Stuart M. Shieber
We explore various methods for predicting library book use. Accurate prediction is invaluable when choosing titles to be stored in an off-site location. Previous researchers in this area concluded...
Challenges in web search engines (2002)
Henzinger, Monika R., Motwani, Rajeev, Silverstein, Craig
This article presents a high-level discussion of some problems in information retrieval that are unique to web search engines. The goal is to raise awareness and stimulate research in these areas.
Scalable techniques for mining causal structures (1998)
Craig Silverstein, Sergey Brin, Jeff Ullman, Rajeev Motwani
Mining for association rules in market basket data has proved a fruitful area of research. Mea-sures such as conditional probability (confi-dence) and correlation have been used to infer rules of the...
Analysis of a very large AltaVista query log (1998)
Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moricz
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents...
Analysis of a very large AltaVista query log (1998)
Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moricz
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents...
Scalable Techniques for Mining Causal Structures (1998)
Craig Silverstein, Sergey Brin, Rajeev Motwani, Jeff Ullman
Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the...
Scalable Techniques for Mining Causal Structures (1998)
Craig Silverstein, Sergey Brin, Rajeev Motwani, Usama Fayyad
. Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the...
Analysis of a Very Large AltaVista Query Log (1998)
Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moricz
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents...
Scalable Techniques for Mining Causal Structures (1998)
Craig Silverstein, Sergey Brin, Rajeev Motwani, Jeff Ullman
Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the...
Analysis of a very large AltaVista query log (1998)
Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moricz
In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents...
Projections for efficient document clustering (1997)
Hinrich Schutze, Craig Silverstein
Clustering is increasing in importance, but linear- and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the...
Beyond Market Baskets: Generalizing Association Rules to Correlations (1997)
Sergey Brin, Rajeev Motwani, Craig Silverstein
One of the most well-studied problems in data mining is mining for association rules in market basket data. Association rules, whose significance is measured via support and confidence, are intended...
Computational Evaluation of Hot Queues (1997)
Andrew V. Goldberg, Craig Silverstein
The heap-on-top (hot) priority queue data structure [6] improves on the best known times for Dijkstra's shortest path algorithm. It also has very good practical performance and is robust over a...
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules (1997)
Craig Silverstein, Sergey Brin, Rajeev Motwani
One of the more well-studied problems in data mining is the search for association rules in market basket data. Association rules are intended to identify patterns of the type: "A customer...
Projections for Efficient Document Clustering (1997)
Hinrich Schütze, Craig Silverstein
Clustering is increasing in importance, but linear- and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the...
Constrained TSP and Low-Power Computing (1997)
Moses Charikar, Rajeev Motwani, Prabhakar Raghavan, Craig Silverstein
. In the precedence-constrainedtraveling salesmanproblem (PTSP) we are givena partial order on n nodes, each of which is labeled by one of k points in a metric space.We are to find a visit order...
Almost-Constant-Time Clustering of Arbitrary Corpus Subsets (1997)
Craig Silverstein, Jan O. Pedersen
Methods exist for constant-time clustering of corpus subsets selected via Scatter/Gather browsing [3]. In this paper we expand on those techniques, giving an algorithm for almostconstant -time...
Predicting Individual Book Use for Off-Site Storage Using Decision Trees (1996)
Craig Silverstein, Stuart M. Shieber
We explore various methods for predicting library book use, as measured by circulation records. Accurate prediction is invaluable when choosing titles to be stored in an off-site location. Previous...
Thesis (A.B., Honors in Computer Science)--Harvard University, 1994.