M. Henzinger

Publication List Details

Period

1996 - 2008

Number

9

Co-Authors

Web page language identification based on URLs (2008)

Baykan, E., Henzinger, M., Weber, I.

Given only the URL of a web page, can we identify its language? This is the question that we examine in this paper. Such a language classifier is, for example, useful for crawlers of web search...

Web Information Retrieval (2007)

Monika Henzinger, Shmm C. Silverstein, M. Henzinger, J. Marais, M. Moricz Analysis, ...

e Winter 1994 USENIX Conference, pages 1-10, Berkeley, CA, USA, 1994. [M97] M. Marchiori. The quest for correct information on the web: Hyper search engines. In Proceedings of the Sixth International...

Web Conference [WWW7], pages 541--550. (2007)

Shmm C. Silverstein, M. Henzinger, J. Marais, M. Moricz Analysis

[S73] H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Amer. Soc. Info.

Finding near-duplicate web pages: A large-scale evaluation of algorithms (2006)

Henzinger, M.

Broder et al.'s [3] shingling algorithm and Charikar's [4] random projection based approach are considered "state-of-the-art" algorithms for finding near-duplicate web pages. Both algorithms were...

The past, present, and future of web information retrieval (2004)

Henzinger, M., E. H. B., Smith, J., Hu, J., Allan

In this article we describe the approach taken by the first web search engines, discuss the state of the art, and present some of the challenges for the future.

Who links to whom: mining linkage between Web sites (2001)

Bharat, K., Chang, Bay-Wei, Henzinger, M., Ruhl, M.

Previous studies of the Web graph structure have focused on the graph structure at the level of individual pages. In actuality the Web is a hierarchically nested graph, with domains, hosts and Web...

Analysis of a very large web search engine query log (1999)

Silverstein, C., Marais, H., Henzinger, M., Moricz, M.

In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285...

On the number of small cuts in a graph (1996)

Henzinger, M., Williamson, D. P.

We prove that in an undirected graph there are at most O(n2) cuts of size strictly less than 3/2 of the size of the minimum cut.