Piotr Indyk

Near-Optimal Sparse Recovery in the L1 norm (2009)

Piotr Indyk

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x ∈ Rn from its lower-dimensional sketch Ax ∈ Rm. Specifically, we...

LOCAL DECODING OF WALSH CODES TO REDUCE CDMA DESPREADING COMPUTATION (2009)

Raghu Madyastha (vanu, Piotr Indyk

In traditional hardware implementations of the CDMA standard IS-95 [1], signals received at the base station are despread at a rate of 1.2288 Megachips/sec prior to Walsh decoding. However, in a...

Biographies (2009)

R Andoni, Piotr Indyk

In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given...

Approximate Line Nearest Neighbor in High Dimensions (2009)

Alexandr Andoni, Piotr Indyk, Robert Krauthgamer, Huy L. Nguyen

We consider the problem of approximate nearest neighbors in high dimensions, when the queries are lines. In this problem, given n points in R d, we want to construct a data structure to support...

Probabilistic embeddings of bounded genus graphs into planar graphs (2008)

Piotr Indyk, Anastasios Sidiropoulos

Ì��ÓÒר�ÒØC�×�ÐÐ��Ø����רÓÖØ�ÓÒÓ�Ø���Ñ�����Ò �...

Declaring Independence via the Sketching of Sketches (2008)

Piotr Indyk, Andrew McGregor

We consider the problem of identifying correlations in data streams. Surprisingly, our work seems to be the first to consider this natural problem. In the centralized model, we consider a stream of...

ABSTRACT (2008)

Mihai Bădoiu, Julia Chuzhoy, Piotr Indyk, Anastasios Sidiropoulos

We study the problem of minimum-distortion embedding of ultrametrics into the plane and higher dimensional spaces. Ultrametrics are a natural class of metrics that frequently occur in applications...

Abstract (2008)

Piotr Indyk, David Woodruff

In [12] a private approximation of a function f is defined to be another function F that approximates f in the usual sense, but does not reveal any information about x other than what can be deduced...

ABSTRACT Low-Dimensional Embedding with Extra Information (2008)

Mihai Bădoiu, Erik D. Demaine, Mohammadtaghi Hajiaghayi, Piotr Indyk

A frequently arising problem in computational geometry is when a physical structure, such as an ad-hoc wireless sensor network or a protein backbone, can measure local information about its geometry...

Block Heavy Hitters (2008)

Andoni, Alexandr, Ba, Khanh Do, Indyk, Piotr

e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$,...

Block Heavy Hitters (2008)

Andoni, Alexandr, Ba, Khanh Do, Indyk, Piotr

e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A$,...

General Terms (2008)

Piotr Indyk

We give a 1-pass Õ(m1−2/k)-space algorithm for computing the k-th frequency moment of a data stream for any real k> 2. Together with the lower bounds of [1, 2, 4], this resolves the main...

�Ò Ñ�ÒÝ �Ö�� × �Ò ÐÙ��Ò � ÕÙ�ÖÝ ÓÔØ�Ñ�Þ�Ø�ÓÒ �Ò � �ÔÔÖÓÜ� (2008)

Nitin Thaper, Piotr Indyk

ÁÒ Ø�� × Ô�Ô�Ö Û � ÔÖ�×�ÒØ � �ÓÖÑ�Ð ×ØÙ�Ý Ó � �ÝÒ�Ñ � ÑÙÐØ� ��Ñ�Ò×�ÓÒ�Ð ��רÓ�Ö�Ñ ×ØÖÙ ØÙÖ� ×...

1 Introduction (2008)

Gregory Shakhnarovich, Piotr Indyk, Trevor Darrell

The nearest-neighbor (NN) problem occurs in the literature under many names, including the best match or the post office problem. The problem is of significant importance to several areas of computer...

MIT Abstract (2008)

Piotr Indyk, Assaf Naor

In this paper we introduce the notion of nearest neighbor preserving embeddings. These are randomized embeddings between two metric spaces which preserve the (approximate) nearest neighbors. We give...

Abstract (2008)

Piotr Indyk, David Woodruff

In [12] a private approximation of a function f is defined to be another function F that approximates f in the usual sense, but does not reveal any information about x other than what can be deduced...

ABSTRACT Dynamic Multidimensional Histograms (2008)

Nitin Thaper, Piotr Indyk

Histograms are a concise and flexible way to construct sum-mary structures for large data sets. They have attracted a lot of attention in database research due to their utility in many areas,...

Analysis This talk Data Data Streams (2008)

Piotr Indyk, L Minimization, Low Distortion Embeddings

• Two explicit constructions: – A “low-distortion ” embedding A:R n →R m, m=n 1+o(1) , such that for any x ||Ax| | 1 = (1±ε)||x| | 2 (a.k.a. Dvoretzky’s Theorem for l 1) – A “nice...

Abstract Earth Mover Distance over High-Dimensional Spaces (2008)

Alexandr Andoni, Piotr Indyk

The Earth Mover Distance (EMD) between two equalsize sets of points in R d is defined to be the minimum cost of a bipartite matching between the two pointsets. It is a natural metric for comparing...

Sparse recovery using sparse matrices (2008)

Berinde, Radu, Indyk, Piotr

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x from its lower-dimensional sketch Ax. A popular way of performing this...

Sparse recovery using sparse matrices (2008)

Berinde, Radu, Indyk, Piotr

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x from its lower-dimensional sketch Ax. A popular way of performing this...

Abstract (2008)

Radu Berinde, Piotr Indyk

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x from its lower-dimensional sketch Ax. A popular way of performing this...

Explicit constructions for compressed sensing of sparse signals (2008)

Piotr Indyk

Over the recent years, a new approach for obtaining a succinct approximate representation of ndimensional

Overcoming the ℓ1 non-embeddability barrier: Algorithms for product metrics (2008)

Alexandr Andoni, Piotr Indyk

A common approach for solving computational problems over a difficult metric space is to embed the “hard ” metric into L1, which admits efficient algorithms and is thus considered an “easy ”...

Abstract (2008)

Radu Berinde, Piotr Indyk

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x from its lower-dimensional sketch Ax. A popular way of performing this...

Problem Ref Approx. Time Comments Diameter [10] (2007)

Ashish Goel, Piotr Indyk, Kasturi Varadarajan

We present improved running times for a wide range of approximate high dimensional proximity problems. We obtain subquadratic running time for each of these problems. These improved running times are...

y (2007)

Yair Bartal, Moses Charikar, Piotr Indyk

This paper is concerned with the page migration (or file migration) problem [BS89] as part of a large class of on-line problems. The page migration problem deals with the management of pages residing...

Finding minimum DNF formula on the Boltzmann Machine (2007)

Anna Gambin, Piotr Indyk

In this paper we show that given the set of positive and negative examples of a concept it is possible to build a Boltzmann Machine that aproximates the minimum Disjunctive Normal Form formula...

Approximate Nearest Neighbor Algorithms for Hausdorff Metrics via Embeddings (2007)

Martin Farach-colton, Piotr Indyk

Hausdorff metrics are used in geometric settings for measuring the distance between sets of points. They have been used extensively in areas such as computer vision, pattern recognition and...

Efficient Parallel Computing with Memory Faults (2007)

Leszek Gasieniec, Piotr Indyk

. In this paper we show two results on PRAM with constant fraction of memory faults. First we show how to preprocess (i.e. connect a constant fraction of processors into a binary tree) a faulty EREW...

39 NEAREST NEIGHBORS IN HIGH-DIMENSIONAL SPACES (2007)

Piotr Indyk

In this chapter we consider the following problem: given a set P of points in a high-dimensional space, construct a data structure which given any query point q

z (2007)

Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...

z (2007)

Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...

Fast Algorithms for Subset Matching and Tree Pattern Matching (2007)

Richard Cole, Ramesh Hariharan, Piotr Indyk

This paper describes an O(s log 3 s) time deterministic algorithm, an O(s log 3

Codes of (2007)

Venkatesan Guruswami, Piotr Indyk

We present several novel constructions of codes which share the common thread of using expander (or expander-like) graphs as a component. The expanders enable the design of efficient decoding...

Linear-time Codes to Correct a Maximum Possible Fraction of Errors (2007)

Venkatesan Guruswami, Piotr Indyk

We construct linear-time encodable and decodable binary codes of positive rate (in fact, rate

y (2007)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

z (2007)

Anna C. Gilbert, Sudipto Guha, Piotr Indyk, Yannis Kotidis, S. Muthukrishnan, Martin J. Strauss

A vector A of length N is defined implicitly, via a stream of updates of the form "add 5 to A 3. " We give a sketching algorithm, that constructs a small sketch from the stream of...

z (2007)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We...

y (2007)

Martin Gavrilov, Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

In this paper, we undertake a performance study of some recent algorithms for geometric pattern matching. These algorithms cover two general paradigms for pattern matching; alignment and...

E#cient Regular Data Structures and Algorithms for Dilation, Location and Proximity Problems (2007)

Arnon Amir, Alon Efrat, Piotr Indyk, Hanan Samet

In this paper we investigate data-structures obtained by a recursive partitioning of the input domain into regions of equal size. One of the most well known examples of such a structure is the...

Abstract (2007)

Martin Farach-colton, Piotr Indyk

Hausdorff metrics are used in geometric settings for measuring the distance between sets of points. They have been used extensively in areas such as computer vision, pattern recognition and...

z (2007)

Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...

Abstract Earth Mover Distance over High-Dimensional Spaces ∗ (2007)

Alexandr Andoni, Piotr Indyk, Robert Krauthgamer

The Earth Mover Distance (EMD) between two equalsize sets of points in R d is defined to be the minimum cost of a bipartite matching between the two pointsets. It is a natural metric for comparing...

Abstract (2007)

Alexandr Andoni, Piotr Indyk, Robert Krauthgamer

The Earth Mover Distance (EMD) between two equal-size sets of points in R d is defined to be the minimum cost of a bipartite matching between the two pointsets. It is a natural metric for comparing...

Sketching information divergences (2007)

Sudipto Guha, Piotr Indyk, Andrew Mcgregor

When comparing discrete probability distributions, natural measures of similarity are not ℓp distances but rather are information divergences such as Kullback-Leibler and Hellinger. This paper...

Sketching information divergences (2007)

Sudipto Guha, Piotr Indyk, Andrew Mcgregor

Abstract. When comparing discrete probability distributions, natural measures of similarity are not ℓp distances but rather are informationdivergences such as Kullback-Leibler and Hellinger. This...

Sketching information divergences (2007)

Sudipto Guha, Piotr Indyk, Andrew Mcgregor

When comparing discrete probability distributions, natural measures of similarity are not ℓp distances but rather are information-divergences such as Kullback-Leibler and Hellinger. This paper...

Certified by.......................................................... (2006)

Mihai Bădoiu, Piotr Indyk, Arthur C. Smith, Mihai Bădoiu

We present several computationally efficient algorithms, and complexity results on low distortion mappings between metric spaces. An embedding between two metric spaces is a mapping between the two...

Products ℓ2 Regression SVD (2006)

Tamás Sarlós, Thanks András, A. Benczúr, Katalin Friedl, Piotr Indyk

◮ Fundamental tool in data mining (e.g. clustering) and web information retrieval (e.g. HITS, LSI) ◮ Task: Given A ∈ R m×n find rank-k matrix Ak such that

New LSH-based Algorithm for Approximate Nearest Neighbor (2005)

Andoni, Alexandr, Indyk, Piotr

We present an algorithm for c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time ofO(dn^{1/c^2+o(1)}) and space O(dn + n^{1+1/c^2+o(1)}).

New LSH-based Algorithm for Approximate Nearest Neighbor (2005)

Andoni, Alexandr, Indyk, Piotr

We present an algorithm for c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time ofO(dn^{1/c^2+o(1)}) and space O(dn + n^{1+1/c^2+o(1)}).

Facility location in sublinear time (2005)

Mihai Bădoiu, Artur Czumaj, Piotr Indyk, Christian Sohler

Abstract. In this paper we present a randomized constant factor approximation algorithm for the problem of computing the optimal cost of the metric Minimum Facility Location problem, in the case of...

Into Low-Dimensional Spaces (2005)

Anastasios Sidiropoulos, Piotr Indyk, Anastasios Sidiropoulos

We present several approximation algorithms for the problem of embedding metric spaces into a line, and into the two-dimensional plane. We give an O ( √ n)approximation algorithm for the problem of...

Linear time encodable/decodable codes with nearoptimal rate (2005)

Venkatesan Guruswami, Piotr Indyk

We present an explicit construction of linear-time encodable and decodable codes of rate r which can correct a fraction (1 − r − ε)/2 of errors over an alphabet of constant size depending only...

Polylogarithmic private approximations and efficient matching, ECCC (2005)

Piotr Indyk, David Woodruff

Abstract. In [12] a private approximation of a function f is defined to be another function F that approximates f in the usual sense, but does not reveal any information about x other than what can...

6.046J Introduction to Algorithms (SMA 5503), Fall 2004 (2004)

Leiserson, Charles Eric, Indyk, Piotr

Techniques for the design and analysis of efficient algorithms, emphasizing methods useful in practice. Topics: sorting; search trees, heaps, and hashing; divide-and-conquer; dynamic programming;...

6.046J Introduction to Algorithms (SMA 5503), Fall 2004 (2004)

Leiserson, Charles Eric, Indyk, Piotr

Techniques for the design and analysis of efficient algorithms, emphasizing methods useful in practice. Topics: sorting; search trees, heaps, and hashing; divide-and-conquer; dynamic programming;...

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees (2004)

Badoiu, Mihai, Indyk, Piotr, Sidiropoulos, Anastasios

We present a constant-factor approximation algorithm for computing an embedding of the shortest path metric of an unweighted graph into a tree, that minimizes the multiplicative distortion.

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees (2004)

Badoiu, Mihai, Indyk, Piotr, Sidiropoulos, Anastasios

We present a constant-factor approximation algorithm for computing anembedding of the shortest path metric of an unweighted graph into atree, that minimizes the multiplicative distortion.

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees (2004)

Badoiu, Mihai, Indyk, Piotr, Sidiropoulos, Anastasios

We present a constant-factor approximation algorithm for computing anembedding of the shortest path metric of an unweighted graph into atree, that minimizes the multiplicative distortion.

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees (2004)

Badoiu, Mihai, Indyk, Piotr, Sidiropoulos, Anastasios

We present a constant-factor approximation algorithm for computing an embedding of the shortest path metric of an unweighted graph into a tree, that minimizes the multiplicative distortion.

Optimal Approximations of the Frequency Moments (2004)

Indyk, Piotr, Woodruff, David

We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left...

Optimal Approximations of the Frequency Moments (2004)

Indyk, Piotr, Woodruff, David

We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left...

Optimal Approximations of the Frequency Moments (2004)

Indyk, Piotr, Woodruff, David

We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left...

Optimal Approximations of the Frequency Moments (2004)

Indyk, Piotr, Woodruff, David

We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left...

Low-dimensional embedding with extra information (2004)

Erik D. Demaine, Mohammadtaghi Hajiaghayi, Piotr Indyk

ABSTRACT A frequently arising problem in computational geometry is whena physical structure, such as an ad-hoc wireless sensor network or

Locality-sensitive hashing scheme based on p-stable distributions (2004)

Mayur Datar, Piotr Indyk

inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of...

Low-distortion embeddings of finite metric spaces (2004)

Piotr Indyk

An n-point metric space (X; D) can be represented by an n n table specifying the distances. Such tables arise in many diverse areas. For example, consider the following scenario in microbiology: X is...

Low-Distortion Embeddings of Finite Metric Spaces (2004)

Piotr Indyk, Jiri Matousek

An n-point metric space (X, D) can be represented by an n × n table specifying the distances. Such tables arise in many diverse areas. For example, consider the following scenario in...

Low-dimensional embedding with extra information (2004)

Mihai Bădoiu, Erik D. Demaine, Mohammadtaghi Hajiaghayi, Piotr Indyk

A frequently arising problem in computational geometry is when a physical structure, such as an ad-hoc wireless sensor network or a protein backbone, can measure local information about its geometry...

Low-Dimensional Embedding with Extra Information (2004)

Mihai Badoiu, Erik D. Demaine, Mohammadtaghi Hajiaghayi, Piotr Indyk

A frequently arising problem in computational geometry is when a physical structure, such as an ad-hoc wireless sensor network or a protein backbone, can measure local information about its geometry...

Efficiently decodable low-rate codes meeting gilbert varshamov bound (2004)

Venkatesan Guruswami, Piotr Indyk

We demonstrate a probabilistic construction of binary linear codes meeting the Gilbert-Varshamov bound (with overwhelming probability) for rates up to about 10 −4, together with polynomial time...

When Crossings Count - Approximating the Minimum Spanning Tree (2003)

Har-Peled, Sariel, Indyk, Piotr

In the first part of the paper, we present an (1+\mu)-approximation algorithm to the minimum-spanning tree of points in a planar arrangement of lines, where the metric is the number of crossings...

Sorting (2003)

Piotr Indyk, Jeff Vitter

• Another way to tackle large data sets • Exact solutions (no more embeddings)

Combinatorial and experimental methods for approximate point pattern matching (2003)

Martin Gavrilov, Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

Point pattern matching is an important problem in computational geometry, with applications in areas like computer vision, object recognition, molecular modelling, and image registration....

Better algorithms for high-dimensional proximity problems via asymmetric embeddings (2003)

Piotr Indyk

Abstract In this paper we give several results based on randomized embeddings of l2 into l1 (or "l1-like") spaces. Our first result is a (1 + ffl)-distortion asymmetric embedding of...

Tight lower bounds for the distinct elements problem (2003)

Piotr Indyk

We prove strong lower bounds for the space complexity of ¢¤£¦¥¨§� ©-approximating the number of distinct elements �� � in a data stream. Let � be the size of the universe from which...

Combinatorial and experimental methods for approximate point pattern matching. Algorithmica (2003)

Martin Gavrilov, Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

Point pattern matching is an important problem in computational geometry, with applications in areas like computer vision, object recognition, molecular modelling, and image registration....

Evaluating Strategies for Similarity Search on the Web (2002)

Haveliwala, Taher H., Gionis, Aristades, Klein, Dan, Indyk, Piotr

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...

New algorithms for subset query, partial match, orthogonal range searching, and related problems (2002)

Moses Charikar, Piotr Indyk, Rina Panigrahy

- N * 2O(m log 2 mpc / log N) space, and O(N/2c) time, for any c- Nmc space and O(mN/c) query time, for any c < = N We extend these results to the more general problem of orthogonal range...

Approximate clustering via core-sets (2002)

Mihai Bădoiu, Sariel Har-peled, Piotr Indyk

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The...

Comparing data streams using hamming norms (how to zero in (2002)

Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthukrishnan

Abstract—Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in...

Comparing data streams using hamming norms (how to zero in (2002)

Graham Cormode, Mayur Datar, S. Muthukrishnan, Piotr Indyk

Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in traditional...

Derandomized dimensionality reduction with applications (2002)

Lars Engebretsen, Piotr Indyk

The Johnson-Lindenstrauss lemma provides a way to map a number of points in highdimensional space into a low-dimensional space, with only a small distortion of the distances between the points. The...

Approximate Clustering via Core-Sets (2002)

Mihai Badoiu, Sariel Har-peled, Piotr Indyk

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently.

Fast, Small-Space Algorithms for Approximate Histogram Maintenance (Extended Abstract) (2002)

Anna Gilbert, Yannis Kotidis, Sudipto Guha, S. Muthukrishnan, Piotr Indyk, Martin J. Strauss

Anna C. Gilbert AT&T Labs---Research agilbert@research.att.com Yannis Kotidis AT&T Labs---Research kotidis@research.att.com Sudipto Guha CIS,University of Pennsylvania sudipto@cis.upenn.edu...

Fast, Small-Space Algorithms for Approximate Histogram (2002)

Maintenance Extend Ed, Anna C. Gilbert, Sudipto Guha, Piotr Indyk, Yannis Kotidis, S. Muthukrishnan, ...

Anna C. Gilbert AT&T Labs---Research agilbert@research.att.com Yannis Kotidis AT&T Labs---Research kotidis@research.att.com Sudipto Guha CIS,University of Pennsylvania sudipto@cis.upenn.edu...

Approximate clustering via core-sets (2002)

Mihai Bădoiu, Sariel Har-peled, Piotr Indyk

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The...

Maintaining Stream Statistics over Sliding Windows (2002)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....

Maintaining Stream Statistics over Sliding Windows (2002)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....

Approximate clustering via core-sets (2002)

Mihai Bădoiu, Sariel Har-peled, Piotr Indyk

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The...

Approximate clustering via core-sets (2002)

Mihai Bădoiu, Sariel Har-peled, Piotr Indyk

In this paper, we show that for several clustering problems one can extract a small set of points, so that using those core-sets enable us to perform approximate clustering efficiently. The...

Evaluating Strategies for Similarity Search on the Web (2002)

Taher H. Haveliwala, Aristides Gionis, Dan Klein, Piotr Indyk

Finding pages on the Web that are similar to a query page (Related Pages) is an important component of modern search engines. A variety of strategies have been proposed for answering Related Pages...

Algorithmic applications of low-distortion geometric embeddings (2001)

Piotr Indyk

Low-distortion geometric embeddings Formally: a mapping f: PA → PB: • PA: points from metric space with distance D(·, ·) • PB: points from some normed space, e.g., l d 2 • For any p, q ∈...

Expander-based constructions of efficiently decodable codes (2001)

Venkatesan Guruswami, Piotr Indyk

We present several novel constructions of codes which share the common thread of using expander (or expander-like) graphs as a component. The expanders enable the design of efficient decoding...

Pattern matching for sets of segments (2001)

Alon Efrat, Piotr Indyk, Suresh Venkatasubramanian

In this paper we present algorithms for a number of problems in geometric pattern matching where the input consists of a collection of orthogonal segments in the plane. Such collections arise...

Pattern Matching for sets of segments (2000)

Efrat, Alon, Indyk, Piotr, Venkatasubramanian, Suresh

In this paper we present algorithms for a number of problems in geometric pattern matching where the input consist of a collections of segments in the plane. Our work consists of two main parts. In...

High-dimensional computational geometry / (2000)

Indyk, Piotr.

Submitted to the Department of Computer Science.

Approximate congruence in nearly linear time (2000)

Piotr Indyk, Suresh Venkatasubramanian

Abstract The problem of geometric point set matching has been studied extensively in the domain of computationalgeometry, and has many applications in areas such as computer vision, computational...

Mining the Stock Market: Which Measure is Best (2000)

Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, Rajeev Motwani

In recent years, there has been a lot of interest in the database community in mining time series data. Surprisingly, little work has been done on verifying which measures are most suitable for...

Mining the Stock Market: Which Measure is Best (2000)

Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, Rajeev Motwani

In recent years, there has been a lot of interest in the database community in mining time series data. Surprisingly, little work has been done on verifying which measures are most suitable for...

Dimensionality reduction techniques for proximity problems (2000)

Piotr Indyk

\Gamma O(ffl2) to n2\Gamma O(ffl). Our algorithms are unified by the fact that their key component is a dimensionality reduction technique for Hamming spaces. 1 Introduction The proximity problems...

Approximate congruence in nearly linear time (2000)

Piotr Indyk, Suresh Venkatasubramanian

The problem of geometric point set matching has been well-studied in the domain of computational geometry, and it has many applications in areas such as computer vision, computational chemistry, and...

When crossings count - approximating the minimum spanning tree (2000)

Sariel Har-peled, Piotr Indyk

We present an (1+&quot;)-approximation algorithm for computing the minimum-spanning tree of points in a planar arrangement of lines, where the metric is the number of crossings between the...

Mining The Stock Market: Which Measure Is Best? (Extended Abstract) (2000)

Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, Rajeev Motwani

In recent years, there has been a lot of interest in the database community in mining time series data. Surprisingly, little work has been done on verifying which measures are most suitable for...

Stable Distributions, Pseudorandom Generators, Embeddings and Data Stream Computation (2000)

Piotr Indyk

In this paper we show several results obtained by combining the use of stable distributions with pseudorandom generators for bounded space. In particular: ffl we show how to maintain (using only...

Scalable Techniques for Clustering the Web (Extended Abstract) (2000)

Taher H. Haveliwala, Aristides Gionis, Piotr Indyk

Clustering is one of the most crucial techniques for dealing with the massive amount of information present on the web. Clustering can either be performed once offline, independent of search queries,...

Dimensionality Reduction Techniques for Proximity Problems (2000)

Piotr Indyk

In this paper we give approximation algorithms for several proximity problems in high dimensional spaces. In particular, we give the first Las Vegas data structure for (1 + ffl)-nearest neighbor with...

When crossings count - approximating the minimum spanning tree (2000)

Sariel Har-peled, Piotr Indyk

We present an (1+ε)-approximation algorithm for computing the minimum-spanning tree of points in a planar arrangement of lines, where the metric is the number of crossings between the spanning tree...

Approximate congruence in nearly linear time (2000)

Piotr Indyk, Suresh Venkatasubramanian

The problem of geometric point set matching has been well-studied in the domain of computational geometry, and it has many applications in areas such as computer vision, computational chemistry, and...

Approximate congruence in nearly linear time (2000)

Piotr Indyk, Suresh Venkatasubramanian

The problem of geometric point set matching has been studied extensively in the domain of computational geometry, and has many applications in areas such as computer vision, computational chemistry,...

Dimensionality reduction techniques for proximity problems (2000)

Piotr Indyk

In this paper we giveapproximation algorithms for several proximity problems in high dimensional spaces. In particular, we give the rst Las Vegas data structure for (1 +)-nearest neighbor with...

Finding Interesting Associations without Support Pruning (2000)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the conditionof high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

Mining the Stock Market: Which Measure is Best (2000)

Martin Gavrilov, Dragomir Anguelov, Piotr Indyk, Rajeev Motwani

Inrecentyears,therehasbeenalotofinterestinthedatabase communityinminingtimeseriesdata. Surprisingly,little workhasbeendoneonverifyingwhichmeasuresaremost...

Finding interesting associations without support pruning (2000)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Abstract Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only...

Geometric matching under noise: combinatorial bounds and algorithms (1999)

Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

In geometric pattern matching, we are given two sets of points P and Q in d dimensions, and the problem is to determine the rigid transformation that brings P closest to Q, under some distance...

Similarity search in high dimensions via hashing (1999)

Aristides Gionis, Piotr Indyk, Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building...

Geometric matching under noise: combinatorial bounds and algorithms (1999)

Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

In geometric pattern matching, we are given two sets of points P and Q in d dimensions, and the problem is to determine the rigid transformation that brings P closest to Q, under some distance...

When Crossings Count - Approximating the Minimum Spanning Tree (1999)

Sariel Har-peled, Piotr Indyk

In the first part of the paper, we present an (1 + &epsilon;)-approximation algorithm to the minimum-spanning tree of points in a planar arrangement of lines, where the metric is the number of...

Similarity Search in High Dimensions via Hashing (1999)

Aristides Gionis, Piotr Indyk, Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building...

A Small Approximately Min-Wise Independent Family of Hash Functions (1999)

Piotr Indyk

In this paper we give a construction of a small approximately min-wise independent family of hash functions. The number of bits needed to represent each function is O(logn \Delta log 1=ffl). This...

Stochastic Load Balancing and Related Problems (1999)

Ashish Goel, Piotr Indyk

We study the problems of makespan minimization (load balancing), knapsack, and bin packing when the jobs have stochastic processing requirements or sizes. If the jobs are all Poisson, we present a...

Stochastic Load Balancing and Related Problems (1999)

Piotr Indyk

We study the problems of makespan minimization (load balancing), knapsack, and bin packing when the jobs have stochastic processing requirements or sizes. If the jobs are all Poisson, we present a...

Finding Interesting Associations without Support Pruning (1999)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Jeffrey D. Ullman, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

Geometric pattern matching: A performance study (1999)

Martin Gavrilov, Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

In this paper, we undertake a performance study of some recent algorithms for geometric pattern matching. These algorithms cover two general paradigms for pattern matching; alignment and...

Similarity search in high dimensions via hashing (1999)

Aristides Gionis, Piotr Indyk, Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building...

Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality (1998)

Preliminary Version Piotr, Piotr Indyk, Rajeev Motwani

The nearest neighbor problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require finding the point...

Faster Algorithms for String Matching Problems: Matching the Convolution Bound (1998)

Piotr Indyk

In this paper we give a randomized O(n log n)-time algorithm for the string matching with don't cares problem. This improves the Fischer-Paterson bound [10] from 1974 and answers the open...

On Approximate Nearest Neighbors in Non-Euclidean Spaces (1998)

Piotr Indyk

The nearest neighbor search (NNS) problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require...

Probabilistic Analysis for Combinatorial Functions of Moving Points (1998)

Julien Basch, Harish Devarajan, Piotr Indyk, Li Zhang

We initiate a probabilistic study of configuration functions of moving points. In our probabilistic model, a particle is given an initial position and a velocity drawn independently at random from...

Enhanced Hypertext Categorization Using Hyperlinks (1998)

Soumen Chakrabarti, Byron Dom, And Piotr Indyk, Piotr Indyk

A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves...

Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality (1998)

Piotr Indyk, Rajeev Motwani

The nearest neighbor problem is the following: Given a set of n points P in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to the...

Approximate nearest neighbors: Towards removing the curse of dimensionality (1998)

Piotr Indyk, Rajeev Motwani

The nearest neighbor problem is the following: Given a set of n points P = fp1�:::�p ng in some metric space X, preprocess P so as to e ciently answer queries which require nding the point inP...

Similarity Search in High Dimensions via Hashing (1997)

Aristides Gionis Piotr, Piotr Indyk, Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building...

On Page Migration and Other Relaxed Task Systems (1997)

Yair Bartal, Moses Charikar, Piotr Indyk

This paper is concerned with the page migration (or le migration) problem [BS89] as part of a large class of on-line problems. The page migration problem deals with the management of pages residing...

Similarity Search in High Dimensions via Hashing (1997)

Aristides Gionis, Piotr Indyk, Rajeev Motwani

The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building...

External Inverse Pattern Matching (1997)

Leszek Asieniec Piotr, Piotr Indyk, Piotr Krysta

We consider external inverse pattern matching problem. Given a text T of length n over an ordered alphabet \Sigma, such that j\Sigmaj = oe, and a number m n. The entire problem is to find a pattern ~...

Probabilistic Analysis for Combinatorial Functions of Moving Points (1997)

Li Zhang, Harish Devarajan, Julien Basch, Piotr Indyk

We initiate a probabilistic study of configuration functions of moving points. In our probabilistic model, a particle is given an initial position and a velocity drawn independently at random from...

Locality-Preserving Hashing in Multidimensional Spaces (1997)

Piotr Indyk, Rajeev Motwani, Prabhakar Raghavan, S. Vempala, Santosh Vempala

this paper was published in Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 618--625, 1997

Probabilistic Analysis for Combinatorial Functions of Moving Points (1997)

Li Zhang, Harish Devarajan, Julien Basch, Piotr Indyk

Given a set of n points, what is the description complexity of their convex hull? In our world, this question is understood with an implicit "in the worst case", and the answer is n bd=2c...

External Inverse Pattern Matching (1997)

Leszek Gasieniec, Piotr Indyk, Piotr Krysta

. In this paper we consider the external inverse pattern matching problem. Given a text T of length n over an ordered alphabet \Sigma and a number m n, the goal is to find a pattern ~ PMAX 2 \Sigma m...

External Inverse Pattern Matching (1997)

Leszek Gasieniec, Piotr Indyk, Piotr Krysta

We consider external inverse pattern matching problem. Given a text T of length n over an ordered alphabet \Sigma, such that j\Sigmaj = oe, and a number m n. The entire problem is to find a pattern ~...

Locality-Preserving Hashing in Multidimensional Spaces (1997)

Piotr Indyk, Rajeev Motwani, Prabhakar Raghavan, Santosh Vempala

We consider locality-preserving hashing --- in which adjacent points in the domain are mapped to adjacent or nearlyadjacent points in the range --- when the domain is a d- dimensional cube. This...

Shared-Memory Simulations on a Faulty-Memory DMM (1996)

Bogdan S. Chlebus, Anna Gambin, Piotr Indyk, Instytut Informatyki, Uniwersytet Warszawski

this paper are synchronous, and the time performance is our major efficiency criterion. We consider a DMM with faulty memory words, otherwise everything is assumed to be operational. In particular...

Finding Interesting Associations without Support Pruning

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

Sublinear Time Algorithms for Metric Space Problems

Piotr Indyk

In this paper we give approximation algorithms for the following problems on metric spaces: Furthest Pair, k- median, Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, Maximum...

A Sublinear Time Approximation Scheme for Clustering in Metric Spaces

Piotr Indyk

The metric 2-clustering problem is defined as follows: given a metric (X; d), partition X into two sets S 1 and S 2 in order to minimize the value of X i X fu;vgaeS i d(u; v) In this paper we show an...

Efficient Regular Data Structures and Algorithms for Location and Proximity Problems

Arnon Amir, Alon Efrat, Piotr Indyk, Hanan Samet

In this paper we investigate data-structures obtained by a recursive partitioning of the input domain into regions of equal size. One of the most well known examples of such a structure is the...

Geometric Matching under Noise: Combinatorial Bounds and Algorithms

Piotr Indyk, Rajeev Motwani, Suresh Venkatasubramanian

The geometric point set matching problem in 2 and 3 dimensions is a well-studied problem with application to areas such as computer vision and pattern recognition, computational chemistry and other...