Mayur Datar

The VLDB Journal (2004) / Digital Object Identifier (DOI) 10.1007/s00778-004-0132-6 Operator scheduling in data stream systems (2008)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Dilys Thomas

Abstract. In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an...

datar @ cs. stanford.edu (2008)

Surajit Chaudhuri, Gautam Das, Mayur Datar, Rajeev Motwani

We study the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To...

ABSTRACT Models and Issues in Data Stream Systems (2008)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom

In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...

Abstract (2008)

Gagan Aggarwal, Mayur Datar

The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. The most important quality of a realistic...

Abstract (2008)

Gagan Aggarwal, Mayur Datar

The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. The most important quality of a realistic...

Extending the Streaming Model: Sorting and Streaming Networks (2008)

Matthias Ruhl, Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan

The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. One such model is “streaming computations...

Extending the Streaming Model: Sorting and Streaming Networks (2008)

Matthias Ruhl, Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan

The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. One such model is “streaming computations...

Operator Scheduling in Data Stream Systems (2008)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Dilys Thomas

In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or realtime query responses in such an environment...

y (2007)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

z (2007)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We...

Locality-sensitive hashing scheme based on p-stable distributions (2004)

Mayur Datar, Piotr Indyk

inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of...

Load Shedding for Aggregation Queries over Data Streams (2004)

Brian Babcock, Mayur Datar, Rajeev Motwani

Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this paper, we focus on...

Query processing, resource management, and approximation in a data stream management system (2003)

Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, ...

This paper describes our ongoing work developing the Stanford Stream Data Manager (STREAM), a system for executing continuous queries over multiple continuous data streams. The STREAM system supports...

Chain: Operator Scheduling for Memory Minimization in Data Stream Systems (2003)

Brian Babcock, Shivnath Babu, Mayur Datar

In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment...

Maintaining Variance and k-Medians over Data Stream Windows (2003)

Brian Babcock, Mayur Datar

The sliding window model is useful for discounting stale data in data stream applications. In this model, data elements arrive continually and only the most recent N elements are used when answering...

Load Shedding Techniques for Data Stream Systems (2003)

Brian Babcock Mayur, Brian Babcock, Mayur Datar, Rajeev Motwani

Introduction Many data stream sources (communication network tra#c, HTTP requests, etc.) are prone to dramatic spikes in volume. Because peak load during a spike can be orders of magnitude higher...

Operator Scheduling in Data Stream Systems (2003)

Brian Babcock, Brian Babcock Shivnath, Mayur Datar, Rajeev Motwani, Dilys Thomas

In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment...

STREAM: The Stanford Stream Data Manager (2003)

Demonstration Proposal Arvind, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Keith Ito, ...

Introduction We propose to demonstrate a Data Stream Management System (DSMS) called STREAM, for STanford stREam datA Manager. The challenges in building a DSMS instead of a traditional DBMS arise...

Models and issues in data stream systems (2002)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom

In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...

2 (2002)

Mayur Datar, Aristides Gionis, Rajeev Motwani, Rina Panigrahy

Abstract We consider the problem max csp over multi-valued domains with variables ranging over sets of size si ^ s and constraints involving kj ^ k variables. We study two algorithms with...

Comparing data streams using hamming norms (how to zero in (2002)

Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthukrishnan

Abstract—Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in...

Models and issues in data stream systems (2002)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom

In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...

Models and issues in data stream systems (2002)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom

In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...

Comparing data streams using hamming norms (how to zero in (2002)

Graham Cormode, Mayur Datar, S. Muthukrishnan, Piotr Indyk

Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in traditional...

Estimating rarity and similarity over data stream windows (2002)

Mayur Datar, S Muthukrishnan

Email:datar@cs.stanford.edu. This work was done while the author was visiting AT&T Research

Sampling from a moving window over streaming data (2002)

Brian Babcock, Mayur Datar, Rajeev Motwani

We introduce the problem of sampling from a moving window of recent items from a data stream and develop the \chain-sample " and \priority-sample " algorithms for this problem. 1

Query Processing, Approximation, and Resource Management (2002)

In Data Stream, Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, ...

This paper describes our ongoing work developing the Stanford Stream Data Manager (STREAM), a system for executing continuous queries over multiple continuous data streams. The STREAM system supports...

Maintaining Stream Statistics over Sliding Windows (2002)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....

Maintaining Stream Statistics over Sliding Windows (2002)

Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani

Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....

Models and issues in data stream systems (2002)

Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom

In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...

Overcoming Limitations of Sampling for Aggregation Queries (2001)

Surajit Chaudhuri, Gautam Das, Mayur Datar, Rajeev Motwani, Vivek Narasayya

We study the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To...

Finding Interesting Associations without Support Pruning (2000)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the conditionof high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

Finding interesting associations without support pruning (2000)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Abstract Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only...

Finding Interesting Associations without Support Pruning (1999)

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Jeffrey D. Ullman, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...

Finding Interesting Associations without Support Pruning

Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...