Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Dilys Thomas
Abstract. In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an...
datar @ cs. stanford.edu (2008)
Surajit Chaudhuri, Gautam Das, Mayur Datar, Rajeev Motwani
We study the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To...
ABSTRACT Models and Issues in Data Stream Systems (2008)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...
The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. The most important quality of a realistic...
The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. The most important quality of a realistic...
Extending the Streaming Model: Sorting and Streaming Networks (2008)
Matthias Ruhl, Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan
The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. One such model is “streaming computations...
Extending the Streaming Model: Sorting and Streaming Networks (2008)
Matthias Ruhl, Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan
The need to deal with massive data sets in many practical applications has led to a growing interest in computational models appropriate for large inputs. One such model is “streaming computations...
Operator Scheduling in Data Stream Systems (2008)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Dilys Thomas
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or realtime query responses in such an environment...
Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, ...
This paper describes our ongoing work developing the
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...
Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We...
Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, ...
This paper describes our ongoing work developing the
Locality-sensitive hashing scheme based on p-stable distributions (2004)
inÇÐÓ�Ò We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of...
Load Shedding for Aggregation Queries over Data Streams (2004)
Brian Babcock, Mayur Datar, Rajeev Motwani
Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this paper, we focus on...
Algorithms for data stream systems / (2003)
Datar, Mayur., Motwani, Rajeev Advisor
Submitted to the Department of Computer Science.
Query processing, resource management, and approximation in a data stream management system (2003)
Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, ...
This paper describes our ongoing work developing the Stanford Stream Data Manager (STREAM), a system for executing continuous queries over multiple continuous data streams. The STREAM system supports...
Chain: Operator Scheduling for Memory Minimization in Data Stream Systems (2003)
Brian Babcock, Shivnath Babu, Mayur Datar
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment...
Maintaining Variance and k-Medians over Data Stream Windows (2003)
The sliding window model is useful for discounting stale data in data stream applications. In this model, data elements arrive continually and only the most recent N elements are used when answering...
Load Shedding Techniques for Data Stream Systems (2003)
Brian Babcock Mayur, Brian Babcock, Mayur Datar, Rajeev Motwani
Introduction Many data stream sources (communication network tra#c, HTTP requests, etc.) are prone to dramatic spikes in volume. Because peak load during a spike can be orders of magnitude higher...
Operator Scheduling in Data Stream Systems (2003)
Brian Babcock, Brian Babcock Shivnath, Mayur Datar, Rajeev Motwani, Dilys Thomas
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment...
STREAM: The Stanford Stream Data Manager (2003)
Demonstration Proposal Arvind, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Keith Ito, ...
Introduction We propose to demonstrate a Data Stream Management System (DSMS) called STREAM, for STanford stREam datA Manager. The challenges in building a DSMS instead of a traditional DBMS arise...
Models and issues in data stream systems (2002)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...
Mayur Datar, Aristides Gionis, Rajeev Motwani, Rina Panigrahy
Abstract We consider the problem max csp over multi-valued domains with variables ranging over sets of size si ^ s and constraints involving kj ^ k variables. We study two algorithms with...
Comparing data streams using hamming norms (how to zero in (2002)
Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthukrishnan
Abstract—Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in...
Models and issues in data stream systems (2002)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...
Models and issues in data stream systems (2002)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...
Comparing data streams using hamming norms (how to zero in (2002)
Graham Cormode, Mayur Datar, S. Muthukrishnan, Piotr Indyk
Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely stored in traditional...
Estimating rarity and similarity over data stream windows (2002)
Email:datar@cs.stanford.edu. This work was done while the author was visiting AT&T Research
Sampling from a moving window over streaming data (2002)
Brian Babcock, Mayur Datar, Rajeev Motwani
We introduce the problem of sampling from a moving window of recent items from a data stream and develop the \chain-sample " and \priority-sample " algorithms for this problem. 1
Query Processing, Approximation, and Resource Management (2002)
In Data Stream, Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, ...
This paper describes our ongoing work developing the Stanford Stream Data Manager (STREAM), a system for executing continuous queries over multiple continuous data streams. The STREAM system supports...
Maintaining Stream Statistics over Sliding Windows (2002)
Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani
Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....
Maintaining Stream Statistics over Sliding Windows (2002)
Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani
Abstract. We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model....
Models and issues in data stream systems (2002)
Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives...
Overcoming Limitations of Sampling for Aggregation Queries (2001)
Surajit Chaudhuri, Gautam Das, Mayur Datar, Rajeev Motwani, Vivek Narasayya
We study the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To...
Finding Interesting Associations without Support Pruning (2000)
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...
Association-rule mining has heretofore relied on the conditionof high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...
Finding interesting associations without support pruning (2000)
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...
Abstract Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only...
Finding Interesting Associations without Support Pruning (1999)
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Jeffrey D. Ullman, ...
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...
Finding Interesting Associations without Support Pruning
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, ...
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a-priori algorithm is only effective when the only rules of...