Automatic Optimization of Parallel Dataflow Programs (2009)
Christopher Olston, Benjamin Reed, Adam Silberstein, Utkarsh Srivastava
Large-scale parallel dataflow systems, e.g., Dryad and Map-Reduce, have attracted significant attention recently. High-level dataflow languages such as Pig Latin and Sawzall are being layered on top...
Automatic Optimization of Parallel Dataflow Programs (2009)
Christopher Olston, Benjamin Reed, Adam Silberstein, Utkarsh Srivastava
Large-scale parallel dataflow systems, e.g., Dryad and Map-Reduce, have attracted significant attention recently. High-level dataflow languages such as Pig Latin and Sawzall are being layered on top...
Christopher Olston, Benjamin Reed, Utkarsh Srivastava
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected...
Efficient Top-K Processing Over Query-Dependent Functions (2009)
Lin Guo, Sihem Amer, Yahia Raghu, Ramakrishnan Jayavel Shanmugasundaram, Utkarsh Srivastava, Erik Vee
We study the efficient evaluation of top-k queries over data items, where the score of each item is dynamically computed by applying an item-specific function whose parameter value is specified in...
Efficient Bulk Insertion into a Distributed Ordered Table ABSTRACT (2009)
Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Erik Vee, Ramana Yerneni, Raghu Ramakrishnan
We study the problem of bulk-inserting records into tables in a system that horizontally range-partitions data over a large cluster of shared-nothing machines. Each table partition contains a...
Christopher Olston, Benjamin Reed, Utkarsh Srivastava
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected...
Christopher Olston, Benjamin Reed, Utkarsh Srivastava
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected...
Pig Latin: A Not-So-Foreign Language for Data Processing (2008)
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected...
Pig Latin: A Not-So-Foreign Language for Data Processing (2008)
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins
There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where innovation critically depends on being able to analyze terabytes of data collected...
A Parallel DFA Minimization Algorithm (2008)
Ambuj Tewari, Utkarsh Srivastava, P. Gupta
Abstract. In this paper,we have considered the state minimization problem for Deterministic Finite Automata (DFA). An efficient parallel algorithm for solving the problem on an arbitrary CRCW PRAM...
Efficient Computation of Diverse Query Results (2008)
Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, Sihem Amer Yahia
We study the problem of efficiently computing diverse query results in online shopping applications, where users specify queries through a form interface that allows a mix of structured and...
Utkarsh Srivastava, Advisor Prof, Jennifer Widom
Stanford University, Graduate Fellow, 2002-present.
Area Optimized Square Root and Divider Unit for Multimedia Processing Applications (2008)
Shilpa Reddy, Utkarsh Srivastava
Digital systems for square root computation and division are still a challenge for IC designers. Various techniques have been proposed for increasing division/square root performance, including...
PNUTS: Yahoo!’s hosted data serving platform (2008)
Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-arno Jacobsen, ...
We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!’s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency...
PNUTS: Yahoo!’s hosted data serving platform (2008)
Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-arno Jacobsen, ...
We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!’s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency...
A Distributed Monitoring System for troubleshooting Wireless Networks (2007)
Ambuj Tewari, Utkarsh Srivastava, Dr. Dheeraj Sanghi
Over recent years, one has seen a tremendous increase in the use of mobile devices such as laptops and PDA's replacing the conventional desktops, particularly in the realm of personal and...
Query Optimization over Web Services (2006)
Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom, Rajeev Motwani
Web services are becoming a standard method of sharing data and functionality among loosely-coupled systems. We propose a generalpurpose Web Service Management System (WSMS) that enables querying...
Query Optimization over Web Services (2006)
Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom, Rajeev Motwani
Web services are becoming a standard method of sharing data and functionality among loosely-coupled systems. We propose a generalpurpose Web Service Management System (WSMS) that enables querying...
Operator placement for in-network stream query processing (2005)
Utkarsh Srivastava, Kamesh Munagala, Jennifer Widom
In sensor networks, data acquisition frequently takes place at lowcapability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network...
Flexible time management in data stream systems (2004)
(DSMS) rely on time as a basis for windows on streams and for defining a consistent semantics for multiple streams and updatable relations. The system clock in a centralized DSMS provides a...
Stream: The stanford data stream management system (2004)
Arvind Arasu, Brian Babcock, Shivnath Babu, John Cieslewicz, Keith Ito, Rajeev Motwani, ...
Traditional database management systems are best equipped to run onetime queries over finite stored data sets. However, many modern applications such as network monitoring, financial analysis,...
Flexible time management in data stream systems (2004)
(DSMS) rely on time as a basis for windows on streams and for defining a consistent semantics for multiple streams and updatable relations. The system clock in a centralized DSMS provides a...
Vision Paper: Enabling Privacy for the Paranoids (2004)
Gagan Aggarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-molina, Krishnaram Kenthapadi, Nina Mishra, ...
P3P [27, 32] is a set of standards that allow corporations to declare their privacy policies. Hippocratic Databases [4] have been proposed to implement such policies within a corporation’s...
Monitoring stream properties for continuous query processing (2003)
Utkarsh Srivastava, Shivnath Babu, Jennifer Widom
Management System for processing continuous queries over multiple continuous data streams [12]. When a new continuous query is registered, our query optimizer creates an initial query plan (possibly...
Monitoring Stream Properties for Continuous Query Processing (2003)
Utkarsh Srivastava, Shivnath Babu, Jennifer Widom
focus now on properties related to stream data and arrival characteristics. As a simple example of property-based query execution, if a stream is known to be roughly sorted on an attribute , and is a...
Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams (2002)
Moustafa Mohamadou, Shivnath Babu, Utkarsh Srivastava, Jennifer Widom
Continuous queries often require significant runtime state over arbitrary data streams. However, streams may exhibit certain data or arrival patterns, or constraints, that can be detected and...
Exploiting k-Constraints to reduce memory overhead in continuous queries over data streams (2002)
Shivnath Babu, Utkarsh Srivastava, Jennifer Widom
Continuous queries often require significant run-time state over arbitrary data streams. However, streams may exhibit certain data or arrival patterns, or constraints, that can be detected and...