Ranking Objects Based on Relationships and Fixed Associations (2009)
Albert Angel, Surajit Chaudhuri, Nick Koudas, Gautam Das
Text corpora are often enhanced by additional metadata which relate real-world entities, with each document in which such entities are discussed. Such relationships are typically obtained through...
Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries (2009)
Marios Hadjieleftheriou, Xiaohui Yu, Nick Koudas, Divesh Srivastava
We study selectivity estimation techniques for set similarity queries. A wide variety of similarity measures for sets have been proposed in the past. In this work we concentrate on the class of...
Categorical Skylines for Streaming Data ABSTRACT (2009)
The problem of skyline computation has attracted considerable research attention. In the categorical domain the problem becomes more complicated, primarily due to the partially-ordered nature of the...
H. V. Jagadish, Nick Koudas, Divesh Srivastava, Ting Yu, N. Koudas, D. Srivastava, ...
or classroom use provided that the copies are not made or distributed for pro t or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice...
Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries ABSTRACT (2009)
Marios Hadjieleftheriou, Xiaohui Yu, Nick Koudas, Divesh Srivastava
We study selectivity estimation techniques for set similarity queries. A wide variety of similarity measures for sets have been proposed in the past. In this work we concentrate on the class of...
Yang, Yin, Bansal, Nilesh, Dakka, Wisam, Ipeirotis, Panagiotis, Koudas, Nick, Papadias, Dimitris
We are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and microblogging services. Such abundance of content complements...
ABSTRACT Similarity Search: A Matching Based Approach (2008)
Rui Zhang, Nick Koudas, Beng Chin Ooi
Similarity search is a crucial task in multimedia retrieval and data mining. Most existing work has modelled this problem as the nearest neighbor (NN) problem, which considers the distance between...
Abstract Size Separation Spatial Join (2008)
Nick Koudas, Kenneth C. Sevcik
We introduce a new algorithm to compute the spatial join of two or more spatial data sets, when indexes are not available on them. Size Separation Spatial Join (S3 J) imposes a hierarchical...
Fast Indexes and Algorithms for Set Similarity Selection Queries (2008)
Marios Hadjieleftheriou, Amit Ch, Nick Koudas, Divesh Srivastava
Abstract — Data collections often have inconsistencies that arise due to a variety of reasons, and it is desirable to be able to identify and resolve them efficiently. Set similarity queries are...
Call Nominations, Letter Betty Salzberg, David Lomet, Erich Neuhold, Hideki Kawai, Tait Eliott Larson, ...
Sql Anywhere, An Embeddable, Dbms Ivan, T. Bowman, Peter Bumbulis, Dan Farrar, ...
The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...
ABSTRACT Ad-hoc Top-k Query Answering for Data Streams (2008)
A top-k query retrieves the k highest scoring tuples from a data set with respect to a scoring function defined on the attributes of a tuple. The efficient evaluation of top-k queries has been an...
Philip A. Bernstein, Nishant Dani, Badriddine Khessib, Ramesh Manne, David Shutt, Jayant Madhavan, ...
A funny thing happened on the way to a billion........................................... Alfredo Alba,
Byung-Won On Penn State Univ. (2008)
Poor quality data is prevalent in databases due to a variety of reasons, including transcription errors, lack of standards for recording database fields, etc. To be able to query and integrate such...
Estimating the selectivity of approximate string queries (2008)
Arturas Mazeika, Michael H. Böhlen, Nick Koudas
Approximate queries on string data are important, due to the prevalence of such data in databases and various conventions and errors in string data. We present the VSol estimator, a novel technique...
Approximation and streaming algorithms for histogram construction problems (2008)
Sudipto Guha, Nick Koudas, Kyuseok Shim
Histograms are typically used to approximate data distributions. Histograms and related synopsis structures have been successful in a wide variety of popular database applications including...
Vagelis Hristidis, Yannis Papakonstantinou, Nick Koudas, Divesh Srivastava
Recent works have shown the benefits of keyword proximity search in querying XML doc-uments in addition to text documents. For example, given query keywords over Shakespeare’s plays in XML, the...
Rapid Identification of Column Heterogeneity (2008)
Bing Tian Dai, Divesh Srivastava, Nick Koudas, Suresh Venkatasubramanian, Beng Chin Ooi
Data quality is a serious concern in every data management application, and a variety of quality measures have been proposed, e.g., accuracy, freshness and completeness, to capture common sources of...
Nick Koudas, Divesh Srivastava, Data Streams What
� A data stream is a (potentially unbounded) sequence of tuples � Transactional data streams: log interactions between entities � Credit card: purchases by consumers from merchants �...
Optimization Techniques for Reactive Network Monitoring (2008)
Ahmet Bulut, Nick Koudas, Divesh Srivastava Member
Abstract—We develop a framework for minimizing the communication overhead of monitoring global system parameters in IP networks and sensor networks. A global system parameter is defined as a...
Nick Koudas, Divesh Srivastava, Data Streams What
� A data stream is a (potentially unbounded) sequence of tuples � Transactional data streams: log interactions between entities � Credit card: purchases by consumers from merchants �...
Andrey Balmin, Divesh Srivastava, Yannis Papakonstantinou, Nick Koudas
Keyword proximity search is a user-friendly information discovery technique that has been extensively studied for text documents. In extending this technique to structured databases, recent works [6,...
Masaru Kitsuregawa, Betty Salzberg, Gonzalo Navarro, Ricardo Baeza-yates, Erkki Sutinen, Jorma Tarhio, ...
IntegratingDiverseInformationManagementSystems:ABriefSurvey..................................
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
ABSTRACT Seeking Stable Clusters in the Blogosphere (2008)
The popularity of blogs has been increasing dramatically over the last couple of years. As topics evolve in the blogosphere, keywords align together and form the heart of various stories. Intuitively...
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
Benchmarking Declarative Approximate Selection Predicates (2008)
Amit Ch, Oktie Hassanzadeh, Nick Koudas, Mohammad Sadoghi, Divesh Srivastava
Declarative data quality has been an active research topic. The fundamental principle behind a declarative approach to data quality is the use of declarative statements to realize data quality...
ABSTRACT Ad-hoc Top-k Query Answering for Data Streams (2008)
A top-k query retrieves the k highest scoring tuples from a data set with respect to a scoring function defined on the attributes of a tuple. The efficient evaluation of top-k queries has been an...
Flip Korn AT&T Labs-Research (2007)
Zhiyuan Chen, Nick Koudas, S. Muthukrishnan
In a variety of applications ranging from optimizing queries on alphanumeric attributes to providing approximate counts of documents containing several query terms, there is an increasing need to...
Choosing Bucket Boundaries for Histograms (2007)
H. V. Jagadish, Nick Koudas, Kenneth C. Sevcik
Histograms have long been used to capture attribute value distribution statistics for query optimizers. More recently, there has been a growing interest in the use of histograms to produce quick...
Nick Koudas, Beng Chin Ooi, Heng Tao Shen
Recent advances in research fields like multimedia and bioinformatics have brought about a new generation of hyper-dimensional databases which can contain hundreds or even thousands of dimensions....
ABSTRACT Text Joins in an RDBMS for Web Data Integration (2007)
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...
Using Õ-grams in a DBMS for Approximate String Processing (2007)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
Ecient Biased Sampling for Approximate Clustering and Outlier Detection in Large Datasets (2007)
George Kollios, Dimitrios Gunopulos, Nick Koudas, Stefan Berchtold
We investigate the use of biased sampling according to the density of the dataset, to speed up the operation of general data mining tasks, such as clustering and outlier detection in large...
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
Flip Korn AT&T Labs--Research (2007)
Zhiyuan Chen, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Raymond Ng, Divesh Srivastava
We describe efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e., a twig, in a large node-labeled tree, using a summary data structure. This problem...
George Kollios, Dimitrios Gunopulos, Nick Koudas, Stefan Berchtold
We investigate the use of biased sampling according to the density of the dataset, to speed up the operation of general data mining tasks, such as clustering and outlier detection in large...
Peter Carlin, David B. Lomet, Anastassia Ailamaki, Jayant Haritsa, Nick Koudas, Dan Suciu
The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...
Aggregate query answering on anonymized tables (2007)
Qing Zhang, Nick Koudas, Divesh Srivastava, Ting Yu
Privacy is a serious concern when microdata need to be released for ad hoc analyses. The privacy goals of existing privacy protection approaches (e.g., k-anonymity and ℓ-diversity) are suitable...
Aggregate query answering on anonymized tables (2007)
Qing Zhang, Nick Koudas, Divesh Srivastava, Ting Yu
Privacy is a serious concern when microdata need to be released for ad hoc analyses. The privacy goals of existing privacy protection approaches (e.g., �-anonymity and �-diversity) are suitable...
Relaxing join and selection queries (2006)
Nick Koudas, Chen Li, Rares Vernica
Database users can be frustrated by having an empty answer to a query. In this paper, we propose a framework to systematically relax queries involving joins and selections. When considering relaxing...
Relaxing join and selection queries (2006)
Nick Koudas, Chen Li, Rares Vernica
Database users can be frustrated by having an empty answer to a query. In this paper, we propose a framework to systematically relax queries involving joins and selections. When considering relaxing...
Similarity search: A matching based approach (2006)
Rui Zhang, Nick Koudas, Beng Chin Ooi
Similarity search is a crucial task in multimedia retrieval and data mining. Most existing work has modelled this problem as the nearest neighbor (NN) problem, which considers the distance between...
Similarity search: A matching based approach (2006)
Rui Zhang, Nick Koudas, Beng Chin Ooi
Similarity search is a crucial task in multimedia retrieval and data mining. Most existing work has modelled this problem as the nearest neighbor (NN) problem, which considers the distance between...
Similarity search: A matching based approach (2006)
Rui Zhang, Nick Koudas, Beng Chin Ooi
Similarity search is a crucial task in multimedia retrieval and data mining. Most existing work has modelled this problem as the nearest neighbor (NN) problem, which considers the distance between...
Keyword proximity search in XML trees (2006)
Vagelis Hristidis, Nick Koudas, Yannis Papakonstantinou, Divesh Srivastava
Abstract—Recent works have shown the benefits of keyword proximity search in querying XML documents in addition to text documents. For example, given query keywords over Shakespeare’s plays in...
Answering Order-Based Queries Over XML Data (2005)
Zografoula Vagena, Nick Koudas, Divesh Srivastava, Vassilis J. Tsotras
Order-based queries over XML data include XPath navigation axes such as following-sibling and following. In this paper, we present holistic algorithms that evaluate such order-based queries. An...
Index Structures for Matching XML Twigs Using Relational Query Processors Zhiyuan Chen (2005)
Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices su#er from at least one of three limitations: they focus only on...
Approximate Joins: Concepts and Techniques (2005)
unities, deploying diverse approximate match predicates. The objective of this tutorial is to provide a comprehensive and cohesive overview of the key research results, techniques, and tools used for...
SPIDER: Flexible Matching in Databases (2005)
We present a prototype system, SPIDER, developed at AT&T Labs-- Research, which supports flexible string attribute value matching in large databases. We discuss the design principles on which...
Efficient Handling of Positional Predicates within XML Query Processing (2005)
Zografoula Vagena, Nick Koudas, Divesh Srivastava, Vassilis J. Tsotras
The inherent order within the XML document-centric data model is typically exposed through positional predicates defined over the XPath navigation axes. Although processing algorithms for each axis...
Approximate Joins: Concepts and Techniques (2005)
unities, deploying diverse approximate match predicates. The objective of this tutorial is to provide a comprehensive and cohesive overview of the key research results, techniques, and tools used for...
Multiple Aggregations Over Data Streams (2005)
Rui Zhang Nick, Rui Zhang, Nick Koudas, Beng Chin Ooi, Divesh Srivastava
application for data stream management systems. The need for exploratory IP traffic data analysis naturally leads to posing related aggregation queries on data streams, that differ only in the choice...
Indexing Mixed Types for Approximate Retrieval (2005)
Liang Jin Nick, Liang Jin, Chen Li, Nick Koudas
In various applications such as data cleansing, being able to retrieve categorical or numerical attributes based on notions of approximate match (e.g., edit distance, numerical distance) is of...
Answering Order-Based Queries Over XML Data (2005)
Zografoula Vagena Nick, Nick Koudas, Divesh Srivastava, Vassilis J. Tsotras
Order-based queries over XML data include XPath navigation axes such as following-sibling and following. In this paper, we present holistic algorithms that evaluate such order-based queries. An...
Indexing mixed types for approximate retrieval (2005)
Liang Jin, Chen Li, Nick Koudas
In various applications such as data cleansing, being able to retrieve categorical or numerical attributes based on notions of approximate match (e.g., edit distance, numerical distance) is of...
Multiple aggregations over data streams (2005)
Rui Zhang, Nick Koudas, Beng Chin, Ooi Divesh Srivastava
Monitoring aggregates on IP traffic data streams is a compelling application for data stream management systems. The need for exploratory IP traffic data analysis naturally leads to posing related...
Index Structures for Matching XML Twigs Using Relational Query Processors (2004)
Chen, Zhiyuan, Gehrke, Johannes, Korn, Flip, Koudas, Nick, Shanmugasundaram, Jayavel, Srivastava, Divesh
Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices suffer from at least one of three limitations: they focus only on...
Index Structures for Matching XML Twigs Using Relational Query Processors (2004)
Chen, Zhiyuan, Gehrke, Johannes, Korn, Flip, Koudas, Nick, Shanmugasundaram, Jayavel, Srivastava, Divesh
Various index structures have been proposed to speed up the evaluation of XML path expressions. However, existing XML path indices suffer from at least one of three limitations: they focus only on...
LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space (2004)
Koudas, Nick, Ooi, Beng Chin, Shen, Heng Tao, Tung, Anthony K. H.
Recent advances in research fields like multimedia and bioinformatics have brought about a new generation of hyper-dimensional databases which can contain hundreds or even thousands of dimensions....
Approximate NN Queries on Streams with Guaranteed Error/Performance Bounds (2004)
Nick Koudas, Beng Chin, Ooi Kian-lee, Tan Rui Zhang
In data stream applications, data arrive continuously and can only be scanned once as the query processor has very limited memory (relative to the size of the stream) to work with. Hence, queries on...
NNH: Improving performance of nearest-neighbor searches using histograms (2004)
Abstract. Efficient search for nearest neighbors (NN) is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper we propose a novel technique,...
NNH: Improving performance of nearest-neighbor searches using histograms (2004)
Liang Jin, Nick Koudas, Chen Li
Abstract. Efficient search for nearest neighbors (NN) is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper we propose a novel technique,...
Flexible string matching against large databases in practice (2004)
Nick Koudas, Amit Marathe, Divesh Srivastava
Data Cleaning is an important process that has been at the center of research interest in recent years. Poor data quality is the result of a variety of reasons, including data entry errors and...
Merging the Results of Approximate Match Operations (2004)
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Srivastava
Data Cleaning is an important process that has been at the center of research interest in recent years. An important end goal of effective data cleaning is to identify the relational tuple or tuples...
LDC: Enabling Search by Partial Distance in a Hyper-Dimensional Space (2004)
Nick Koudas, Beng Chin, Beng Chin Ooi, Tao Shen
Recent advances in research fields like multimedia and bioinformatics have brought about a new generation of hyper-dimensional databases which can contain hundreds or even thousands of dimensions....
Approximate NN Queries on Streams with Guaranteed Error/Performance Bounds (2004)
Nick Koudas, Beng Chin, Ooi Kian-lee, Tan Rui Zhang
In data stream applications, data arrive continuously and can only be scanned once as the query processor has very limited memory (relative to the size of the stream) to work with. Hence, queries on...
Text Joins in an RDBMS for Web Data Integration (2003)
Gravano, Luis, Ipeirotis, Panagiotis G., Koudas, Nick, Srivastava, Divesh
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...
Panayiotis Tsaparas, Nick Koudas, Themistoklis Palpanas
A plethora of data sources contain data entities that could be ordered according to a variety of attributes associated with the entities. Such orderings result effectively in a ranking of the...
Panayiotis Tsaparas, Themistoklis Palpanas, Yannis Kotidis, Nick Koudas, Divesh Srivastava
A plethora of data sources contain data entities that could be ordered according to a variety of attributes associated with the entities. Such orderings result effectively in a ranking of the...
Efficient approximation of optimization queries under parametric aggregation constraints (2003)
Sudipto Guha, Dimitrios Gunopulos, Nick Koudas, Divesh Srivastava, Michail Vlachos
We introduce and study a new class of queries that we refer to as OPAC (optimization under parametric aggregation constraints) queries. Such queries aim to identify sets of database tuples that...
Text Joins for Data Cleansing and Integration in an RDBMS (2003)
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava
An organization’s data records are often noisy because of transcription errors, incomplete information, lack of standard formats for textual data or combinations thereof. A fundamental task in a...
Approximate String Joins in a Database (Almost) for Free (2003)
Erratum Luis Gravano, Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, ...
case the result returned by the Figure 1 query is incomplete and su#ers from "false negatives," in contrast to our claim to the contrary in [GIJ 01b]. In general, the string pairs that are...
Nick Koudas And, Nick Koudas, Divesh Srivastava, Data Streams What
logical/physical streams and signatures Express I/O and CPU efficient signature programs cleanly Lesson: Essential to consider I/O issues for data streams AT&T Labs-Research 7 Hancock: Data...
Panayiotis Tsaparas, Nick Koudas, Themistoklis Palpanas
be ordered according to a variety of attributes associated with the entities. Such orderings result effectively in a ranking of the entities according to the values in the attribute domain. Commonly,...
Index-based approximate XML joins (2003)
XML data integration tools are facing a variety of challenges for their efficient and effective operation. Among these is the requirement to handle a variety of inconsistencies or mistakes present in...
Navigation- vs. Index-Based XML Multi-Query Processing (2003)
Nicolas Bruno Luis, Luis Gravano, Nick Koudas, Divesh Srivastava
XML path queries form the basis of complex filtering of XML data. Most current XML path query processing techniques can be divided in two groups. Navigation-based algorithms compute results by...
Holistic Twig Joins: Optimal XML Pattern Matching (2002)
Bruno, Nicolas, Koudas, Nick, Srivastava, Divesh
XML employs a tree-structured data model, and, naturally, XML queries specify patterns of selection predicates on multiple elements related by a tree structure. Finding all occurrences of such a twig...
Non-Linear Dimensionality Reduction Techniques for Classification and Visualization (2002)
Michail Vlachos, Carlotta Domeniconi, Dimitrios Gunopulos, George Kollios, Nick Koudas
In this paper we address the issue of using local embeddings for data visualization in two and three dimensions, and for classi cation. We advocate their use on the basis that they provide an ecient...
Structural joins: a primitive for efficient XML query pattern matching (2002)
Shurug Al-khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, Yuqing Wu
XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are...
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Datasets (2002)
George Kollios, Ieee Computer Society, Dimitrios Gunopulos, Nick Koudas, Stefan Berchtold
Abstract---We investigate the use of biased sampling according to the density of the data set to speed up the operation of general data mining tasks, such as clustering and outlier detection in large...
Fast Algorithms for Hierarchical Range Histogram Construction (2002)
Sudipto Guha, Nick Koudas, Divesh Srivastava
Data Warehousing and OLAP applications typically view data as having multiple logical dimensions (e.g., product, location) with natural hierarchies defined on each dimension. OLAP queries usually...
Structural Joins: A Primitive for Efficient XML Query Pattern Matching (2002)
H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, Yuqing Wu
XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are...
Non-Linear Dimensionality Reduction Techniques for Classification and Visualization (2002)
Michail Vlachos, Carlotta Domeniconi, Dirnitrios Gunopulos, George Kollios, Nick Koudas
In this paper we address the issue of using local embeddings for data visualization in two and three dimensions, and for classification. We advocate their use on the basis that they provide an...
Efficient and tunable similar set retrieval (2001)
Aristides Gionis, Dimitrios Gunopulos, Nick Koudas
Set value attributes are a concise and natural way to model complex data sets. Modern Object Relational systems support set value attributes and allow various query capabilities on them. In this...
Approximate string joins in a database (almost) for free (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
In [GIJ + 01a, GIJ + 01b] we described how to use q-grams in an RDBMS to perform approximate string joins. We also showed how to implement the approximate join using plain SQL queries. Specifically,...
Using q-grams in a DBMS for Approximate String Processing (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
Entropy Based Approximate Querying and Exploration of Datacubes (2001)
Themistoklis Palpanas, Nick Koudas
Much research has been devoted to the efficient computation of relational aggregations and specifically the efficient execution of the datacube operation. In this paper we consider the inverse...
Falcon: Fault management via alarm warehousing and mining (2001)
Matt Grossglauser, Nick Koudas, Alice Variot
The ability to manage faults in large scale networks is of vast importance for successful and effective network management operations. In this paper, we describe FALCON, a project underway at...
PREFER: A system for the efficient execution of multi-parametric ranked queries (2001)
Vagelis Hristidis, Nick Koudas, Yannis Papakonstantinou, La Jolla Ca, La Jolla Ca
Users often need to optimize the selection of objects by appropriately weighting the importance of multiple object attributes. Such optimization problems appear often in operations ' research...
Approximate string joins in a database (almost) for free (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
Data-streams and histograms (2001)
Sudipto Guha, Nick Koudas, Kyuseok Shim
Histograms have been used widely to capture data distribution, to represent the data by a small number of step functions. Dynamic programming algorithms which provide optimal construction of these...
Using q-grams in a DBMS for Approximate String Processing (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
Optimal histograms for hierarchical range queries (2000)
Now there is tremendous interest in data warehousing and OLAP applications. OLAP applications typically view data as having multiple logical dimensions (e.g., product, location) with natural...
Selectivity estimation for Boolean queries (2000)
Zhiyuan Chen, Nick Koudas, S. Muthukrishnan
In a variety of applications ranging from optimizing queries on alphanumeric attributes to providing approximate counts of documents containing several query terms, there is an increasing need to...
Selectivity estimation for Boolean queries (2000)
Zhiyuan Chen, Nick Koudas, S. Muthukrishnan
In a variety of applications ranging from optimizing queries on alphanumeric attributes to providing approximate counts of documents containing several query terms, there is an increasing need to...
On Effective Multi-Dimensional Indexing for Strings (2000)
H. V. Jagadish, Nick Koudas, Divesh Srivastava
As databases have expanded in scope from storing purely business data to include XML documents, product catalogs, e-mail messages, and directory data, it has become increasingly important to search...
Optimal Histograms for Hierarchical Range Queries (Extended Abstract) (2000)
Nick Koudas, S. Muthukrishnan, Divesh Srivastava
) Nick Koudas AT&T Labs--Research koudas@research.att.com S. Muthukrishnan AT&T Labs--Research muthu@research.att.com Divesh Srivastava AT&T Labs--Research divesh@research.att.com 1...
High Dimensional Similarity Joins: Algorithms and Performance Evaluation (1998)
Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing rely on a...
Optimal Histograms with Quality Guarantees (1998)
H. V. Jagadish, Viswanath Poosala, Nick Koudas, Ken Sevcik, S. Muthukrishnan, Torsten Suel
Histograms are commonly used to capture attribute value distribution statistics for query optimizers. More recently, histograms have also been considered as a way to produce quick approximate answers...
High dimensional similarity joins: algorithms and performance evaluation (1998)
Nick Koudas, Kenneth C. Sevcik, Ieee Computer Society, Ieee Computer Society
AbstractÐCurrent data repositories include a variety of data types, including audio, images, and time series. State-of-the-art techniques for indexing such data and doing query processing rely on a...
Optimal Histograms with Quality Guarantees (1998)
H. V. Jagadish, Viswanat H Poosala, Nick Koudas, Ken Sevcik
Histograms are commonly used to capture attribute value distribution statistics for query optimizers. More recently, histograms have also been considered as a way to produce quick approximate answers...
Size Separation Spatial Join (1997)
Nick Koudas, Kenneth C. Sevcik
We introduce a new algorithm to compute the spatial join of two or more spatial data sets, when indexes are not available on them. Size Separation Spatial Join (S 3 J) imposes a hierarchical...
Letter Special, Peter Buneman, Grigoris Karvounarakis, David B. Lomet, Anastassia Ailamaki, Jayant Haritsa, ...
The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...