Publication View

Cost-Based Query Optimiztion for Complex Pattern Mining on Multiple Databases ABSTRACT (2008)

Abstract
Mining frequent patterns across multiple datasets has received a lot of research interest recently. In this paper, we investigate cost-based query optimization approaches to efficiently evaluate such mining tasks. Specifically, we make the following contributions: 1) We present a rich class of queries on mining frequent itemsets across multiple datasets supported by a SQL-based mechanism. 2) We present an approach to enumerate all possible query plans for the mining queries, and develop a dynamic programming approach and a branch-and-bound approach based on the enumeration algorithm to find optimal query plans with the least mining cost. 3) We introduce models to estimate the cost of individual mining operators. 4) We evaluate our query optimization techniques on both real and synthetic datasets and show significant performance improvements. 1.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.122.7492
Source http://www.cs.kent.edu/~dfuhry/papers/edbt08.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.40.6757, 10.1.1.3.2424, 10.1.1.12.8836, 10.1.1.42.3283, 10.1.1.41.407, 10.1.1.31.3305, 10.1.1.48.5095, 10.1.1.13.6173, 10.1.1.41.1963, 10.1.1.137.3356, 10.1.1.13.7502, 10.1.1.23.5522, 10.1.1.11.644, 10.1.1.125.8146, 10.1.1.75.8273, 10.1.1.77.2614, 10.1.1.31.9604, 10.1.1.63.5424