Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, ...
SVD- quality • Remember – from SVD: • Q: can we find better ‘hidden variables’? • A: yes – with Independent Component Analysis (ICA) – see later (Q1) Find patterns in data • Motion...
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia 15-826 Copyright: C. Faloutsos (2007) 3
Hypothesis testing (Chi-square) (2009)
C. Faloutsos, Ai Decision Trees, Intro To Db
Approach: very intuitive: Accept hypothesis if the theoretical values are ‘close enough ’ to the actual ones Formally: Step 1: bucketize (how many? how wide?) Step 2: Compute deviation of...
Outline Goal: ‘Find similar / interesting things’ (2009)
Fractals Introduction, C. Faloutsos, Intro To Db
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....
CMU SCS Data Mining- Detailed outline • Statistics (2009)
C. Faloutsos, Ai Decision Trees, C. Faloutsos
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • clustering (revisited) • reconstruction of info 15-826 Copyright: C. Faloutsos (2007) 3 1
C. Faloutsos, Intro To Db, Ecgs Sound, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text
and Data Mining Information recovery (2009)
C. Faloutsos, C C. Faloutsos, Christos Faloutsos
■ Qopt- selectivities ■ data warehousing ■ transaction recording systems (details: in tertiary store) ■ statistical/scientific db ■ data integration (partial info from many sources) 2....
CMU SCS Outline Goal: ‘Find similar / interesting things’ (2009)
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text
C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Faloutsos, Chris Palmer (vivisimo, C. Faloutsos, ...
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting
• Variations / Applications • New concepts (2009)
Christos Faloutsos, R. Agrawal, R. Agrawal, T. Imielinski, A. Swami Mining, C. Faloutsos, ...
• Consider ‘market basket ’ case: (milk, bread)
LOCI: Fast Outlier Detection Using The Local Correlation Integral, (2009)
Outlier Detection, C. Faloutsos, Spiros Papadimitriou, Hiroyuki Kitagawa, Phillip B. Gibbons, Christos Faloutsos, ...
What is an outlier? Why outlier detection?
Spatial Access Methods- problem (2009)
Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
z-ordering- Detailed outline • What is the problem / S.A.M. • z-ordering – main idea- 3 methods – use w / B-trees; algorithms (range, knn queries – non-point (eg., region) data –...
Outline Goal: ‘Find similar / interesting things’ (2008)
C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS
• Indexing- similarity search (2008)
C. Faloutsos, Sec Key Indexing, Intro To Db, Copyright C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • text 15-826 Copyright: C. Faloutsos (2005) 3
Christos Faloutsos, Pods Christos Faloutsos, Ibrahim Kamel, C. Faloutsos, C. Faloutsos, C. Faloutsos
Montgomery county: •Q1: how many d.a. for an R-tree? •Q2: distribution? •not uniform
Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
x 1, x 2, … , x t, …
Christos Faloutsos (cmu, S. Chakrabarti, C. Faloutsos, S. Chakrabarti, C. Faloutsos
[lumeta.com]
CMU SCS Indexing- Detailed outline (2008)
Fractals Introduction, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....
C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia
Outline Goal: ‘Find similar / interesting things’ (2008)
C. Faloutsos, Intro To Db, Sams Detailed Outline
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • text
CMU SCS Indexing- Detailed outline (2008)
Cmu Scs, C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia 15-826 Copyright: C. Faloutsos (2007) 3
C. Faloutsos, Ai Decision Trees, C. Faloutsos
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting 15-826 Copyright: C....
CMU SCS Indexing- Detailed outline (2008)
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • fractals • text 15-826 Copyright: C. Faloutsos...
• Indexing- similarity search • Data Mining (2008)
C. Faloutsos, C. Faloutsos, Ai Decision Trees, Copyright C. Faloutsos, C. Faloutsos
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text
Christos Faloutsos (scs, Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, ...
x 1, x 2, … , x t, …
Cmu Scs, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text
• Q: Why study (image/video) compression? (2008)
Cmu Scs, C. Faloutsos, C. Faloutsos
• primary key indexing • multimedia • Digital Signal Processing (DSP) tools • Image + video compression
C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia
CMU SCS Indexing- Detailed outline (2008)
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • text 15-826 Copyright: C. Faloutsos (2007) #3
Outline Goal: ‘Find similar / interesting things’ (2008)
C. Faloutsos, C. Faloutsos, C. Faloutsos
• AI- decision trees
Outline Goal: ‘Find similar / interesting things’ (2008)
C. Faloutsos, Ai Decision Trees, Intro To Db
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting
Age Chol-level Gender … CLASS-ID (2008)
C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos, C. Faloutsos
• Problem: Classification- Ie., • given a training set (N tuples, with M attributes, plus a label attribute) • find rules, to predict the label for newcomers Pictorially:
Hypothesis testing (Chi-square) (2008)
Problem: is it true that the salary distribution is Zipf?
Christos Faloutsos (scs, Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, ...
x 1, x 2, … , x t, …
Christos Faloutsos, C. Faloutsos, C. Faloutsos, Wang+ Mengzhi Wang, Tara Madhyastha, Ngai Hang, ...
2) Implementation: buffering, indexing, q-opt 7) Data Analysis- data mining sensors, time series, indexing and wavelets sensors and forecasting 8) Benchmarks 9) vision statements extras...
C. Faloutsos, C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Palmer, C. Faloutsos, ...
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting
Vector Space Model and Clustering (2008)
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia
C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos
• Problem: Classification- Ie., • given a training set (N tuples, with M attributes, plus a label attribute) • find rules, to predict the label for newcomers Pictorially: 15-826 Copyright: C....
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • text 15-826 Copyright: C. Faloutsos (2007) 3
• Definition- properties • Interpretation • Complexity (2008)
C. Faloutsos, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text
C. Faloutsos, Ai Decision Trees, C. Faloutsos
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting 15-826 Copyright: C....
Power Laws, C. Faloutsos, Christos Faloutsos, Deepayan Chakrabarti (cmu/yahoo, Michalis Faloutsos (ucr, George Siganos (ucr, ...
Applications of sensors/streams • ‘Smart house’: monitoring temperature, humidity etc • Financial, sales, economic series
• Problem#5: Track communities over time (2008)
Graph Analysis Laws, Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
• Static & dynamic laws; generators
C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees • text
• Q: Why study (image/video) compression? (2008)
Cmu Scs, C. Faloutsos, Fractal Compression, Intro To Db
• primary key indexing • multimedia • Digital Signal Processing (DSP) tools
• Major contribution: LSI = Latent Semantic Indexing (2008)
Cmu Scs, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • SVD: a powerful tool • multimedia 15-826 Copyright: C. Faloutsos (2007) 3
• Definition- properties • Interpretation • Complexity (2008)
Cmu Scs, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia
– Problem definition- Spatial Access Methods (2008)
Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
• Given a collection of geometric objects (points, lines, polygons,...) • organize them on disk, to answer spatial queries (like??)
C C. Faloutsos, J-y Pan, C. Faloutsos, C C. Faloutsos, J-y Pan, Jia-yu Pan, ...
(Q1) Find patterns in data • Human would say – Pattern 1: along diagonal – Pattern 2: along vertical axis • How to find these automatically?
• civil/automobile infrastructure (2008)
C. Faloutsos, Christos Faloutsos, C C. Faloutsos, Deepay Chakrabarti (cmu, Spiros Papadimitriou (cmu, ...
• Similarity search – distance functions
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....
M. J. Carey, S. Ceri, P. Bernstein, U. Dayal, C. Faloutsos, J. C. Freytag, ...
is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks....
• Archive recovery • Conclusions (2008)
Christos Faloutsos, Recovery T. Haerder, A. Reuter, C. Faloutsos, C. Faloutsos, C. Faloutsos, ...
= unit of work, eg. move $10 from savings to checking Atomicity (all or none)
4) Distributed DBMSs 5) Parallel DBMSs: Gamma, Alphasort (2008)
Christos Faloutsos, C. Faloutsos, B Trees, B -trees, C. Faloutsos, C. Faloutsos, ...
• the most successful family of index
• Major contribution: LSI = Latent Semantic (2008)
C. Faloutsos, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • SVD: a powerful tool • multimedia
C. Faloutsos, C. Faloutsos, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia 15-826 Copyright: C. Faloutsos...
Outline Goal: ‘Find similar / interesting things’ (2008)
C. Faloutsos, Intro To Db, Sams Detailed Outline
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • fractals • text 15-826 Copyright: C. Faloutsos...
CMU SCS Indexing- Detailed outline (2008)
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....
C. Faloutsos, Copyright C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees • text
Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos
2) Implementation: buffering, indexing, q-opt
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS
C. Faloutsos, Intro To Db, C. Faloutsos
• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS
Christos Faloutsos, Sigmod Copyright, C. Faloutsos, Deepay Chakrabarti (cmu, Spiros Papadimitriou (cmu, ...
• Bursty traffic- fractals and multifractals
C. Faloutsos, S C. Faloutsos, S C. Faloutsos
www.cs.cmu.edu/~christos/TALKS/
C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos, ...
– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting
Power Laws, Christos Faloutsos, C. Faloutsos, C. Faloutsos, Prof Rong Jin, ...
• Goals / motivation: find patterns in large datasets: – (A) Sensor data – (B) network/graph data
Data Mining at CALD-CMU: Tools, Experiences and Research Directions (2008)
C. Faloutsos, G. Gibson, T. Mitchell, A. Moore, S. Thrun
We describe the data mining problems and solutions that we have encountered in the Center for Automated Learning and Discovery (CALD) at CMU. Speci cally, we describe these settings and their...
• Financial, sales, economic series (2007)
Time Series Mining, Christos Faloutsos, Deepay Chakrabarti (cmu, Spiros Papadimitriou (cmu, Mengzhi Wang (cmu, ...
• Similarity search – distance functions
Detecting discriminative functional MRI activation patterns using space filling curves (2003)
Despina Kontos, Vasileios Megalooikonomou, D. Kontos, Christos Faloutsos, V. Megalooikonomou, Nilesh Ghubade, ...
INTRODUCTION The detection of relationships between human brain structures and brain functions (i.e., human brain mapping) has been recognized as one of the main goals of the Human Brain Project [1]....
Detecting Discriminative Functional MRI . . . (2003)
V. Megalooikonomou, N. Ghubade, C. Faloutsos
INTRODUCTION The detection of relationships between human brain structures and brain functions (i.e., human brain mapping) has been recognized as one of the main goals of the Human Brain Project [1]....
Online Data Mining for Co-Evolving Time Sequences (2000)
Byoung-Kee Yi, N. D. Sidiropoulos, T. Johnson, H. V. Jagadish, C. Faloutsos, ...
In many applications, the data of interest comprises multiple sequences that evolve over time. Examples include currency exchange rates, network traffic data, and demographic data on multiple...
Online data mining for co-evolving time sequences (2000)
H. V. Jagadish, C. Faloutsos, T. Johnson, A. Biliris
under Contract No. N66001-97-C-8517. Additional funding was provided by donations from NEC and Intel. Any opinions, ndings, and conclusions or recommendations expressed in this material are those of...
Data Mining at CALD-CMU: Tools, Experiences and Research Directions (1997)
C. Faloutsos, G. Gibson, T. Mitchell, A. Moore, S. Thrun
We describe the data mining problems and solutions that we have encountered in the Center for Automated Learning and Discovery (CALD) at CMU. Specifically, we describe these settings and their...
A Signature Technique for Similarity-Based Queries (1997)
Exte Nd Ed, C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo
) C. Faloutsos Univ. of Maryland christos@cs.umd.edu H. V. Jagadish AT&T Labs jag@research.att.com A. O. Mendelzon Univ. of Toronto mendel@db.toronto.edu T. Milo Tel Aviv Univ....
A Signature Technique for Similarity-Based Queries (Extended Abstract) (1997)
C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo
) C. Faloutsos , H. V. Jagadish AT&T Bell Labs Murray Hill, NJ 07974 fchristos,jagg@research.att.com A. O. Mendelzon Dept. of Computer Science Univ. of Toronto mendel@db.toronto.edu T. Milo Tel...
A Signature Technique for Similarity-Based Queries (Extended Abstract) (1997)
C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo
) C. Faloutsos Univ. of Maryland christos@cs.umd.edu H. V. Jagadish AT&T Labs jag@research.att.com A. O. Mendelzon Univ. of Toronto mendel@db.toronto.edu T. Milo Tel Aviv Univ....
• problem #1: text- LSI: find ‘concepts’ (1997)
Christos Faloutsos, Flip Korn, H. V. Jagadish, Christos Faloutsos “efficiently, A. Labrinidis, C. Faloutsos, ...
www.cs.cmu.edu/~christos/PUBLICATIONS.OLDER/sigmod97.ps.gz
Efficient and Effective Querying by Image Content (1994)
C. Faloutsos, W. Equitz, M. Flickner, W. Niblack, D. Petkovic, R. Barber
In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content we use...
Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, ...
e.g., person.car_id=car.id • Goal: Maximize performance – Some methods require additional indexing, or are only useful on indexed fields