C. Faloutsos

Left Knee (2009)

Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, Kdd J. Vanbriesen, C. Faloutsos, ...

SVD- quality • Remember – from SVD: • Q: can we find better ‘hidden variables’? • A: yes – with Independent Component Analysis (ICA) – see later (Q1) Find patterns in data • Motion...

Vector Space Model and (2009)

C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia 15-826 Copyright: C. Faloutsos (2007) 3

Hypothesis testing (Chi-square) (2009)

C. Faloutsos, Ai Decision Trees, Intro To Db

Approach: very intuitive: Accept hypothesis if the theoretical values are ‘close enough ’ to the actual ones Formally: Step 1: bucketize (how many? how wide?) Step 2: Compute deviation of...

Outline Goal: ‘Find similar / interesting things’ (2009)

Fractals Introduction, C. Faloutsos, Intro To Db

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....

CMU SCS Data Mining- Detailed outline • Statistics (2009)

C. Faloutsos, Ai Decision Trees, C. Faloutsos

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • clustering (revisited) • reconstruction of info 15-826 Copyright: C. Faloutsos (2007) 3 1

Applications (2009)

C. Faloutsos, Intro To Db, Ecgs Sound, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text

and Data Mining Information recovery (2009)

C. Faloutsos, C C. Faloutsos, Christos Faloutsos

■ Qopt- selectivities ■ data warehousing ■ transaction recording systems (details: in tertiary store) ■ statistical/scientific db ■ data integration (partial info from many sources) 2....

CMU SCS Outline Goal: ‘Find similar / interesting things’ (2009)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text

Thanks to (2009)

C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Faloutsos, Chris Palmer (vivisimo, C. Faloutsos, ...

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting

Spatial Access Methods- problem (2009)

Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos

z-ordering- Detailed outline • What is the problem / S.A.M. • z-ordering – main idea- 3 methods – use w / B-trees; algorithms (range, knn queries – non-point (eg., region) data –...

Outline Goal: ‘Find similar / interesting things’ (2008)

C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS

• Indexing- similarity search (2008)

C. Faloutsos, Sec Key Indexing, Intro To Db, Copyright C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • text 15-826 Copyright: C. Faloutsos (2005) 3

Association Rules (2008)

Christos Faloutsos, Pods Christos Faloutsos, Ibrahim Kamel, C. Faloutsos, C. Faloutsos, C. Faloutsos

Montgomery county: •Q1: how many d.a. for an R-tree? •Q2: distribution? •not uniform

CMU SCS Indexing- Detailed outline (2008)

Fractals Introduction, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....

Q: space overhead? (2008)

C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia

Outline Goal: ‘Find similar / interesting things’ (2008)

C. Faloutsos, Intro To Db, Sams Detailed Outline

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • text

CMU SCS Indexing- Detailed outline (2008)

Cmu Scs, C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia 15-826 Copyright: C. Faloutsos (2007) 3

Tree Classifiers (2008)

C. Faloutsos, Ai Decision Trees, C. Faloutsos

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting 15-826 Copyright: C....

CMU SCS Indexing- Detailed outline (2008)

C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • fractals • text 15-826 Copyright: C. Faloutsos...

• Indexing- similarity search • Data Mining (2008)

C. Faloutsos, C. Faloutsos, Ai Decision Trees, Copyright C. Faloutsos, C. Faloutsos

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting

Problem (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text

1 (2008)

Cmu Scs, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text

• Q: Why study (image/video) compression? (2008)

Cmu Scs, C. Faloutsos, C. Faloutsos

• primary key indexing • multimedia • Digital Signal Processing (DSP) tools • Image + video compression

SVD- Detailed outline (2008)

C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia

CMU SCS Indexing- Detailed outline (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • text 15-826 Copyright: C. Faloutsos (2007) #3

Outline Goal: ‘Find similar / interesting things’ (2008)

C. Faloutsos, Ai Decision Trees, Intro To Db

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting

Age Chol-level Gender … CLASS-ID (2008)

C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos, C. Faloutsos

• Problem: Classification- Ie., • given a training set (N tuples, with M attributes, plus a label attribute) • find rules, to predict the label for newcomers Pictorially:

Hypothesis testing (Chi-square) (2008)

C. Faloutsos, C. Faloutsos

Problem: is it true that the salary distribution is Zipf?

• Conclusions (2008)

Christos Faloutsos, C. Faloutsos, C. Faloutsos, Wang+ Mengzhi Wang, Tara Madhyastha, Ngai Hang, ...

2) Implementation: buffering, indexing, q-opt 7) Data Analysis- data mining sensors, time series, indexing and wavelets sensors and forecasting 8) Benchmarks 9) vision statements extras...

CMU SCS Thanks to (2008)

C. Faloutsos, C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Palmer, C. Faloutsos, ...

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting

Vector Space Model and Clustering (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • multimedia

CMU SCS Decision Trees (2008)

C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos

• Problem: Classification- Ie., • given a training set (N tuples, with M attributes, plus a label attribute) • find rules, to predict the label for newcomers Pictorially: 15-826 Copyright: C....

2 (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • text 15-826 Copyright: C. Faloutsos (2007) 3

• Definition- properties • Interpretation • Complexity (2008)

C. Faloutsos, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text

Data Ware-housing (2008)

C. Faloutsos, Ai Decision Trees, C. Faloutsos

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • reconstruction of info • network databases; time sequence forecasting 15-826 Copyright: C....

Tamer; Ihab (2008)

Power Laws, C. Faloutsos, Christos Faloutsos, Deepayan Chakrabarti (cmu/yahoo, Michalis Faloutsos (ucr, George Siganos (ucr, ...

Applications of sensors/streams • ‘Smart house’: monitoring temperature, humidity etc • Financial, sales, economic series

1 (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees • text

• Q: Why study (image/video) compression? (2008)

Cmu Scs, C. Faloutsos, Fractal Compression, Intro To Db

• primary key indexing • multimedia • Digital Signal Processing (DSP) tools

• Major contribution: LSI = Latent Semantic Indexing (2008)

Cmu Scs, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • SVD: a powerful tool • multimedia 15-826 Copyright: C. Faloutsos (2007) 3

• Definition- properties • Interpretation • Complexity (2008)

Cmu Scs, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia

– Problem definition- Spatial Access Methods (2008)

Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos

• Given a collection of geometric objects (points, lines, polygons,...) • organize them on disk, to answer spatial queries (like??)

American (2008)

C C. Faloutsos, J-y Pan, C. Faloutsos, C C. Faloutsos, J-y Pan, Jia-yu Pan, ...

(Q1) Find patterns in data • Human would say – Pattern 1: along diagonal – Pattern 2: along vertical axis • How to find these automatically?

1 (2008)

C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....

Editorial Board (2008)

M. J. Carey, S. Ceri, P. Bernstein, U. Dayal, C. Faloutsos, J. C. Freytag, ...

is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks....

• Archive recovery • Conclusions (2008)

Christos Faloutsos, Recovery T. Haerder, A. Reuter, C. Faloutsos, C. Faloutsos, C. Faloutsos, ...

= unit of work, eg. move $10 from savings to checking Atomicity (all or none)

• Major contribution: LSI = Latent Semantic (2008)

C. Faloutsos, C. Faloutsos, Copyright C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • SVD: a powerful tool • multimedia

2 (2008)

C. Faloutsos, C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods • fractals • text • Singular Value Decomposition (SVD) • multimedia 15-826 Copyright: C. Faloutsos...

Outline Goal: ‘Find similar / interesting things’ (2008)

C. Faloutsos, Intro To Db, Sams Detailed Outline

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees – misc • fractals • text 15-826 Copyright: C. Faloutsos...

CMU SCS Indexing- Detailed outline (2008)

C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text

1 (2008)

C. Faloutsos, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text 15-826 Copyright: C....

• Data Mining (2008)

C. Faloutsos, Copyright C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – problem dfn – z-ordering – R-trees • text

CMU SCS (2008)

C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS

• Data Mining (2008)

C. Faloutsos, Intro To Db, C. Faloutsos

• primary key indexing • secondary key / multi-key indexing • spatial access methods – z-ordering – R-trees – misc • fractals – intro – applications • text CMU SCS

CMU SCS Thanks to (2008)

C. Faloutsos, Ai Decision Trees, C. Faloutsos, C. Faloutsos, Intro To Db, C. Faloutsos, ...

– data warehouses; data cubes; OLAP – classifiers – association rules – misc. topics: • approximate counting

Sunspot Data (2008)

Power Laws, Christos Faloutsos, C. Faloutsos, C. Faloutsos, Prof Rong Jin, ...

• Goals / motivation: find patterns in large datasets: – (A) Sensor data – (B) network/graph data

Data Mining at CALD-CMU: Tools, Experiences and Research Directions (2008)

C. Faloutsos, G. Gibson, T. Mitchell, A. Moore, S. Thrun

We describe the data mining problems and solutions that we have encountered in the Center for Automated Learning and Discovery (CALD) at CMU. Speci cally, we describe these settings and their...

Detecting discriminative functional MRI activation patterns using space filling curves (2003)

Despina Kontos, Vasileios Megalooikonomou, D. Kontos, Christos Faloutsos, V. Megalooikonomou, Nilesh Ghubade, ...

INTRODUCTION The detection of relationships between human brain structures and brain functions (i.e., human brain mapping) has been recognized as one of the main goals of the Human Brain Project [1]....

Detecting Discriminative Functional MRI . . . (2003)

V. Megalooikonomou, N. Ghubade, C. Faloutsos

INTRODUCTION The detection of relationships between human brain structures and brain functions (i.e., human brain mapping) has been recognized as one of the main goals of the Human Brain Project [1]....

Online Data Mining for Co-Evolving Time Sequences (2000)

Byoung-Kee Yi, N. D. Sidiropoulos, T. Johnson, H. V. Jagadish, C. Faloutsos, ...

In many applications, the data of interest comprises multiple sequences that evolve over time. Examples include currency exchange rates, network traffic data, and demographic data on multiple...

Online data mining for co-evolving time sequences (2000)

H. V. Jagadish, C. Faloutsos, T. Johnson, A. Biliris

under Contract No. N66001-97-C-8517. Additional funding was provided by donations from NEC and Intel. Any opinions, ndings, and conclusions or recommendations expressed in this material are those of...

Data Mining at CALD-CMU: Tools, Experiences and Research Directions (1997)

C. Faloutsos, G. Gibson, T. Mitchell, A. Moore, S. Thrun

We describe the data mining problems and solutions that we have encountered in the Center for Automated Learning and Discovery (CALD) at CMU. Specifically, we describe these settings and their...

A Signature Technique for Similarity-Based Queries (1997)

Exte Nd Ed, C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo

) C. Faloutsos Univ. of Maryland christos@cs.umd.edu H. V. Jagadish AT&T Labs jag@research.att.com A. O. Mendelzon Univ. of Toronto mendel@db.toronto.edu T. Milo Tel Aviv Univ....

A Signature Technique for Similarity-Based Queries (Extended Abstract) (1997)

C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo

) C. Faloutsos , H. V. Jagadish AT&T Bell Labs Murray Hill, NJ 07974 fchristos,jagg@research.att.com A. O. Mendelzon Dept. of Computer Science Univ. of Toronto mendel@db.toronto.edu T. Milo Tel...

A Signature Technique for Similarity-Based Queries (Extended Abstract) (1997)

C. Faloutsos, H. V. Jagadish, A. O. Mendelzon, T. Milo

) C. Faloutsos Univ. of Maryland christos@cs.umd.edu H. V. Jagadish AT&T Labs jag@research.att.com A. O. Mendelzon Univ. of Toronto mendel@db.toronto.edu T. Milo Tel Aviv Univ....

Efficient and Effective Querying by Image Content (1994)

C. Faloutsos, W. Equitz, M. Flickner, W. Niblack, D. Petkovic, R. Barber

In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content we use...

• Join Processing – what is it? • Traditional Join Algorithms – Nested-loops – Sort-merge – Hash-join (1986)

Christos Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, C. Faloutsos, ...

e.g., person.car_id=car.id • Goal: Maximize performance – Some methods require additional indexing, or are only useful on indexed fields