Xindong Wu

Publication List Details

Period

1993 - 2009

Number

73

Co-Authors

Conceptual Equivalence for Contrast Mining in Classification Learning (2009)

Ying Yang, Xindong Wu, Xingquan Zhu

Learning often occurs through comparing. In classification learning, in order to compare data groups, most existing methods compare either raw instances or learned classification rules against each...

Parameter Tuning for Induction Algorithm Oriented Feature Elimination (2009)

Ying Yang, Xindong Wu

Abstract. This paper presents an analysis of parameter tuning for induction algorithm oriented feature elimination (IAOFE), an approach that takes into consideration not only the data and the target...

Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets (2009)

Xingquan Zhu, Xindong Wu, Ying Yang

Given a noisy dataset, how to locate erroneous instances and attributes and rank suspicious instances based on their impacts on the system performance is an interesting and important research issue....

User-Centered Biological Information Location by Combining User Profiles and Domain Knowledge (2008)

Xindong Wu, Jeffrey E. Stone, Marc Greenblatt

To aid researchers in obtaining, organizing and managing biological data, we have designed an intelligent digital library system that utilizes advanced data mining techniques. Our digital library...

Mining Video Associations for Efficient Database Management (2008)

Xingquan Zhu, Xindong Wu

To support more efficient video database management, this paper explores the concept of video association mining, with which the association patterns are characterized by sequentially associated...

A Semantic Network for Modeling Biological Knowledge in Multiple Databases (2008)

Jeffrey Stone, Xindong Wu

We have developed a semantic network of biological terminology to aid in the retrieval and integration of biological information from a variety of disparate information sources. Our semantic network...

DOI 10.1007/s10115-006-0016-8 REGULAR PAPER (2008)

Gong Chen, Xindong Wu, Xingquan Zhu

Efficient string matching with wildcards and length constraints

LARGE SCALE DATA MINING BASED ON DATA PARTITIONING (2008)

Shichao Zhang, Xindong Wu, S. Z Hang

Dealing with very large databases is one of the deÐning challenges in data mining research and development. Some databases are simply too large (e.g., with terabytes of data) to be processed at one...

Parameter Tuning for Induction Algorithm Oriented Feature Elimination (2008)

Ying Yang, Xindong Wu

Abstract. This paper presents an analysis of parameter tuning for induction algorithm oriented feature elimination (IAOFE), an approach that takes into consideration not only the data and the target...

ELAPSED TIME IN HUMAN GAIT RECOGNITION: A NEW APPROACH (2008)

Dacheng Tao, Xuelong Li, Xindong Wu, Steve Maybank

Human gait is an effective biometric source for human identification and visual surveillance; therefore human gait recognition becomes to be a hot topic in recent research. However, the elapsed time...

Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams ⋆ (2008)

Ying Yang, Xindong Wu, Xingquan Zhu

Abstract. Prediction in streaming data is an important activity in the modern society. Two major challenges posed by data streams are (1) the data may grow without limit so that it is difficult to...

Effective Classification of Noisy Data Streams with Attribute-Oriented Dynamic Classifier Selection 1 (2008)

Xingquan Zhu, Xindong Wu, Ying Yang

Abstract. Recently, mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular...

Downloaded from (2008)

Guojun Mao, Xindong Wu, Xingquan Zhu, Gong Chen, Chunnian Liu, Guojun Mao, ...

Mining maximal frequent itemsets from data streams

An Empirical Study of the Noise Impact on Cost-Sensitive Learning (2008)

Xingquan Zhu, Xindong Wu, Taghi M. Khoshgoftaar, Yong Shi

In this paper, we perform an empirical study of the impact of noise on cost-sensitive (CS) learning, through observations on how a CS learner reacts to the mislabeled training examples in terms of...

Announcements (2008)

Multi-database Mining, Shichao Zhang, Xindong Wu, Chengqi Zhang, Smart Distance, Information Systems, ...

Intelligence (TCCI) of the IEEE Computer Society deals with tools and systems using biologically and linguistically motivated computational paradigms such as artificial neural

Locating White Box Reuse via Data Mining (2007)

Margot Postema, Heinz Schmidt, Xindong Wu

A large percentage of white box reuse can occur within software systems. Once these reused components are located, they can be restructured to black box components. In structured systems, the black...

y (2007)

Honghua Dai, Kevin Korb, Chris Wallace, Xindong Wu

Weak causal relationships and small sample size pose two significant difficulties to the automatic discovery of causal models from observational data. This paper examines the influence of weak causal...

Feature Article: Multi-Database Mining 5 Multi-Database Mining (2007)

Shichao Zhang, Xindong Wu, Chengqi Zhang

Abstract — Multi-database mining is an important research area because (1) there is an urgent need for analyzing data in different sources, (2) there are essential differences between mono- and...

DigitalObjea IdealS9 (DOI) 10.1007/s00530-003-0076-5 Multime/s System (2003) (2007)

Multimesn System Springe, Xingquan Zhu, Jianping Fan, Ahmed K. Elmagarmid, Xindong Wu

Vide isincrexSTWWk the mere ofchoice for a varieP of communicationchanneca renneca primarily from incre8kw le vex ofneh orke multimeST systeme One way to keW ourheSw above the vide se is to provide...

DOI 10.1007/s10115-007-0114-2 SURVEY PAPER Top 10 algorithms in data mining (2007)

Xindong Wu, Vipin Kumar, J. Ross, Quinlan Joydeep, Ghosh Qiang Yang, Hiroshi Motoda, ...

Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank,...

Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval (2006)

Tao, Dacheng, Tang, Xiaoou, Li, Xuelong, Wu, Xindong

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor...

Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval (2006)

Tao, Dacheng, Tang, Xiaoou, Li, Xuelong, Wu, Xindong

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor...

Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval (2006)

Tao, Dacheng, Tang, Xiaoou, Li, Xuelong, Wu, Xindong

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor...

Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval (2006)

Tao, Dacheng, Tang, Xiaoou, Li, Xuelong, Wu, Xindong

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor...

S.: Human carrying status in visual surveillance (2006)

Dacheng Tao, Xuelong Li, Xindong Wu, Stephen J. Maybank

A person’s gait changes when he or she is carrying an object such as a bag, suitcase or rucksack. As a result, human identification and tracking are made more difficult because the averaged gait...

Mining sequential patterns across data streams (2005)

Gong Chen, Xindong Wu, Xingquan Zhu

Abstract. There are extensive endeavors toward mining frequent items or itemsets in a single data stream, but rare efforts have been made to explore sequential patterns among literals in different...

Supervised tensor learning (2005)

Dacheng Tao, Xuelong Li, Weiming Hu, Stephen Maybank, Xindong Wu

This paper aims to take general tensors as inputs for supervised learning. A supervised tensor learning (STL) framework is established for convex optimization based learning techniques such as...

Sequential pattern mining in multiple streams (2005)

Gong Chen, Xindong Wu, Xingquan Zhu

In this paper, we deal with mining sequential patterns in multiple data streams. Building on a state-of-the-art sequential pattern mining algorithm PrefixSpan for mining transaction databases, we...

A decremental algorithm for maintaining frequent itemsets in dynamic databases (2005)

Shichao Zhang, Xindong Wu, Jilian Zhang, Chengqi Zhang

Abstract. Data mining and machine learning must confront the problem of pattern maintenance because data updating is a fundamental operation in data management. Most existing data-mining algorithms...

Supervised tensor learning (2005)

Dacheng Tao, Xuelong Li, Weiming Hu, Stephen Maybank, Xindong Wu

This paper aims to take general tensors as inputs for supervised learning. A supervised tensor learning (STL) framework is established for convex optimization based learning techniques such as...

Digital Object Identifier (DOI) 10.1007/s00530-004-0142-7 Exploring video content structure for hierarchical summarization (2004)

Xingquan Zhu, Xindong Wu, Jianping Fan, Ahmed K. Elmagarmid, Walid G. Aref

Abstract. In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First,...

Dynamic Classifier Selection for Effective Mining from Noisy Data Streams (2004)

Xingquan Zhu, Xindong Wu, Ying Yang

Recently, mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular solution is to...

Dealing with predictive-but-unpredictable attributes in noisy data sources (2004)

Ying Yang, Xindong Wu, Xingquan Zhu

Abstract. Attribute noise can affect classification learning. Previous work in handling attribute noise has focused on those predictable attributes that can be predicted by the class and other...

Cost-guided Class Noise Handling for Effective Cost-sensitive Learning (2004)

Xingquan Zhu, Xindong Wu

Recent research in machine learning, data mining and related areas has produced a wide variety of algorithms for costsensitive (CS) classification, where instead of maximizing the classification...

Eliminating class noise in large datasets (2003)

Xingquan Zhu, Xindong Wu, Qijun Chen

This paper presents a new approach for identifying and eliminating mislabeled instances in large or distributed datasets. We first partition a dataset into subsets, each of which is small enough to...

Eliminating class noise in large datasets (2003)

Xingquan Zhu, Xindong Wu, Qijun Chen

This paper presents a new approach for identifying and eliminating mislabeled instances in large or distributed datasets. We first partition a dataset into subsets, each of which is small enough to...

Association analysis with one scan of databases (2002)

Hao Huang, Xindong Wu, Richard Relue

Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and...

Building Intelligent Learning Database Systems (2000)

Xindong Wu

Induction and deduction are two opposite operations in data mining applications. Induction extracts knowledge in the form of, say, rules or decision trees from existing data, and deduction applies...

Aggregation of Association Rules (1999)

Shichao ZHANG, Xindong WU

Dealing with very large databases is one of the defining challenges in data mining research and development. Some databases are simply too large (e.g., with terabytes of data) to be processed at one...

Association Analysis with One Scan of Databases (1998)

Huang, Hao, Wu, Xindong, Relue, Richard

Mining frequent patterns with an FP-tree avoids costly candidate generation and repeatedly occurrence frequency checking against the support threshold. It therefore achieves better performance and...

Multi-Layer Incremental Induction (1998)

Xindong Wu

. This paper describes a multi-layer incremental induction algorithm, MLII, which is linked to an existing nonincremental induction algorithm to learn incrementally from noisy data. MLII makes use of...

Rule Induction with Extension Matrices (1998)

Xindong Wu

This paper presents a heuristic, attribute-based, noise-tolerant data mining program, HCV (Version 2.0), based on the newly-developed extension matrix approach. By dividing the positive examples (PE)...

A Decision Support Tool for Tuning Parameters in a Machine Learning Algorithm (1997)

Margot Postema, Tim Menzies, Xindong Wu

Many machine learning algorithms require parameter tuning in order to adapt them to the particulars of a training set. This tuning task can be an expert task in its own right. Based on our...

A Study of Causal Discovery With Weak Links and Small Samples (1997)

Honghua Dai, Kevin Korb, Chris Wallace, Xindong Wu

Weak causal relationships and small sample size pose two significant difficulties to the automatic discovery of causal models from observational data. This paper examines the influence of weak causal...

Object-Oriented Modeling of Rule-Based Programming (1997)

Xindong Wu, Xiaoya Lin

A domain expertise always comprises a set of concepts and the logical relationships between them. In rule-based programming, rules which describe logical relationships are the fundamental knowledge...

A Decision Support Tool for Tuning Parameters in a Machine Learning Algorithm (1997)

Margot Postema, Tim Menzies, Xindong Wu

Many machine learning algorithms require parameter tuning in order to adapt them to the particulars of a training set. This tuning task can be an expert task in its own right. Based on our...

A Study of Causal Discovery With Weak Links and Small Samples (1997)

Honghua Dai, Kevin Korb, Chris Wallace, Xindong Wu

Weak causal relationships and small sample size pose two significant difficulties to the automatic discovery of causal models from observational data. This paper examines the influence of weak causal...

The Use and Acquisition of Explicit Ontologies in KEshell (1996)

Xindong Wu

Schematic descriptions of a domain knowledge, called an ontology [van Heijst et al. 96], are very useful in facilitating and formalising the knowledge acquisition process in knowledgebased systems...

Noise Handling with Extension Matrices (1996)

Xindong Wu, Johan Krisár, Petter Mahlén

HCV is a heuristic attribute-based induction algorithm based on the newly-developed extension matrix approach. By dividing the positive examples (PE) of a specific class in a given example set into...

A Tuning Aid to Improve Deduction of Induction Results (1996)

Margot Postema, Xindong Wu, Tim Menzies

This paper examines where a tuning aid can be useful to improve deduction of induction results. Different discretization methods use different strategies to set up the borders for continuous...

Noise Handling With Extension Matrices (1996)

Xindong Wu

HCV is a heuristic attribute-based induction algorithm based on the newly-developed extension matrix approach. By dividing the positive examples (PE) of a specific class in a given example set into...

A Bayesian Discretizer for Real-Valued Attributes (1996)

Xindong Wu

Discretization of real-valued attributes into nominal intervals has been an important area for symbolic induction systems because many real world classification tasks involve both symbolic and...

A Comparison of Objects with Frames and OODBs (1995)

Xindong Wu

Objects and frames are two powerful technologies used in Software Engineering and Artificial Intelligence respectively. Object-oriented databases are currently one of the most important research...

Noise Handling with Extension Matrixes (1995)

Xindong Wu, Johan Krisár, Petter Mahlén

HCV is a heuristic attribute-based induction algorithm based on the newly-developed extension matrix approach. By dividing the positive examples (PE) of a specific class in a given example set into...

Knowledge Objects (1995)

Xindong Wu, Sita Ramakrishnan, Heinz Schmidt

ion and encapsulation Abstraction is the principle of capturing useful information by ignoring all the detailed features of an entity that are not relevant to understanding what it does or what it...

SIKT: A Structured Interactive Knowledge Transfer Program (1995)

Xindong Wu

Facilitating and formalising the process of specification acquisition in software development is an important problem in the automatic generation of software. This paper presents a structured...

Knowledge Objects (1995)

Xindong Wu, Sita Ramakrishnan, Heinz Schmidt

One of the fundamental differences between AI research and conventional computer science (such as software engineering and database technology) is that AI has its own established programming...

Fuzzy Interpretation of Induction Results (1995)

Xindong Wu, Petter Mahlén

When applying rules induced from training examples to a test example, there are three possible cases which demand different actions: (1) no match, (2) single match, and (3) multiple match. Existing...

Rule Schema + Rule Body: A 2-Level Representation Language (1994)

Xindong Wu

This paper presents an alternative representation language, rule schema + rule body, to rule-based production systems based on an integration of rule-based and numeric computations. Rule schemata in...

Extracting Rule Schemas from Rules for an Intelligent Learning Database System (1994)

Geoff Sutcliffe, Xindong Wu

A software module for extracting rule schemas from rules, in the context of an intelligent learning data base system (ILDB), is described. The ILDB system employs a two level knowledge representation...

Knowledge Acquisition from Data Bases (1993)

Wu, Xindong

Knowledge acquisition from databases is a research frontier for both data base technology and machine learning (ML) techniques,and has seen sustained research over recent years.It also acts as a link...

Knowledge Acquisition from Data Bases (1993)

Wu, Xindong

Knowledge acquisition from databases is a research frontier for both data base technology and machine learning (ML) techniques,and has seen sustained research over recent years.It also acts as a link...

10 CHALLENGING PROBLEMS IN DATA MINING RESEARCH

QIANG YANG, XINDONG WU

In October 2005, we took an initiative to identify 10 challenging problems in data mining research, by consulting some of the most active researchers in data mining and machine learning for their...