Haixun Wang

Publication List Details

Period

1949 - 2009

Number

109

Co-Authors

Weighted Proximity Best-Joins for Information Retrieval † (2009)

Risi Thonangi, Hao He, Anhai Doan, Haixun Wang

Abstract—We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a...

A Monte Carlo Sampling Framework for Information Recovery ∗ (2009)

Junyi Xie, Jun Yang, Yuguo Chen, Haixun Wang, Philip S. Yu

There has been a recent resurgence in research related to noisy and incomplete data. Many applications require information to be recovered from imperfect data. For example, in sensor data processing,...

Efficiently Answering Reachability Queries on Very Large Directed Graphs (2009)

Ruoming Jin, Yang Xiang, Ning Ruan, Haixun Wang, Graph Indexing

Efficiently processing queries against very large graphs is an important research topic largely driven by emerging real world applications, as diverse as XML databases, GIS, web mining, social...

Lock-FreeConsistencyControlforWeb2.0Applications ∗ (2009)

Jiangming Yang, Haixun Wang, Ning Gu, Yiming Liu, Chunsong Wang, Qiwei Zhang

Online collaboration and sharing is the central theme of many webbased services that create the so-called Web 2.0 phenomena. Using the Internet as a computing platform, many Web 2.0 applications set...

Stop Chasing Trends: Discovering High Order Models in Evolving Data (2009)

Shixi Chen, Haixun Wang, Shuigeng Zhou, Philip S. Yu

Abstract — Many applications are driven by evolving data — patterns in web traffic, program execution traces, network event logs, etc., are often non-stationary. Building prediction models for...

Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach (2009)

Yijian Bai, Haixun Wang, Carlo Zaniolo

Monitoring multiple streaming sources for collective decision making presents several challenges. First, streaming data are often of large volume, fast speed, and highly bursty nature. Second, it is...

Efficiently Answering Reachability Queries on Very Large Directed Graphs (2009)

Ruoming Jin, Yang Xiang, Ning Ruan, Haixun Wang

Efficiently processing queries against very large graphs is an important research topic largely driven by emerging real world applications, as diverse as XML databases, GIS, web mining, social...

Load Shedding in Classifying Multi-Source Streaming Data: A Bayes Risk Approach (2009)

Yijian Bai, Haixun Wang

Monitoring multiple streaming sources for collective decision making presents several challenges. First, streaming data are often of large volume, fast speed, and highly bursty nature. Second, it is...

Estimating the Selectivity of XML Path Expression with predicates by Histograms ⋆ (2008)

Yu Wang, Haixun Wang, Xiaofeng Meng, Shan Wang

Abstract. Selectivity estimation of path expressions in querying XML data plays an important role in query optimization. A path expression may contain multiple branches with predicates, each of which...

Providing Freshness Guarantees for Outsourced Databases ∗ (2008)

Min Xie, Haixun Wang

Database outsourcing becomes increasingly attractive as advances in network technologies eliminate the perceived performance difference between in-house databases and outsourced databases, and price...

Industrial and Government Track Short Paper ABSTRACT Event Summarization for System Management ∗ (2008)

Wei Peng, Tao Li, Haixun Wang

In system management applications, an overwhelming amount of data are generated and collected in the form of temporal events. While mining temporal event data to discover interesting and frequent...

ABSTRACT Integrity Auditing of Outsourced Data ∗ (2008)

Min Xie, Haixun Wang

An increasing number of enterprises outsource their IT services to third parties who can offer these services for a much lower cost due to economy of scale. Quality of service is a major concern in...

ABSTRACT Integrity Auditing of Outsourced Data (2008)

Min Xie, Haixun Wang

An increasing number of enterprises outsource their IT functions or business processes to third-parties who offer these services with a lower cost due to the economy of scale. Quality of service has...

A Sampling-Based Approach to Information Recovery † (2008)

Junyi Xie, Jun Yang, Yuguo Chen, Haixun Wang, Philip S. Yu

Abstract — There has been a recent resurgence of interest in research on noisy and incomplete data. Many applications require information to be recovered from such data. Ideally, an approach for...

A fully distributed framework for cost-sensitive data mining (2008)

Wei Fan, Haixun Wang, Philip S. Yu

In this paper, we propose a fully distributed system (as compared to centralized and partially distributed systems) for cost-sensitive data mining. Experimental results have shown that this approach...

ABSTRACT Integrity Auditing of Outsourced Data ∗ (2008)

Min Xie, Haixun Wang

An increasing number of enterprises outsource their IT services to third parties who can offer these services for a much lower cost due to economy of scale. Quality of service is a major concern in...

Inductive Learning in Less Than One Sequential Data Scan (2008)

Wei Fan, Haixun Wang, Philip S. Yu

Most recent research of scalable inductive learning on very large dataset, decision tree construction in particular, focuses on eliminating memory constraints and reducing the number of sequential...

Pattern-based Similarity Search for Microarray Data (2008)

Haixun Wang

One fundamental task in near-neighbor search as well as other similarity matching efforts is to find a distance function that can efficiently quantify the similarity between two objects in a...

Streams and Stream-based Processing Query Languages and Data Models for Database Sequences and Data Streams (2008)

Haixun Wang, Carlo Zaniolo, Yan-nei Law, Haixun Wang, Carlo Zaniolo

We study the fundamental limitations of relational algebra (RA) and SQL in supporting sequence and stream queries, and present effective query language and data model enrichments to deal with them....

A Random Method for Quantifying Changing Distributions in Data Streams (2008)

Haixun Wang, Jian Pei

Abstract. In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between two...

Research Track Poster Suppressing Model Overfitting in Mining Concept-Drifting Data Streams ABSTRACT (2008)

Haixun Wang, Jian Yin, Philip S. Yu, Jeffrey Xu Yu

Mining data streams of changing class distributions is important for real-time business decision support. The stream classifier must evolve to reflect the current class distribution. This poses a...

A Balanced Ensemble Approach to Weighting Classifiers for Text Classification (2008)

Gabriel Pui, Cheong Fung, Jeffrey Xu Yu, Haixun Wang, David W. Cheung, Huan Liu

This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function,...

Abstract Active Mining of Data Streams (2008)

Wei Fan, Yi-an Huang, Haixun Wang, Philip S. Yu

Most previously proposed mining methods on data streams make an unrealistic assumption that “labelled ” data stream is readily available and can be mined at anytime. However, in most real-world...

International Journal on Artificial Intelligence Tools c ○ World Scientific Publishing Company An Improved Biclustering Method for Analyzing Gene Expression Profiles (2008)

Jiong Yang, Haixun Wang, Wei Wang, Philip S. Yu

Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously...

Under consideration for publication in Theory and Practice of Logic Programming 1 The Deductive Database System LDL++ (2008)

Faiz Arni, Shalom Tsur, Haixun Wang, Carlo Zaniolo

This paper describes the LDL++ system and the research advances that have enabled its design and development. We begin by discussing the new nonmonotonic and nondeterministic constructs that extend...

A Balanced Ensemble Approach to Weighting Classifiers for Text Classification (2008)

Gabriel Pui, Cheong Fung, Jeffrey Xu Yu, Haixun Wang, David W. Cheung, Huan Liu

This paper studies the problem of constructing an effective heterogeneous ensemble classifier for text classification. One major challenge of this problem is to formulate a good combination function,...

Abstract The ATLaS System and its Powerful Database Language Based on Simple Extensions of SQL (Extended Abstract) (2008)

Haixun Wang

A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support new applications domains, such as datamining. In this...

Incremental learning (2008)

Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz

Information Systems Catch the moment: maintaining closed frequent itemsets over a data stream sliding window

Catch the Moment: Maintaining Closed Frequent Itemsets (2008)

Over Data Stream, Yun Chi, Philip S. Yu, Haixun Wang, Richard R. Muntz

This paper considers the problem of mining closed frequent itemsets over a data stream sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the...

Fast computing reachability labelings for large graphs with high compression rate (2008)

Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, Philip S. Yu

Abstract. The need of processing graph reachability queries stems from many applications that manage complex data as graphs. The applications include transportation network, Internet traffic...

Database System Extensions for Decision Support: the AXL Approach (2007)

Haixun Wang, Carlo Zaniolo

Research on database-centric data mining is seeking to improve the eectiveness of database systems in decision support applications. Dierent solutions are now used for dierent problems, including (i)...

Implementation of XY Stratification: An Extension to LDL++ (2007)

Haixun Wang

Introduction The problem of allowing non-monotonic constructs, such as negation and aggregates, in recursive programs represents a difficult challenge faced by current research in deductive...

User Defined Aggregates in LDL++ (2007)

Haixun Wang Ucla, Haixun Wang

Introduction The reason why aggregate is important is twofold. One is that aggregate in deductive database systems introduces a situation that is very similar to negation, since before any aggregate...

Abstract (2007)

Haixun Wang, Carlo Zaniolo

A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support new application domains. Considerable efforts by...

Improving Performance of Bicluster Discovery in a Large Data Set (2007)

Jiong Yang, Wei Wang, Haixun Wang, Philip Yu

Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously...

Under consideration for publication in Theory and Practice of Logic Programming 1 The Deductive Database System LDL++ (2007)

Faiz Arni, Shalom Tsur, Haixun Wang, Carlo Zaniolo

This paper describes the LDL++ system and the research advances that have enabled its design and development. We begin by discussing the new nonmonotonic and nondeterministic constructs that extend...

Abstract The ATLaS System and its Powerful Database Language Based on Simple Extensions of SQL (Extended Abstract) (2007)

Haixun Wang

A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support new applications domains, such as datamining. In this...

1 (2007)

Wei Fan, Haixun Wang, Philip S. Yu, Shaw-hwa Lo, Salvatore Stolfo

Presently, inductive learning is still performed in a frustrating batch process. The user has little interaction with the system and no control over the final accuracy and training time. If the...

Inductive Learning in Less Than One Sequential Data Scan (2007)

Wei Fan, Haixun Wang, Philip S. Yu

Most recent research of scalable inductive learning on very large dataset, decision tree construction in particular, focuses on eliminating memory constraints and reducing the number of sequential...

The s2-tree: An index structure for subsequence matching of spatial objects (2007)

Haixun Wang, Chang-shing Perng

Abstract. We present the S 2- Tree, an indexing method for subsequence matching of spatial objects. The S 2- Tree locates subsequences within a collection of spatial sequences, i.e., sequences made...

ABSTRACT Mining Concept-Drifting Data Streams Using Ensemble Classifiers (2007)

Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han

Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud protection, target...

ABSTRACT Mining Concept-Drifting Data Streams using Ensemble Classifiers (2007)

Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han

Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud protection, target...

Database System Extensions for Decision Support: the AXL Approach (2007)

Haixun Wang, Carlo Zaniolo

Research on database-centric data mining is seeking to improve the effectiveness of database systems in decision support applications. Different solutions are now used for different problems,...

Logic-Based User-Defined Aggregates for the Next Generation of Database Systems (2007)

London Milan Paris, Carlo Zaniolo, Haixun Wang

Summary. In this paper, we provide logic-based foundations for the extended aggregate constructs required by advanced database applications. In particular, we focus on data mining applications and...

Abstract ATLaS: A Native Extension of SQL for Data Mining (2007)

Haixun Wang

A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support data mining applications. Thus, there is a pressing need...

Abstract (2007)

Yun Chi, Yun Chi, Philip S. Yu, Philip S. Yu, Haixun Wang, Haixun Wang, ...

been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be...

Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S (2007)

Kun-lung Wu, Kirsten W. Hildrum, Wei Fan, Gang Luo, Philip S. Yu, Charu C. Aggarwal, ...

In this paper, we describe the challenges of prototyping a reference application on System S, a distributed stream processing middleware under development at IBM Research. With a large number of...

Supporting ranking and clustering as generalized order-by and group-by (2007)

Chengkai Li, Min Wang, Lipyeow Lim, Haixun Wang

The Boolean semantics of SQL queries cannot adequately capture the “fuzzy ” preferences and “soft ” criteria required in non-traditional data retrieval applications. One way to solve this...

Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S (2007)

Kun-lung Wu, Kirsten W. Hildrum, Wei Fan, Gang Luo, Philip S. Yu, Charu C. Aggarwal, ...

In this paper, we describe the challenges of prototyping a reference application on System S, a distributed stream processing middleware under development at IBM Research. With a large number of...

Gstring: A novel approach for efficient search in graph databases (2007)

Haoliang Jiang, Haixun Wang, Philip S. Yu, Shuigeng Zhou

Graphs are widely used for modeling complicated data, including chemical compounds, protein interactions, XML documents, and multimedia. Information retrieval against such data can be formulated as a...

Blinks: Ranked keyword searches on graphs (2007)

Hao He, Haixun Wang, Jun Yang, Philip S. Yu

Query processing over graph-structured data is enjoying a growing number of applications. A top-k keyword search query on a graph nds the top k answers according to some ranking criteria, where each...

Discovering frequent closed partial orders from strings (2006)

Jian Pei, Haixun Wang, Ieee Computer Society, Jian Liu, Ke Wang, Jianyong Wang, ...

Abstract—Mining knowledge about ordering from sequence data is an important problem with many applications, such as bioinformatics, Web mining, network management, and intrusion detection. For...

Dual labeling: Answering graph reachability queries in constant time (2006)

Haixun Wang, Hao He, Jun Yang, Philip S. Yu, Jeffrey Xu Yu

Graph reachability is fundamental to a wide range of applications, including XML indexing, geographic navigation, Internet routing, ontology queries based on RDF/OWL, etc. Many applications involve...

Load shedding in classifying multi-source streaming data: A Bayes Risk approach (2006)

Yijian Bai, Haixun Wang

Monitoring multiple streaming sources for collective decision making presents several challenges. First, streaming data are often of large volume, fast speed, and highly bursty nature. Second, it is...

Load shedding in classifying multi-source streaming data: A Bayes Risk approach (2006)

Yijian Bai, Haixun Wang

Monitoring multiple streaming sources for collective decision making presents several challenges. First, streaming data are often of large volume, fast speed, and highly bursty nature. Second, it is...

Fast computation of reachability labeling for large graphs (2006)

Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, Philip S. Yu

There are numerous applications that need to deal with a large graph and need to query reachability between nodes in the graph. A 2-hop cover can compactly represent the whole edge transitive closure...

On the sequencing of tree structures for XML indexing (2005)

Haixun Wang

Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically...

Loadstar: A Load Shedding Scheme for Classifying Data Streams (2005)

Yun Chi, Philip S. Yu, Haixun Wang, Richard R. Muntz

We consider the problem of resource allocation in mining multiple data streams. Due to the large volume and the high speed of streaming data, mining algorithms must cope with the e#ects of system...

A native extension of sql for mining data streams (2005)

Chang Luo, Hetal Thakkar, Haixun Wang, Carlo Zaniolo

ESL 1 enables users to develop stream applications in an SQL-like

On the sequencing of tree structures for XML indexing (2005)

Haixun Wang

Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically...

Loadstar: Load shedding in data stream mining (2005)

Yun Chi, Haixun Wang, Philip S. Yu

In this demo, we show that intelligent load shedding is essential in achieving optimum results in mining data streams under various resource constraints. The Loadstar system introduces load shedding...

A native extension of sql for mining data streams (2005)

Chang Luo, Hetal Thakkar, Haixun Wang, Carlo Zaniolo

ESL 1 enables users to develop stream applications in an SQL-like

Loadstar: A load shedding scheme for classifying data streams (2005)

Yun Chi, Philip S. Yu, Haixun Wang, Richard R. Muntz

We consider the problem of resource allocation in mining multiple data streams. Due to the large volume and the high speed of streaming data, mining algorithms must cope with the effects of system...

Query Languages and Data Models for Database Sequences and Data Streams (2004)

Yan-nei Law, Haixun Wang, Carlo Zaniolo

We study the fundamental limitations of relational algebra (RA) and SQL in supporting sequence and stream queries, and present effective query language and data model enrichments to deal with them....

Query Languages and Data Models for Database Sequences and Data Streams (2004)

Yan-nei Law, Haixun Wang, Carlo Zaniolo

We study the fundamental limitations of relational algebra (RA) and SQL in supporting sequence and stream queries, and present effective query language and data model enrichments to deal with them....

Moment: Maintaining closed frequent itemsets over a stream sliding window (2004)

Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz

This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window...

Query Languages and Data Models for Database Sequences and Data Streams (2004)

Yan-nei Law, Haixun Wang, Carlo Zaniolo

We study the fundamental limitations of relational algebra (RA) and SQL in supporting sequence and stream queries, and present effective query language and data model enrichments to deal with them....

Compact reachability labeling for graph-structured data (2004)

Hao He, Haixun Wang, Jun Yang, Philip S. Yu

Testing reachability between nodes in a graph is a well-known problem with many important applications, including knowledge representation, program analysis, and more recently, biological and...

Compact reachability labeling for graph-structured data (2004)

Hao He, Haixun Wang

Testing reachability between nodes in a graph is a well-known problem with many important applications, including knowledge representation, program analysis, and more recently, biological and...

Moment: Maintaining closed frequent itemsets over a stream sliding window (2004)

Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz

This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window...

Indexing Weighted-Sequences in Large Databases (2003)

Haixun Wang, Chang-shing Perng, Wei Fan, Sanghyun Park, Philip S. Yu

We present an index structure for managing weightedsequences in large databases. A weighted-sequence is defined as a two-dimensional structure where each element in the sequence is associated with a...

ViST: a dynamic index method for querying XML data by tree structures (2003)

Haixun Wang, Sanghyun Park, Wei Fan, Philip S. Yu

With the growing importance of XML in data exchange, much research has been done in providing flexible query facilities to extract data from structured XML documents. In this paper, we propose ViST,...

Enhanced biclustering on expression data (2003)

Jiong Yang, Haixun Wang, Wei Wang, Philip Yu, Uiuc Ibm, Unc Chapel, ...

Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously...

ATLaS: A native extension of SQL for data mining (2003)

Haixun Wang

A lack of power and extensibility in their query languages has seriously limited the generality of DBMSs and hampered their ability to support data mining applications. Thus, there is a pressing need...

ATLaS: a Small but Complete SQL Extension for Data Mining and Data Streams (2003)

Haixun Wang, Carlo Zaniolo, Chang Richard Luo

Introduction DBMSs have long suffered from SQL's lack of power and extensibility. We have implemented ATLAS [1], a powerful database language and system that enables users to develop complete...

Online Mining of Changes from Data Streams: (2003)

Research Problems And, Guozhu Dong, Jiawei Han, Jian Pei, Haixun Wang, ...

As data streams are gaining prominence in a growing number of emerging applications, advanced analysis and mining of data streams is becoming increasingly important. While there are some recent...

The Deductive Database System LDL++ (2002)

Arni, Faiz, Ong, KayLiang, Tsur, Shalom, Wang, Haixun, Zaniolo, Carlo

This paper describes the LDL++ system and the research advances that have enabled its design and development. We begin by discussing the new nonmonotonic and nondeterministic constructs that extend...

ATLaS: a Turing-Complete Extension of SQL for Data Mining Applications and Streams. http://wis.cs.ucla.edu/atlas/doc (2002)

Haixun Wang

ATLaS is a powerful database language and system that enables users to develop complete data-intensive applications in SQL—by writing new table functions and aggregates in SQL, rather than in...

Clustering by pattern similarity in large data sets (2002)

Haixun Wang, Wei Wang, Jiong Yang, Philip S. Yu

Clustering is the process of grouping a set of objects into classes of similar objects. Although definitions of similarity vary from one clustering model to another, in most of these models the...

ffi-cluster: capturing subspace correlation in a large data set (2002)

Jiong Yang, Wei Wang, Haixun Wang, Philip Yu

Clustering has been an active research area of great practical importance for recent years. Most previous clustering models have focused on grouping objects with similar values on a (sub)set of...

Average Mile Split (2002)

Mark Hosang, Wayne Wight, Haixun Wang, Carlo Zaniolo Zaniolo, S. Sarawagi Sarawagi

Non blocking AGGREGATE myavg(Next int) int): Real { TABLE state)sum Int, Int, cnt Int); Int); INITIALIZE: { INSERT INTO state VALUES (Next, 1); ITERATE: { UPDATE state SET sum=sum+Next sum =...

Empirical comparison of various reinforcement learning strategies for sequential targeted marketing (2002)

Naoki Abe, Edwin Pednault, Haixun Wang, Bianca Zadrozny, Wei Fan, Chid Apte

We empirically evaluate the performance of various reinforcement learning methods in applications to sequential targeted marketing. In particular, we propose and evaluate a progression of...

Mining associations by pattern structure in large relational tables (2002)

Haixun Wang, Chang-shing Perng, Sheng Ma, Philip S. Yu

Association rule mining aims at discovering patterns whose support is beyond a given threshold. Mining patterns composed of items described by an arbitrary subset of attributes in a large relational...

The deductive database system ldl (2002)

Faiz Arni, Haixun Wang, Carlo Zaniolo

This paper describes the LDL++ system and the research advances that have enabled its design and development. We begin by discussing the new nonmonotonic and nondeterministic constructs that extend...

Extending sql for decision support applications (2002)

Haixun Wang, Carlo Zaniolo

The challenge of extending database systems for decision support applications has been the topic of much recent research—a very incomplete list of previous work includes [11, 8, 12, 4, 10, 5]. Yet,...

The deductive database system ldl (2002)

Natraj Arni, Kayliang Ong, Shalom Tsur, Haixun Wang, Carlo Zaniolo

This paper describes the LDL++ system and the research advances that have enabled its design and development. We begin by discussing the new nonmonotonic and nondeterministic constructs that extend...

Empirical Comparison of Various Reinforcement Learning Strategies for Sequential Targeted Marketing (2002)

Naoki Abe, Edwin Pednault, Haixun Wang, Bianca Zadrozny, Wei Fan, Chid Apte

We empirically evaluate the performance of various reinforcement learning methods in applications to sequential targeted marketing. In particular, we propose and evaluate a progression of...

User-defined aggregates for advanced database applications / (2000)

Wang, Haixun.

Thesis (Ph. D.)--University of California, Los Angeles, 2000.

CMP: A Fast Decision Tree Classifier Using Multivariate Predictions (2000)

Haixun Wang

Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. In this paper, we propose a...

CMP: A Fast Decision Tree Classifier Using Multivariate Predictions (2000)

Haixun Wang, Carlo Zaniolo

Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. In this paper, we propose a...

Landmarks: a new model for similarity-based pattern querying in time series databases (2000)

Chang-shing Perng, Haixun Wang, Sylvia R. Zhang, D. Stott Parker

In this paper we present the Landmark Model, a model for time series that yields new techniques for similarity-based time series pattern querying. The Landmark Model does not follow traditional...

Landmarks: a new model for similarity-based pattern querying in time series databases (2000)

Chang-shing Perng, Haixun Wang, Sylvia R. Zhang, D. Stott Parker

In this paper we present the Landmark Model, a model for time series that yields new techniques for similarity-based time series pattern querying. The Landmark Model does not follow traditional...

Nonmonotonic reasoning in LDL (2000)

Haixun Wang, Carlo Zaniolo

Abstract Deductive database systems have made major advances on efficient support for nonmonotonic reasoning. A first generation of deductive database systems supported the notion of stratification...

User Defined Aggregates in Object-Relational Systems (2000)

Haixun Wang, Carlo Zaniolo

User-defined aggregates are essential in many advanced database applications, particularly in expressing data mining functions, but they find little support in current systems including...

Using SQL to Build New Aggregates and Extenders for Object-Relational Systems (2000)

Haixun Wang, Carlo Zaniolo

User-defined Aggregates (UDAs) provide a versatile mechanism for extending the power and applicability of Object-Relational Databases (O-R DBs). In this paper, we describe the AXL system that...

Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases (2000)

Chang-shing Perng, Haixun Wang, Sylvia R. Zhang, D. Stott Parker

In this paper we present the Landmark Model, a model for time series that yields new techniques for similarity-based time series pattern querying. The Landmark Model does not follow traditional...

CMP: A Fast Decision Tree Classifier Using Multivariate Predictions (2000)

Haixun Wang, Carlo Zaniolo

Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. In this paper, we propose a...

CMP: A Fast Decision Tree Classifier Using Multivariate Predictions (2000)

Haixun Wang

Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. In this paper, we propose a...

Landmark: A New Technique for Similarity-Based Pattern Querying in Time Series Databases (2000)

Chang-Shing Perng, Haixun Wang, Sylvia R. Zhang, D. Stott Parker

In this paper we present Landmark, a new technique for similarity-based time series pattern querying. Landmark does not follow traditional similarity models which rely on the point-wise Euclidean...

User-Defined Aggregates for Datamining (1999)

Haixun Wang, Carlo Zaniolo

User-defined aggregates can be the linchpin of sophisticated datamining functions and other advanced database applications. This is demonstrated by our efficient implementation on DB2 of SQL3...

User-Defined Aggregates in Database Languages (1999)

Haixun Wang, Carlo Zaniolo

User-defined aggregates can be the linchpin of sophisticated datamining functions and other advanced database applications, but they find little support in current database systems including...

User Defined Aggregates in Database Languages (1999)

Haixun Wang Computer, Haixun Wang, Carlo Zaniolo

User-defined aggregates can be the linchpin of sophisticated datamining functions and other advanced database applications, but they find little support in current database systems including...

The S²-Tree: An Index Structure for Subsequence Matching of Spatial Objects (1999)

Haixun Wang, Chang-shing Perng, The S

We present the S²-Tree, an indexing method for subsequence matching of spatial objects. The S²-Tree locates subsequences within a collection of spatial sequences, i.e., sequences made up of spatial...

User-Defined Aggregates for Datamining (1999)

Haixun Wang Computer, Haixun Wang, Carlo Zaniolo

User-defined aggregates can be the linchpin of sophisticated datamining functions and other advanced database applications. This is demonstrated by our efficient implementation on DB2 of SQL3...

User-Defined Aggregates in Database Languages. DBPL 1999: 43-60 (1949)

Haixun Wang, Carlo Zaniolo

Abstract. User-defined aggregates (UDAs) can be the linchpin of sophisticated data mining functions and other advanced database applications, but they find little support in current database...

Logic-Based User-Defined Aggregates for the Next Generation of Database Systems

Carlo Zaniolo, Haixun Wang

. 1 Introduction A new wave of database applications, particularly decision-support and data mining applications, are based on complex aggregates not supported by current DBMSs: in fact, SQL2...