A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data (2009)
Zhang, Ke, Hutter, Marcus, Jin, Huidong
Detecting outliers which are grossly different from or inconsistent with the remaining dataset is a major challenge in real-world KDD applications. Existing outlier detection methods are ineffective...
A HMM-Based Hierarchical Framework for Long-term Population Projection of Small Areas (2008)
Bin Jiang, Huidong Jin, Nianjun Liu, Mike Quirk, Ben Searle
Abstract. Population Projection is the numerical outcome of a specific set of assumptions about future population changes. It is indispensable to the planning of sites as almost all successive...
Temporal Sequence Associations for Rare Events 1 (2008)
Jie Chen, Hongxing He, Graham Williams, Huidong Jin
Abstract. In many real world applications, systematic analysis of rare events, such as credit card frauds and adverse drug reactions, is very important. Their low occurrence rate in large databases...
Frequency-Based Temporal Pattern Mining in Health Data (2008)
Jie Chen, Huidong Jin, Hongxing He, Christine M. O’keefe, Ross Sparks, Graham Williams, ...
The low occurrence rate of adverse drug reactions makes it difficult to identify the risk factors from straightforward application of frequent pattern discovery in large databases. In this paper, we...
An Expectation-Maximization Algorithm Working on Data Summary (2008)
Huidong Jin, Kwong-sak Leung, Man-leung Wong
Scalable cluster analysis addresses the problem of processing large data sets with limited resources, e.g., memory and computation time. A data summarization or sampling procedure is an essential...
Discovering Prediction Model for Environmental Distribution Maps (2008)
Ke Zhang, Huidong Jin, Nianjun Liu, Rob Lesslie, Lei Wang, Zhouyu Fu, ...
Abstract. Currently environmental distribution maps, such as for soil fertility, rainfall and foliage, are widely used in the natural resource management and policy making. One typical example is to...
Identifying Risk Groups Associated with Colorectal Cancer (2008)
Jie Chen, Hongxing He, Huidong Jin, Damien Mcaullay
Abstract. In this paper, we explore data mining techniques for the task of identifying and describing risk groups for colorectal cancer (CRC) from population based administrative health data....
Automatic feature selection for classification of health data (2008)
Hongxing He, Huidong Jin, Jie Chen
Abstract. For classification of health data, we propose in this paper a fast and accurate feature selection method, FIEBIT (Feature Inclusion and Exclusion Based on Information Theory). FIEBIT...
A Delivery Framework for Health Data Mining and Analytics (2008)
Damien Mcaullay, Graham Williams, Jie Chen, Huidong Jin, Hongxing He, Ross Sparks, ...
The i Health Explorer tool, developed by CSIRO and DoHA, delivers web services type data mining and analytic facilities over a web interface, providing desktop access to sophisticated analyses over...
Current developments of k-anonymous data releasing (2008)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
[Abstract]: Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining...
Current developments of k-anonymous data releasing (2008)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
[Abstract]: Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining...
Current developments of k-anonymous data releasing (2008)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
[Abstract]: Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining...
Mining unexpected temporal associations: Applications in detecting adverse drug reactions (2008)
Jin, Huidong, Chen, Jie, He, Hongxing, Williams, G. J., Kelman, Chris, O'Keefe, Christine Margaret
Copyright © 2008 IEEE
Mining unexpected temporal associations: Applications in detecting adverse drug reactions (2008)
Jin, Huidong, Chen, Jie, He, Hongxing, Williams, G. J., Kelman, Chris, O'Keefe, Christine Margaret
Copyright © 2008 IEEE
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Anthony Bruce
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Anthony Bruce, ...
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Tony
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Anthony Bruce, ...
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Current developments of k-anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community,...
Current developments of k-Anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming, Croll, Peter, Morarji, Hasmukh, ...
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyeda revival in the data mining community,...
Current developments of k-Anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community,...
Current developments of k-Anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming, Croll, Peter, Morarji, Hasmukh, ...
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyeda revival in the data mining community,...
Frequency-based Rare Events Mining in Administrative Health Data (2006)
Jie Chen; CSIRO Mathematical And Information Sciences, Canberra, Huidong Jin; CSIRO Mathematical And Information Sciences, Canberra, Hongxing He; CSIRO Mathematical And Information Sciences, Canberra, Christine M O'Keefe; CSIRO Mathematical And Information Sciences, Canberra, Ross Sparks; CSIRO Mathematical And Information Sciences, Canberra, ...
The low occurrence rate of adverse drug reactions makes it difficult to identify risk factors from a straightforward application of association pattern discovery in large databases. In this paper, we...
Current developments of k-Anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community,...
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Anthony Bruce
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Current developments of k-anonymous data releasing (2006)
Jiuyong Li, Hua Wang, Huidong Jin, Jianming Yong
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community,...
Current developments of k-anonymous data releasing (2006)
Li, Jiuyong, Wang, Hua, Jin, Huidong, Yong, Jianming
Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community,...
Analysis of breast feeding data using data mining methods (2006)
He, Hongxing, Jin, Huidong, Chen, Jie, McAullay, Damien, Li, Jiuyong, Fallon, Anthony Bruce
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used...
Scalable model-based cluster analysis using clustering features (2005)
We present two scalable model-based clustering systems based on a Gaussian mix-ture model with independent attributes within clusters. They first summarize data into sub-clusters, and then generate...
K.S.: Scalable model-based clustering for large databases based on data summarization (2005)
Huidong Jin, Man-leung Wong, Kwong-sak Leung
The scalability problem in data mining involves the development of methods for handling large databases with limited computational resources such as memory and computation time. In this paper, two...
Representing Association Classification Rules Mined from Health Data (2005)
Jie Chen, Hongxing He, Jiuyong Li, Huidong Jin, Damien Mcaullay, Graham Williams, ...
Abstract. An association classification algorithm has been developed to explore adverse drug reactions in a large medical transaction dataset with unbalanced classes. Rules discovered can be used to...
Representing Association Classification Rules Mined from Health Data (2005)
Jie Chen, Hongxing He, Jiuyong Li, Huidong Jin, Graham Williams, Ross Sparks, ...
Abstract. An association classification algorithm has been used to explore adverse drug reactions in a large medical transaction data set with unbalanced classes. Rules discovered can be used to...
Scalable model-based clustering by working on data summaries (2003)
The scalability problem in data mining involves the development of methods for handling large databases with limited computational resources. In this paper, we present a two-phase scalable...
Abstract- The issue of obtaining a well-converged and well-distributed set of Pareto optimal solutions efficiently and automatically is crucial in multi-objective evolutionary algorithms (MOEAs)....