David J. H.

ABSTRACT (2008)

Niall M. Adams, David J. H

Variable selection can be valuable in the analysis of streaming data with costly measurements, as in intensive care monitoring or battery-powered sensor networks. In the presence of drift, selections...

Abstract: Draft- comments welcomed Scorecard construction with unbalanced class sizes (2008)

David J. H, Veronica Vinciotti

A long-running issue in scorecard construction is how to handle dramatically unbalanced class sizes. This is important because, in many applications, the class sizes are very different. For example,...

Abstract The Impact of Changing Populations on Classifier Performance (2008)

Mark G. Kelly, David J. H, Niall M. Adams

An assumption fundamental to almost all work on super-vised classification is that the probabilities of class member-ship, conditional on the feature vectors, are stationary. However, in many...

Behavioural (2008)

Piotr Juszczak, Niall Adams, David J. H

finance as a multi-instance learning problem

Unsupervised Clustering In Streaming Data Abstract (2008)

Dimitris K. Tasoulis, Niall M. Adams, David J. H

Tools for automatically clustering streaming data are becoming increasingly important as data acquisition technology continues to advance. In this paper we present an extension of conventional kernel...

Peer Group Analysis – Local Anomaly Detection in Longitudinal Data (2007)

Richard J. Bolton, David J. H

Peer group analysis is a new tool for monitoring behavior over time in data mining situations. In particular, the tool detects individual objects that begin to behave in a way distinct from objects...

Significance Tests for Unsupervised Pattern Discovery in Large Continuous Multivariate Data Sets Richard J. Bolton (2007)

David Hand And, Richard J. Bolton, David J. H, Martin Crowder

In this paper we consider the question of uncertainty of discovered patterns in data mining. In particular, we develop statistical tests for flagged patterns found in continuous data, where such...

Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges (2005)

Nicholas A. Heard, Christopher C. Holmes, David A. Stephens, David J. H, George Dimopoulos

We present a method for Bayesian model-based hierarchical co-clustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon...

sizes (2003)

Veronica Vinciotti, David J. H

Scorecard construction with unbalanced class

An empirical comparison of three boosting algorithms on real data sets with artificial class noise (2003)

Ross A. Mcdonald, David J. H, Idris A. Eckley

Abstract. Boosting algorithms are a means of building a strong ensemble classifier by aggregating a sequence of weak hypotheses. In this paper we consider three of the best-known boosting algorithms:...

Projection techniques for nonlinear principal component analysis (2003)

Richard J. Bolton, David J. H, Andrew R. Webb

Principal Components Analysis (PCA) is traditionally a linear technique for projecting multidimensional data onto lower dimensional subspaces with minimal loss of variance. However, there are several...

Determining Hit Rate in Pattern Search (2002)

Richard Bolton David, David J. H, Niall M. Adams, Sw Bz

The problem of spurious apparent patterns arising by chance is a fundamental one for pattern detection. Classical approaches, based on adjustments such as the Bonferroni procedure, are arguably not...

Statistical Fraud Detection: A Review (2002)

Richard J. Bolton, David J. Hand, David J. H

Fraud is increasing dramatically with the expansion of modern technology and the global superhighways of communication, resulting in the loss of billions of dollars worldwide each year. Although...

Statistical fraud detection: A review (2002)

Richard J. Bolton, David J. H

Abstract. Fraud is increasing dramatically with the expansion of modern technology and the global superhighways of communication, resulting in the loss of billions of dollars worldwide each year....

Unsupervised profiling methods for fraud detection, Credit Scoring and Credit Control VII (2001)

Richard J. Bolton, David J. H

Credit card fraud falls broadly into two categories: behavioural fraud and application fraud. Application fraud occurs when individuals obtain new credit cards from issuing companies using false...

Finding patterns that correspond to episodes (2001)

Paul Cohen, Niall Adams, David J. H

We present two algorithms for elucidating structures in time series. These are unsupervised algorithms; they discover patterns without any knowledge about the episodic structures in the time series...