Hillol Kargupta

Distributed Data Mining Bibliography (2004)

Kun Liu, Hillol Kargupta, Jessica Ryan

Advances in computing and communication over wired and wireless networks have resulted in many pervasive distributed computing environments. Many of these environments deal with different distributed...

Distributed Clustering Using Collective Principal Component (2003)

Hillol Kargupta, Weiyun Huang, Krishnamoorthy Sivakumar, Erik Johnson

This paper considers distributed clustering of high dimensional heterogeneous data using a distributed Principal Component Analysis (PCA) technique called the Collective PCA. It presents the...

Energy Consumption in Data Analysis for On-board and Distributed Applications (2003)

Ruchita Bhargava, Hillol Kargupta, Michael Powers

Energy consumption is an important issue in the growing number of data mining and machine learning applications for battery-powered embedded and mobile devices. It plays a critical role in...

Random Value Distortion' Does It Really Preserve Privacy? (2003)

Hillol Kargupta, Souptik Datta, Krishnamoorthy Sivakumar

Privacy is becoming an increasingly important issue in many data mining applications. This has resulted in the development of several privacy-preserving data mining techniques. The random value...

Revisiting The GEMGA: Scalable Evolutionary Optimization Through Linkage Learning (2002)

Sanghamitra B, Hillol Kargupta, Gang Wang

The Gene expression messy genetic algorithm (GEMGA) is a new generation of messy genetic algorithms (GAs) that pays careful attention to linkage learning and in a broader context the search for...

The Gene Expression Messy Genetic Algorithm (2002)

Hillol Kargupta

This paper introduces the gene expression messy genetic algorithm (GEMGA) a new generation of messy GAs that directly search for relations among the members of the search space. The GEMGA is an O(Ak(...

A Fourier Spectrum-based Approach to Represent Decision Trees for Mining Data Streams in Mobile Environments (2002)

Hillol Kargupta, Byung-hoon Park

This paper presents a novel Fourier analysis-based technique to aggregate, transmit, and visualize decision trees in a mobile environment. Fourier representation of a decision tree has several...

Gene Expression and Fast Construction of Distributed Evolutionary Representation (2002)

Hillol Kargupta

The gene expression process in nature produces different proteins in different cells from different portions of the DNA. Since proteins control almost every important activity in a living organism,...

Distributed Data Mining: Algorithms, Systems, and Applications (2002)

Byung-hoon Park, Hillol Kargupta

This paper presents a brief overview of the DDM algorithms, systems, applications, and the emerging research directions. The structure of the paper is organized as follows. We first present the...

A Resampling Technique for Learning the Fourier Spectrum of Skewed Data (2002)

Rajeev Ayyagari, Hillol Kargupta

Function induction using the widely studied Walsh or Multidimensional Discrete Fourier Transform (MDFT) coefficient estimates has several benefits, including the fact that decision trees can be...

Constructing Simpler Decision Trees from Ensemble Models Using Fourier Analysis (2002)

Byung-hoon Park, Hillol Kargupta

Ensemble learning is frequently used for classification and other related applications in data mining. It generates multiple models and produces the final classification by aggregating the outputs of...

Dependency Detection in MobiMine and Random Matrices (2002)

Hillol Kargupta, Krishnamoorthy Sivakumar, Samiran Ghosh

This paper describes a novel approach to detect correlation from data streams in the context of MobiMine --- an experimental mobile data mining system. It presents a brief description of the MobiMine...

A Random Matrix-Based Approach for Dependency Detection from (2002)

Hillol Kargupta, Krishnamoorthy Sivakumar, Samiran Ghosh

This paper describes a novel approach to detect correlation from data streams in the context of MobiMine, an experimental mobile data mining system. It presents a brief description of the MobiMine...

A Resampling Technique for Learning the Fourier Spectrum of (2002)

Rajeev Ayyagari, Hillol Kargupta

Function induction using the widely studied Walsh or Multidimensional Discrete Fourier Transform (MDFT) coecient estimates has several bene ts, including the fact that decision trees can be...

Constructing Simpler Decision Trees from Ensemble Models Using (2002)

Byung-hoon Park, Hillol Kargupta

Ensemble learning is frequently used for classi cation and other related applications in data mining. It generates multiple models and produces the nal classi cation by aggregating the outputs of the...

MobiMine: Monitoring the Stock Market from a PDA (2002)

Hillol Kargupta, Byung-hoon Park, Sweta Pittie, Lei Liu, Deepali Kushraj, Kakali Sarkar

This paper describes an experimental mobile data mining system that allows intelligent monitoring of time-critical financial data from a hand-held PDA. It presents the overall system architecture and...

Learning Functions Using Randomized Expansions: Probabilistic Properties and Experimentations (2001)

Hillol Kargupta, Rajeev Ayyagari, Samiran Ghosh

Inductive learning of nonlinear functions plays an important role in constructing predictive models and classifiers from data. This paper explores a novel randomized approach to construct linear...

Gene Expression and Fast Construction of Distributed Evolutionary Representation (2001)

Hillol Kargupta

The gene expression process in nature produces different proteins in different cells from different portions of the DNA. Since proteins control almost every important activity in a living organism,...

Mining Decision Trees from Data Streams in a Mobile Environment (2001)

Hillol Kargupta, Byung-hoon Park

This paper presents a novel Fourier analysis-based technique to aggregate, communicate, and visualize decision trees in a mobile environment. Fourier representation of a decision tree has several...

A Striking Property of Genetic Code-Like Transformations (2001)

Hillol Kargupta

The gene expression process in... This paper shows that genetic code-like transformations introduce very interesting properties to the representation of a genetic fitness function. It presents a...

Distributed Clustering Using Collective Principal Component Analysis (2000)

Hillol Kargupta, Weiyun Huang, Krishnamoorthy Sivakumar, Erik Johnson

. This paper considers distributed clustering of high dimensional heterogeneous data using a distributed Principal Component Analysis (PCA) technique called the Collective PCA. It presents the...

Collective Data Mining: A New Perspective Toward Distributed Data Analysis (2000)

Hillol Kargupta, Byung-hoon Park, Daryl Hershberger, Erik Johnson

This paper introduces the collective data mining (CDM) framework, a new approach toward distributed data mining (DDM) from heterogeneous sites. It points out that naive approaches to distributed data...

A Striking Property of Genetic Code-Like Transformations (2000)

Hillol Kargupta

The gene expression process in nature evaluates the tness of a DNA through the production of different proteins in different cells. The production of protein from DNA goes through transcription that...

SEARCH, Computational Processes in Evolution, and Preliminary Development of the Gene Expression Messy Genetic Algorithm (2000)

Hillol Kargupta

This paper considers the issue of scalable search with little domain knowledge and explores implications in the context of evolutionary computation. It presents the Search Envisioned As Relation and...

Drift, Diffusion and Boltzmann Distribution in Simple Genetic Algorithm (2000)

Hillol Kargupta

This paper presents a general diffusion model of a simple genetic algorithm. Unlike the similar previous efforts made for modeling mutation based genetic search, this work includes the effect of...

Gene Expression and Fast Construction of Distributed Evolutionary Representation (2000)

Hillol Kargupta, Byung-hoon Park

The gene expression process in nature evaluates the fitness of a DNA in a very distributed and decomposed fashion through the production of different proteins in di erent cells. Doing so requires a...

Scalable Evolutionary Computation (1999)

Hillol Kargupta

This paper considers the scalability of the process of schema detection and their subsequent exploitation in simple GA. It essentially shows that unless the user hand-picks the representation in such...

Distributed Multivariate Regression Using Wavelet-based Collective Data Mining (1999)

Daryl E. Hershberger, Hillol Kargupta

This paper presents a method for distributed multivariate regression using wavelet-based Collective Data Mining (CDM). The method seamlessly blends machine learning and information theory with the...

Function Induction, Gene Expression, And Evolutionary Representation Construction (1999)

Hillol Kargupta, Kakali Sarkar

Different portions of the DNA, the primary information career of a living organism, are evaluated in different cells through the process of gene expression (DNA!mRNA!Protein). Such distributed...

Collective Data Mining: A New Perspective Toward Distributed Data Mining (1999)

Hillol Kargupta, Byung-hoon Park, Daryl Hershberger, Erik Johnson

This paper introduces the collective data mining (CDM), a new approach toward distributed data mining (DDM) from heterogeneous sites. It points out that naive approaches to distributed data analysis...

The Collective Data Mining: A Technology For Ubiquitous Data Analysis From Distributed Heterogeneous Sites (1999)

Hillol Kargupta, Byung-hoon Park

This paper introduces the collective data mining (CDM), a unique approach to distributed data mining (DDM) from heterogeneous sites. It points out that naive approaches to distributed data analysis...

Web Based Parallel/Distributed Medical Data Mining Using Software Agents (1999)

Hillol Kargupta, Brian Stafford, Ilker Hamzaoglu

This paper describes an experimental parallel /distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based...

Further Experimentations On The Scalability Of The GEMGA (1998)

Hillol Kargupta

. This paper reports the recent developments of the Gene Expression Messy Genetic Algorithm (GEMGA) research. It presents extensive experimental results for large problems with massive...

Collective Data Mining From Distributed, Vertically Partitioned Feature Space (1998)

Hillol Kargupta, Erik Johnson, Eleonora Riva Sanseverino, Byung-hoon Park, Luisa Di, Silvestre Daryl Hershberger

This paper develops collective data mining, a unique approach for finding patterns from a network of databases, each with a distinct feature space. This paper addresses both distributed cooperative...

Scalable, Distributed Data Mining Using An Agent Based Architecture (1998)

Hillol Kargupta, Ilker Hamzaoglu, Brian Stafford

: Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the context of data mining. This paper presents PADMA (PArallel Data Mining Agents), a...

The Gene Expression Messy Genetic Algorithm (1998)

Hillol Kargupta

This paper introduces the gene expression messy genetic algorithm (GEMGA)---a new generation of messy GAs that directly search for relations among the members of the search space. The GEMGA is an O(...

A Temporal Sequence Processor Based on the Biological Reaction-Diffusion Process (1998)

Sylvian R. Ray, Hillol Kargupta

Temporal sequences are a fundamental form of information and communication both in natural and engineered systems. The biological control process which directs the generation of iterative structures...

A Temporal Sequence Processor Based on the Biological Reaction-Diffusion Process (1998)

Sylvian R. Ray, Hillol Kargupta

Temporal sequences are a fundamental form of information and communication both in natural and engineered systems. The biological control process which directs the generation of iterative structures...

Web Based Parallel/Distributed Medical Data Mining Using Software Agents (1997)

Hillol Kargupta, Brian Stafford, Ilker Hamzaoglu

This paper describes an experimental parallel /distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based...

Scalable, Distributed Data Mining Using An Agent Based Architecture (1997)

Hillol Kargupta, Ilker Hamzaoglu, Brian Stafford

: Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the context of data mining. This paper presents PADMA (PArallel Data Mining Agents), a...

Revisiting The GEMGA: Scalable Evolutionary Optimization Through Linkage Learning (1997)

Sanghamitra B, Hillol Kargupta, Gang Wang

The Gene expression messy genetic algorithm (GEMGA) is a new generation of messy genetic algorithms (GAs) that pays careful attention to linkage learning and in a broader context the search for...

From DNA To Protein: Transformations And Their Possible Role In Linkage Learning (1997)

Hillol Kargupta, Brian Stafford

This paper first presents an extended perspective of linkage using basic concepts developed in the SEARCH framework (Kargupta, 1995; Kargupta & Goldberg, 1996) and identifies detection of...

Extending The Class of Order-k Delineable Problems For The Gene Expression Messy Genetic Algorithm (1997)

Hillol Kargupta, David E. Goldberg, Liwei Wang

This paper revisits the gene expression messy genetic algorithm (GEMGA) (Kargupta, 1996a) and offers some modifications to extend the class of order-k delineable problems (class of problems that can...

Computational Processes In Evolution And The Gene Expression Messy Genetic Algorithm (1997)

Hillol Kargupta

This paper makes an effort to project the theoretical lessons of the SEARCH (Search Envisioned As Relation and Class Hierarchizing) framework introduced elsewhere (Kargupta, 1995; Kargupta &...

Unconstrained and Constrained Blackbox Optimization: The SEARCH Perspective (1997)

Hillol Kargupta, Vijay Hanagandi, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995a; Kargupta & Goldberg, 1995) offered an alternate perspective toward blackbox...

Blackbox Optimization: Implications Of SEARCH (1997)

Hillol Kargupta, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995; Kargupta & Goldberg, 1996a; Kargupta & Goldberg, 1996b) offered an alternate...

Signal-to-noise, Crosstalk and Long Range Problem Difficulty in Genetic Algorithms (1997)

Hillol Kargupta

This paper presents a signal-to-noise perspective of the search bias introduced by genetic algorithms. A decision theoretic signal-tonoise framework is used to show that there are two fundamental...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1997)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

SEARCH, Blackbox Optimization, And Sample Complexity (1997)

Hillol Kargupta, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995) offered an alternate perspective toward blackbox optimization (BBO)---optimization in...

Relation Learning In Gene Expression: Introns, Variable Length Representation, And All That (1997)

Hillol Kargupta

this paper we will primarily be concerned with tuples taken from space of n-ary Cartesian products of the search domain with itself.

The Gene Expression Messy Genetic Algorithm For Financial Applications (1997)

Hillol Kargupta, Kevin Buescher

This paper introduces the gene expression messy genetic algorithm (GEMGA)---a new generation of messy GAs that may find many applications in financial engineering. Unlike other existing blackbox...

From DNA To Protein: Transformations And Their Possible Role In Linkage Learning (1997)

Hillol Kargupta, Brian Stafford

This paper first presents an extended perspective of linkage using basic concepts developed in the SEARCH framework (Kargupta, 1995; Kargupta & Goldberg, 1996) and identifies detection of...

Relation Learning In Gene Expression: Introns, Variable Length Representation, And All That (1997)

Hillol Kargupta

this paper we will primarily be concerned with tuples taken from space of n-ary Cartesian products of the search domain with itself.

Scalable, Distributed Data Mining Using An Agent Based Architecture (1997)

Hillol Kargupta, Ilker Hamzaoglu, Brian Stafford

: Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the context of data mining. This paper presents PADMA (PArallel Data Mining Agents), a...

Web Based Parallel/Distributed Medical Data Mining Using Software Agents (1997)

Hillol Kargupta, Brian Stafford, Ilker Hamzaoglu

This paper describes an experimental parallel /distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based...

Extending The Class of (1997)

Hillol Kargupta, David E. Goldberg, Liwei Wang

This paper revisits the gene expression messy genetic algorithm (GEMGA) (Kargupta, 1996a) and offers some modifications to extend the class of order-k delineable problems (class of problems that can...

Computational Processes In Evolution And The Gene Expression Messy Genetic Algorithm (1997)

Hillol Kargupta

This paper makes an effort to project the theoretical lessons of the SEARCH (Search Envisioned As Relation and Class Hierarchizing) framework introduced elsewhere (Kargupta, 1995; Kargupta &...

Blackbox Optimization: Implications Of SEARCH (1996)

Hillol Kargupta, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995; Kargupta & Goldberg, 1996a; Kargupta & Goldberg, 1996b) offered an alternate...

SEARCH, Blackbox Optimization, And Sample Complexity (1996)

Hillol Kargupta, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995) offered an alternate perspective toward blackbox optimization (BBO)---optimization in...

The Gene Expression Messy Genetic Algorithm (1996)

Hillol Kargupta

This paper introduces the gene expression messy genetic algorithm (GEMGA)---a new generation of messy GAs that directly search for relations among the members of the search space. The GEMGA is an O(...

Unconstrained and Constrained Blackbox Optimization: The SEARCH Perspective (1996)

Hillol Kargupta, Vijay Hanagandi, David E. Goldberg

The SEARCH (Search Envisioned As Relation & Class Hierarchizing) framework developed elsewhere (Kargupta, 1995a; Kargupta & Goldberg, 1995) offered an alternate perspective toward blackbox...

The Gene Expression Messy Genetic Algorithm For Financial Applications (1996)

Hillol Kargupta, Kevin Buescher

This paper introduces the gene expression messy genetic algorithm (GEMGA)---a new generation of messy GAs that may find many applications in financial engineering. Unlike other existing blackbox...

SEARCH: An Alternate Perspective Toward Blackbox Optimization (1996)

Hillol Kargupta, David E. Goldberg

This paper presents SEARCH (Search Envisioned As Relation & Class Hierarchizing)---an alternate perspective of blackbox optimization and its quantitative analysis that lays the foundation essential...

Signal-to-noise, Crosstalk and Long Range Problem Difficulty in Genetic Algorithms (1996)

Hillol Kargupta

This paper presents a signal-to-noise perspective of the search bias introduced by genetic algorithms. A decision theoretic signal-tonoise framework is used to show that there are two fundamental...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1996)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

SEARCH, Polynomial Complexity, And The Fast Messy Genetic Algorithm (1995)

Hillol Kargupta

Blackbox optimization---optimization in presence of limited knowledge about the objective function---has recently enjoyed a large increase in interest because of the demand from the practitioners....

SEARCH, Polynomial Complexity, And The Fast Messy Genetic Algorithm (1995)

Hillol Kargupta

Blackbox optimization---optimization in presence of limited knowledge about the objective function---has recently enjoyed a large increase in interest because of the demand from the practitioners....

Polynomial Complexity Search, Problem Difficulty and Genetic Algorithms (1995)

Hillol Kargupta

A blackbox optimization problem can be difficult to solve because of 1) inherent problem complexity class 2) inadequate representation and 3) inappropriate search bias. In this paper I address the...

The Class Of Statically Deceptive Problems Is NP-Turing Complete (1995)

Hillol Kargupta

This paper identifies the complexity class of statically constructed partially deceptive problems by proving a turing reduction of 3-SAT problem to partially deceptive problem. This essentially...

Critical Deme Size For Serial And Parallel Genetic Algorithms (1995)

David E. Goldberg, Hillol Kargupta, Jeffrey Horn, Erick Cantu-paz

This paper investigates the possibility of gaining any computational benefit from multiple deme, small population GAs compared to a single large population GA. Our framework is based on an earlier...

Signal-to-noise, Crosstalk and Long Range Problem Difficulty in Genetic Algorithms (1995)

Hillol Kargupta

This paper presents a signal-to-noise perspective of the search bias introduced by genetic algorithms. A decision theoretic signal-tonoise framework is used to show that there are two fundamental...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1995)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

A Signal-to-noise Framework for Quantifying Search Difficulties in Genetic Algorithms (1995)

Hillol Kargupta

This paper presents a signal-to-noise perspective of the search process in genetic algorithms. First we pose a decision problem in terms of multiple 2-armed, mutually dependent bandits. This presents...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1995)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1995)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms (1995)

David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, Georges Harik

Researchers have long sought genetic algorithms (GAs) that can solve difficult search, optimization, and machine learning problems quickly. Despite years of work on simple GAs and their variants it...

Decision Making In Genetic Algorithms: A Signal-To-Noise Perspective (1994)

Hillol Kargupta, David E. Goldberg

Signal detection in presence of noise is a decision problem, as addressed in the traditional signal processing literature. On the other hand, an arbitrary decision problem can also be posed in terms...

Decision Making In Genetic Algorithms: A Signal-To-Noise Perspective (1994)

Hillol Kargupta, David E. Goldberg

Signal detection in presence of noise is a decision problem, as addressed in the traditional signal processing literature. On the other hand, an arbitrary decision problem can also be posed in terms...

Decision Making In Genetic Algorithms: A Signal-To-Noise Perspective (1994)

Hillol Kargupta, David E. Goldberg

Signal detection in presence of noise is a decision problem, as addressed in the traditional signal processing literature. On the other hand, an arbitrary decision problem can also be posed in terms...

A Temporal Sequence Processor Based on the Biological Reaction-Diffusion Process (1994)

Sylvian R. Ray, Hillol Kargupta

Temporal sequences are a fundamental form of information and communication both in natural and engineered systems. The biological control process which directs the generation of iterative structures...

A Temporal Sequence Processor Based on the Biological Reaction-Diffusion Process (1994)

Sylvian R. Ray, Hillol Kargupta

Temporal sequences are a fundamental form of information and communication both in natural and engineered systems. The biological control process which directs the generation of iterative structures...