Alex Smola

Kernel Measures of Independence for non-iid Data ∗ (2009)

Xinhua Zhang, Arthur Gretton, Le Song, Alex Smola

Many machine learning algorithms can be formulated in the framework of statistical independence such as the Hilbert Schmidt Independence Criterion. In this paper, we extend this criterion to deal...

ABSTRACT (2009)

Markus Weimer, Tu Darmstadt, Alexandros Karatzoglou, Alex Smola

We present a flexible approach to collaborative filtering which stems from basic research results. The approach is flexible in several dimensions: We introduce an algorithm where the loss can be...

Feature Hashing for Large Scale Multitask Learning (2009)

Weinberger, Kilian, Dasgupta, Anirban, Attenberg, Josh, Langford, John, Smola, Alex

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature...

Kernelized sorting (2009)

Quadrianto, Novi, Song, Le, Smola, Alex

Object matching is a fundamental operation in data analysis. It typically requires the definition of a similarity measure between the classes of objects to be matched. Instead, we develop an approach...

Kernel Measures of Independence for non-iid Data (2008)

Zhang, Xinhua, Song, Le, Gretton, Arthur, Smola, Alex

Many machine learning algorithms can be formulated in the framework of statistical independence such as the Hilbert Schmidt Independence Criterion. In this paper, we extend this criterion to deal...

Joint Regularization (2008)

Karsten M. Borgwardt, Alex Smola

Abstract. We present a principled method to combine kernels under joint regularization constraints. Central to our method is an extension of the representer theorem for handling multiple joint...

Sparse Bayesian Learning and the Relevance Vector Machine (2008)

Alex Smola

This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classi � cation tasks utilising models linear in the parameters. Although this framework is fully...

Near-optimal supervised feature selection among frequent subgraphs (2008)

Thoma, Marisa, Cheng, Hong, Gretton, Arthur, Han, Jiawei, Kriegel, Hans-Peter, Smola, Alex, ...

Graph classification is an increasingly important step in numerous application domains, such as function prediction of molecules and proteins, computerised scene analysis, and anomaly detection in...

Discriminative Human Action Segmentation and Recognition using Semi-Markov Model (2008)

Shi, Qinfeng, Wang, Li, Cheng, Li, Smola, Alex

Given an input video sequence of one person conducting a sequence of continuous actions, we consider the problem of jointly segmenting and recognizing actions. We propose a discriminative approach to...

Tailoring Density Estimation via Reproducing Kernel Moment Matching (2008)

Song, Le, Zhang, Xinhua, Smola, Alex, Gretton, Arthur, Schölkopf, Bernhard

Moment matching is a popular means of parametric density estimation. We extend this technique to nonparametric estimation of mixture models. Our approach works by embedding distributions into a...

m (2008)

Bernhard Schölkopf, Karsten Borgwardt, Kenji Fukumizu, Arthur Gretton, Jiayuan Huang, Quoc Le, ...

An example of a kernel algorithm, revisited µ(X)

Hilbert Space Representations of Probability Distributions (2008)

Arthur Gretton, Kenji Fukumizu, Malte Rasch, Alex Smola, Le Song, ...

generated from the same distribution? • Kernel independence testing: given a sample of m pairs {(x1, y1),...,(xm, ym)}, are the random variables x and y independent? Kernels, feature maps A very...

BIOINFORMATICS doi:10.1093/bioinformatics/btm216 Gene selection via the BAHSIC family of algorithms (2008)

Le Song, Justin Bedo, Karsten M. Borgwardt, Arthur Gretton, Alex Smola

Motivation: Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are...

1 (m)4 (2008)

Le Song, Alex Smola, Arthur Gretton, Karsten Borgwardt, Kl Tr(kl, Hsic(f G Pr

Proof. Define the Pochammer symbol as (m)n = m!

BIOINFORMATICS Gene Selection via the BAHSIC Family of Algorithms (2008)

Le Song, Justin Bedo, Karsten M. Borgwardt, Arthur Gretton, Alex Smola

Motivation Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are...

Abstract (2008)

Le Song, Arthur Gretton, Alex Smola, Karsten Borgwardt

We introduce a framework of feature filtering for supervised learning. It employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between data and labels. The key idea is...

ABSTRACT A Scalable Modular Convex Solver for (2008)

Regularized Risk Minimization, Choon Hui Teo, Quoc Le, Alex Smola

A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers....

Colored Maximum Variance Unfolding (2008)

Le Song, Alex Smola, Karsten Borgwardt, Arthur Gretton

Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while...

MIT Press (2000) Support Vector Method for Novelty Detection (2008)

Bernhard Schölkopf, Robert Williamson, Alex Smola, John Shawe-taylor, John Platt

Suppose you are given some dataset drawn from an underlying probability distribution ¤ and you want to estimate a “simple ” subset ¥ of input space such that the probability that a test point...

Presentation Preference: Poster Correspondence: S. Mika Kernel PCA and De-Noising in Feature Spaces (2008)

Sebastian Mika, Bernhard Schölkopf, Alex Smola, Klaus-robert Müller, Matthias Scholz, Gunnar Rätsch

Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal...

Abstract (2008)

Olivier Chapelle, Quoc Le, Alex Smola

Most ranking algorithms, such as pairwise ranking, are based on the optimization of standard loss functions, but the quality measure to test web page rankers is often different. We present an...

ABSTRACT A Scalable Modular Convex Solver for (2008)

Regularized Risk Minimization, Choon Hui Teo, Quoc Le, Alex Smola

A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers....

Colored Maximum Variance Unfolding (2008)

Le Song, Alex Smola, Karsten Borgwardt, Arthur Gretton

Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while...

Supervised Feature Selection via Dependence Estimation (2008)

Le Song, Alex Smola, Arthur Gretton, Karsten M. Borgwardt

We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that...

Journal of Machine Learning Research X (2005) XX Submitted 02/03; Revised 11/04; Published XX/XX Frame, Reproducing Kernel, Regularization and Learning (2008)

Alain Rakotomamonjy, Stéphane Canu, Alex Smola

This work deals with a method for building Reproducing Kernel Hilbert Space (RKHS) from a Hilbert space with frame elements having special properties. Conditions on existence and a method of...

A Dependence Maximization View of Clustering (2008)

Le Song, Alex Smola, Arthur Gretton

We propose a family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion...

Semi-Markov Models for Sequence Segmentation (2008)

Qinfeng Shi, Alex Smola

In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We...

Joint Regularization (2008)

Karsten M. Borgwardt, Omri Guttman, Alex Smola

Abstract. We present a principled method to combine kernels under joint regularization constraints. Central to our method is an extension of the representer theorem for handling multiple joint...

Estimating labels from label proportions (2008)

Quadrianto, Novi, Smola, Alex, Caetano, Tiberio, Le, Quoc

Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, also with known label proportions. This...

Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces Sebastian Mika, Gunnar Rätsch, Member, IEEE, (2007)

Jason Weston, Bernhard Schölkopf, Alex Smola, Klaus-robert Müller

Abstract—We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of...

**Bell Labs AT&T Labs (2007)

Harris Drucker, Chris J. C, Burges* Linda Kaufman, Alex Smola, Vladimir Vapnik

A new regression technique based on Vapnik's concept of support vectors is introduced. We compare support vector regression (SVR) with a committee regression technique (bagging) based on...

kernlab – A Kernel Methods Package (2007)

Friedrich Leisch, Achim Zeileis (eds, Alexandros Karatzoglou, Achim Zeileis, ...

Designing software for Support Vector Machines (SVM) and kernel methods in general poses an interesting design problem. Our aim is to provide one possible solution using R object oriented features....

Support Vector Machines and Kernel Algorithms (2007)

Alexander J. Smola, Alex Smola, Bernhard Schölkopf, Bernhard Schölkopf, Er J. Smola

One of the fundamental problems of learning theory is the following: suppose we are given two classes of objects. We are then faced with a new object, and we have to assign it to one of the two...

Friedman, J. (2001). The Elements of Statistical (2007)

Lan T. Nabney, Anton Schwaighofer, Steve Gunn, Alex Smola

for helpful suggestions and enduring the/-version of this lecture series

Sparse Bayesian Learning and the Relevance Vector Machine (2007)

Alex Smola

This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully...

2 (2007)

Sebastian Mika, Jason Weston, Alex Smola

We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a uni ed framework in terms of a nonlinear variant of the Rayleigh...

Kernel PCA Pattern Reconstruction via Approximate Pre-Images (2007)

Alex Smola, Gunnar Ratsch, Klaus-robert Muller

Algorithms based on Mercer kernels construct their solutions in terms of expansions in a high-dimensional feature space F. Previous work has shown that all algorithms which can be formulated in terms...

A Hilbert Space Embedding for Distributions (2007)

Smola, Alex, Gretton, Arthur, Song, Le, Schölkopf, Bernhard

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert...

Learning Graph Matching (2007)

Smola, Alex, Caetano, Tiberio, Cheng, Li, Le, Quoc

As a fundamental problem in pattern recognition, graph matching has found a variety of applications in the field of computer vision. In graph matching, patterns are modeled as graphs and pattern...

The Need for Open Source Software in Machine Learning (2007)

Sonnenburg, Sören, Braun, Mikio, Ong, Cheng Soon, Bengio, Samy, Bottou, Leon, Holmes, Geoffrey, ...

Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a...

A scalable modular convex solver for regularized risk minimization (2007)

Teo, Choon Hui, Smola, Alex, Vishwanathan, S V N

A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers....

A Kernel Approach to Comparing Distributions (2007)

Gretton, Arthur, Borgwardt, Karsten M., Rasch, Malte, Schölkopf, Bernhard, Smola, Alex

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a Reproducing Kernel Hilbert...

A Kernel Approach to Comparing Distributions (2007)

Gretton, Arthur, Borgwardt, Karsten M., Rasch, Malte, Schölkopf, Bernhard, Smola, Alex

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a Reproducing Kernel Hilbert...

Gene selection via the BAHSIC family of algorithms (2007)

Song, Le, Bedo, Justin, Borgwardt, Karsten M., Gretton, Arthur, Smola, Alex

Motivation: Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are...

Gene selection via the BAHSIC family of algorithms (2007)

Song, Le, Bedo, Justin, Borgwardt, Karsten, Gretton, Arthur, Smola, Alex

Motivation: Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are...

Semi-Markov Models for Sequence Segmentation (2007)

Shi, Qinfeng, Altun, Yasemin, Smola, Alex, Vishwanathan, S V N

In this paper, we study the problem of automatically segmenting written text into paragraphs. This is inherently a sequence labeling problem, however, previous approaches ignore this dependency. We...

Supervised Feature Selection via Dependence Estimation (2007)

Song, Le, Smola, Alex, Gretton, Arthur, Borgwardt, Karsten, Bedo, Justin

We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that...

A Kernel Method for the Two-Sample-Problem (2007)

Gretton, Arthur, Borgwardt, Karsten M., Rasch, Malte, Schölkopf, Bernhard, Smola, Alex

We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a...

A Dependence Maximization View of Clustering (2007)

Song, Le, Smola, Alex, Gretton, Arthur, Borgwardt, Karsten

We propose a family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion...

Supervised Feature Selection via Dependence Estimation (2007)

Song, Le, Smola, Alex, Gretton, Arthur, Borgwardt, Karsten, Bedo, Justin

We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that...

A Kernel Test of Statistical Dependence (2007)

Gretton, Arthur, Fukumizu, Kenji, Teo, Choon Hui, Song, Le, Schölkopf, Bernhard, Smola, Alex

Although kernel measures of independence have been widely applied in machine learning (notably in kernel ICA), there is as yet no method to determine whether they have detected statistically...

Colored Maximum Variance Unfolding (2007)

Song, Le, Smola, Alex, Borgwardt, Karsten, Gretton, Arthur

Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while...

Density estimation of Structured Outputs in RKHS (2007)

Altun, Y., Smola, Alex

In this paper we study the problem of estimating conditional probability distributions for structured output prediction tasks in Reproducing Kernel Hilbert Spaces. More specically, we prove...

Kernel Methods in Machine Learning (2007)

Smola, Alex, Hofmann, T, Schölkopf, Bernhard

We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on...

A Hilbert Space Embedding for Distributions (2007)

Smola, Alex, Gretton, Arthur, Song, Le, Schölkopf, Bernhard

We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert...

Elefant User’s Manual Release 0.1 (2007)

Kishor Gawande, Christfried Webers, Alex Smola, Choon Hui Teo, Javen Qinfeng Shi, Julian Mcauley, ...

The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this

Integrating structured biological data by Kernel Maximum Mean Discrepancy (2006)

Borgwardt, K, Gretton, Arthur, Rasch, M, Schölkopf, Bernhard, Smola, Alex

\section{Motivation:} Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a...

Integrating structured biological data by Kernel Maximum Mean Discrepancy (2006)

Borgwardt, Karsten M., Gretton, Arthur, Rasch, Malte, Kriegel, Hans-Peter, Schölkopf, Bernhard, Smola, Alex

Motivation: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based...

A Kernel Method for the Two-Sample-Problem (2006)

Gretton, Arthur, Borgwardt, K, Rasch, M, Schölkopf, Bernhard, Smola, Alex

We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a...

Correcting Sample Selection Bias by Unlabeled Data (2006)

Huang, J., Smola, Alex, Gretton, Arthur, Borgwardt, K, Schölkopf, Bernhard

We consider the scenario where training and test data are drawn from different distributions, commonly referred to as \emph{sample selection bias}. Most algorithms for this setting try to first...

Kernel extrapolation (2006)

Vishwanathan, S V N, Borgwardt, K. M., Guttman, Omri, Smola, Alex

We present a framework for efficient extrapolation of reduced rank approximations, graph kernels, and locally linear embeddings (LLE) to unseen data. We also present a principled method to combine...

Kernel methods and the exponential family (2006)

Canu, Stéphane, Smola, Alex

The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated...

Learning High-Order MRF Priors of Color Image (2006)

McAuley, J., Caetano, T., Smola, Alex, Franz, Matthias

In this paper, we use large neighborhood Markov random fields to learn rich prior models of color images. Our approach extends the monochromatic Fields of Experts model (Roth and Blackwell, 2005} to...

Integrating structured biological data by Kernel Maximum Mean Discrepancy (2006)

Borgwardt, Karsten M., Gretton, Arthur, Rasch, Malte, Kriegel, Hans-Peter, Schölkopf, Bernhard, Smola, Alex

Motivation: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based...

Unifying Divergence Minimization and Statistical Inference via Convex Duality (2006)

Yasemin Altun, Alex Smola

Abstract. In this paper we unify divergence minimization and statistical inference by means of convex duality. In the process of doing so, we prove that the dual of approximate maximum entropy...

Step Size Adaptation in Reproducing Kernel Hilbert Space (2005)

Vishwanathan, S V N, Schraudolph, Nicol, Smola, Alex

This paper presents an online Support Vector Machine (SVM) that uses the Stochastic Meta-Descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a...

Binet-Cauchy Kernels on Dynamical Systems and its Application to the Analysis of Dynamic Scenes (2005)

Vishwanathan, S V N, Smola, Alex, Vidal, Rene

We derive a family of kernels on dynamical systems by applying the Binet-Cauchy theorem to trajectories of states. Our derivation provides a unifying framework for all kernels on dynamical systems...

Simple and SimplerSVM (2005)

Vishwanathan, S V N, Smola, Alex, Schraudolph, Nicol

We present a fast iterative support vector training algorithm for the quadratic hard margin formulation. Our algorithm works by incrementally changing a candidate support vector set using a locally...

Kernel Extrapolation (2005)

Vishwanathan, S V N, Borgwardt, Karsten, Guttman, Omri, Smola, Alex

We present a framework for efficient extrapolation of reduced rank approximations, graph kernels, and locally linear embeddings (LLE) to unseen data. We also present a principled method to combine...

Measuring Statistical Dependence with Hilbert-Schmidt Norms (2005)

Gretton, Arthur, Bousquet, Olivier, Smola, Alex, Schölkopf, Bernhard

We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm...

Kernel methods and the exponential family (2005)

Canu, Stéphane, Smola, Alex

The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated...

Step Size-Adapted Online Support Vector Learning (2005)

Karatzoglou, Alexandros, Vishwanathan, S V N, Schraudolph, Nicol N., Smola, Alex

We present an online Support Vector Machine (SVM) that uses Stochastic Meta-Descent (SMD) to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient...

Kernel Methods for Measuring Independence (2005)

Gretton, Arthur, Herbrich, Ralf, Smola, Alex, Bousquet, Olivier, Schölkopf, Bernhard

We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the...

Measuring Statistical Dependence with Hilbert-Schmidt Norms (2005)

Gretton, Arthur, Bousquet, Olivier, Smola, Alex, Schölkopf, Bernhard

We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm...

Heteroscedastic Gaussian Process Regression (2005)

Le, Quoc, Smola, Alex, Canu, Stéphane

This paper presents an algorithm to estimate simultaneously both mean and variance of a {\scanu non parametric} regression problem. The key point is that we are able to estimate variance...

Kernel Constrained Covariance for Dependence Measurement (2005)

Gretton, Arthur, Smola, Alex, Bousquet, Olivier, Herbrich, Ralf, Belitski, Andrei, Augath, Mark, ...

We discuss reproducing kernel Hilbert space (RKHS)-based measures of statistical dependence, with emphasis on constrained covariance (COCO), a novel criterion to test dependence of random variables....

Kernel Methods for Missing Variables (2005)

Smola, Alex, Vishwanathan, S V N, Hoffman, Thomas

We present methods for dealing with missing variables in the context of Gaussian Processes and Support Vector Machines. This solves an important problem which has largely been ignored by kernel...

Kernel Constrained Covariance for Dependence Measurement (2005)

Gretton, Arthur, Smola, Alex, Bousquet, Olivier, Herbrich, Ralf, Belitski, Andrei, Augath, Mark, ...

We discuss reproducing kernel Hilbert space (RKHS)-based measures of statistical dependence, with emphasis on constrained covariance (COCO), a novel criterion to test dependence of random variables....

Kernel Based Depdendence Detection in the Macaque Visual Cortex (2005)

Gretton, Arthur, Smola, Alex, Bousquet, Olivier, Herbrich, Ralf, Belitski, Andrei, Augath, Mark, ...

Tests to determine the dependence or independence of random variables are well established in statistical analysis: the mutual information (MI) is one example, while another approach (used in some...

Invariances in Classification : an efficient SVM implementation (2005)

Loosli, Gaëlle, Canu, Stéphane, Vishwanathan, S V N, Smola, Alex

Often, in pattern recognition, complementary knowledge is available. This could be useful to improve the performance of the recognition system. Part of this knowledge regards invariances, in...

Protein function prediction via graph kernels (2005)

Borgwardt, Karsten M., Ong, Cheng Soon, Schoenauer, Stefan, Vishwanathan, S V N, Smola, Alex, Kriegel, Hans-Peter

Motivation: Computational approaches to protein function prediction infer protein function by finding proteins with similar sequence, structure, surface clefts, chemical properties, amino acid...

Learning the Kernel with Hyperkernels (2005)

Ong, Cheng Soon, Smola, Alex, Williamson, Bob

This paper addresses the problem of choosing a kernel suitable for estimation with a Support Vector Machine, hence further automating machine learning. This goal is achieved by defining a Reproducing...

Large-Scale Multiclass Transduction (2005)

Geartner, Thomas, Le, Quoc, Burton, Simon, Smola, Alex, Vishwanathan, S V N

We present a method for performing transductive inference on very large datasets. Our algorithm is based on multiclass Gaussian processes and is effective whenever the multiplication of the kernel...

Kernel methods for testing independence (2005)

Gretton, Arthur, Herbrich, Ralf, Smola, Alex, Bousquet, Olivier, Schölkopf, Bernhard

We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the...

Heteroscedastic gaussian process regression (2005)

Le, Quoc, Smola, Alex, Canu, Stéphane

This paper presents an algorithm to estimate simultaneously both mean and variance of a non parametric regression problem. The key point is that we are able to estimate vari- ance local ly unlike...

Measuring statistical dependence with hilbert-schmidt norms (2005)

Gretton, Arthur, Bousquet, Olivier, Smola, Alex, Schölkopf, Bernhard

We propose an independence criterion based on the eigen- spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm...

Kernel methods for missing variables (2005)

Smola, Alex, Vishwanathan, S V N, Hofmann, Thomas

We present methods for dealing with missing variables in the context of Gaussian Processes and Support Vector Machines. This solves an important problem which has largely been ig- nored by kernel...

Kernel constrained covariance for dependence measurement (2005)

Gretton, Arthur, Smola, Alex, Bousquet, Olivier, Herbrich, Ralf, Belitski, Andreas, Augath, M, ...

We discuss reproducing kernel Hilbert space (RKHS)-based measures of statistical dependence, with emphasis on constrained covariance (COCO), a novel criterion to test dependence of random variables....

Protein function prediction via graph kernels (2005)

Borgwardt, Karsten, Ong, Cheng Soon, Schonauer, Stefan, Vishwanathan, S V N, Smola, Alex, Kriegel, Hans-Peter

Motivation: Computational approaches to protein function prediction infer protein function by finding proteins with similar sequence, structure, surface clefts, chemical properties, amino acid...

Large-scale multiclass transduction (2005)

Gaertner, Thomas, Le, Quoc, Burton, Simon, Smola, Alex, Vishwanathan, S V N

We present a method for performing transductive inference on very large datasets. Our algorithm is based on multiclass Gaussian processes and is effective whenever the multiplication of the kernel...

Kernel extrapolation (2005)

Karsten M. Borgwardt, Omri Guttman, Alex Smola

We present a framework for efficient extrapolation of reduced rank approximations, graph kernels, and locally linear embeddings (LLE) to unseen data. We also present a principled method to combine...

Kernel methods and the exponential family (2005)

Stéphane Canu, Alex Smola

The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated...

Kernel extrapolation (2005)

Omri Guttman, Alex Smola

We present a general framework for extrapolation of kernels, which can be used to extend graph kernels, reduced rank approximations, local linear embeddings (LLE), or the Fisher kernel to various...

Reproducing Kernel, Regularization and Learning (2005)

Alain Rakotomamonjy, Stéphane Canu, Alex Smola

This work deals with a method for building a reproducing kernel Hilbert space (RKHS) from a Hilbert space with frame elements having special properties. Conditions on existence and a method of...

Kernel methods and the exponential family (2005)

Stephane Canu, Alex Smola

The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated...

Binet-Cauchy Kernels (2004)

Vishwanathan, Vishy, Smola, Alex

We propose a family of kernels based on the Binet-Cauchy theorem and its extension to Fredholm operators. This includes as special cases all currently known kernels derived from the behavioral...

Mathematical Programming for Missing Data (2004)

Bhattacharyya, Chiranjib, Pannagadatta, K.S., Smola, Alex

We propose a mathematical programming method to deal with uncertainty in the observations of a classification problem. This means that we can deal with situations where instead of a sample $(\xb_i,...

Behaviour and Convergence of the Constrained Covariance (2004)

Gretton, Arthur, Smola, Alex, Bousquet, Olivier, Herbrich, Ralf, Schoelkopf, Bernhard, Logothetis, Nikos

We discuss reproducing kernel Hilbert space (RKHS)-based measures of statistical dependence, with emphasis on constrained covariance (COCO), a novel criterion to test dependence of random variables....

Fast kernels for string and tree matching (2004)

Vishwanathan, Vishy, Smola, Alex

In this chapter we present a new algorithm suitable for matching discrete objects such as strings and trees in linear time, thus obviating dynamic programming with quadratic time complexity. This...

Exponential Families for Conditional Random Fields (2004)

Altun, Yasemin, Smola, Alex, Hofmann, Thomas

Many real-world classification tasks involve the prediction of multiple, inter-dependent class labels. A prototypical case of this sort deals with sequences of observations for which a corresponding...

Gaussian Process Classification for Segmenting and Annotating Sequences (2004)

Altun, Yasemin, Hofmann, Thomas, Smola, Alex

Many real-world classification tasks involve the prediction of multiple, inter-dependent class labels. A prototypical case of this sort deals with sequences of observations for which a corresponding...

Exponential Families for Conditional Random Fields (2004)

Altun, Yasemin, Smola, Alex, Hofmann, Thomas

In this paper we define conditional random fields in Hilbert space and we show connections to Gaussian Process classification. More specifically, we prove decomposition results for undirected...

Une boîte à outils rapide et simple pour les SVM (2004)

Loosli, Gaëlle, Canu, Stéphane, Vishwanathan, S V N, Smola, Alex, Chattopadhyay, Manojit

If SVM (Support Vector Machines) is now considered as one of the best learning methods, it is still considered as slow. Here we propose a Matlab toolbox that enables the usage of SVM in a fast and...

A fast and efficient toolbox for SVMs in Matlab (2004)

Loosli, Gaëlle, Canu, Stéphane, Rakotomamonjy, Alain, Smola, Alex, Vishwanathan, S V N

Technology: The presented Toolbox is an efficient implementation in MATLAB of the SimpleSVM algorithm. It provides the exact solution to the SVM problem. The extended version of the toolbox provides...

A Tutorial on Support Vector Regression (2004)

Smola, Alex, Schoelkopf, Bernhard

In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV...

kernlab - An S4 package for kernel methods in R (2004)

Karatzoglou, Alexandros, Smola, Alex, Hornik, Kurt, Zeileis, Achim

kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 object model and provides a framework for creating and using kernel-based algorithms....

Learning with Non-Positive Kernels (2004)

Ong, Cheng Soon, Mary, Xavier, Canu, Stéphane, Smola, Alex

What happens when we do not have a positive definite kernel? It turns out that we can still do learning, in what is now a Krein Space. This paper talks about regression with indefinite kernels.

Online learning with kernels (2004)

Kivinen, Jyrki, Smola, Alex, Williamson, Bob

Kernel-based algorithms such as support vector machines have achieved considerable success in various problems in batch setting, where all of the training data is available in advance. Support vector...

Binet-Cauchy kernels (2004)

Vishwanathan, S V N, Smola, Alex

We propose a family of kernels based on the Binet-Cauchy theorem and its ex- tension to Fredholm operators. This includes as special cases all currently known kernels derived from the behavioral...

A second order cone programming formulation for classifying missing data (2004)

Bhattacharyya, Chiranjib, Pannagadatta, K.S., Smola, Alex

We propose a convex optimization based strategy to deal with uncertainty in the observations of a classification problem. We assume that instead of a sample (xi , yi ) a distribution over (xi , yi )...

Online learning with kernels (2004)

Kivinen, Jyrki, Smola, Alex, Williamson, Bob

Kernel-based algorithms such as support vector ma- chines have achieved considerable success in various problems in batch setting, where all of the training data is available in advance. Support...

Fast kernels for string and tree matching (2004)

Vishwanathan, S V N, Smola, Alex

In this chapter we present a new algorithm suitable for matching discrete ob jects such as strings and trees in linear time, thus obviating dynamic programming with quadratic time complexity. This...

A tutorial on support vector regression (2004)

Smola, Alex, Schölkopf, Bernhard

In this tutorial we give an overview of the basic ideas under- lying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algo- rithms for...

The kernel mutual information (2003)

Gretton, Arthur, Herbrich, Ralf, Smola, Alex

We introduce two new functions, the kernel covariance (KC) and the kernel mutual information (KMI), to measure the degree of independence of several continuous random variables. The former is...

The Kernel Mutual Information (2003)

Arthur Gretton, Ralf Herbrich, Alex Smola, Christophe Andrieu, Olivier Bousquet, Arnaud Doucet, ...

We introduce two new functions, the kernel covariance (KC) and the kernel mutual information (KMI), to measure the degree of independence of several continuous random variables. The former is...

Kernel Methods and Support Vector Machines (2003)

Bernhard Schölkopf, Alex Smola

Introduction Over the past ten years kernel methods such as Support Vector Machines and Gaussian Processes have become a staple for modern statistical estimation and machine learning. The groundwork...

The kernel mutual information (2003)

Arthur Gretton, Ralf Herbrich, Alex Smola

This version contains changes to the formatting and minor corrections to the background section, compared

Robust Ensemble Learning for Data Analysis (2000)

Gunnar Rätsch, Bernhard Scholkopf, Alex Smola, Sebastian Mika, S. Mika, Klaus-Robert Müller, ...

Classification tasks appearing very often in data analysis and are important sub-tasks in Data Mining. AdaBoost and other Ensemble methods have successfully been applied to a number of classification...

Query Learning with Large Margin Classifiers (2000)

Colin Campbell, Nello Cristianini, Alex Smola

The active selection of instances can significantly improve the generalisation performance of a learning machine. Large margin classifiers such as Support Vector Machines classify data using the most...

Invariant Feature Extraction and Classification in Kernel Spaces (2000)

Sebastian Mika, Gunnar Rätsch, Jason Weston, Bernhard Schölkopf, Alex Smola, Klaus-Robert Müller

We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinear variant of the Rayleigh...

Natural regularization in SVMs (2000)

Nuria Oliver, Bernhard Schölkopf, Alex Smola

Recently the so called Fisher kernel was proposed by [6] to construct discriminative kernel techniques by using generative models. We provide a regularization-theoretic analysis of this approach and...

Query Learning with Large Margin Classifiers (2000)

Colin Campbell, Nello Cristianini, Alex Smola

The active selection of instances can significantly improve the generalisation performance of a learning machine. Large margin classifiers such as support vector machines classify data using the most...

Natural Regularization in SVMs (2000)

Nuria Oliver, Bernhard Schölkopf, Alex Smola

We provide a regularization-theoretic analysis of a class of SV kernels {called natural kernels{ based on generative models with density p(xj), such as the Fisher kernel proposed in [5]. In...

Kernel pca and de-noising in feature spaces (1999)

Sebastian Mika, Bernhard Sch Olkopf, Alex Smola, Matthias Scholz, Gunnar R Atsch

Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal...

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett, ...

We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature...

SV Estimation of a Distribution's Support (1999)

Bernhard Schölkopf, Bernhard Sch Olkopf, Robert C. Williamson, Alex Smola, John Shawe-taylor

Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a subset S of input space such that the probability that a test point drawn from P lies...

Shrinking the Tube: A New Support Vector Regression Algorithm (1999)

Bernhard Schölkopf, Bernhard Sch Olkopf, Peter Bartlett, Alex Smola, Robert Williamson

A new algorithm for Support Vector regression is described. For a priori chosen , it automatically adjusts a flexible tube of minimal radius to the data such that at most a fraction of the data...

Kernel PCA and De-Noising in Feature Spaces (1999)

Sebastian Mika, Bernhard Scholkopf, Bernhard Sch Olkopf, Alex Smola, Klaus-Robert Muller, Matthias Scholz, ...

Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal...

Classification on Proximity Data with LP--Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Scholkopf, Alex Smola, Peter Bartlett, ...

We provide a new linear program to deal with classification of data in the case of data given in terms of pairwise proximities. This allows to avoid the problems inherent in using feature spaces with...

Linear Programs for Automatic Accuracy Control in Regression (1999)

Alex Smola, Bernhard Schölkopf, Gunnar Rätsch

We have recently proposed a new approach to control the number of basis functions and the accuracy in Support Vector Machines. The latter is transferred to a linear programming setting, which...

Classification on Proximity Data with LP-Machines (1999)

Thore Graepel, Ralf Herbrich, Bernhard Schölkopf, Alex Smola, Peter Bartlett, Klaus-Robert Müller, ...

We provide a new linear program to deal with classification of data in the case of functions written in terms of pairwise proximities. This allows to avoid the problems inherent in using feature...

Kernel pca and de-noising in feature spaces (1999)

Sebastian Mika, Bernhard Schölkopf, Alex Smola, Klaus-robert Müller, Matthias Scholz, Gunnar Rätsch

Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal...

Convex Cost Functions for Support Vector Regression (1998)

Alex Smola, Bernhard Schölkopf, Klaus-Robert Müller

The concept of Support Vector Regression is extended to a more general class of convex cost functions. It is shown how the resulting convex constrained optimization problems can be efficiently solved...

Prior Knowledge in Support Vector Kernels (1998)

Bernhard Schölkopf, Bernhard Sch Olkopf, Patrice Simard, Alex Smola, Vladimir Vapnik

We explore methods for incorporating prior knowledge about a problem at hand in Support Vector learning machines. We show that both invariances under group transformations and prior knowledge about...

Support Vector Methods in Learning and Feature Extraction (1998)

Bernhard Schölkopf, Alex Smola, Klaus-Robert Müller, Chris Burges, Vladimir Vapnik

The last years have witnessed an increasing interest in Support Vector (SV) machines, which use Mercer kernels for efficiently performing computations in high-dimensional spaces. In pattern...

Support vector regression machines (1997)

Harris Drucker, Chris J. C, Burges* Linda Kaufman, Alex Smola, Vladimir Vapnik

A new regression technique based on Vapnik’s concept of support vectors is introduced. We compare support vector regression (SVR) with a committee regression technique (bagging) based on regression...

Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing (1996)

Vladimir Vapnik, Steven E. Golowich, Alex Smola

The Support Vector (SV) method was recently proposed for estimating regressions, constructing multidimensional splines, and solving linear operator equations [Vapnik, 1995]. In this presentation we...