Discrete profile comparison using information bottleneck (2009)
Gal Chechik, Robin Friedman, Eleazar Eskin
Sequence homologs are an important source of information about proteins. Amino acid profiles, representing the position-specific mutation probabilities found in profiles, are a richer encoding of...
Laplace Propagation Abstract (2008)
We present a novel method for approximate inference in Bayesian models and regularized risk functionals. It is based on the propagation of mean and variance derived from the Laplace approximation of...
Christina Leslie, Jason Weston, Eleazar Eskin, William Stafford Noble
We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. These kernels measure...
A note on phasing long genomic regions using local haplotype predictions (2008)
Eleazar Eskin, Roded Sharan, Eran Halperin
Common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational...
Laplace Propagation Abstract (2008)
We present a novel method for approximate inference in Bayesian models and regularized risk functionals. It is based on the propagation of mean and variance derived from the Laplace approximation of...
Eleazar Eskin, Yoram Singer, William Stafford Noble
substitution matrices to estimate probability distributions for
Separation of overlapping subpopulations by mutual information (2008)
Identifying ancestral sequences is an important first step in understanding population history and dynamics. However, several interesting cases including human genetic variation feature highly...
Christina Leslie, Jason Weston, Eleazar Eskin, William Stafford Noble
We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. These kernels measure...
Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping (2008)
Olshen, Adam B, Gold, Bert, Lohmueller, Kirk E, Struewing, Jeffery P, Satagopan, Jaya, Stefanov, Stefan A, ...
Abstract Background Genetic isolates such as the Ashkenazi Jews (AJ) potentially offer advantages in mapping novel loci in whole genome disease association studies. To analyze patterns of genetic...
The availability of various types of genomic data provides an opportunity to incorporate this data as prior information in genetic association studies. This information includes knowledge of linkage...
Laplace Propagation Abstract (2007)
We present a novel method for approximate inference in Bayesian models and regularized risk functionals. It is based on the propagation of mean and variance derived from the Laplace approximation of...
Motivation: Recently, a new type of expression data is being collected which aims to measure the effect of genetic variation on gene expression in pathways. In these datasets, expression profiles are...
Evolutionary History, Alkes L. Price, Eleazar Eskin, Pavel A. Pevzner, Email Alerting, Alkes L. Price, ...
data
Discrete profile comparison using information bottleneck (2006)
O'Rourke, Sean, Chechik, Gal, Friedman, Robin, Eskin, Eleazar
Abstract Sequence homologs are an important source of information about proteins. Amino acid profiles, representing the position-specific mutation probabilities found in profiles, are a richer...
A comparison of phasing algorithms for trios and unrelated individuals (2006)
Jonathan Marchini, David Cutler, Nick Patterson, Matthew Stephens, Eleazar Eskin, Eran Halperin, ...
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of...
The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high...
Bafna V: Searching Genomes for Noncoding RNA Using FastR (2005)
Shaojie Zhang, Brian Haas, Eleazar Eskin, Vineet Bafna
Abstract—The discovery of novel noncoding RNAs has been among the most exciting recent developments in biology. It has been hypothesized that there is, in fact, an abundance of functional noncoding...
A Comparative Evaluation of Two Algorithms for Windows Registry Anomaly Detection, volume 13 (2005)
Salvatore J. Stolfo, Frank Apap, Eleazar Eskin, Katherine Heller, Andrew Honig, Krysta Svore
Abstract. We present a component anomaly detector for a host-based intrusion detection system (IDS) for Microsoft Windows. The core of the detector is a learning-based anomaly detection algorithm...
Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP (2005)
Zaitlen, Noah A., Kang, Hyun Min, Feolo, Michael L., Sherry, Stephen T., Halperin, Eran, Eskin, Eleazar
In the attempt to understand human variation and the genetic basis of complex disease, a tremendous number of single nucleotide polymorphisms (SNPs) have been discovered and deposited into NCBI's...
Haplotype reconstruction from genotype data using imperfect phylogeny (2004)
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Snir S: The homology kernel: a biologically motivated sequence embedding into Euclidean space (2004)
Abstract — Part of the challenge of modeling protein sequences is their discrete nature. Many of the most powerful statistical and learning techniques are applicable to points in a Euclidean space...
Haplotype reconstruction from genotype data using imperfect phylogeny (2004)
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Mismatch string kernels for discriminative protein classification (2004)
Christina Leslie, Eleazar Eskin, Adiel Cohen, Jason Weston, William Stafford Noble
Motivation Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine...
Mismatch string kernels for discriminative protein classification (2004)
Leslie, Christina, Eskin, Eleazar, Cohen, Adiel, Weston, Jason, Noble, William Stafford
Motivation: Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine...
Haplotype reconstruction from genotype data using imperfect phylogeny (2004)
Halperin, Eran, Eskin, Eleazar
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Mismatch string kernels for discriminative protein classification (2004)
Leslie, Christina S., Eskin, Eleazar, Cohen, Adiel, Weston, Jason, Noble, William Stafford
Motivation: Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine...
Whole-genome analysis of Alu repeat elements reveals complex evolutionary history (2004)
Price, Alkes L., Eskin, Eleazar, Pevzner, Pavel A.
Alu repeats are the most abundant family of repeats in the human genome, with over 1 million copies comprising 10% of the genome. They have been implicated in human genetic disease and in the...
Haplotype reconstruction from genotype data using Imperfect Phylogeny (2004)
Halperin, Eran, Eskin, Eleazar
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Mismatch string kernels for discriminative protein classification (2004)
Leslie, Christina, Eskin, Eleazar, Cohen, Adiel, Weston, Jason, Noble, William Stafford
Motivation: Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine...
Haplotype reconstruction from genotype data using imperfect phylogeny (2004)
Halperin, Eran, Eskin, Eleazar
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Christina S. Leslie, Eleazar Eskin, Adiel Cohen, Jason Weston, William Stafford Noble
Motivation: Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine...
Large scale reconstruction of haplotypes from genotype data (2003)
Eleazar Eskin, Eran Halperin, Richard M. Karp
Critical to the understanding of the genetic basis for complex diseases is the modeling of human variation. Most of this variation can be characterized by single nucleotide polymorphisms (SNPs) which...
Sequence Motifs in Ranked Expression Data (2003)
The combination of gene expression data and genomic sequence data can be used to help discover putative transcription factor binding sites (TFBSs). There are two major approaches to incorporating...
Efficient reconstruction of haplotype structure via perfect phylogeny (2003)
Eleazar Eskin, Eran Halperin, Richard M. Karp
Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does...
Sparse sequence modeling with applications to computational biology and intrusion detection (2002)
Sequence models have been studied for some time in different contexts including language parsing and analysis, genomics, and recently in computer security in the area of intrusion detection. Many of...
Detecting malicious software by monitoring anomalous windows registry accesses (2002)
Frank Apap, Andrew Honig, Shlomo Hershkop, Eleazar Eskin, Sal Stolfo
Abstract. We present a host-based intrusion detection system (IDS) for Microsoft Windows. The core of the system is an algorithm that detects attacks on a host machine by looking for anomalous...
Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, Sal Stolfo
Abstract Most current intrusion detection systems employ signature-based methods or data mining-based methods which rely on labeled training data. This training data is typically expensive to...
Andrew Honig, Andrew Howard, Eleazar Eskin, Sal Stolfo
1 Introduction As sensitive information is increasingly being stored and manipulated on networked systems, the security of these networks and systems has become an extremely important issue....
Andrew Honig, Andrew Howard, Eleazar Eskin, Sal Stolfo
Data mining-based intrusion detection systems (IDSs) have signi cant advantages over signaturebased IDSs since they are designed to generalize models of network audit data to detect new attacks....
Sparse Sequence Modeling with Applications to Computational Biology and Intrusion Detection (2002)
Sequence models have been studied for some time in different contexts including language parsing and analysis, genomics, and recently in computer security in the area of intrusion detection. Many of...
Sparse Sequence Modeling with Applications to Computational Biology and Intrusion Detection (2002)
Sequence models have been studied for some time in different contexts including language parsing and analysis, genomics, and recently in computer security in the area of intrusion detection. Many of...
Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy, Sal Stolfo
Most current intrusion detection systems employ signature-based methods or data mining-based methods which rely on labeled training data. This training data is typically expensive to produce. We...
Finding composite regulatory patterns in DNA sequences (2002)
Eleazar Eskin, Pavel A. Pevzner
Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus...
MET: An Experimental System for Malicious Email Tracking (2002)
Manasi Bhattacharyya, Matthew G. Schultz, Eleazar Eskin, Shlomo Hershkop, Salvatore J. Stolfo
Despite the use of state of the art methods to protect against malicious programs, they continue to threaten and damage computer systems around the world. In this paper we present MET, the Malicious...
Finding composite regulatory patterns in DNA sequences (2002)
Eleazar Eskin, Pavel A. Pevzner
Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus...
Sparse sequence modeling with applications to computational biology and intrusion detection (2002)
Department: Computer Science.
Finding composite regulatory patterns in DNA sequences (2002)
Eskin, Eleazar, Pevzner, Pavel A.
Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus...
Modeling system calls for intrusion detection with dynamic window sizes (2001)
We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system...
Real time data mining-based intrusion detection (2001)
Wenke Lee, Salvatore J. Stolfo, Philip K. Chan, Eleazar Eskin, Wei Fan, Matthew Miller, ...
Salvatore J. Stolfo, Wenke Lee, Philip K, Wei Fan, Eleazar Eskin
The field of Intrusion Detection has been an active area of research for some time. The goal of an Intrusion Detection System (IDS) is to provide another layer of defense against malicious (or
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Salvatore J. Stolfo
A serious security threat today is malicious executables, especially new, unseen malicious executables. Many of these new malicious executables are undetectable by current anti-virus systems because...
Intrusion detection with unlabeled data using clustering (2001)
Leonid Portnoy, Eleazar Eskin, Sal Stolfo
Abstract Intrusions pose a serious security risk in a network environment. Although systems can be hardened against many types of intrusions, often intrusions are successful making systems for...
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin
A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of...
Modeling system calls for intrusion detection with dynamic window sizes (2001)
We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system...
Eleazar Eskin, William Noble Grundy, Yoram Singer
discrete events in biological sequences
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Salvatore J. Stolfo
A serious security threat today is malicious executables, especially new, unseen malicious executables. Many of these new malicious executables are undetectable by current anti-virus systems because...
Malicious Email Filter - A UNIX Mail Filter that Detects Malicious Windows Executables (2001)
Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Manasi Bhattacharyya, Salvatore J. Stolfo
We present Malicious Email Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server....
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Salvatore J. Stolfo
A serious security threat today is malicious executables, especially new, unseen malicious executables. Many of these new malicious executables are undetectable by current anti-virus systems because...
Malicious Email Filter - A UNIX Mail Filter that Detects Malicious Windows Executables (2001)
Matthew G. Schultz, Eleazar Eskin, Salvatore J. Stolfo
We present Malicious Email Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server....
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Salvatore J. Stolfo
A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of...
Modeling system calls for intrusion detection with dynamic window sizes (2001)
Eleazar Eskin, Wenke Lee, Salvatore J. Stolfo
We extend prior research on system call anomaly detection modeling methods for intrusion detection by incorporating dynamic window sizes. The window size is the length of the subsequence of a system...
MEF: Malicious Email Filter (2001)
Unix Mail Filter, Matthew G. Schultz, Eleazar Eskin
We present Malicious Email Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server....
Data mining methods for detection of new malicious executables (2001)
Matthew G. Schultz, Eleazar Eskin
A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of...
Malicious Email Filter - A UNIX Mail Filter that Detects Malicious Windows Executables (2001)
Matthew G. Schultz, Eleazar Eskin
We present Malicious Email Filter, MEF,afreelydistributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server. The...
MEF: Malicious Email Filter (2001)
Unix Mail Filter, Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Manasi Bhattacharyya, Salvatore J. Stolfo
We present Malicious Email Filter, MEF, a freely distributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server....
Real Time Data Mining-based Intrusion Detection (2001)
Wenke Lee Salvatore, Salvatore J. Stolfo, Philip K. Chan, Eleazar Eskin, Wei Fan, Matthew Miller, ...
In this paper, we present an overview of our research in real time data mining-based intrusion detection systems (IDSs). We focus on issues related to deploying a data mining-based IDS in a real time...
Detecting Malicious Software by Monitoring Anomalous Windows Registry Accesses (2001)
Frank Apap, Andrew Honig, Shlomo Hershkop, Eleazar Eskin, Sal Stolfo
We present a host-based intrusion detection system (IDS) for Microsoft Windows. The core of the system is an algorithm that detects attacks on a host machine by looking for anomalous accesses to the...
Intrusion Detection with Unlabeled Data Using Clustering (2001)
Leonid Portnoy, Eleazar Eskin, Sal Stolfo
Intrusions pose a serious security risk in a network environment. Although systems can be hardened against many types of intrusions, often intrusions are successful making systems for detecting these...
Eleazar Eskin, William Noble, Yoram Singer
We present a method for classifying proteins into families based on short subsequences of amino acids using a new probabilistic model called sparse Markov transducers (SMT). We classify a protein by...
Malicious Email Filter - A UNIX Mail Filter that Detects Malicious Windows Executables (2001)
Matthew G. Schultz, Eleazar Eskin
Permission is granted for noncommercial reproduction of the work for educational or research purposes.
Malicious Email Filter - A UNIX Mail Filter that Detects Malicious Windows Executables (2001)
Matthew G. Schultz, Eleazar Eskin
We present Malicious Email Filter, MEF,afreelydistributed malicious binary filter incorporated into Procmail that can detect malicious Windows attachments by integrating with a UNIX mail server. The...
Real time data mining-based intrusion detection (2001)
Wenke Lee, Salvatore J. Stolfo, Philip K. Chan, Eleazar Eskin, Wei Fan, Matthew Miller, ...
1
Eskin, Eleazar, Grundy, William N., Singer, Yoram
Accurately estimating probabilities from observations is important for probabilistic-based approaches to problems in computational biology. In this paper we present a biologically-motivated method...
Combining Strategies for Extracting Relations from Text Collections (2000)
Agichtein, Eugene, Eskin, Eleazar, Gravano, Luis
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering...
Protein family classi� cation using sparse markov transducers (2000)
Eleazar Eskin, William Stafford Noble, Yoram Singer
We present a method for classifying proteins into families based on short subsequences of amino acids using a new probabilistic model called sparse Markov transducers (SMT). We classify a protein by...
Combining Strategies for Extracting Relations from Text Collections (2000)
Eugene Agichtein, Eleazar Eskin, Luis Gravano
Abstract Text documents often contain valuable structured datathat is hidden in regular English sentences. This data is best exploited if available as a relational table that wecould use for...
Anomaly detection over noisy data using learned probability distributions (2000)
Traditional anomaly detection techniques focus on detecting anomalies in new data after training on normal (or clean) data. In this paper we present a technique for detecting anomalies without...
Adaptive model generation for intrusion detection systems (2000)
Eleazar Eskin, Matthew Miller, Zhi-da Zhong, George Yi, Wei-ang Lee, Salvatore Stolfo
In this paper, we present adaptive model generation, a method for automatically building detection models for data-mining based intrusion detection systems. Using the same data collected by intrusion...
Combining Strategies for Extracting Relations from Text Collections (2000)
Eugene Agichtein Eleazar, Eugene Agichtein, Eleazar Eskin, Luis Gravano
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering...
Combining Strategies for Extracting Relations from Text Collections (2000)
Eugene Agichtein, Eleazar Eskin, Luis Gravano
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering...
Protein Family Classification using Sparse Markov Transducers (2000)
Eleazar Eskin, William Noble Grundy, Yoram Singer
this paper we present a method for classifying
Combining Strategies for Extracting Relations from Text Collections (2000)
Eugene Agichtein, Eleazar Eskin, Luis Gravano
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering...
Combining Strategies for Extracting Relations from Text Collections (2000)
Eugene Agichtein, Eleazar Eskin, Luis Gravano
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering...
Towards multidocument summarization by reformulation: Progress and prospects (1999)
Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
Vasileios Hatzivassiloglou, Eleazar Eskin
We present a new composite similarity metric that combines information from multiple lin-guistic indicators to measure semantic distance between pairs of small textual units. Several potential...
Genetic Programming Applied to Othello: Introducing Students to Machine Learning Research (1999)
In this paper we describe and analyze a three week assignment that was given in a Machine Learning course at Columbia University. The assignment presented students with an introduction to machine...
Towards Multidocument Summarization by Reformulation: (1999)
Progress And Prospects, Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
Towards Multidocument Summarization by Reformulation: Progress and Prospects (1999)
Kathleen Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
Vasileios Hatzivassiloglou, Judith L. Klavans, Eleazar Eskin
We present a new composite similarity metric that combines information from multiple linguistic indicators to measure semantic distance between pairs of small textual units. Several potential...
Towards multidocument summarization by reformulation: Progress and prospects (1999)
Kathleen R. Mckeown, Judith L. Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, Eleazar Eskin
By synthesizing information common to retrieved documents, multi-document summarization can help users of information retrieval systems to find relevant documents with a minimal amount of reading. We...
Whole-genome analysis of Alu repeat elements reveals complex evolutionary history
Price, Alkes L., Eskin, Eleazar, Pevzner, Pavel A.
Alu repeats are the most abundant family of repeats in the human genome, with over 1 million copies comprising 10% of the genome. They have been implicated in human genetic disease and in the...
Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP
Zaitlen, Noah A., Kang, Hyun Min, Feolo, Michael L., Sherry, Stephen T., Halperin, Eran, Eskin, Eleazar
In the attempt to understand human variation and the genetic basis of complex disease, a tremendous number of single nucleotide polymorphisms (SNPs) have been discovered and deposited into NCBI's...
A Comparison of Phasing Algorithms for Trios and Unrelated Individuals
Marchini, Jonathan, Cutler, David, Patterson, Nick, Stephens, Matthew, Eskin, Eleazar, Halperin, Eran, ...
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of...
Whole-genome analysis of Alu repeat elements reveals complex evolutionary history
Price, Alkes L., Eskin, Eleazar, Pevzner, Pavel A.
Alu repeats are the most abundant family of repeats in the human genome, with over 1 million copies comprising 10% of the genome. They have been implicated in human genetic disease and in the...
Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP
Zaitlen, Noah A., Kang, Hyun Min, Feolo, Michael L., Sherry, Stephen T., Halperin, Eran, Eskin, Eleazar
In the attempt to understand human variation and the genetic basis of complex disease, a tremendous number of single nucleotide polymorphisms (SNPs) have been discovered and deposited into NCBI's...
A Comparison of Phasing Algorithms for Trios and Unrelated Individuals
Marchini, Jonathan, Cutler, David, Patterson, Nick, Stephens, Matthew, Eskin, Eleazar, Halperin, Eran, ...
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of...
Discrete profile comparison using information bottleneck
O'Rourke, Sean, Chechik, Gal, Friedman, Robin, Eskin, Eleazar
Sequence homologs are an important source of information about proteins. Amino acid profiles, representing the position-specific mutation probabilities found in profiles, are a richer encoding of...
Leveraging the HapMap Correlation Structure in Association Studies
Zaitlen, Noah, Kang, Hyun Min, Eskin, Eleazar, Halperin, Eran
Recent high-throughput genotyping technologies, such as the Affymetrix 500k array and the Illumina HumanHap 550 beadchip, have driven down the costs of association studies and have enabled the...
Analysis of genetic variation in Ashkenazi Jews by high density SNP genotyping
Olshen, Adam B, Gold, Bert, Lohmueller, Kirk E, Struewing, Jeffery P, Satagopan, Jaya, Stefanov, Stefan A, ...
Efficient Control of Population Structure in Model Organism Association Mapping
Kang, Hyun Min, Zaitlen, Noah A., Wade, Claire M., Kirby, Andrew, Heckerman, David, Daly, Mark J., ...
Genomewide association mapping in model organisms such as inbred mouse strains is a promising approach for the identification of risk factors related to human diseases. However, genetic association...
High-Resolution Mapping of Gene Expression Using Association in an Outbred Mouse Stock
Ghazalpour, Anatole, Doss, Sudheer, Kang, Hyun, Farber, Charles, Wen, Ping-Zi, Brozell, Alec, ...
Quantitative trait locus (QTL) analysis is a powerful tool for mapping genes for complex traits in mice, but its utility is limited by poor resolution. A promising mapping approach is association...
Dealing with large diagonals in kernel matrices
Jason Weston, Bernhard Schölkopf, Eleazar Eskin, Christina Leslie, William Noble
Kernel methods, Support Vector Machines, pattern recognition, bioinformatics, microarray data analysis, transduction, regularization,
The availability of various types of genomic data provides an opportunity to incorporate this data as prior information in genetic association studies. This information includes knowledge of linkage...
Kang, Hyun Min, Ye, Chun, Eskin, Eleazar
In genomewide mapping of expression quantitative trait loci (eQTL), it is widely believed that thousands of genes are trans-regulated by a small number of genomic regions called “regulatory...
Mismatch String Kernels for SVM Protein Classification
Christina Leslie, Eleazar Eskin, Jason Weston, William Stafford Noble
We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem. These kernels measure...
Data Mining Methods for Detection of New Malicious Executables
Matthew Schultz And, Matthew G. Schultz, Eleazar Eskin, Erez Zadok, Salvatore J. Stolfo
A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. These new malicious executables are created at the rate of...
Ye, Chun, Galbraith, Simon J., Liao, James C., Eskin, Eleazar
Understanding the relationship between genetic variation and gene expression is a central question in genetics. With the availability of data from high-throughput technologies such as ChIP-Chip,...
Han, Buhm, Kang, Hyun Min, Eskin, Eleazar
With the development of high-throughput sequencing and genotyping technologies, the number of markers collected in genetic association studies is growing rapidly, increasing the importance of methods...