Gill Bejerano, Craig Lowe, Nadav Ahituv, Bryan King, Adam Siepel, Sofie Salama, ...
An enhancer near ISL1 and an ultraconserved PCBP2 exon are
056094-01 to C.C.N. Ultraconserved Elements in the (2008)
D. Tokarchick, A. Schell, Human Genome, Gill Bejerano, Michael Pheasant, ...
for technical assistance. We are grateful to M. Epler and S. Tevethia for their generous gift of D b-NP 366-374 tetramers. This work was supported in part
Article Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes (2008)
Adam Siepel, Gill Bejerano, Jakob S. Pedersen, Angie S. Hinrichs, Minmei Hou, Kate Rosenbloom, ...
We have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu...
Summary: P-value computation is often used in bioinformatics to quantify the surprise, or significance, associated with a given observation. An implementation is provided that computes the exact...
Gill Bejerano, Mentor Prof, David Haussler
Employment & Welfare, and the private sector. 1993-94 Physics teacher. Mekif Gilo secondary school, Jerusalem. Part of an educational betterment project.
Dispensability of mammalian DNA (2008)
In the lab, the cis-regulatory network seems to exhibit great functional redundancy. Many experiments testing enhancer activity of neighboring cis-regulatory elements show largely overlapping...
Motivation We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein...
Efficient Exact p-Value Computation and Applications to Biosequence Analysis (2007)
Extended Abstract, Gill Bejerano
Like other fields of life sciences, bioinformatics has turned to capture biological phenomena through probabilistic models, and to analyse these models using statistical methodology. A central...
modeling and prediction of protein families (2007)
Motivation: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein...
Centro de Investigacin sobre Fijacin de Nitrgeno, (2007)
Ruti Hershberg, Gill Bejerano, Alberto Santos-zavaleta, Autnoma Mxico
mRNA promoters with experimentally identified transcriptional start sites
Unsupervised Segmentation and Classication of Mixtures of Markovian Sources (2007)
Yevgeny Seldin, Gill Bejerano, Naftali Tishby
We describe a novel algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources, rst presented in [SBT01]. The algorithm is based on competitive learning...
many of the small RNAs known to date were discovered fortuitously. (2007)
Liron Argaman, Ruth Hershberg, Jörg Vogel, Gill Bejerano, E. Gerhart, H. Wagner, ...
out diverse functions, and many of them are regulators of gene expression.
Alberto Apostolico, Gill Bejerano
Statistical modeling of sequences is a central paradigm of machine learning that nds multiple uses in computational molecular biology and many other domains. The probabilistic automata typically...
BIOINFORMATICS Markovian domain fingerprinting: statistical segmentation of protein sequences (2007)
Gill Bejerano, Yevgeny Seldin, Hanah Margalit, Naftali Tishby
Motivation: Characterization of a protein family by its distinct sequence domains is crucial for functional annotation and correct classification of newly discovered proteins. Conventional Multiple...
Human genome ultraconserved elements are ultraselected., Science 317 (2007)
Sol Katzman, Andrew Kern, Gill Bejerano, Ginger Fewell, Lucinda Fulton, Richard Wilson, ...
Unexpectedly long regions of extremely conserved DNA, known as ultraconserved elements, were first found by comparing the human, mouse and rat genomes 1. Most are non
Automated Function Prediction, 2006 (2006)
Christian Zmasek, Dana Weekes, Einat Sprinzak, Gill Bejerano, Jeffrey Chang, Lukasz Jaroszewski, ...
1 Welcome! On behalf of the Program Committee, the Scientific Committee and the organizers, we are happy to welcome you to San Diego for the second Automated Function Prediction conference, AFP 2006....
Branch and bound computation of exact p-values (2006)
Summary: P-value computation is often used in bioinformatics to quantify the surprise, or significance, associated with a given observation. An implementation is provided that computes the exact...
Computational screening of conserved genomic DNA in search of functional noncoding elements (2005)
Bejerano, Gill, Siepel, A C, Kent, W J, Haussler, D
The sequencing of the mouse genome allowed, for the first time, the large-scale estimation of the extent of sequence conservation within our own genome. In particular, it suggested that in mammals...
Glazov, Evgeny A., Pheasant, Michael, McGraw, Elizabeth A., Bejerano, Gill, Mattick, John S.
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we...
Glazov, Evgeny A., Pheasant, Michael, McGraw, Elizabeth A., Bejerano, Gill, Mattick, John S.
Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we...
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes (2005)
Siepel, Adam, Bejerano, Gill, Pedersen, Jakob S., Hinrichs, Angie S., Hou, Minmei, Rosenbloom, Kate, ...
We have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu...
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes (2005)
Siepel, Adam, Bejerano, Gill, Pedersen, Jakob S., Hinrichs, Angie S., Hou, Minmei, Rosenbloom, Kate, ...
We have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu...
Efficient exact p-value computation for small sample, sparse, and surprising categorical data (2004)
Gill Bejerano, Nir Friedman, Naftali Tishby
A major obstacle in applying various hypothesis testing procedures to datasets in bioinformatics is the computation of ensuing p-values. In this paper, we define a generic branchand-bound approach to...
Algorithms for variable length Markov chain modeling (2004)
Summary: We present a general purpose implementation of variable length Markov models. Contrary to fixed order Markov models, these models are not restricted to a predefined uniform depth. Rather, by...
Into the heart of darkness: large-scale clustering of human non-coding DNA (2004)
Bejerano, Gill, Haussler, David, Blanchette, Mathieu
Motivation: It is currently believed that the human genome contains about twice as much non-coding functional regions as it does protein-coding genes, yet our understanding of these regions is very...
Algorithms for variable length Markov chain modeling (2004)
Summary: We present a general purpose implementation of variable length Markov models. Contrary to fixed order Markov models, these models are not restricted to a predefined uniform depth. Rather, by...
Algorithms for variable length Markov chain modeling (2004)
Summary: We present a general purpose implementation of variable length Markov models. Contrary to fixed order Markov models, these models are not restricted to a predefined uniform depth. Rather, by...
Summary: We present a general purpose implementation of variable length Markov models. Contrary to fixed order Markov models, these models are not restricted to a predefined uniform depth. Rather, by...
Discriminative Feature Selection via Multiclass Variable Memory Markov Model (2003)
Naftali Tishby, Shai Fine, Gill Bejerano, Noam Slonim
We propose a novel feature selection method based on a variable memory Markov (VMM) model. The VMM was originally proposed as a generative model trying to preserve the original source statistics from...
Discriminative Feature Selection via Multiclass Variable Memory Markov Model (2003)
Noam Slonim, Gill Bejerano, Shai Fine, Naftali Tishby
We propose a novel feature selection method based on a variable memory Markov (VMM) model. The VMM was originally proposed as a generative model trying to preserve the original source statistics from...
Discriminative feature selection via multiclass variable memory Markov model (2002)
Noam Slonim, Gill Bejerano, Shai Fine, Naftali Tishby
We propose a novel feature selection method based on a Variable Memory Markov model (VMM). The VMM was originally proposed as a generative model trying to preserve the original source statistics from...
Discriminative Feature Selection via Multiclass Variable Memory Markov Model (2002)
Noam Slonim, Gill Bejerano, Shai Fine, Naftali Tishby
We propose a novel feature selection method based on a Variable Memory Markov model (VMM). The VMM was originally proposed as a generative model trying to preserve the original source statistics from...
A simple hyper-geometric approach for discovering putative transcription factor binding sites (2001)
Yoseph Barash, Gill Bejerano, Nir Friedman
Abstract. A central issue in molecular biology is understanding the regulatory mechanisms that control gene expression. The recent ood of genomic and post-genomic data opens the way for computational...
Unsupervised sequence segmentation by a mixture of switching variable memory Markov sources (2001)
Yevgeny Seldin, Gill Bejerano, Naftali Tishby
We present a novel information theoretic algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources. The algorithm is based on competitive learning between...
Motivation We present a method for modeling protein families by means of probabilistic sux trees (PSTs). The method is based on identifying signicant patterns in a set of related protein sequences....
Gill Bejerano, Yevgeny Seldin, Hanah Margalit, Naftali Tishby
We present a novel information theoretic method for protein domain and statistical signature extraction. We apply a new algorithm [20] for unsupervised segmentation of sequences into alternating...
Gill Bejerano, Yevgeny Seldin, Hanah Margalitý, Naftali Tishby
We present a novel information theoretic method for protein domain and statistical signature extraction. We apply a new algorithm [20] for unsupervised segmentation of sequences into alternating...
Unsupervised Sequence Segmentation by a Mixture of Switching Variable Memory Markov Sources (2001)
Yevgeny Seldin, Gill Bejerano, Naftali Tishby
We present a novel information theoretic algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources. The algorithm is based on competitive learning between...
Markovian domain fingerprinting: statistical segmentation of protein sequences (2001)
Bejerano, Gill, Seldin, Yevgeny, Margalit, Hanah, Tishby, Naftali
Motivation: Characterization of a protein family by its distinct sequence domains is crucial for functional annotation and correct classification of newly discovered proteins. Conventional Multiple...
Hershberg, Ruti, Bejerano, Gill, Santos-Zavaleta, Alberto, Margalit, Hanah
PromEC is an updated compilation of Escherichia coli mRNA promoter sequences. It includes documentation on the location of experimentally identified mRNA transcriptional start sites on the E.coli...
Motivation: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein...
Alberto Apostolico, Gill Bejerano
Statistical modeling of sequences is a central paradigm of machine learning that � nds multiple uses in computational molecular biology and many other domains. The probabilistic automata typically...
Alberto Apostolico, Gill Bejerano
Statistical modeling of sequences is a central paradigm of machine learning that � nds multiple uses in computational molecular biology and many other domains. The probabilistic automata typically...
Modeling protein families using probabilistic suffix trees (1999)
We present a method for modeling protein families by means of probabilistic sux trees (PSTs). The method is based on identifying signicant patterns in a set of related protein sequences. The input...
Modeling Protein Families Using Probabilistic Suffix Trees (1999)
We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The...
Modeling Protein Families Using Probabilistic Suffix Trees (1999)
We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The...
Dispensability of mammalian DNA
In the lab, the cis-regulatory network seems to exhibit great functional redundancy. Many experiments testing enhancer activity of neighboring cis-regulatory elements show largely overlapping...