Reactome - a knowledgebase of human biological pathways (2009)
Bijay Jassal, Esther E. Schmidt, Guanming Wu, Imre Vastrik, David Croft, Bernard De Bono, ...
Reactome ("http://www.reactome.org":http://www.reactome.org) is an expert-authored, peer-reviewed knowledgebase of human reactions and pathways that functions as a data mining resource and...
Correction: Reactome: a knowledge base of biologic pathways and processes (2009)
Vastrik, Imre, D'Eustachio, Peter, Schmidt, Esther, Gopinath, Gopal, Croft, David, De Bono, Bernard, ...
No abstract available.
Pruitt, Kim D., Harrow, Jennifer, Harte, Rachel A., Wallin, Craig, Diekhans, Mark, Maglott, Donna R., ...
Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can...
VectorBase: a data resource for invertebrate vector genomics (2009)
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase (http://www.vectorbase.org) is an NIAID-funded Bioinformatic Resource Center focused on invertebrate vectors of human pathogens. VectorBase annotates and curates vector genomes providing a...
Petabyte-scale innovations at the European Nucleotide Archive (2009)
Cochrane, Guy, Akhtar, Ruth, Bonfield, James, Bower, Lawrence, Demiralp, Fehmi, Faruque, Nadeem, ...
Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence...
MAPU 2.0: high-accuracy proteomes mapped to genomes (2009)
Gnad, Florian, Oroshi, Mario, Birney, Ewan, Mann, Matthias
The MAPU 2.0 database contains proteomes of organelles, tissues and cell types measured by mass spectrometry (MS)-based proteomics. In contrast to other databases it is meant to contain a limited...
Reactome knowledgebase of human biological pathways and processes (2009)
Matthews, Lisa, Gopinath, Gopal, Gillespie, Marc, Caudy, Michael, Croft, David, De Bono, Bernard, ...
Reactome (http://www.reactome.org) is an expert-authored, peer-reviewed knowledgebase of human reactions and pathways that functions as a data mining resource and electronic textbook. Its current...
EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates (2009)
Vilella, Albert J., Severin, Jessica, Ureta-Vidal, Abel, Heng, Li, Durbin, Richard, Birney, Ewan
We have developed a comprehensive gene orientated phylogenetic resource, EnsemblCompara GeneTrees, based on a computational pipeline to handle clustering, multiple alignment, and tree generation,...
Paten, Benedict, Herrero, Javier, Beal, Kathryn, Birney, Ewan
Motivation: Multiple sequence alignment is a cornerstone of comparative genomics. Much work has been done to improve methods for this task, particularly for the alignment of small sequences, and...
Integrating biological data – the Distributed Annotation System (2008)
Jenkinson, Andrew M, Albrecht, Mario, Birney, Ewan, Blankenburg, Hagen, Down, Thomas, Finn, Robert D, ...
Abstract Background The Distributed Annotation System (DAS) is a widely adopted protocol for dynamically integrating a wide range of biological data from geographically diverse sources. DAS continues...
software engineering, source (2008)
joined the EBI as one of the founding investigators for the Ensembl project. He is now a senior scientist at the EBI and runs both research and database projects. Michele Clamp joined the Sanger...
Paul J. Kersey, Jorge Duarte, Allyson Williams, Youla Karavidopoulou, Ewan Birney, Rolf Apweiler
Despite the complete determination of the genome sequence of several higher eukaryotes, their proteomes remain relatively poorly defined. Information about proteins identified by different...
Stefan Gräf, Stefan Kurtz, Martijn A. Huynen, Ewan Birney, Henk Stunnenberg, ...
Motivation: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important...
BMC Bioinformatics BioMed Central Research article Gene finding in the chicken genome (2008)
Eduardo Eyras, Re Reymond, Robert Castelo, Jacqueline M Bye, Francisco Camara, Paul Flicek, ...
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
Genome annotation techniques: (2008)
Alistair G. Rust, Emmanuel Mongin, Ewan Birney, Alistair G. Rust, Emmanuel Mongin, Ewan Birney
new approaches and challenges
Genome analysis of the platypus reveals unique signatures of evolution (2008)
Warren, Wesley C., Hillier, LaDeana W., Marshall Graves, Jennifer A., Birney, Ewan, Ponting, Chris P., Grützner, Frank, ...
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a...
Integrating biological data - the Distributed {Annotation} {System} (2008)
Jenkinson, Andrew M., Albrecht, Mario, Birney, Ewan, Blankenburg, Hagen, Down, Thomas, Finn, Robert D., ...
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs (2008)
Zerbino, Daniel R., Birney, Ewan
We have developed a new set of algorithms, collectively called “Velvet,” to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short...
Cochrane, Guy, Akhtar, Ruth, Aldebert, Philippe, Althorpe, Nicola, Baldwin, Alastair, Bates, Kirsty, ...
The Ensembl Trace Archive (http://trace.ensembl.org/) and the EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), known together as the European Nucleotide Archive, continue to see growth...
The HGNC Database in 2008: a resource for the human genome (2008)
Bruford, Elspeth A., Lush, Michael J., Wright, Mathew W., Sneddon, Tam P., Povey, Sue, Birney, Ewan
The HUGO Gene Nomenclature Committee (HGNC) aims to assign a unique and ideally meaningful name and symbol to every human gene. The HGNC database currently comprises over 24 000 public records...
OReilly, Paul F., Birney, Ewan, Balding, David J.
In recent years, there have been major developments of population genetics methods to estimate both rates of recombination and levels of natural selection. However, genomic variants subject to...
Enredo and Pecan: Genome-wide mammalian consistency-based multiple alignment with paralogs (2008)
Paten, Benedict, Herrero, Javier, Beal, Kathryn, Fitzgerald, Stephen, Birney, Ewan
Pairwise whole-genome alignment involves the creation of a homology map, capable of performing a near complete transformation of one genome into another. For multiple genomes this problem is...
Genome-wide nucleotide-level mammalian ancestor reconstruction (2008)
Paten, Benedict, Herrero, Javier, Fitzgerald, Stephen, Beal, Kathryn, Flicek, Paul, Holmes, Ian, ...
Recently attention has been turned to the problem of reconstructing complete ancestral sequences from large multiple alignments. Successful generation of these genome-wide reconstructions will...
Rakyan, Vardhman K., Down, Thomas A., Thorne, Natalie P., Flicek, Paul, Kulesha, Eugene, Gräf, Stefan, ...
We report a novel resource (methylation profiles of DNA, or mPod) for human genome-wide tissue-specific DNA methylation profiles. mPod consists of three fully integrated parts, genome-wide DNA...
Sean R. Eddy, Ewan Birney, Alex Bateman, Richard Durbin
Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members, and alignment is done...
Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. (2007)
Alex Bateman, Ewan Birney, Richard Durbin, Sean R. Eddy, Robert D. Finn, Erik L. L
Pfam is a collection of multiple alignments and profile hidden Markov models of protein domain families. Release 3.1 is a major update of the Pfam database and contains 1313 families which are...
James A. Cuff, Ewan Birney, Michele E. Clamp, Geoffrey J. Barton, Oxford Ox Qu
from expressed sequence tags
Resource The Bioperl Toolkit: Perl Modules for the Life Sciences (2007)
Jason E. Stajich, David Block, Kris Boulez, Steven E. Brenner, Lincoln D. Stein, Elia Stupka, ...
The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive libraryof...
Alex Bateman, Ewan Birney, Richard Durbin, Sean R. Eddy, Robert D. Finn
Pfam is a collection of multiple alignments and profile hidden Markov models of protein domain families. Release 3.1 is a major update of the Pfam database and contains 1313 families which are...
Alex Bateman, Ewan Birney, Richard Durbin, Sean R. Eddy
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at
Optimising oligonucleotide array design for ChIP-on-chip (2007)
Nielsen, Fiona, Graef, Stefan, Zhang, Xinmin, Kurtz, Stefan, Denissov, Sergei, Green, Roland, ...
No abstract available.
Reactome - a knowledgebase of human biological pathways (2007)
Peter D'Eustachio, David Croft, Bernard De Bono, Gopal Gopinath, Marc Gillespie, Bijay Jassal, ...
Pathway curation is a powerful tool for systematically associating gene products with functions. Reactome (www.reactome.org) is a manually curated human pathway knowledgebase describing a wide range...
In Vivo Validation of a Computationally Predicted Conserved Ath5 Target Gene Set (2007)
Filippo Del Bene, Laurence Ettwiller, Dorota Skowronska-Krawczyk, Herwig Baier, Jean-Marc Matter, Ewan Birney, ...
So far, the computational identification of transcription factor binding sites is hampered by the complexity of vertebrate genomes. Here we present an in silico procedure to predict target sites of a...
In vivo Validation of a Computationally Predicted Conserved Ath5 Target Gene Set (2007)
Filippo Del Bene, Laurence Ettwiller, Dorota Skowronska-Krawczyk, Herwig Baier, Jean-Marc Matter, Ewan Birney, ...
So far the computational identification of transcription factor binding sites was hampered by the complexity of vertebrate genomes. Here we present an in silico procedure to predict target sites of a...
Optimized design and assessment of whole genome tiling arrays (2007)
Gräf, Stefan, Nielsen, Fiona G. G., Kurtz, Stefan, Huynen, Martijn A., Birney, Ewan, Stunnenberg, Henk, ...
Motivation: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important...
The landscape of histone modifications across 1% of the human genome in five human cell lines (2007)
Koch, Christoph M., Andrews, Robert M., Flicek, Paul, Dillon, Shane C., Karaöz, Ulas, Clelland, Gayle K., ...
We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1,...
Margulies, Elliott H., Cooper, Gregory M., Asimenos, George, Thomas, Daryl J., Dewey, Colin N., Siepel, Adam, ...
A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation,...
Reactome: a knowledge base of biologic pathways and processes (2007)
Vastrik, Imre, D'Eustachio, Peter, Schmidt, Esther, Joshi-Tope, Geeta, Gopinath, Gopal, Croft, David, ...
Abstract Reactome http://www.reactome.org , an online curated resource for human pathway data, provides infrastructure for computation across the biologic reaction network. We use Reactome to infer...
Update of the Anopheles gambiaePEST genome assembly (2007)
Sharakhova, Maria V, Hammond, Martin P, Lobo, Neil F, Krzywinski, Jaroslaw, Unger, Maria F, Hillenmeyer, Maureen E, ...
Abstract Background The genome of Anopheles gambiae , the major vector of malaria, was sequenced and assembled in 2002. This initial genome assembly and analysis made available to the scientific...
Patterns of somatic mutation in human cancer genomes (2007)
Greenman, Christopher, Stephens, Philip, Smith, Raffaella, Dalgliesh, Gillian L., Hunter, Christopher, Bignell, Graham, ...
Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for...
Reactome: An integrated expert model of human molecular processes and access toolkit (2007)
De Bono, Bernard, Vastrik, Imre, D´Eustachio, Peter, Schmidt, Esther, Gopinath, Gopal, Croft, David, ...
The behaviour of pervasive molecular processes in human biology can be studied through the large-scale modeling of the molecular events that define them. Constructing detailed models of such extent...
Patterns of somatic mutation in human cancer genomes (2007)
Greenman, Christopher, Stephens, Philip, Smith, Raffaella, Dalgliesh, Gillian L., Hunter, Christopher, Bignell, Graham, ...
Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for...
The implications of alternative splicing in the ENCODE protein complement (2007)
Tress, Michael L., Martelli, Pier L., Frankish, Adam, Reeves, Gabrielle A., Wesselink, Jan J., Yeats, Corin, ...
Alternative premessenger RNA splicing enables genes to generate more than one gene product. Splicing events that occur within protein coding regions have the potential to alter the biological...
Mirabeau, Olivier, Perlas, Emerald, Severini, Cinzia, Audero, Enrica, Gascuel, Olivier, Possenti, Roberta, ...
Peptide hormones are small, processed, and secreted peptides that signal via membrane receptors and play critical roles in normal and pathological physiology. The search for novel peptide hormones...
Genome browsing with Ensembl: a practical overview (2007)
Spudich, Giulietta, Fernández-Suárez, Xosé M., Birney, Ewan
A wealth of gene information is accruing in public databases. Genome browsers such as Ensembl are needed to organize and depict this information in the context of the genome. Ensembl provides an open...
Estimating the Neutral Rate of Nucleotide Substitution Using Introns (2007)
Hoffman, Michael M., Birney, Ewan
Evolutionary biologists frequently rely on estimates of the neutral rate of evolution when characterizing the selective pressure on protein-coding genes. We introduce a new method to estimate this...
VectorBase: a home for invertebrate vectors of human pathogens (2007)
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase (http://www.vectorbase.org/) is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes providing...
EGASP: the human ENCODE Genome Annotation Assessment Project (2006)
Guigó, Roderic, Flicek, Paul, Abril, Josep F, Reymond, Alexandre, Lagarde, Julien, Denoeud, France, ...
Abstract Background We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence....
EGASP: the human ENCODE Genome Annotation Assessment Project (2006)
Guigó Serra, Roderic, Flicek, Paul, Abril, Josep F., Reymond, Alexandre, Lagarde, Julien, Denoeud, France, ...
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the...
VectorBase: a home for invertebrate vectors of human pathogens (2006)
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase (http://www.vectorbase.org/) is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes providing...
VectorBase: a home for invertebrate vectors of human pathogens (2006)
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase (http://www.vectorbase.org/) is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes providing...
Ettwiller, Laurence, Paten, Benedict, Souren, Marcel, Loosli, Felix, Wittbrodt, Jochen, Birney, Ewan
Abstract We have developed several new methods to investigate transcriptional motifs in vertebrates. We developed a specific alignment tool appropriate for regions involved in transcription control,...
Gene finding in the chicken genome (2005)
Eyras, Eduardo, Reymond, Alexandre, Castelo, Robert, Bye, Jacqueline M, Camara, Francisco, Flicek, Paul, ...
Abstract Background Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is...
Automated generation of heuristics for biological sequence comparison (2005)
Abstract Background Exhaustive methods of sequence alignment are accurate but slow, whereas heuristic approaches run quickly, but their complexity makes them more difficult to implement. We introduce...
Bmc Bioinformatics, Guy St, C Slater, Ewan Birney
Automated generation of heuristics for biological sequence
Hubbard, Simon J., Grafham, Darren V., Beattie, Kevin J., Overton, Ian M., McLaren, Stuart R., Croning, Michael D.R., ...
We present an analysis of the chicken (Gallus gallus) transcriptome based on the full insert sequences for 19,626 cDNAs, combined with 485,337 EST sequences. The cDNA data set has been functionally...
Guigo, Roderic, Birney, Ewan, Brent, Michael, Dermitzakis, Emmanouil, Pachter, Lior, Crollius, Hugues Roest, ...
With the sponsorship of ``Fundacio La Caixa'' we met in Barcelona, November 21st and 22nd, to analyze the reasons why, after the completion of the human genome sequence, the identification all...
Representación gráfica de la secuencia del genoma humano en los 23 pares de cromosomas de los seres humanos. Se trata de un material acompañante al vol. 431, no. 7011 de la revista Nature (21 de...
Representación gráfica de la secuencia del genoma humano en los 23 pares de cromosomas de los seres humanos. Se trata de un material acompañante al vol. 431, no. 7011 de la revista Nature (21 de...
Ewan Birney, Michele Clamp, Richard Durbin, Email Alerting, Ewan Birney, Michele Clamp, ...
service
The Pfam protein families database (2004)
Alex Bateman, Ewan Birney, Lorenzo Cerruti, Richard Durbin, Laurence Etwiller, Sean R. Eddy, ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at
EnsMart: A Generic System for Fast and Flexible Access to Biological Data (2004)
Kasprzyk, Arek, Keefe, Damian, Smedley, Damian, London, Darin, Spooner, William, Melsopp, Craig, ...
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools....
Biological database design and implementation (2004)
We present our experience of building biological databases. Such databases have most aspects in common with other complex databases in other fields. We do not believe that biological data are that...
Birney, Ewan, Andrews, T. Daniel, Bevan, Paul, Caccamo, Mario, Chen, Yuan, Clarke, Laura, ...
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of...
Dermitzakis, Emmanouil T., Kirkness, Ewen, Schwarz, Scott, Birney, Ewan, Reymond, Alexandre, Antonarakis, Stylianos E.
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic...
Hubbard, Simon J., Grafham, Darren V., Beattie, Kevin J., Overton, Ian M., McLaren, Stuart R., Croning, Michael D.R., ...
We present an analysis of the chicken (Gallus gallus) transcriptome based on the full insert sequences for 19,626 cDNAs, combined with 485,337 EST sequences. The cDNA data set has been functionally...
Dermitzakis, Emmanouil T., Kirkness, Ewen, Schwarz, Scott, Birney, Ewan, Reymond, Alexandre, Antonarakis, Stylianos E.
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic...
Birney, Ewan, Andrews, T. Daniel, Bevan, Paul, Caccamo, Mario, Chen, Yuan, Clarke, Laura, ...
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of...
The Ensembl Core Software Libraries (2004)
Stabenau, Arne, McVicker, Graham, Melsopp, Craig, Proctor, Glenn, Clamp, Michele, Birney, Ewan
Systems for managing genomic data must store a vast quantity of information. Ensembl stores these data in several MySQL databases. The core software libraries provide a practical and effective means...
Sockeye: A 3D Environment for Comparative Genomics (2004)
Montgomery, Stephen B., Astakhova, Tamara, Bilenky, Mikhail, Birney, Ewan, Fu, Tony, Hassel, Maik, ...
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases,...
GeneWise and Genomewise (2004)
Birney, Ewan, Clamp, Michele, Durbin, Richard
We present two algorithms in this paper: GeneWise, which predicts gene structure using similar protein sequences, and Genomewise, which provides a gene structure final parse across cDNA- and...
The European Bioinformatics Institute's data resources (2003)
Brooksbank, Catherine, Camon, Evelyn, Harris, Midori A., Magrane, Michele, Martin, Maria Jesus, Mulder, Nicola, ...
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics...
Discovering Novel cis-Regulatory Motifs Using Functional Networks (2003)
Ettwiller, Laurence M., Rung, Johan, Birney, Ewan
We combined functional information such as protein–protein interactions or metabolic networks with genome information in Saccaromyces cerevisiae to predict cis-regulatory motifs in the upstream...
DOI: 10.1093/nar/gkg066 The European Bioinformatics Institute’s data resources (2002)
Catherine Brooksbank, Evelyn Camon, Midori A. Harris, Michele Magrane, Maria Jesus Martin, Nicola Mulder, ...
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics...
The Pfam Protein Families Database (2002)
Alex Bateman, Ewan Birney, Lorenzo Cerruti, Richard Durbin, Laurence Etwiller, Sean R. Eddy, ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in...
The Pfam Protein Families Database (2002)
Bateman, Alex, Birney, Ewan, Cerruti, Lorenzo, Durbin, Richard, Etwiller, Laurence, Eddy, Sean R., ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at...
The Bioperl Toolkit: Perl Modules for the Life Sciences (2002)
Stajich, Jason E., Block, David, Boulez, Kris, Brenner, Steven E., Chervitz, Stephen A., Dagdigian, Chris, ...
The Pfam Protein Families Database (2000)
Alex Bateman, Ewan Birney, Richard Durbin, Sean R. Eddy, Kevin L. Howe, Erik Sonnhammer, ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in...
ProtEST: protein multiple sequence alignments from expressed sequence tags (2000)
Cuff, James A., Birney, Ewan, Clamp, Michele E., Barton, Geoffrey J.
Motivation: An automatic sequence searching method (ProtEST) is described which constructs multiple protein sequence alignments from protein sequences and translated expressed sequence tags (ESTs)....
The Pfam Protein Families Database (2000)
Bateman, Alex, Birney, Ewan, Durbin, Richard, Eddy, Sean R., Howe, Kevin L., Sonnhammer, Erik L. L.
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at...
Pfam: Multiple Sequence Alignments and HMM-Profiles of Protein Domains (1998)
Sean R. Eddy, Ewan Birney, Alex Bateman, Richard Durbin
Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members and alignment is done...
Studies in Probabilistic Sequence Alignment and Evolution (1998)
Ian Holmes, Ewan Birney, Bill Bruno, Richard Durbin, Sean Eddy, David Haussler, ...
The complete sequencing of whole genomes presents opportunities for detailed study of molecular evolution. This thesis combines theoretical developments of Bayesian approaches in bioinformatics with...
Birney, Ewan, Kumar, Sanjay, Krainer, Adrian R.
We present a systematic analysis of sequence motifs found in metazoan protein factors involved in constitutive pre-mRNA splicing and in alternative splicing regulation. Using profile analysis we...
The Pfam Protein Families Database
Bateman, Alex, Birney, Ewan, Cerruti, Lorenzo, Durbin, Richard, Etwiller, Laurence, Eddy, Sean R., ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in...
The Pfam Protein Families Database
Bateman, Alex, Birney, Ewan, Durbin, Richard, Eddy, Sean R., Howe, Kevin L., Sonnhammer, Erik L. L.
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/ , in Sweden at...
The European Bioinformatics Institute's data resources
Brooksbank, Catherine, Camon, Evelyn, Harris, Midori A., Magrane, Michele, Martin, Maria Jesus, Mulder, Nicola, ...
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics...
The Bioperl Toolkit: Perl Modules for the Life Sciences
Stajich, Jason E., Block, David, Boulez, Kris, Brenner, Steven E., Chervitz, Stephen A., Dagdigian, Chris, ...
The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of...
Comparative Analysis of Noncoding Regions of 77 Orthologous Mouse and Human Gene Pairs
Jareborg, Niclas, Birney, Ewan, Durbin, Richard
A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined. This set was used to analyze the degree of...
Using GeneWise in the Drosophila Annotation Experiment
The GeneWise method for combining gene prediction and homology searches was applied to the 2.9-Mb region from Drosophila melanogaster. The results from the Genome Annotation Assessment Project (GASP)...
EnsMart: A Generic System for Fast and Flexible Access to Biological Data
Kasprzyk, Arek, Keefe, Damian, Smedley, Damian, London, Darin, Spooner, William, Melsopp, Craig, ...
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools....
Discovering Novel cis-Regulatory Motifs Using Functional Networks
Ettwiller, Laurence M., Rung, Johan, Birney, Ewan
We combined functional information such as protein–protein interactions or metabolic networks with genome information in Saccaromyces cerevisiae to predict cis-regulatory motifs in the upstream...
Dermitzakis, Emmanouil T., Kirkness, Ewen, Schwarz, Scott, Birney, Ewan, Reymond, Alexandre, Antonarakis, Stylianos E.
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic...
Birney, Ewan, Andrews, T. Daniel, Bevan, Paul, Caccamo, Mario, Chen, Yuan, Clarke, Laura, ...
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of...
The Ensembl Core Software Libraries
Stabenau, Arne, McVicker, Graham, Melsopp, Craig, Proctor, Glenn, Clamp, Michele, Birney, Ewan
Systems for managing genomic data must store a vast quantity of information. Ensembl stores these data in several MySQL databases. The core software libraries provide a practical and effective means...
Sockeye: A 3D Environment for Comparative Genomics
Montgomery, Stephen B., Astakhova, Tamara, Bilenky, Mikhail, Birney, Ewan, Fu, Tony, Hassel, Maik, ...
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases,...
Birney, Ewan, Clamp, Michele, Durbin, Richard
We present two algorithms in this paper: GeneWise, which predicts gene structure using similar protein sequences, and Genomewise, which provides a gene structure final parse across cDNA- and...
Hubbard, Simon J., Grafham, Darren V., Beattie, Kevin J., Overton, Ian M., McLaren, Stuart R., Croning, Michael D.R., ...
We present an analysis of the chicken (Gallus gallus) transcriptome based on the full insert sequences for 19,626 cDNAs, combined with 485,337 EST sequences. The cDNA data set has been functionally...
A survey of homozygous deletions in human cancer genomes
Cox, Charles, Bignell, Graham, Greenman, Chris, Stabenau, Arne, Warren, William, Stephens, Philip, ...
Homozygous deletions of recessive cancer genes and fragile sites are known to occur in human cancers. We identified 281 homozygous deletions in 636 cancer cell lines. Of these deletions, 86 were...
Gene finding in the chicken genome
Eyras, Eduardo, Reymond, Alexandre, Castelo, Robert, Bye, Jacqueline M, Camara, Francisco, Flicek, Paul, ...
Ettwiller, Laurence, Paten, Benedict, Souren, Marcel, Loosli, Felix, Wittbrodt, Jochen, Birney, Ewan
Several new methods and a specific alignment tool for investigating transcriptional motifs in vertebrates were used to identify all possible 12mers involved in transcription. Active instances of...
The Pfam Protein Families Database
Bateman, Alex, Birney, Ewan, Cerruti, Lorenzo, Durbin, Richard, Etwiller, Laurence, Eddy, Sean R., ...
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in...
The Pfam Protein Families Database
Bateman, Alex, Birney, Ewan, Durbin, Richard, Eddy, Sean R., Howe, Kevin L., Sonnhammer, Erik L. L.
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/ , in Sweden at...
The European Bioinformatics Institute's data resources
Brooksbank, Catherine, Camon, Evelyn, Harris, Midori A., Magrane, Michele, Martin, Maria Jesus, Mulder, Nicola, ...
As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics...
The Bioperl Toolkit: Perl Modules for the Life Sciences
Stajich, Jason E., Block, David, Boulez, Kris, Brenner, Steven E., Chervitz, Stephen A., Dagdigian, Chris, ...
The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of...
Comparative Analysis of Noncoding Regions of 77 Orthologous Mouse and Human Gene Pairs
Jareborg, Niclas, Birney, Ewan, Durbin, Richard
A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined. This set was used to analyze the degree of...
Using GeneWise in the Drosophila Annotation Experiment
The GeneWise method for combining gene prediction and homology searches was applied to the 2.9-Mb region from Drosophila melanogaster. The results from the Genome Annotation Assessment Project (GASP)...
EnsMart: A Generic System for Fast and Flexible Access to Biological Data
Kasprzyk, Arek, Keefe, Damian, Smedley, Damian, London, Darin, Spooner, William, Melsopp, Craig, ...
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools....
Discovering Novel cis-Regulatory Motifs Using Functional Networks
Ettwiller, Laurence M., Rung, Johan, Birney, Ewan
We combined functional information such as protein–protein interactions or metabolic networks with genome information in Saccaromyces cerevisiae to predict cis-regulatory motifs in the upstream...
Dermitzakis, Emmanouil T., Kirkness, Ewen, Schwarz, Scott, Birney, Ewan, Reymond, Alexandre, Antonarakis, Stylianos E.
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic...
Birney, Ewan, Andrews, T. Daniel, Bevan, Paul, Caccamo, Mario, Chen, Yuan, Clarke, Laura, ...
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of...
The Ensembl Core Software Libraries
Stabenau, Arne, McVicker, Graham, Melsopp, Craig, Proctor, Glenn, Clamp, Michele, Birney, Ewan
Systems for managing genomic data must store a vast quantity of information. Ensembl stores these data in several MySQL databases. The core software libraries provide a practical and effective means...
Sockeye: A 3D Environment for Comparative Genomics
Montgomery, Stephen B., Astakhova, Tamara, Bilenky, Mikhail, Birney, Ewan, Fu, Tony, Hassel, Maik, ...
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases,...
Birney, Ewan, Clamp, Michele, Durbin, Richard
We present two algorithms in this paper: GeneWise, which predicts gene structure using similar protein sequences, and Genomewise, which provides a gene structure final parse across cDNA- and...
Hubbard, Simon J., Grafham, Darren V., Beattie, Kevin J., Overton, Ian M., McLaren, Stuart R., Croning, Michael D.R., ...
We present an analysis of the chicken (Gallus gallus) transcriptome based on the full insert sequences for 19,626 cDNAs, combined with 485,337 EST sequences. The cDNA data set has been functionally...
A survey of homozygous deletions in human cancer genomes
Cox, Charles, Bignell, Graham, Greenman, Chris, Stabenau, Arne, Warren, William, Stephens, Philip, ...
Homozygous deletions of recessive cancer genes and fragile sites are known to occur in human cancers. We identified 281 homozygous deletions in 636 cancer cell lines. Of these deletions, 86 were...
Gene finding in the chicken genome
Eyras, Eduardo, Reymond, Alexandre, Castelo, Robert, Bye, Jacqueline M, Camara, Francisco, Flicek, Paul, ...
Ettwiller, Laurence, Paten, Benedict, Souren, Marcel, Loosli, Felix, Wittbrodt, Jochen, Birney, Ewan
Several new methods and a specific alignment tool for investigating transcriptional motifs in vertebrates were used to identify all possible 12mers involved in transcription. Active instances of...
VectorBase: a home for invertebrate vectors of human pathogens
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase () is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes providing an integrated resource for...
EGASP: the human ENCODE Genome Annotation Assessment Project
Guigó, Roderic, Flicek, Paul, Abril, Josep F, Reymond, Alexandre, Lagarde, Julien, Denoeud, France, ...
Update of the Anopheles gambiae PEST genome assembly
Sharakhova, Maria V, Hammond, Martin P, Lobo, Neil F, Krzywinski, Jaroslaw, Unger, Maria F, Hillenmeyer, Maureen E, ...
An update on the Anopheles gambiae PEST genome assembly places about 33% of previously unmapped sequences on the chromosomes.
Reactome: a knowledge base of biologic pathways and processes
Vastrik, Imre, D'Eustachio, Peter, Schmidt, Esther, Joshi-Tope, Geeta, Gopinath, Gopal, Croft, David, ...
Reactome, an online curated resource for human pathway data, can be used to infer equivalent reactions in non-human species and as a tool to aid in the interpretation of microarrays and other...
Identification of novel peptide hormones in the human proteome by hidden Markov model screening
Mirabeau, Olivier, Perlas, Emerald, Severini, Cinzia, Audero, Enrica, Gascuel, Olivier, Possenti, Roberta, ...
Peptide hormones are small, processed, and secreted peptides that signal via membrane receptors and play critical roles in normal and pathological physiology. The search for novel peptide hormones...
The implications of alternative splicing in the ENCODE protein complement
Tress, Michael L., Martelli, Pier Luigi, Frankish, Adam, Reeves, Gabrielle A., Wesselink, Jan Jaap, Yeats, Corin, ...
Alternative premessenger RNA splicing enables genes to generate more than one gene product. Splicing events that occur within protein coding regions have the potential to alter the biological...
In Vivo Validation of a Computationally Predicted Conserved Ath5 Target Gene Set
Del Bene, Filippo, Ettwiller, Laurence, Skowronska-Krawczyk, Dorota, Baier, Herwig, Matter, Jean-Marc, Birney, Ewan, ...
So far, the computational identification of transcription factor binding sites is hampered by the complexity of vertebrate genomes. Here we present an in silico procedure to predict target sites of a...
The landscape of histone modifications across 1% of the human genome in five human cell lines
Koch, Christoph M., Andrews, Robert M., Flicek, Paul, Dillon, Shane C., Karaöz, Ulaş, Clelland, Gayle K., ...
We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1,...
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome
Margulies, Elliott H., Cooper, Gregory M., Asimenos, George, Thomas, Daryl J., Dewey, Colin N., Siepel, Adam, ...
A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation,...
The HGNC Database in 2008: a resource for the human genome
Bruford, Elspeth A., Lush, Michael J., Wright, Mathew W., Sneddon, Tam P., Povey, Sue, Birney, Ewan
The HUGO Gene Nomenclature Committee (HGNC) aims to assign a unique and ideally meaningful name and symbol to every human gene. The HGNC database currently comprises over 24 000 public records...
Cochrane, Guy, Akhtar, Ruth, Aldebert, Philippe, Althorpe, Nicola, Baldwin, Alastair, Bates, Kirsty, ...
The Ensembl Trace Archive (http://trace.ensembl.org/) and the EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), known together as the European Nucleotide Archive, continue to see growth...
Integrating biological data – the Distributed Annotation System
Jenkinson, Andrew M, Albrecht, Mario, Birney, Ewan, Blankenburg, Hagen, Down, Thomas, Finn, Robert D, ...
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
Zerbino, Daniel R., Birney, Ewan
We have developed a new set of algorithms, collectively called “Velvet,” to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short...
Confounding between recombination and selection, and the Ped/Pop method for detecting selection
O’Reilly, Paul F., Birney, Ewan, Balding, David J.
In recent years, there have been major developments of population genetics methods to estimate both rates of recombination and levels of natural selection. However, genomic variants subject to...
Rakyan, Vardhman K., Down, Thomas A., Thorne, Natalie P., Flicek, Paul, Kulesha, Eugene, Gräf, Stefan, ...
We report a novel resource (methylation profiles of DNA, or mPod) for human genome-wide tissue-specific DNA methylation profiles. mPod consists of three fully integrated parts, genome-wide DNA...
1 We have developed a code generating language, called Dynamite, specialised for the production and subsequent manipulation of complex dynamic programming methods for biological sequence comparison....
Genome-wide nucleotide-level mammalian ancestor reconstruction
Paten, Benedict, Herrero, Javier, Fitzgerald, Stephen, Beal, Kathryn, Flicek, Paul, Holmes, Ian, ...
Recently attention has been turned to the problem of reconstructing complete ancestral sequences from large multiple alignments. Successful generation of these genome-wide reconstructions will...
Enredo and Pecan: Genome-wide mammalian consistency-based multiple alignment with paralogs
Paten, Benedict, Herrero, Javier, Beal, Kathryn, Fitzgerald, Stephen, Birney, Ewan
Pairwise whole-genome alignment involves the creation of a homology map, capable of performing a near complete transformation of one genome into another. For multiple genomes this problem is...
Petabyte-scale innovations at the European Nucleotide Archive
Cochrane, Guy, Akhtar, Ruth, Bonfield, James, Bower, Lawrence, Demiralp, Fehmi, Faruque, Nadeem, ...
Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence...
VectorBase: a data resource for invertebrate vector genomics
Lawson, Daniel, Arensburger, Peter, Atkinson, Peter, Besansky, Nora J., Bruggner, Robert V., Butler, Ryan, ...
VectorBase (http://www.vectorbase.org) is an NIAID-funded Bioinformatic Resource Center focused on invertebrate vectors of human pathogens. VectorBase annotates and curates vector genomes providing a...
MAPU 2.0: high-accuracy proteomes mapped to genomes
Gnad, Florian, Oroshi, Mario, Birney, Ewan, Mann, Matthias
The MAPU 2.0 database contains proteomes of organelles, tissues and cell types measured by mass spectrometry (MS)-based proteomics. In contrast to other databases it is meant to contain a limited...
Correction: Reactome: a knowledge base of biologic pathways and processes
Vastrik, Imre, D'Eustachio, Peter, Schmidt, Esther, Gopinath, Gopal, Croft, David, De Bono, Bernard, ...
Arabidopsis Reactome: A Foundation Knowledgebase for Plant Systems Biology[W]
Tsesmetzis, Nicolas, Couchman, Matthew, Higgins, Janet, Smith, Alison, Doonan, John H., Seifert, Georg J., ...
EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates
Vilella, Albert J., Severin, Jessica, Ureta-Vidal, Abel, Heng, Li, Durbin, Richard, Birney, Ewan
We have developed a comprehensive gene orientated phylogenetic resource, EnsemblCompara GeneTrees, based on a computational pipeline to handle clustering, multiple alignment, and tree generation,...
Reactome knowledgebase of human biological pathways and processes
Matthews, Lisa, Gopinath, Gopal, Gillespie, Marc, Caudy, Michael, Croft, David, De Bono, Bernard, ...
Reactome (http://www.reactome.org) is an expert-authored, peer-reviewed knowledgebase of human reactions and pathways that functions as a data mining resource and electronic textbook. Its current...