Genome sequence and rapid evolution of the rice pathogen Xanthomonas oryzaepv. oryzae PXO99A (2008)
Salzberg, Steven L, Sommer, Daniel D, Schatz, Michael C, Phillippy, Adam M, Rabinowicz, Pablo D, Tsuge, Seiji, ...
Abstract Background Xanthomonas oryzae pv. oryzae causes bacterial blight of rice ( Oryza sativa L.), a major disease that constrains production of this staple crop in many parts of the world. We...
Haas, Brian J, Salzberg, Steven L, Zhu, Wei, Pertea, Mihaela, Allen, Jonathan E, Orvis, Joshua, ...
Abstract EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM,...
Haas, Brian U., Salzberg, Steven L., Zhu, Wei, Pertea, Mihaela, Allen, Jonathan E., Orvis, Joshua, ...
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when...
Pertea, Mihaela, Mount, Stephen M, Salzberg, Steven L
Abstract Background Algorithmic approaches to splice site prediction have relied mainly on the consensus patterns found at the boundaries between protein coding and non-coding regions. However exonic...
Comprehensive DNA Signature Discovery and Validation (2007)
Adam M. Phillippy, Jacquline A. Mason, Kunmi Ayanbule, Daniel D. Sommer, Elisa Taviani, Anwar Huq, ...
DNA signatures are nucleotide sequences that can be used to detect the presence of an organism and to distinguish that organism from all other species. Here we describe Insignia, a new, comprehensive...
Comprehensive DNA Signature Discovery and Validation (2007)
Adam Phillippy, Jacquline A. Mason, Kunmi Ayanbule, Daniel D. Sommer, Elisa Taviani, Anwar Huq, ...
DNA signatures are nucleotide sequences that can be used to detect the presence of an organism and to distinguish that organism from all other species. Here we describe Insignia, a new, comprehensive...
Hawkeye: an interactive visual analytics tool for genome assemblies (2007)
Schatz, Michael C, Phillippy, Adam M, Shneiderman, Ben, Salzberg, Steven L
Abstract Genome sequencing remains an inexact science, and genome sequences can contain significant errors if they are not carefully examined. Hawkeye is our new visual analytics tool for genome...
Hawkeye: an interactive visual analytics tool for genome assemblies (2007)
Schatz, Michael C., Phillippy, Adam M., Shneiderman, Ben, Salzberg, Steven L.
Genome sequencing remains an inexact science, and genome sequences can contain significant errors if they are not carefully examined. Hawkeye is our new visual analytics tool for genome assemblies,...
Identifying bacterial genes and endosymbiont DNA with Glimmer (2007)
Delcher, Arthur L., Bratke, Kirsten A., Powers, Edwin C., Salzberg, Steven L.
Motivation: The Glimmer gene-finding software has been successfully used for finding genes in bacteria, archæa and viruses representing hundreds of species. We describe several major changes to the...
Minimus: a fast, lightweight genome assembler (2007)
Sommer, Daniel D, Delcher, Arthur L, Salzberg, Steven L, Pop, Mihai
Abstract Background Genome assemblers have grown very large and complex in response to the need for algorithms to handle the challenges of large whole-genome sequencing projects. Many of the most...
Minimus: a fast, lightweight genome assembler (2007)
Sommer, Daniel D., Delcher, Arthur L., Salzberg, Steven L., Pop, Mihai
Background: Genome assemblers have grown very large and complex in response to the need for algorithms to handle the challenges of large whole-genome sequencing projects. Many of the most common uses...
Kingsford, Carleton L, Ayanbule, Kunmi, Salzberg, Steven L
Abstract Background In many prokaryotes, transcription of DNA to RNA is terminated by a thymine-rich stretch of DNA following a hairpin loop. Detecting such Rho-independent transcription terminators...
Kingsford, Carleton L., Ayanbule, Kunmi, Salzberg, Steven L.
Background: In many prokaryotes, transcription of DNA to RNA is terminated by a thymine-rich stretch of DNA following a hairpin loop. Detecting such Rho-independent transcription terminators can shed...
Genome re-annotation: a wiki solution? (2007)
Abstract The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately,...
A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes. (2007)
Kingsford, Carl, Delcher, Arthur L., Salzberg, Steven L.
Overlapping genes are a common phenomenon. Among sequenced prokaryotes, more than 29% of all annotated genes overlap at least 1 of their 2 flanking genes. We present a unified model for the creation...
A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes. (2007)
Kingsford, Carl, Delcher, Arthur L., Salzberg, Steven L.
Overlapping genes are a common phenomenon. Among sequenced prokaryotes, more than 29% of all annotated genes overlap at least 1 of their 2 flanking genes. We present a unified model for the creation...
Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. (2006)
Eisen, Jonathan A, Coyne, Robert S, Wu, Martin, Wu, Dongying, Thiagarajan, Mathangi, Wortman, Jennifer R, ...
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct...
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote (2006)
Jonathan A. Eisen, Robert S. Coyne, Martin Wu, Dongying Wu, Mathangi Thiagarajan, Jennifer R. Wortman, ...
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct...
A phylogenetic generalized hidden Markov model for predicting alternatively spliced exons (2006)
Allen, Jonathan E, Salzberg, Steven L
Abstract Background An important challenge in eukaryotic gene prediction is accurate identification of alternatively spliced exons. Functional transcripts can go undetected in gene expression studies...
Allen, Jonathan E, Majoros, William H, Pertea, Mihaela, Salzberg, Steven L
Abstract Background Predicting complete protein-coding genes in human DNA remains a significant challenge. Though a number of promising approaches have been investigated, an ideal suite of tools has...
Allen, Jonathan E., Majoros, William H., Pertea, Mihaela, Salzberg, Steven L.
Background: Predicting complete protein-coding genes in human DNA remains a significant challenge. Though a number of promising approaches have been investigated, an ideal suite of tools has yet to...
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote (2006)
Eisen, Jonathan A., Coyne, Robert S., Wu, Martin, Wu, Dongying, Thiagarajan, Mathangi, Wortman, Jennifer R., ...
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct...
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote (2006)
Eisen, Jonathan A., Coyne, Robert S., Wu, Martin, Wu, Dongying, Thiagarajan, Mathangi, Wortman, Jennifer R., ...
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct...
Edward C. Holmes, Elodie Ghedin, Naomi Miller, Jill Taylor, Yiming Bao, Kirsten St. George, ...
Evolution of the flu virus is analyzed via genomic phylogeny; humans are found to provide a reservoir of antigenic variability implicit in flu adaptation and virulence.
Edward C. Holmes, Elodie Ghedin, Naomi Miller, Jill Taylor, Yiming Bao, Kirsten St. George, ...
Understanding the evolution of influenza A viruses in humans is important for surveillance and vaccine strain selection. We performed a phylogenetic analysis of 156 complete genomes of human H3N2...
An empirical analysis of training protocols for probabilistic gene finders (2005)
Majoros, William H, Salzberg, Steven L
No abstract available.
Correction: Serendipitous discovery of Wolbachiagenomes in multiple Drosophilaspecies (2005)
Salzberg, Steven L, Dunning Hotopp, Julie, Delcher, Arthur L, Pop, Mihai, Smith, Douglas R, Eisen, Michael B, ...
No abstract available.
Serendipitous discovery of Wolbachiagenomes in multiple Drosophilaspecies (2005)
Salzberg, Steven L, Hotopp, Julie, Delcher, Arthur L, Pop, Mihai, Smith, Douglas R, Eisen, Michael B, ...
Abstract Background The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of...
Efficient decoding algorithms for generalized hidden Markov model gene finders (2005)
Majoros, William H, Pertea, Mihaela, Delcher, Arthur L, Salzberg, Steven L
Abstract Background The Generalized Hidden Markov Model (GHMM) has proven a useful framework for the task of computational gene prediction in eukaryotic genomes, due to its flexibility and...
Efficient decoding algorithms for generalized hidden Markov model gene finders (2005)
Majoros, William H., Pertea, Mihaela, Delcher, Arthur L., Salzberg, Steven L.
Background: The Generalized Hidden Markov Model (GHMM) has proven a useful framework for the task of computational gene prediction in eukaryotic genomes, due to its flexibility and probabilistic...
Holmes, Edward C., Ghedin, Elodie, Miller, Naomi, Taylor, Jill, Bao, Yiming, St. George, Kirsten, ...
Understanding the evolution of influenza A viruses in humans is important for surveillance and vaccine strain selection. We performed a phylogenetic analysis of 156 complete genomes of human H3N2...
Serendipitous discovery of Wolbachia genomes in multiple Drosophila species (2005)
Salzberg, Steven L., Dunning Hotopp, Julie C., Delcher, Arthur L., Pop, Mihai, Smith, Douglas R, Eisen, Michael B., ...
Background: The Trace Archive is a repository for the raw, unanalyzed data generated by largescale genome sequencing projects. The existence of this data offers scientists the possibility of...
Holmes, Edward C., Ghedin, Elodie, Miller, Naomi, Taylor, Jill, Bao, Yiming, St. George, Kirsten, ...
Understanding the evolution of influenza A viruses in humans is important for surveillance and vaccine strain selection. We performed a phylogenetic analysis of 156 complete genomes of human H3N2...
Serendipitous discovery of Wolbachia genomes in multiple Drosophila species (2005)
Salzberg, Steven L., Dunning Hotopp, Julie C., Delcher, Arthur L., Pop, Mihai, Smith, Douglas R, Eisen, Michael B., ...
Background: The Trace Archive is a repository for the raw, unanalyzed data generated by largescale genome sequencing projects. The existence of this data offers scientists the possibility of...
An empirical analysis of training protocols for probabilistic gene finders (2004)
Majoros, William H, Salzberg, Steven L
Abstract Background Generalized hidden Markov models (GHMMs) appear to be approaching acceptance as a de facto standard for state-of-the-art ab initio gene finding, as evidenced by the recent...
An empirical analysis of training protocols for probabilistic gene finders (2004)
Majoros, William H., Salzberg, Steven L.
Background: Generalized hidden Markov models (GHMMs) appear to be approaching acceptance as a de facto standard for state-of-the-art ab initio gene finding, as evidenced by the recent proliferation...
Naomi Ward, Øivind Larsen, James Sakwa, Live Bruseth, Hoda Khouri, A. Scott Durkin, ...
Methanotrophs are bacteria that use methane as a sole carbon source. The genome sequence of Methylococcus capsulatus deepens our understanding of methanotroph biology and its relationship to global...
Naomi Ward, Øivind Larsen, James Sakwa, Live Bruseth, Hoda Khouri, A. Scott Durkin, ...
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular,...
Ward, Naomi, Larsen, Øivind, Sakwa, James, Bruseth, Live, Khouri, Hoda, Durkin, A. Scott, ...
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular,...
Ward, Naomi, Larsen, Øivind, Sakwa, James, Bruseth, Live, Khouri, Hoda, Durkin, A. Scott, ...
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular,...
The Genome Assembly Archive: A New Public Resource (2004)
Steven L. Salzberg, Deanna Church, Michael DiCuccio, Eugene Yaschenko, James Ostell
With the genome assembly archive, it is possible to examine the raw data that underlies the DNA sequence in any sequenced genome.
The Genome Assembly Archive: A New Public Resource (2004)
Steven L. Salzberg, Deanna Church, Michael DiCuccio, Eugene Yaschenko, James Ostell
The Genome Assembly Archive: A New Public Resource (2004)
Salzberg, Steven L., Church, Deanna, DiCuccio, Michael, Yaschenko, Eugene, Ostell, James
Berman, Benjamin P, Pfeiffer, Barret D, Laverty, Todd R, Salzberg, Steven L, Rubin, Gerald M, Eisen, Michael B, ...
Abstract Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of...
Berman, Benjamin P., Pfeiffer, Barret D., Laverty, Todd R., Salzberg, Steven L., Rubin, Gerald M., Eisen, Michael B., ...
Versatile and open software for comparing large genomes (2004)
Kurtz, Stefan, Phillippy, Adam, Delcher, Arthur L, Smoot, Michael, Shumway, Martin, Antonescu, Corina, ...
Abstract The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes. Two new graphical...
Versatile and open software for comparing large genomes (2004)
Kurtz, Stefan, Phillippy, Adam, Delcher, Arthur L., Smoot, Michael, Shumway, Martin, Antonescu, Corina, ...
The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes. Two new graphical viewing...
Arthur L. Delcher, Jane Carlton, Steven L. Salzberg
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs...
Baris E. Suzek, Maria D. Ermolaeva, Mark Schreiber, Steven L. Salzberg
As the pace of genome sequencing has accelerated, the need for highly accurate gene prediction systems has grown. Computational systems for identifying genes in prokaryotic genomes have sensitivities...
Prediction of Transcription Terminators in (2003)
Maria D. Ermolaeva, Hanif G. Khalak, Owen White, Hamilton O. Smith, Steven L. Salzberg
Introduction Bacterial genomes are organized into units of expression that are bounded by sites where transcription of DNA into RNA is initiated and terminated. Regulation of gene expression is often...
analysis and refutation of the claim that bacterial genes were laterally transferred into the human genome. References
Full-length messenger RNA sequences greatly improve genome annotation (2002)
Haas, Brian J, Volfovsky, Natalia, Town, Christopher D, Troukhan, Maxim, Alexandrov, Nickolai, Feldmann, Kenneth A, ...
Abstract Background Annotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of...
Full-length messenger RNA sequences greatly improve genome annotation (2002)
Haas, Brian J, Volfovsky, Natalia, Town, Christopher D, Troukhan, Maxim, Alexandrov, Nickolai, Feldman, Kenneth A, ...
Background: Annotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of genome...
Microbial gene identification using interpolated Markov models (2002)
Steven L. Salzberg, Arthur L. Delcher, Simon Kasif
This paper describes a new system, GLIMMER, for finding genes in microbial genomes. In a series of tests on Haemophilus influenzae, Helicobacter pylori and other complete microbial genomes, this...
Finding a Majority Among N Votes. (2002)
Fischer,Michael J., Salzberg,Steven L.
A commonly-used technique for fault-tolerant computing is to perform n redundant computations and then vote on the results, choosing on the majority value if one exists. We present an algorithm for...
Microbial gene identification using interpolated Markov models (2001)
This paper describes a new system, GLIMMER, for finding genes in microbial genomes. In a series of tests on Haemophilus influenzae, Helicobacter pylori and other complete microbial genomes, this...
A clustering method for repeat analysis in DNA sequences (2001)
Volfovsky, Natalia, Haas, Brian J, Salzberg, Steven L
Abstract Background A computational system for analysis of the repetitive structure of genomic sequences is described. The method uses suffix trees to organize and search the input sequences; this...
A clustering method for repeat analysis in DNA sequences. (2001)
Volfovsky, Natalia, Haas, Brian J., Salzberg, Steven L.
Background: A computational system for analysis of the repetitive structure of genomic sequences is described. The method uses suffix trees to organize and search the input sequences; this data...
GeneSplicer: a new computational method for splice site prediction (2001)
Mihaela Pertea, Xiaoying Lin, Steven L. Salzberg
GeneSplicer is a new, flexible system for detecting splice sites in the genomic DNA of various eukaryotes. The system has been tested successfully using DNA from two reference organisms: the model...
Prediction of Operons in Microbial Genomes (2001)
Maria D. Ermolaeva, Owen White, Steven L. Salzberg
Operon structure is an important organization feature of bacterial genomes. Many sets of genes occur in the same order on multiple genomes; these conserved gene groupings represent candidate operons....
Evidence for Symmetric Chromosomal Inversions Around the Replication Origin in Bacteria (2000)
Background: Whole-genome comparisons can provide great insight into many aspects of biology. Until recently, however, comparisons were mainly possible only between distantly related species. Complete...
Evidence for symmetric chromosomal inversions around the replication origin in bacteria (2000)
Eisen, Jonathan A, Heidelberg, John F, White, Owen, Salzberg, Steven L
Abstract Background Whole-genome comparisons can provide great insight into many aspects of biology. Until recently, however, comparisons were mainly possible only between distantly related species....
Evidence for symmetric chromosomal inversions around the replication origin in bacteria (2000)
Eisen, Jonathan A., Heidelberg, John F., White, Owen, Salzberg, Steven L.
Background: Whole-genome comparisons can provide great insight into many aspects of biology. Until recently, however, comparisons were mainly possible only between distantly related species. Complete...
An optimized protocol for analysis of EST sequences (2000)
Feng Liang, Ingeborg Holt, Geo Pertea, Svetlana Karamycheva, Steven L. Salzberg, John Quackenbush
The vast body of Expressed Sequence Tag (EST) data in the public databases provide an important resource for comparative and functional genomics studies and an invaluable tool for the annotation of...
Optimized Multiplex PCR: Efficiently Closing a Whole-Genome Shotgun Sequencing Project (2000)
Herve Tettelin, Diana Radune, Simon Kasif, Hoda Khouri, Steven L. Salzberg
INTRODUCTION In the late stages of a whole-genome shotgun sequencing project, most DNA sequences will be assembled into large contiguous blocks, or contigs (Fraser and Fleischmann, 1997). As the...
On Comparing Classifiers: A Critique of Current Research and Methods (1999)
. An importantcomponent of many data mining projects is finding a good classification algorithm, a process that requires very careful thought about experimental design. If not done very carefully,...
Improved microbial gene identification with GLIMMER (1999)
Arthur L. Delcher, Douglas Harmon, Steven L. Salzberg
The GLIMMER system for microbial gene identification finds ~97--98% of all genes in a genome when compared with published annotation. This paper reports on two new results: (i) significant technical...