Jiang Qian, Marisa Dolled-filhart, Jimmy Lin, Haiyuan Yu, Mark Gerstein
The complexity of biological systems provides for a great diversity of relationships between genes. The current analysis of whole-genome expression data focuses on relationships based on global...
Sasidharan, Rajkumar, Agarwal, Ashish, Rozowsky, Joel, Gerstein, Mark
Corrected abstract We are correcting the abstract of our published article ( 1 ). The sentence that starts "We observe that 4.5% of MPSS tags...." was not scientifically complete in the original...
The relationship between the evolution of microRNA targets and the length of their UTRs (2009)
Cheng, Chao, Bhardwaj, Nitin, Gerstein, Mark
Abstract Background MicroRNAs (miRNAs) are endogenous small RNA molecules that modulate the gene expression at the post-transcription levels in many eukaryotic cells. Their widespread and important...
Cheng, Chao, Fu, Xuping, Alves, Pedro, Gerstein, Mark
Abstract Background Recent studies have shown that the regulatory effect of microRNAs can be investigated by examining expression changes of their target genes. Given this, it is useful to define an...
Yip, Kevin Y, Kim, Philip M, McDermott, Drew, Gerstein, Mark
Abstract Background Proteins interact through specific binding interfaces that contain many residues in domains. Protein interactions thus occur on three different levels of a concept hierarchy:...
Sasidharan, Rajkumar, Agarwal, Ashish, Rozowsky, Joel, Gerstein, Mark
Abstract Background There are two main technologies for transcriptome profiling, namely, tiling microarrays and high-throughput sequencing. Recently there has been a tremendous amount of excitement...
Cheng, Chao, Li, Lei M, Alves, Pedro, Gerstein, Mark
Abstract Background Aberrant activation or expression of transcription factors has been implicated in the tumorigenesis of various types of cancer. In spite of the prevalent application of microarray...
Dehydrogenase In Pathogens, Rajdeep Das, Mark Gerstein
ABSTRACT We have introduced a method to identify functional shifts in protein families. Our method is based on the calculation of an active-site conservation ratio, which we call the “ASC ratio.”...
Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing (2009)
Lefrançois, Philippe, Euskirchen, Ghia M, Auerbach, Raymond K, Rozowsky, Joel, Gibson, Theodore, Yellman, Christopher M, ...
Abstract Background Short-read high-throughput DNA sequencing technologies provide new tools to answer biological questions. However, high cost and low throughput limit their widespread use,...
Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes (2009)
Balasubramanian, Suganthi, Zheng, Deyou, Liu, Yuen-Jong, Fang, Gang, Frankish, Adam, Carriero, Nicholas, ...
Abstract Background The availability of genome sequences of numerous organisms allows comparative study of pseudogenes in syntenic regions. Conservation of pseudogenes suggests that they might have a...
Mishima, Yuichiro, Abreu-Goodger, Cei, Staton, Alison A., Stahlhut, Carlos, Shou, Chong, Cheng, Chao, ...
microRNAs (miRNAs) represent ∼4% of the genes in vertebrates, where they regulate deadenylation, translation, and decay of the target messenger RNAs (mRNAs). The integrated role of miRNAs to...
MAPK target networks in Arabidopsis thaliana revealed using functional protein microarrays (2009)
Popescu, Sorina C., Popescu, George V., Bachan, Shawn, Zhang, Zimei, Gerstein, Mark, Snyder, Michael, ...
Signaling through mitogen-activated protein kinases (MPKs) cascades is a complex and fundamental process in eukaryotes, requiring MPK-activating kinases (MKKs) and MKK-activating kinases (MKKKs)....
MSB: A mean-shift-based approach for the analysis of structural variation in the genome (2009)
Wang, Lu-yong, Abyzov, Alexej, Korbel, Jan O., Snyder, Michael, Gerstein, Mark
Genome structural variation includes segmental duplications, deletions, and other rearrangements, and array-based comparative genomic hybridization (array-CGH) is a popular technology for determining...
Motivation: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While...
Seringhaus, Michael, Rozowsky, Joel, Royce, Thomas, Nagalakshmi, Ugrappa, Jee, Justin, Snyder, Michael, ...
Abstract Background Mismatched oligonucleotides are widely used on microarrays to differentiate specific from nonspecific hybridization. While many experiments rely on such oligos, the hybridization...
Dov Greenbaum, Nicholas M. Luscombe, Ronald Jansen, Jiang Qian, Mark Gerstein
With the completion of genome sequences, the current challenge for biology is to determine the functions of all gene products and to understand how they contribute in making an organism viable. For...
Inferring Protein-Protein Interactions Using Interaction Network Topologies (2008)
Alberto Paccanaro, Valery Trifonov, Haiyuan Yu, Mark Gerstein
[ ∗ these authors contributed equally to this work] Abstract — We describe two novel methods for predicting protein interactions, using only the topology of an observed protein interaction...
BIOINFORMATICS ORIGINAL PAPER (2008)
Gene Expression, Jiang Du, Joel S. Rozowsky, Jan O. Korbel, Zhengdong D. Zhang, Thomas E. Royce, ...
doi:10.1093/bioinformatics/btl515 A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating...
• Secondary Structure Review ◊ Handout from D Frishman (2008)
◊ this class (sec. str.) and next (packing)
Andrew Smith, Kei Cheung, Michael Krauthammer, Martin Schultz, Mark Gerstein
Motivation: Proteomics researchers need to be able to quickly retrieve relevant information from the web and the biomedical literature. To improve information retrieval, we leverage a graph of...
Methods Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions (2008)
Yuval Kluger, Ronen Basri, Joseph T. Chang, Mark Gerstein
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to find “marker genes ” that are...
PseudoPipe: an automated pseudogene identification pipeline (2008)
Genome Analysis, Zhaolei Zhang, Nicholas Carriero, Deyou Zheng, John Karro, Paul M. Harrison, ...
doi:10.1093/bioinformatics/btl116
Ursula Lehnert, Eric Z. Yu, Mark Gerstein
Motivation: In many proteins, helix-helix interactions can be critical to establishing protein conformation (folding) and dynamics, as well as determining associations between protein units. However,...
BIOINFORMATICS ORIGINAL PAPER Systems biology (2008)
Haiyuan Yu, Alberto Paccanaro, Valery Trifonov, Mark Gerstein
Vol. 22 no. 7 2006, pages 823–829 doi:10.1093/bioinformatics/btl014 Predicting interactions in protein networks by completing defective cliques
Kevin Y. Yip, Haiyuan Yu, Philip M. Kim, Martin Schultz, Mark Gerstein
Biological processes involve complex networks of interactions between molecules. Various large-scale experiments and curation efforts have led to preliminary versions of complete cellular networks...
• Databases make program data persistent • RDB’s turn formless data in a number of structured tables ◊ Ways of joining together tables to give various views of the data 2
Mark Gerstein, C Mark Gerstein, C Mark Gerstein, Monte Carlo
◊ Electrical non-bonded interactions ◊ bonded, fundamentally QM but treat as springs ◊ Sum up the energy
REVIEW DNA recognition code of transcription factors (2008)
Masashi Suzuki, Steven E. Brenner, Mark Gerstein, Naoto Yagi
'TO whom correspondence should be addressed Key words: DNA binding1DNA-protein interactionlgene expression/molecular recognition
Selection and Characterization of Small Random (2008)
Transmembrane Proteins That, Ann M. Dixon, Jennifer B. Frank, Yu Xia, Lara Ely, ...
this article can be found at doi: 10.1016/j.jmb.2004.03.044 E-mail address of the corresponding author: daniel.dimaio@yale.edu Abbreviations used: PDGF, platelet-derived growth factor; CAT,...
Annotation Transfer Between Genomes: (2008)
Protein–protein Interologs, Haiyuan Yu, Nicholas M. Luscombe, Hao Xin Lu, Xiaowei Zhu, ...
this paper, we present results from both approaches
Wu, Jia, Du, Jiang, Rozowsky, Joel, Zhang, Zhengdong, Urban, Alexander E, Euskirchen, Ghia, ...
Abstract Background Recent studies of the mammalian transcriptome have revealed a large number of additional transcribed regions and extraordinary complexity in transcript diversity. However, there...
Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. (2008)
Johnson, David S., Li, Wei, Gordon, D. Benjamin, Bhattacharjee, Arindam, Curry, Bo, Ghosh, Jayati, ...
The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first...
An integrated system for studying residue coevolution in proteins (2008)
Yip, Kevin Y., Patel, Prianka, Kim, Philip M., Engelman, Donald M., McDermott, Drew, Gerstein, Mark
Residue coevolution has recently emerged as an important concept, especially in the context of protein structures. While a multitude of different functions for quantifying it have been proposed, not...
Analysis of Nuclear Receptor Pseudogenes in Vertebrates: How the Silent Tell Their Stories (2008)
Zhang, Zhengdong D., Cayting, Philip, Weinstock, George, Gerstein, Mark
Transcription factor pseudogenes have not been systematically studied before. Nuclear receptors (NRs) constitute one of the largest groups of transcription factors in animals (e.g., 48 NRs in human)....
Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets (2008)
Johnson, David S., Li, Wei, Gordon, D. Benjamin, Bhattacharjee, Arindam, Curry, Bo, Ghosh, Jayati, ...
The most widely used method for detecting genome-wide protein–DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first...
Lian, Zheng, Karpikov, Alexander, Lian, Jin, Mahajan, Milind C., Hartman, Stephen, Gerstein, Mark, ...
Genomic analyses have been applied extensively to analyze the process of transcription initiation in mammalian cells, but less to transcript 3′ end formation and transcription termination. We used...
Mark Gerstein, Ronald Jansen, Ted Johnson, Jerry Tsai, Werner Krebs
We describe database approaches taken in our lab to the study of protein and nucleic acid motions. We have developed a database of macromolecular motions, which is accessible on the World Wide Web...
REVIEW DNA recognition code of transcription factors (2007)
Masashi Suzuki, Steven E. Brenner, Mark Gerstein, Naoto Yagi
‘To whom correspondence should be addressed Key words: DNA bindingDNA-protein interactiordgene expressiordmolecular recognition
Paul Bertone, Bhaskar Dasgupta, Mark Gerstein, Ming-yang Kao, Michael Snyder
A preliminary version of this paper appeared in the 2 nd Workshop on Algorithms in Bioinformatics,
The Morph Server and the Macromolecular (2007)
Motions Database ' a standardized system for analyzing and visualizing macromolecular motions in a database framework
Letter Relating Whole-Genome Expression Data with Protein-Protein Interactions (2007)
Ronald Jansen, Dov Greenbaum, Mark Gerstein
We investigate the relationship of protein-protein interactions with mRNA expression levels, by integrating a variety of data sources for yeast. We focus on known protein complexes that have clearly...
Subject classification: Proteins Figures for Average Core Structures (Revised) A variety of methods are currently available for creating multiple alignments, and these can be used to define and...
Calculated from Simulation, using Voronoi Polyhedra (2007)
Mark Gerstein, Jerry Tsai, Michael Levitt
The protein surface is of great interest since proteins recognize other molecules and perform their functions through their surfaces. Central to understanding the protein surface is understanding
Pages: __ _ in total including this one (2007)
ss-pstxt.rtf (word-97 RTF file of text) ss-prsci.pdf (acrobat PDF file, text + figures) ss-psfig.pdf (acrobat PDF file, just figures) ss-pstxt.txt (ASCII text file of just the text) 1 Structural...
Fast Optimal Genome Tiling with Applications to Microarray Design and Homology Search (2007)
Piotr Berman, Paul Bertone, Bhaskar Dasgupta, Mark Gerstein, Ming-yang Kao, Michael Snyder
In this paper we consider several variations of the following basic tiling problem: given a sequence of real numbers with two size bound parameters, we want to find a set of tiles such that they...
Yu, Haiyuan, Jansen, Ronald, Stolovitzky, Gustavo, Gerstein, Mark
Motivation: Many classifications of protein function such as Gene Ontology (GO) are organized in directed acyclic graph (DAG) structures. In these classifications, the proteins are terminal leaf...
PARE: A tool for comparing protein abundance and mRNA expression data (2007)
Yu, Eric Z, Burba, Anne, Gerstein, Mark
Abstract Background Techniques for measuring protein abundance are rapidly advancing and we are now in a situation where we anticipate many protein abundance data sets will be available in the near...
Tilescope: online analysis pipeline for high-density tiling microarray data (2007)
Zhang, Zhengdong D, Rozowsky, Joel, Lam, Hugo YK, Du, Jiang, Snyder, Michael, Gerstein, Mark
Abstract We developed Tilescope, a fully integrated data processing pipeline for analyzing high-density tiling-array data http://tilescope.gersteinlab.org . In a completely automated fashion,...
Getting connected: analysis and principles of biological networks (2007)
Zhu, Xiaowei, Gerstein, Mark, Snyder, Michael
The execution of complex biological processes requires the precise interaction and regulation of thousands of molecules. Systematic approaches to study large numbers of proteins, metabolites, and...
Haiyuan Yu, Philip M. Kim, Emmett Sprecher, Valery Trifonov, Mark Gerstein
It has been a long-standing goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has...
Zhang, Zhaolei, Pang, Andy, Gerstein, Mark
Abstract Background Widespread transcription activities in the human genome were recently observed in high-resolution tiling array experiments, which revealed many novel transcripts that are outside...
Haiyuan Yu, Philip M. Kim, Emmett Sprecher, Valery Trifinov, Mark Gerstein
It has been a long-standing goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has...
Tilescope: online analysis pipeline for high-density tiling microarray data (2007)
Zhengdong D. Zhang, Joel Rozowsky, Jiang Du, Michael Snyder, Mark Gerstein
Running title: microarray data analysis pipeline Key words: high-density tiling microarray, high-density oligonucleotide microarray, microarray data analysis For test data sets, sample result web...
Smith, Michael G., Gianoulis, Tara A., Pukatzki, Stefan, Mekalanos, John J., Ornston, L. Nicholas, Gerstein, Mark, ...
Acinetobacter baumannii has emerged as an important and problematic human pathogen as it is the causative agent of several types of infections including pneumonia, meningitis, septicemia, and urinary...
Yu, Haiyuan, Nguyen, Katherine, Royce, Tom, Qian, Jiang, Nelson, Kenneth, Snyder, Michael, ...
Microarray technology is currently one of the most widely-used technologies in biology. Many studies focus on inferring the function of an unknown gene from its co-expressed genes. Here, we are able...
Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation (2007)
Karro, John E., Yan, Yangpan, Zheng, Deyou, Zhang, Zhaolei, Carriero, Nicholas, Cayting, Philip, ...
The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different...
Bmc Bioinformatics, Eric Z Yu, Mark Gerstein, Eric Z Yu, ...
Software PARE: A tool for comparing protein abundance and mRNA expression data
ProCAT: a data analysis approach for protein microarrays (2006)
Zhu, Xiaowei, Gerstein, Mark, Snyder, Michael
Abstract Protein microarrays provide a versatile method for the analysis of many protein biochemical activities. Existing DNA microarray analytical methods do not translate to protein microarrays due...
Wang, Lu-yong, Snyder, Michael, Gerstein, Mark
Abstract Comprehensive mapping of transcription factor binding sites is essential in postgenomic biology. For this, we propose a mining approach combining noisy data from ChIP (chromatin...
An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits (2006)
Yang Liu, Jianrong Li, Lee Sam, Chern-Sing Goh, Mark Gerstein, Yves A. Lussier
With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes,...
Integration of curated databases to identify genotype-phenotype associations (2006)
Goh, Chern-Sing, Gianoulis, Tara A, Liu, Yang, Li, Jianrong, Paccanaro, Alberto, Lussier, Yves A, ...
Abstract Background The ability to rapidly characterize an unknown microorganism is critical in both responding to infectious disease and biodefense. To do this, we need some way of anticipating an...
Design principles of molecular networks revealed by global comparisons and composite motifs (2006)
Yu, Haiyuan, Xia, Yu, Trifonov, Valery, Gerstein, Mark
Abstract Background Molecular networks are of current interest, particularly with the publication of many large-scale datasets. Previous analyses have focused on topologic structures of individual...
PseudoPipe: an automated pseudogene identification pipeline (2006)
Zhaolei Zhang, Nicholas Carriero, Deyou Zheng, John Karro, Paul M. Harrison, Mark Gerstein, ...
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for...
BIOINFORMATICS ORIGINAL PAPER Structural bioinformatics (2006)
Ursula Lehnert, Eric Z. Yu, Mark Gerstein
doi:10.1093/bioinformatics/btl274
BioMed Central Open Access (2006)
Chern-sing Goh, Tara A Gianoulis, Yang Liu, Jianrong Li, Alberto Paccanaro, Yves A Lussier, ...
Integration of curated databases to identify genotype-phenotype
Predicting interactions in protein networks by completing defective cliques (2006)
Haiyuan Yu, Alberto Paccanaro, Valery Trifonov, Mark Gerstein
defective cliques
Design optimization methods for genomic DNA tiling arrays (2006)
Paul Bertone, Valery Trifonov, Joel S. Rozowsky, Falk Schubert, Olof Emanuelsson, John Karro, ...
A recent development in microarray construction entails the unbiased coverage, or tiling, of non-repetitive genomic DNA for the experimental identification of unannotated transcribed sequences and...
PseudoPipe: an automated pseudogene identification pipeline (2006)
Zhang, Zhaolei, Carriero, Nicholas, Zheng, Deyou, Karro, John, Harrison, Paul M., Gerstein, Mark
Motivation: Mammalian genomes contain many ‘genomic fossils’ i.e. pseudogenes. These are disabled copies of functional genes that have been retained in the genome by gene duplication or...
Burba, Anne E. Counterman, Lehnert, Ursula, Yu, Eric Z., Gerstein, Mark
Motivation: In many proteins, helix–helix interactions can be critical to establishing protein conformation (folding) and dynamics, as well as determining associations between protein units....
Predicting essential genes in fungal genomes (2006)
Seringhaus, Michael, Paccanaro, Alberto, Borneman, Anthony, Snyder, Michael, Gerstein, Mark
Essential genes are required for an organism's viability, and the ability to identify these genes in pathogens is crucial to directed drug development. Predicting essential genes through...
Yip, Kevin Y., Yu, Haiyuan, Kim, Philip M., Schultz, Martin, Gerstein, Mark
Summary: Biological processes involve complex networks of interactions between molecules. Various large-scale experiments and curation efforts have led to preliminary versions of complete cellular...
Predicting interactions in protein networks by completing defective cliques (2006)
Yu, Haiyuan, Paccanaro, Alberto, Trifonov, Valery, Gerstein, Mark
Datasets obtained by large-scale, high-throughput methods for detecting protein–protein interactions typically suffer from a relatively high level of noise. We describe a novel method for improving...
Yu, Haiyuan, Nguyen, Katherine, Royce, Tom, Qian, Jiang, Nelson, Kenneth, Snyder, Michael, ...
Microarray technology is currently one of the most widely-used technologies in biology. Many studies focus on inferring the function of an unknown gene from its co-expressed genes. Here, we are able...
Target hub proteins serve as master regulators of development in yeast (2006)
Borneman, Anthony R., Leigh-Bell, Justine A., Yu, Haiyuan, Bertone, Paul, Gerstein, Mark, Snyder, Michael
To understand the organization of the transcriptional networks that govern cell differentiation, we have investigated the transcriptional circuitry controlling pseudohyphal development in...
Emanuelsson, Olof, Nagalakshmi, Ugrappa, Zheng, Deyou, Rozowsky, Joel S., Urban, Alexander E., Du, Jiang, ...
Genomic tiling microarrays have become a popular tool for interrogating the transcriptional activity of large regions of the genome in an unbiased fashion. There are several key parameters associated...
Predicting essential genes in fungal genomes (2006)
Seringhaus, Michael, Paccanaro, Alberto, Borneman, Anthony, Snyder, Michael, Gerstein, Mark
Essential genes are required for an organism’s viability, and the ability to identify these genes in pathogens is crucial to directed drug development. Predicting essential genes through...
The Database of Macromolecular Motions: new features added at the decade mark (2006)
Flores, Samuel, Echols, Nathaniel, Milburn, Duncan, Hespenheide, Brandon, Keating, Kevin, Lu, Jason, ...
The database of molecular motions, MolMovDB (http://molmovdb.org), has been in existence for the past decade. It classifies macromolecular motions and provides tools to interpolate between two...
Yu, Haiyuan, Nguyen, Katherine, Royce, Tom, Qian, Jiang, Nelson, Kenneth, Snyder, Michael, ...
Microarray technology is currently one of the most widely-used technologies in biology. Many studies focus on inferring the function of an unknown gene from its co-expressed genes. Here, we are able...
Smith, Andrew, Greenbaum, Dov, Douglas, Shawn M, Long, Morrow, Gerstein, Mark
No abstract available.
PubNet: a flexible system for visualizing literature derived networks (2005)
Douglas, Shawn M, Montelione, Gaetano T, Gerstein, Mark
Abstract We have developed PubNet, a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks, allowing for graphical visualization, textual...
Carriero, Nicholas, Osier, Michael, Cheung, Kei-Hoi, Miller, Perry, Gerstein, Mark, Zhao, Hongyu, ...
Article may be found at: http://www.jamia.org/cgi/content/abstract/12/1/90
Carriero, Nicholas, Osier, Michael, Cheung, Kei-Hoi, Miller, Perry, Gerstein, Mark, Zhao, Hongyu, ...
Article may be found at: http://www.jamia.org/cgi/content/abstract/12/1/90
Hartman, Stephen E., Bertone, Paul, Nath, Anjali K., Royce, Thomas E., Gerstein, Mark, Weissman, Sherman, ...
The STAT (signal transducer and activator of transcription) proteins play a crucial role in the regulation of gene expression, but their targets and the manner in which they select them remain...
Balasubramanian, Suganthi, Xia, Yu, Freinkman, Elizaveta, Gerstein, Mark
We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled...
Alexandrov, Vadim, Lehnert, Ursula, Echols, Nathaniel, Milburn, Duncan, Engelman, Donald, Gerstein, Mark
We carry out an extensive statistical study of the applicability of normal modes to the prediction of mobile regions in proteins. In particular, we assess the degree to which the observed motions...
Assessing the limits of genomic data integration for predicting protein networks (2005)
Lu, Long J., Xia, Yu, Paccanaro, Alberto, Yu, Haiyuan, Gerstein, Mark
Genomic data integration—the process of statistically combining diverse sources of information from functional genomics experiments to make large-scale predictions—is becoming increasingly...
Hartman, Stephen E., Bertone, Paul, Nath, Anjali K., Royce, Thomas E., Gerstein, Mark, Weissman, Sherman, ...
The STAT (signal transducer and activator of transcription) proteins play a crucial role in the regulation of gene expression, but their targets and the manner in which they select them remain...
Biochemical and genetic analysis of the yeast proteome with a movable ORF collection (2005)
Gelperin, Daniel M., White, Michael A., Wilkinson, Martha L., Kon, Yoshiko, Kung, Li A., Wise, Kevin J., ...
Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was...
Gilad, Yoav, Rifkin, Scott A., Bertone, Paul, Gerstein, Mark, White, Kevin P.
Interspecies comparisons of gene expression levels will increase our understanding of the evolution of transcriptional mechanisms and help to identify targets of natural selection. This approach...
Design optimization methods for genomic DNA tiling arrays (2005)
Bertone, Paul, Trifonov, Valery, Rozowsky, Joel S., Schubert, Falk, Emanuelsson, Olof, Karro, John, ...
A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements. A central...
YeastHub: a semantic web use case for integrating data in the life sciences domain (2005)
Cheung, Kei-Hoi, Yip, Kevin Y., Smith, Andrew, DeKnikker, Remko, Masiar, Andy, Gerstein, Mark
Motivation: As the semantic web technology is maturing and the need for life sciences data integration over the web is growing, it is important to explore how data integration needs can be addressed...
Harrison, Paul M., Zheng, Deyou, Zhang, Zhaolei, Carriero, Nicholas, Gerstein, Mark
Pseudogenes, in the case of protein-coding genes, are gene copies that have lost the ability to code for a protein; they are typically identified through annotation of disabled, decayed or incomplete...
Hartman, Stephen E., Bertone, Paul, Nath, Anjali K., Royce, Thomas E., Gerstein, Mark, Weissman, Sherman, ...
The STAT (signal transducer and activator of transcription) proteins play a crucial role in the regulation of gene expression, but their targets and the manner in which they select them remain...
PubNet: a flexible system for visualizing literature derived networks (2005)
Shawn M Douglas, Gaetano T Montelione, Mark Gerstein
Software
Information assessment on predicting protein-protein interactions (2004)
Lin, Nan, Wu, Baolin, Jansen, Ronald, Gerstein, Mark, Zhao, Hongyu
Abstract Background Identifying protein-protein interactions is fundamental for understanding the molecular machinery of the cell. Proteome-wide studies of protein-protein interactions are of...
Liu, Yang, Harrison, Paul M, Kunin, Victor, Gerstein, Mark
Abstract Background Pseudogenes often manifest themselves as disabled copies of known genes. In prokaryotes, it was generally believed (with a few well-known exceptions) that they were rare. Results...
Alexandrov, Vadim, Gerstein, Mark
Abstract Background Hidden Markov Models (HMMs) have proven very useful in computational biology for such applications as sequence pattern matching, gene-finding, and structure prediction. Thus far,...
An XML-based approach to integrating heterogeneous yeast genome data (2004)
Kei-hoi Cheung, Deyun Pan, Andrew Smith, Michael Seringhaus, Shawn M. Douglas, Mark Gerstein
Abstract. While there are an increasing number of genomes (including the human genome) whose sequences have been fully or nearly completed, the budding yeast Saccharomyces cerevisiae was the first...
Analyzing cellular biochemistry in terms of molecular networks (2004)
Yu Xia, Haiyuan Yu, Ronald Jansen, Michael Seringhaus, Sarah Baxter, Dov Greenbaum, ...
Key Words genome-wide high-throughput experiments, protein-protein interaction networks, regulatory networks, integration and prediction, network topology f Abstract One way to understand cells and...
Information assessment on predicting protein-protein interactions (2004)
Bmc Bioinformatics, Nan Lin, Baolin Wu, Ronald Jansen, Mark Gerstein, Hongyu Zhao, ...
Research article
Sequence variation in G-protein-coupled receptors: (2004)
Suganthi Balasubramanian, Yu Xia, Elizaveta Freinkman, Mark Gerstein
analysis of single nucleotide polymorphisms
Yu, Haiyuan, Zhu, Xiaowei, Greenbaum, Dov, Karro, John, Gerstein, Mark
Biological networks are a topic of great current interest, particularly with the publication of a number of large genome‐wide interaction datasets. They are globally characterized by a variety...
Annotation Transfer Between Genomes: Protein-Protein Interologs and Protein-DNA Regulogs (2004)
Yu, Haiyuan, Luscombe, Nicholas M., Lu, Hao Xin, Zhu, Xiaowei, Xia, Yu, Han, Jing-Dong J., ...
Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their...
Relationship between gene co-expression and probe localization on microarray slides (2003)
Kluger, Yuval, Yu, Haiyuan, Qian, Jiang, Gerstein, Mark
Abstract Background Microarray technology allows simultaneous measurement of thousands of genes in a single experiment. This is a potentially useful tool for evaluating co-expression of genes and...
Comparing protein abundance and mRNA expression levels on a genomic scale (2003)
Greenbaum, Dov, Colangelo, Christopher, Williams, Kenneth, Gerstein, Mark
Abstract Attempts to correlate protein abundance with mRNA expression levels have had variable success. We review the results of these comparisons, focusing on yeast. In the process, we survey...
Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements (2003)
Zhang, Zhaolei, Gerstein, Mark
Abstract Phylogenetic footprinting is an approach to finding functionally important sequences in the genome that relies on detecting their high degrees of conservation across different species. A new...
Harrison, Paul M, Gerstein, Mark
Abstract We have derived a novel method to assess compositional biases in biological sequences, which is based on finding the lowest-probability subsequences for a given residue-type set. As a case...
Chern-sing Goh, Ning Lan, Nathaniel Echols, Shawn M. Douglas, Duncan Milburn, Paul Bertone, ...
We present version 2 of the SPINE system for structural proteomics. SPINE is available over the web at
Spectral Biclustering of Microarray Cancer Data: Co-clustering Genes and Conditions (2003)
Yuval Kluger, Ronen Basri, Joseph T. Chang, Mark Gerstein
and conditions
ExpressYourself: a modular platform for processing and visualizing microarray data (2003)
Luscombe, Nicholas M., Royce, Thomas E., Bertone, Paul, Echols, Nathaniel, Horak, Christine E., Chang, Joseph T., ...
DNA microarrays are widely used in biological research; by analyzing differential hybridization on a single microarray slide, one can detect changes in mRNA expression levels, increases in DNA copy...
The transcriptional activity of human Chromosome 22 (2003)
Rinn, John L., Euskirchen, Ghia, Bertone, Paul, Martone, Rebecca, Luscombe, Nicholas M., Hartman, Stephen, ...
Zhang, Zhaolei, Gerstein, Mark
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences,...
Identification of pseudogenes in the Drosophila melanogaster genome (2003)
Harrison, Paul M., Milburn, Duncan, Zhang, Zhaolei, Bertone, Paul, Gerstein, Mark
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are...
Zhang, Zhaolei, Harrison, Paul M., Liu, Yin, Gerstein, Mark
Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we...
Goh, Chern-Sing, Lan, Ning, Echols, Nathaniel, Douglas, Shawn M., Milburn, Duncan, Bertone, Paul, ...
We present version 2 of the SPINE system for structural proteomics. SPINE is available over the web at http://nesg.org. It serves as the central hub for the Northeast Structural Genomics Consortium,...
Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions (2003)
Kluger, Yuval, Basri, Ronen, Chang, Joseph T., Gerstein, Mark
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to find “marker genes” that are...
MolMovDB: analysis and visualization of conformational change and structural flexibility (2003)
Echols, Nathaniel, Milburn, Duncan, Gerstein, Mark
The Database of Macromolecular Movements (http://MolMovDB.org) is a collection of data and software pertaining to flexibility in protein and RNA structures. The database is organized into two parts....
Qian, Jiang, Lin, Jimmy, Luscombe, Nicholas M., Yu, Haiyuan, Gerstein, Mark
Motivation: Defining regulatory networks, linking transcription factors (TFs) to their targets, is a central problem in post-genomic biology. One might imagine one could readily determine these...
Jansen, Ronald, Bussemaker, Harmen J., Gerstein, Mark
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and...
Genomic analysis of membrane protein families: abundance and conserved motifs (2002)
Liu, Yang, Engelman, Donald M, Gerstein, Mark
Abstract Background Polytopic membrane proteins can be related to each other on the basis of the number of transmembrane helices and sequence similarities. Building on the Pfam classification of...
Luscombe, Nicholas M, Qian, Jiang, Zhang, Zhaolei, Johnson, Ted, Gerstein, Mark
Abstract Background The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through...
Structural genomics: a new era for pharmaceutical research (2002)
Liu, Yang, Luscombe, Nicholas M, Alexandrov, Vadim, Bertone, Paul, Harrison, Paul, Zhang, Zhaolei, ...
A report on the 15th Annual Center for Advanced Biotechnology and Medicine Symposium on structural genomics in pharmaceutical design, Princeton, USA, 24-25 October 2001.
Fast optimal genome tiling with applications to microarray design and homology search (2002)
Piotr Berman, Paul Bertone, Bhaskar Dasgupta, Mark Gerstein, Ming-yang Kao, Michael Snyder
In this paper we consider several variations of the following basic tiling problem: given a sequence of real numbers with two size bound parameters, we want to find a set of tiles of maximum total...
Fast optimal genome tiling with applications to microarray design and homology search (2002)
Piotr Berman, Paul Bertone, Bhaskar Dasgupta, Mark Gerstein, Ming-yang Kao, Michael Snyder, ...
Abstract. In this paper we consider several variations of the following basic tiling problem: given a sequence of real numbers with two size bound parameters, we want to nd a set of tiles such that...
Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae (2002)
Horak, Christine E., Luscombe, Nicholas M., Qian, Jiang, Bertone, Paul, Piccirrillo, Stacy, Gerstein, Mark, ...
A question of size: the eukaryotic proteome and the problems in defining it (2002)
Harrison, Paul M., Kumar, Anuj, Lang, Ning, Snyder, Michael, Gerstein, Mark
We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and...
Calculations of protein volumes: sensitivity analysis and parameter database (2002)
Motivation: The precise sizes of protein atoms in terms of occupied packing volume are of great importance. We have previously presented standard volumes for protein residues based on calculations...
Echols, Nathaniel, Harrison, Paul, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Zhang, Zhaolei, ...
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21...
Harrison, Paul M., Hegyi, Hedi, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Echols, Nathaniel, ...
RNA expression patterns change dramatically in human neutrophils exposed to bacteria (2001)
Yamaga, Shigeru, Prashar, Yatindra, Lee, Helen H., Hoe, Nancy Palme, Kluger, Yuval, ...
A comprehensive study of changes in messenger RNA (mRNA) levels in human neutrophils following exposure to bacteria is described. Within 2 hours there are dramatic changes in the levels of several...
Determining the minimum number of types necessary to represent the sizes of protein atoms (2001)
Tsai, Jerry, Voss, Neil, Gerstein, Mark
Motivation: Traditionally, for packing calculations people have collected atoms together into a number of distinct ‘types’. These, in fact, often represent a heavy atom and its associated...
Bertone, Paul, Kluger, Yuval, Lan, Ning, Zheng, Deyou, Christendat, Dinesh, Yee, Adelinda, ...
High-throughput structural proteomics is expected to generate considerable amounts of data on the progress of structure determination for many proteins. For each protein this includes information...
We built “whole-genome ” trees based on the presence or absence of particular molecular features (either orthologs or folds) in the genomes of a number of recently sequenced microorganisms. To...
Jansen, Ronald, Gerstein, Mark
We analyzed 10 genome expression data sets by large-scale cross-referencing against broad structural and functional categories. The data sets, generated by different techniques (e.g. SAGE and gene...
Krebs, Werner G., Gerstein, Mark
The number of solved structures of macromolecules that have the same fold and thus exhibit some degree of conformational variability is rapidly increasing. It is consequently advantageous to develop...
Balasubramanian, Suganthi, Schneider, Tamara, Gerstein, Mark, Regan, Lynne
We present the results of a comprehensive analysis of the proteome of Mycoplasma genitalium (MG), the smallest autonomously replicating organism that has been completely sequenced. Our aim was to...
Database of Macromolecular Movements (1999)
Mark Gerstein ; Werner G. Krebs
The Molecular Movements Database lists motions in proteins and other macromolecules. It is arranged around a multi-level classification scheme and includes motions of loops, domains, and subunits.
Database of Macromolecular Movements (1999)
Mark Gerstein ; Werner G. Krebs
The Molecular Movements Database lists motions in proteins and other macromolecules. It is arranged around a multi-level classification scheme and includes motions of loops, domains, and subunits.
Studying Macromolecular Motions in a Database Framework: From Structure to Sequence (1999)
Mark Gerstein, Ronald Jansen, Ted Johnson, Jerry Tsai, Werner Krebs
We describe database approaches taken in our lab to the study of protein and nucleic acid motions. We have developed a database of macromolecular motions, which is accessible on the World Wide Web...
The objective of this project is to study protein sequence-structure relationships through large-scale computational analysis of gene sequences and crystal structure in the databanks. The results of...
(The NLM-formatted bibliographic entry is also available.) We apply a simple method for aligning protein sequences on the basis of a 3D structure, on a large scale, to the proteins in the scop...
Simulating the Minimum Core for Hydrophobic Collapse in Globular Proteins (1997)
Jerry Tsai, Mark Gerstein, Michael Levitt
(The NLM-formatted bibliographic entry is also available.) To investigate the nature of hydrophobic collapse considered to be the driving force in protein folding, we have simulated aqueous solutions...
We show how a basic pairwise alignment procedure can be improved to more accurately align conserved structural regions, by using variable, positiondependent gap penalties that depend on secondary...
DNA recognition code of transcription factors (1995)
Suzuki, Masashi, Brenner, Steven E., Gerstein, Mark, Yagi, Naoto
DNA recognition and superstructure formation by helix-turn-helix proteins (1995)
Suzuki, Masashi, Yagi, Naoto, Gerstein, Mark
The way helix-turn-helix proteins recognize DNA is analysed by comparing their sequences, structures, and binding specificities. Individual recognition helices in these proteins bind to four DNA base...
Stereochemical basis of DNA recognition by Zn fingers (1994)
Suzuki, Masashi, Gerstein, Mark, Yagi, Naoto
DNA-recognition rules for Zn fingers are discussed in terms of crystal structures. The rules can explain the DNA-binding characteristics of a number of Zn finger proteins for which there are no...
Solution structure of the DNA binding octapeptide repeat of the K10 gene product (1994)
Suzuki, Masashi, Neuhaus, David, Gerstein, Mark, Aimoto, Saburo
A putative transcription factor, the Drosophila K10 gene product, contains eight repeats of the octapeptide sequence SPNQQQHP or close variants. The solution structure of the K10 repeat was studied...
An NMR study on the DNA-binding SPKK motif and a model for its interaction with DNA (1993)
Suzuki, Masashi, Gerstein, Mark, Johnson, Tony
The solution structure of one and two repeats of the ‘SPKK’ DNA-binding motif is reported on the basis of NMR measurements. In dimethylsulphoxide (DMSO) the major population (approximately 90%)...
Protein recognition : surfaces and conformational change. (1992)
Thesis (Ph. D.)--University of Cambridge, 1992.
A structural census of the current population of protein sequences
Gerstein, Mark, Levitt, Michael
We examine the occurrence of the ≈300 known protein folds in different groups of organisms. To do this, we characterize a large fraction of the currently known protein sequences (≈140,000) in...
Qian, Jiang, Stenger, Brad, Wilson, Cyrus A., Lin, Jimmy, Jansen, Ronald, Teichmann, Sarah A., ...
As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the...
A unified statistical framework for sequence comparison and structure comparison
Levitt, Michael, Gerstein, Mark
We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an...
Bertone, Paul, Kluger, Yuval, Lan, Ning, Zheng, Deyou, Christendat, Dinesh, Yee, Adelinda, ...
High-throughput structural proteomics is expected to generate considerable amounts of data on the progress of structure determination for many proteins. For each protein this includes information...
A question of size: the eukaryotic proteome and the problems in defining it
Harrison, Paul M., Kumar, Anuj, Lang, Ning, Snyder, Michael, Gerstein, Mark
We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and...
Krebs, Werner G., Gerstein, Mark
The number of solved structures of macromolecules that have the same fold and thus exhibit some degree of conformational variability is rapidly increasing. It is consequently advantageous to develop...
Balasubramanian, Suganthi, Schneider, Tamara, Gerstein, Mark, Regan, Lynne
We present the results of a comprehensive analysis of the proteome of Mycoplasma genitalium (MG), the smallest autonomously replicating organism that has been completely sequenced. Our aim was to...
Jansen, Ronald, Gerstein, Mark
We analyzed 10 genome expression data sets by large-scale cross-referencing against broad structural and functional categories. The data sets, generated by different techniques (e.g. SAGE and gene...
Echols, Nathaniel, Harrison, Paul, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Zhang, Zhaolei, ...
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21...
GATA-1 binding sites mapped in the β-globin locus by using mammalian chIp-chip analysis
Horak, Christine E., Mahajan, Milind C., Luscombe, Nicholas M., Gerstein, Mark, Weissman, Sherman M., Snyder, Michael
The expression of the β-like globin genes is intricately regulated by a series of both general and tissue-restricted transcription factors. The hemapoietic lineage-specific transcription factor...
Luscombe, Nicholas M, Qian, Jiang, Zhang, Zhaolei, Johnson, Ted, Gerstein, Mark
The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. The genomic occurrence of...
Genomic analysis of membrane protein families: abundance and conserved motifs
Liu, Yang, Engelman, Donald M, Gerstein, Mark
A genome-wide analysis was carried out on patterns of the classified polytopic membrane protein families, and the distribution of conserved amino acids and motifs in the transmembrane helix regions...
Structural genomics: a new era for pharmaceutical research
Liu, Yang, Luscombe, Nicholas M, Alexandrov, Vadim, Bertone, Paul, Harrison, Paul, Zhang, Zhaolei, ...
A report on the 15th Annual Center for Advanced Biotechnology and Medicine Symposium on structural genomics in pharmaceutical design, Princeton, USA, 24-25 October 2001.
Identification of pseudogenes in the Drosophila melanogaster genome
Harrison, Paul M., Milburn, Duncan, Zhang, Zhaolei, Bertone, Paul, Gerstein, Mark
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are...
Jansen, Ronald, Bussemaker, Harmen J., Gerstein, Mark
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and...
SPINE 2: a system for collaborative structural proteomics within a federated database framework
Goh, Chern-Sing, Lan, Ning, Echols, Nathaniel, Douglas, Shawn M., Milburn, Duncan, Bertone, Paul, ...
We present version 2 of the SPINE system for structural proteomics. SPINE is available over the web at http://nesg.org. It serves as the central hub for the Northeast Structural Genomics Consortium,...
MolMovDB: analysis and visualization of conformational change and structural flexibility
Echols, Nathaniel, Milburn, Duncan, Gerstein, Mark
The Database of Macromolecular Movements (http://MolMovDB.org) is a collection of data and software pertaining to flexibility in protein and RNA structures. The database is organized into two parts....
ExpressYourself: a modular platform for processing and visualizing microarray data
Luscombe, Nicholas M., Royce, Thomas E., Bertone, Paul, Echols, Nathaniel, Horak, Christine E., Chang, Joseph T., ...
DNA microarrays are widely used in biological research; by analyzing differential hybridization on a single microarray slide, one can detect changes in mRNA expression levels, increases in DNA copy...
Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae
Horak, Christine E., Luscombe, Nicholas M., Qian, Jiang, Bertone, Paul, Piccirrillo, Stacy, Gerstein, Mark, ...
In the yeast Saccharomyces cerevisiae, SBF (Swi4–Swi6 cell cycle box binding factor) and MBF (MluI binding factor) are the major transcription factors regulating the START of the cell cycle, a time...
Identification and Analysis of Over 2000 Ribosomal Protein Pseudogenes in the Human Genome
Zhang, Zhaolei, Harrison, Paul, Gerstein, Mark
Mammals have 79 ribosomal proteins (RP). Using a systematic procedure based on sequence-homology, we have comprehensively identified pseudogenes of these proteins in the human genome. Our assignments...
Mateos, Alvaro, Dopazo, Joaquín, Jansen, Ronald, Tu, Yuhai, Gerstein, Mark, Stolovitzky, Gustavo
Recent advances in microarray technology have opened new ways for functional annotation of previously uncharacterised genes on a genomic scale. This has been demonstrated by unsupervised clustering...
Harrison, Paul M, Gerstein, Mark
A novel method has been derived to assess compositional biases in biological sequences. It is based on finding the lowest-probability subsequences for a given residue-type set.
Comparing protein abundance and mRNA expression levels on a genomic scale
Greenbaum, Dov, Colangelo, Christopher, Williams, Kenneth, Gerstein, Mark
We review the results of attempts to correlate protein abundance with mRNA expression levels, focusing on yeast.
Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements
Zhang, Zhaolei, Gerstein, Mark
Phylogenetic footprinting is an approach to finding functionally important sequences in the genome that relies on detecting their high degrees of conservation across different species. A new study...
The transcriptional activity of human Chromosome 22
Rinn, John L., Euskirchen, Ghia, Bertone, Paul, Martone, Rebecca, Luscombe, Nicholas M., Hartman, Stephen, ...
A DNA microarray representing nearly all of the unique sequences of human Chromosome 22 was constructed and used to measure global-transcriptional activity in placental poly(A)+ RNA. We found that...
Zhang, Zhaolei, Gerstein, Mark
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences,...
Distribution of NF-κB-binding sites across human chromosome 22
Martone, Rebecca, Euskirchen, Ghia, Bertone, Paul, Hartman, Stephen, Royce, Thomas E., Luscombe, Nicholas M., ...
We have mapped the chromosomal binding site distribution of a transcription factor in human cells. The NF-κB family of transcription factors plays an essential role in regulating the induction of...
Jiao, Yuling, Yang, Hongjuan, Ma, Ligeng, Sun, Ning, Yu, Haiyuan, Liu, Tie, ...
A microarray based on PCR amplicons of 1,864 confirmed and predicted Arabidopsis transcription factor genes was produced and used to profile the global expression pattern in seedlings, specifically...
We built whole-genome trees based on the presence or absence of particular molecular features, either orthologs or folds, in the genomes of a number of recently sequenced microorganisms. To put these...
Annotation Transfer for Genomics: Measuring Functional Divergence in Multi-Domain Proteins
Annotation transfer is a principal process in genome annotation. It involves “transferring” structural and functional annotation to uncharacterized open reading frames (ORFs) in a newly completed...
Yu, Haiyuan, Zhu, Xiaowei, Greenbaum, Dov, Karro, John, Gerstein, Mark
Biological networks are a topic of great current interest, particularly with the publication of a number of large genome-wide interaction datasets. They are globally characterized by a variety of...
Transmembrane protein domains rarely use covalent domain recombination as an evolutionary mechanism
Liu, Yang, Gerstein, Mark, Engelman, Donald M.
Recombination of evolutionarily unrelated domains is a mechanism often used by evolution to produce variety in soluble proteins. By using a classification of polytopic transmembrane domains into...
CREB Binds to Multiple Loci on Human Chromosome 22
Euskirchen, Ghia, Royce, Thomas E., Bertone, Paul, Martone, Rebecca, Rinn, John L., Nelson, F. Kenneth, ...
The cyclic AMP-responsive element-binding protein (CREB) is an important transcription factor that can be activated by hormonal stimulation and regulates neuronal function and development. An...
Zhang, Zhaolei, Harrison, Paul M., Liu, Yin, Gerstein, Mark
Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we...
Annotation Transfer Between Genomes: Protein–Protein Interologs and Protein–DNA Regulogs
Yu, Haiyuan, Luscombe, Nicholas M., Lu, Hao Xin, Zhu, Xiaowei, Xia, Yu, Han, Jing-Dong J., ...
Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their...
Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions
Kluger, Yuval, Basri, Ronen, Chang, Joseph T., Gerstein, Mark
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to find “marker genes” that are...
Liu, Yang, Harrison, Paul M, Kunin, Victor, Gerstein, Mark
A comprehensive analysis of the occurrence of pseudogenes in a diverse selection of 64 prokaryote genomes identified around 7,000 candidate pseudogenes. A large fraction of prokaryote pseudogenes...
Information assessment on predicting protein-protein interactions
Lin, Nan, Wu, Baolin, Jansen, Ronald, Gerstein, Mark, Zhao, Hongyu
White, Eric J., Emanuelsson, Olof, Scalzo, David, Royce, Thomas, Kosak, Steven, Oakeley, Edward J., ...
Duplication of the genome during the S phase of the cell cycle does not occur simultaneously; rather, different sequences are replicated at different times. The replication timing of specific...
Carriero, Nicholas, Osier, Michael V., Cheung, Kei-Hoi, Miller, Perry L., Gerstein, Mark, Zhao, Hongyu, ...
The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size...
Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms
Balasubramanian, Suganthi, Xia, Yu, Freinkman, Elizaveta, Gerstein, Mark
We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled...
Huber, Damon, Boyd, Dana, Xia, Yu, Olma, Michael H., Gerstein, Mark, Beckwith, Jon
We have previously reported that the DsbA signal sequence promotes efficient, cotranslational translocation of the cytoplasmic protein thioredoxin-1 via the bacterial signal recognition particle...
Harrison, Paul M., Zheng, Deyou, Zhang, Zhaolei, Carriero, Nicholas, Gerstein, Mark
Pseudogenes, in the case of protein-coding genes, are gene copies that have lost the ability to code for a protein; they are typically identified through annotation of disabled, decayed or incomplete...
Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles
Gilad, Yoav, Rifkin, Scott A., Bertone, Paul, Gerstein, Mark, White, Kevin P.
Interspecies comparisons of gene expression levels will increase our understanding of the evolution of transcriptional mechanisms and help to identify targets of natural selection. This approach...
Assessing the limits of genomic data integration for predicting protein networks
Lu, Long J., Xia, Yu, Paccanaro, Alberto, Yu, Haiyuan, Gerstein, Mark
Genomic data integration—the process of statistically combining diverse sources of information from functional genomics experiments to make large-scale predictions—is becoming increasingly...
Smith, Andrew, Greenbaum, Dov, Douglas, Shawn M, Long, Morrow, Gerstein, Mark
A direct impediment to the optimal use of online databases is the increasing prevalence, severity, and toll of computer and network security incidents. Funding agencies should set up working groups...
PubNet: a flexible system for visualizing literature derived networks
Douglas, Shawn M, Montelione, Gaetano T, Gerstein, Mark
PubNet is a web-based tool to extract several types of relationships returned by PubMed queries and map them into networks.
The Database of Macromolecular Motions: new features added at the decade mark
Flores, Samuel, Echols, Nathaniel, Milburn, Duncan, Hespenheide, Brandon, Keating, Kevin, Lu, Jason, ...
The database of molecular motions, MolMovDB (), has been in existence for the past decade. It classifies macromolecular motions and provides tools to interpolate between two conformations (the Morph...
Relating Whole-Genome Expression Data with Protein-Protein Interactions
Jansen, Ronald, Greenbaum, Dov, Gerstein, Mark
We investigate the relationship of protein-protein interactions with mRNA expression levels, by integrating a variety of data sources for yeast. We focus on known protein complexes that have clearly...
Harrison, Paul M., Hegyi, Hedi, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Echols, Nathaniel, ...
We have developed an initial approach for annotating and surveying pseudogenes in the human genome. We search human genomic DNA for regions that are similar to known protein sequences and contain...
Subcellular localization of the yeast proteome
Kumar, Anuj, Agarwal, Seema, Heyman, John A., Matson, Sandra, Heidtman, Matthew, Piccirillo, Stacy, ...
Protein localization data are a valuable information resource helpful in elucidating eukaryotic protein function. Here, we report the first proteome-scale analysis of protein localization within any...
Seringhaus, Michael, Kumar, Anuj, Hartigan, John, Snyder, Michael, Gerstein, Mark
Transposons are widely employed as tools for gene disruption. Ideally, they should display unbiased insertion behavior, and incorporate readily into any genomic DNA to which they are exposed....
Biochemical and genetic analysis of the yeast proteome with a movable ORF collection
Gelperin, Daniel M., White, Michael A., Wilkinson, Martha L., Kon, Yoshiko, Kung, Li A., Wise, Kevin J., ...
Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was...
Global changes in STAT target selection and transcription regulation upon interferon treatments
Hartman, Stephen E., Bertone, Paul, Nath, Anjali K., Royce, Thomas E., Gerstein, Mark, Weissman, Sherman, ...
The STAT (signal transducer and activator of transcription) proteins play a crucial role in the regulation of gene expression, but their targets and the manner in which they select them remain...
Design optimization methods for genomic DNA tiling arrays
Bertone, Paul, Trifonov, Valery, Rozowsky, Joel S., Schubert, Falk, Emanuelsson, Olof, Karro, John, ...
A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements. A central...
Target hub proteins serve as master regulators of development in yeast
Borneman, Anthony R., Leigh-Bell, Justine A., Yu, Haiyuan, Bertone, Paul, Gerstein, Mark, Snyder, Michael
To understand the organization of the transcriptional networks that govern cell differentiation, we have investigated the transcriptional circuitry controlling pseudohyphal development in...
Coric, Tatjana, Zheng, Deyou, Gerstein, Mark, Canessa, Cecilia M
The acid-sensitive ion channel 1 (ASIC1) is a neuronal Na+ channel insensitive to changes in membrane potential but is gated by external protons. Proton sensitivity is believed to be essential for...
Integration of curated databases to identify genotype-phenotype associations
Goh, Chern-Sing, Gianoulis, Tara A, Liu, Yang, Li, Jianrong, Paccanaro, Alberto, Lussier, Yves A, ...
An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits
Liu, Yang, Li, Jianrong, Sam, Lee, Goh, Chern-Sing, Gerstein, Mark, Lussier, Yves A
With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes,...
A structural census of the current population of protein sequences
Gerstein, Mark, Levitt, Michael
We examine the occurrence of the ≈300 known protein folds in different groups of organisms. To do this, we characterize a large fraction of the currently known protein sequences (≈140,000) in...
Qian, Jiang, Stenger, Brad, Wilson, Cyrus A., Lin, Jimmy, Jansen, Ronald, Teichmann, Sarah A., ...
As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the...
A unified statistical framework for sequence comparison and structure comparison
Levitt, Michael, Gerstein, Mark
We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an...
Bertone, Paul, Kluger, Yuval, Lan, Ning, Zheng, Deyou, Christendat, Dinesh, Yee, Adelinda, ...
High-throughput structural proteomics is expected to generate considerable amounts of data on the progress of structure determination for many proteins. For each protein this includes information...
A question of size: the eukaryotic proteome and the problems in defining it
Harrison, Paul M., Kumar, Anuj, Lang, Ning, Snyder, Michael, Gerstein, Mark
We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and...
Krebs, Werner G., Gerstein, Mark
The number of solved structures of macromolecules that have the same fold and thus exhibit some degree of conformational variability is rapidly increasing. It is consequently advantageous to develop...
Balasubramanian, Suganthi, Schneider, Tamara, Gerstein, Mark, Regan, Lynne
We present the results of a comprehensive analysis of the proteome of Mycoplasma genitalium (MG), the smallest autonomously replicating organism that has been completely sequenced. Our aim was to...
Jansen, Ronald, Gerstein, Mark
We analyzed 10 genome expression data sets by large-scale cross-referencing against broad structural and functional categories. The data sets, generated by different techniques (e.g. SAGE and gene...
Echols, Nathaniel, Harrison, Paul, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Zhang, Zhaolei, ...
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21...
GATA-1 binding sites mapped in the β-globin locus by using mammalian chIp-chip analysis
Horak, Christine E., Mahajan, Milind C., Luscombe, Nicholas M., Gerstein, Mark, Weissman, Sherman M., Snyder, Michael
The expression of the β-like globin genes is intricately regulated by a series of both general and tissue-restricted transcription factors. The hemapoietic lineage-specific transcription factor...
Luscombe, Nicholas M, Qian, Jiang, Zhang, Zhaolei, Johnson, Ted, Gerstein, Mark
The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. The genomic occurrence of...
Genomic analysis of membrane protein families: abundance and conserved motifs
Liu, Yang, Engelman, Donald M, Gerstein, Mark
A genome-wide analysis was carried out on patterns of the classified polytopic membrane protein families, and the distribution of conserved amino acids and motifs in the transmembrane helix regions...
Structural genomics: a new era for pharmaceutical research
Liu, Yang, Luscombe, Nicholas M, Alexandrov, Vadim, Bertone, Paul, Harrison, Paul, Zhang, Zhaolei, ...
A report on the 15th Annual Center for Advanced Biotechnology and Medicine Symposium on structural genomics in pharmaceutical design, Princeton, USA, 24-25 October 2001.
Identification of pseudogenes in the Drosophila melanogaster genome
Harrison, Paul M., Milburn, Duncan, Zhang, Zhaolei, Bertone, Paul, Gerstein, Mark
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are...
Jansen, Ronald, Bussemaker, Harmen J., Gerstein, Mark
Highly expressed genes in many bacteria and small eukaryotes often have a strong compositional bias, in terms of codon usage. Two widely used numerical indices, the codon adaptation index (CAI) and...
Relating Whole-Genome Expression Data with Protein-Protein Interactions
Jansen, Ronald, Greenbaum, Dov, Gerstein, Mark
We investigate the relationship of protein-protein interactions with mRNA expression levels, by integrating a variety of data sources for yeast. We focus on known protein complexes that have clearly...
Harrison, Paul M., Hegyi, Hedi, Balasubramanian, Suganthi, Luscombe, Nicholas M., Bertone, Paul, Echols, Nathaniel, ...
We have developed an initial approach for annotating and surveying pseudogenes in the human genome. We search human genomic DNA for regions that are similar to known protein sequences and contain...
Subcellular localization of the yeast proteome
Kumar, Anuj, Agarwal, Seema, Heyman, John A., Matson, Sandra, Heidtman, Matthew, Piccirillo, Stacy, ...
Protein localization data are a valuable information resource helpful in elucidating eukaryotic protein function. Here, we report the first proteome-scale analysis of protein localization within any...
SPINE 2: a system for collaborative structural proteomics within a federated database framework
Goh, Chern-Sing, Lan, Ning, Echols, Nathaniel, Douglas, Shawn M., Milburn, Duncan, Bertone, Paul, ...
We present version 2 of the SPINE system for structural proteomics. SPINE is available over the web at http://nesg.org. It serves as the central hub for the Northeast Structural Genomics Consortium,...
MolMovDB: analysis and visualization of conformational change and structural flexibility
Echols, Nathaniel, Milburn, Duncan, Gerstein, Mark
The Database of Macromolecular Movements (http://MolMovDB.org) is a collection of data and software pertaining to flexibility in protein and RNA structures. The database is organized into two parts....
ExpressYourself: a modular platform for processing and visualizing microarray data
Luscombe, Nicholas M., Royce, Thomas E., Bertone, Paul, Echols, Nathaniel, Horak, Christine E., Chang, Joseph T., ...
DNA microarrays are widely used in biological research; by analyzing differential hybridization on a single microarray slide, one can detect changes in mRNA expression levels, increases in DNA copy...
Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae
Horak, Christine E., Luscombe, Nicholas M., Qian, Jiang, Bertone, Paul, Piccirrillo, Stacy, Gerstein, Mark, ...
In the yeast Saccharomyces cerevisiae, SBF (Swi4–Swi6 cell cycle box binding factor) and MBF (MluI binding factor) are the major transcription factors regulating the START of the cell cycle, a time...
Identification and Analysis of Over 2000 Ribosomal Protein Pseudogenes in the Human Genome
Zhang, Zhaolei, Harrison, Paul, Gerstein, Mark
Mammals have 79 ribosomal proteins (RP). Using a systematic procedure based on sequence-homology, we have comprehensively identified pseudogenes of these proteins in the human genome. Our assignments...
Mateos, Alvaro, Dopazo, Joaquín, Jansen, Ronald, Tu, Yuhai, Gerstein, Mark, Stolovitzky, Gustavo
Recent advances in microarray technology have opened new ways for functional annotation of previously uncharacterised genes on a genomic scale. This has been demonstrated by unsupervised clustering...
Harrison, Paul M, Gerstein, Mark
A novel method has been derived to assess compositional biases in biological sequences. It is based on finding the lowest-probability subsequences for a given residue-type set.
Comparing protein abundance and mRNA expression levels on a genomic scale
Greenbaum, Dov, Colangelo, Christopher, Williams, Kenneth, Gerstein, Mark
We review the results of attempts to correlate protein abundance with mRNA expression levels, focusing on yeast.
Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements
Zhang, Zhaolei, Gerstein, Mark
Phylogenetic footprinting is an approach to finding functionally important sequences in the genome that relies on detecting their high degrees of conservation across different species. A new study...
The transcriptional activity of human Chromosome 22
Rinn, John L., Euskirchen, Ghia, Bertone, Paul, Martone, Rebecca, Luscombe, Nicholas M., Hartman, Stephen, ...
A DNA microarray representing nearly all of the unique sequences of human Chromosome 22 was constructed and used to measure global-transcriptional activity in placental poly(A)+ RNA. We found that...
Zhang, Zhaolei, Gerstein, Mark
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences,...
Distribution of NF-κB-binding sites across human chromosome 22
Martone, Rebecca, Euskirchen, Ghia, Bertone, Paul, Hartman, Stephen, Royce, Thomas E., Luscombe, Nicholas M., ...
We have mapped the chromosomal binding site distribution of a transcription factor in human cells. The NF-κB family of transcription factors plays an essential role in regulating the induction of...
Jiao, Yuling, Yang, Hongjuan, Ma, Ligeng, Sun, Ning, Yu, Haiyuan, Liu, Tie, ...
A microarray based on PCR amplicons of 1,864 confirmed and predicted Arabidopsis transcription factor genes was produced and used to profile the global expression pattern in seedlings, specifically...
We built whole-genome trees based on the presence or absence of particular molecular features, either orthologs or folds, in the genomes of a number of recently sequenced microorganisms. To put these...
Annotation Transfer for Genomics: Measuring Functional Divergence in Multi-Domain Proteins
Annotation transfer is a principal process in genome annotation. It involves “transferring” structural and functional annotation to uncharacterized open reading frames (ORFs) in a newly completed...
Yu, Haiyuan, Zhu, Xiaowei, Greenbaum, Dov, Karro, John, Gerstein, Mark
Biological networks are a topic of great current interest, particularly with the publication of a number of large genome-wide interaction datasets. They are globally characterized by a variety of...
Transmembrane protein domains rarely use covalent domain recombination as an evolutionary mechanism
Liu, Yang, Gerstein, Mark, Engelman, Donald M.
Recombination of evolutionarily unrelated domains is a mechanism often used by evolution to produce variety in soluble proteins. By using a classification of polytopic transmembrane domains into...
CREB Binds to Multiple Loci on Human Chromosome 22
Euskirchen, Ghia, Royce, Thomas E., Bertone, Paul, Martone, Rebecca, Rinn, John L., Nelson, F. Kenneth, ...
The cyclic AMP-responsive element-binding protein (CREB) is an important transcription factor that can be activated by hormonal stimulation and regulates neuronal function and development. An...
Zhang, Zhaolei, Harrison, Paul M., Liu, Yin, Gerstein, Mark
Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we...
Annotation Transfer Between Genomes: Protein–Protein Interologs and Protein–DNA Regulogs
Yu, Haiyuan, Luscombe, Nicholas M., Lu, Hao Xin, Zhu, Xiaowei, Xia, Yu, Han, Jing-Dong J., ...
Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their...
Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions
Kluger, Yuval, Basri, Ronen, Chang, Joseph T., Gerstein, Mark
Global analyses of RNA expression levels are useful for classifying genes and overall phenotypes. Often these classification problems are linked, and one wants to find “marker genes” that are...
Liu, Yang, Harrison, Paul M, Kunin, Victor, Gerstein, Mark
A comprehensive analysis of the occurrence of pseudogenes in a diverse selection of 64 prokaryote genomes identified around 7,000 candidate pseudogenes. A large fraction of prokaryote pseudogenes...
Information assessment on predicting protein-protein interactions
Lin, Nan, Wu, Baolin, Jansen, Ronald, Gerstein, Mark, Zhao, Hongyu
White, Eric J., Emanuelsson, Olof, Scalzo, David, Royce, Thomas, Kosak, Steven, Oakeley, Edward J., ...
Duplication of the genome during the S phase of the cell cycle does not occur simultaneously; rather, different sequences are replicated at different times. The replication timing of specific...
Carriero, Nicholas, Osier, Michael V., Cheung, Kei-Hoi, Miller, Perry L., Gerstein, Mark, Zhao, Hongyu, ...
The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size...
Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms
Balasubramanian, Suganthi, Xia, Yu, Freinkman, Elizaveta, Gerstein, Mark
We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled...
Huber, Damon, Boyd, Dana, Xia, Yu, Olma, Michael H., Gerstein, Mark, Beckwith, Jon
We have previously reported that the DsbA signal sequence promotes efficient, cotranslational translocation of the cytoplasmic protein thioredoxin-1 via the bacterial signal recognition particle...
Harrison, Paul M., Zheng, Deyou, Zhang, Zhaolei, Carriero, Nicholas, Gerstein, Mark
Pseudogenes, in the case of protein-coding genes, are gene copies that have lost the ability to code for a protein; they are typically identified through annotation of disabled, decayed or incomplete...
Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles
Gilad, Yoav, Rifkin, Scott A., Bertone, Paul, Gerstein, Mark, White, Kevin P.
Interspecies comparisons of gene expression levels will increase our understanding of the evolution of transcriptional mechanisms and help to identify targets of natural selection. This approach...
Assessing the limits of genomic data integration for predicting protein networks
Lu, Long J., Xia, Yu, Paccanaro, Alberto, Yu, Haiyuan, Gerstein, Mark
Genomic data integration—the process of statistically combining diverse sources of information from functional genomics experiments to make large-scale predictions—is becoming increasingly...
Smith, Andrew, Greenbaum, Dov, Douglas, Shawn M, Long, Morrow, Gerstein, Mark
A direct impediment to the optimal use of online databases is the increasing prevalence, severity, and toll of computer and network security incidents. Funding agencies should set up working groups...
PubNet: a flexible system for visualizing literature derived networks
Douglas, Shawn M, Montelione, Gaetano T, Gerstein, Mark
PubNet is a web-based tool to extract several types of relationships returned by PubMed queries and map them into networks.
Biochemical and genetic analysis of the yeast proteome with a movable ORF collection
Gelperin, Daniel M., White, Michael A., Wilkinson, Martha L., Kon, Yoshiko, Kung, Li A., Wise, Kevin J., ...
Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was...
Global changes in STAT target selection and transcription regulation upon interferon treatments
Hartman, Stephen E., Bertone, Paul, Nath, Anjali K., Royce, Thomas E., Gerstein, Mark, Weissman, Sherman, ...
The STAT (signal transducer and activator of transcription) proteins play a crucial role in the regulation of gene expression, but their targets and the manner in which they select them remain...
The Database of Macromolecular Motions: new features added at the decade mark
Flores, Samuel, Echols, Nathaniel, Milburn, Duncan, Hespenheide, Brandon, Keating, Kevin, Lu, Jason, ...
The database of molecular motions, MolMovDB (), has been in existence for the past decade. It classifies macromolecular motions and provides tools to interpolate between two conformations (the Morph...
Design optimization methods for genomic DNA tiling arrays
Bertone, Paul, Trifonov, Valery, Rozowsky, Joel S., Schubert, Falk, Emanuelsson, Olof, Karro, John, ...
A recent development in microarray research entails the unbiased coverage, or tiling, of genomic DNA for the large-scale identification of transcribed sequences and regulatory elements. A central...
Target hub proteins serve as master regulators of development in yeast
Borneman, Anthony R., Leigh-Bell, Justine A., Yu, Haiyuan, Bertone, Paul, Gerstein, Mark, Snyder, Michael
To understand the organization of the transcriptional networks that govern cell differentiation, we have investigated the transcriptional circuitry controlling pseudohyphal development in...
Seringhaus, Michael, Kumar, Anuj, Hartigan, John, Snyder, Michael, Gerstein, Mark
Transposons are widely employed as tools for gene disruption. Ideally, they should display unbiased insertion behavior, and incorporate readily into any genomic DNA to which they are exposed....
Coric, Tatjana, Zheng, Deyou, Gerstein, Mark, Canessa, Cecilia M
The acid-sensitive ion channel 1 (ASIC1) is a neuronal Na+ channel insensitive to changes in membrane potential but is gated by external protons. Proton sensitivity is believed to be essential for...
Integration of curated databases to identify genotype-phenotype associations
Goh, Chern-Sing, Gianoulis, Tara A, Liu, Yang, Li, Jianrong, Paccanaro, Alberto, Lussier, Yves A, ...
An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits
Liu, Yang, Li, Jianrong, Sam, Lee, Goh, Chern-Sing, Gerstein, Mark, Lussier, Yves A
With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes,...
Design principles of molecular networks revealed by global comparisons and composite motifs
Yu, Haiyuan, Xia, Yu, Trifonov, Valery, Gerstein, Mark
A global comparison of the four basic molecular networks in yeast - regulatory, co-expression, interaction and metabolic - reveals general design principles.
TOS9 Regulates White-Opaque Switching in Candida albicans▿ †
Srikantha, Thyagarajan, Borneman, Anthony R., Daniels, Karla J., Pujol, Claude, Wu, Wei, Seringhaus, Michael R., ...
In Candida albicans, the a1-α2 complex represses white-opaque switching, as well as mating. Based upon the assumption that the a1-α2 corepressor complex binds to the gene that regulates...
ProCAT: a data analysis approach for protein microarrays
Zhu, Xiaowei, Gerstein, Mark, Snyder, Michael
ProCAT, a powerful and flexible new approach for analyzing many types of protein microarrays, is described.
BoCaTFBS: a boosted cascade learner to refine the binding sites suggested by ChIP-chip experiments
Wang, Lu-yong, Snyder, Michael, Gerstein, Mark
BoCaTFBS, a new method that combines noisy data from ChIP-chip experiments with known binding-site patterns, is described and applied to the ENCODE project.
Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation
Karro, John E., Yan, Yangpan, Zheng, Deyou, Zhang, Zhaolei, Carriero, Nicholas, Cayting, Philip, ...
The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different...
Predicting essential genes in fungal genomes
Seringhaus, Michael, Paccanaro, Alberto, Borneman, Anthony, Snyder, Michael, Gerstein, Mark
Essential genes are required for an organism's viability, and the ability to identify these genes in pathogens is crucial to directed drug development. Predicting essential genes through...
Yu, Haiyuan, Nguyen, Katherine, Royce, Tom, Qian, Jiang, Nelson, Kenneth, Snyder, Michael, ...
Microarray technology is currently one of the most widely-used technologies in biology. Many studies focus on inferring the function of an unknown gene from its co-expressed genes. Here, we are able...
Global Identification and Characterization of Transcriptionally Active Regions in the Rice Genome
Li, Lei, Wang, Xiangfeng, Sasidharan, Rajkumar, Stolc, Viktor, Deng, Wei, He, Hang, ...
Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the...
Genomic analysis of the hierarchical structure of regulatory networks
A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and...
Yu, Haiyuan, Kim, Philip M, Sprecher, Emmett, Trifonov, Valery, Gerstein, Mark
It has been a long-standing goal in systems biology to find relations between the topological properties and functional features of protein networks. However, most of the focus in network studies has...
Tilescope: online analysis pipeline for high-density tiling microarray data
Zhang, Zhengdong D, Rozowsky, Joel, Lam, Hugo YK, Du, Jiang, Snyder, Michael, Gerstein, Mark
Tilescope is a fully integrated and automated new data-processing pipeline for analyzing high-density tiling-array data.
Smith, Michael G., Gianoulis, Tara A., Pukatzki, Stefan, Mekalanos, John J., Ornston, L. Nicholas, Gerstein, Mark, ...
Acinetobacter baumannii has emerged as an important and problematic human pathogen as it is the causative agent of several types of infections including pneumonia, meningitis, septicemia, and urinary...
Popescu, Sorina C., Popescu, George V., Bachan, Shawn, Zhang, Zimei, Seay, Montrell, Gerstein, Mark, ...
Calmodulins (CaMs) are the most ubiquitous calcium sensors in eukaryotes. A number of CaM-binding proteins have been identified through classical methods, and many proteins have been predicted to...
Alexandrov, Vadim, Lehnert, Ursula, Echols, Nathaniel, Milburn, Duncan, Engelman, Donald, Gerstein, Mark
We carry out an extensive statistical study of the applicability of normal modes to the prediction of mobile regions in proteins. In particular, we assess the degree to which the observed motions...
The role of disorder in interaction networks: a structural analysis
Kim, Philip M, Sboner, Andrea, Xia, Yu, Gerstein, Mark
Recent studies have emphasized the value of including structural information into the topological analysis of protein networks. Here, we utilized structural information to investigate the role of...
Wu, Jia Qian, Du, Jiang, Rozowsky, Joel, Zhang, Zhengdong, Urban, Alexander E, Euskirchen, Ghia, ...
RACE sequencing of ENCODE regions shows that much of the human genome is represented in poly(A)+ RNA.
Transmembrane Protein Oxygen Content and Compartmentalization of Cells
Sasidharan, Rajkumar, Smith, Andrew, Gerstein, Mark
Recently, there was a report that explored the oxygen content of transmembrane proteins over macroevolutionary time scales where the authors observed a correlation between the geological time of...
Modeling ChIP Sequencing In Silico with Applications
Zhang, Zhengdong D., Rozowsky, Joel, Snyder, Michael, Chang, Joseph, Gerstein, Mark
ChIP sequencing (ChIP-seq) is a new method for genomewide mapping of protein binding sites on DNA. It has generated much excitement in functional genomics. To score data and determine adequate...
Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets
Johnson, David S., Li, Wei, Gordon, D. Benjamin, Bhattacharjee, Arindam, Curry, Bo, Ghosh, Jayati, ...
The most widely used method for detecting genome-wide protein–DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first...
YMD: a microarray database for large-scale gene expression analysis.
Cheung, Kei-Hoi, White, Kevin, Hager, Janet, Gerstein, Mark, Reinke, Valerie, Nelson, Kenneth, ...
The use of microarray technology to perform parallel analysis of the expression pattern of a large number of genes in a single experiment has created a new frontier of medical research. The vast...
Lian, Zheng, Karpikov, Alexander, Lian, Jin, Mahajan, Milind C., Hartman, Stephen, Gerstein, Mark, ...
Genomic analyses have been applied extensively to analyze the process of transcription initiation in mammalian cells, but less to transcript 3′ end formation and transcription termination. We used...
Mismatch oligonucleotides in human and yeast: guidelines for probe design on tiling microarrays
Seringhaus, Michael, Rozowsky, Joel, Royce, Thomas, Nagalakshmi, Ugrappa, Jee, Justin, Snyder, Michael, ...
Motivation: An important problem in systems biology is reconstructing complete networks of interactions between biological objects by extrapolating from a few known interactions as examples. While...
Efficient yeast ChIP-Seq using multiplex short-read DNA sequencing
Lefrançois, Philippe, Euskirchen, Ghia M, Auerbach, Raymond K, Rozowsky, Joel, Gibson, Theodore, Yellman, Christopher M, ...
Comparative analysis of processed ribosomal protein pseudogenes in four mammalian genomes
Balasubramanian, Suganthi, Zheng, Deyou, Liu, Yuen-Jong, Fang, Gang, Frankish, Adam, Carriero, Nicholas, ...
An analysis of ribosomal protein pseudogenes in the four mammalian genomes reveals no correlation between number of pseudogenes and mRNA abundance.
MSB: A mean-shift-based approach for the analysis of structural variation in the genome
Wang, Lu-yong, Abyzov, Alexej, Korbel, Jan O., Snyder, Michael, Gerstein, Mark
Genome structural variation includes segmental duplications, deletions, and other rearrangements, and array-based comparative genomic hybridization (array-CGH) is a popular technology for determining...
MAPK target networks in Arabidopsis thaliana revealed using functional protein microarrays
Popescu, Sorina C., Popescu, George V., Bachan, Shawn, Zhang, Zimei, Gerstein, Mark, Snyder, Michael, ...
Signaling through mitogen-activated protein kinases (MPKs) cascades is a complex and fundamental process in eukaryotes, requiring MPK-activating kinases (MKKs) and MKK-activating kinases (MKKKs)....
Zebrafish miR-1 and miR-133 shape muscle gene expression and regulate sarcomeric actin organization
Mishima, Yuichiro, Abreu-Goodger, Cei, Staton, Alison A., Stahlhut, Carlos, Shou, Chong, Cheng, Chao, ...
microRNAs (miRNAs) represent ∼4% of the genes in vertebrates, where they regulate deadenylation, translation, and decay of the target messenger RNAs (mRNAs). The integrated role of miRNAs to...
An approach to compare genome tiling microarray and MPSS sequencing data for transcript mapping
Sasidharan, Rajkumar, Agarwal, Ashish, Rozowsky, Joel, Gerstein, Mark
We are correcting the abstract of our published article ([1]). The sentence that starts "We observe that 4.5% of MPSS tags...." was not scientifically complete in the original abstract, having only...
Cheng, Chao, Fu, Xuping, Alves, Pedro, Gerstein, Mark
Most microRNAs have a stronger inhibitory effect in estrogen receptor-negative than in estrogen receptor-positive breast cancers