Abstract

Abstract With the advent of ultra high-throughput sequencing technologies, increasingly researchers are turning to deep sequencing for gene expression studies. Here we present a set of rigorous methods for normalization, quantification of noise, and co-expression analysis of deep sequencing data. Using these methods on 122 cap analysis of gene expression (CAGE) samples of transcription start sites, we construct genome-wide 'promoteromes' in human and mouse consisting of a three-tiered hierarchy of transcription start sites, transcription start clusters, and transcription start regions.

Keywords

BiologyComputational biologyDeep sequencingGeneticsDNA sequencingGeneTranscription (linguistics)Normalization (sociology)GenomicsGenomeHuman geneticsHuman genome

MeSH Terms

AlgorithmsAnimalsBase CompositionCell LineCluster AnalysisComputational BiologyCpG IslandsGene Expression ProfilingGenome-Wide Association StudyHumansMiceOligonucleotide Array Sequence AnalysisPromoter RegionsGeneticReproducibility of ResultsSequence AnalysisDNATranscription Initiation Site

Affiliated Institutions

Related Publications

The Phusion Assembler

The Phusion assembler has assembled the mouse genome from the whole-genome shotgun (WGS) dataset collected by the Mouse Genome Sequencing Consortium, at ∼7.5× sequence coverage,...

2002 Genome Research 220 citations

Publication Info

Year
2009
Type
article
Volume
10
Issue
7
Citations
142
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

142
OpenAlex
12
Influential

Cite This

Piotr J. Balwierz, Piero Carninci, Carsten O. Daub et al. (2009). Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data. Genome biology , 10 (7) . https://doi.org/10.1186/gb-2009-10-7-r79

Identifiers

DOI
10.1186/gb-2009-10-7-r79
PMID
19624849
PMCID
PMC2728533

Data Quality

Data completeness: 86%