Abstract

Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.

Keywords

Serial analysis of gene expressionNegative binomial distributionBayesian probabilityComputer scienceScripting languageClass (philosophy)Mixture modelComputational biologyBiologyData miningPoisson distributionGene expression profilingGene expressionStatisticsArtificial intelligenceGeneMathematicsGenetics

MeSH Terms

AstrocytomaBayes TheoremBrain ChemistryBrain NeoplasmsComputational BiologyDatabasesGeneticGene Expression ProfilingGene Expression RegulationNeoplasticGenetic VariationHumansModelsGeneticOligonucleotide Array Sequence AnalysisResearch Design

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
5
Issue
1
Pages
119-119
Citations
66
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

66
OpenAlex
5
Influential

Cite This

Ricardo Z. N. Vêncio, Helena Brentani, Diogo Ferreira da Costa Patrão et al. (2004). Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE). BMC Bioinformatics , 5 (1) , 119-119. https://doi.org/10.1186/1471-2105-5-119

Identifiers

DOI
10.1186/1471-2105-5-119
PMID
15339345
PMCID
PMC517707

Data Quality

Data completeness: 90%