Abstract
Abstract Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t -test, provide a systematic inference approach that compares favorably with simple t -test or fold methods, and partly compensate for the lack of replication. Availability: The approach is implemented in software called Cyber-T accessible through a Web interface at www.genomics.uci.edu/software.html. The code is available as Open Source and is written in the freely available statistical language R. Contact: pfbaldi@ics.uci.edu; tdlong@uci.edu * To whom correspondence should be addressed. 3 Also at Department of Biological Chemistry, College of Medicine, University of California, Irvine.
Keywords
Affiliated Institutions
Related Publications
Improved statistical tests for differential gene expression by shrinking variance components estimates
Combining information across genes in the statistical analysis of microarray data is desirable because of the relatively small number of data points obtained for each individual...
MCMC Methods for Multi-Response Generalized Linear Mixed Models: The<b>MCMCglmm</b><i>R</i>Package
Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in clo...
Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper)
Various noninformative prior distributions have been suggested for scale parameters in\nhierarchical models. We construct a new folded-noncentral-$t$ family of conditionally\nco...
Natural Exponential Families with Quadratic Variance Functions: Statistical Theory
The normal, Poisson, gamma, binomial, negative binomial, and NEFGHS distributions are the six univariate natural exponential families (NEF) with quadratic variance functions (QV...
A mixture of generalized hyperbolic distributions
Abstract We introduce a mixture of generalized hyperbolic distributions as an alternative to the ubiquitous mixture of Gaussian distributions as well as their near relatives wit...
Publication Info
- Year
- 2001
- Type
- article
- Volume
- 17
- Issue
- 6
- Pages
- 509-519
- Citations
- 1617
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/bioinformatics/17.6.509