Abstract
There are two cultures in the use of statistical modeling to reach\nconclusions from data. One assumes that the data are generated by a given\nstochastic data model. The other uses algorithmic models and treats the data\nmechanism as unknown. The statistical community has been committed to the\nalmost exclusive use of data models. This commitment has led to irrelevant\ntheory, questionable conclusions, and has kept statisticians from working on a\nlarge range of interesting current problems. Algorithmic modeling, both in\ntheory and practice, has developed rapidly in fields outside statistics. It can\nbe used both on large complex data sets and as a more accurate and informative\nalternative to data modeling on smaller data sets. If our goal as a field is to\nuse data to solve problems, then we need to move away from exclusive dependence\non data models and adopt a more diverse set of tools.
Keywords
Related Publications
Statistical modeling: The two cultures
Abstract. There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated bya given stochastic data model. The ...
Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion)
Markov chain Monte Carlo is a key computational tool in Bayesian statistics,\nbut it can be challenging to monitor the convergence of an iterative stochastic\nalgorithm. In this...
A multivariate technique for multiply imputing missing values using a sequence of regression models
This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obt...
Arcing classifier (with discussion and a rejoinder by the author)
Recent work has shown that combining multiple versions of unstable\nclassifiers such as trees or neural nets results in reduced test set error. One\nof the more effective is bag...
Maximum Likelihood Estimation and Model Selection in Contingency Tables with Missing Data
Abstract In many studies the values of one or more variables are missing for subsets of the original sample. This article focuses on the problem of obtaining maximum likelihood ...
Publication Info
- Year
- 2001
- Type
- article
- Volume
- 16
- Issue
- 3
- Citations
- 4037
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1214/ss/1009213726