Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

Abstract

We aim to produce predictive models that are not only accurate, but are also\ninterpretable to human experts. Our models are decision lists, which consist of\na series of if...then... statements (e.g., if high blood pressure, then stroke)\nthat discretize a high-dimensional, multivariate feature space into a series of\nsimple, readily interpretable decision statements. We introduce a generative\nmodel called Bayesian Rule Lists that yields a posterior distribution over\npossible decision lists. It employs a novel prior structure to encourage\nsparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy\non par with the current top algorithms for prediction in machine learning. Our\nmethod is motivated by recent developments in personalized medicine, and can be\nused to produce highly accurate and interpretable medical scoring systems. We\ndemonstrate this by producing an alternative to the CHADS$_2$ score, actively\nused in clinical practice for estimating the risk of stroke in patients that\nhave atrial fibrillation. Our model is as interpretable as CHADS$_2$, but more\naccurate.\n

Keywords

Machine learningComputer scienceArtificial intelligenceBayesian probabilityFeature (linguistics)Data mining

Affiliated Institutions

Related Publications

Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)

Jerome H. Friedman , Trevor Hastie , Robert Tibshirani

Boosting is one of the most important recent developments in\nclassification methodology. Boosting works by sequentially applying a\nclassification algorithm to reweighted versi...

2000 The Annals of Statistics 6819 citations

On a Method to Measure Supervised Multiclass Model’s Interpretability: Application to Degradation Diagnosis (Short Paper)

Scott Lundberg , Su‐In Lee

In an industrial maintenance context, degradation diagnosis is the problem of determining the current level of degradation of operating machines based on measurements. With the ...

2024 Dagstuhl Research Online Publication ... 12892 citations

Path Aggregation Network for Instance Segmentation

Shu Liu , Lu Qi , Haifang Qin +2 more

The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in ...

2018 7956 citations

A general approach for developing system‐specific functions to score protein–ligand docked complexes using support vector inductive logic programming

Ata Amini , Paul J. Shrimpton , Stephen Muggleton +1 more

Abstract Despite the increased recent use of protein–ligand and protein–protein docking in the drug discovery process due to the increases in computational power, the difficulty...

2007 Proteins Structure Function and Bioin... 30 citations

IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

Lam-Tung Nguyen , Heiko A. Schmidt , Arndt von Haeseler +1 more

Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to fin...

2014 Molecular Biology and Evolution 25080 citations

Publication Info

Year: 2015
Type: article
Volume: 9
Issue: 3
Citations: 743
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

743

OpenAlex

Influential

469

CrossRef

Cite This

APA Style

                            
                                
                                    Benjamin Letham, 
                                
                                    Cynthia Rudin, 
                                
                                    Tyler H. McCormick
                                
                                et al.
                            
                            (2015). 
                            Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. 
                            The Annals of Applied Statistics
                            , 9
                            (3)
                            .
                            https://doi.org/10.1214/15-aoas848
                        

Identifiers

DOI: 10.1214/15-aoas848
arXiv: 1511.01644

Data Quality

Data completeness: 84%