Abstract
Commonly used evaluation measures including Recall, Precision, F-Measure and\nRand Accuracy are biased and should not be used without clear understanding of\nthe biases, and corresponding identification of chance or base case levels of\nthe statistic. Using these measures a system that performs worse in the\nobjective sense of Informedness, can appear to perform better under any of\nthese commonly used measures. We discuss several concepts and measures that\nreflect the probability that prediction is informed versus chance. Informedness\nand introduce Markedness as a dual measure for the probability that prediction\nis marked versus chance. Finally we demonstrate elegant connections between the\nconcepts of Informedness, Markedness, Correlation and Significance as well as\ntheir intuitive relationships with Recall and Precision, and outline the\nextension from the dichotomous case to the general multi-class case.\n
Keywords
Related Publications
Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation
Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corre...
Precision and recall of machine translation
Machine translation can be evaluated using precision, recall, and the F-measure. These standard measures have significantly higher correlation with human judgments than recently...
Revised “LEPS” Scores for Assessing Climate Model Simulations and Long-Range Forecasts
The most commonly used measures for verifying forecasts or simulators of continuous variables are root-mean-squared error (rmse) and anomaly correlation. Some disadvantages of t...
Bayesian inference of stellar parameters and interstellar extinction using parallaxes and multiband photometry
Astrometric surveys provide the opportunity to measure the absolute\nmagnitudes of large numbers of stars, but only if the individual line-of-sight\nextinctions are known. Unfor...
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
Abstract Background To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the ...
Publication Info
- Year
- 2020
- Type
- preprint
- Citations
- 1514
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.48550/arxiv.2010.16061