Abstract

We offer a new metric for big data platforms, COST, or the Configuration that Outperforms a Single Thread. The COST of a given platform for a given problem is the hardware configuration required before the platform out-performs a competent single-threaded implementation. COST weighs a system’s scalability against the over-heads introduced by the system, and indicates the actual performance gains of the system, without rewarding sys-tems that bring substantial but parallelizable overheads. We survey measurements of data-parallel systems re-cently reported in SOSP and OSDI, and find that many systems have either a surprisingly large COST, often hundreds of cores, or simply underperform one thread for all of their reported configurations. 1

Keywords

ScalabilityParallelizable manifoldComputer scienceThread (computing)Distributed computingBig dataEmbedded systemParallel computingOperating system

Affiliated Institutions

Related Publications

GeePS

Large-scale deep learning requires huge computational resources to train a multi-layer neural network. Recent systems propose using 100s to 1000s of machines to train networks w...

2016 296 citations

Publication Info

Year
2015
Type
article
Pages
14-14
Citations
218
Access
Closed

External Links

Citation Metrics

218
OpenAlex

Cite This

Frank McSherry, Michael Isard, Derek G. Murray (2015). Scalability! but at what cost?. , 14-14.