Abstract
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.
Keywords
Affiliated Institutions
Related Publications
MapReduce: Simplified Data Processing on Large Cluster
<p>Abstract - MapReduce is a data processing approach, where a single machine acts as a master, assigning map/reduce tasks to all the other machines attached in the cluste...
The Hadoop Distributed File System
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cl...
Accelerating Parallel Maximum Likelihood-Based Phylogenetic Tree Calculations Using Subtree Equality Vectors
Heuristics for calculating phylogenetic trees for a large sets of aligned rRNA sequences based on the maximum likelihood method are computationally expensive. The core of most p...
CloneCloud
Mobile applications are becoming increasingly ubiquitous and provide ever richer functionality on mobile devices. At the same time, such devices often enjoy strong connectivity ...
GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit
Abstract Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changi...
Publication Info
- Year
- 2008
- Type
- article
- Volume
- 51
- Issue
- 1
- Pages
- 107-113
- Citations
- 18309
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/1327452.1327492