Abstract

This paper presents and evaluates the storage management and caching in PAST, a large-scale peer-to-peer persistent storage utility. PAST is based on a self-organizing, Internet-based overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files, and cache additional copies of popular files.In the PAST system, storage nodes and files are each assigned uniformly distributed identifiers, and replicas of a file are stored at nodes whose identifier matches most closely the file's identifier. This statistical assignment of files to storage nodes approximately balances the number of files stored on each node. However, non-uniform storage node capacities and file sizes require more explicit storage load balancing to permit graceful behavior under high global storage utilization; likewise, non-uniform popularity of files requires caching to minimize fetch distance and to balance the query load.We present and evaluate PAST, with an emphasis on its storage management and caching system. Extensive trace-driven experiments show that the system minimizes fetch distance, that it balances the query load for popular files, and that it displays graceful degradation of performance as the global storage utilization increases beyond 95%.

Keywords

Computer scienceComputer networkCacheDistributed data storeIdentifierComputer data storageNode (physics)Load balancing (electrical power)File system fragmentationStorage managementPeer-to-peerDistributed computingInformation repositoryStorage efficiencyFile sizeDatabaseOperating systemComputer fileStub fileGrid

Affiliated Institutions

Related Publications

Chord

A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed loo...

2001 9645 citations

Publication Info

Year
2001
Type
article
Pages
188-201
Citations
1220
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1220
OpenAlex

Cite This

Antony Rowstron, Peter Druschel (2001). Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. , 188-201. https://doi.org/10.1145/502034.502053

Identifiers

DOI
10.1145/502034.502053