Sketching volume capacities in deduplicated storage

Danny Harnik, Moshik Hershcovitch, Yosef Shatsky, Amir Epstein, Ronen Kat

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

The adoption of deduplication in storage systems has introduced significant new challenges for storage management. Specifically, the physical capacities associated with volumes are no longer readily available. In this work we introduce a new approach to analyzing capacities in deduplicated storage environments. We provide sketch-based estimations of fundamental capacity measures required for managing a storage system: How much physical space would be reclaimed if a volume or group of volumes were to be removed from a system (the reclaimable capacity) and how much of the physical space should be attributed to each of the volumes in the system (the attributed capacity). Our methods also support capacity queries for volume groups across multiple storage systems, e.g., how much capacity would a volume group consume after being migrated to another storage system? We provide analytical accuracy guarantees for our estimations as well as empirical evaluations. Our technology is integrated into a prominent all-flash storage array and exhibits high performance even for very large systems. We also demonstrate how this method opens the door for performing placement decisions at the data center level and obtaining insights on deduplication in the field.

Original languageEnglish
Title of host publicationProceedings of the 17th USENIX Conference on File and Storage Technologies, FAST 2019
PublisherUSENIX Association
Pages107-119
Number of pages13
ISBN (Electronic)9781939133090
StatePublished - 1 Jan 2019
Externally publishedYes
Event17th USENIX Conference on File and Storage Technologies, FAST 2019 - Boston, United States
Duration: 25 Feb 201928 Feb 2019

Publication series

NameProceedings of the 17th USENIX Conference on File and Storage Technologies, FAST 2019

Conference

Conference17th USENIX Conference on File and Storage Technologies, FAST 2019
Country/TerritoryUnited States
CityBoston
Period25/02/1928/02/19

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Sketching volume capacities in deduplicated storage'. Together they form a unique fingerprint.

Cite this