A Tale of Two Erasure Codes in HDFS

Size: px
Start display at page:

Download "A Tale of Two Erasure Codes in HDFS"

Transcription

1 A Tale of Two Erasure Codes in HDFS Dynamo Mingyuan Xia *, Mohit Saxena +, Mario Blaum +, and David A. Pease + * McGill University, + IBM Research Almaden FAST 15 何军权

2 Outline Introduction & Motivation Design Evaluation Conclustions Related work 2

3 Introduction & Motivation 3

4 Big Data Storage Reliability and Availability Replication: 3-way replication Erasure Code: Reed-Solomon(RS), LRC GFS 3-way replication 3x, 2003 GFS v2 RS, 1.5x, 2012 FB HDFS LRC, 1.66x, 2013 FB HDFS RS, 1.4x, 2011 Azure LRC, 1.33x,

5 Popular Erasure Code Families Product Code(PC) Local Reconstruction Code(LRC) Other Reed-Solomon(RS) a 0 a 1 a 2 a 3 a 4 h a b 0 b 1 b 2 b 3 b 4 h b P 0 P 1 P 2 P 3 P 4 h PC a 0 a 1 a 2 a 3 a 4 a 5 G 1 a 6 a 7 a 8 a 9 a 10 a 11 G 2 L 0 L 1 L 2 L 3 L 4 L 5 LRC 5

6 Erasure Code Facebook HDFS RS(10,4) Compute 4 parities per 10 data blocks All blocks store in different storage nodes Storage Overhead: 1.4x D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 P1 P2 P3 P4 6

7 Erasure Code High Degraded Read Latency Read to an unavailable block requires Multiple disk reads, network transfers and compute cycles to decode Client Read exception HDFS 7

8 Erasure Code Long Reconstruction Time Facebook's Cluster: 100K blocks lost per day 50 machine-unavailablility events per day Reconstruction traffic: 180TB per day Reconstruction Job HDFS 8

9 Erasure Code Recover Cost Degraded Read Latency Reconstruction Time Recover Cost: the total number of blocks required to reconstruction a data block after failure 9

10 Recovery Cost vs. Storage Overhead Conclusion Storage Overhead and Reconstruction Cost are a tradeoff in single erasure code. FB HDFS RS Azure LRC GFS v2 RS FB HDFS LRC GFS 3-way Repl 10

11 How to balance? Storage Overhead Recovery Cost 11

12 Data Access Skew Conclusions Only few data are "hot" P(freq > 10) ~= 1% Most data are "cold" P(freq <= 10) ~= 99% 12

13 Data Access Skew Hot data High access frequency A small fraction of data Cold data Low access frequency A major fraction of data A little improvement on read can gain a high read performance Hot Data: Decrease the Recovery Cost A few less of data to store can save huge storage space Cold Data: High Storage Efficiency 13

14 HACFS System State Tracks file states File size, last mtime Read count and coding state Adapting Coding Tracks system states Choose coding scheme based on read count and mtime Erasure Coding Providing four coding interfaces Encode/Decode Upcode/Downcode 14

15 Erasure Coding Algorithms Two different erasure codes Fast code: Encode the frequently accessed blocks to reduce the read latency and reconstruction time Provide overall low recovery cost Compact code: Encode the less frequently accessed blocks to get low storage overhead Maintain a low and bounded storage overhead 15

16 State Transition HACFS Recently created COND Fast Code 3-way replication Write cold COND COND' COND' COND : Read Hot and Bounded COND': Read Cold or Not Bounded Compact Code 16

17 Fast and Compact Product Codes(1) h a1 =RS(a 0,a 1,a 2,a 3,a 4 ) Pa 0 =XOR(a 0,a 5 ) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph Fast Code (Product Code 2x5) Storage overhead: 1.8x Recovery Cost: 2 Compact Code (Product Code 6x5) Storage overhead: 1.4x 17

18 Fast and Compact Product Codes(2) P 0 =XOR(a 0,a 5,b 0,b 5,c 0,c 5 ) h a1 =RS(a 0,a 1,a 2,a 3,a 4 ) Pa 0 =XOR(a 0,a 5 ) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph Fast Code (Product Code 2x5) Storage overhead: 1.8x Recovery Cost: 2 Compact Code (Product Code 6x5) Storage overhead: 1.4x Recovery Cost: 5 18

19 Fast and Compact LRC(1) {G 1,G 2 }=RS(a 0,a 1,..,a 11 ) L i =XOR(a i, a i+6 ) a 0 a 1 a 2 a 3 a 4 a 5 G 1 {G 1,G 2 }=RS(a 0,a 1,..,a 11 ) L i =RS'(a 0, a 1, a 2, a6, a 7, a 8 ) a 0 a 1 a 2 a 3 a 4 a 5 G 1 a 6 a 7 a 8 a 9 a 10 a 11 G 2 a 6 a 7 a 8 a 9 a 10 a 11 G 2 L 0 L 1 L 2 L 3 L 4 L 5 L 0 L 1 Fast Code (LRC(12,6,2)) Storage overhead: 20/12=1.67x Compact Code (LRC(12,2,2)) Storage overhead: 16/12=1.33x Recovery Cost: 2 Recovery Cost: 6 19

20 Upcoding for Product Codes Fast Code PC(2x5) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a Compact Code PC(6x5) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 Pb 0 Pb 1 Pb 2 Pb 3 Pb 4 Ph b c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 Pc 0 Pc 1 Pc 2 Pc 3 Pc 4 Ph c Parities h require no re-construction Parities P require no data block transfer All parities updates can be done in parallel 20

21 Downcoding for Product Codes Compact Code PC(6x5) Fast Code PC(2x5) a 0 a 1 a 2 a 3 a 4 h a1 a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 Pb 0 Pb 1 Pb 2 Pb 3 Pb 4 Ph b c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 Pa 0 =XOR(a 0,a 5 ) Pc 0 =XOR(P 0,Pa 0,Pb 0 ) Pc 0 Pc 1 Pc 2 Pc 3 Pc 4 Ph c 21

22 Evaluation Platform CPU: Intel Xeon E cores, 2.4GHz Disk: 7.2K RPM, 6*2TB Memory: 96GB Network: 1Gbps NIC Cluster size: 11 nodes Workload CC: Cloudera Customer FB: Facebook 22

23 Evaluation Metrics Degraded read latency Foreground read request latency Reconstruction time Background recovery for failures Storage overhead 23

24 Degraded Read Latency The Production systems: seconds HACFS: seconds Bounded the storage overhead of HACFS LRC and PC to 1.4 and

25 Reconstruction Time A disk with 100GB data failed HACFS-PC takes about minutes less than Production systems HACFS-LRC is worse than RS(6,3) in GFS v2 To reconstruction global parities, HACFS-LRC need to read 12 blocks, but GFS v2 only 6 blocks 25

26 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x 26

27 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x lost block type HACFS-PC HACFS-LRC Colossus FS HDFS-RAID Azure data block global parity fast: 2 fast: 2 comp: 5 comp: 6 fast: 5 fast: 12 comp: 6 comp:

28 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x lost block type HACFS-PC HACFS-LRC Colossus FS HDFS-RAID Azure data block global parity fast: 2 fast: 2 comp: 5 comp: 6 fast: 5 fast: 12 comp: 6 comp:

29 Conclusions By using Erasure code, a lot of storage space can be saved. The production systems using a single erasure code can not balance the tradeoff between recovery cost and storage overhead very well. HACFS by using a dynamically adaptive coding can provide both low recovery cost and storage overhead. 29

30 Related Work f4 OSDI'14 Divide the cold and hot by the data age XOR-based Erasure Code--FAST 12 Combination RS with XOR. Minimum-Storage-Regeneration(MSR) Minimizes network transfers during reconstruction. Product-Matrix-Reconstruct-By-Transfer(PM-RBT) FAST 15 Optimal in terms of I/O, storage, and network bandwidth. 30

31 Thank You! 31

32 Acknowledgment Prof. Xiong Zigang Zhang Biao Ma CAS ICT Storage System Group 32

Coding for loss tolerant systems

Coding for loss tolerant systems Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon

More information

LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead

LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead LDPC Code Design for Distributed Storage: 1 Balancing Repair Bandwidth, Reliability and Storage Overhead arxiv:1710.05615v1 [cs.dc] 16 Oct 2017 Hyegyeong Park, Student Member, IEEE, Dongwon Lee, and Jaekyun

More information

A Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes. K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran

A Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes. K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran A Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran Outline IntroducGon & MoGvaGon Measurements from Facebook s Warehouse

More information

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5 RJ10504 (ALM1303-010) March 15, 2013 Computer Science IBM Research Report Construction of PMDS and SD Codes Extending RAID 5 Mario Blaum IBM Research Division Almaden Research Center 650 Harry Road San

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Map-Reduce Denis Helic KTI, TU Graz Oct 24, 2013 Denis Helic (KTI, TU Graz) KDDM1 Oct 24, 2013 1 / 82 Big picture: KDDM Probability Theory Linear Algebra

More information

Balanced Locally Repairable Codes

Balanced Locally Repairable Codes Balanced Locally Repairable Codes Katina Kralevska, Danilo Gligoroski and Harald Øverby Department of Telematics, Faculty of Information Technology, Mathematics and Electrical Engineering, NTNU, Norwegian

More information

Balanced Locally Repairable Codes

Balanced Locally Repairable Codes Balanced Locally Repairable Codes Katina Kralevska, Danilo Gligoroski and Harald Øverby Department of Telematics, Faculty of Information Technology, Mathematics and Electrical Engineering, NTNU, Norwegian

More information

416 Distributed Systems

416 Distributed Systems 416 Distributed Systems RAID, Feb 26 2018 Thanks to Greg Ganger and Remzi Arapaci-Dusseau for slides Outline Using multiple disks Why have multiple disks? problem and approaches RAID levels and performance

More information

Coding problems for memory and storage applications

Coding problems for memory and storage applications .. Coding problems for memory and storage applications Alexander Barg University of Maryland January 27, 2015 A. Barg (UMD) Coding for memory and storage January 27, 2015 1 / 73 Codes with locality Introduction:

More information

arxiv: v1 [cs.it] 16 Jan 2013

arxiv: v1 [cs.it] 16 Jan 2013 XORing Elephants: Novel Erasure Codes for Big Data arxiv:13013791v1 [csit] 16 Jan 2013 ABSTRACT Maheswaran Sathiamoorthy University of Southern California msathiam@uscedu Alexandros G Dimais University

More information

BDR: A Balanced Data Redistribution Scheme to Accelerate the Scaling Process of XOR-based Triple Disk Failure Tolerant Arrays

BDR: A Balanced Data Redistribution Scheme to Accelerate the Scaling Process of XOR-based Triple Disk Failure Tolerant Arrays BDR: A Balanced Data Redistribution Scheme to Accelerate the Scaling Process of XOR-based Triple Disk Failure Tolerant Arrays Yanbing Jiang 1, Chentao Wu 1, Jie Li 1,2, and Minyi Guo 1 1 Department of

More information

One Optimized I/O Configuration per HPC Application

One Optimized I/O Configuration per HPC Application One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University

More information

IBM Research Report. Performance Metrics for Erasure Codes in Storage Systems

IBM Research Report. Performance Metrics for Erasure Codes in Storage Systems RJ 10321 (A0408-003) August 2, 2004 Computer Science IBM Research Report Performance Metrics for Erasure Codes in Storage Systems James Lee Hafner, Veera Deenadhayalan, Tapas Kanungo, KK Rao IBM Research

More information

XORing Elephants: Novel Erasure Codes for Big Data

XORing Elephants: Novel Erasure Codes for Big Data XORing Elephants: Novel Erasure Codes for Big Data Maheswaran Sathiamoorthy University of Southern California msathiam@uscedu Alexandros G Dimais University of Southern California dimais@uscedu Megasthenis

More information

Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing

Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing Prasanna Balaprakash, Leonardo A. Bautista Gomez, Slim Bouguerra, Stefan M. Wild, Franck Cappello, and Paul D. Hovland

More information

On Locally Recoverable (LRC) Codes

On Locally Recoverable (LRC) Codes On Locally Recoverable (LRC) Codes arxiv:151206161v1 [csit] 18 Dec 2015 Mario Blaum IBM Almaden Research Center San Jose, CA 95120 Abstract We present simple constructions of optimal erasure-correcting

More information

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0 NEC PerforCache Influence on M-Series Disk Array Behavior and Performance. Version 1.0 Preface This document describes L2 (Level 2) Cache Technology which is a feature of NEC M-Series Disk Array implemented

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges

More information

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures Guangyan Zhang, Zican Huang, Xiaosong Ma SonglinYang, Zhufan Wang, Weimin Zheng Tsinghua University Qatar Computing Research

More information

On the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems

On the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems 1 On the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems Akshay Kumar, Ravi Tandon, T. Charles Clancy arxiv:1405.2833v2 [cs.dc] 22 May 2015 Abstract The increase in data storage and

More information

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Steve Baker December 6, 2011 Abstract. The evaluation of various error detection and correction algorithms and

More information

Scalable and Power-Efficient Data Mining Kernels

Scalable and Power-Efficient Data Mining Kernels Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the

More information

Ultimate Codes: Near-Optimal MDS Array Codes for RAID-6

Ultimate Codes: Near-Optimal MDS Array Codes for RAID-6 University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of Summer 014 Ultimate Codes: Near-Optimal MDS Array

More information

MANY enterprises, including Google, Facebook, Amazon. Capacity of Clustered Distributed Storage

MANY enterprises, including Google, Facebook, Amazon. Capacity of Clustered Distributed Storage TO APPEAR AT IEEE TRANSACTIONS ON INFORATION THEORY Capacity of Clustered Distributed Storage Jy-yong Sohn, Student ember, IEEE, Beongjun Choi, Student ember, IEEE, Sung Whan Yoon, ember, IEEE, and Jaekyun

More information

IBM Research Report. Notes on Reliability Models for Non-MDS Erasure Codes

IBM Research Report. Notes on Reliability Models for Non-MDS Erasure Codes RJ10391 (A0610-035) October 24, 2006 Computer Science IBM Research Report Notes on Reliability Models for Non-MDS Erasure Codes James Lee Hafner, KK Rao IBM Research Division Almaden Research Center 650

More information

IEEE TRANSACTIONS ON INFORMATION THEORY 1

IEEE TRANSACTIONS ON INFORMATION THEORY 1 IEEE TRANSACTIONS ON INFORMATION THEORY 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Proxy-Assisted Regenerating Codes With Uncoded Repair for Distributed Storage Systems Yuchong Hu,

More information

Reliability at Scale

Reliability at Scale Reliability at Scale Intelligent Storage Workshop 5 James Nunez Los Alamos National lab LA-UR-07-0828 & LA-UR-06-0397 May 15, 2007 A Word about scale Petaflop class machines LLNL Blue Gene 350 Tflops 128k

More information

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research The Curse of O(N) in

More information

Hierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems

Hierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems Hierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems Alessandro Duminuco (duminuco@eurecom.fr) Ernst W. Biersack (biersack@eurecom.fr) PREPRINT VERSION The original

More information

On MBR codes with replication

On MBR codes with replication On MBR codes with replication M. Nikhil Krishnan and P. Vijay Kumar, Fellow, IEEE Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore. Email: nikhilkrishnan.m@gmail.com,

More information

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan ArcGIS GeoAnalytics Server: An Introduction Sarah Ambrose and Ravi Narayanan Overview Introduction Demos Analysis Concepts using GeoAnalytics Server GeoAnalytics Data Sources GeoAnalytics Server Administration

More information

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes A Piggybacing Design Framewor for Read-and Download-efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran, Fellow, IEEE Department of Electrical Engineering and Computer Sciences

More information

Linear Programming Bounds for Distributed Storage Codes

Linear Programming Bounds for Distributed Storage Codes 1 Linear Programming Bounds for Distributed Storage Codes Ali Tebbi, Terence H. Chan, Chi Wan Sung Department of Electronic Engineering, City University of Hong Kong arxiv:1710.04361v1 [cs.it] 12 Oct 2017

More information

Secure RAID Schemes from EVENODD and STAR Codes

Secure RAID Schemes from EVENODD and STAR Codes Secure RAID Schemes from EVENODD and STAR Codes Wentao Huang and Jehoshua Bruck California Institute of Technology, Pasadena, USA {whuang,bruck}@caltechedu Abstract We study secure RAID, ie, low-complexity

More information

Estimates for factoring 1024-bit integers. Thorsten Kleinjung, University of Bonn

Estimates for factoring 1024-bit integers. Thorsten Kleinjung, University of Bonn Estimates for factoring 1024-bit integers Thorsten Kleinjung, University of Bonn Contents GNFS Overview Polynomial selection, matrix construction, square root computation Sieving and cofactoring Strategies

More information

An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks

An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks Sanjeeb Nanda and Narsingh Deo School of Computer Science University of Central Florida Orlando, Florida 32816-2362 sanjeeb@earthlink.net,

More information

Linear Programming Bounds for Robust Locally Repairable Storage Codes

Linear Programming Bounds for Robust Locally Repairable Storage Codes Linear Programming Bounds for Robust Locally Repairable Storage Codes M. Ali Tebbi, Terence H. Chan, Chi Wan Sung Institute for Telecommunications Research, University of South Australia Email: {ali.tebbi,

More information

DRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13

DRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13 slide 1 DRAM Reliability: Parity, ECC, Chipkill, Scrubbing Alpha Particle or Cosmic Ray electron-hole pairs silicon Alpha Particles: Radioactive impurity in package material slide 2 - Soft errors were

More information

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012 Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,

More information

Regenerating Codes and Locally Recoverable. Codes for Distributed Storage Systems

Regenerating Codes and Locally Recoverable. Codes for Distributed Storage Systems Regenerating Codes and Locally Recoverable 1 Codes for Distributed Storage Systems Yongjune Kim and Yaoqing Yang Abstract We survey the recent results on applying error control coding to distributed storage

More information

Distributed storage systems from combinatorial designs

Distributed storage systems from combinatorial designs Distributed storage systems from combinatorial designs Aditya Ramamoorthy November 20, 2014 Department of Electrical and Computer Engineering, Iowa State University, Joint work with Oktay Olmez (Ankara

More information

Coping with disk crashes

Coping with disk crashes Lecture 04.03 Coping with disk crashes By Marina Barsky Winter 2016, University of Toronto Disk failure types Intermittent failure Disk crash the entire disk becomes unreadable, suddenly and permanently

More information

Towards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems

Towards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems Towards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems 2018 USENIX Annual Technical Conference Zhen Cao 1, Vasily Tarasov 2, Sachin Tiwari 1, and Erez Zadok 1

More information

Marla Meehl Manager of NCAR/UCAR Networking and Front Range GigaPoP (FRGP)

Marla Meehl Manager of NCAR/UCAR Networking and Front Range GigaPoP (FRGP) Big Data at the National Center for Atmospheric Research (NCAR) & expanding network bandwidth to NCAR over Pacific Wave and Western Regional Network (WRN) Marla Meehl Manager of NCAR/UCAR Networking and

More information

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, and Kenneth W. Shum Institute of Network Coding, The Chinese

More information

Partial-MDS Codes and their Application to RAID Type of Architectures

Partial-MDS Codes and their Application to RAID Type of Architectures Partial-MDS Codes and their Application to RAID Type of Architectures arxiv:12050997v2 [csit] 11 Sep 2014 Mario Blaum, James Lee Hafner and Steven Hetzler IBM Almaden Research Center San Jose, CA 95120

More information

Explicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth

Explicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth Explicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth A Project Report Submitted in partial fulfilment of the requirements for the Degree of Master of Engineering in Telecommunication

More information

Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions

Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions arxiv:1806.04474v1 [cs.it] 12 Jun 2018 A Thesis Submitted for the Degree of Doctor of Philosophy in the Faculty of Engineering

More information

Stochastic Modelling of Electron Transport on different HPC architectures

Stochastic Modelling of Electron Transport on different HPC architectures Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy

More information

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Song Han 1 Deji Chen 2 Ming Xiong 3 Aloysius K. Mok 1 1 The University of Texas at Austin 2 Emerson Process Management

More information

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems Hierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented

More information

CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When

CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When 1 CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When Inkwon Hwang, Student Member and Massoud Pedram, Fellow, IEEE Abstract

More information

The Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering

The Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering The Pennsylvania State University The Graduate School Department of Computer Science and Engineering A SIMPLE AND FAST VECTOR SYMBOL REED-SOLOMON BURST ERROR DECODING METHOD A Thesis in Computer Science

More information

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of

More information

Revenue Maximization in a Cloud Federation

Revenue Maximization in a Cloud Federation Revenue Maximization in a Cloud Federation Makhlouf Hadji and Djamal Zeghlache September 14th, 2015 IRT SystemX/ Telecom SudParis Makhlouf Hadji Outline of the presentation 01 Introduction 02 03 04 05

More information

Large-Scale Behavioral Targeting

Large-Scale Behavioral Targeting Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting

More information

S-Code: Lowest Density MDS Array Codes for RAID-6

S-Code: Lowest Density MDS Array Codes for RAID-6 University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of Summer 2014 S-Code: Lowest Density MDS Array Codes

More information

Qualitative vs Quantitative metrics

Qualitative vs Quantitative metrics Qualitative vs Quantitative metrics Quantitative: hard numbers, measurable Time, Energy, Space Signal-to-Noise, Frames-per-second, Memory Usage Money (?) Qualitative: feelings, opinions Complexity: Simple,

More information

Computer Architecture

Computer Architecture Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines

More information

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction K V Rashmi, Nihar B Shah, and P Vijay Kumar, Fellow, IEEE Abstract Regenerating codes

More information

Behavioral Simulations in MapReduce

Behavioral Simulations in MapReduce Behavioral Simulations in MapReduce Guozhang Wang, Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, Walker White Cornell University 1 What are Behavioral Simulations?

More information

Distributed Data Storage Systems with. Opportunistic Repair

Distributed Data Storage Systems with. Opportunistic Repair Distributed Data Storage Systems with 1 Opportunistic Repair Vaneet Aggarwal, Chao Tian, Vinay A. Vaishampayan, and Yih-Farn R. Chen Abstract arxiv:1311.4096v2 [cs.it] 6 Nov 2014 The reliability of erasure-coded

More information

CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson

CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson CSE 4201, Ch. 6 Storage Systems Hennessy and Patterson Challenge to the Disk The graveyard is full of suitors Ever heard of Bubble Memory? There are some technologies that refuse to die (silicon, copper...).

More information

Factorisation of RSA-704 with CADO-NFS

Factorisation of RSA-704 with CADO-NFS Factorisation of RSA-704 with CADO-NFS Shi Bai, Emmanuel Thomé, Paul Zimmermann To cite this version: Shi Bai, Emmanuel Thomé, Paul Zimmermann. Factorisation of RSA-704 with CADO-NFS. 2012. HAL Id: hal-00760322

More information

IBM Research Report. R5X0: An Efficient High Distance Parity-Based Code with Optimal Update Complexity

IBM Research Report. R5X0: An Efficient High Distance Parity-Based Code with Optimal Update Complexity RJ 0322 (A0408-005) August 9, 2004 Computer Science IBM Research Report R5X0: An Efficient High Distance Parity-Based Code with Optimal Update Complexity Jeff R. Hartline Department of Computer Science

More information

ETH Beowulf day January 31, Adrian Biland, Zhiling Chen, Derek Feichtinger, Christoph Grab, André Holzner, Urs Langenegger

ETH Beowulf day January 31, Adrian Biland, Zhiling Chen, Derek Feichtinger, Christoph Grab, André Holzner, Urs Langenegger CMS SM Meeting Nov 28 2005 Analysis and simulation of proton-proton collision data at LHC ETH Beowulf day January 31, 2006 Adrian Biland, Zhiling Chen, Derek Feichtinger, Christoph Grab, André Holzner,

More information

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines 3) Techniques To Measure The Memory Component a) Understanding Your Current Environment

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.

More information

Introduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum

Introduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum Introduction to ArcGIS GeoAnalytics Server Sarah Ambrose & Noah Slocum Agenda Overview Analysis Capabilities + Demo Deployment and Configuration Questions ArcGIS GeoAnalytics Server uses the power of distributed

More information

Maximally Recoverable Codes for Grid-like Topologies *

Maximally Recoverable Codes for Grid-like Topologies * Maximally Recoverable Codes for Grid-like Topologies * Parikshit Gopalan VMware Research pgopalan@vmware.com Shubhangi Saraf Rutgers University shubhangi.saraf@gmail.com Guangda Hu Princeton University

More information

Effective method for coding and decoding RS codes using SIMD instructions

Effective method for coding and decoding RS codes using SIMD instructions Effective method for coding and decoding RS codes using SIMD instructions Aleksei Marov, Researcher, R&D department Raidix LLC, and PhD Student, St.Petersburg State University Saint Petersburg, Russia

More information

Distributed Storage Systems with Secure and Exact Repair - New Results

Distributed Storage Systems with Secure and Exact Repair - New Results Distributed torage ystems with ecure and Exact Repair - New Results Ravi Tandon, aidhiraj Amuru, T Charles Clancy, and R Michael Buehrer Bradley Department of Electrical and Computer Engineering Hume Center

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

A Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures

A Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures 1 A Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures arxiv:181050v1 [csit] 6 Dec 018 S B Balaji, Ganesh R Kini and

More information

Tradeoff between Reliability and Power Management

Tradeoff between Reliability and Power Management Tradeoff between Reliability and Power Management 9/1/2005 FORGE Lee, Kyoungwoo Contents 1. Overview of relationship between reliability and power management 2. Dakai Zhu, Rami Melhem and Daniel Moss e,

More information

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis

AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of

More information

Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing

Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing 20 Years of Innovative Computing Knoxville, Tennessee March 26 th, 200 Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing Zizhong (Jeffrey) Chen zchen@mines.edu Colorado School

More information

All-in-one or BOX industrial PC for autonomous or distributed applications

All-in-one or BOX industrial PC for autonomous or distributed applications M a g e l i s i P C All-in-one or BOX industrial PC for autonomous or distributed applications Intel Core Duo TM Windows XP TM HDD / Flash disk M a g e l i s i P C You are looking for an open, powerful

More information

Unit 6: Branch Prediction

Unit 6: Branch Prediction CIS 501: Computer Architecture Unit 6: Branch Prediction Slides developed by Joe Devie/, Milo Mar4n & Amir Roth at Upenn with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi,

More information

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE

The conceptual view. by Gerrit Muller University of Southeast Norway-NISE by Gerrit Muller University of Southeast Norway-NISE e-mail: gaudisite@gmail.com www.gaudisite.nl Abstract The purpose of the conceptual view is described. A number of methods or models is given to use

More information

Experience in Factoring Large Integers Using Quadratic Sieve

Experience in Factoring Large Integers Using Quadratic Sieve Experience in Factoring Large Integers Using Quadratic Sieve D. J. Guan Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 guan@cse.nsysu.edu.tw April 19, 2005 Abstract

More information

Linear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach

Linear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach Linear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach Mehran Elyasi Department of ECE University of Minnesota melyasi@umn.edu Soheil Mohajer Department of ECE University

More information

A Highly-Available Scalable Distributed Data Structure

A Highly-Available Scalable Distributed Data Structure Online Appendix to: LH RS A Highly-Available Scalable Distributed Data Structure WITOLD LITWIN and RIM MOUSSA Université Paris Dauphine and THOMAS SCHWARZ, S. J. Santa Clara University APPENDIX C 8. LH

More information

Cauchy MDS Array Codes With Efficient Decoding Method

Cauchy MDS Array Codes With Efficient Decoding Method IEEE TRANSACTIONS ON COMMUNICATIONS Cauchy MDS Array Codes With Efficient Decoding Method Hanxu Hou and Yunghsiang S Han, Fellow, IEEE Abstract arxiv:609968v [csit] 30 Nov 206 Array codes have been widely

More information

Scalable Failure Recovery for Tree-based Overlay Networks

Scalable Failure Recovery for Tree-based Overlay Networks Scalable Failure Recovery for Tree-based Overlay Networks Dorian C. Arnold University of Wisconsin Paradyn/Condor Week April 30 May 3, 2007 Madison, WI Overview Motivation Address the likely frequent failures

More information

Fermilab Experiments. Daniel Wicke (Bergische Universität Wuppertal) Outline. (Accelerator, Experiments and Physics) Computing Concepts

Fermilab Experiments. Daniel Wicke (Bergische Universität Wuppertal) Outline. (Accelerator, Experiments and Physics) Computing Concepts Fermilab Experiments CDF Daniel Wicke (Bergische Universität Wuppertal) Outline Motivation (Accelerator, Experiments and Physics) Computing Concepts (SAM, RACs, Prototype and GRID) Summary 30. Oct. 2002

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world. Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check

More information

Optical Storage Technology. Error Correction

Optical Storage Technology. Error Correction Optical Storage Technology Error Correction Introduction With analog audio, there is no opportunity for error correction. With digital audio, the nature of binary data lends itself to recovery in the event

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #12: Frequent Itemsets Seoul National University 1 In This Lecture Motivation of association rule mining Important concepts of association rules Naïve approaches for

More information

LHC-CMS Tier2 facility at TIFR

LHC-CMS Tier2 facility at TIFR LHC-CMS Tier2 facility at TIFR http://indiacms.res.in T2-IN-TIFR Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research Mumbai, India. NKN Workshop, IIT, Bombay November

More information

Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities

Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities Junyu Chen Department of Information Engineering The Chinese University of Hong Kong Email: cj0@alumniiecuhkeduhk Kenneth W Shum

More information

Lecture 2: Metrics to Evaluate Systems

Lecture 2: Metrics to Evaluate Systems Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video

More information

PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1

PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 AUGUST 7, 2007 APRIL 14, 2010 APRIL 24, 2012 Copyr i g h t 2012 O S Is o f t, L L C. 2 PI Data Archive Security PI Asset

More information

A Different Kind of Flow Analysis. David M Nicol University of Illinois at Urbana-Champaign

A Different Kind of Flow Analysis. David M Nicol University of Illinois at Urbana-Champaign A Different Kind of Flow Analysis David M Nicol University of Illinois at Urbana-Champaign 2 What Am I Doing Here??? Invite for ICASE Reunion Did research on Peformance Analysis Supporting Supercomputing

More information

Compressing Tabular Data via Pairwise Dependencies

Compressing Tabular Data via Pairwise Dependencies Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet

More information

I/O Devices. Device. Lecture Notes Week 8

I/O Devices. Device. Lecture Notes Week 8 I/O Devices CPU PC ALU System bus Memory bus Bus interface I/O bridge Main memory USB Graphics adapter I/O bus Disk other devices such as network adapters Mouse Keyboard Disk hello executable stored on

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

Generating Urban Mobility Data Sets Using Scalable GANs

Generating Urban Mobility Data Sets Using Scalable GANs Generating Urban Mobility Data Sets Using Scalable GANs Abhinav Jauhri & John Paul Shen ECE Department Carnegie Mellon University {ajauhri, jpshen}@cmu.edu Objective Generate city-scale human mobility

More information