Redundant Array of Independent Disks

Similar documents
Redundant Array of Independent Disks

FAULT TOLERANT SYSTEMS

CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson

416 Distributed Systems

IBM Research Report. Notes on Reliability Models for Non-MDS Erasure Codes

arxiv: v2 [cs.pf] 10 Jun 2018

Reliable Computing I

Coping with disk crashes

arxiv: v1 [cs.pf] 27 Mar 2015

An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks

NEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0

RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures

Markov Models for Reliability Modeling

CHAPTER 10 RELIABILITY

Reliability and Power-Efficiency in Erasure-Coded Storage Systems

Analysis of Reliability Dynamics of SSD RAID

Reliability at Scale

Fault Tolerance. Dealing with Faults

Chapter 15. System Reliability Concepts and Methods. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

EE 445 / 850: Final Examination

I/O Devices. Device. Lecture Notes Week 8

Reliability of Technical Systems

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Coding for loss tolerant systems

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Fault Tolerant Computing ECE 655

When MTTDLs Are Not Good Enough: Providing Better Estimates of Disk Array Reliability

IBM Research Report. Performance Metrics for Erasure Codes in Storage Systems

S-Code: Lowest Density MDS Array Codes for RAID-6

CSWL: Cross-SSD Wear-Leveling Method in SSD-Based RAID Systems for System Endurance and Performance

Quantitative evaluation of Dependability

57:022 Principles of Design II Midterm Exam #2 Solutions

Statistical Reliability Modeling of Field Failures Works!

57:022 Principles of Design II Final Exam Solutions - Spring 1997

On the average file unavailability for specific storage disk arrangements in Cloud systems

Partial-MDS Codes and their Application to RAID Type of Architectures

IEEE TRANSACTIONS ON INFORMATION THEORY 1

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5

Dependable Computer Systems

EECS150 - Digital Design Lecture 26 - Faults and Error Correction. Types of Faults in Digital Designs

Balanced Locally Repairable Codes

Dependable Systems. ! Dependability Attributes. Dr. Peter Tröger. Sources:

Balanced Locally Repairable Codes

LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead

Linear Programming Bounds for Distributed Storage Codes

Distributed Data Storage Systems with. Opportunistic Repair

Single-part-type, multiple stage systems. Lecturer: Stanley B. Gershwin

Information Redundancy: Coding

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

Fault Tolerant Computing CS 530 Information redundancy: Coding theory. Yashwant K. Malaiya Colorado State University

MIT Manufacturing Systems Analysis Lectures 6 9: Flow Lines

Quantitative evaluation of Dependability

Ultimate Codes: Near-Optimal MDS Array Codes for RAID-6

Page 1. Outline. Modeling. Experimental Methodology. ECE 254 / CPS 225 Fault Tolerant and Testable Computing Systems. Modeling and Evaluation

Reliability Analysis of a Fuel Supply System in Automobile Engine

Write Once Memory Codes and Lattices for Flash Memories

Chapter 5. System Reliability and Reliability Prediction.

Fault Tolerant Computing CS 530 Software Reliability Growth. Yashwant K. Malaiya Colorado State University

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems

Scalable, Robust, Fault-Tolerant Parallel QR Factorization. Camille Coti

Markov Reliability and Availability Analysis. Markov Processes

Hierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems

Availability and Reliability Analysis for Dependent System with Load-Sharing and Degradation Facility

Lecture on Memory Test Memory complexity Memory fault models March test algorithms Summary

CRYOGENIC DRAM BASED MEMORY SYSTEM FOR SCALABLE QUANTUM COMPUTERS: A FEASIBILITY STUDY

Multi-State Availability Modeling in Practice

MIT Manufacturing Systems Analysis Lectures 18 19

Reliability Modeling of Cloud-RAID-6 Storage System

Yield Evaluation Methods of SRAM Arrays: a Comparative Study

Practical Applications of Reliability Theory

Improving Disk Sector Integrity Using 3-dimension Hashing Scheme

CS162 Operating Systems and Systems Programming Lecture 17. Performance Storage Devices, Queueing Theory

DRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13

Linear Programming Bounds for Robust Locally Repairable Storage Codes

Low-Complexity Codes for Random and Clustered High-Order Failures in Storage Arrays

EECS150 - Digital Design Lecture 26 Faults and Error Correction. Recap

Combinational Techniques for Reliability Modeling

Page 1. Outline. Experimental Methodology. Modeling. ECE 254 / CPS 225 Fault Tolerant and Testable Computing Systems. Modeling and Evaluation

Does soft-decision ECC improve endurance?

Part 3: Fault-tolerance and Modeling

In-Memory Computing of Akers Logic Array

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Terminology and Concepts

Weather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012

Complete Solutions to Examination Questions Complete Solutions to Examination Questions 16

CORRECTNESS OF A GOSSIP BASED MEMBERSHIP PROTOCOL BY (ANDRÉ ALLAVENA, ALAN DEMERS, JOHN E. HOPCROFT ) PRATIK TIMALSENA UNIVERSITY OF OSLO

T. Liggett Mathematics 171 Final Exam June 8, 2011

Revisiting Memory Errors in Large-Scale Production Data Centers

Availability. M(t) = 1 - e -mt

Power System Research Group Electrical Engineering Dept., University of Saskatchewan Saskatoon, Canada

Robust Network Codes for Unicast Connections: A Case Study

Reliability and Availability Simulation. Krige Visser, Professor, University of Pretoria, South Africa

Modelling HDD Failure Rates with Bayesian Inference

Distributed storage systems from combinatorial designs

CSE 312, 2011 Winter, W.L.Ruzzo. 6. random variables

Zigzag Codes: MDS Array Codes with Optimal Rebuilding

EARLY DEPLOYMENT DATA RETRIEVAL 10-6 ESTIMATES Maximum Likelihood 6-1 Point 6-1

ECEN 655: Advanced Channel Coding

EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs)

Fault Tolerant Computing CS 530 Random Testing. Yashwant K. Malaiya Colorado State University

Module 4-2 Methods of Quantitative Reliability Analysis

Transcription:

Redundant Array of Independent Disks Yashwant K. Malaiya 1

Redundant Array of Independent Disks (RAID) Enables greater levels of performance and/or reliability How? By concurrent use of two or more hard disk drives. How Exactly? Striping (of data): data divided and spread across drives Mirroring: identical contents on two disks Error correction techniques: Parity 2

Hard Disks Rotate: one or more platters Efficient for blocks of data (sector 512 bytes) Inherently error prone ECC to check for errors internally Need a controller Can fail completely Modern discs use efficient Low Density Parity codes (LDPC). Some errors involving a few bits corrected internally, they are not considered here. 3

Solid State Drives (SDD) Also non-volatile, higher cost, higher reliability, lower power Still block oriented Controlled by a controller Wears out: 1000-6000 cycles Wear levelling by the controller Random writes can be slower, caching can help. Wearout Ex: 64GB SSD, can sustain writes of 64GBx3000 = 192 TB Write 40 MB/hour, 260 GB/year. (8760 hours) Will last for 192 TB/260 GB = 738 years! High level discussion applies to both HDD and SDD. 4

Standard RAID levels RAID 0: striping RAID 1: mirroring RAID 2: bit-level striping, Hamming code for error correction (not used anymore) RAID 3: byte-level striping, parity (rare) RAID 4: block-level striping, parity RAID 5: block-level striping, distributed parity RAID 6: block-level striping, distributed double parity 5

RAID 0 Data striped across n disks here n = 2 Read/write in parallel No redundancy. n R sys R i i1 Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 14 R sys = (0.9) 14 = 0.23 6

RAID 1 Disk 1 mirrors Disk 0 Read/write in parallel One of them can be used as a backup. n R sys [1 (1 R i ) 2 ] i1 Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 7 pairs R sys = (2x0.9-(0.9) 2 ) 7 = 0.93 Failed disk identified using internal ECC 7

RAID 2 Used Hamming code check bits as redundancy Bit level operations needed. Obsolete. 8

RAID 3 Byte level striping not efficient Dedicated parity disk If one fails, its data can be reconstructed using a spare R sys n j n 1 n R j j j (1 R ) n j Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 13, j = 12, 13 R sys = 0.62 i 9

RAID 4 Block level striping Dedicated parity disk If one fails, its data can be reconstructed using a spare R sys n j n 1 n R j j j (1 R i ) n j Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 13, j = 12, 13 R sys = 0.62 Parity disk bottleneck 10

RAID 5 Distributed parity If one disk fails, its data can be reconstructed using a spare R sys n j n 1 n R j j j (1 R i ) n j Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 13, j = 12, 13 R sys = 0.62 11

RAID 5: illustration Distributed parity If one disk fails, its data can be reconstructed using a spare Parity block = Block1 block2 block3 10001101 block1 01101100 block2 11000110 block3 -------------- 00100111 parity block (ensures even number of 1s) Can reconstruct any missing block from the others 12

RAID 5: try Distributed parity If one disk fails, its data can be reconstructed using a spare Parity block = Block1 block2 block3 10001101 block1 01101100 block2 block3 (Go ahead, reconstruct) -------------- 00100111 parity block (ensures even number of 1s) Can reconstruct any missing block from the others 13

RAID 6 Distributed double parity If one disk fails, its data can be reconstructed using a spare Handles data loss during a rebuild R sys n j n 2 n R j j j (1 R ) n j Ex: 3 year disk reliability = 0.9 for 100% duty cycle. n = 13, j = 11, 12, 13 R sys = 0.87 i 14

Nested RAID Levels RAID 01: mirror (1) of stripes (0) RAID 10: stripe of mirrors RAID 50: block-level striping of RAID 0 with the distributed parity of RAID 5 for individual subsets RAID 51: RAID5 duplicated RAID 60: block-level striping of RAID 0 with distributed double parity of RAID 6 for individual subsets. 15

RAID 10 Stripe of mirrors: each disk in RAID0 is duplicated. ns R sys [1 (1 R i ) 2 ] i1 Ex: 3 year disk reliability = 0.9 for 100% duty cycle. ns = 6 pairs, R sys = 0.94 RAID 10: redundancy at lower level 16

RAID 01 Mirror of stripes: Complete RAID0 is duplicated. ns i1 R sys [1 (1 R i ) 2 ] Ex: 3 year disk reliability = 0.9 for 100% duty cycle. ns = 6 for each of the two sets, R sys = 0.78 RAID 01: redundancy at higher level 17

RAID 50 Multiple RAID 5 for higher capacity 18

RAID 51 Multiple RAID 5 for higher reliability (not capacity) 19

RAIDS Comparison Level Space efficiency Fault tolerance Read performance 0 1 none nx nx 1 mirror 1/2 1 drive 2x x 2 <1 1 var var Write performance 3 <1 1 (n-1)x (n-1)x 4 parity <1 1 (n-1)x (n-1)x 5 Dist Parity <1 1 (n-1)x (n-1)x 6 Dist Double P <1 2 (n-2)x (n-2)x 10 st of mirrors 1/2 1/set nx (n/2)x 20

Markov modeling We have computed reliability using combinatorial modeling. Time dependent modeling can be done failure / repair rates. Repair can be done using rebuilding. MTTDL: mean time to data loss RAID 1: data is lost if the second disk fails before the first could be rebuilt. Reference: Koren and Krishna 21

RAID1 - Reliability Calculation Assumptions: disks fail independently failure process - Poisson process with rate repair time - exponential with mean time 1/ Markov chain: state - number of good disks dp 2 ( t) 2P2 ( t) P1 ( t ) dt P Reliability at time t - R dp 1 ( t) ( ) P1 ( t) 2P2 ( t ) dt t) 1 P ( t) P ( ) P 0) 1; P (0) P (0) 0 0( 1 2 t ( 0 t) P1 ( t) P2 ( t) 1 P ( t) 2( 0 1 Part.8.22 Copyright 2007 Koren & Krishna, Morgan-Kaufman

RAID1 - MTTDL Calculation Starting in state 2 at t=0 time before entering state 1 = 1/(2) Mean time spent in state 1 is 1/( + ) Go back to state 2 with probability q = /( + ) or to state 0 with probability p = /( + ) - mean Probability of n visits to state 1 before transition to state 0 is q n 1 p Mean time to enter state 0 with n visits to state 1: 1 1 3 0 ( T 2 n) n( ) n 2 2( ) MTTDL n1 q pt 20 n1 n1 n1 ( n) nq pt 20(1) T 20 p (1) 3 2 2 Part.8.23 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Approximate Reliability of RAID1 If >>, the transition rate into state 0 from the aggregate of states 1 and 2 is 1/MTTDL Approximate reliability: R( t) e t MTTDL Impact of Disk Repair time Impact of Disk lifetime Part.8.24 Copyright 2007 Koren & Krishna, Morgan-Kaufman

RAID4 - MTTDL Calculation RAID 4/5: data is lost if the second disk fails before the first failed (any one of n) could be rebuilt. MTTDL (2n 1) 2 n( n 1) n( n 1) 2 Detailed MTTDL calculators are available on the web: https://www.servethehome.com/raid-calculator/raid-reliability-calculator-simplemttdl-model/ https://wintelguy.com/raidmttdl.pl Part.8.25 Copyright 2007 Koren & Krishna, Morgan-Kaufman

26 Additional Reading Modeling the Reliability of Raid SetS Dell Triple-Parity RAID and Beyond, Adam Leventhal, Sun Microsystems Estimation of RAID Reliability Enhanced Reliability Modeling of RAID Storage Systems Part.8.26 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Notes Part.8.27 Copyright 2007 Koren & Krishna, Morgan-Kaufman