CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson

CSE 4201, Ch. 6 Storage Systems Hennessy and Patterson

Challenge to the Disk The graveyard is full of suitors Ever heard of Bubble Memory? There are some technologies that refuse to die (silicon, copper...). Disk is their queen.

FLASH Memory Only serious challenger Mostly niche Same bandwidth (and increasing) Much smaller latency (2-3 orders of magnitude) Very expensive (10-100 times) Wears down with use (one million writes) Smaller, hardier, less power hungry

How the disk survives Every time there is a challenge, disks become better Microprocessors are built in (20 years ago) High level intelligent interfaces ATA, SATA, SCSI Can schedule the arm, queue the requests Include cache Now include FLASH!

Power Consumption Consume 9-13 Watts Power increases fast with Diameter (4.6 power) RPM (2.8) Number of platters SATA spin at 7200rpm Serial Attach SCSI 15,000rpm

RAID Nothing to do with piracy! Redundant Array of Independent (Inexpensive) Disks Various kinds of RAID to: Increase size Increase reliability Increase bandwidth Decrease latency

Reliability Revisited With redundant disks we can tolerate 1 disk failure... Until the disk is repaired Easy to repair: Jolt the operator with 10000 Volt He jumps and exchanges the the disk with a new one The whole procedure takes 3 seconds So what is the problem?

Size Matters When we plug in the new disk we have to restore the data If a second disk fries in the meantime we have to skip town A 1000GB disk with 0.1GB/sec bandwidth needs 10,000 seconds if there is no traffic, and the s/w was written by angels On a bad hair day, takes half a day or more Many things can happen in half a day

RAID-0 JBOD (Just a Bunch Of Disks) Concatenated in a single logical volume Or Striped. Striping has better performance and uniform wear. Most home NAS have it.

RAID-1 Mirroring or Shadowing Works with two disks Oldest and simplest redundancy scheme Fast reads Slow(er) writes Most home NAS have it

RAID-2 Obsolete Multiple disks with ECC Disks do not need to report if they are alive

RAID-3 N disks plus parity Every page is striped over N disks The last disk contains the parities If a disk fails we can infer its contents from the good disks and the parity Bit-interleaved parity High bandwidth, easy to check Latency is the latency of the slowest

RAID-4 Optimized for small reads Block interleaved parity Small reads independent of each other Writes involve two reads and two writes: Read affected block and parity Compute the new parity Write the affected block and the parity Increased workload on parity disk

RAID-5 To reduce workload on parity disk, rotate: Stripe zero: parity is on disk-0 Stripe one: parity is on disk-1... Stripe nine: parity is on disk-0

RAID-6 AKA RAID-DP, diagonal parity Can handle two failures Disk-0 0 1 2 3 Disk-1 Disk-2 Disk-3 Row parity Diag. parity 1 2 2 3 3 4 4 0 3 4 0 1 4 0 1 2 0 1 2 3

RAID-10, RAID-01 We have eight disks, for example We can configure them as: Four pairs Called RAID-10 Two quartets Called RAID-01

Queuing Theory Used to study Telephony Traffic Interactive systems Disks Memory systems

Performance prediction Can be used to study/predict Throughput Response time Utilization Studies the complex interactions: When utilization increases Queue length increases Waiting time increases Response time increases Turnaround time increases

Analysis and Simulation Many simple problems can be studied analytically Markov chains Poisson Distribution (Not to be confused with the perfume by Christian Dior that is written with one s ) Most real problems can only be studied with simulations and empirical methods

Queuing Theory 101 Little's Law Tasks in system = arrival rate x time in system We assume steady state That is the number of jobs arriving and the number of jobs leaving the system are equal on average So the job just about to leave the system waited long enough on average to have the queue behind it filled with the average jobs in the system.

Markov Chains Consider a server that when a job arrives while it is busy, puts the job into a queue At any point in time the system has 0, 1, 2, 3, etc elements (i.e. when it has 2 elements one is being served and one is in the queue) Under Poisson assumptions it does not matter how long the job has been served It does not matter if the previous event was an arrival or departure It matters only how many jobs are in the system

Markov Chains P1 P2 P0 P3

Assumptions The probability of an event happening is independent of how long we waited to happen This is the basic assumption for Poisson processes You can think of it as follows: At every time interval dt we roll a die to decide if the event will occur The smaller the dt the better the approximation The probability of an event happening is d t

Interarrival Time To compute this beast we cut the time in n chunks The probability of an event happening is The probability of not happening n times is t n 1 t n n e t

Exponential Distribution The inter-arrival time follows the exponential distribution

Binomial Distribution What is the probability of having k events in time t? We cut the time t in n chunks The probability of having k events in the n chunks is n t k n k t 1 n n k

Poisson Distribution If we let the n go to infinity we get the Poisson distribution e t t k k!

Mean, Variance Mean Variance Coefficient of variance C=

Average Residual Service time If we look at a process at a random instant, what is the time until it finishes T ARS = 1 2 T 1C 2

Memory-less Process The fundamental assumption for us (for now) For memory-less processes the coefficient of variance is 1 The average residual time is the inverse of the rate T AR S = T

Mean Inter-arrival Time and Arrival Rate These two are the inverse of one another If we have ten arrivals in a time unit then the interarrival time is on average 1/10 of the time unit If we have a thousand disk requests per second then the inter-arrival time is a millisecond. If at a random time we use a stopwatch to time the time until the next disk request it will be A millisecond (for memory-less process)

Service Time and Completion Rate Service time is the time it takes the server to service the task You knew that! If the server has enough tasks in the queue to keep busy, then the number of tasks it completes per unit time is The completion rate You could have guessed that!

Alternative Nomenclature Arrival Rate aka Birth Rate Completion Rate aka Death Rate Sometimes statisticians think they are gods.

M/M/1 systems The first M means the task arrival is memoryless (the M stands for Markov) The second M means the task completion is also memory-less (Markov) The 1 means we have one server The arrival rate is λ The completion rate is μ

Stable systems In stable M/M/1 systems the arrival rate is less than the completion rate This means that every now and then the queue will empty (statistically speaking) The length of the queue will hover around the average If the arrival rate is larger, the queue will keep increasing forever (the increase is linear) Sane people call this unstable

Server Utilization The average number of jobs arriving in a unit of time is λ The average time it takes to service a job is 1/μ The time the server was busy was =

Server Utilization Clearly ρ has to be less than one For stability For the definition to hold The server utilization is also the probability of a process finding the the server busy

Analysis of M/M/1 P1 P0 P2 P 0 t=p 1 t P 1 t =P 2 t P3 P 1 = P 0 P 2 = 2 P 0 P i = i P 0

Finding P0 We know that the probabilities sum up to 1 1= i =0 P = i i=0 P 0 =1 P 0 i =P 0 i=0 i = P 0 1 P i =1 i

Number of Tasks in the System This is the expected value of the tasks in the system We apply the formula for the expected value L = sys i =0 P i i= i=0 i 1 i =1 i =0 i i = 1

Length of the Queue State I has i-1 tasks in the queue L = q i =1 P i i 1=1 i=1 i 1 i = 2 1