A Tale of Two Erasure Codes in HDFS
|
|
- Egbert Horton
- 5 years ago
- Views:
Transcription
1 A Tale of Two Erasure Codes in HDFS Dynamo Mingyuan Xia *, Mohit Saxena +, Mario Blaum +, and David A. Pease + * McGill University, + IBM Research Almaden FAST 15 何军权
2 Outline Introduction & Motivation Design Evaluation Conclustions Related work 2
3 Introduction & Motivation 3
4 Big Data Storage Reliability and Availability Replication: 3-way replication Erasure Code: Reed-Solomon(RS), LRC GFS 3-way replication 3x, 2003 GFS v2 RS, 1.5x, 2012 FB HDFS LRC, 1.66x, 2013 FB HDFS RS, 1.4x, 2011 Azure LRC, 1.33x,
5 Popular Erasure Code Families Product Code(PC) Local Reconstruction Code(LRC) Other Reed-Solomon(RS) a 0 a 1 a 2 a 3 a 4 h a b 0 b 1 b 2 b 3 b 4 h b P 0 P 1 P 2 P 3 P 4 h PC a 0 a 1 a 2 a 3 a 4 a 5 G 1 a 6 a 7 a 8 a 9 a 10 a 11 G 2 L 0 L 1 L 2 L 3 L 4 L 5 LRC 5
6 Erasure Code Facebook HDFS RS(10,4) Compute 4 parities per 10 data blocks All blocks store in different storage nodes Storage Overhead: 1.4x D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 P1 P2 P3 P4 6
7 Erasure Code High Degraded Read Latency Read to an unavailable block requires Multiple disk reads, network transfers and compute cycles to decode Client Read exception HDFS 7
8 Erasure Code Long Reconstruction Time Facebook's Cluster: 100K blocks lost per day 50 machine-unavailablility events per day Reconstruction traffic: 180TB per day Reconstruction Job HDFS 8
9 Erasure Code Recover Cost Degraded Read Latency Reconstruction Time Recover Cost: the total number of blocks required to reconstruction a data block after failure 9
10 Recovery Cost vs. Storage Overhead Conclusion Storage Overhead and Reconstruction Cost are a tradeoff in single erasure code. FB HDFS RS Azure LRC GFS v2 RS FB HDFS LRC GFS 3-way Repl 10
11 How to balance? Storage Overhead Recovery Cost 11
12 Data Access Skew Conclusions Only few data are "hot" P(freq > 10) ~= 1% Most data are "cold" P(freq <= 10) ~= 99% 12
13 Data Access Skew Hot data High access frequency A small fraction of data Cold data Low access frequency A major fraction of data A little improvement on read can gain a high read performance Hot Data: Decrease the Recovery Cost A few less of data to store can save huge storage space Cold Data: High Storage Efficiency 13
14 HACFS System State Tracks file states File size, last mtime Read count and coding state Adapting Coding Tracks system states Choose coding scheme based on read count and mtime Erasure Coding Providing four coding interfaces Encode/Decode Upcode/Downcode 14
15 Erasure Coding Algorithms Two different erasure codes Fast code: Encode the frequently accessed blocks to reduce the read latency and reconstruction time Provide overall low recovery cost Compact code: Encode the less frequently accessed blocks to get low storage overhead Maintain a low and bounded storage overhead 15
16 State Transition HACFS Recently created COND Fast Code 3-way replication Write cold COND COND' COND' COND : Read Hot and Bounded COND': Read Cold or Not Bounded Compact Code 16
17 Fast and Compact Product Codes(1) h a1 =RS(a 0,a 1,a 2,a 3,a 4 ) Pa 0 =XOR(a 0,a 5 ) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph Fast Code (Product Code 2x5) Storage overhead: 1.8x Recovery Cost: 2 Compact Code (Product Code 6x5) Storage overhead: 1.4x 17
18 Fast and Compact Product Codes(2) P 0 =XOR(a 0,a 5,b 0,b 5,c 0,c 5 ) h a1 =RS(a 0,a 1,a 2,a 3,a 4 ) Pa 0 =XOR(a 0,a 5 ) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph Fast Code (Product Code 2x5) Storage overhead: 1.8x Recovery Cost: 2 Compact Code (Product Code 6x5) Storage overhead: 1.4x Recovery Cost: 5 18
19 Fast and Compact LRC(1) {G 1,G 2 }=RS(a 0,a 1,..,a 11 ) L i =XOR(a i, a i+6 ) a 0 a 1 a 2 a 3 a 4 a 5 G 1 {G 1,G 2 }=RS(a 0,a 1,..,a 11 ) L i =RS'(a 0, a 1, a 2, a6, a 7, a 8 ) a 0 a 1 a 2 a 3 a 4 a 5 G 1 a 6 a 7 a 8 a 9 a 10 a 11 G 2 a 6 a 7 a 8 a 9 a 10 a 11 G 2 L 0 L 1 L 2 L 3 L 4 L 5 L 0 L 1 Fast Code (LRC(12,6,2)) Storage overhead: 20/12=1.67x Compact Code (LRC(12,2,2)) Storage overhead: 16/12=1.33x Recovery Cost: 2 Recovery Cost: 6 19
20 Upcoding for Product Codes Fast Code PC(2x5) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a Compact Code PC(6x5) a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 Pb 0 Pb 1 Pb 2 Pb 3 Pb 4 Ph b c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 Pc 0 Pc 1 Pc 2 Pc 3 Pc 4 Ph c Parities h require no re-construction Parities P require no data block transfer All parities updates can be done in parallel 20
21 Downcoding for Product Codes Compact Code PC(6x5) Fast Code PC(2x5) a 0 a 1 a 2 a 3 a 4 h a1 a 0 a 1 a 2 a 3 a 4 h a1 a 5 a 6 a 7 a 8 a 9 h a2 a 5 a 6 a 7 a 8 a 9 h a2 Pa 0 Pa 1 Pa 2 Pa 3 Pa 4 Ph a b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 P 0 P 1 P 2 P 3 P 4 Ph b 0 b 1 b 2 b 3 b 4 h b1 b 5 b 6 b 7 b 8 b 9 h b2 Pb 0 Pb 1 Pb 2 Pb 3 Pb 4 Ph b c 0 c 1 c 2 c 3 c 4 h c1 c 5 c 6 c 7 c 8 c 9 h c2 Pa 0 =XOR(a 0,a 5 ) Pc 0 =XOR(P 0,Pa 0,Pb 0 ) Pc 0 Pc 1 Pc 2 Pc 3 Pc 4 Ph c 21
22 Evaluation Platform CPU: Intel Xeon E cores, 2.4GHz Disk: 7.2K RPM, 6*2TB Memory: 96GB Network: 1Gbps NIC Cluster size: 11 nodes Workload CC: Cloudera Customer FB: Facebook 22
23 Evaluation Metrics Degraded read latency Foreground read request latency Reconstruction time Background recovery for failures Storage overhead 23
24 Degraded Read Latency The Production systems: seconds HACFS: seconds Bounded the storage overhead of HACFS LRC and PC to 1.4 and
25 Reconstruction Time A disk with 100GB data failed HACFS-PC takes about minutes less than Production systems HACFS-LRC is worse than RS(6,3) in GFS v2 To reconstruction global parities, HACFS-LRC need to read 12 blocks, but GFS v2 only 6 blocks 25
26 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x 26
27 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x lost block type HACFS-PC HACFS-LRC Colossus FS HDFS-RAID Azure data block global parity fast: 2 fast: 2 comp: 5 comp: 6 fast: 5 fast: 12 comp: 6 comp:
28 System Comparison Colossus FS:RS(6,3)-1.5x HDFS-Raid: RS(10,4)-1.4x Azure: LRC(12,2,2)-1.33x HACFS-PC: PC(2x5)-1.8x PC(6x5)-1.4x HACFS-LRC: LRC(12,6,2)-1.67x LRC(12,2,2)-1.33x lost block type HACFS-PC HACFS-LRC Colossus FS HDFS-RAID Azure data block global parity fast: 2 fast: 2 comp: 5 comp: 6 fast: 5 fast: 12 comp: 6 comp:
29 Conclusions By using Erasure code, a lot of storage space can be saved. The production systems using a single erasure code can not balance the tradeoff between recovery cost and storage overhead very well. HACFS by using a dynamically adaptive coding can provide both low recovery cost and storage overhead. 29
30 Related Work f4 OSDI'14 Divide the cold and hot by the data age XOR-based Erasure Code--FAST 12 Combination RS with XOR. Minimum-Storage-Regeneration(MSR) Minimizes network transfers during reconstruction. Product-Matrix-Reconstruct-By-Transfer(PM-RBT) FAST 15 Optimal in terms of I/O, storage, and network bandwidth. 30
31 Thank You! 31
32 Acknowledgment Prof. Xiong Zigang Zhang Biao Ma CAS ICT Storage System Group 32
Coding for loss tolerant systems
Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon
More informationLDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead
LDPC Code Design for Distributed Storage: 1 Balancing Repair Bandwidth, Reliability and Storage Overhead arxiv:1710.05615v1 [cs.dc] 16 Oct 2017 Hyegyeong Park, Student Member, IEEE, Dongwon Lee, and Jaekyun
More informationA Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes. K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran
A Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran Outline IntroducGon & MoGvaGon Measurements from Facebook s Warehouse
More informationIBM Research Report. Construction of PMDS and SD Codes Extending RAID 5
RJ10504 (ALM1303-010) March 15, 2013 Computer Science IBM Research Report Construction of PMDS and SD Codes Extending RAID 5 Mario Blaum IBM Research Division Almaden Research Center 650 Harry Road San
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Map-Reduce Denis Helic KTI, TU Graz Oct 24, 2013 Denis Helic (KTI, TU Graz) KDDM1 Oct 24, 2013 1 / 82 Big picture: KDDM Probability Theory Linear Algebra
More informationBalanced Locally Repairable Codes
Balanced Locally Repairable Codes Katina Kralevska, Danilo Gligoroski and Harald Øverby Department of Telematics, Faculty of Information Technology, Mathematics and Electrical Engineering, NTNU, Norwegian
More informationBalanced Locally Repairable Codes
Balanced Locally Repairable Codes Katina Kralevska, Danilo Gligoroski and Harald Øverby Department of Telematics, Faculty of Information Technology, Mathematics and Electrical Engineering, NTNU, Norwegian
More information416 Distributed Systems
416 Distributed Systems RAID, Feb 26 2018 Thanks to Greg Ganger and Remzi Arapaci-Dusseau for slides Outline Using multiple disks Why have multiple disks? problem and approaches RAID levels and performance
More informationCoding problems for memory and storage applications
.. Coding problems for memory and storage applications Alexander Barg University of Maryland January 27, 2015 A. Barg (UMD) Coding for memory and storage January 27, 2015 1 / 73 Codes with locality Introduction:
More informationarxiv: v1 [cs.it] 16 Jan 2013
XORing Elephants: Novel Erasure Codes for Big Data arxiv:13013791v1 [csit] 16 Jan 2013 ABSTRACT Maheswaran Sathiamoorthy University of Southern California msathiam@uscedu Alexandros G Dimais University
More informationBDR: A Balanced Data Redistribution Scheme to Accelerate the Scaling Process of XOR-based Triple Disk Failure Tolerant Arrays
BDR: A Balanced Data Redistribution Scheme to Accelerate the Scaling Process of XOR-based Triple Disk Failure Tolerant Arrays Yanbing Jiang 1, Chentao Wu 1, Jie Li 1,2, and Minyi Guo 1 1 Department of
More informationOne Optimized I/O Configuration per HPC Application
One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University
More informationIBM Research Report. Performance Metrics for Erasure Codes in Storage Systems
RJ 10321 (A0408-003) August 2, 2004 Computer Science IBM Research Report Performance Metrics for Erasure Codes in Storage Systems James Lee Hafner, Veera Deenadhayalan, Tapas Kanungo, KK Rao IBM Research
More informationXORing Elephants: Novel Erasure Codes for Big Data
XORing Elephants: Novel Erasure Codes for Big Data Maheswaran Sathiamoorthy University of Southern California msathiam@uscedu Alexandros G Dimais University of Southern California dimais@uscedu Megasthenis
More informationAnalysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing
Analysis of the Tradeoffs between Energy and Run Time for Multilevel Checkpointing Prasanna Balaprakash, Leonardo A. Bautista Gomez, Slim Bouguerra, Stefan M. Wild, Franck Cappello, and Paul D. Hovland
More informationOn Locally Recoverable (LRC) Codes
On Locally Recoverable (LRC) Codes arxiv:151206161v1 [csit] 18 Dec 2015 Mario Blaum IBM Almaden Research Center San Jose, CA 95120 Abstract We present simple constructions of optimal erasure-correcting
More informationNEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0
NEC PerforCache Influence on M-Series Disk Array Behavior and Performance. Version 1.0 Preface This document describes L2 (Level 2) Cache Technology which is a feature of NEC M-Series Disk Array implemented
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges
More informationRAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures
RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures Guangyan Zhang, Zican Huang, Xiaosong Ma SonglinYang, Zhufan Wang, Weimin Zheng Tsinghua University Qatar Computing Research
More informationOn the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems
1 On the Latency and Energy Efficiency of Erasure-Coded Cloud Storage Systems Akshay Kumar, Ravi Tandon, T. Charles Clancy arxiv:1405.2833v2 [cs.dc] 22 May 2015 Abstract The increase in data storage and
More informationError Detection, Correction and Erasure Codes for Implementation in a Cluster File-system
Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Steve Baker December 6, 2011 Abstract. The evaluation of various error detection and correction algorithms and
More informationScalable and Power-Efficient Data Mining Kernels
Scalable and Power-Efficient Data Mining Kernels Alok Choudhary, John G. Searle Professor Dept. of Electrical Engineering and Computer Science and Professor, Kellogg School of Management Director of the
More informationUltimate Codes: Near-Optimal MDS Array Codes for RAID-6
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of Summer 014 Ultimate Codes: Near-Optimal MDS Array
More informationMANY enterprises, including Google, Facebook, Amazon. Capacity of Clustered Distributed Storage
TO APPEAR AT IEEE TRANSACTIONS ON INFORATION THEORY Capacity of Clustered Distributed Storage Jy-yong Sohn, Student ember, IEEE, Beongjun Choi, Student ember, IEEE, Sung Whan Yoon, ember, IEEE, and Jaekyun
More informationIBM Research Report. Notes on Reliability Models for Non-MDS Erasure Codes
RJ10391 (A0610-035) October 24, 2006 Computer Science IBM Research Report Notes on Reliability Models for Non-MDS Erasure Codes James Lee Hafner, KK Rao IBM Research Division Almaden Research Center 650
More informationIEEE TRANSACTIONS ON INFORMATION THEORY 1
IEEE TRANSACTIONS ON INFORMATION THEORY 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Proxy-Assisted Regenerating Codes With Uncoded Repair for Distributed Storage Systems Yuchong Hu,
More informationReliability at Scale
Reliability at Scale Intelligent Storage Workshop 5 James Nunez Los Alamos National lab LA-UR-07-0828 & LA-UR-06-0397 May 15, 2007 A Word about scale Petaflop class machines LLNL Blue Gene 350 Tflops 128k
More informationImpression Store: Compressive Sensing-based Storage for. Big Data Analytics
Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research The Curse of O(N) in
More informationHierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems
Hierarchical Codes: A Flexible Trade-off for Erasure Codes in Peer-to-Peer Storage Systems Alessandro Duminuco (duminuco@eurecom.fr) Ernst W. Biersack (biersack@eurecom.fr) PREPRINT VERSION The original
More informationOn MBR codes with replication
On MBR codes with replication M. Nikhil Krishnan and P. Vijay Kumar, Fellow, IEEE Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore. Email: nikhilkrishnan.m@gmail.com,
More informationArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan
ArcGIS GeoAnalytics Server: An Introduction Sarah Ambrose and Ravi Narayanan Overview Introduction Demos Analysis Concepts using GeoAnalytics Server GeoAnalytics Data Sources GeoAnalytics Server Administration
More informationA Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes
A Piggybacing Design Framewor for Read-and Download-efficient Distributed Storage Codes K V Rashmi, Nihar B Shah, Kannan Ramchandran, Fellow, IEEE Department of Electrical Engineering and Computer Sciences
More informationLinear Programming Bounds for Distributed Storage Codes
1 Linear Programming Bounds for Distributed Storage Codes Ali Tebbi, Terence H. Chan, Chi Wan Sung Department of Electronic Engineering, City University of Hong Kong arxiv:1710.04361v1 [cs.it] 12 Oct 2017
More informationSecure RAID Schemes from EVENODD and STAR Codes
Secure RAID Schemes from EVENODD and STAR Codes Wentao Huang and Jehoshua Bruck California Institute of Technology, Pasadena, USA {whuang,bruck}@caltechedu Abstract We study secure RAID, ie, low-complexity
More informationEstimates for factoring 1024-bit integers. Thorsten Kleinjung, University of Bonn
Estimates for factoring 1024-bit integers Thorsten Kleinjung, University of Bonn Contents GNFS Overview Polynomial selection, matrix construction, square root computation Sieving and cofactoring Strategies
More informationAn Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks
An Algorithm for a Two-Disk Fault-Tolerant Array with (Prime 1) Disks Sanjeeb Nanda and Narsingh Deo School of Computer Science University of Central Florida Orlando, Florida 32816-2362 sanjeeb@earthlink.net,
More informationLinear Programming Bounds for Robust Locally Repairable Storage Codes
Linear Programming Bounds for Robust Locally Repairable Storage Codes M. Ali Tebbi, Terence H. Chan, Chi Wan Sung Institute for Telecommunications Research, University of South Australia Email: {ali.tebbi,
More informationDRAM Reliability: Parity, ECC, Chipkill, Scrubbing. Alpha Particle or Cosmic Ray. electron-hole pairs. silicon. DRAM Memory System: Lecture 13
slide 1 DRAM Reliability: Parity, ECC, Chipkill, Scrubbing Alpha Particle or Cosmic Ray electron-hole pairs silicon Alpha Particles: Radioactive impurity in package material slide 2 - Soft errors were
More informationWeather Research and Forecasting (WRF) Performance Benchmark and Profiling. July 2012
Weather Research and Forecasting (WRF) Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationRegenerating Codes and Locally Recoverable. Codes for Distributed Storage Systems
Regenerating Codes and Locally Recoverable 1 Codes for Distributed Storage Systems Yongjune Kim and Yaoqing Yang Abstract We survey the recent results on applying error control coding to distributed storage
More informationDistributed storage systems from combinatorial designs
Distributed storage systems from combinatorial designs Aditya Ramamoorthy November 20, 2014 Department of Electrical and Computer Engineering, Iowa State University, Joint work with Oktay Olmez (Ankara
More informationCoping with disk crashes
Lecture 04.03 Coping with disk crashes By Marina Barsky Winter 2016, University of Toronto Disk failure types Intermittent failure Disk crash the entire disk becomes unreadable, suddenly and permanently
More informationTowards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems
Towards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems 2018 USENIX Annual Technical Conference Zhen Cao 1, Vasily Tarasov 2, Sachin Tiwari 1, and Erez Zadok 1
More informationMarla Meehl Manager of NCAR/UCAR Networking and Front Range GigaPoP (FRGP)
Big Data at the National Center for Atmospheric Research (NCAR) & expanding network bandwidth to NCAR over Pacific Wave and Western Regional Network (WRN) Marla Meehl Manager of NCAR/UCAR Networking and
More informationAnalysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems
Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, and Kenneth W. Shum Institute of Network Coding, The Chinese
More informationPartial-MDS Codes and their Application to RAID Type of Architectures
Partial-MDS Codes and their Application to RAID Type of Architectures arxiv:12050997v2 [csit] 11 Sep 2014 Mario Blaum, James Lee Hafner and Steven Hetzler IBM Almaden Research Center San Jose, CA 95120
More informationExplicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth
Explicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth A Project Report Submitted in partial fulfilment of the requirements for the Degree of Master of Engineering in Telecommunication
More informationErasure Codes for Distributed Storage: Tight Bounds and Matching Constructions
Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions arxiv:1806.04474v1 [cs.it] 12 Jun 2018 A Thesis Submitted for the Degree of Doctor of Philosophy in the Faculty of Engineering
More informationStochastic Modelling of Electron Transport on different HPC architectures
Stochastic Modelling of Electron Transport on different HPC architectures www.hp-see.eu E. Atanassov, T. Gurov, A. Karaivan ova Institute of Information and Communication Technologies Bulgarian Academy
More informationOnline Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems
Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Song Han 1 Deji Chen 2 Ming Xiong 3 Aloysius K. Mok 1 1 The University of Texas at Austin 2 Emerson Process Management
More informationHierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems
Hierarchical Codes: How to Make Erasure Codes Attractive for Peer to Peer Storage Systems Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented
More informationCPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When
1 CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When Inkwon Hwang, Student Member and Massoud Pedram, Fellow, IEEE Abstract
More informationThe Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering
The Pennsylvania State University The Graduate School Department of Computer Science and Engineering A SIMPLE AND FAST VECTOR SYMBOL REED-SOLOMON BURST ERROR DECODING METHOD A Thesis in Computer Science
More informationAstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis
AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of
More informationRevenue Maximization in a Cloud Federation
Revenue Maximization in a Cloud Federation Makhlouf Hadji and Djamal Zeghlache September 14th, 2015 IRT SystemX/ Telecom SudParis Makhlouf Hadji Outline of the presentation 01 Introduction 02 03 04 05
More informationLarge-Scale Behavioral Targeting
Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting
More informationS-Code: Lowest Density MDS Array Codes for RAID-6
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of Summer 2014 S-Code: Lowest Density MDS Array Codes
More informationQualitative vs Quantitative metrics
Qualitative vs Quantitative metrics Quantitative: hard numbers, measurable Time, Energy, Space Signal-to-Noise, Frames-per-second, Memory Usage Money (?) Qualitative: feelings, opinions Complexity: Simple,
More informationComputer Architecture
Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines
More informationOptimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction
Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction K V Rashmi, Nihar B Shah, and P Vijay Kumar, Fellow, IEEE Abstract Regenerating codes
More informationBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduce Guozhang Wang, Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, Walker White Cornell University 1 What are Behavioral Simulations?
More informationDistributed Data Storage Systems with. Opportunistic Repair
Distributed Data Storage Systems with 1 Opportunistic Repair Vaneet Aggarwal, Chao Tian, Vinay A. Vaishampayan, and Yih-Farn R. Chen Abstract arxiv:1311.4096v2 [cs.it] 6 Nov 2014 The reliability of erasure-coded
More informationCSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson
CSE 4201, Ch. 6 Storage Systems Hennessy and Patterson Challenge to the Disk The graveyard is full of suitors Ever heard of Bubble Memory? There are some technologies that refuse to die (silicon, copper...).
More informationFactorisation of RSA-704 with CADO-NFS
Factorisation of RSA-704 with CADO-NFS Shi Bai, Emmanuel Thomé, Paul Zimmermann To cite this version: Shi Bai, Emmanuel Thomé, Paul Zimmermann. Factorisation of RSA-704 with CADO-NFS. 2012. HAL Id: hal-00760322
More informationIBM Research Report. R5X0: An Efficient High Distance Parity-Based Code with Optimal Update Complexity
RJ 0322 (A0408-005) August 9, 2004 Computer Science IBM Research Report R5X0: An Efficient High Distance Parity-Based Code with Optimal Update Complexity Jeff R. Hartline Department of Computer Science
More informationETH Beowulf day January 31, Adrian Biland, Zhiling Chen, Derek Feichtinger, Christoph Grab, André Holzner, Urs Langenegger
CMS SM Meeting Nov 28 2005 Analysis and simulation of proton-proton collision data at LHC ETH Beowulf day January 31, 2006 Adrian Biland, Zhiling Chen, Derek Feichtinger, Christoph Grab, André Holzner,
More informationToday s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines
Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines 3) Techniques To Measure The Memory Component a) Understanding Your Current Environment
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.
More informationIntroduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum
Introduction to ArcGIS GeoAnalytics Server Sarah Ambrose & Noah Slocum Agenda Overview Analysis Capabilities + Demo Deployment and Configuration Questions ArcGIS GeoAnalytics Server uses the power of distributed
More informationMaximally Recoverable Codes for Grid-like Topologies *
Maximally Recoverable Codes for Grid-like Topologies * Parikshit Gopalan VMware Research pgopalan@vmware.com Shubhangi Saraf Rutgers University shubhangi.saraf@gmail.com Guangda Hu Princeton University
More informationEffective method for coding and decoding RS codes using SIMD instructions
Effective method for coding and decoding RS codes using SIMD instructions Aleksei Marov, Researcher, R&D department Raidix LLC, and PhD Student, St.Petersburg State University Saint Petersburg, Russia
More informationDistributed Storage Systems with Secure and Exact Repair - New Results
Distributed torage ystems with ecure and Exact Repair - New Results Ravi Tandon, aidhiraj Amuru, T Charles Clancy, and R Michael Buehrer Bradley Department of Electrical and Computer Engineering Hume Center
More informationEnergy-efficient Mapping of Big Data Workflows under Deadline Constraints
Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute
More informationA Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures
1 A Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures arxiv:181050v1 [csit] 6 Dec 018 S B Balaji, Ganesh R Kini and
More informationTradeoff between Reliability and Power Management
Tradeoff between Reliability and Power Management 9/1/2005 FORGE Lee, Kyoungwoo Contents 1. Overview of relationship between reliability and power management 2. Dakai Zhu, Rami Melhem and Daniel Moss e,
More informationAstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis
AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of
More informationFault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing
20 Years of Innovative Computing Knoxville, Tennessee March 26 th, 200 Fault Tolerate Linear Algebra: Survive Fail-Stop Failures without Checkpointing Zizhong (Jeffrey) Chen zchen@mines.edu Colorado School
More informationAll-in-one or BOX industrial PC for autonomous or distributed applications
M a g e l i s i P C All-in-one or BOX industrial PC for autonomous or distributed applications Intel Core Duo TM Windows XP TM HDD / Flash disk M a g e l i s i P C You are looking for an open, powerful
More informationUnit 6: Branch Prediction
CIS 501: Computer Architecture Unit 6: Branch Prediction Slides developed by Joe Devie/, Milo Mar4n & Amir Roth at Upenn with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi,
More informationThe conceptual view. by Gerrit Muller University of Southeast Norway-NISE
by Gerrit Muller University of Southeast Norway-NISE e-mail: gaudisite@gmail.com www.gaudisite.nl Abstract The purpose of the conceptual view is described. A number of methods or models is given to use
More informationExperience in Factoring Large Integers Using Quadratic Sieve
Experience in Factoring Large Integers Using Quadratic Sieve D. J. Guan Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 guan@cse.nsysu.edu.tw April 19, 2005 Abstract
More informationLinear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach
Linear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach Mehran Elyasi Department of ECE University of Minnesota melyasi@umn.edu Soheil Mohajer Department of ECE University
More informationA Highly-Available Scalable Distributed Data Structure
Online Appendix to: LH RS A Highly-Available Scalable Distributed Data Structure WITOLD LITWIN and RIM MOUSSA Université Paris Dauphine and THOMAS SCHWARZ, S. J. Santa Clara University APPENDIX C 8. LH
More informationCauchy MDS Array Codes With Efficient Decoding Method
IEEE TRANSACTIONS ON COMMUNICATIONS Cauchy MDS Array Codes With Efficient Decoding Method Hanxu Hou and Yunghsiang S Han, Fellow, IEEE Abstract arxiv:609968v [csit] 30 Nov 206 Array codes have been widely
More informationScalable Failure Recovery for Tree-based Overlay Networks
Scalable Failure Recovery for Tree-based Overlay Networks Dorian C. Arnold University of Wisconsin Paradyn/Condor Week April 30 May 3, 2007 Madison, WI Overview Motivation Address the likely frequent failures
More informationFermilab Experiments. Daniel Wicke (Bergische Universität Wuppertal) Outline. (Accelerator, Experiments and Physics) Computing Concepts
Fermilab Experiments CDF Daniel Wicke (Bergische Universität Wuppertal) Outline Motivation (Accelerator, Experiments and Physics) Computing Concepts (SAM, RACs, Prototype and GRID) Summary 30. Oct. 2002
More informationHow to deal with uncertainties and dynamicity?
How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling
More informationOutline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.
Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check
More informationOptical Storage Technology. Error Correction
Optical Storage Technology Error Correction Introduction With analog audio, there is no opportunity for error correction. With digital audio, the nature of binary data lends itself to recovery in the event
More informationIntroduction to Data Mining
Introduction to Data Mining Lecture #12: Frequent Itemsets Seoul National University 1 In This Lecture Motivation of association rule mining Important concepts of association rules Naïve approaches for
More informationLHC-CMS Tier2 facility at TIFR
LHC-CMS Tier2 facility at TIFR http://indiacms.res.in T2-IN-TIFR Kajari Mazumdar Department of High Energy Physics Tata Institute of Fundamental Research Mumbai, India. NKN Workshop, IIT, Bombay November
More informationSector-Disk Codes and Partial MDS Codes with up to Three Global Parities
Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities Junyu Chen Department of Information Engineering The Chinese University of Hong Kong Email: cj0@alumniiecuhkeduhk Kenneth W Shum
More informationLecture 2: Metrics to Evaluate Systems
Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video
More informationPI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1
PI SERVER 2012 Do. More. Faster. Now! Copyr i g h t 2012 O S Is o f t, L L C. 1 AUGUST 7, 2007 APRIL 14, 2010 APRIL 24, 2012 Copyr i g h t 2012 O S Is o f t, L L C. 2 PI Data Archive Security PI Asset
More informationA Different Kind of Flow Analysis. David M Nicol University of Illinois at Urbana-Champaign
A Different Kind of Flow Analysis David M Nicol University of Illinois at Urbana-Champaign 2 What Am I Doing Here??? Invite for ICASE Reunion Did research on Peformance Analysis Supporting Supercomputing
More informationCompressing Tabular Data via Pairwise Dependencies
Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet
More informationI/O Devices. Device. Lecture Notes Week 8
I/O Devices CPU PC ALU System bus Memory bus Bus interface I/O bridge Main memory USB Graphics adapter I/O bus Disk other devices such as network adapters Mouse Keyboard Disk hello executable stored on
More informationCPU SCHEDULING RONG ZHENG
CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM
More informationGenerating Urban Mobility Data Sets Using Scalable GANs
Generating Urban Mobility Data Sets Using Scalable GANs Abhinav Jauhri & John Paul Shen ECE Department Carnegie Mellon University {ajauhri, jpshen}@cmu.edu Objective Generate city-scale human mobility
More information