Coding problems for memory and storage applications

Similar documents
Constructions of Optimal Cyclic (r, δ) Locally Repairable Codes

Regenerating Codes and Locally Recoverable. Codes for Distributed Storage Systems

Security in Locally Repairable Storage

Cyclic Linear Binary Locally Repairable Codes

Progress on High-rate MSR Codes: Enabling Arbitrary Number of Helper Nodes

Explicit MBR All-Symbol Locality Codes

Linear Programming Bounds for Robust Locally Repairable Storage Codes

On Locally Recoverable (LRC) Codes

An Upper Bound On the Size of Locally. Recoverable Codes

Coding with Constraints: Minimum Distance. Bounds and Systematic Constructions

Secure RAID Schemes from EVENODD and STAR Codes

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

Asymptotically good sequences of codes and curves

Balanced Locally Repairable Codes

Balanced Locally Repairable Codes

arxiv: v1 [cs.it] 5 Aug 2016

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5

Codes With Hierarchical Locality

Optimal binary linear locally repairable codes with disjoint repair groups

Algebraic Geometry Codes. Shelly Manber. Linear Codes. Algebraic Geometry Codes. Example: Hermitian. Shelly Manber. Codes. Decoding.

Distributed storage systems from combinatorial designs

Linear Exact Repair Rate Region of (k + 1, k, k) Distributed Storage Systems: A New Approach

A Tight Rate Bound and Matching Construction for Locally Recoverable Codes with Sequential Recovery From Any Number of Multiple Erasures

Locally Encodable and Decodable Codes for Distributed Storage Systems

Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities

arxiv: v1 [cs.it] 18 Jun 2011

On MBR codes with replication

Explicit Maximally Recoverable Codes with Locality

Lecture 12: November 6, 2017

Minimum Repair Bandwidth for Exact Regeneration in Distributed Storage

MATH 291T CODING THEORY

Error-correcting Pairs for a Public-key Cryptosystem

Product-matrix Construction

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

Rack-Aware Regenerating Codes for Data Centers

: Error Correcting Codes. October 2017 Lecture 1

Optimal Locally Repairable and Secure Codes for Distributed Storage Systems

-ИИИИИ"ИИИИИИИИИ 2017.

MATH 291T CODING THEORY

Coding Theory and Applications. Solved Exercises and Problems of Cyclic Codes. Enes Pasalic University of Primorska Koper, 2013

Distributed Data Storage with Minimum Storage Regenerating Codes - Exact and Functional Repair are Asymptotically Equally Efficient

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes

Linear Programming Bounds for Distributed Storage Codes

Maximally Recoverable Codes with Hierarchical Locality

Distributed Storage Systems with Secure and Exact Repair - New Results

A Piggybacking Design Framework for Read- and- Download- efficient Distributed Storage Codes. K. V. Rashmi, Nihar B. Shah, Kannan Ramchandran

Lecture 03: Polynomial Based Codes

Binary MDS Array Codes with Optimal Repair

Coding for loss tolerant systems

Can Linear Minimum Storage Regenerating Codes be Universally Secure?

Update-Efficient Error-Correcting. Product-Matrix Codes

Explicit Code Constructions for Distributed Storage Minimizing Repair Bandwidth

Zigzag Codes: MDS Array Codes with Optimal Rebuilding

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Fractional Repetition Codes For Repair In Distributed Storage Systems

Communication Efficient Secret Sharing

LDPC Code Design for Distributed Storage: Balancing Repair Bandwidth, Reliability and Storage Overhead

Communication Efficient Secret Sharing

An Introduction to (Network) Coding Theory

Cooperative Repair of Multiple Node Failures in Distributed Storage Systems

Section 3 Error Correcting Codes (ECC): Fundamentals

LET be the finite field of cardinality and let

Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions

Batch and PIR Codes and Their Connections to Locally Repairable Codes

Cauchy MDS Array Codes With Efficient Decoding Method

Arrangements, matroids and codes

On Linear Subspace Codes Closed under Intersection

Code-Based Cryptography Error-Correcting Codes and Cryptography

Weakly Secure Regenerating Codes for Distributed Storage

Lecture 6: Expander Codes

MATH 433 Applied Algebra Lecture 21: Linear codes (continued). Classification of groups.

Berlekamp-Massey decoding of RS code

Asymptotically good sequences of curves and codes

Algebraic geometry codes

MATH32031: Coding Theory Part 15: Summary

Lecture 12. Block Diagram

Skew Cyclic Codes Of Arbitrary Length

A Characterization Of Quantum Codes And Constructions

An Introduction to (Network) Coding Theory

1 Vandermonde matrices

Coding for Memory with Stuck-at Defects

Outline. MSRI-UP 2009 Coding Theory Seminar, Week 2. The definition. Link to polynomials

A Tale of Two Erasure Codes in HDFS

Chapter 6 Reed-Solomon Codes. 6.1 Finite Field Algebra 6.2 Reed-Solomon Codes 6.3 Syndrome Based Decoding 6.4 Curve-Fitting Based Decoding

Error Correcting Codes Questions Pool

The Complete Hierarchical Locality of the Punctured Simplex Code

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Lecture 3: Error Correcting Codes

Synchronizing Edits in Distributed Storage Networks

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory

Math 512 Syllabus Spring 2017, LIU Post

Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions

Fountain Uncorrectable Sets and Finite-Length Analysis

Refined analysis of RGHWs of code pairs coming from Garcia-Stichtenoth s second tower

Permutation Codes: Optimal Exact-Repair of a Single Failed Node in MDS Code Based Distributed Storage Systems

IEEE TRANSACTIONS ON INFORMATION THEORY 1

MULTI-SYMBOL LOCALLY REPAIRABLE CODES

Report on PIR with Low Storage Overhead

Construction of a Class of Algebraic-Geometric Codes via Gröbner Bases

Transcription:

.. Coding problems for memory and storage applications Alexander Barg University of Maryland January 27, 2015 A. Barg (UMD) Coding for memory and storage January 27, 2015 1 / 73

Codes with locality Introduction: Big Data Big Data players: Facebook, Instagram, Google, MSFT, etc.; Dropbox, Box, etc. Companies marketing coding solutions: CleverSafe (RS codes) and others. A. Barg (UMD) Coding for memory and storage January 27, 2015 2 / 73

Codes with locality Introduction: Big Data Big Data players: Facebook, Instagram, Google, MSFT, etc.; Dropbox, Box, etc. Companies marketing coding solutions: CleverSafe (RS codes) and others. Cluster of machines running Hadoop at Yahoo! Node failures are the norm A. Barg (UMD) Coding for memory and storage January 27, 2015 2 / 73

Is repair cost a real issue? Codes with locality (Average number of failed nodes =20) 15Tb = 300Tb A. Barg (UMD) Coding for memory and storage January 27, 2015 3 / 73

Codes with locality Two approaches to data coding in distributed storage Codes with locality Regenerating codes A. Barg (UMD) Coding for memory and storage January 27, 2015 4 / 73

Codes with locality Regenerating codes 8 x 104 1 2 k k+1 α α α Data Collector 1 2 3 d+1 β β β 1 Repair bandwidth, dβ 7 6 5 4 3 2 n n 1 α capacity nodes α capacity nodes 0 6000 7000 8000 9000 10000 11000 Storage per node, α Data collection Node repair Trade-off B symbols are encoded into nα symbols stored in n nodes A. Barg (UMD) Coding for memory and storage January 27, 2015 5 / 73

Codes with locality Regenerating codes 8 x 104 1 2 k k+1 α α α Data Collector 1 2 3 d+1 β β β 1 Repair bandwidth, dβ 7 6 5 4 3 2 n n 1 α capacity nodes α capacity nodes 0 6000 7000 8000 9000 10000 11000 Storage per node, α Data collection Node repair Trade-off B symbols are encoded into nα symbols stored in n nodes Downloading the data is possible by accessing any k nodes A. Barg (UMD) Coding for memory and storage January 27, 2015 5 / 73

Codes with locality Regenerating codes 8 x 104 1 2 k k+1 α α α Data Collector 1 2 3 d+1 β β β 1 Repair bandwidth, dβ 7 6 5 4 3 2 n n 1 α capacity nodes α capacity nodes 0 6000 7000 8000 9000 10000 11000 Storage per node, α Data collection Node repair Trade-off B symbols are encoded into nα symbols stored in n nodes Downloading the data is possible by accessing any k nodes Node repair (exact or functional) can be performed by downloading β < α symbols from any subset of d nodes. A. Barg (UMD) Coding for memory and storage January 27, 2015 5 / 73

Codes with locality Regenerating codes 8 x 104 1 2 k k+1 α α α Data Collector 1 2 3 d+1 β β β 1 Repair bandwidth, dβ 7 6 5 4 3 2 n n 1 α capacity nodes α capacity nodes 0 6000 7000 8000 9000 10000 11000 Storage per node, α Data collection Node repair Trade-off B symbols are encoded into nα symbols stored in n nodes Downloading the data is possible by accessing any k nodes Node repair (exact or functional) can be performed by downloading β < α symbols from any subset of d nodes. Repair bandwidth dβ (n, k, d, {α, β}) regenerating codes A. Barg (UMD) Coding for memory and storage January 27, 2015 5 / 73

Codes with locality Regenerating codes 8 x 104 1 2 k k+1 α α α Data Collector 1 2 3 d+1 β β β 1 Repair bandwidth, dβ 7 6 5 4 3 2 n n 1 α capacity nodes α capacity nodes 0 6000 7000 8000 9000 10000 11000 Storage per node, α Data collection Node repair Trade-off B symbols are encoded into nα symbols stored in n nodes Downloading the data is possible by accessing any k nodes Node repair (exact or functional) can be performed by downloading β < α symbols from any subset of d nodes. Repair bandwidth dβ (n, k, d, {α, β}) regenerating codes A. Dimakis, P. Godfrey, Y. Wu, M. Wainwright, and K. Ramchandran, Network coding for distributed storage systems, 2010 A. Barg (UMD) Coding for memory and storage January 27, 2015 5 / 73

Codes with locality Locally recoverable codes: Plan In this part we focus on locally recoverable codes A. Barg (UMD) Coding for memory and storage January 27, 2015 6 / 73

Codes with locality Locally recoverable codes: Plan In this part we focus on locally recoverable codes LRC code: To recover one lost symbol of the encoding it suffices to access a small number r of other symbols. A. Barg (UMD) Coding for memory and storage January 27, 2015 6 / 73

Codes with locality Locally recoverable codes: Plan In this part we focus on locally recoverable codes LRC code: To recover one lost symbol of the encoding it suffices to access a small number r of other symbols. 1. Current solutions 2. Parameters of LRC codes 3. MDS-like codes with the locality property 4. The availability problem: Multiple recovering sets. 5 Extensions LRC codes on algebraic curves Cyclic LRC codes 6. Open problems: Bounds on codes; cyclic codes; list decoding A. Barg (UMD) Coding for memory and storage January 27, 2015 6 / 73

Codes with locality State-of-the-Art Coding technique RAID: Redundant Array of Independent Disks RAID 1 Replication (currently 3x) Provides high availability of information Can tolerate any 2 disk failures Widely used in Hadoop and many other systems Storage overhead of 200% RAID 6 uses [6,4,3] RS codes [n, k] RS codes Can tolerate any n-k disk failures Poor handling of single disk failures (The Repair Problem) A. Barg (UMD) Coding for memory and storage January 27, 2015 7 / 73

Codes with locality Limitations of Reed-Solomon codes Example: [14, 10] RS code Transmit 10 symbols to recover one lost value Generates 10x more traffic for recovery of one drive If large portion of the cluster is RS-coded, this leads to saturation of the network A. Barg (UMD) Coding for memory and storage January 27, 2015 8 / 73

Codes with locality Other constructions A combination of local and global parity checks for single and multiple nodes failures (C. Huang at al., Erasure coding in Windows Azure Storage, USENIX Conf. 2012) A. Barg (UMD) Coding for memory and storage January 27, 2015 9 / 73

Codes with locality Other constructions A combination of local and global parity checks for single and multiple nodes failures (C. Huang at al., Erasure coding in Windows Azure Storage, USENIX Conf. 2012) Other similar constructions (Windows Azure code) X1 X2 X3 X4 X5 X6 Y1 Y2 Y3 Y4 Y5 Y6 P1 P2 Px Py Pyramid codes (C. Huang et al., 2007) A. Barg (UMD) Coding for memory and storage January 27, 2015 9 / 73

LRC Codes Locally recoverable codes Table of codewords The code C F n is locally recoverable with locality r if every symbol can be recovered by accessing some other r symbols in the encoding (recovering set of coordinate i) a a 000000 111111 000000 111111 000000 111111 I A. Barg (UMD) Coding for memory and storage January 27, 2015 10 / 73

LRC Codes (n, k, r) LRC code Let a F; consider the restriction C J of C to a subset J [n]. Let C J (a, i) = {x C J : x i = a}, i [n].. Definition. Code C has locality r if for every i [n] there exists a subset J i [n]\i, J i r such that. C Ji (a, i) C Ji (a, i) =, a a A. Barg (UMD) Coding for memory and storage January 27, 2015 11 / 73

LRC Codes (n, k, r) LRC code Let a F; consider the restriction C J of C to a subset J [n]. Let C J (a, i) = {x C J : x i = a}, i [n].. Definition. Code C has locality r if for every i [n] there exists a subset J i [n]\i, J i r such that. C Ji (a, i) C Ji (a, i) =, a a J. Han and L. Lastras-Montano, ISIT 2007; C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007; F. Oggier and A. Datta 10; P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory, Nov. 2012. A. Barg (UMD) Coding for memory and storage January 27, 2015 11 / 73

Bounds Parameters of LRC codes A. Barg (UMD) Coding for memory and storage January 27, 2015 12 / 73

Bounds Parameters of LRC codes. Theorem. Let C be an (n, k, r) LRC code of cardinality q k over an alphabet of size q, then: The rate of C satisfies The minimum distance of C satisfies. k n r r + 1. (1) d n k k + 2. (2) r Bound (2) is due to Gopalan e.a. (2011) and Papailiopoulos e.a. (2012). A. Barg (UMD) Coding for memory and storage January 27, 2015 12 / 73

Bounds Parameters of LRC codes. Theorem. Let C be an (n, k, r) LRC code of cardinality q k over an alphabet of size q, then: The rate of C satisfies The minimum distance of C satisfies. k n r r + 1. (1) d n k k + 2. (2) r Bound (2) is due to Gopalan e.a. (2011) and Papailiopoulos e.a. (2012). Note that r = k reduces (2) to the Singleton bound d n k + 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 12 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals d(c) = n max{ S : C S < q k } S [n] A. Barg (UMD) Coding for memory and storage January 27, 2015 13 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals d(c) = n max{ S : C S < q k } S [n] The Singleton bound (without locality): S = k 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 13 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals The Singleton bound (with locality): d(c) = n max{ S : C S < q k } S [n] Let I i [n], I i r be the recovering set for the symbol c i, i = 1,..., n. A. Barg (UMD) Coding for memory and storage January 27, 2015 14 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals The Singleton bound (with locality): d(c) = n max{ S : C S < q k } S [n] Let I i [n], I i r be the recovering set for the symbol c i, i = 1,..., n. Let J m = m i=1i i, where m = (k 1)/r. Clearly J m k 1. A. Barg (UMD) Coding for memory and storage January 27, 2015 14 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals The Singleton bound (with locality): d(c) = n max{ S : C S < q k } S [n] Let I i [n], I i r be the recovering set for the symbol c i, i = 1,..., n. Let J m = m i=1i i, where m = (k 1)/r. Clearly J m k 1. Consider the subset J m = J m {1,..., m}. We have C J m q k 1. A. Barg (UMD) Coding for memory and storage January 27, 2015 14 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals The Singleton bound (with locality): d(c) = n max{ S : C S < q k } S [n] Let I i [n], I i r be the recovering set for the symbol c i, i = 1,..., n. Let J m = m i=1i i, where m = (k 1)/r. Clearly J m k 1. Consider the subset J m = J m {1,..., m}. We have C J m q k 1. If J m < k 1, add to J m any k 1 J m other coordinates to form the set L m [n]. A. Barg (UMD) Coding for memory and storage January 27, 2015 14 / 73

Bounds The distance bound Main idea. Let C be a q-ary code of length n, size q k. The distance d(c) equals The Singleton bound (with locality): d(c) = n max{ S : C S < q k } S [n] Let I i [n], I i r be the recovering set for the symbol c i, i = 1,..., n. Let J m = m i=1i i, where m = (k 1)/r. Clearly J m k 1. Consider the subset J m = J m {1,..., m}. We have C J m q k 1. If J m < k 1, add to J m any k 1 J m other coordinates to form the set L m [n]. We have C Lm < q k L m = k 1 + m = k 1 + k 1 k = k 2 + r r A. Barg (UMD) Coding for memory and storage January 27, 2015 14 / 73

Bounds Cadambe-Mazumdar bound (n, k, r) LRC code C A. Barg (UMD) Coding for memory and storage January 27, 2015 15 / 73

Bounds Cadambe-Mazumdar bound (n, k, r) LRC code C (q) k min(rs + k opt (n s(r + 1), d)) s 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 15 / 73

Bounds Cadambe-Mazumdar bound (n, k, r) LRC code C (q) k min(rs + k opt (n s(r + 1), d)) s 1 Consider the sets of coordinates L s constructed above, 1 s (k 1)/r. C Ls q rs The shortening of the code C on the coordinates in L s forms a code of length n s(r + 1) with distance d A. Barg (UMD) Coding for memory and storage January 27, 2015 15 / 73

Bounds Existence (Gilbert-Varshamov) bound A linear q-ary [n, k, d] code exists if ( ) d 2 n 1 (q 1) i < q n k i Add n/(r + 1) local parities i=0 k k n r + 1 Sequences of (R, δ) codes with locality r exist as long as R < r r + 1 δ log q 1 1 q (1 δ) log δ q 1 δ A. Barg (UMD) Coding for memory and storage January 27, 2015 16 / 73

Bounds Existence (Gilbert-Varshamov) bound A linear q-ary [n, k, d] code exists if ( ) d 2 n 1 (q 1) i < q n k i Add n/(r + 1) local parities i=0 k k n r + 1 Sequences of (R, δ) codes with locality r exist as long as R < r r + 1 δ log q 1 1 q (1 δ) log δ q 1 δ R r r + 1 hq(δ) A. Barg (UMD) Coding for memory and storage January 27, 2015 16 / 73

Constructions Early constructions. 1 Optimal ((r + 1) k/r, k, r) LRC code Prasanth, Kamath, Lalitha, and Kumar, ISIT 2012 Restricted length. 2 Optimal (n, k, r) LRC codes Silberstein, Rawat, Koluoglu, and Vishwanath, ISIT 2013 Tamo, Papailiopoulos, and Dimakis, ISIT 2013 Almost any n, k, r Field size q 2 n A. Barg (UMD) Coding for memory and storage January 27, 2015 17 / 73

Constructions RS-type codes Reed-Solomon codes An RS code C is a linear code of length n q 1 over the field F q A. Barg (UMD) Coding for memory and storage January 27, 2015 18 / 73

Constructions RS-type codes Reed-Solomon codes An RS code C is a linear code of length n q 1 over the field F q Given a polynomial f F q [x] and a set A = {P 1,..., P n } F q define the map ev A : f (f (P i ), i = 1,..., n) A. Barg (UMD) Coding for memory and storage January 27, 2015 18 / 73

Constructions RS-type codes Reed-Solomon codes An RS code C is a linear code of length n q 1 over the field F q Given a polynomial f F q [x] and a set A = {P 1,..., P n } F q define the map ev A : f (f (P i ), i = 1,..., n) RS code C encodes messages of k symbols. Let V k (q) = {f F q [x] : deg(f ) k 1} C : V k (q) F n q f ev A (f ) = (f (P i ), i = 1,..., n) A. Barg (UMD) Coding for memory and storage January 27, 2015 18 / 73

Constructions RS-type codes Reed-Solomon codes An RS code C is a linear code of length n q 1 over the field F q Given a polynomial f F q [x] and a set A = {P 1,..., P n } F q define the map ev A : f (f (P i ), i = 1,..., n) RS code C encodes messages of k symbols. Let V k (q) = {f F q [x] : deg(f ) k 1} C : V k (q) F n q f ev A (f ) = (f (P i ), i = 1,..., n) Example: Let q = 8, f (x) = 1 + αx + αx 2 f (x) (1, α 4, α 6, α 4, α, α, α 6 ) A. Barg (UMD) Coding for memory and storage January 27, 2015 18 / 73

Constructions RS-type codes Reed-Solomon codes A. Barg (UMD) Coding for memory and storage January 27, 2015 19 / 73

Constructions RS-type codes Reed-Solomon codes A. Barg (UMD) Coding for memory and storage January 27, 2015 20 / 73

Constructions RS-type codes Reed-Solomon codes A. Barg (UMD) Coding for memory and storage January 27, 2015 21 / 73

Constructions RS-type codes Reed-Solomon codes To recover one erased value we need to read k other values A. Barg (UMD) Coding for memory and storage January 27, 2015 21 / 73

Constructions RS-type codes LRC codes: Idea of construction What if we can interpolate low-degree polynomials? A. Barg (UMD) Coding for memory and storage January 27, 2015 22 / 73

Constructions RS-type codes LRC codes: Idea of construction What if we can interpolate low-degree polynomials? A. Barg (UMD) Coding for memory and storage January 27, 2015 22 / 73

Constructions RS-type codes Construction of LRC codes: Limitations We need a specially chosen set of points A Restricted set of polynomials A. Barg (UMD) Coding for memory and storage January 27, 2015 23 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes: Example Parameters: n = 9, k = 4, r = 2, q = 13; Set of points: A={1,2,3,4,5,6,9,10,12} A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} Message: a = (a 0,0, a 0,1, a 1,0, a 1,1 ) F k q Polynomial space: V k (q) := {a 0,0 + a 1,0 x + a 0,1 x 3 + a 1,1 x 4 } A. Barg (UMD) Coding for memory and storage January 27, 2015 24 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes: Example Parameters: n = 9, k = 4, r = 2, q = 13; Set of points: A={1,2,3,4,5,6,9,10,12} A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} Message: a = (a 0,0, a 0,1, a 1,0, a 1,1 ) F k q Polynomial space: V k (q) := {a 0,0 + a 1,0 x + a 0,1 x 3 + a 1,1 x 4 } E.g., a = (1, 1, 1, 1), f a(x) = 1 + x + x 3 + x 4 ; ev A (f ) = (4, 8, 7, 1, 11, 2, 0, 0, 0) A. Barg (UMD) Coding for memory and storage January 27, 2015 24 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes: Example Parameters: n = 9, k = 4, r = 2, q = 13; Set of points: A={1,2,3,4,5,6,9,10,12} A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} Message: a = (a 0,0, a 0,1, a 1,0, a 1,1 ) F k q Polynomial space: V k (q) := {a 0,0 + a 1,0 x + a 0,1 x 3 + a 1,1 x 4 } E.g., a = (1, 1, 1, 1), f a(x) = 1 + x + x 3 + x 4 ; ev A (f ) = (4, 8, 7, 1, 11, 2, 0, 0, 0) Say c 1 = f a (1) is erased. We access the recovering set A 1 to construct a line δ(x) = 2x + 2 such that δ(3) = 8, δ(9) = 7. A. Barg (UMD) Coding for memory and storage January 27, 2015 24 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes: Example Parameters: n = 9, k = 4, r = 2, q = 13; Set of points: A={1,2,3,4,5,6,9,10,12} A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} Message: a = (a 0,0, a 0,1, a 1,0, a 1,1 ) F k q Polynomial space: V k (q) := {a 0,0 + a 1,0 x + a 0,1 x 3 + a 1,1 x 4 } E.g., a = (1, 1, 1, 1), f a(x) = 1 + x + x 3 + x 4 ; ev A (f ) = (4, 8, 7, 1, 11, 2, 0, 0, 0) Say c 1 = f a (1) is erased. We access the recovering set A 1 to construct a line δ(x) = 2x + 2 such that δ(3) = 8, δ(9) = 7. Compute c 1 as δ(1) = 4 A. Barg (UMD) Coding for memory and storage January 27, 2015 24 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes: Example Parameters: n = 9, k = 4, r = 2, q = 13; Set of points: A={1,2,3,4,5,6,9,10,12} A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} Message: a = (a 0,0, a 0,1, a 1,0, a 1,1 ) F k q Polynomial space: V k (q) := {a 0,0 + a 1,0 x + a 0,1 x 3 + a 1,1 x 4 } E.g., a = (1, 1, 1, 1), f a(x) = 1 + x + x 3 + x 4 ; ev A (f ) = (4, 8, 7, 1, 11, 2, 0, 0, 0) Say c 1 = f a (1) is erased. We access the recovering set A 1 to construct a line δ(x) = 2x + 2 such that δ(3) = 8, δ(9) = 7. Compute c 1 as δ(1) = 4 It works! A. Barg (UMD) Coding for memory and storage January 27, 2015 24 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes Assume that q n, (r + 1) n, r k Let A F q, A = n A. Barg (UMD) Coding for memory and storage January 27, 2015 25 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes Assume that q n, (r + 1) n, r k Let A F q, A = n Suppose there exists a polynomial g(x) F[x] such that 1. deg g = r + 1, 2. There exists a partition A = {A 1,..., A n } of A into sets of size r + 1, such that g r+1 is constant on each set A i in the partition. For all i = 1,..., n/(r + 1), and any α, β A i, g(α) = g(β). A. Barg (UMD) Coding for memory and storage January 27, 2015 25 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes Assume that q n, (r + 1) n, r k Let A F q, A = n Suppose there exists a polynomial g(x) F[x] such that 1. deg g = r + 1, 2. There exists a partition A = {A 1,..., A n } of A into sets of size r + 1, such that g r+1 is constant on each set A i in the partition. For all i = 1,..., n/(r + 1), and any α, β A i, g(α) = g(β). E.g., n = 9, r = 2, q = 13; A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}}, Then g(x) = x 3 is constant on each of the A i s A. Barg (UMD) Coding for memory and storage January 27, 2015 25 / 73

Constructions RS-type codes Construction of (n, k, r) LRC codes Given A F, partition A into (r + 1)-subsets. To encode the message a F k, write a = (a ij, i = 0,..., r 1; j = 0,..., k 1) r Define the encoding polynomial r 1 f a(x) = f i (x)x i, where k r 1 f i (x) = a ij g(x) j, i = 0,..., r 1 A linear code C is constructed as follows: Ev :F k F n a (f a(β), β A) j=0 i=0 A. Barg (UMD) Coding for memory and storage January 27, 2015 26 / 73

Constructions RS-type codes Recovery of erased symbol Suppose that the location of erased symbol is α A j ; A j A To find c α we rely on the recovering set A j Find a polynomial δ(x) s.t. δ(β) = c β, β A j \α; deg δ r 1 : Then c α = δ(α) δ(x) = β A j \α c β β A j \{α,β} x β β β A. Barg (UMD) Coding for memory and storage January 27, 2015 27 / 73

Constructions RS-type codes Properties of the construction. Theorem. The constructed linear codes are optimal (n, k, r) LRC codes with respect to the. Singleton bound (2). Optimality is proved by counting degrees. Locality: Let α A j be the erased location. Define By the construction, for all β A j r 1 (x) = f i (α)x i i=0 (β) = f a (β) Since deg r 1, we see that (x) δ(x). A. Barg (UMD) Coding for memory and storage January 27, 2015 28 / 73

Constructions RS-type codes Constructing the polynomial g(x). Proposition. Let H be a subgroup of F q or F + q. The annihilator polynomial of H g(x) = h H(x h). is constant on each coset of H. A. Barg (UMD) Coding for memory and storage January 27, 2015 29 / 73

Constructions RS-type codes Constructing the polynomial g(x). Proposition. Let H be a subgroup of F q or F + q. The annihilator polynomial of H g(x) = h H(x h). is constant on each coset of H. Assume that H is a multiplicative subgroup and let a, ah be two elements of the coset ah, where h H, then g(ah) = h H (ah h) =h H h H(a hh 1 ) = h H(a h) =g(a). A. Barg (UMD) Coding for memory and storage January 27, 2015 29 / 73

Constructions RS-type codes Some generalizations The locator set A F, A = m i=1a i. Consider the algebra F A [x] = {f F[x] : f is constant on A i, i = 1,..., m; deg f < A }. A. Barg (UMD) Coding for memory and storage January 27, 2015 30 / 73

Constructions RS-type codes Some generalizations The locator set A F, A = m i=1a i. Consider the algebra F A [x] = {f F[x] : f is constant on A i, i = 1,..., m; deg f < A }. The properties of F A [x] are summarized as follows: 1. dim(f A [x]) = m; 2. Let α 1,..., α m be distinct nonzero elements of F, and let g be the polynomial of degree deg(g) < A that satisfies g(a i ) = α i for all i = 1,..., m. Then the polynomials 1, g,..., g m 1 form a basis of F A [x]. A. Barg (UMD) Coding for memory and storage January 27, 2015 30 / 73

Constructions RS-type codes Some generalizations The locator set A F, A = m i=1a i. Consider the algebra F A [x] = {f F[x] : f is constant on A i, i = 1,..., m; deg f < A }. The properties of F A [x] are summarized as follows: 1. dim(f A [x]) = m; 2. Let α 1,..., α m be distinct nonzero elements of F, and let g be the polynomial of degree deg(g) < A that satisfies g(a i ) = α i for all i = 1,..., m. Then the polynomials 1, g,..., g m 1 form a basis of F A [x]. General code construction: Let A F, A = n; A = m i=1a i, A i = r + 1 for all i. Let Φ be an injective mapping from F k to the space of polynomials F r A = r 1 i=0 F A[x]x i. The evaluation code obtained in this way is an (n, k, r) LRC code. A. Barg (UMD) Coding for memory and storage January 27, 2015 30 / 73

Constructions Extensions Extensions 1. It is possible to lift the divisibility constraints r k, (r + 1) n A. Barg (UMD) Coding for memory and storage January 27, 2015 31 / 73

Constructions Extensions Extensions 1. It is possible to lift the divisibility constraints r k, (r + 1) n 2. It is possible to define a systematic algebraic encoding mapping. A. Barg (UMD) Coding for memory and storage January 27, 2015 31 / 73

Constructions Extensions Extensions 1. It is possible to lift the divisibility constraints r k, (r + 1) n 2. It is possible to define a systematic algebraic encoding mapping. 3. To improve data availability, replace [r + 1, r, 2] local codes with [r + ρ 1, r] MDS codes. Then every c i is a function of any r out of r + ρ 1 coordinates. Bound on the distance: ( k ) d n k + 1 1 (ρ 1) (Kamath e.a., 2013) r A. Barg (UMD) Coding for memory and storage January 27, 2015 31 / 73

Constructions Extensions Extensions 1. It is possible to lift the divisibility constraints r k, (r + 1) n 2. It is possible to define a systematic algebraic encoding mapping. 3. To improve data availability, replace [r + 1, r, 2] local codes with [r + ρ 1, r] MDS codes. Then every c i is a function of any r out of r + ρ 1 coordinates. Bound on the distance: ( k ) d n k + 1 1 (ρ 1) (Kamath e.a., 2013) r Claim: Taking recovering sets of size A i = r + ρ 1 and a polynomial basis of F A[x], we can construct an (n, k, r) LRC code whose distance meets this bound. A. Barg (UMD) Coding for memory and storage January 27, 2015 31 / 73

Multiple recovering sets Availability problem Hot data accessed simultaneously by a very large number of users A. Barg (UMD) Coding for memory and storage January 27, 2015 32 / 73

Multiple recovering sets Availability problem Hot data accessed simultaneously by a very large number of users A. Barg (UMD) Coding for memory and storage January 27, 2015 32 / 73

Multiple recovering sets Multiple recovering sets: Definition Every symbol in data encoding appears in several disjoint (orthogonal) parity checks C F n a code of length n Every coordinate is recoverable from the codeword symbols in several recovering sets: i A. Barg (UMD) Coding for memory and storage January 27, 2015 33 / 73

Multiple recovering sets Multiple recovering sets: Definition Let C(a, i) = {x C : x i = a}, a F, i [n] The code C has two disjoint recovering sets if for every i [n] there are subsets R 1 i, R 2 i [n]\{i}, R 1 i R 2 i = such that C(a, i) j R C(a, i) j i R =, a a ; j = 1, 2 i A. Barg (UMD) Coding for memory and storage January 27, 2015 34 / 73

Multiple recovering sets Multiple recovering sets: Idea of construction A. Barg (UMD) Coding for memory and storage January 27, 2015 35 / 73

Multiple recovering sets Multiple recovering sets: Idea of construction A. Barg (UMD) Coding for memory and storage January 27, 2015 36 / 73

Multiple recovering sets Multiple recovering sets: Idea of construction A. Barg (UMD) Coding for memory and storage January 27, 2015 37 / 73

Multiple recovering sets Multiple recovering sets: Idea of construction A. Barg (UMD) Coding for memory and storage January 27, 2015 38 / 73

Multiple recovering sets Multiple recovering sets: Idea of construction 1 x f a(γ) can be found by interpolating δ 1 (x) as well as δ 2 (x) Γ f a x 2 x A. Barg (UMD) Coding for memory and storage January 27, 2015 39 / 73

Multiple recovering sets Multiple recovering sets: Example Take F = F 13 ; G, H F ; G = 5, H = 3 Let A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} F AG [x] = {f F[x] : f is constant on A i, i = 1, 2, 3; deg f < F } F AG [x] = 1, x 4, x 8, F AH [x] = 1, x 3, x 6, x 9 A. Barg (UMD) Coding for memory and storage January 27, 2015 40 / 73

Multiple recovering sets Multiple recovering sets: Example Take F = F 13 ; G, H F ; G = 5, H = 3 Let A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} F AG [x] = {f F[x] : f is constant on A i, i = 1, 2, 3; deg f < F } F AG [x] = 1, x 4, x 8, F AH [x] = 1, x 3, x 6, x 9 We construct an LRC (12, 4, {2, 3}), distance 6, code C : F 4 F 12 a = (a 0, a 1, a 2, a 3 ) f a(x) = a 0 + a 1 x + a 2 x 4 + a 3 x 6 2 f a (x) = f i (x)x i, where f 0 (x) = a 0 + a 2 x 4, f 1 (x) = a 1, f 2 (x) = a 3 x 4 ; f i F A [x] i=0 1 f a (x) = g j (x)x j where g 0 (x) = a 0 + a 3 x 6, g 1 (x) = a 1 + a 2 x 3 ; g j F AH [x] j=0 E.g., f a (1) can be recovered by computing δ 1 (x), x {5, 12, 8} OR δ 2 (x), x {3, 9} A. Barg (UMD) Coding for memory and storage January 27, 2015 40 / 73

Multiple recovering sets Multiple recovering sets General Construction: A = {α 1,..., α n} F, A = n; A A {}}{{}} { A = i 0 Ri 1 = j 0 Rj 2 ; Ri 1 = r + 1, Rj 2 = s + 1 k 1 f a(x) = a i g i (x), g i (x) FA r F s A i=0 Evaluation map: (a 1,..., a k ) C (f a (α 1 ),..., f a (α n )) Theorem: Assume that the partitions A, A are orthogonal. Then Eval (f : f F r A F s A ), x A gives an (n, k, {r, s}) LRC code with distance n m + 1, where m is the largest degree in F r A F s A. A. Barg (UMD) Coding for memory and storage January 27, 2015 41 / 73

Multiple recovering sets Constructing orthogonal partitions Orthogonal partitions can be obtained from the structure of additive or multiplicative subgroups of F q A. Barg (UMD) Coding for memory and storage January 27, 2015 42 / 73

Multiple recovering sets Constructing orthogonal partitions Orthogonal partitions can be obtained from the structure of additive or multiplicative subgroups of F q 1. F 13 ; G, H F 13; G = 5, H = 3 ; G H = id A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} A. Barg (UMD) Coding for memory and storage January 27, 2015 42 / 73

Multiple recovering sets Constructing orthogonal partitions Orthogonal partitions can be obtained from the structure of additive or multiplicative subgroups of F q 1. F 13 ; G, H F 13; G = 5, H = 3 ; G H = id A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} 2. Take G, H F + q, e.g., G = H = (Z 2 ) 2 ; F + 16 = (Z 2) 2 (Z 2 ) 2 A. Barg (UMD) Coding for memory and storage January 27, 2015 42 / 73

Multiple recovering sets Constructing orthogonal partitions Orthogonal partitions can be obtained from the structure of additive or multiplicative subgroups of F q 1. F 13 ; G, H F 13; G = 5, H = 3 ; G H = id A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} 2. Take G, H F + q, e.g., G = H = (Z 2 ) 2 ; F + 16 = (Z 2) 2 (Z 2 ) 2 G = {0000, 0001, 0010, 0011} and H = {0000, 0100, 1000, 1100} A G = {{0, 1, α, α 4 }, {α 5, α 10, α 2, α 8 }, {α 6, α 13, α 11, α 12 }, {α 7, α 9, α 14, α 3 }} A H = {{0, α 2, α 3, α 6 }, {1, α 8, α 14, α 13 }, {α, α 5, α 9, α 11 }, {α 4, α 10, α 7, α 12 }} A. Barg (UMD) Coding for memory and storage January 27, 2015 42 / 73

Multiple recovering sets Constructing orthogonal partitions Orthogonal partitions can be obtained from the structure of additive or multiplicative subgroups of F q 1. F 13 ; G, H F 13; G = 5, H = 3 ; G H = id A G = {{1, 5, 12, 8}, {2, 10, 11, 3}, {4, 7, 9, 6}} A H = {{1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}} 2. Take G, H F + q, e.g., G = H = (Z 2 ) 2 ; F + 16 = (Z 2) 2 (Z 2 ) 2 G = {0000, 0001, 0010, 0011} and H = {0000, 0100, 1000, 1100} A G = {{0, 1, α, α 4 }, {α 5, α 10, α 2, α 8 }, {α 6, α 13, α 11, α 12 }, {α 7, α 9, α 14, α 3 }} A H = {{0, α 2, α 3, α 6 }, {1, α 8, α 14, α 13 }, {α, α 5, α 9, α 11 }, {α 4, α 10, α 7, α 12 }} Proposition: Two subgroups G, H define orthogonal coset partitions if they intersect trivially: G H = id A. Barg (UMD) Coding for memory and storage January 27, 2015 42 / 73

Multiple recovering sets Remarks There are other ways of constructing codes with multiple (e.g., two) recovering sets: Product codes, Bipartite-graph codes A. Barg (UMD) Coding for memory and storage January 27, 2015 43 / 73

Multiple recovering sets Remarks There are other ways of constructing codes with multiple (e.g., two) recovering sets: Product codes, Bipartite-graph codes A family of optimal locally recoverable codes, with I. Tamo, arxiv:1311.3284 (IT Trans., no. 8, 2014) A. Barg (UMD) Coding for memory and storage January 27, 2015 43 / 73

Bounds on the parameters Multiple recovering sets. Theorem. Let C be an (n, k, r, t) LRC code with t disjoint recovering sets of size r. Then the rate of C satisfies k n 1 t j=1 (1 + 1 ) t 1 r jr The minimum distance of C is bounded above as follows:. d n d n k t k 1. (Tamo B, 2014) i=0 r i t(k 1) + 1 + 2 (Rawat e.a., 2014) t(r 1) + 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 44 / 73

Bounds on the parameters Multiple recovering sets. Theorem. Let C be an (n, k, r, t) LRC code with t disjoint recovering sets of size r. Then the rate of C satisfies k n 1 t j=1 (1 + 1 ) t 1 r jr The minimum distance of C is bounded above as follows:. d n d n k t k 1. (Tamo B, 2014) i=0 r i t(k 1) + 1 + 2 (Rawat e.a., 2014) t(r 1) + 1 It is likely that these bounds are not final A. Barg (UMD) Coding for memory and storage January 27, 2015 44 / 73

Locally recoverable codes, Part II LRC codes A block code of length n over F q is called LRC with locality r if every symbol of the codeword can be found by accessing some r symbols of the codeword. A. Barg (UMD) Coding for memory and storage January 27, 2015 45 / 73

Locally recoverable codes, Part II LRC codes A block code of length n over F q is called LRC with locality r if every symbol of the codeword can be found by accessing some r symbols of the codeword. Regenerating codes: Data can be read off from any location A. Barg (UMD) Coding for memory and storage January 27, 2015 45 / 73

Locally recoverable codes, Part II LRC codes A block code of length n over F q is called LRC with locality r if every symbol of the codeword can be found by accessing some r symbols of the codeword. Regenerating codes: Data can be read off from any location Last time we constructed RS-type LRC codes with q n, distance meeting the Gopalan et al. Singleton bound: k d n k + 2 r A. Barg (UMD) Coding for memory and storage January 27, 2015 45 / 73

Locally recoverable codes, Part II LRC codes A block code of length n over F q is called LRC with locality r if every symbol of the codeword can be found by accessing some r symbols of the codeword. Regenerating codes: Data can be read off from any location Last time we constructed RS-type LRC codes with q n, distance meeting the Gopalan et al. Singleton bound: k d n k + 2 r Asymptotic GV bound with locality: R r r + 1 hq(δ) A. Barg (UMD) Coding for memory and storage January 27, 2015 45 / 73

Locally recoverable codes, Part II Extensions Reed-Solomon codes can be extended in two ways: Codes on algebraic curves Cyclic codes and subfield subcodes A. Barg (UMD) Coding for memory and storage January 27, 2015 46 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) Message space: span over F q of (x i, i = 0,..., k 1) A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) Message space: span over F q of (x i, i = 0,..., k 1) By construction, n q A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) Message space: span over F q of (x i, i = 0,..., k 1) By construction, n q (recall the MDS conjecture) A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) Message space: span over F q of (x i, i = 0,..., k 1) By construction, n q (recall the MDS conjecture) Longer codes? We need more points P i A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II Algebraic codes RS codes: n q 1, 1 k n, d = n k + 1 To construct the code, take A := (P 1,..., P n) F q Message a = (a 1,..., a k ) F k q f (x) = k i=1 a ix i 1 (f (P 1 ),..., f (P n )) Message space: span over F q of (x i, i = 0,..., k 1) By construction, n q (recall the MDS conjecture) Longer codes? We need more points P i Curves to the rescue! A. Barg (UMD) Coding for memory and storage January 27, 2015 47 / 73

Locally recoverable codes, Part II AG codes in error correction 1. Gilbert-Varshamov bound An [n, k, d] code exists if ( ) d 2 n 1 (q 1) i < q n k i i=0 Let R = k/n, δ = d/n, take logs and divide by n: R 1 h q (δ) A. Barg (UMD) Coding for memory and storage January 27, 2015 48 / 73

Locally recoverable codes, Part II AG codes in error correction 1. Gilbert-Varshamov bound An [n, k, d] code exists if ( ) d 2 n 1 (q 1) i < q n k i i=0 Let R = k/n, δ = d/n, take logs and divide by n: R 1 h q (δ) 2. Tsfasman-Vlăduţ-Zink bound There exist explicit sequences of codes on algebraic curves with the parameters R 1 δ 1 q 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 48 / 73

Locally recoverable codes, Part II RS type codes Given A F, partition it into (r + 1)-subsets. To encode the message a F k, write a = (a ij, i = 0,..., r 1; j = 0,..., k r 1) A. Barg (UMD) Coding for memory and storage January 27, 2015 49 / 73

Locally recoverable codes, Part II RS type codes Given A F, partition it into (r + 1)-subsets. To encode the message a F k, write a = (a ij, i = 0,..., r 1; j = 0,..., k r 1) r 1 a f a(x) = f i (x)x i, where f i (x) = a ij g(x) j, i = 0,..., r 1 k r 1 i=0 j=0 A. Barg (UMD) Coding for memory and storage January 27, 2015 49 / 73

Locally recoverable codes, Part II RS type codes Given A F, partition it into (r + 1)-subsets. To encode the message a F k, write a = (a ij, i = 0,..., r 1; j = 0,..., k r 1) r 1 a f a(x) = f i (x)x i, where f i (x) = a ij g(x) j, i = 0,..., r 1 k r 1 i=0 j=0 Message space: span (g(x) j x i, i = 0,..., r 1; j = 0,..., k r 1) A. Barg (UMD) Coding for memory and storage January 27, 2015 49 / 73

Locally recoverable codes, Part II RS type codes Given A F, partition it into (r + 1)-subsets. To encode the message a F k, write a = (a ij, i = 0,..., r 1; j = 0,..., k r 1) r 1 a f a(x) = f i (x)x i, where f i (x) = a ij g(x) j, i = 0,..., r 1 k r 1 i=0 j=0 Message space: span (g(x) j x i, i = 0,..., r 1; j = 0,..., k r 1) Evaluation code C Ev :F k F n a (f a(p), P A) A. Barg (UMD) Coding for memory and storage January 27, 2015 49 / 73

Locally recoverable codes, Part II RS-like codes A. Barg (UMD) Coding for memory and storage January 27, 2015 50 / 73

Locally recoverable codes, Part II RS-like codes What is the meaning of g(x)? A. Barg (UMD) Coding for memory and storage January 27, 2015 50 / 73

Locally recoverable codes, Part II RS-like codes What is the meaning of g(x)? It does not make sense that the functions are g(x) j x i A. Barg (UMD) Coding for memory and storage January 27, 2015 50 / 73

Locally recoverable codes, Part II RS-like codes What is the meaning of g(x)? It does not make sense that the functions are g(x) j x i They should really be g(y) j x i A. Barg (UMD) Coding for memory and storage January 27, 2015 50 / 73

Codes on curves Geometric interpretation A := {1, 2, 3, 4, 5, 6, 9, 10, 12} F 13 g(x) :A F 13 x x 3 A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} g(a 1 ) = 1, g(a 2 ) = 8, g(a 3 ) = 12 A. Barg (UMD) Coding for memory and storage January 27, 2015 51 / 73

Codes on curves Geometric interpretation A := {1, 2, 3, 4, 5, 6, 9, 10, 12} F 13 g(x) :A F 13 x x 3 A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} g(a 1 ) = 1, g(a 2 ) = 8, g(a 3 ) = 12 g : F q F q g 1 (x) = r + 1 for every x in the image of g A. Barg (UMD) Coding for memory and storage January 27, 2015 51 / 73

Codes on curves Geometric interpretation A := {1, 2, 3, 4, 5, 6, 9, 10, 12} F 13 g(x) :A F 13 x x 3 A = {A 1 = {1, 3, 9}, A 2 = {2, 6, 5}, A 3 = {4, 12, 10}} g(a 1 ) = 1, g(a 2 ) = 8, g(a 3 ) = 12 g : F q F q g 1 (x) = r + 1 for every x in the image of g 1 2 4 X 3 6 12 9 5 10 Y 1 8 12 A. Barg (UMD) Coding for memory and storage January 27, 2015 51 / 73

Codes on curves LRC codes on curves Consider the set of pairs (x, y) F 9 that satisfy the equation x 3 + x = y 4 α 7 α 6 α 5 α 4 x α 3 α 2 α 1 0 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 y A. Barg (UMD) Coding for memory and storage January 27, 2015 52 / 73

Codes on curves LRC codes on curves Consider the set of pairs (x, y) F 9 that satisfy the equation x 3 + x = y 4 α 7 α 6 α 5 α 4 x α 3 α 2 α 1 0 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 y 27 points of the Hermitian curve over F 9 ; α 2 = α + 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 52 / 73

Codes on curves LRC codes on curves Recall RS codes: C is a mapping V k = 1, x,..., x k 1 F n q A. Barg (UMD) Coding for memory and storage January 27, 2015 53 / 73

Codes on curves LRC codes on curves Recall RS codes: C is a mapping V k = 1, x,..., x k 1 F n q Hermitian codes Take the space of functions V := 1, y, y 2, x, xy, xy 2 A={27 points of the Hermitian curve over F 9 }; n = 27, k = 6 C : V F n 9 A. Barg (UMD) Coding for memory and storage January 27, 2015 53 / 73

Codes on curves LRC codes on curves Recall RS codes: C is a mapping V k = 1, x,..., x k 1 F n q Hermitian codes Take the space of functions V := 1, y, y 2, x, xy, xy 2 A={27 points of the Hermitian curve over F 9 }; n = 27, k = 6 E.g., message (1, α, α 2, α 3, α 4, α 5 ) C : V F n 9 F (x, y) = 1 + αy + α 2 y 2 + α 3 x + α 4 xy + α 5 xy 2 F (0, 0) = 1 etc. A. Barg (UMD) Coding for memory and storage January 27, 2015 53 / 73

Codes on curves LRC codes on curves α 7 α α 7 α 5 0 α 6 α 2 α 5 α 6 α 4 α 2 0 α 4 α 7 α 3 α 5 α 5 x α 3 α 3 α 7 α α α 2 α 3 α 0 0 0 0 1 1 α 6 α 4 0 0 1 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 y A. Barg (UMD) Coding for memory and storage January 27, 2015 54 / 73

Codes on curves LRC codes on curves Let P = (α, 1) be the erased location. α 7 α α 7 α 5 0 α 6 α 2 α 5 α 6 α 4 α 2 0 α 4 α 7 α 3 α 5 α 5 x α 3 α 3 α 7 α α α 2 α 3 α X0 0 0 0 1 1 α 6 α 4 0 0 1 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 y A. Barg (UMD) Coding for memory and storage January 27, 2015 55 / 73

Codes on curves LRC codes on curves α 7 α α 7 α 5 0 α 6 α 2 α 5 α 6 α 4 α 2 0 α 4 α 7 α 3 α 5 α 5 x α 3 α 3 α 7 α α α 2 α 3 α? 0 0 0 1 1 α 6 α 4 0 0 1 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 Let P = (α, 1) be the erased location. Recovering set I P = {(α 4, 1), (α 3, 1)} Find f (x) : f (α 4 ) = α 7, f (α 3 ) = α 3 y f (x) = αx α 2 A. Barg (UMD) Coding for memory and storage January 27, 2015 56 / 73

Codes on curves LRC codes on curves α 7 α α 7 α 5 0 α 6 α 2 α 5 α 6 α 4 α 2 0 α 4 α 7 α 3 α 5 α 5 x α 3 α 3 α 7 α α α 2 α 3 α? 0 0 0 1 1 α 6 α 4 0 0 1 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 Let P = (α, 1) be the erased location. Recovering set I P = {(α 4, 1), (α 3, 1)} Find f (x) : f (α 4 ) = α 7, f (α 3 ) = α 3 y f (x) = αx α 2 f (α) = 0 = F (P) A. Barg (UMD) Coding for memory and storage January 27, 2015 56 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power X : x q 0 + x = y q 0+1 A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power X : x q 0 + x = y q 0+1 X has q 3 0 = q 3/2 points in F q A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power X : x q 0 + x = y q 0+1 X has q 3 0 = q 3/2 points in F q Let g : X Y = P 1, g(p) = g(x, y) := y A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power X : x q 0 + x = y q 0+1 X has q 3 0 = q 3/2 points in F q Let g : X Y = P 1, g(p) = g(x, y) := y We obtain a family of q-ary codes of length n = q 3 0, with locality r = q 0 1. k = (t + 1)(q 0 1), d n tq 0 (q 0 2)(q 0 + 1) A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Hermitian codes q = q 2 0, q 0 prime power X : x q 0 + x = y q 0+1 X has q 3 0 = q 3/2 points in F q Let g : X Y = P 1, g(p) = g(x, y) := y We obtain a family of q-ary codes of length n = q 3 0, with locality r = q 0 1. k = (t + 1)(q 0 1), d n tq 0 (q 0 2)(q 0 + 1) It is also possible to take g(p) = x (projection on x); we obtain LRC codes with locality q 0 A. Barg (UMD) Coding for memory and storage January 27, 2015 57 / 73

Codes on curves Two recovering sets α 7 α 6 α 5 α 4 x α 3 α 2 α 1 0 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 Polynomial basis {x i y j, i = 0, 1,..., r 1 1, j = 0, 1,..., r 2 1} y A. Barg (UMD) Coding for memory and storage January 27, 2015 58 / 73

Codes on curves Two recovering sets α 7 α 6 α 5 α 4 x α 3 α 2 α 1 0 0 1 α α 2 α 3 α 4 α 5 α 6 α 7 Polynomial basis {x i y j, i = 0, 1,..., r 1 1, j = 0, 1,..., r 2 1} (24, 6, {2, 3}) LRC(2) code over F 9 y A. Barg (UMD) Coding for memory and storage January 27, 2015 58 / 73

General LRC codes on curves Codes on curves Map of curves X, Y smooth projective absolutely irreducible curves over k rational separable map of degree r + 1. g : X Y A. Barg (UMD) Coding for memory and storage January 27, 2015 59 / 73

General LRC codes on curves Codes on curves Map of curves X, Y smooth projective absolutely irreducible curves over k rational separable map of degree r + 1. g : X Y Lift the points of Y S = {P 1,..., P s } Y (k); Q = π 1 ( ), where π : Y P 1 k. Assume that there is a partition of points A := g 1 (S) = {P ij, i = 0,..., r, j = 1,..., s} X(k) such that g(p ij ) = P j for all i, j. A. Barg (UMD) Coding for memory and storage January 27, 2015 59 / 73

General LRC codes on curves Codes on curves Map of curves X, Y smooth projective absolutely irreducible curves over k rational separable map of degree r + 1. g : X Y Lift the points of Y S = {P 1,..., P s } Y (k); Q = π 1 ( ), where π : Y P 1 k. Assume that there is a partition of points A := g 1 (S) = {P ij, i = 0,..., r, j = 1,..., s} X(k) such that Basis of the function space g(p ij ) = P j for all i, j. {f j x i, i = 0,..., r 1; j = 1,..., m} A. Barg (UMD) Coding for memory and storage January 27, 2015 59 / 73

General LRC codes on curves Codes on curves Map of curves X, Y smooth projective absolutely irreducible curves over k rational separable map of degree r + 1. g : X Y Lift the points of Y S = {P 1,..., P s } Y (k); Q = π 1 ( ), where π : Y P 1 k. Assume that there is a partition of points A := g 1 (S) = {P ij, i = 0,..., r, j = 1,..., s} X(k) such that Basis of the function space g(p ij ) = P j for all i, j. {f j x i, i = 0,..., r 1; j = 1,..., m} Construct LRC codes Evaluation codes constructed on the set A have the locality property with parameter r. A. Barg (UMD) Coding for memory and storage January 27, 2015 59 / 73

Codes on curves Asymptotically good sequences of codes Let q = q0, 2 where q 0 is a prime power. Take Garcia-Stichtenoth towers of curves: x 0 := 1; X 1 := P 1, k(x 1 ) = k(x 1 ); X l : z q 0 l + z l = x q 0+1 l 1, x l 1 := z l 1 k(x l 1 ) (if l 3), x l 2 A. Barg (UMD) Coding for memory and storage January 27, 2015 60 / 73

Codes on curves Asymptotically good sequences of codes Let q = q 2 0, where q 0 is a prime power. Take Garcia-Stichtenoth towers of curves: x 0 := 1; X 1 := P 1, k(x 1 ) = k(x 1 ); X l : z q 0 l + z l = x q 0+1 l 1, x l 1 := z l 1 k(x l 1 ) (if l 3), x l 2 There exist families of q-ary LRC codes with locality r whose rate and relative distance satisfy R r r + 1 R r r + 1 ( 1 δ 3 q + 1 ), r = q 1 ( 1 δ 2 q ), r = q q 1 (better than the GV bound) A. Barg (UMD) Coding for memory and storage January 27, 2015 60 / 73

Codes on curves Asymptotically good sequences of codes Let q = q 2 0, where q 0 is a prime power. Take Garcia-Stichtenoth towers of curves: x 0 := 1; X 1 := P 1, k(x 1 ) = k(x 1 ); X l : z q 0 l + z l = x q 0+1 l 1, x l 1 := z l 1 k(x l 1 ) (if l 3), x l 2 There exist families of q-ary LRC codes with locality r whose rate and relative distance satisfy R r r + 1 R r r + 1 ( 1 δ 3 q + 1 ), r = q 1 ( 1 δ 2 q ), r = q q 1 (better than the GV bound) ) Recall the TVZ bound without locality: R 1 δ 1 q 1 A. Barg (UMD) Coding for memory and storage January 27, 2015 60 / 73

Codes on curves Asymptotically good sequences of codes Let q = q 2 0, where q 0 is a prime power. Take Garcia-Stichtenoth towers of curves: x 0 := 1; X 1 := P 1, k(x 1 ) = k(x 1 ); X l : z q 0 l + z l = x q 0+1 l 1, x l 1 := z l 1 k(x l 1 ) (if l 3), x l 2 There exist families of q-ary LRC codes with locality r whose rate and relative distance satisfy R r r + 1 R r r + 1 ( 1 δ 3 q + 1 ), r = q 1 ( 1 δ 2 q ), r = q q 1 (better than the GV bound) Locally recoverable codes on algebraic curves, with I. Tamo and S. Vlăduţ, arxiv:1501.04904 A. Barg (UMD) Coding for memory and storage January 27, 2015 61 / 73

Codes on curves What next? A. Barg (UMD) Coding for memory and storage January 27, 2015 62 / 73

Codes on curves What next? A. Barg (UMD) Coding for memory and storage January 27, 2015 62 / 73

Codes on curves What next? A. Barg (UMD) Coding for memory and storage January 27, 2015 62 / 73

Codes on curves What next? A. Barg (UMD) Coding for memory and storage January 27, 2015 62 / 73

Cyclic LRC codes Another connection: Cyclic codes and Binary cyclic codes A. Barg (UMD) Coding for memory and storage January 27, 2015 63 / 73

Cyclic LRC codes Another connection: Cyclic codes and Binary cyclic codes Consider an [n = 15, k = 4] RS code over F 16 ; A = {1, α, α 2,..., α 14 } A. Barg (UMD) Coding for memory and storage January 27, 2015 63 / 73