IERG60 Coding for Distributed Storage Systems Lecture 0-9//06 Lecturer: Kenneth Shum Product-matrix Construction Scribe: Xishi Wang In previous lectures, we have discussed about the minimum storage regenerating (MSR) point. In this lecture, we study the minimum bandwidth regenerating (MBR) point via a product-matrix construction []. Minimum Bandwidth Regeneration (MBR) Recall that for regenerating codes, the file size B is upper bounded by B = min ((d i)β, α), () i=0 where d is the number of helper nodes for repairing, is the number of nodes needed to recover the whole file, α is the number of symbols stored in each node and β is the repair bandwidth per helper. The boundary of the admissible (achievable) region is a piecewise linear function as shown in Figure. The MSR point is shown in red and the MBR point is in blue. The latter one is the focus of this lecture. dβ MSR MBR α Figure : Graph of the admissible region. To get the minimum bandwidth regenerating point, we let α = dβ. In this case, (d i)β α i. Thus, B =dβ + (d )β + + (d + )β () Figure shows a graphical illustration of the above equation. =β (d + ) (arithmetic progression). (3)
α dβ (d )β Area = B (d + )β Figure : Graphical illustration of equation (). α is the water level and B is the total area. Product-matrix Construction. Matrix Construction Let β =, d > and each symbol is an element over F q. We have B = (d + ). We arrange these B elements in a trapezoid (blue region) manner. Then we flip the matrix along the diagonal and create a symmetric d d matrix M. The resulting matrix is called the message matrix. Alternately, the message matrix can be described as a partitioned matrix. The submatrix on the upper left corner is a symmetric matrix, whose entries are the first ( + )/ message symbols. The remaining B ( + ) = ( + ) (d + ) = (d ) message symbols are arranged in the (d ) submatrix A. d A A T M d d = = 0 d A 0 Symmetric Matrix A T = A Figure 3: Construction of the message matrix.
d ψ T Ψ n d = n C C = ψ T. ψ T n Figure 4: Construction of the matrix Ψ. Let Ψ be an n d matrix over F q such that it satisfies the following two properties. (i) Any rows in the n submatrix on the left, denoted by C in Figure 4, are linear independent over F q. (ii) Any d rows in Ψ are linear independent over F q. e.g. Vandermonde matrix, Cauchy matrix. Node i stores ψ T i M for i =,..., n.. Repairment Suppose that node f fails. We want to recover ψf T M. New node pics any d helper nodes. Denote their indexes as i i, i,..., i d. Node i j stores ψi T j M ( d row vector). Node i j sends ψi T j M ψ f (a scalar symbol in F q ). The new node receives d symbols, which can be concatenated as a column matrix ψi T Mψ f ψ ψi T i T Mψ f ψi T. ψ T i d Mψ f =. Mψ f (4) By property (ii), the square matrix with ψ T i j as the rows is invertible, the new node can compute Mψ f. Using the property that M is symmetric, we can tae the transpose of Mψ f and get.3 (n, )-recovery Property ψ T i d (Mψ f ) T = ψ T f M T = ψ T f M. We remar that this code is not maximum distance separable (MDS) code, as a node stores more than B symbols. 3
A data collector pics any nodes i i, i,..., i. ψi T M ψ ψi T i T M. = ψ T i. M (5) ψi T M ψ T i = [ D E ] M (D is and E is (d )) (6) = [ D E ] [ ] A A T A (7) 0 = [ DA + EA D(A ) T ] (8) Equation (7) follows from the construction of message matrix. To recover the file with B symbols, it suffices to recover the matrix A and A. Here are the steps to recover these two matrices from the equation (8). (a) From D(A ) T recover A since D is invertible by property (i). (b) Subtracting EA from DA + EA yields DA. (c) Recover A from DA since D is invertible..4 Remars A few remars for this product-matrix construction: ( ) General construction for all parameters at MBR points. ( ) Not repair-by-transfer. ( ) New node can pic any d helpers. ( ) It suffices to choose field size q > n. The finite field size scales linearly with the number of storage nodes. 3 An explicit example We give an example of the product-matrix construction for MBR codes. The code parameters are n = 7, = 4, d = α = 5, B = 4, β =. We chec that this attains equality in the min-cut bound 4 = B min((d i)β, α) = 5 + 4 + 3 +. i=0 Pic the finite field GF (7) as the alphabet. (For practical implementation we can choose GF (8).) For i =,,..., 7, let ψi T be the i-th row of Vandermonde matrix 3 4 3 3 3 3 3 4 Ψ = 4 4 4 3 4 4 5 5 5 3 5 4 6 6 6 3 6 4 0 0 0 0 4
with calculations carried out in GF (7). Let the message symbols be a m for m =,,..., 4. In the encoding process, we first put the message symbols in a 5 5 matrix M = a a a 3 a 4 a a a 5 a 6 a 7 a a 3 a 6 a 8 a 9 a 3 a 4 a 7 a 9 a 0 a 4 a a a 3 a 4 0 For i =,,..., 7, node i stores the components in the vector ψi T M, namely a + ia + i a 3 + i 3 a 4 + i 4 a, a + ia 5 + i a 6 + i 3 a 7 + i 4 a, a 3 + ia 6 + i a 8 + i 3 a 9 + i 4 a 3, a 4 + ia 7 + i a 9 + i 3 a 0 + i 4 a 4, a + ia + i a 3 + i 3 a 4. All arithmetic are performed mod 7. Suppose that node fails, and we want to repair it from nodes 3,4,5,6,7. For i = 3, 4,..., 7, node i taes the dot product of the stored vector with (,,,, ), and sends the resulting symbol to the new node. The new node receives 5 symbols, which can be concatenated into a vector 3 3 3 3 3 4 4 4 4 3 4 4 5 5 5 3 5 4 6 6 6 3 6 4 M. 0 0 0 0 Since the matrix on the left is invertible, the new node can compute M and the transpose of it is the desired vector. To illustrate that the message symbols can be decoded from any 4 nodes, suppose that a data collector connects to nodes,,3,4. The data collector can obtain a + ia + i a 3 + i 3 a 4 + i 4 a, a + ia 5 + i a 6 + i 3 a 7 + i 4 a, a 3 + ia 6 + i a 8 + i 3 a 9 + i 4 a 3, a 4 + ia 7 + i a 9 + i 3 a 0 + i 4 a 4, a + ia + i a 3 + i 3 a 4. for i =,, 3, 4. The message symbols a, a, a 3, a 4 can be obtained from a 3 a 3 3 3 3 a 3. 4 4 4 3 a 4. 5
After some subtractions, we can compute the remaining message symbols from a a a 3 a 4 3 a a 5 a 6 a 7 3 3 3 3 a 3 a 6 a 8 a 9. 4 4 4 3 a 4 a 7 a 9 a 0 References [] K. Rashmi, N. B. Shah, and P. V. Kumar, Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction, IEEE Trans. Inf. Theory, vol. 57, no. 8, pp. 57 539, Aug. 0. [] S. El Rouayheb and K. Ramchandran, Fractional repetition codes for repair in distributed storage systems, presented at the 48th Annu. Allerton Conf. Control, Comput., Commun., Urbana-Champaign, IL, Sep. 00. 6