Error-correcting codes and applications

Similar documents
Errors, Eavesdroppers, and Enormous Matrices

Code-based Cryptography

Notes 10: Public-key cryptography

An Introduction to (Network) Coding Theory

Code Based Cryptology at TU/e

Error-correcting Pairs for a Public-key Cryptosystem

An Introduction to (Network) Coding Theory

Lecture 12. Block Diagram

A Key Recovery Attack on MDPC with CCA Security Using Decoding Errors

Side-channel analysis in code-based cryptography

Theory of Computation Chapter 12: Cryptography

Channel Coding for Secure Transmissions

Coset Decomposition Method for Decoding Linear Codes

Code-Based Cryptography Error-Correcting Codes and Cryptography

Guess & Check Codes for Deletions, Insertions, and Synchronization

during transmission safeguard information Cryptography: used to CRYPTOGRAPHY BACKGROUND OF THE MATHEMATICAL

Logic gates. Quantum logic gates. α β 0 1 X = 1 0. Quantum NOT gate (X gate) Classical NOT gate NOT A. Matrix form representation

Error-correcting codes and Cryptography

List decoding of binary Goppa codes and key reduction for McEliece s cryptosystem

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

CPSC 467b: Cryptography and Computer Security

Cryptographie basée sur les codes correcteurs d erreurs et arithmétique

Network Coding and Schubert Varieties over Finite Fields

Cryptography CS 555. Topic 25: Quantum Crpytography. CS555 Topic 25 1

The Elliptic Curve in https

Error-Correcting Codes

The McEliece Cryptosystem Resists Quantum Fourier Sampling Attack

Example: sending one bit of information across noisy channel. Effects of the noise: flip the bit with probability p.

Codes used in Cryptography

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Discussion 6A Solution

Lecture notes on coding theory

The failure of McEliece PKC based on Reed-Muller codes.

Cryptography. P. Danziger. Transmit...Bob...

LDPC Codes in the McEliece Cryptosystem

8 Elliptic Curve Cryptography

Lecture Notes, Week 6

McEliece type Cryptosystem based on Gabidulin Codes

Applications of Lattices in Telecommunications

Post-Quantum Code-Based Cryptography

Post-quantum cryptography Why? Kristian Gjøsteen Department of Mathematical Sciences, NTNU Finse, May 2017

( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1

Algebraic Codes for Error Control

Lecture 1: Introduction to Public key cryptography

Quantum-resistant cryptography

Coding Theory and Applications. Linear Codes. Enes Pasalic University of Primorska Koper, 2013

Quantum Information Theory and Cryptography

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

ECEN 655: Advanced Channel Coding

Regenerating Codes and Locally Recoverable. Codes for Distributed Storage Systems

Mathematics of Cryptography

Secure RAID Schemes from EVENODD and STAR Codes

Wild McEliece Incognito

} has dimension = k rank A > 0 over F. For any vector b!

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

9. Distance measures. 9.1 Classical information measures. Head Tail. How similar/close are two probability distributions? Trace distance.

Lemma 1.2. (1) If p is prime, then ϕ(p) = p 1. (2) If p q are two primes, then ϕ(pq) = (p 1)(q 1).

An Overview to Code based Cryptography

Communications II Lecture 9: Error Correction Coding. Professor Kin K. Leung EEE and Computing Departments Imperial College London Copyright reserved

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

Noisy Diffie-Hellman protocols

MATH3302. Coding and Cryptography. Coding Theory

Quantum Entanglement and Cryptography. Deepthi Gopal, Caltech

Candidates must show on each answer book the type of calculator used. Only calculators permitted under UEA Regulations may be used.

Simple Matrix Scheme for Encryption (ABC)

Polynomial Codes over Certain Finite Fields

Univ.-Prof. Dr. rer. nat. Rudolf Mathar. Written Examination. Cryptography. Tuesday, August 29, 2017, 01:30 p.m.

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

2012 IEEE International Symposium on Information Theory Proceedings

Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks

Lecture 17 - Diffie-Hellman key exchange, pairing, Identity-Based Encryption and Forward Security

Cryptanalysis of the McEliece Public Key Cryptosystem Based on Polar Codes

Local correctability of expander codes

A FUZZY COMMITMENT SCHEME WITH MCELIECE S CIPHER

Cryptographic applications of codes in rank metric

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory

Code-based cryptography

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

9 Knapsack Cryptography

Math 299 Supplement: Modular Arithmetic Nov 8, 2013

MATH Examination for the Module MATH-3152 (May 2009) Coding Theory. Time allowed: 2 hours. S = q

MATH 433 Applied Algebra Lecture 21: Linear codes (continued). Classification of groups.

Cryptographic Engineering

MATH3302 Cryptography Problem Set 2

A Public Key Encryption Scheme Based on the Polynomial Reconstruction Problem

Linear Programming Bounds for Robust Locally Repairable Storage Codes

Encryption: The RSA Public Key Cipher

Correcting Localized Deletions Using Guess & Check Codes

2. Cryptography 2.5. ElGamal cryptosystems and Discrete logarithms

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Guess & Check Codes for Deletions, Insertions, and Synchronization

Arrangements, matroids and codes

A Fuzzy Sketch with Trapdoor

Code Based Cryptography

1 Recommended Reading 1. 2 Public Key/Private Key Cryptography Overview RSA Algorithm... 2

Security Implications of Quantum Technologies

Post-Quantum Cryptography

Lecture 8: Shannon s Noise Models

Introduction to Cryptography. Lecture 8

IN this paper, we exploit the information given by the generalized

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014

Transcription:

Error-correcting codes and applications November 20, 2017

Summary and notation Consider F q : a finite field (if q = 2, then F q are the binary numbers), V = V(F q,n): a vector space over F q of dimension n (so vectors have length n), U = U(F q,k): a vector space over F q of dimension k (so vectors have length k), G: a k n matrix over F q of rank k. The set of vectors (codewords) C = {ug : u U} V forms a linear error correcting code.

Summary and notation Consider F q : a finite field (if q = 2, then F q are the binary numbers), V = V(F q,n): a vector space over F q of dimension n (so vectors have length n), U = U(F q,k): a vector space over F q of dimension k (so vectors have length k), G: a k n matrix over F q of rank k. The set of vectors (codewords) C = {ug : u U} V forms a linear error correcting code. The code has alphabet F q length n, dimension k, information rate k n, generator matrix G.

Summary and notation The code C = {ug : u U} V is the image of the vector space U under the linear transformation defined by G: T G : U V. Note that C is a subspace of V of dimension k (same as U). The vector space U is the set of vectors representing information. The codes minimum distance d is the minimum (Hamming) distance between any two codewords. To correct many errors the minimum distance d should be large. Constructing a good code therefore means selecting vectors far away from each other.

Applications of linear error-correcting codes Examples of different functionalities of linear block codes: Forward error correction (channel coding) Cryptography (McEliece-type cryptosystems) Group testing (compressed sensing and identifying codes)

1 Forward error correction/channel coding General overview Data storage Pseudonymization in medical research 2 Cryptography Reminder of public key cryptography Code-based public-key cryptography

Forward error correction Information is transmitted over a noisy channel. There are applications where resending lost information is expensive or impossible. Then we use forward error correction or channel coding. Examples: Space communications. When we notice data is lost it is too late to resend.

Forward error correction Information is transmitted over a noisy channel. There are applications where resending lost information is expensive or impossible. Then we use forward error correction or channel coding. Examples: Space communications. When we notice data is lost it is too late to resend. Wireless communications When we notice data is lost we already need the next data block.

Forward error correction Information is transmitted over a noisy channel. There are applications where resending lost information is expensive or impossible. Then we use forward error correction or channel coding. Examples: Space communications. When we notice data is lost it is too late to resend. Wireless communications When we notice data is lost we already need the next data block. Data storage When we notice data is lost there is no other copy to recover it from.

Codes for forward error correction: design criteria What is important to consider? The channel (erasures/errors, fading, error-rate, etc). Accepted delay (voice conversation: short delay /email: long delay). Computing resources (how long time will coding/decoding take). From this, make choices of code parameters like: Alphabet (typically: a finite field F q where q = 2 n ). How many errors to correct in each block? Block length. Size of codebook (number of codewords). Available coding/decoding algorithms.

Typical research questions So what kind of questions is interesting when studying forward error correction? Give upper bounds for: Amount of information that can be transmitted over the channel (e.g. Shannon limit). Number of codewords of code given error-correcting capacity and length of codewords (e.g. Singleton bound: d n k +1). Construct codes approaching/attaining these bound. Construct fast coding/decoding algorithms. Main tools: combinatorics, geometry and linear algebra.

1 Forward error correction/channel coding General overview Data storage Pseudonymization in medical research 2 Cryptography Reminder of public key cryptography Code-based public-key cryptography

Data storage (one disk) When data is stored (on disks, tapes, etc), the channel is the storage medium. Example. Error-correction for compact discs (CD) often uses Reed-Solomon codes. Reed-Solomon codes are MDS codes. MDS codes are codes which attain the Singleton bound d n k +1, so they are optimal in some sense. Example. Error-correction for Micro-SD cards often uses BCH codes. BCH codes are cyclic codes: it is easy to prescribe error-correcting capacity and decoding is easy.

Data storage (multiple disks) Modern disks are cheap but error prone. Solution: use n disks and store n copies of data. That s a repetition code! Every symbol is repeated n times. Information rate: 1 n. We can do better by using, for example, an MDS-code.

Storage codes: general setting Linear error-correcting code of length n. Assign one symbol of the codeword to each of the n disk. Suppose one disk fails: an error in the codeword. If the code can correct e errors, then data will survive e disk failures.

MDS-codes attain the Singleton bound: d n k +1. They can correct n k 2 errors or n k erasures (i.e. when you know where the error is). An MDS-code of dimension k = 5 and length n = 8 have minimum distance d = n k +1 = 8 5+1 = 4. It corrects n k = 3 erasures.

MDS-codes attain the Singleton bound: d n k +1. It can correct n k 2 errors or n k erasures (i.e. when you know where the error is). An MDS-code of dimension k = 5 and length n = 8 have minimum distance d = n k +1 = 8 5+1 = 4. It corrects n k = 3 erasures.

MDS-codes attain the Singleton bound: d n k +1. It can correct n k 2 errors or n k erasures (i.e. when you know where the error is). An MDS-code of dimension k = 5 and length n = 8 have minimum distance d = n k +1 = 8 5+1 = 4. It corrects n k = 3 erasures.

Storage codes: other functionalities In the cloud the cloud provider keeps track of failing disks. Before e disk failures occur, they have to recode the information so it is not lost. Downloading the coded information, decode, recode and upload is inefficient. Solution: Regenerating codes. Regenerating codes allow a lost symbol/disk to repair itself by contacting a small number of neighboring symbols/disks.

1 Forward error correction/channel coding General overview Data storage Pseudonymization in medical research 2 Cryptography Reminder of public key cryptography Code-based public-key cryptography

Why pseudonymization? Laws and regulations require data collected in distinct medical research projects to be pseudonymized before combined. A pseudonym is an identifier used to keep track of the data of an individual. A pseudonym is different from a name or a personal number, in that it should give no reference to the individual (my personal number is almost public information).

Example: German PIDs German Federal Data Protection Act: Replace identifying patient characteristics by a label to preclude identification of a patient by unauthorized persons. Clinical trials: Names, initials, birth dates are not allowed. Medicinal Products Act: Only pseudonymized data may be passed to the sponsor.

Example: German PIDs Design challenges of a patient identifier (PID): Distinguish 1 billion individuals by a unique PID. Medical personal are in a hurry and make lots of typos. Consequences: False classification of patients. Possible severe consequences in diagnosis and therapy. Increasing variance in statistical analyses. Solution: error-correcting pseudonyms (PID)!

Example: German PIDs German researcher suggest a 32 [8,6,3] MDS-code: Minimum distance 3. Optimal code (attains the Singleton bound). Detects 2 errors. Corrects 1 error. Corrects transposition of adjacent distinct characters. Homework: This is forward error-correction. What is the channel in this application?

1 Forward error correction/channel coding General overview Data storage Pseudonymization in medical research 2 Cryptography Reminder of public key cryptography Code-based public-key cryptography

Assymmetric/public key cryptography Secrecy: Alice encrypts (transforms) the message into something only Bob can decrypt (read). Symmetric key cryptography: Alice and Bob use the same key for encryption and decryption. Assymmetric key cryptography: Alice and Bob have different keys. The public key is for encryption (Alice can get it for example from Bob s homepage), The secret key is for decrypting (Bob keeps it secret). Calculating the secret key from the public key must be a COMPUTATIONALLY HARD PROBLEM. Example: RSA is based on the hard problem of integer factorization. (Integer factorization is not so hard if you have a quantum computer... and it is not believed to be NP-complete.)

1 Forward error correction/channel coding General overview Data storage Pseudonymization in medical research 2 Cryptography Reminder of public key cryptography Code-based public-key cryptography

A hard problem in coding theory The decoding problem: given a received vector v and a code C, find the closest codeword c C. The general decoding problem of a linear code is NP-complete. In other words: the existence of a polynomial time algorithm for decoding any linear code would imply that P = NP. (The most important problem in computer science?) There is an algorithm (of exponential complexity) which decodes a received vector v to the correct codeword by searching a set of exponentially many (in terms of the length of v) candidate codewords. If P NP, then we can t do much better!

Decoding algorithms for special codes If all decoding algorithms were of exponential complexity, coding theory would be pretty useless. There are many efficient (polynomial time) algorithms for decoding. A given family of codes becomes interesting because of the coding and decoding algorithms they feature. It is the algebraic and geometric structure of the code which decides what algorithms are useful. Example. Binary Goppa codes can be decoded using the Patterson algorithm.

McEliece cryptosystem Public key: A matrix and an integer (Ĝ,t). The matrix Ĝ is the generator matrix of a linear code Ĉ obtained from a generator matrix G of a Goppa code C as Ĝ = SGP where S and P are certain randomly chosen invertible matrices. t is the number of errors that C (and Ĉ) can correct. We can regard Ĝ as a scrambled generator matrix of the generator matrix G of the Goppa code C.

McEliece cryptosystem Public key: A matrix and an integer (Ĝ,t). The matrix Ĝ is the generator matrix of a linear code Ĉ obtained from a generator matrix G of a Goppa code C as Ĝ = SGP where S and P are certain randomly chosen invertible matrices. t is the number of errors that C (and Ĉ) can correct. We can regard Ĝ as a scrambled generator matrix of the generator matrix G of the Goppa code C. Private key: The three matrices (S,G,P). Bob encodes the message vector m as x = mĝ +e t, where e t is an error vector with t errors. Alice has the matrices S, P which scrambled G to Ĝ. They define invertible linear transformations T S and T P. She uses the inverse transformations to allow the Patterson algorithm to remove the error e t from T 1 P (x), and then recover m by applying T 1 S.

McEliece cryptosystem McEliece cryptosystem was one of the first non-deterministic cryptosystems. Advantage: Immune to Shor s algorithm (which makes integer factorization a not-so-hard problem for quantum computers.) Drawback: Large key (post-quantum: about 1 MB).

Thank you!