Outline. Computer Science 418. Number of Keys in the Sum. More on Perfect Secrecy, One-Time Pad, Entropy. Mike Jacobson. Week 3

Similar documents
Outline. CPSC 418/MATH 318 Introduction to Cryptography. Information Theory. Partial Information. Perfect Secrecy, One-Time Pad

Topics. Probability Theory. Perfect Secrecy. Information Theory

THE UNIVERSITY OF CALGARY FACULTY OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE DEPARTMENT OF MATHEMATICS & STATISTICS MIDTERM EXAMINATION 1 FALL 2018

Chapter 2 : Perfectly-Secret Encryption

Cryptography. Lecture 2: Perfect Secrecy and its Limitations. Gil Segev

PERFECT SECRECY AND ADVERSARIAL INDISTINGUISHABILITY

Lecture Notes. Advanced Discrete Structures COT S

Shannon s Theory of Secrecy Systems

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography

Cryptography CS 555. Topic 2: Evolution of Classical Cryptography CS555. Topic 2 1

Cryptographic Engineering

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Lecture 8 - Cryptography and Information Theory

Data and information security: 2. Classical cryptography

CSCI3381-Cryptography

Cryptography 2017 Lecture 2

CS 282A/MATH 209A: Foundations of Cryptography Prof. Rafail Ostrosky. Lecture 4

Shift Cipher. For 0 i 25, the ith plaintext character is. E.g. k = 3

U.C. Berkeley CS276: Cryptography Luca Trevisan February 5, Notes for Lecture 6

Solution to Midterm Examination

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Cryptography - Session 2

William Stallings Copyright 2010

Introduction to Cryptography Lecture 4

Solution of Exercise Sheet 6

Perfectly-Secret Encryption

PERFECTLY secure key agreement has been studied recently

Lecture 2: Perfect Secrecy and its Limitations

The simple ideal cipher system

Solutions for week 1, Cryptography Course - TDA 352/DIT 250

Lecture Note 3 Date:

RSA RSA public key cryptosystem

Solution of Exercise Sheet 7

Historical cryptography. cryptography encryption main applications: military and diplomacy

Introduction to Cryptology. Lecture 2

A block cipher enciphers each block with the same key.

Problem 1. k zero bits. n bits. Block Cipher. Block Cipher. Block Cipher. Block Cipher. removed

Dan Boneh. Stream ciphers. The One Time Pad

and its Extension to Authenticity

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

Notes 10: Public-key cryptography

CPSC 467b: Cryptography and Computer Security

MATH3302 Cryptography Problem Set 2

1. Basics of Information

CODING AND CRYPTOLOGY III CRYPTOLOGY EXERCISES. The questions with a * are extension questions, and will not be included in the assignment.

All-Or-Nothing Transforms Using Quasigroups

Lecture 19: Public-key Cryptography (Diffie-Hellman Key Exchange & ElGamal Encryption) Public-key Cryptography

Computer Science A Cryptography and Data Security. Claude Crépeau

Lectures One Way Permutations, Goldreich Levin Theorem, Commitments

On Perfect and Adaptive Security in Exposure-Resilient Cryptography. Yevgeniy Dodis, New York University Amit Sahai, Princeton Adam Smith, MIT

Winter 2008 Introduction to Modern Cryptography Benny Chor and Rani Hod. Assignment #2

CPSC 467b: Cryptography and Computer Security

Circuit Complexity. Circuit complexity is based on boolean circuits instead of Turing machines.

Classical Cryptography

An introduction to basic information theory. Hampus Wessman

Lecture 5, CPA Secure Encryption from PRFs

Lecture 9 Julie Staub Avi Dalal Abheek Anand Gelareh Taban. 1 Introduction. 2 Background. CMSC 858K Advanced Topics in Cryptography February 24, 2004

Computational security & Private key encryption

monoalphabetic cryptanalysis Character Frequencies (English) Security in Computing Common English Digrams and Trigrams Chapter 2

Univ.-Prof. Dr. rer. nat. Rudolf Mathar. Written Examination. Cryptography. Tuesday, August 29, 2017, 01:30 p.m.

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies

Stream ciphers I. Thomas Johansson. May 16, Dept. of EIT, Lund University, P.O. Box 118, Lund, Sweden

CS 282A/MATH 209A: Foundations of Cryptography Prof. Rafail Ostrovsky. Lecture 7

Lecture Notes, Week 6

Lecture 9 - Symmetric Encryption

7 Cryptanalysis. 7.1 Structural Attacks CA642: CRYPTOGRAPHY AND NUMBER THEORY 1

Using an innovative coding algorithm for data encryption

Exercise Sheet Cryptography 1, 2011

A Lower Bound on the Key Length of Information-Theoretic Forward-Secure Storage Schemes

6.080/6.089 GITCS April 8, Lecture 15

1 Secure two-party computation

CLASSICAL ENCRYPTION. Mihir Bellare UCSD 1

YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE

Recommended Reading. A Brief History of Infinity The Mystery of the Aleph Everything and More

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1

Akelarre. Akelarre 1

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Block Ciphers and Feistel cipher

Lecture 28: Public-key Cryptography. Public-key Cryptography

STREAM CIPHER. Chapter - 3

Cryptography. pieces from work by Gordon Royle

17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.

Notes on Alekhnovich s cryptosystems

Final Exam Math 105: Topics in Mathematics Cryptology, the Science of Secret Writing Rhodes College Tuesday, 30 April :30 11:00 a.m.

Chapter 2: Source coding

Lecture 22: RSA Encryption. RSA Encryption

Channel Coding for Secure Transmissions

El Gamal A DDH based encryption scheme. Table of contents

Menu. Lecture 5: DES Use and Analysis. DES Structure Plaintext Initial Permutation. DES s F. S-Boxes 48 bits Expansion/Permutation

SYMMETRIC ENCRYPTION. Mihir Bellare UCSD 1

Lecture 4: DES and block ciphers

Lecture 1: September 25, A quick reminder about random variables and convexity

Division Property: a New Attack Against Block Ciphers

Pseudorandom Generators

Chapter 2. A Look Back. 2.1 Substitution ciphers

Leftovers from Lecture 3

Information-theoretic Secrecy A Cryptographic Perspective

Lattice Cryptography

CPE 776:DATA SECURITY & CRYPTOGRAPHY. Some Number Theory and Classical Crypto Systems

REU 2015: Complexity Across Disciplines. Introduction to Cryptography

Transcription:

Outline Computer Science 48 More on Perfect Secrecy, One-Time Pad, Mike Jacobson Department of Computer Science University of Calgary Week 3 2 3 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 2 / 34 Number of Keys in the Sum Recall that perfect secrecy is equivalent to p(m C) = p(m) for all messages M and all ciphertexts C that occur. How can we determine p(c M) and p(c)? For any message M M, we have p(c M) = K K E K (M)=C p(k). That is, p(c M) is the sum of probabilities p(k) over all those keys K K that encipher M to C. Usually there is at most one key K with E K (M) = C for given M and C. However, some ciphers can transform the same plaintext into the same ciphertext with different keys. A monoalphabetic substitution cipher will transform a message into the same ciphertext with different keys if the only differences between the keys occur for characters which do not appear in the message Eg. key = ECONOMICS, key2 = ECONOMY, and we encrypt a message of at most 6 characters). Mike Jacobson (University of Calgary) Computer Science 48 Week 3 3 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 4 / 34

Example: Computing p(c M) Description of E K M = {a, b}, K = {K, K 2, K 3 }, and C = {, 2, 3, 4}. Encryption is given by the following table: Thus, Key M = a M = b K C = C = 2 K 2 C = 2 C = 3 K 3 C = 3 C = 4 p( a) = p(k ), p( b) = 0, p(2 a) = p(k 2 ), p(2 b) = p(k ), p(3 a) = p(k 3 ), p(3 b) = p(k 2 ), p(4 a) = 0, p(4 b) = p(k 3 ). Consider a fixed key K. The mathematical description of the set of all possible encryptions (of any plaintext) under this key K is exactly the image of E K, i.e. the set E K (M) = {E K (M) M M}. In the previous example, we have E K (M) = {, 2}, E K2 (M) = {2, 3} E K3 (M) = {3, 4}. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 5 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 6 / 34 Computation of p(c) Example, cont. The respective probabilities of the four ciphertexts, 2, 3, 4 are: For a key K and ciphertext C E K (M), consider the probability p(d K (C)) that the message M = D K (C) was sent. Then p(c) = p(k)p(d K (C)). K K C E K (M) That is, p(c) is the sum of probabilities over all those keys K K under which C has a decryption under key K, each weighted by the probability that that key K was chosen. p() = p(k )p(a), p(3) = p(k 2 )p(b) + p(k 3 )p(a), p(2) = p(k )p(b) + p(k 2 )p(a) p(4) = p(k 3 )p(b) If we assume that every key and every message is equally probable, i.e. p(k ) = p(k 2 ) = p(k 3 ) = /3 and p(a) = p(b) = /2, then p() = 3 2 = 6, p(2) = 3 2 + 3 2 = 3 p(3) = 3 2 + 3 2 = 3, p(4) = 3 2 = 6 Note that p( a) = p(k ) = /3 /6 = p(), so this system does not provide perfect secrecy. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 7 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 8 / 34

Necessary Condition for Perfect Secrecy Proof of the Theorem Theorem If a cryptosystem has perfect secrecy, then K M. Informal argument: suppose K < M. Then there is some message M such that for a given ciphertext C, no key K encrypts M to C. This means that the sum defining p(c M) is empty, so p(c M) = 0. But p(c) > 0 for all ciphertexts of interest, so p(c M) p(c), and hence no perfect security. (The cryptanalyst could eliminate certain possible plaintext messages from consideration after receiving a particular ciphertext.) Assume perfect secrecy. Fix a ciphertext C 0 that is the encryption of some message under some key (i.e. actually occurs as a ciphertext). We first claim that for every key K, there is a message M that encrypts to C 0 under key K. Since C 0 occurs as a ciphertext, C 0 = E K0 (M 0 ) for some key K 0 and message M 0, so by perfect secrecy, p(c 0 ) = p(c 0 M 0 ) = p(k) K K E K (M)=C 0 = p(k 0 ) + possibly other terms in the sum p(k 0 ) > 0. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 9 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 0 / 34 Proof, cont. Shannon s Theorem Again by perfect secrecy, for every other message M M, 0 < p(c 0 ) = p(c 0 M) = p(k). K K E K (M)=C 0 In other words, for every M, there is at least one non-zero term in that sum, i.e. there exists at least one key K that encrypts M to C 0. Moreover, different messages that encrypt to C 0 must do so under different keys (as E K (M ) = E K (M 2 ) implies M = M 2 ). So we have at least as many keys as messages. (Formally, consider the set K 0 = {K K C 0 E K (M)} K. Then we have shown that the map K 0 M via K M where E K (M) = C 0 is well-defined and surjective. Hence K K 0 M.) Theorem 2 (Shannon s Theorem, 949/50) A cryptosystem with M = K = C has perfect secrecy if and only if p(k) = / K (i.e. every key is chosen with equal likelihood) and for every M M and every C C, there exists a unique key K K such that E K (M) = C. Proof. See Theorem 2.8, p. 38, in Katz & Lindell. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 2 / 34

One-Time Pad Bitwise Exclusive-Or Fix a string length n. Then set {0, } n is the set of bit strings (i.e. strings of 0 s and s) of lengthn. Generally attributed to Vernam (97, WW I) who patented it, but recent research suggests the technique may have been used as early as 882 in any case, it was long before Shannon It is the only substitution cipher that does not fall to statistical analysis. Definition (bitwise exclusive or, XOR) For a, b {0, }, we define a b = a + b (mod 2) = { 0 a = b, a b. For A = (a, a 2,..., a n ), B = (b, b 2,..., b n ) {0, } n, we define then (component-wise XOR). A B = (a b, a 2 b 2,..., a n b n ). Mike Jacobson (University of Calgary) Computer Science 48 Week 3 3 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 4 / 34 The One-Time Pad Security of the One-Time Pad Definition 2 (Vernam one-time pad) Let M = C = K = {0, } n (bit strings of some fixed length n). Encryption of M {0, } n under key K {0, } n is bitwise XOR, i.e. C = M K. Decryption of C under K is done the same way, i.e. M = C K, since K K = (0, 0,..., 0). Theorem 3 The one-time pad provides perfect secrecy if each key is chosen with equal likelihood. Under this assumption, each ciphertext occurs with equal likelihood (regardless of the probability distribution on the plaintext space). This means that in the one-time pad, any given ciphertext can be decrypted to any plaintext with equal likelihood (defn of perfect secrecy). There is no meaningful decryption. So even exhaustive search doesn t help. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 5 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 6 / 34

Proof of the Theorem Cryptanalysis of the One-Time Pad Proof of Theorem 3. We have M = C = K = 2 n, and for every M, C {0, } n, there exists a unique key K that encrypts M to C, namely K = M C. By Shannon s Theorem 2, we have prefect secrecy. Now let M, C {0, } n be arbitrary. Then by perfect secrecy, p(c) = p(c M) = K {0,} n M K=C p(k) Now p(k) = 2 n for all keys K, and the sum only has one term (corresponding to the unique key K = M C). Hence p(c) = 2 n for every C {0, } n. It is imperative that each key is only used once: Immediately falls to a KPA: if a plaintext/ciphertext pair (M, C) is known, then the key is K = M C. Suppose K were used twice: C = M K, C 2 = M 2 K = C C 2 = M M 2. Note that C C 2 = M M 2 is just a coherent running key cipher (adding two coherent texts, M and M 2 ), which as we have seen is insecure. For the same reason, we can t use shorter keys and re-use portions of them. Keys must be randomly chosen and at least as long as messages. This makes the one-time pad impractical. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 7 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 8 / 34 Practical Issues One-Time Pad: Conclusion Main disadvantages of one-time pad: requires a random key which is as long as the message each key can be used only once. One-time schemes are used when perfect secrecy is crucial and practicality is less of a concern, for example, Moscow-Washington hotline. The major problem with the one-time pad is the cost. As a result, we generally rely on computationally secure ciphers. These ciphers would succumb to exhaustive search, because there is a unique meaningful decipherment. The computational difficulty of finding this solution foils the cryptanalyst. A proof of security does not exist for any proposed computationally secure system. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 9 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 20 / 34

Measuring Information Example Recall that information theory captures the amount of information in a piece of text. Measured by the average number of bits needed to encode all possible messages in an optimal prefix-free encoding. optimal the average number of bits is as small as possible prefix-free no code word is the beginning of another code word (e.g. can t have code words 0 and 0 for example) Formally, the amount of information in an outcome is measured by the entropy of the outcome (function of the probability distribution over the set of possible outcomes). The four messages UP, DOWN, LEFT, RIGHT could be encoded in the following ways: String Character Numeric Binary UP U 00 DOWN D 2 0 LEFT L 3 0 RIGHT R 4 (40 bits) (8 bits) (6 bits) (2 bits) (5 char string) 8-bit ASCII (2 byte integer) 2 bits Mike Jacobson (University of Calgary) Computer Science 48 Week 3 2 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 22 / 34 Coding Theory In the example, all encodings carry the same information (which we will be able to measure), but some are more efficient (in terms of the number of bits required) than others. Note: Huffmann encoding can be used to improve on the above example if the directions occur with different probabilities. This branch of mathematics is called coding theory (and has nothing to do with the term code defined previously). Definition 3 Let X be a random variable taking on the values X, X 2,..., X n with a probability distribution p(x ), p(x 2 ),..., p(x n ) where n p(x i ) = i= The entropy of X is defined by the weighted average H(X ) = n i= p(x i ) 0 p(x i ) log 2 p(x i ) = n i= p(x i ) 0 p(x i ) log 2 p(x i ). Mike Jacobson (University of Calgary) Computer Science 48 Week 3 23 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 24 / 34

Intuition Example An event occurring with prob. 2 n can be optimally encoded with n bits. An event occurring with probability p can be optimally encoded with log 2 (/p) = log 2 (p) bits. The weighted sum H(X ) is the expected number of bits (i.e. the amount of information) in an optimal encoding of X (i.e. one that minimizes the number of bits required). If X, X 2,..., X n are outcomes (e.g. plaintexts, ciphertexts, keys) occurring with respective probabilities p(x ), p(x 2 ),..., p(x n ), then H(X ) is the amount of information conveyed about these outcomes. Suppose n > and p(x i ) > 0 for all i. Then hence H(X ) > 0 if n >. 0 < p(x i ) < (i =, 2,..., n) p(x i ) > log 2 p(x i ) > 0, If there are at least 2 outcomes, both occurring with nonzero probability, then either one of them conveys information. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 25 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 26 / 34 Example 2 Example 3 Suppose n =. Then p(x ) =, p(x ) =, log 2 = 0 = H(X ) = 0. p(x ) One single possible outcome conveys no new information (you already know what it s going to be). In fact, for arbitrary n, H(X ) = if and only of p i = for exactly one i and p j = 0 for all j i. Suppose there are two possible outcomes which are equally likely: p(heads) = p(tails) = 2, H(X ) = 2 log 2 2 + 2 log 2 2 =. Seeing either outcome conveys exactly bit of information (heads or tails). Mike Jacobson (University of Calgary) Computer Science 48 Week 3 27 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 28 / 34

Example 4 Example 5 Suppose we have Then p(up) = 2, p(down) = 4, p(left ) = 8, p(right ) = 8. H(X ) = 2 log 2 2 + 4 log 2 4 + 8 log 2 8 + 8 log 2 8 = 2 + 2 4 + 3 8 + 3 8 = 4 8 = 7 4 =.75. An optimal prefix-free (Huffman) encoding is UP = 0, DOWN = 0, LEFT = 0, RIGHT =. Suppose we have n outcomes which are equally likely: p(x i ) = /n. H(X ) = n i= n log 2 n = log 2 n. So if all outcomes are equally likely, then H(X ) = log 2 n. If n = 2 k (e.g. each outcome is encoded with k bits), then H(X ) = k. Because UP is more probable than the other messages, receiving UP conveys less information than receiving one of the other messages. The average amount of information received is.75 bits. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 29 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 30 / 34 Application to Cryptography Maximal For plaintext space M, H(M) measures the uncertainty of plaintexts. Gives the amount of partial information that must be learned about a message in order to know its whole content when it has been distorted by a noisy channel (coding theory) or hidden in a ciphertext (cryptography) For example, consider a ciphertext C = X$7PK that is known to correspond to a plaintext M M = { heads, tails }. H(M) =, so the cryptanalyst only needs to find the distinguishing bit in the first character of M, not all of M. Recall that the entropy of n equally likely outcomes (i.e. each occurring with probability /n) is log 2 (n). This is indeed the maximum: Theorem 4 H(X ) is maximized if and only if all outcomes are equally likely. That is, for any n, H(X ) = log 2 (n) is maximal if and only if p(x i ) = /n for i n. H(X ) = 0 is minimized if and only if p(x i ) = for or exactly one i and p(x j ) = 0 for all j i. Intuitively, H(X ) decreases as the distribution of messages becomes increasingly skewed. Mike Jacobson (University of Calgary) Computer Science 48 Week 3 3 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 32 / 34

Idea of Proof Notes (This is not a complete proof! The full proof uses induction on n or Jensen s inequality.) Idea. Suppose p(x ) > /n, p(x 2 ) < /n, and p(x i ) is fixed for i > 2. Set p = p(x ), then p(x 2 ) = p ɛ with ɛ = p(x 3 ) + + p(x n ), and H = p log(p) ( p ɛ) log( p ɛ) ɛ) log(ɛ) dh dp = log p + log( p ɛ) + = log p ɛ p < 0 since 0 < p < Thus, H decreases as a function of p as p increases. Now prove that H is maximized when p = /n by induction on n. For a key space K, H(K) measures the amount of partial information that must be learned about a key to actually uncover it (e.g. the number of bits that must be guessed correctly to recover the whole key). For a k bit key, the best scenario is that all k bits must be guessed correctly to know the whole key (i.e. no amount of partial information reveals the key, only full information does). of the random variable on the key space shoul be maximal. Previous theorem: happens exactly when each key is equally likely. Best strategy to select keys in order to give away as little as possible is to choose them with equal likelihood (uniformly at random). Cryptosystems are assessed by their key entropy, which ideally should just be the key length in bits (i.e. maximal). Mike Jacobson (University of Calgary) Computer Science 48 Week 3 33 / 34 Mike Jacobson (University of Calgary) Computer Science 48 Week 3 34 / 34