Lecture 11: Hash Functions, Merkle-Damgaard, Random Oracle

Similar documents
Homework 7 Solutions

Lecture 5, CPA Secure Encryption from PRFs

Lecture 7: CPA Security, MACs, OWFs

Notes for Lecture A can repeat step 3 as many times as it wishes. We will charge A one unit of time for every time it repeats step 3.

1 Cryptographic hash functions

Lecture 15 & 16: Trapdoor Permutations, RSA, Signatures

1 Cryptographic hash functions

Lecture 10 - MAC s continued, hash & MAC

Lecture 14: Cryptographic Hash Functions

1 Recommended Reading 1. 2 Public Key/Private Key Cryptography Overview RSA Algorithm... 2

Authentication. Chapter Message Authentication

Question 2.1. Show that. is non-negligible. 2. Since. is non-negligible so is μ n +

Intro to Public Key Cryptography Diffie & Hellman Key Exchange

Question 1. The Chinese University of Hong Kong, Spring 2018

CPSC 467: Cryptography and Computer Security

1 Indistinguishability for multiple encryptions

Avoiding collisions Cryptographic hash functions. Table of contents

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

2 Message authentication codes (MACs)

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

CHALMERS GÖTEBORGS UNIVERSITET. TDA352 (Chalmers) - DIT250 (GU) 11 April 2017, 8:30-12:30

Lecture 18: Message Authentication Codes & Digital Signa

CPSC 467b: Cryptography and Computer Security

CPSC 467b: Cryptography and Computer Security

Solutions for week 1, Cryptography Course - TDA 352/DIT 250

9 Knapsack Cryptography

CPSC 467b: Cryptography and Computer Security

Lecture 5: Pseudo-Random Generators and Pseudo-Random Functions

Ex1 Ex2 Ex3 Ex4 Ex5 Ex6

6.080/6.089 GITCS Apr 15, Lecture 17

Avoiding collisions Cryptographic hash functions. Table of contents

Introduction to Cryptography. Lecture 8

CPSC 467: Cryptography and Computer Security

Lecture 1: Introduction to Public key cryptography

Number Theory: Applications. Number Theory Applications. Hash Functions II. Hash Functions III. Pseudorandom Numbers

Lecture Summary. 2 Simplified Cramer-Shoup. CMSC 858K Advanced Topics in Cryptography February 26, Chiu Yuen Koo Nikolai Yakovenko

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography

1 Number Theory Basics

YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE

Technische Universität München (I7) Winter 2013/14 Dr. M. Luttenberger / M. Schlund SOLUTION. Cryptography Endterm

COMS W4995 Introduction to Cryptography September 29, Lecture 8: Number Theory

Lecture Notes, Week 6

5199/IOC5063 Theory of Cryptology, 2014 Fall

Lecture 5: Pseudorandom functions from pseudorandom generators

Lecture 19: Public-key Cryptography (Diffie-Hellman Key Exchange & ElGamal Encryption) Public-key Cryptography

ECS 189A Final Cryptography Spring 2011

Notes. Number Theory: Applications. Notes. Number Theory: Applications. Notes. Hash Functions I

Lecture 11: Key Agreement

1 Maintaining a Dictionary

1 What are Physical Attacks. 2 Physical Attacks on RSA. Today:

10 Public Key Cryptography : RSA

BEYOND POST QUANTUM CRYPTOGRAPHY

CPSC 467b: Cryptography and Computer Security

5 Pseudorandom Generators

CS483 Design and Analysis of Algorithms

CPSC 467b: Cryptography and Computer Security

Lecture 1. Crypto Background

ENEE 459-C Computer Security. Message authentication (continue from previous lecture)

Lecture 22. We first consider some constructions of standard commitment schemes. 2.1 Constructions Based on One-Way (Trapdoor) Permutations

Winter 2008 Introduction to Modern Cryptography Benny Chor and Rani Hod. Assignment #2

Where do pseudo-random generators come from?

14 Diffie-Hellman Key Agreement

Scribe for Lecture #5

Computer Science A Cryptography and Data Security. Claude Crépeau

Lecture 10: NMAC, HMAC and Number Theory

Lecture 3,4: Multiparty Computation

Lecture 4: Computationally secure cryptography

Lecture 11: Hash Functions and Random Oracle Model

Lecture 9 Julie Staub Avi Dalal Abheek Anand Gelareh Taban. 1 Introduction. 2 Background. CMSC 858K Advanced Topics in Cryptography February 24, 2004

Extending Oblivious Transfers Efficiently

SECURE IDENTITY-BASED ENCRYPTION IN THE QUANTUM RANDOM ORACLE MODEL. Mark Zhandry Stanford University

Public Key Cryptography

Inaccessible Entropy and its Applications. 1 Review: Psedorandom Generators from One-Way Functions

Theoretical Cryptography, Lectures 18-20

YALE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE

CPSC 467: Cryptography and Computer Security

CSC 5930/9010 Modern Cryptography: Number Theory

CHAPTER 6: OTHER CRYPTOSYSTEMS, PSEUDO-RANDOM NUMBER GENERATORS and HASH FUNCTIONS. Part VI

Construction of universal one-way hash functions: Tree hashing revisited

10 Concrete candidates for public key crypto

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2018

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

: On the P vs. BPP problem. 18/12/16 Lecture 10

Lecture 10: NMAC, HMAC and Number Theory

Introduction to Cryptography k. Lecture 5. Benny Pinkas k. Requirements. Data Integrity, Message Authentication

COMS W4995 Introduction to Cryptography October 12, Lecture 12: RSA, and a summary of One Way Function Candidates.

Pseudorandom Generators

CPSC 467b: Cryptography and Computer Security

Foundations of Network and Computer Security

Leftovers from Lecture 3

Pseudorandom Generators

Notes for Lecture 9. 1 Combining Encryption and Authentication

Public-Key Cryptography. Lecture 9 Public-Key Encryption Diffie-Hellman Key-Exchange

Topics in Cryptography. Lecture 5: Basic Number Theory

Lecture th January 2009 Fall 2008 Scribes: D. Widder, E. Widder Today s lecture topics

Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks

The Random Oracle Paradigm. Mike Reiter. Random oracle is a formalism to model such uses of hash functions that abound in practical cryptography

U.C. Berkeley CS276: Cryptography Luca Trevisan February 5, Notes for Lecture 6

Introduction to Cybersecurity Cryptography (Part 4)

1 Distributional problems

Transcription:

CS 7880 Graduate Cryptography October 20, 2015 Lecture 11: Hash Functions, Merkle-Damgaard, Random Oracle Lecturer: Daniel Wichs Scribe: Tanay Mehta 1 Topics Covered Review Collision-Resistant Hash Functions Merkle-Damgaard Theorem Random Oracle Model Number Theory and Algebra 2 Collision-Resistant Hash Functions Recall the following definition from last time. A family of functions H s : {0, 1} l(n) {0, 1} n, where l(n) > n and s {0, 1} n are collisonresistant hash functions iff H s (x) can be computed in polynomial time. Pr[(x x ) (H s (x) = H s (x )) : s {0, 1} n, (x, x ) A(s)] negl(n) Note that the seed s is public to all parties; there is no secret key. Given access to all this information, an adversary should still be unable to find collisions. Collision-resistant hash functions seem to be harder to construct than the other cryptographic primitives we ve seen so far and we don t know how to build them from one-way functions. In fact, there is a result that shows that a collision-resistant hash function cannot be built from one-way functions in a blackbox manner. Ideally, we want a high amount of compressability from collision-resistant hash functions (i.e. l(n) >> n). This is similar to how pseudorandom generators where we wanted a high amount of stretch (number fo extra pseudorandom bits we get). The following theorem shows that if we can compress by even 1 bit then we can compress by any number of bits. Theorem 1 (Merkle-Damgaard) If there exists a family of collision-resistant hash function H s with l(n) = n + 1 bit input, then there exists a collision-resistant hash function H s with any polynomial l (n) bit input. Lecture 11, Page 1

Proof: We prove this by giving an explicit construction for H s. Let (b 1,..., b l (n)) be the input to H s. Then, H s can be found by chaining together l (n) instances of H s in the following manner. b 1 b 2 l (n) times H s H s 0 n y 1 y l (n) 1 b l (n) H s y l (n) Each H s takes n + 1 bit input. We initially feed an n-bit 0 string and the first bit of our input b 1 into one instance of H s. Then, we take it s output y 1, which is of n bits, and feed it into another H s instance along with the second bit of our input b 2. We do this process l (n) times until we have fed all bits of our input. We take the output of from this instance y l (n) as the value of our hash. We will now prove security for this construction. Assume that there is a PPT attacker A(s) on the collision resistance of H s. In other words, with some probability ε(n), the attacker A(s) outputs x = (b 1,..., b l (n)) x = (b 1,..., b l (n) ) such that H s(x) = H s(x ). We claim that whenever the above happens, we can use it to find a collision on H s. Running backwards on two instances of the above construction (one with input x and the other with x ), there must be a largest index j such that (b j, y j 1 ) (b j, y j 1 ). Since this is the larges such j, we know that H s(b j, y j 1 ) = y j = H s (b j, y j 1 ). Therefore this allows us to efficiently find a collision on H s whenever A finds a collision on H s. Since the former can only happen with negligible probability, this shows that ε(n) is negligible and therefore H s is collision resistant. Note that the above allows us to get a collision-resistant hash function for an arbitrarily large l(n) bit input. But still, the input size l(n) needs to be fixed. Suppose we want to create a collision-resistant hash function H s : {0, 1} {0, 1} n with variable input length. Then, the proof above doesn t work anymore because x and x can be of different length. It turns out that there are relatively simple ways to fix this and get a hash function with variable input length as well. 2.1 Merkle Trees Let us consider a different construction, Merkle trees. We start with a hash function H s : {0, 1} 2n {0, 1} n. Now consider the following complete binary tree of depth h (illustrated for h = 3): Lecture 11, Page 2

y x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 For n-bit blocks x 1,..., x 2 h, we define the host (parent) of two nodes to have value H s (a, b) where a and b are the values of the host s two children. Then, the root node has value y = H s(x 1,..., x 2 h) where H s : {0, 1} 2h {0, 1} n. The proof of security is similar to the one for the Merkle-Damgaard construction. If an adversary comes up with a collision on the function H s, then we can use it to find a collision on H s. Merkle trees have a couple of useful properties. First, Merkle trees are more efficient to compute when working in parallel processing system than the Merkle-Damgaard construction. This is because all of the hashes at any level of the tree can be computed in parallel. Second, let s say Alice hashes some long string x and sends the output y = H s (x) to Bob. Since the hash function is collision resistant, this commits Alice to a single string x which she can send as the pre-image of y. Let s say that Bob later wants to know the i th block of x where we think of x = (x 1,..., x l ) as having l = 2 h blocks with x i {0, 1} n. Bob wants to be sure that there is only a single value of x i that Alice can legitimately send. One way to do this would be for Alice to send the entire string x, and for Bob to very that H s (x) = y matches the hash he has. But this requires a lot of communication if Bob only wants x i and doesn t care about the rest of x. The nice thing about the Merkle tree construction is that we can do this with much less communication. Instead of Alice sending the entire string x, she will only send the i th block x i along with the hash outputs corresponding to all the sibling nodes along the path from the root of the tree to the i th bloc. For example, when i = 3, the transmitted values are illustrated by the nodes in bold below. Bob can the use the transmitted values to re-compute the value associated with the root of the tree and verify that this matches y. Lecture 11, Page 3

y x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 This has applications to memory checking. If you have a hash of a large database consisting of l records with n bits each (think of l as very large and n as reasonably small) and want to download one small record, you can use this method to check that the downloaded record is correct. The communication is only n log l bits. 3 Random Oracle Model Hash functions are used in many different ways in cryptography beyond only for collision resistance. There is a belief that practical hash functions have many security properties which aren t captured by collision resistance alone. To capture this intuition, we consider an idealized model of hash functions called the random oracle model. In this model, we think of a hash function as a completely random but public function RO( ). Anybody can query this function at arbitrary inputs x and get back RO(x). This means that when we construct a scheme in the random oracle model, we allow the scheme to make such random oracle queries. However, we must also allow the attacker to make such queries as well when attacking the scheme. Defining the Random Oracle Model. More formally, the random oracle model is a model where all parties (e.g. algorithms, adversaries) have oracle access to a (uniformly) random function RO : {0, 1} {0, 1} n (we can be flexible about the output length, but for now let s just insist on n bits.) We can think of this as whenever a fresh value x is queried, the oracle chooses a random output y. The next time that x is queried, the oracle gives back the same y as previously. Note that if two different parties query RO on the same input x, they will receive the same output. We say that a cryptographic scheme (e.g., PRG, PRF,encryption, etc.) is secure in the random oracle model if it satisfies the standard syntax, correctness and security properties but augmented so that the scheme as well as the attacker in the security definition have oracle access to RO. We give some examples below. Lecture 11, Page 4

Constructions in the Random Oracle Model. cryptographic primitives in this model. Let us consider how to build various Collision-resistant hash functions: We can simply define our CRHF to be We claim that H(x) := RO(x) Pr[A RO( ) (x, x ) : x x, RO(x) = RO(x )] q2 2 n = negl(n) where q is the number of queries that A makes to RO. We can prove this claim unconditionally without assuming that P NP, and allowing A to run in any amount of time. We will only make a restriction on the number of queries A can make. Proof: Note that the probability that the i-th and j-th queries collide is 1 2. Then by n the union bound, we have that probability of any collision is q2 2. This is also known n as the Birthday Paradox. Without loss of generality, assume that x i x j and A calls RO on x i. Pseudorandom generators: We define our PRG to be G RO (x) = RO(x 1), RO(x 2) where x 1 and x 2 are x concatenated with bit-representations of 1 and 2 respectively. Note that each output of each RO is n bits yielding a total of 2n bits. The only way to distinguish between this PRG and the uniform distribution is by calling RO on x. The probability of successfully guess x is 1 2 x. Pseudorandom functions: We initially may think to just define our functions as F k (x) = RO(x) However, this doesn t work because the adversary A has access to RO(x). There is no secret key used. Instead, consider F k (x) = RO(k x) The security of this definition follows from considering Pr[A RO( ),F k( ) (1 n ) = 1] Pr[A RO( ),R( ) (1 n ) = 1] negl(n) where R is a random function independent of RO. Lecture 11, Page 5

Discussion. As we have seen, it is fairly easy to build cryptographic primitives in this model. The model is mathematically precise and we can give formal definitions and proofs of security in this model. But what does it tell us about the real world, where there is no random oracle? The hope is that we take a construction which is secure in the random oracle model, and replace the random oracle RO( ) with some real cryptographic hash function H s ( ) (where the seed is chosen randomly but given to the adversary) then the schemes will still be secure. This is not a well defined security property for the hash function H s ( ). We are simply hoping that H s is as good as a random oracle even though we know that H s is not a truly random function (it has a short description and can be evaluated in polynomial time). We do not have a theorem saying that if a scheme is secure in the random oracle model than it is secure when we replace the random oracle with a real hash function H s. It s therefore only a heuristic way of arguing about security. It seems to work in real life and allows us to construct very simple schemes. But we know that it does not work in general and can lead to incorrect incorrect conclusions. In fact, we know that there are (contrived) schemes which are secure in the random oracle model, but are always insecure when you replace the random oracle by any potential hash function H s ( ). On the other hand, we can think of the random oracle model as defining the limits of what we can even hope to construct using (only) tools like one-way function or collisionresistant hash functions. If there is a primitive that we can t even construct in the Random Oracle model, then there is no hope to construct it (in a black-box way) from any of the primitives we ve studied so far. An in fact, it turns out that there is such a primitive: public-key encryption. A beautiful result by Impagliazzo-Rudich shows that we can t build public key encryption in the random oracle model (without making some other hardness assumptions). Intuitively, public-key encryption requires hard problems that have more structure which the random oracle does not provide. Luckily, we will see (candidates) for such hard problems in number theory and algebra. 4 Number Theory and Algebra We now move onto a new part of the course. We re going to build the primitives we ve been studying (OWFs, PRGs, PRFs) using hardness assumptions from number theory and algebra. These constructions are never used in practice since they re far less efficient than the ad-hoc constructions out there. We will also go beyond the primitives we ve seen so far and show how to build more advanced cryptosystems, most importantly public-key encryption. These are the schemes used in practice since there aren t more efficient ad-hoc alternatives. We state some facts of number theory and algebra. Modular arithmetic: Consider the group over the set Z n (integers modulo n). The canonical representatives of Z n = {0,..., n 1}. Along with the operation +, (Z n, +) forms a group. This means we satisfy 1. Identity: There exists 0 Z n such that for all a Z n we have that a + 0 = a. Lecture 11, Page 6

2. Add: For all a, b Z n, a + b Z n. 3. Inverse: For all a Z n, there exists a Z n such that a + ( a) = 0. Note that (Z n, ) does not form a group because we lack multiplicative inverses (i.e. we cannot divide every element by another to get 1. Fact: a 0 has inverse in (Z n, ) if gcd(a, n) = 1. Proof: This is equivalent to saying there exists x, y such that xa + yn = 1. From this, we have xa 1 (mod n) x a 1 (mod n) Consider Z n = {a {1,..., n}} such that gcd(a, n) = 1. This forms a group under multiplication since all elements have inverses. What is the size of this group? Define the Euler totient function ϕ(n) = Z n. If p is prime, then Z p = Z p \{0} = {1,..., p 1}. Therefore, ϕ(p) = p 1. Let us now look at the computational complexity of various algebraic operations. +, are polynomial time. + is, in fact, linear. ( ) 1 (mod n) computing the inverse. Existence of the inverse can be found using Euclid s algorithm. The actual inverse can be found using the extended Euclidean algorithm. Both of these run in polynomial time. a b exponentiation cannot be computed in polynomial time because the output is exponential in the input length. a b (mod n) can be computed in polynomial time. Write b in binary. a b a b i 2 i (mod n) i a b i 2 i a 2i (mod n) i,b i =1 The final value can be computed by repeated squaring i times. Lecture 11, Page 7