Lecture 11: Hash Functions and Random Oracle Model

CS 7810 Foudatios of Cryptography October 16, 017 Lecture 11: Hash Fuctios ad Radom Oracle Model Lecturer: Daiel Wichs Scribe: Akshar Varma 1 Topic Covered Defiitio of Hash Fuctios Merkle-Damgaård Theorem Merkle Trees Radom Oracle Model Collisio-Resistat Hash Fuctios A collisio-resistat hash fuctios (CRHF) is a fuctio that shriks a log message to a short output called the digest of the message. Ituitively, we wat to esure that the eve though the digest is short it uiquely idetifies the message. This may seem cotradictory sice may messages must be hashed to the same digest. However, the security property that we will require, called collisio resistace says that o efficiet adversary ca come up with a collisio cosistig of two differet iputs that both get mapped to the same output..1 Naive defiitio A atural way of defiig a CRHF is as follows. A family of fuctios H : {0, 1} l() {0, 1} with l() > are collisio-resistat hash fuctios iff they satisfy the followig properties. H(x) ca be computed i polyomial time. For all PPT adversaries A, Pr[H(x) = H(x ) x x : (x, x ) A(1 )] = egl(). While this defiitio ituitively captures the idea of a CRHF it is ever goig to be satisfied if we allow for o-uiform adversaries. Whe o-uiform adversaries are allowed there is a trivial adversary that simply hard codes a collisio for each possible ad always successfullly outputs x x such that H(x) = H(x ) (all without doig ay computatio). Lecture 11, Page 1

. Better defiitio To remove the possibility of such trivial adversaries, we modify our defiitio to iclude a seed. A family of fuctios H(s, x) : {0, 1} {0, 1} l() {0, 1} with l() > are collisio-resistat hash fuctios iff they satisfy the followig properties. H s (x) := H(s, x) ca be computed i polyomial time. For all PPT adversaries A, Pr[H s (x) = H s (x ) x x : s {0, 1}, (x, x ) A(1, s)] = egl(). The defiitio says that whe a uiformly radom public seed is used, o adversary ca fid a collisio with o-egligible probability. Apart from capturig the ituitio of collisio-resistat hash fuctios it also removes the possibility of trivial adversaries. This seed is primarily for havig a soud defiitio ad i practice useeded hash fuctios are used because o oe kows how to come up with the hard coded collisios ad hece the trivial adversary. 3 Merkle-Damgaård Theorem Recall that for PRGs we had a theorem statig that if we have a PRG which exteds the iput by oe bit, the we ca use that to create PRGs that exteds the iput by ay polyomial umber of bits. Similarly, we ow show that if we ca compress the iput by oe bit, the we ca compress it by ay polyomial umber of bits. Theorem 1 If H s is a family of collisio-resistat hash fuctios with l() = + 1, the we ca costruct a family of collisio-resistat hash fuctios H s with l () = poly(). Proof: We provide a costructio for H s similar to the costructio for the chaied PRG costructio. Let X = (x 1, x,..., x ) be the iput bits to the fuctio H s. We defie H s(x) = y l () via the followig figure. x 1 x x i x l () H s H s............ H s... H s 0 y 1 y l () 1 y l () We show that if X X such that H s(x) = H s(x ), the we ca use that to fid a collisio for H s. Let y 1,..., y l () be the values created durig the computatio of H s (X) ad y 1,..., y l () be the values created durig the computatio H s(x ). We kow that y l () = y l () because H s(x) = H s(x ). Search backwards for the largest i such that (x i, y i 1 ) (x i, y i 1 ) (this i exists because X X ). This pair implies that H s (x i, y i 1 ) = H s (x i, y i 1 ) y i = y i. This provides a collisio for H s ad is a cotradictio to H s beig a CRHF. Lecture 11, Page

A atural questio to ask ow is whether the same costructio is sufficiet for variable legth iputs? The aswer is o. Cosider a CRHF for which H s (0 +1 ) = 0, we simply preped 0 to our iput ad get collisios, H s(x) = H s(0 X). The place where our proof fails is whe we search for the largest i. We simply fall off the chai of H s correspodig to H s(x) whe searchig for this i. We ca avoid this problem by defiig H s (X) = H s(x < l () >) where < l () > is the biary represetatio of the legth of the iput X. Now the same proof idea will work. Either we fid a collisio i the first bits ( l () ) from the right or X = X ad the earlier proof works is sufficiet. 4 Merkle Trees While the Merkle-Damgaård costructio allows us to go from 1 bit compressio to ay sized compressio, we ow look at a differet costructio which is more useful i some scearios. Theorem If H s is a family of collisio-resistat hash fuctios with l() =, the we ca costruct a family of collisio-resistat hash fuctios H s with l () = i for ay i > 0. Proof: Let X = (x 1, x,..., x i) be the iput bits split ito blocks of bits each. We compute the fial digest by creatig a biary tree (of depth i) with each bit block as a leaf, ad every iteral ode beig the hash of the cocateatio of its two childre (x = H s (x l, x r )). Thus, the root will be a digest of the required size ad would be depedet o all the bits of X. We illustrate the tree structure for i = 3. H s(x) x 1 x x 3 x 4 x 5 x 6 x 7 x 8 The proof that this costructio gives us a CRHF is quite similar to the proof of the Merkle-Damgaård costructio. Usig a dowwards search startig at the root i the two Lecture 11, Page 3

trees, we would see that a paret ad the correspodig childre would provide a collisio for H s if there is a collisio for H s. 4.1 Merkle-Damgaård costructio vs. Merkle Trees We have see two ways to costruct CRHFs with more compressio power startig with less powerful CRHFs. Both of these have their advatages ad disadvatages ad we ow look at some of those. Parallelizability/Streamig: Merkle Trees aturally provide parallelizability i their costructio ulike the Merkle-Damgaård costructio. O the other had, this parallelizability comes with the disadvatage of havig to store a lot of itermediate states before reachig the fial digest value ulike for Merkle-Damgaård where we oly eed to store oe hashed value at ay give time. However, if the data is received i a streamig maer the the Merkle-Damgaård costructio is much more efficiet i terms of storage space required. Updatio/Appedig: If the iput value chages slightly the the Merkle Tree costructio simply updates the iteral odes that do get chaged which eeds O(i) computatios ad hece smaller tha O(l ()) computatios that the Merkle-Damgaård costructio would eed. This is a simple storage vs. computatio trade-off with the Merkle Tree havig to store all itermediate odes ad the Merkle-Damgaård costructio havig to perform all computatios from scratch. Appedig to the iput is slightly more complicated for the Merkle Tree as ew odes eed to be created ad balacig of the tree may also come ito play if oe eeds to keep the updatio efficiet. For the Merkle-Damgaård costructio storig the state just before the < l () > part might be sufficiet. Small segmet of large dataset: A sceario where the Merkle Tree is better equipped compared to the Merkle-Damgaård costructio is whe a small part of the message eeds to be verified. Suppose Alice has stored a very large amout of data with Bob ad Charlie wats to see a small segmet of this data (say x i ). If Bob seds Charlie the whole data ad the correspodig hash, the that results i a large amout of data trasfer (especially if Charlie is oly cocered with x i ad does t care about the rest of x). However, if Bob uses the Merkle Tree method of computig the hash the he ca simply sed Charlie the requested data segmet alog with the hashes for all the siblig odes alog the path from the leaf for x i to the root (as show i bold i the followig figure). This allows us to solve the problem while avoidig high commuicatio overhead which would ot be possible with the Merkle-Damgaård costructio due to its serial ature. 5 Radom Oracle Model The CRHFs that are used i practice seem to have may other cryptographic properties apart from collisio-resistace. We try to capture this usig the Radom Oracle model by cosiderig a idealized hash fuctio. We formally defie a Radom Oracle model as a model i which all parties (icludig adversaries) have oracle access to a cosistet, uiformly radom fuctio RO : {0, 1} {0, 1}. This oracle ca be thought of as choosig a radom output y o beig queried with Lecture 11, Page 4

H s(x) x 1 x x 3 x 4 x 5 x 6 x 7 x 8 a value x ad rememberig its choice. Whe two people query the fuctio with the same x, they both receive the same y value. We defie cryptographic systems i this model the same as earlier, except both the algorithm ad adversary are provided oracle access to RO( ). The stadard security ad correctess requiremets are carried forward ito this settig as we will see i the followig examples. We assume that X = while aalyzig security of the cryptosystems we defie. 5.1 OWFs We defie a OWF i the Radom Oracle model as follows: f RO( ) (X) = RO(X) Sice RO is a truly radom fuctio, there is o way for a adversary to ivert it except by brute-force queries. The probability that oe query of the adversary succeeds is bouded by 1. If the adversary A makes T queries to RO, the we ca boud the success probability of A by T. 5. PRGs We defie a PRG i the Radom Oracle model as follows: G RO( ) (X) = RO(X 0) RO(X 1) This is a PRG as it takes i bits of iput ad returs bits of output. The oly way for a adversary to distiguish betwee G(X) ad U is to query RO o X 0 or X 1, which ca happe with probability at most 1 (adversary eeds to guess X). Similar to the OWF case, the success probability is bouded by T if the adversary is allowed to make T queries. Lecture 11, Page 5

5.3 PRFs We defie a PRF i the Radom Oracle model as follows: F RO( ) K (X) = RO(K X) This has the same security as the PRG defiitio if K = (adversary has to guess K). 5.4 CRHF We defie a CRHF i the Radom Oracle model as follows: H RO( ) (X) = RO(X) We claim that this has security T if the adversary makes T queries. The proof follows by otig that the probability of the i th ad j th queries collidig is 1 ad hece with T queries, the probability of fidig a collisio is bouded by T (usig the uio boud). This is also kow as the birthday boud that is see i the birthday paradox. Pr[(X X ) (RO(X) = RO(X )) : (X, X ) A RO( ) (1 )] Pr[ i, j [T ], i j, RO(X i ) = RO(X j )] i,j Pr[RO(X i ) = RO(X j )] T 5.5 Radom Oracles i real life The motivatio of comig up with the Radom Oracle model was to try ad capture the extra properties that CRHFs seemed to show i real life, i a formal ad rigourous maer. Whe we try to go back from the theoretical models to practice, we lose this rigour. I real life there are o radom oracles RO, ad the cryptographic primitives that we costruct i this model caot be directly used. What is doe i practice is to simply replace the RO with a CRHF H s ad use the same costructios as i the Radom Oracle model. Of course H s is ot a truly radom fuctio, ad hece oe of the ice properties that we ca prove regardig RO ecessarily hold whe we use H s. Whe shiftig from theory to practice, mathematical rigour ad proofs are let go ad oe just hopes that it works out. The guaratees regardig security get modified (i a had-wavy maer), with the umber of queries T gettig replaced with the ruig time of H s. While this is ot a rigourous maer of arguig about the security of cryptosystems it seems to work i geeral ad thus gets used. Lecture 11, Page 6