On Compressing Encrypted Data

Size: px

Start display at page:

Download "On Compressing Encrypted Data"

Bennett Dennis
6 years ago
Views:

1 O Compressig Ecrypted Data Mark Johso, Prakash Ishwar, Viod M. Prabhakara, Daiel Schoberg, ad Kaa Ramchadra Departmet of Electrical Egieerig ad Computer Scieces, Uiversity of Califoria, Berkeley, CA 94720, USA. {mjohso, ishwar, viodmp, dschobe, Abstract Whe it is desired to trasmit redudat data over a isecure ad badwidth-costraied chael, it is customary to first compress the data ad the ecrypt it. I this paper, we ivestigate the ovelty of reversig the order of these steps, i.e., first ecryptig ad the compressig, without compromisig either the compressio efficiecy or the iformatio-theoretic security. Although couter-ituitive, we show surprisigly that, through the use of codig with side iformatio priciples, this reversal of order is ideed possible i some settigs of iterest without loss of either optimal codig efficiecy or perfect secrecy. We show that i certai scearios our scheme requires o more radomess i the ecryptio key tha the covetioal system where compressio precedes ecryptio. I additio to provig the theoretical feasibility of this reversal of operatios, we also describe a system which implemets compressio of ecrypted data. I. INTRODUCTION Cosider the problem of trasmittig redudat data over a isecure, badwidth-costraied commuicatios chael. It is desirable to both compress ad ecrypt the data. The traditioal way to do this, show i Figure, is to first compress the data to strip it of its redudacy followed by ecryptio of the compressed bitstream. The source is first compressed to its etropy rate usig a stadard source coder. The, the compressed source is This research was supported by NSF uder grats CCR , CCR , ad CCR ad DARPA uder grat F Mark Johso s work is supported by the Faie ad Joh Hertz Foudatio

2 2 ecrypted usig oe of the may widely available ecryptio techologies. At the receiver, decryptio is performed first, followed by decompressio. Eavesdropper Message Source Compressio Ecryptio Public chael Decryptio Decompressio Recostructed Source Secure chael Key Fig.. Covetioal system: The ecoder first compresses the source ad the ecrypts before trasmittig over a public chael. The decoder first decrypts the received bitstream ad the decompresses the result. I this paper, we ivestigate the ovelty of reversig the order of these steps, i.e., first ecryptig ad the compressig the ecrypted source, as show i Figure 2. The compressor does ot have access to the cryptographic key, so it must be able to compress the ecrypted data (also called ciphertext) without ay kowledge of the origial source. At first glace, it appears that oly a miimal compressio gai, if ay, ca be achieved, sice the output of a ecryptor will look very radom. However, at the receiver, there is a decoder i which both decompressio ad decryptio are performed i a joit step. The fact that the decoder ca use the cryptographic key to assist i the decompressio of the received bitstream leads to the possibility that we may be able to compress the ecrypted source. I fact, we show that a sigificat compressio ratio ca be achieved if compressio is performed after ecryptio. This is true for both lossless ad lossy compressio. I some cases, we ca eve achieve the same compressio ratio as i the stadard case of first compressig ad the ecryptig. The fact that we ca still compress the ecrypted source follows directly from distributed source codig theory. Whe we cosider the case of lossless compressio, we use the Slepia-Wolf theorem [] to show that we ca achieve the same compressio gai as if we had compressed the origial, uecrypted source. For the case of lossy compressio, the Wyer-Ziv theorem [2] dictates the compressio gais that ca be achieved. If the origial source is Gaussia, the we ca achieve the

3 3 Eavesdropper Message Source Ecryptio Compressio Public chael Joit decompressio ad decryptio Recostructed Source Secure chael Key Fig. 2. Proposed system: The source is first ecrypted ad the compressed. The compressor does ot have access to the key used i the ecryptio step. At the decoder, decompressio ad decryptio are performed i a sigle joit step. same compressio efficiecy, for ay fixed distortio, as whe we compress before ecryptig. For more geeral sources, we caot achieve the same compressio gais as i the covetioal system, which is a direct result of the rate-loss of the uderlyig Wyer-Ziv problem. All of these claims relate to the theoretical limits of compressig a ecrypted source, ad are demostrated via o-costructive, existece proofs. However, i additio to studyig the theoretical bouds, we also implemet a system where the compressio step follows the ecryptio. We will describe the costructio of this system ad preset computer simulatios of its performace We also ivestigate the security provided by a system where a message is first ecrypted ad the compressed. We first defie a measure of secrecy based o the statistical correlatio of the origial source ad the compressed, ecrypted source. The, we show that the reversed cryptosystem i Figure 2 ca still have perfect secrecy uder some coditios. While we focus here o the fact that the reversed cryptosystem ca match the performace of a covetioal system, we have ucovered a few applicatio scearios where the reversed system might be preferable. I oe such sceario, we ca imagie that some cotet, either discrete or cotiuous i ature, is beig distributed over a etwork. We will further assume that the cotet ower ad the etwork operator are two distict etities, ad do ot trust each other. The cotet ower is very iterested i protectig the privacy of the cotet via ecryptio. However, because the ower has o icetive to compress his data, he will ot use his limited computatioal

4 4 resources to ru a compressio algorithm before ecryptig the data. The etwork operator, o the other had, has a overridig iterest i compressig all etwork traffic, i order to maximize etwork utilizatio ad, therefore, maximize his profit. However, because the cotet ower does ot trust the etwork operator, the ower will ot supply the cryptographic key that was used to ecrypt the data. The etwork operator is forced to compress the data after it has bee ecrypted. Our work was primarily ispired by recet code costructios for the distributed source codig problem [3], which we use i the compressio stage of our system. We are ot aware of ay previous literature o the topic of compressig data that has already bee ecrypted. The two mai cotributios of this work are the idetificatio of the coectio betwee the stated problem ad distributed source codig, ad the demostratio that i some scearios reversig the order of ecryptio ad compressio does ot compromise either the compressio efficiecy or the security. This paper is orgaized i the followig maer. Sectio II gives some backgroud iformatio o distributed source codig. The topics preseted i that sectio will be used subsequetly to develop a efficiet system for compressig ecrypted data. I Sectio III, the formal otio of iformatio-theoretic security is itroduced ad the performace limits of geeral cryptosystems are established. The problem of compressig ecrypted data is formally stated i Sectio IV ad a solutio based o the Wyer-Ziv distributed source codig theorem [2] is preseted. Results from computer simulatios are preseted i Sectio V ad some cocludig remarks are give i Sectio VI. Ivolved proofs of the mai results have bee placed i appedices to maitai the flow of the presetatio. Notatio: R + deotes the set of oegative real umbers. Radom quatities will be deoted by capital letters (e.g., X). Specific realizatios of radom quatities will be deoted by small letters (e.g., x). Boldface letters will deote vectors of some geeric block legth, e.g., x := (x,x 2,...,x ), X := (X,X 2,...,X ), etc. Ofte, lim sup ad lim if shall be abbreviated to lim sup ad lim if respectively. We will deote the mathematical expectatio operator by E[ ] ad evet probabilities by P( ). II. DISTRIBUTED SOURCE CODING I this sectio, we describe the distributed source codig problem ad provide the priciples behid code costructios for both lossless compressio ad compressio with a fidelity criterio. These code costructios

5 5 will be used subsequetly to costruct systems that implemet the compressio of ecrypted data. A. Lossless compressio Distributed source codig cosiders the problem of compressig sources Y ad K that are correlated, but caot commuicate with each other. I this subsectio, we look at the case where Y ad K are to be compressed losslessly. This is possible oly if Y ad K are draw from discrete alphabets, i.e., the size of the alphabets is at most coutably ifiite. A importat special case of this problem, upo which we will focus, is whe Y eeds to be set to a decoder which has access to the correlated side-iformatio K. For this situatio, the Slepia-Wolf theorem [] gives the smallest achievable rate for commuicatig Y losslessly to the decoder. The Slepia-Wolf theorem asserts that the best achievable rate required to trasmit K is give by the coditioal etropy [4] of Y give K, deoted by H(Y K) bits/sample. While these results are theoretical, there has bee some recet work that provides practical code costructios to realize these distributed compressio gais [3]. We will use a example to show the ituitio behid these costructios. Y Ecoder Decoder Ŷ K Fig. 3. A distributed source codig problem: The side iformatio K is available at both the ecoder ad the decoder. We begi by lookig at the case where K is available at both the ecoder ad the decoder, as depicted i Figure 3. I our example, Y ad K are each uiformly distributed biary data of legth 3. Furthermore, Y ad K are correlated such that their Hammig distace is at most, i.e., they differ i at most oe of the three bit positios. For example, if Y is 00, the K will equally likely be oe of the four patters {00,0, 000,0}. The ecoder forms the error patter e = Y K. Because Y ad K differ i at most oe bit positio, the error patter e ca take o oly four possible values, amely {000,00,00,00}. These four values ca be idexed with two bits. That idex is trasmitted to the decoder, which looks up the error patter correspodig to the idex received from the

6 6 ecoder, ad the computes Y = e K. Y Ecoder Decoder Ŷ Coset (00) = Coset 2 (0) K 0 0 = 0 Coset 3 (0) 0 0 = 0 Coset 4 () 0 0 = 0 Fig. 4. A distributed source codig problem: Y ad K are three bit biary sequeces which differ by at most oe bit. K is available oly at the decoder. The ecoder ca compress Y to two bits by sedig the idex of the coset i which Y occurs. We ext cosider the case i Figure 4 where K is available at the decoder, but ot at the ecoder. Without K, the ecoder caot form the error patter e. However, it is still possible for the ecoder to compress Y to two bits ad the decoder to recostruct Y without error. The reaso behid this surprisig fact is that it is uecessary for the ecoder to sped ay bits to differetiate betwee Y = 000 ad Y =. The Hammig distace of 3 betwee these two codewords is sufficietly large to eable the decoder to correctly decode Y based o its access to K ad the kowledge that K is withi a Hammig distace of from Y. If the decoder kows Y to be either Y = 000 or Y =, it ca resolve this ambiguity by checkig which of the two is closer i Hammig distace to K, ad declarig that codeword to be Y. We observe that the set {000,} is a 3-bit repetitio code with a miimum distace of 3. Likewise, i additio to the set {000,}, we ca cosider the followig 3 sets: {00,0}, {00,0}, ad {00,0}. Each of these sets is composed of two codewords whose Hammig distace is 3. These sets are the cosets of the 3-bit repetitio code. While we typically use the set {000,} as the 3-bit repetitio code (0 is ecoded as 000, ad as ), it is clear that oe could just as easily have used ay of the other three cosets with the same performace. Also, these 4 sets cover the complete space of biary 3-tuples that Y ca assume. Thus, istead of describig Y by its 3-bit value, we ca istead ecode the coset i which Y occurs. There are 4 cosets,

7 7 so we eed oly 2 bits to idex the cosets. We ca compress Y to 2 bits, just as i the case where K was available at both the ecoder ad decoder. This simple code costructio ca be used to compress data that has bee ecrypted with a oe-time pad. I this problem, K is a biary pad that is used to ecrypt a 3-bit data sequece X, formig a ecrypted data sequece Y = X K. If X ca oly take o the values {000, 00, 00, 00}, the the Hammig distace betwee Y ad K is at most. We ca use this costructio to compress Y to 2 bits, ad a decoder which has access to K will be able to correctly decode Y. The decoder ca the recover the origial data X by computig X = Y K. This costructio ca be exteded beyod the simple example cosidered here. The space of all possible words is partitioed ito cosets, which are associated with the sydromes of the pricipal uderlyig chael code (the 3-bit repetitio code i the above example). The ecodig procedure is to compute the sydrome of Y with respect to the appropriate chael code ad trasmit this sydrome to the decoder. The choice of chael code depeds o the correlatio structure betwee Y ad K. If Y ad K are more correlated, the the required stregth of the code is less. I practice, we will use a much more complex chael code tha the simple repetitio code. The decodig procedure is to idetify the closest codeword to K i the coset associated with the trasmitted sydrome, ad declare that codeword to be Y. B. Compressio with a fidelity criterio The Wyer-Ziv theorem [2] exteds the Slepia-Wolf result to the case of lossy codig with a distortio measure. This theorem gives the best achievable rate-distortio pairs for the problem of codig with side iformatio. The theorem applies to both discrete ad cotiuous sources. However, the real lie is the atural alphabet for represetig may sigals of iterest, such as atural images. We are primarily iterested i the case where the source is a real umber ad a mea squared error distortio measure is used. We will provide a example that illustrates some of the ituitio behid the implemetatio of a ecoder/decoder pair for distributed source codig with a o-zero fidelity criterio. I this example, Y is uiformly distributed i the iterval [ 9δ 2, 9δ 2 ]. The side iformatio K is correlated with Y such that K Y < δ. The ecoder will first quatize Y to Ŷ with a scalar quatizer with step size δ, which we show i Figure 5. Clearly, the distace betwee Y ad Ŷ is bouded by Y Ŷ δ 2. We ca thik of the

8 8 A B C A B C A B C 4δ 3δ 2δ δ 0 δ 2δ 3δ 4δ Fig. 5. Composite quatizer: The scalar quatizer with step size δ ca be thought of as three iterleaved scalar quatizers with step size 3δ. quatizer as cosistig of three iterleaved quatizers (cosets), each of step size 3δ. I Figure 5 we have labeled the recostructio levels of the three quatizers as A, B, ad C, respectively. The ecoder, after quatizig Y, will ote the label of Ŷ ad sed this label to the decoder, which requires log 2(3) bits o average. The decoder has access to the label trasmitted by the ecoder ad the side iformatio K. We ca boud the distace betwee Ŷ ad K as Ŷ K Ŷ Y + Y K < δ 2 + δ = 3δ 2 () Because Ŷ ad K are withi a distace of 3δ 2 separated by 3δ, the decoder ca correctly fid of each other ad the recostructio levels with the same label are Ŷ by selectig the recostructio level with the label set by the ecoder that is closest to K. This ca be see i Figure 6, which shows oe realizatio of Y ad K. I this figure, the ecoder quatizes Y to Ŷ ad trasmits the label, a A, to the decoder. The decoder fids the recostructio level labeled A that is closest to K, which is i fact Ŷ. K Y Ŷ A B C A B C A B C 4δ 3δ 2δ δ 0 δ 2δ 3δ 4δ Fig. 6. Distributed lossy compressio example: The ecoder quatizes Y to Ŷ ad trasmits the label of Ŷ, a A. The decoder fids the recostructio level labeled A which is closest to the side iformatio K, which is equal to Ŷ. I this example, the ecoder has trasmitted oly log 2 (3) bits, ad the decoder ca correctly recostruct Ŷ, a estimate withi δ 2 of the source Y. I the absece of K at the decoder, the ecoder would have had to trasmit log 2 (9) bits i order to sed the idex of the quatized level. This shows that the presece of the side-iformatio

9 9 ca be used to reduce the required trasmissio rate for meetig a target distortio. Further, observe that if the decoder had merely used K as a estimate of Y, the by defiitio that estimate could have bee as far as δ from the source Y. Hece, the ecoder, by sedig the label of the quatized source, has reduced the maximum possible distortio at the decoder by a factor of two. It should be oted that i this example we have simply chose to use Ŷ as the best estimate of Y. I reality, the decoder ca use both Ŷ ad K to compute a optimal estimate of Y (usig the joit statistics of (Y,K)). We have omitted this step here, as our itetio was to highlight the gais that are achieved by trasmittig the label (coset membership iformatio) to the decoder. I our example, we have used a scalar quatizer ad the ecoder computed the label of the quatized source via a very simple idea of alteratig the levels with three labels. I practice, we ca achieve better performace by replacig both of these methods with more complex alteratives, such as ested liear codes. For example, the ecoder ca quatize a sequece of source samples with a trellis-coded quatizatio (TCQ) scheme [5]. The it ca fid the sydrome of the quatized sequece with respect to a trellis-coded modulatio (TCM) scheme [6] ad trasmit that sydrome. The correlatio structure betwee Y ad K govers the amout of redudacy that we require i these codes. Practical code costructios for the distributed source codig problem ca be foud i [3]. III. INFORMATION-THEORETIC SECURITY I this sectio we set up the problem of secure ad badwidth efficiet trasmissio of messages over chaels where a eavesdropper ca liste to the messages beig trasmitted. We formalize the otio of iformatio-theoretic security ad establish performace limits for key size, secrecy, ad compressibility for geeral cryptosystems. Figure 7 shows a geeral model of a secret-key cryptosystem. A (stochastic) message source takig values i a source alphabet X is ecoded ito a bitstream B i blocks X of suitable legth usig a cryptographic key T. The source alphabet ca be arbitrary, i.e., discrete or cotiuous, uless otherwise oted. For simplicity, we assume that the source sequece X i, i =,... is idepedet ad idetically distributed (i.i.d) with distributio p X (x) o alphabet X. Our results ca, i priciple, be exteded to more geeral situatios, e.g., for statioary ergodic sources. The key T is a radom variable takig values i a fiite alphabet T idepedet of the message source. The secret key T is kow to the decoder through a secure chael. The ecodig takes place through a rate-r (R < ) bits per source symbol (bits/symbol) ecodig fuctio e : X T {0,} R. The ecoded message bitstream

10 0 B of rate R bits/symbol is set to the decoder through a isecure public chael which is effectively oiseless. The ecodig operatio should be such that the decoder ca recover a estimate X of the source message i a recostructio alphabet X, to a acceptable degree of fidelity, usig B ad T. The decodig takes place through a decodig fuctio g : {0,} R T X. Defiitio 3. (Cryptosystem) A cryptosystem of rate R ad block legth is a triple (T,e,g ) cosistig of (i) a fiite secret-key alphabet T with a associated key distributio 2 (ii) a rate-r ecoder map e : X T {0,} R, ad (iii) a decoder map g : {0,} R T X. Eavesdropper Message source X Ecoder B R e Public chael Decoder g X T Secure chael Fig. 7. A geeral secret-key cryptosystem: A message source X of block legth is ecrypted to a bitstream B usig a secret key T available to the decoder through a secure chael. A eavesdropper has access to the bitstream B which is beig set over a public chael operatig at rate R bits per source symbol. The goal is to desig the system so that the decoder ca recover the message source to a acceptable fidelity while providig security agaist eavesdroppig ad beig efficiet about utilizig system resources such as the badwidth of the public chael ad the cardiality of the key alphabet. Associated with the source ad recostructio alphabets X, X is a per-symbol oegative distortio criterio d : X X R +. The distortio criterio for a pair of -legth sequeces belogig to X ad X respectively is take to be additive i the compoets, i.e., d (x, x) := i= d(x i, ˆx i ). The rate-distortio fuctio of the source is the miimum umber of bits/symbol eeded to idex recostructios of the source so that the expected The effects of chael oise, which ca be dealt with usig error-correctig chael codes, are ot cosidered to be part of the cryptosystem i this work. 2 We do ot explicitly iclude the key distributio as part of the defiitio of the cryptosystem because as will become clear i the sequel, good cryptosystems will have a uiform key distributio.

11 per-symbol distortio is o more tha D [4, Chapter 3]. We deote the rate-distortio fuctio of the message source by R X (D) bits/symbol. A eavesdropper has access to the public chael ad strives to recover iformatio about the message source from the ecoded bitstream B. The goal is to desig a ecoder ad a decoder such that a eavesdropper who has access to the public chael bitstream B, but ot the key T, lears as little as possible about the message source o the average. The idea is to provide secrecy agaist ciphertext-oly attacks. Associated with such a cryptosystem are several iter-related desig ad performace parameters of iterest that oe would like to optimize: ) a measure of secrecy agaist eavesdroppig discussed below, 2) the measure of the fidelity of the source recostructio at the decoder give by E 3) the umber bits per source symbol trasmitted over the public chael give by R, ad ( [d X, X )], 4) the umber of bits of radomess or ucertaity i the secret key as measured by the bits per source symbol eeded to idex the key. This is related to the cardiality of the key-alphabet T. A more radom key would impose, i geeral, a greater burde o the resources of the secure key-distributio chael. A good system provides maximum secrecy with maximum fidelity usig the least amout of system resources, i.e., the miimum umber of bits/symbol R ad the smallest key-alphabet ecessary. A. Notio of perfect secrecy I his 949 paper [7], Claude Shao provided the first rigorous statistical treatmet of secrecy. The idea is that a eavesdropper will lear othig at all about the message source if the ecoded bitstream is statistically idepedet of the source messages. A iformatio-theoretic measure of the extet of the correlatio betwee two radom quatities itroduced by Shao is their mutual iformatio [4, p. 8]. The larger the mutual iformatio, the greater is the correlatio. Mutual iformatio is oegative ad is zero if ad oly if the two associated radom quatities are statistically idepedet. Accordig to Shao, a cryptosystem has (Shao-sese) perfect secrecy if the ecoded bitstream B is statistically idepedet of the message source X (whe the secret key T is uavailable), i.e., if I(X; B) = 0. I [8] Wyer itroduced the followig defiitio of perfect secrecy that we shall be usig i this work:

12 2 Defiitio 3.2 (Measure of secrecy ad Wyer-sese perfect secrecy) A iformatio-theoretic measure of secrecy of a rate-r cryptosystem (T,e,g ) of block legth is give by I(X;B), where I( ; ) stads for mutual iformatio. A sequece of rate-r cryptosystems {(T,e,g )} N is said to have Wyer-sese perfect secrecy if lim sup I(X; B) = 0. We would like to poit out that this is weaker tha Shao s defiitio of perfect secrecy because the idepedece betwee the source messages ad the ecoded bitstream holds oly asymptotically as the block legth goes to ifiity. Shao s defiitio is oasymptotic, i.e., I(X; B) = 0 should hold for every. I [9] Maurer proposed aother asymptotic otio of perfect secrecy which is stroger tha Wyer s defiitio, but weaker tha Shao s. Accordig to this otio, a sequece of cryptosystems has (Maurer-sese) perfect secrecy if lim sup I(X;B) = 0. We do ot kow if our results will cotiue to hold uder this stroger otio of perfect secrecy. However, the techiques that have bee developed i [0] suggest that our results ca be stregtheed. Also see Remark 4.7. A iformatio-theoretic measure of the amout of ucertaity or radomess i the key is the compressibility of the key i bits per source symbol. This is govered by the etropy of the key per source symbol: H(T) [4, p. 3 ad Chapter 5] which represets the miimum umber of bits/symbol that would have to be supported by the secure key-distributio chael. It turs out that log 2 T H(T) with equality if ad oly if all the T values of the key are equally likely [4, Theorem 2.6.4, p. 27]. Thus key radomess directly impacts the cardiality of the key-alphabet eeded. The followig theorem reveals certai importat aspects of the trade-off betwee the various performace parameters of a cryptosystem that strives to achieve maximum secrecy with maximum efficiecy for a maximum tolerable expected distortio. It is a straightforward extesio of a similar result by Shao [7] that assumed lossless recovery (zero distortio) of the source at the decoder. The proof of the theorem is preseted i Appedix A ad applies to both discrete ad cotiuous alphabet sources. Theorem 3.3 For a sequece of rate-r cryptosystems {(T,e,g )} N where the key is draw idepedetly of ( the source, if lim sup I(X;B) = 0 ad lim sup E [d X, X )] D < the R X (D) R ad R X (D) lim if H(T) lim if log 2 T.

13 3 Thus, i ay cryptosystem that provides perfect secrecy (i the Shao, Wyer, or Maurer sese) with expected distortio D, the key-alphabet must grow with block legth at least as fast as 2 RX(D). Hece, there must be at least as may biary digits i the secret key as there are bits of iformatio i the compressed message source if the cryptosystem provides perfect secrecy (i the Shao, Wyer, or Maurer sese) with expected distortio less tha or equal to D. Ituitively it is clear that if the key is chose idepedetly of the message source ad the decoder is able to recostruct the source to withi a expected distortio D, the ecoded bitstream rate R caot be smaller tha R X (D): the smallest umber of bits eeded to recostruct the source with a expected distortio o more tha D. A cryptosystem is efficiet if it operates at a rate close to R X (D) bits/symbol, usig a key-alphabet whose size is close to 2 RX(D) ad achieves a expected distortio less tha or equal to D with almost perfect secrecy (i the Shao, Wyer, or Maurer sese). The questio of whether there exists a efficiet cryptosystem havig the smallest possible key-alphabet that provides perfect secrecy with maximum expected distortio D was also aswered by Shao for the case whe D = 0 ad ivolved the idea of separatig the performace requiremets ito two parts: (i) Efficiet utilizatio of system resources through optimal source compressio ad (ii) Ecryptio of the compressed bitstream with a Veram oe-time pad (a Beroulli( 2 ) bitstream). A slightly geeral versio of Shao s solutio ivolvig ozero distortio is show i Figure 8. Shao s system meets all the four desirable attributes of a cryptosystem discussed Eavesdropper Ecoder e Decoder g X bitstream C R D source B source ecoder + + Message R X (D) + ɛ XOR R X (D) + ɛ XOR C R D decoder X Secure chael T Uiform(T ), T = {0, } (R X(D)+ɛ ) Fig. 8. Shao cryptosystem: Shao s cryptosystem is efficiet ad achieves Shao-sese perfect secrecy with expected distortio D with the smallest key alphabet. earlier. Clearly the bitrate of the output bitstream is R = R X (D) + ɛ. The expected distortio betwee X ad

14 4 X is o more tha D because the decoder successfully recovers the rate-distortio compressed bitstream C. Sice T is assumed to be uiformly distributed over its alphabet i Figure 8, the etropy of the key (ad the size of the key-alphabet) i bits per source symbol is R X (D) + ɛ. Sice X ad T are idepedet, so are C ad T. Hece, P(B = b C = c) = P(T = b(xor)c C = c) = P(T = b(xor)c) = 2 (RX(D)+ɛ) which does ot deped o the value that C takes. Thus, the bitstreams B ad C are idepedet without the key T. Sice X C B form a Markov chai, by the data processig iequality 3, I(X;B) I(C;B) = 0, i.e., the cryptosystem achieves Shao-sese perfect secrecy for each. We would like to ote that i practice, the Veram oe-time pad would be simulated by a pseudo-radom sequece ad the seed of the pseudo-radom geerator would play the role of the key that is shared by the seder of messages ad the iteded recipiet. IV. COMPRESSION OF ENCRYPTED DATA As motivated i the itroductio, a iterestig questio that arises i the cotext of Shao s cryptosystem above is if it is possible to swap the operatios of ecryptio ad compressio i a way that the resultig system cotiues to fuctio as a good cryptosystem. To ecrypt the source data directly before ay compressio, we eed a otio of additio o a geeral alphabet X similar to the XOR operatio (modulo two additio) o biary data. Let X be edowed with a biary operator. The saliet properties of the XOR operatio o biary data that make thigs work are captured by the followig requiremets o the tuple (X, ) 4 : For all x,y,z X, (i) x y = y x, ad (ii) x z = y z x = y. Cosider the system show i Figure 9. I this system, the secret key-word K T is selected radomly from the secret-key codebook K X of size 2 R accordig to a uiform distributio idepedet of the source sequece X. Let T {0,} R be the radom variable (seed or key) which idicates which key-word was selected. Note that R log 2 X. This is directly added to the source sequece X to produce the ecrypted sequece Y = X K T where the additio is compoet-wise. Let B = i (Y) deote the ecoded message bitstream 3 This essetially states that successive stages of processig caot icrease the statistical correlatio betwee the processed data ad the raw data as measured by their mutual iformatio. Specifically, if three radom variables X, Y, ad Z form a Markov chai X Y Z the I(X; Z) I(Y ; Z) ad I(X; Z) I(X;Y ) [4, p. 32]. 4 A tuple (X, ) satisfyig these requiremets is called a commutative quasi-group whe X is a fiite set [].

15 5 Eavesdropper Ecoder e Ecryptio Compressio Message X Y source + i B R Decoder g Joit decryptio ad decodig g X K T K T Secret-key codebook K T Uiform(T ), T = {0, } R Secure chael Secret-key codebook K Fig. 9. Reversed cryptosystem: A cryptosystem where ecryptio precedes compressio. produced by the ecoder. The decoder produces the recostructio X = g (B,K T ). The average, per-compoet distortio is E(d (X, X)). The ecrypted sequece Y is compressed a la Wyer-Ziv (W-Z) [2] by exploitig the fact that the key-word, which is available to the decoder, is related to Y. This leads us to the followig defiitio. Defiitio 4. (Cryptographic Wyer-Ziv source code) A rate-(r,r) cryptographic Wyer-Ziv source code of block legth is a triple (K,i,g ) cosistig of (i) a secret-key codebook K X such that K = 2 R, (ii) a ecoder map i : X {0,} R, ad (iii) a decoder map g : {0,} R K X. As i geeral cryptosystems, a sequece of cryptographic W-Z source codes {(K,i,g )} N is said to have Wyer-sese perfect secrecy if lim sup I(X;B) = 0. Defiitio 4.2 The triplet (R,R,D) is said to be achievable with Wyer-sese perfect secrecy if there exists a sequece of rate-(r,r) cryptographic W-Z codes havig Wyer-sese perfect secrecy such that lim sup E [d (X,g (i (X K T ),K T ))] D The first parameter R i the triplet (R,R,D) represets the ecryptio efficiecy of the cryptosystem, i.e., the umber of bits of radomess i the key per source symbol which has direct bearig o the size of the key codebook. The secod parameter R represets the compressio efficiecy of the cryptosystem, i.e., the umber of bits of the output bitstream per symbol of the message source geerated by the cryptosystem. The followig theorem tells us what sort of ecryptio ad compressio rates, R (D) ad R(D) respectively, ca defiitely be achieved usig a

16 6 cryptosystem havig the structure of Figure 9 for a source recostructio quality D while beig able to achieve Wyer-sese perfect secrecy by usig progressively loger block legths for codig. The two corollaries followig the theorem show that it is possible to compress ecrypted data without ay loss of ecryptio or compressio efficiecy with respect to a system where compressio precedes ecryptio. These results costitute the mai theoretical cotributio of this work. Theorem 4.3 Let X ad K be draw idepedetly ad i.i.d with the (commo) distributio p X (x), Y := X K, ad R WZ (D) := if I(Y ;U K), (2) p U Y, f where U is a auxiliary radom variable takig values i a alphabet U ad the miimizatio is over all coditioal probability distributios p U Y, with (X,K) Y U formig a Markov chai, ad all fuctios f : X U X such that p X (x)p X (k)p U Y (u x k)d(x,f(k,u)) D. x,k,u The, (R WZ (D),R WZ (D),D) is achievable with Wyer-sese perfect secrecy. The proof of this theorem for fiite alphabets is preseted i Appedix B. The proof for cotiuous alphabets (e.g., Gaussia sources) ad ubouded distortio criteria (e.g., mea squared error) ca be established alog similar lies usig the techiques i [2], [3]. We would like to ote that there is o fudametal difficulty i carryig out this proof. The associated techical aspects are defiitely importat ad otrivial but oly detract from the mai cocepts uderlyig the proof. Theorem 4.3 tells us that the triple (R WZ (D),R WZ (D),D) is achievable with Wyer-sese perfect secrecy by cryptosystems havig the structure show i Figure 9 but are there better ecryptio ad compressio rates that ca be realized o these cryptosystems at the same distortio D while beig able to achieve Wyer-sese perfect secrecy? Remark 4.4 It ca be show that the achievable performace give by Theorem 4.3 is also the best possible for ay cryptosystem havig the structure show i Figure 9, i.e., ay system havig this structure eeds a rate of at least R WZ (D) bits per source symbol to achieve Wyer-sese perfect secrecy ad expected distortio D. The proof

17 7 (omitted here) is alog the lies of the proof of the optimality of W-Z distributed source codig i [2], [4]. For geeral distortio criteria ad source distributios the W-Z cryptosystems ca suffer from some loss of compressio efficiecy, i.e., R WZ (D) > R X (D), (but o loss of Wyer-sese perfect secrecy) with respect to the Shao-type cryptosystems. However, as discussed below, i two importat cases of iterest W-Z cryptosystems are efficiet. Corollary 4.5 (Zero distortio, i.e., lossless recovery of data) If X = X are coutable alphabets ad the distortio criterio satisfies d(x,x) = 0 x X, ad d(x, ˆx) > 0, x ˆx the (H(X),H(X),0) is achievable with Wyer-sese perfect secrecy. Furthermore, R = R = H(X) caot be improved upo by ay cryptosystem (ot ecessarily havig the structure of Figure 9) whe it is required that the message source be recovered losslessly, i.e., D = 0. Proof: The achievability ca be proved from theorem 4.3 alog the lies of Remark 3 i [2, pg. 3] where it is show that R WZ (0) = H(Y K) = H(X). Sice R X (0) = H(X) [4, Chapter 5], R = R = H(X) caot be improved for lossless recovery of the source as per Theorem 3.3. Corollary 4.6 (Gaussia sources) Whe X is Gaussia, i.e., X N(0,σ 2 ), X = X = R, ad the distortio criterio is squared-error, i.e., d(x, ˆx) = (x ˆx) 2, the (R X (D),R X (D),D), is achievable with Wyer-sese perfect secrecy by a W-Z cryptosystem for ay target distortio D. Hece, W-Z cryptosystems are optimal i every sese for Gaussia sources ad squared error distortio. Proof: Here, R X (D) = 2 log σ2 D, D σ2 is the rate-distortio fuctio of a Gaussia source with variace σ 2 [4, p. 344]. Achievability follows from Theorem 4.3 by choosig U = Y + Z, Z N(0,σZ 2 ), Z Y, ad f(k,u) = E(X K = k,u = u), where σ 2 Z is chose such that E ( (X f(k,u) 2) = D, i.e., σ 2 Z = σ2 D σ 2 D. With these choices it ca be show that, I(Y ;U K) = 2 σ2 log D. The optimality follows agai from Theorem 3.3. The above corollary shows that for Gaussia sources, cryptographic Wyer-Ziv systems are as efficiet as sourcecodig-followed-by-ecryptio systems i terms of compressio rate ad the requiremets o the secret key. Remark 4.7 For fiite alphabets, it is possible to guaratee the stroger otio of Shao-sese perfect secrecy for the system of Figure 9 if oe is willig to sacrifice key-efficiecy (measured by R ). Specifically, let the variable K i

18 8 Theorem 4.3 be distributed accordig to Uiform(X) istead of p X (x) ad let R WZ (D) deote the correspodig rate as i equatio (2). The (log 2 X, R WZ (D),D) is achievable with Shao-sese perfect secrecy. The proof of this result is alog the same lies as that of Theorem 4.3 (see Appedix B). The oly additioal coditio that eeds to be checked is if Shao-sese perfect secrecy is attaiable. This is verified by a argumet which parallels the oe for the Shao cryptosystem of Figure 8. Example: With referece to Figure 9, let = 3, X = X = K = T = {0,} 3 ad X Uiform({000,00, 00,00}). Hece, X is a correlated sequece of three bits where it is kow that at most oe bit of X is equal to oe. Clearly, R X (0) = H(X) = 2 bits, R = log 2 X =, ad K T is a Veram oe-time pad with K T uiformly distributed over {0,} 3. For the ecryptio system of Figure 9, Y = X(XOR)K T so that Y ad K T differ i at most oe out of their three bits. Hece, if the coset-codebook of Figure 4 is used for the compressio box i of Figure 9, idetifyig X ad S i Figure 4 respectively with Y ad K T, oly two bits (equal to H(X)) are eeded to represet B (the output of i ) eve though the compressio box does ot have access to the secret key K T. Here, B is the idex of the coset (i bits) to which Y belogs. The decoder g first recovers Y by fidig the 3-bit codeword i the coset idexed by B which is closest to the key K T available to it. Fially, X is recovered from Y ad K T as X = X = Y(XOR)K T. Hece, the compressio ad secrecy performace of this system matches that of the Shao cryptosystem of Figure 8 where X is first compressed to two bits ad the ecrypted with a Veram oe-time pad. However, the Shao system is more efficiet i terms of the legth of the Veram oe-time pad eeded. The Shao cryptosystem eeds a oe-time pad of legth two whereas the system of Figure 9 eeds a oe-time pad of legth three. So far we cosidered cryptosystems where we had cotrol over the desig of both the ecryptio ad the compressio compoets. A iterestig questio is how much compressio ca we achieve if the ecryptio scheme is pre-specified by some user. Let us look at this situatio i more detail for the case whe the source is required to be reproduced at the decoder losslessly. Let F T : X Y be a pre-desiged ecryptio map which is parameterized by a secret key T. We oly require that there be a correspodig decryptio map G T : Y X such that G T (F T (.)) is the idetity map. As before lets assume that the compoets of the message sequece X are i.i.d. with distributio p X (x). The pre-specified ecryptio box ca i geeral produce a output sequece Y whose

19 9 compoets are correlated across time. To compress such a sequece, oe would eed to exploit the depedece betwee the output ad the key that is available to the decoder. We kow how to do this for the case whe the output is i.i.d. by usig the Slepia-Wolf distributed source codig theorem []. The results of Slepia ad Wolf ca be geeralized to the case where the iput to the distributed compressio box has memory [4]. This would give us a ecodig-decodig scheme which would work at the etropy-rate of the uecrypted source, H(X), but still recover X with high probability for sufficietly large. But what happes to the secrecy performace if we cocateate a pre-specified ecryptio box (which outputs Y) with the geeralized Slepia-Wolf compressio box (which outputs B) that has bee tailored to provide optimal compressio performace? By the data processig iequality (cf. last footote of Sectio III) we have I(X;B) I(X;Y). Very ofte the iequality is strict. Thus, compressio will preserve, if ot ehace, the iformatio theoretic secrecy. V. COMPUTER SIMULATION RESULTS Up to this poit, our focus has bee o the theoretical aspects of the problem of compressig ecrypted data, i particular o the performace that ca be theoretically achieved. I this sectio, we cosider real systems that implemet the compressio of ecrypted data. We discuss the codes used to costruct such systems ad give computer simulatios of their compressio performace. We will describe systems for both lossless ad lossy compressio. A. Ecryptio ad distributed lossless compressio of bilevel images I the followig example, the bilevel image i Figure 0 is ecrypted ad the compressed. For the purpose of illustratio, this image is treated as a i.i.d. strigs of 0,000 biary digits (the image is of size pixels where filled pixels correspods to a oe ad ufilled pixels correspod to a zero) disregardig ay spatial correlatio or structure that is evidetly preset i such a atural image. Thus, for the purpose of this example, the source is ot a image but is represeted as such i order to aid the readers uderstadig, ad shall heceforth be referred to as a strig to highlight this fact. It is possible to desig distributed source codes that ca exploit the spatial correlatio structures i atural images, much like the Lempel-Ziv algorithm [5] ad its variats exploit

20 20 cotext iformatio for compressig files. However, this is beyod the scope of this work. The methods used i these examples were developed specifically i [6], but are strogly related to a sigificat body of work [7] [2]. Fig. 0. Bilevel image used i the computer simulatio: For the purpose of display, the bit is mapped to the gray-scale value 0 ad the bit 0 is mapped to the gray-scale value 255. Natural images, such as the cartoo show here, have cosiderable memory (correlatio) across spatial locatios as evideced by the presece of sigificat 2-D structure that is easily recogized by humas. For the purpose of simulatio though, the pixel values (take i the raster-sca order) are assumed to be i.i.d. Beroulli radom variables. The image has 706 ozero etries correspodig to a empirical first-order etropy of about H(X) = 0.37 bits/pixel. The strig that is depicted as a image i Figure 0 has 706 ozero etries correspodig to a empirical first-order etropy of about H(X) = 0.37 bits/pixel. The strig is ecrypted by addig a uique pseudo-radom Beroulli( 2 ) strig of the appropriate legth. The ecrypted strig has a empirical first-order etropy of about H(Y ) =.0 bit/pixel. A traditioal compressio approach, which treats the data as origiatig from a i.i.d. biary source, would cosider the ecrypted strig to be icompressible. The ecrypted strig is compressed by fidig its sydrome with respect to a rate- 2 LDPC chael code [6]. That is, a strig of legth is multiplied by a (,k) code s parity check matrix H of dimesio ( k) to obtai a output strig of legth ( k). Thus a (,k) LDPC code is used to compress a ecrypted strig to rate ( k)/. Via this multiplicatio, the ecrypted code space is broke ito cosets. These cosets cosist of all ecrypted strigs with the same sydrome with respect to the chose LDPC code, ad the cosets are idexed by that commo sydrome. By breakig the space ito cosets i this maer, we isure that i each coset there will

2 Fig.. Compressig ecrypted images, example: A image (at left) is first ecrypted by addig a Beroulli( 2 ) bit-sequece geerated by a pseudo-radom key to produce the secod image.

21 2 Fig.. Compressig ecrypted images, example: A image (at left) is first ecrypted by addig a Beroulli( 2 ) bit-sequece geerated by a pseudo-radom key to produce the secod image. The result is the compressed by a factor of two usig practical distributed source codes developed i [6] to produce the third compressed ad ecrypted image bitstream. For the purpose of display, the ecrypted ad compressed bitstream has bee arraged ito the rectagular shape show here. Fially, the compressed bits are simultaeously decompressed ad decrypted usig a iterative decodig algorithm provided i [6] to obtai the last image. The decoded image is idetical to the origial. be oly oe elemet which is joitly typical with the key. At the receiver, the compressed data is decoded with a DISCUS type decoder [3] by usig the key bit-sequece as side iformatio. The decoder makes use of the fact that the ecrypted source bit-sequece ad the key are correlated. That is, the key ca be see as a oisy versio of the ecrypted sequece. Uder this view the goal of decodig ca be see as fidig the earest codeword to the key residig withi the coset specified by the compressed ecrypted sequece. Kowledge of the correlatio betwee the ecrypted strig ad the key (which is equivalet to kowledge of the source statistics) ad the sydrome (bi-idex or coset-idex) of the ecoded data is exploited by a belief propagatio algorithm [22], [23] to recover exactly the ecrypted sequece. Belief propagatio is a iterative algorithm that operates over graphical models ad coverges upo the margial distributios for each of the ukow bits, from which the bits ca be estimated. The algorithm is exact over trees, but i practice performs quite well for sparsely loopy graphs such as LDPC codes. The istace of belief propagatio used is early idetical to that used for decodig stadard LDPC codes, but with some adaptatios. First, the check ode update rule is modified to icorporate the kowledge of the sydrome of the ecrypted word. Secod, iitial margial distributios of the ecrypted bits are obtaied based o the kowledge of the key ad its correlatio to the ecrypted strig. Fially, with kowledge of the key ad the ecrypted sequece the decryptio is a trivial matter ad is cosidered to be a part of the decodig process. Usig this algorithm, the strig i Figure is perfectly decoded i 3 iteratios.

22 22 Samples of the best estimate at each stage of the iterative algorithm are provided i Figure 2. Fig. 2. Covergece of decoded estimate: The best estimate of the image at the ed of the specified umber of iteratios at the decoder (cf. Figure ). Clearly, the iitial estimate is quite graiy, but coverges rapidly towards the solutio. B. Ecryptio ad distributed lossy compressio of real valued data I this sectio, we provide simulatios of the compressio of a ecrypted real-valued data source. I these experimets, the data was a i.i.d. Gaussia sequece with variace.0. The data was ecrypted with a stream cipher. A key sequece, of the same legth as the data sequece, was added to the data o a sample-by-sample basis. The key was a i.i.d. Gaussia sequece, idepedet of the data. Our simulatios show the compressio performace of the scheme as a fuctio of the variace of the key sequece. Clearly, a i.i.d. Gaussia sequece is ot a good model for real world sigals such as atural images. However, more complex models that icorporate Gaussia variables, such as cascades of Gaussia scale mixtures [24], have bee show to be good models of atural sigals. While this work focuses o the problem of compressig ecrypted data ad ot modelig of sigals, we believe that costructig codes for a i.i.d. Gaussia sequece is a iitial step toward developig a system that ca be used with a more complicated source. Our ecoder compresses the ecrypted data to a rate of bit/sample. I the first stage of the ecoder, each sample i the ecrypted data sequece is quatized with a scalar quatizer. We will provide simulatio results for three differet values of the step size of the scalar quatizer. The recostructio levels of the scalar quatizer are labeled

23 23 with umbers i the set {,2,3,4}, with the labels assiged to the recostructio levels i a cyclic maer. Each quatized sample is the replaced with the 2 bit biary represetatio of its label, resultig i a biary sequece that is twice as log as the origial real-valued data sequece. Fially, we fid the sydrome of this biary sequece with respect to a rate 2 trellis code [6]. The sydrome is the output of the ecoder, which is trasmitted to the decoder. I our simulatios, we used a 64 state trellis code i the ecoder. Sice we use a rate 2 code, the legth of the sydrome is half of the legth of the biary iput. Hece, the sydrome is a biary sequece of the same legth as the ecrypted data sequece. The ecrypted data has bee compressed by the scalar quatizer ad trellis code to the rate bit/sample. The decoder has access to the sydrome trasmitted by the ecoder, as well as the key sequece used i the stream cipher. The decoder cosiders the set of real-valued sequeces which take o values from the set of recostructio levels of the scalar quatizer. The decoder looks at the subset of such sequeces whose sydrome is the same as the sydrome set by the ecoder, ad the fids the sequece i that subset which is closest to the key sequece. At this poit, the decoder has two estimates of the ecrypted data sequece. It has the output of the trellis decoder ad it has the key sequece, which ca be thought of as a oisy versio of the ecrypted data where the oise is the origial, uecrypted data. The decoder combies these two estimates to form the optimal estimate of the ecrypted data. Fially, it subtracts the key sequece to obtai the optimal estimate of the origial data. Our simulatios measure the performace of our scheme by computig the distortio ad the probability of error i the trellis decoder as a fuctio of the variace of the key sequece. For each value of the key variace, we ra 500 trials, where each trial cosisted of a block of 2000 symbols. We preset plots of the mea squared error distortio i Figure 3(a) ad of the probability of error i the TCM decoder i Figure 3(b) versus the variace of the key sequece. O each plot there are three lies, which represet the performace for three differet scalar quatizer step sizes. The plots show that the distortio ad probability of error do ot chage as we chage the variace of the key. The performace of our ecoder/decoder pair depeds oly o the source, ot o the side iformatio. We ote that because the data has a variace of.0 ad we are compressig it to a rate of bit/sample, the miimum possible distortio is This result follows from stadard rate-distortio theory [4]. The distortios

24 Step size 3.5 Step size 4. Step size Step size 3.5 Step size 4. Step size Distortio Probability of Error Key Variace (db) Key Variace (db) (a) (b) Fig. 3. Compressio of ecrypted Gaussia data: A i.i.d. Gaussia data sequece, with variace.0, is ecrypted with a i.i.d. Gaussia key sequece, with variace as idicated by the horizotal axis, ad the compressed. The three lies idicate three differet quatizer step sizes used i the compressor. (a) Mea squared error distortio as a fuctio of key variace. (b) Probability of decodig error i the trellis as a fuctio of key variace. that we achieved for the various step sizes are i the rage of 0.5 to 0.6, which is about 3 to 3.8 db above the rate-distortio miimum. I these experimets, the bit error rate was i the rage of 0 3 to 0 4. The goal of these simulatios was to show that we ca compress the ecrypted data with the same efficiecy, regardless of the key sequece. I particular, the variace of the key sequece ca be chose as a fuctio of the security requiremets of the system, ad the compressio gai will ot be affected. The performace of our scheme depeds oly o the statistics of the source, ot the key. Our aim was ot to compress the ecrypted data to the boud provided by the Wyer-Ziv theorem, but to demostrate that icreasig the variace of the key sequece does ot affect the distortio or probability of decodig error. I order to compress the source to a distortio close to the boud, it would be ecessary to use a more powerful chael code i our scheme, such as the codes described i [25]. VI. CONCLUDING REMARKS I this work, we have examied the possibility of first ecryptig a data stream ad the compressig it, such that the compressor does ot have kowledge of the ecryptio key. The ecrypted data ca be compressed usig

Entropies & Information Theory

Entropies & Information Theory Etropies & Iformatio Theory LECTURE I Nilajaa Datta Uiversity of Cambridge,U.K. For more details: see lecture otes (Lecture 1- Lecture 5) o http://www.qi.damtp.cam.ac.uk/ode/223 Quatum Iformatio Theory