EXPURGATED GAUSSIAN FINGERPRINTING CODES. Pierre Moulin and Negar Kiyavash

Similar documents
A Framework for Optimizing Nonlinear Collusion Attacks on Fingerprinting Systems

The Method of Types and Its Application to Information Hiding

Channel Dispersion and Moderate Deviations Limits for Memoryless Channels

New Traceability Codes against a Generalized Collusion Attack for Digital Fingerprinting

Lecture 4 Noisy Channel Coding

Achievable Error Exponents for the Private Fingerprinting Game

Lecture 7 Introduction to Statistical Decision Theory

Soft Covering with High Probability

Dispersion of the Gilbert-Elliott Channel

Lecture 8: Information Theory and Statistics

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

The Capacity of Finite Abelian Group Codes Over Symmetric Memoryless Channels Giacomo Como and Fabio Fagnani

Error Exponent Region for Gaussian Broadcast Channels

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

Separating codes: constructions and bounds

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function

Chapter 9 Fundamental Limits in Information Theory

The Poisson Channel with Side Information

Generalized Writing on Dirty Paper

Generalized hashing and applications to digital fingerprinting

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Practical Polar Code Construction Using Generalised Generator Matrices

4488 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 10, OCTOBER /$ IEEE

WATERMARKING (or information hiding) is the

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Performance-based Security for Encoding of Information Signals. FA ( ) Paul Cuff (Princeton University)

Statistical Modeling and Analysis of Content Identification

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes

ELEC546 Review of Information Theory

Entropy, Inference, and Channel Coding

IEEE Proof Web Version

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH

Lecture 6 Channel Coding over Continuous Channels

THIS paper is aimed at designing efficient decoding algorithms

Sequential and Dynamic Frameproof Codes

Lecture 5 Channel Coding over Continuous Channels

Capacity of AWGN channels

Upper Bounds on the Capacity of Binary Intermittent Communication

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding

A New Achievable Region for Gaussian Multiple Descriptions Based on Subset Typicality

Improved Versions of Tardos Fingerprinting Scheme

Lecture 4 Channel Coding

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

IN this paper, we consider the capacity of sticky channels, a

Optimal Power Control in Decentralized Gaussian Multiple Access Channels

Cover s Open Problem: The Capacity of the Relay Channel

Guesswork Subject to a Total Entropy Budget

A proof of the existence of good nested lattices

Lecture 8: Information Theory and Statistics

Detection Games under Fully Active Adversaries. Received: 23 October 2018; Accepted: 25 December 2018; Published: 29 December 2018

Collusion Resistance of Digital Fingerprinting Schemes

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

DIGITAL media protection has become an important issue

Channels with cost constraints: strong converse and dispersion

IN [1], Forney derived lower bounds on the random coding

Finding the best mismatched detector for channel coding and hypothesis testing

Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel

2012 IEEE International Symposium on Information Theory Proceedings

Bounds on the Maximum Likelihood Decoding Error Probability of Low Density Parity Check Codes

Approaching Blokh-Zyablov Error Exponent with Linear-Time Encodable/Decodable Codes

Codes for Partially Stuck-at Memory Cells

Information Hiding and Covert Communication

STRONG CONVERSE FOR GEL FAND-PINSKER CHANNEL. Pierre Moulin

Uncertainty. Jayakrishnan Unnikrishnan. CSL June PhD Defense ECE Department

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Phase Transitions of Random Codes and GV-Bounds

FRAMES IN QUANTUM AND CLASSICAL INFORMATION THEORY

Binary Transmissions over Additive Gaussian Noise: A Closed-Form Expression for the Channel Capacity 1

Variable Length Codes for Degraded Broadcast Channels

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Arimoto Channel Coding Converse and Rényi Divergence

A Relationship between Quantization and Distribution Rates of Digitally Fingerprinted Data

Optimal Decentralized Control of Coupled Subsystems With Control Sharing

A New Metaconverse and Outer Region for Finite-Blocklength MACs

FORMULATION OF THE LEARNING PROBLEM

STAT 331. Martingale Central Limit Theorem and Related Results

On Gaussian MIMO Broadcast Channels with Common and Private Messages

Information Theoretical Analysis of Digital Watermarking. Multimedia Security

Channel Polarization and Blackwell Measures

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

On the Capacity Region of the Gaussian Z-channel

Introduction to Probability

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Rényi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression

STAT 200C: High-dimensional Statistics

Solving Corrupted Quadratic Equations, Provably

Guess & Check Codes for Deletions, Insertions, and Synchronization

2804 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 6, JUNE 2010

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

Lecture 7 September 24

On Third-Order Asymptotics for DMCs

Towards practical joint decoding of binary Tardos fingerprinting codes

Asymptotically Optimum Universal Watermark Embedding and Detection in the High SNR Regime

MULTIMEDIA forensics [1] is a new discipline aiming

Quantization Index Modulation using the E 8 lattice

Capacity of Memoryless Channels and Block-Fading Channels With Designable Cardinality-Constrained Channel State Feedback

Transcription:

EXPURGATED GAUSSIAN FINGERPRINTING CODES Pierre Moulin and Negar iyavash Beckman Inst, Coord Sci Lab and ECE Department University of Illinois at Urbana-Champaign, USA ABSTRACT This paper analyzes the performance of collusion attacks on random fingerprinting codes, when the colluders are subject to an almostsure squared distortion constraint and a list decoder is used We derive an exact characterization of the type-i and type-ii error exponents of the fingerprinting system A Gaussian ensemble and an expurgated Gaussian ensemble of codes are considered, and the corresponding random-coding exponents are derived Explicit optimal strategies for the colluders are derived as well Index Terms: Digital fingerprinting, random codes, large deviations 1 INTRODUCTION Digital fingerprinting systems can be used for traitor tracing or digital rights management applications A length-n realvalued signal is to be protected and distributed to M users Some of the users of them may collude and process their copies to create a forgery that contains only weak traces of their fingerprints This problem was first posed by Cox et al 1] who proposed the use of Gaussian fingerprints for this purpose Specifically, their fingerprints were drawn randomly from an iid Gaussian distribution; the fingerprint code is shared with the decoder but not revealed to the users A fundamental question is what are the optimal performance limits for detection of colluders To make the problem nontrivial, one may assume embedding distortion constraints on the fingerprinter and the colluders Example of this analysis include 2, 3] for the case of signals defined over finite alphabets, and 4, 5] for the case of real-valued signals In the latter case, an obvious but not necessarily optimal strategy for the colluders is to perform a uniform linear average of their copies and add iid Gaussian noise; this strategy was examined in the above papers Possible improvements for the attackers consist of developing nonlinear order-statistics attacks 6, 7] Our study aims at developing a comprehensive detectiontheoretic analysis of collusion attacks and identifying an optimal strategy for the colluders The analysis is rooted in largedeviations theory Initial results were reported in 8, 9] In This research was supported in part by NSF grant CCR 03-25924 this paper some of the restrictive assumptions in 8, 9] are relaxed, including the assumption that the noise introduced by the colluders is white and Gaussian We also use a list decoder that returns a list of guilty users We consider two random ensembles of fingerprinting codes The first one is the same as the one used by Cox 1] and other researchers and is shown to be less performant than the second one, which is an expurgated ensemble bad codes are eliminated The decoder has access to a forgery as well as to the host signal nonblind detection and performs a binary hypothesis test on each user to determine whether that user was involved in the forgery The cost functions in this problem are the imum type-i and type-ii probabilities of error, which the colluders want to imize Throughout this paper, we use boldface uppercase letters to denote random vectors, uppercase letters for the components of the vectors, and calligraphic fonts for sets We use the symbol E to denote mathematical expectation For any collection of vectors x 1,, x M, we denote by x = x k, k the restriction of this collection to its components k The symbols fn gn and fn gn asymptotic equality mean that lim N fn gn = 0 and lim N fn gn = 1, respectively The symbols fn = gn and fn < gn denote asymptotic equality and inequality on the exponential scale: ln fn ln gn and fn gn, respectively The Gaussian distribution with mean zero and covariance matrix R is denoted by N 0, R 2 PROBLEM STATEMENT The mathematical setup of the problem is diagrammed in Fig 1 21 Fingerprint Generation and Embedding The host signal is a sequence S = S1,, SN in R N, viewed as deterministic but unknown to the colluders Fingerprints are added to S, and the marked copies of the signal are distributed to M users Specifically, user m is assigned a marked copy X m = S U m where m 1,, M and U m R N is the fingerprint assigned to user m The fingerprints U 1,, U M form a N, M fingerprinting code C The rate of the code is R N log M In a = 1 N

Signal S fingerprints U 1 U 2 U U M X 1 X 2 X X M Attack Channel A y x1,, x Forgery Fig 1 The fingerprinting process and the attack channel typical signal fingerprinting application, N 10 3 10 9 and M 2 10 9 not to exceed the number of humans; therefore we focus our attention on the case of zero-rate codes: lim N R N = 0 The code C is selected independently of S from a random ensemble of codes, C, such that E C U m 2 ] = N, m, ie, the expected mean-squared distortion is equal to The random ensembles C considered in this paper are permutationinvariant, ie, C C πc C where π is a permutation of 1,, N, and all N! permutations have the same probability Moreover, the random ensembles are invariant to permutation of the users 22 Attack Model Denote by 1, 2,, M the coalition, ie, the index set of the colluding users Their coalition has cardinality M They select a conditional pdf A Y X and draw a pirated copy, or forgery, Y R N, from that distribution We write Y A X 1 Consider the following constraints on the attack channel A A1 Location-Invariant constraint: Ay x = Ay s x s A2 Almost-Sure Mean-Squared Distortion constraint: Y 1 X k 2 ND c k Y A3 Fairness constraint: Ay x π = Ay x for all permutations π of the index set The model A1 precludes attacks involving filtering of host signal components The motivation for this restriction is that it considerably simplifies the mathematical derivation and does not require a statistical model for the host S The restriction is relatively mild if embedding is done in a transform domain in which the components of the host S are approximately independent and are large relative to the embedding distortion The motivation for A2 is that distortion is best measured relative to the host S, but S is not known to the coalition, so we replace S by its best linear unbiased estimate, 1 k X k Choosing an expected distortion constraint of the form E Y S 2 ND c would allow impulsive noise strategies which blast iid additive noise N 0, ND c with probability 1/N Such attacks are extremely effective 10] as they result in zero error exponents Finally, the fairness condition A3 will not be imposed but will be shown to hold for optimal attacks: all members of the coalition incur the same risk A special case of 1 that will be of interest is 1 Y = f N X E 2 where the noise E is independent of X and uniformly distributed on the N dimensional centered sphere with radius Nσ 2 e contained in the N dimensional hyperplane orthogonal to the coalition vectors X k, k The mapping f N : R N R N may be viewed as an estimator of S based on X, or as a noise-free forgery Note that if the attackers can retrieve the original signal S, they will succeed in defeating the decoder The location-invariant condition on the attack channel implies that the mapping f N is locationinvariant as well: f N X = f N U S 3 This condition is satisfied by various nonlinear mappings including order-statistics mappings 8] A special case is uniform linear averaging, X = f N X = 1 X k 4 k The model studied in 9] was of the form 2, but E was AWGN with mean zero and variance σ 2 e When the attack is of the form 2, we need σ 2 e D c to satisfy the distortion constraint A2 Hence σ 2 e is imized made equal to D c by letting f N be the uniform linear averaging mapping of 4 1 When 2, the model 2 is more restrictive than 1, even if E is unconstrained For instance a mixture of models of the form 2 3 satisfies 1 and A1 but generally not 2

23 Decoder We study the nonblind scenario where the host signal S is available at the decoder and can be subtracted from Y, to form the centered data Y S The decoder computes a guilt index for each user m and returns the list of users whose guilt index exceeds a certain threshold Nτ The guilt index adopted here is the correlation statistic T m y s = u T my s = u T mf N u e] 5 The decoder 5 does not know the channel A used by the colluders or even the exact number of colluders However the decoder knows, the imum number of colluders For any given user m, the possible error events are a false positive incorrectly declaring the user to be guilty or a false negative incorrectly declaring the user to be innocent Denote by Λ m = y R N : u m y s > τ the acceptance region for user m and by P I m,, A = du p U u dy Ay u s, R N Λ m P II m,, A = du p U u dy Ay u s R N the corresponding type-i and type-ii error probabilities By our location-invariant assumption on the attack channel and the decoding regions, these probabilities are independent of s By design of C, P I m,, A is independent of m / The type-i and type-ii error probabilities are given by Λ c m P I, A = m / : T m Y s > Nτ] M P I m,, A m /, 6 incorrectly accusing an innocent user, and P II, A = m : T m Y s Nτ] min P IIm,, A 7 m missing all members of the coalition The inequality 6 is due to the union bound Detection is said to be reliable if 6 and 7 are small enough We shall impose the following requirement on the type-i error probability: sup P I, A exp Nλ I 8 A where λ I is a constraint on the exponent of P I The imum over and A indicates that we do not allow innocent users to be accused for any possible choice of and A A weaker requirement, not recommended here, would be to replace the imum over with an average We would like to find the largest possible value of λ II such that sup P II, A exp Nλ II 9 A Associated with the random ensemble C is a curve of λ II vs λ I, indexed by the threshold τ 3 WORST ATTAC Lemma 1 For any permutation-invariant random ensemble C and the list decoder 5, there is no loss of optimality for the colluders in choosing a strongly exchangeable attack channel: Ay x = Aπy πx for every permutation π of the index set 1, 2,, N In the special case of Model 2, there is no loss of optimality in choosing a memoryless mapping f N x = f x 1,, f x N 10 for some f : R R Given a vector-valued sequence z R d N, denote by R z = 1 N zzt the d d empirical correlation matrix for z Given x and y, define the N dimensional sphere Σx, y whose elements y have the following property: x, y have the same empirical correlation matrix as x, y The sphere Σx, y is centered at ŝ x, y R N which is the orthogonal projection of y onto the -dimensional subspace of R N spanned by x k, k This subspace is orthogonal to the N dimensional subspace in which the sphere Σx, y is embedded The radius of the sphere is y 1 x k = y 1 x k, k k y Σx, y 11 It may also be shown that for the correlator decoder 5, there is no loss of optimality for the colluders in selecting A x that is uniform over each sphere Σx, y Furthermore, P I and P II are imized by A x that is uniform over a single sphere Σx, y That is, there is no loss of optimality for the colluders in choosing Y according to the additive-noise model 2, where f N x spanx k, k Moreover, by application of Lemma 1, there is no loss of optimality in choosing a memoryless mapping f N We still need to identify the worst f in 10 Denote by F a compact set of feasible f, such that f N x 1 k x k 2 ND c σe 2 4 GAUSSIAN ENSEMBLE A possible choice for C is the Gausssian ensemble, in which random codes C = U m, 1 m M are obtained by drawing fingerprint components U m n iid N 0, We now analyze detection performance for the random ensemble C under any attack mapping f F selected by the colluders For simplicity of the exposition, we only report results in the case where E iid N 0, σ 2 e; the error exponents when E is uniformly distributed on the sphere of radius Nσ 2 e are slightly larger First we define the five random variables Z 0,f = U m fu m Z 1,f = U m fu m W = U m E U 0,f = Z 0,f W U 1,f = Z 1,f W

By our assumptions on f and C, the distributions of these random variables do not depend on m and In particular, VarW = σ 2 and EZ 0,f ] = EU 0,f ] = EW ] = 0 Finally, note that Z 0,f, Z 1,f, W, U 0,f and U 1,f are non- Gaussian, even if the mapping f is linear Lemma 2 EZ 1,f ] = EU 1,f ] = / for all f F Proof: The random variables U m fu, m, are identically distributed Therefore EZ 1,f ] = EUfU ] where U = 1 m U m By the location-invariant property of f, we have fu = fu U U Since U m, m, are iid Gaussian, the mean U is independent of the centered random variables U U and therefore of any function of them Hence EUfU ] = EU 2 ] = / which proves the claim Since the test statistic T m Y s in 5 is a sum of N iid random variables U 0,f n and U 1,f n under hypotheses H 0 and H 1, respectively, we immediately obtain Proposition 1 below, where Λ U 0,f and Λ U 1,f are the large-deviations functions for the random variables U 0,f and U 1,f, respectively Proposition 1 For the Gaussian ensemble C, the error probabilities satisfy the following upper bounds for all f F: P I, f exp NΛ U 0,f τ P II, f exp NΛ U 1,f τ where 0 τ / Moreover these bounds are tight in the exponent as N Proposition 2 In the limit as N, we have the asymptotic equalities Λ U 0,f τ Λ U 1,f τ τ 2 2 VarW = τ 2 1 2 σ 2 e τ 2 σe 2 2σe 2 2, 2 2σe 2 2 Proof The arguments of the large-deviations functions Λ U 0,f and Λ U 1,f above vanish as The asymptotic equality results from a quadratic expansion of these functions around zero Prop 1 states that the error exponents depend on the mapping f selected by the colluders However, as indicated by Prop 2, that exponential dependency vanishes for large We conclude this section with Prop 3 which establishes a fundamental relationship between N, M and, guaranteeing reliable detection for the Gaussian ensemble C Proposition 3 For the Gaussian ensemble C, reliable detection is guaranteed provided that 2 ln M N Proof: immediate consequence of Props 1, 2, and 6, 7 5 EXPURGATED GAUSSIAN ENSEMBLE The problem with the Gaussian ensemble C of Sec 4 is that error probability which is obtained by averaging over all codes in C may be dominated by bad codes This is a standard problem for the design of low-rate codes, for which performance is dictated by minimum-distance considerations, and the bad codes are the ones with poor minimum distance 12] Improvements can be obtained by using expurgation, ie, removing bad codes from the random ensemble We apply this principle to our fingerprinting problem and show that performance can indeed be improved by expurgating the Gaussian random ensemble We assume here that 3 ln M N To gain some insight, we first consider the case M = 1 with fixed f and the simplest possible extension of the two-codeword problem of 12, Ch 5] For a typical selection of C, we have when m /, and U m 2 N = ON 1/2 m, U T mf N U = ON 1/2 Nτ U T mf N U N = ON 1/2 Nτ when m Thus P I = U T m E > Nτ], P II = U T m E < N τ] The goal now is to design a random ensemble C of codes that have the above properties for much larger M and for any mapping f F and coalition To this end, fix a positive sequence ɛ N such that ln M ɛ N 1 N and define C as the random ensemble of codes C such that min m U m 2 N < Nɛ N 12 sup m U T mf N U < Nɛ N 13 min inf m UT mf N U > N ɛ N 14 Lemma 3 The probability that a code C drawn from the iid Gaussian ensemble also belongs to C tends to 1 as N

By application of this lemma, the following procedure may be used to draw a code from C : draw C from the iid Gaussian ensemble and verify whether C satisfies 12, 13 and 14 If yes this happens with high probability, keep C If not, repeat the above procedure until C satisfies the above conditions Sketch of Proof of Lemma 3 Denote by E 0, E 1 and E 2 the events 12, 13, and 14, repectively By the union bound, we have ] E0] c M Un 2 ND f > Nɛ N 15 E c 1] E c 2] M 1 M M where have inf sup ] Z 0,f n > Nɛ N Z 1,f n < N 16 ] ɛ N M = exp ln M < expnɛ 2 N We ] Un 2 N > Nɛ N exp Nɛ2 N 4 2, therefore E0] c vanishes exponentially with N By application of Sanov s theorem 11], we prove that ] sup Z 0,f n > Nɛ N Nɛ 2 N exp 2 sup VarZ 0,f inf Z 1,f n < N ] ɛ N Nɛ 2 N exp 2 sup VarZ 1,f, 18 19 Therefore 18 and 19 vanish exponentially with N, and so do 16 and 17 Using Shannon s formula for spherical caps 13], we derive the following result Proposition 4 The type-i and type-ii error probabilities for the expurgated Gaussian ensemble C are given by P I, f = exp NE cap, 20 τσe P II, f = exp NE cap σ e τ, 21 σ e for all and f F, where E cap ρ = 1 2 ln1 ρ2, 0 ρ 1 22 17 Since τ < / tends to 0 for increasing and E cap ρ ρ 2 /2 as ρ 0, the exponents above coincide with those derived for the Gaussian ensemble in Prop 2 in the limit as Proposition 5 The error exponents in 20 and 21 are minimized by the uniform linear averaging f with = ; then σe 2 = D c, and λ I = E cap τ λ II = E cap c τ c c Proof Given, the detection bounds 20 and 21 depend on f via σ e As discussed below 4, uniform linear averaging with = simultaneously imizes and σ e, and therefore minimizes the error exponents 20 and 21 The error exponents in 20 and 21 are uniformly better than those obtained by drawing codes from the Gaussian ensemble 6 REFERENCES 1] I J Cox, J illian, F T Leighton and T Shamoon, Secure Spread Spectrum Watermarking for Multimedia, IEEE T-IP, Vol 6, pp 1673 1687, Dec 1997 Also NEC Tech Rep 95-10, 1995 2] P Moulin and J A O Sullivan, Information-Theoretic Analysis of Information Hiding, IEEE T-IT, Vol 49, No 3, pp 563 593, 2003 3] A Somekh-Baruch and N Merhav, On the Capacity Game of Private Fingerprinting Systems Under Collusion Attacks, Proc IEEE Int Symp on Information Theory, Yokohama, Japan, p 191, July 2003 4] J ilian, F T Leighton, L R Matheson, T G Shamoon, R E Tarjan, and F Zane, Resistance of digital watermarks to collusive attacks, Proc ISIT, p 271, Cambridge, MA, 1998 5] P Moulin and A Briassouli, The Gaussian Fingerprinting Game, Proc CISS 02, Princeton, NJ, March 2002 6] H S Stone, Analysis of Attacks on Image Watermarks With Randomized Coefficients, NEC TR 96-045, Princeton, NJ, 1996 7] H Zhao, M Wu, Z Wang and J R Liu, Forensic Analysis of Nonlinear Collusion Attacks for Multimedia Fingerprinting, IEEE T- IP, Vol 14, No 5, pp 646 661, May 2005 8] N iyavash and P Moulin, A Framework for Optimizing Nonlinear Collusion Attacks on Fingerprinting Systems, Proc Conf on Information Systems and Science, Princeton, NJ, March 2006 9] P Moulin and N iyavash, Performance of Random Fingerprinting Codes Under Arbitrary Nonlinear Collusion Attacks, Proc ICASSP, Hawaii, Apr 2007 10] N iyavash and P Moulin, On Optimal Collusion Strategies for Fingerprinting, Proc ICASSP, Toulouse, France, May 2006 11] A Dembo and O Zeitouni, Large Deviations Techniques and Applications, 2nd ed, Springer-Verlag, New York, 1998 12] R G Gallager, Information Theory and Reliable Communication, Wiley, NY, 1968 13] C E Shannon, Probability of Error for Optimal Codes in a Gaussian Channel, Bell Systems Tech J, Vol 38, pp 611 656, 1959