Survey of JPEG compression history analysis

Similar documents
Density Estimation. Seungjin Choi

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Expectation maximization tutorial

Lecture : Probabilistic Machine Learning

an introduction to bayesian inference

Hierarchical Models & Bayesian Model Selection

p(d θ ) l(θ ) 1.2 x x x

Bayesian Machine Learning

Parametric Techniques Lecture 3

Lecture 4: Probabilistic Learning

Motif representation using position weight matrix

Parametric Techniques

SYDE 575: Introduction to Image Processing. Image Compression Part 2: Variable-rate compression

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

COS513 LECTURE 8 STATISTICAL CONCEPTS

Bayesian Decision and Bayesian Learning

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 6, JUNE

Bayesian Inference and MCMC

Introduction to Probabilistic Machine Learning

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Inverse Problems in Image Processing

Nonparameteric Regression:

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Variational Methods in Bayesian Deconvolution

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Probability and Inference

The Expectation-Maximization Algorithm

Bayesian Interpretations of Regularization

Lecture 6: Graphical Models: Learning

Linear Dynamical Systems

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Expectation Propagation for Approximate Bayesian Inference

Statistical learning. Chapter 20, Sections 1 3 1

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Machine Learning Summer School

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image

Expectation Maximization

Parametric Inverse Simulation

CSC321 Lecture 18: Learning Probabilistic Models

DUE to the ease of tampering a multimedia file, forensics

Department of Electrical Engineering, Polytechnic University, Brooklyn Fall 05 EL DIGITAL IMAGE PROCESSING (I) Final Exam 1/5/06, 1PM-4PM

CS-E3210 Machine Learning: Basic Principles

Computer Intensive Methods in Mathematical Statistics

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Unsupervised Learning

GWAS V: Gaussian processes

4x4 Transform and Quantization in H.264/AVC

Kernel methods, kernel SVM and ridge regression

Machine Learning 4771

Introduction to Machine Learning

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Probabilistic Reasoning in Deep Learning

Dynamic Approaches: The Hidden Markov Model

CS6220: DATA MINING TECHNIQUES

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

INFORMATION THEORETICAL LIMIT OF COMPRESSION FORENSICS

Computational Cognitive Science

Bayesian Machine Learning

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

Support Vector Machines

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Lecture 3: Machine learning, classification, and generative models

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

Learning the hyper-parameters. Luca Martino

T Machine Learning: Basic Principles

Marginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density

Model Selection for Gaussian Processes

CMU-Q Lecture 24:

Statistical learning. Chapter 20, Sections 1 4 1

Image Compression - JPEG

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

Statistical learning. Chapter 20, Sections 1 3 1

Bayesian Inference. Introduction

Celeste: Variational inference for a generative model of astronomical images

John Ashburner. Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, 12 Queen Square, London WC1N 3BG, UK.

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

Computational Cognitive Science

Graphical Models for Collaborative Filtering

CSC411 Fall 2018 Homework 5

encoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256

Variational Principal Components

Topic 12 Overview of Estimation

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Probabilistic Graphical Models

Image Data Compression

K-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543

Factor Analysis and Kalman Filtering (11/2/04)

Algorithmisches Lernen/Machine Learning

Basic math for biology

Frequentist-Bayesian Model Comparisons: A Simple Example

Introduction into Bayesian statistics

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Lecture 8: Bayesian Estimation of Parameters in State Space Models

GWAS IV: Bayesian linear (variance component) models

Naïve Bayes classification

Transcription:

Survey of JPEG compression history analysis Andrew B. Lewis Computer Laboratory Topics in security forensic signal analysis

References Neelamani et al.: JPEG compression history estimation for color images, IEEE Transactions on Image Processing 15(6), 2006 Hany Farid: Exposing digital forgeries from JPEG ghosts, IEEE Transactions on Information Forensics and Security 4(1), 2009 Andrew B. Lewis, Markus G. Kuhn: Exact JPEG recompression 1, to appear in SPIE Electronic Imaging: Visual Information Processing and Communication, 2010 1 Draft at http://www.cl.cam.ac.uk/~abl26/spie10-recomp-full.pdf

Outline Revision of JPEG compression/decompression algorithm Probability theory for parameter estimation JPEG compression history estimation for color images Exact JPEG recompression

The JPEG algorithm Parameters: Quantization: Q L (luma), Q C (chroma) Sub-sampling: 1 1 (luma), 2 2, 2 2 (chroma) also known as 4:2:0 sub-sampling Colour space: Y C b C r Image Colour space convert Y C b C r DCT Q L Encode 2 2 DCT Q C Encode 2 2 DCT Q C Encode

JPEG compression history estimation Probabilistically estimate settings used in the previous compression step Input: raw image in colour space F Output: compressed representation colour space G, sub-sampling scheme S and quantization tables Q {G, S, Q } = arg max P(Image, G, S, Q) G,S,Q = arg max P(Image G, S, Q)P(G, S, Q) G,S,Q

Terminology of inverse probability Unknown parameters θ, data D, assumptions H P(θ D, H) = P(D θ, H)P(θ H) P(D H) likelihood prior posterior = evidence The quantity of P(D θ, H) is a function of both D and θ. For fixed θ it defines a probability over D. For fixed D it defines the likelihood of θ. 2 2 More in David J. C. MacKay: Information Theory, Inference and Learning Algorithms, Cambridge University Press.

Maximum a posteriori estimation We wish to estimate θ on the basis of data D. The maximum likelihood (ML) estimate of the parameters from the data is ˆθ ML (D) = arg max P(D θ) θ and maximum a posteriori (MAP) estimate is ˆθ MAP (D) = arg max P(D θ)p(θ) θ

Expectation maximization ML estimate requires the marginal likelihood, if we have hidden variables. ˆθ ML (D) = arg max P(D θ) θ P(D θ) = Z P(D z, θ)p(z θ) Evaluating the sum is sometimes computationally infeasible. The expectation-maximization algorithm can be used instead. Expectation: Q(θ θ (t) ) = E Z x,θ (t)[log L(θ; x, Z)] Maximization: θ t+1 = arg max Q(θ θ (t) ) θ No guarantees of convergence

Interpolation characterisation as an expectation maximization problem Expectation: Q(θ θ (t) ) = E Z x,θ (t)[log L(θ; x, Z)] Maximization: θ t+1 = arg max Q(θ θ (t) ) θ x: the observed image samples f (x, y) θ: the interpolation kernel α and variance σ 2 Z: the p-map, an array of probabilities with the same dimensions as the image, where each probability indicates P(f (x, y) M 1 )

Compression history estimation as MAP problem ˆθ MAP (D) = arg max P( D θ )P( θ ) θ {G, S, Q } = arg max G, S, Q P( Image G, S, Q )P( G, S, Q )

Estimating quantization tables Q (1) Small set of possible values for G and S. C space to G, sub-sample with S, then forward DCT gives a near-periodic distribution of coefficients Ω G,S over image. G, S, Q and X G,S Ω G,S independent {G, S, Q } = arg max P(Ω G,S G, S, Q)P(G)P(S)P(Q) G,S,Q {G, S, Q } = arg max G,S,Q ex G,S Ω G,S P( X G,S G, S, Q)P(G)P(S)P(Q)

Estimating quantization tables Q (2) Since the decompressor s dequantization, DCT coefficients X q were IDCT ed; the results were up-sampled if necessary; and the image was converted to the RGB colour space. We received the image, and applied a forward colour space conversion; we downsampled the planes, if appropriate; and we applied the forward DCT to get e X. Rounding errors accumulate during every stage of this process. Therefore, we model the DCT coefficient values X = X q + Γ where original DCT coefficients X q are modelled by a sampled zero-mean Laplace distribution P(x) drawn from a truncated normal distribution x and rounding error Γ is P(x) x.

Estimating quantization tables Q (3) The Laplace distribution for DCT coefficient values has scale parameter λ, determined from the observations. The probability distribution of coefficient values after quantization/dequantization with factor q is P( X q = t q Z + ) = (k+0.5)q λ δ(t kq) exp ( λ τ ) dτ k Z (k 0.5)q 2 Rounding errors are independent, so we convolve this distribution with our error term s distribution and normalize: P( X = t q) P( X q = τ q)p(γ = t τ) dτ

Estimating quantization tables Q (4) If we assume a uniform prior P(q) for quantization factors, we can now find the most likely value for a particular quantization factor q = Qi,j by maximizing P( X q). This is repeated for each quantization factor Qi,j for (i, j) {(0, 0),..., (7, 7)}: q = arg max q Z + P( X q) ex Ω

Evaluation Allows for arbitrary colour space conversion, sub-sampling and quantization table parameters Not implementation specific Partially recovered quantization tables could be used to determine quality factor Quality factors q {1,..., 100} map onto quantization tables in many compressor implementations Statistical rather than exact Can t recover bitstream Tikhonov deconvolution filter introduces errors Errors in the quantization table when most DCT coefficients at a particular frequency are zero No data for the quantization table when all are zero (low quality factors)

Exact recompression Can we recover the original bitstream (including the quantization tables) when given the result of decompression? Due to rounding and mismatch between the compressor and decompressor operations, simplying invoking the compressor with the same parameters will not work. Can we provide a guarantee that if the provided image was produced by a particular JPEG decompressor, we will recover that bitstream? uncompressed image compression JPEG image decompressed image 1 decompression = recompression recompressed JPEG image(s) decompressed image 2 decompression

Applications of exact recompression The input to JPEG compressors is often previously compressed image data. Detecting this and recompressing exactly will reduce the information loss from recompression. Hinder forensic analysis double compression detection, JPEG ghosts,... Detect tampered regions in an uncompressed image, when the background was output by a JPEG decompressor Some copy-protection schemes rely on the fact that copies will be recompressed, lowering the quality.

Information loss in the decompressor Model sources of uncertainty (rounding, range limiting) when inverting operations in the decompressor using intervals of integers. We store intervals for all intermediate values in the decompressor. Because we are modelling the exact computations, exact recompressors are implementation-specific.

Chroma down-sampling example (1) IJG chroma upsampling filter weights contributions from neighbouring samples by 1 16 (1, 3, 3, 9) in order of increasing proximity. 9 3 3 9 1 3 3 1 3 1 1 3 3 9 9 3 v x,y = with weights 1 16 (8 + α w i 1,j 1 + β w i,j 1 + γ w i 1,j + δ w i,j ), (1, 3, 3, 9) x = 2i, y = 2j (3, 1, 9, 3) x = 2i 1, y = 2j (α, β, γ, δ) = (3, 9, 1, 3) x = 2i, y = 2j 1 (9, 3, 3, 1) x = 2i 1, y = 2j 1.

Chroma down-sampling example (2) We need to solve for the down-sampled weights w i,j : 1 v x,y = 16 (8 + α w i 1,j 1 + β w i,j 1 + γ w i 1,j + δ w i,j ) Interval arithmetic rules give [ 1 w i,j = δ (v x,y 16 (8 + α w i 1,j 1 + β w i,j 1 + γ w i 1,j )), ] 1 δ (v x,y 16 + 15 (α w i 1,j 1 + β w i,j 1 + γ w i 1,j ))

Chroma down-sampling example (3) k 0 w x,y 0 [0, 255] at all positions 1 x w 2, 1 y h 2 repeat k k + 1 change scan order of (x, y) (,,, ) for each sample position (x, y) in the upsampled plane do for (i, j ) {(i 1, j 1), (i, j 1), (i 1, j), (i, j)} do w i,j s v x,y c ā : Equation satisfied with ā for w i,j, s for v x,y and current estimates w x,y for other w values. w i k,j w k 1 i,j w i,j end for end for until w k = w k 1 w x,y c w k

Performance Number of test images 90 80 70 IJG quality factor 60 50 40 0.1 0.2 0.3 0.4 0.5 0 infeasible blocks total blocks Figure: Recompression performance for a dataset of uncompressed images from the UCID. 1338 colour images were compressed and decompressed at quality factors q {40, 60, 70, 80, 82, 84, 86, 88, 90}, then recompressed. The proportion of blocks at each quality factor which were not possible to recompress due to an infeasible search size is shown.