Image Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)

Image Compression Qiaoyong Zhong CAS-MPG Partner Institute for Computational Biology (PICB) November 19, 2012 1 / 53

Image Compression The art and science of reducing the amount of data required to represent an image. 2 / 53

The central parts of the Milky Way (ESO, 2012) 108, 500 81, 500 3/1024 3 = 24.71GB 3 / 53

Outline 1 Fundamentals 2 Some Basic Compression Methods 3 Digital Image Watermarking 4 / 53

Fundamentals Data: the means by which information is conveyed Compression ratio C = b b where b and b are numbers of bits in two representations of the same information. Relative data redundancy R = 1 1 C = 1 b b = b b b e.g. b = 10, b = 1, C = 10, R = 0.9 5 / 53

Image Data Redundancies Types of image data redundancies: Coding redundancy Spatial and temporal redundancy Irrelevant information 6 / 53

Coding Redundancy Given an M N image, r k is a discrete random variable in the interval [0, L 1] to represent the intensities of the image, the probability of r k : p r (r k ) = n k MN k = 0, 1, 2,..., L 1 Then the average number of bits required to represent each pixel is L 1 L avg = l(r k )p r (r k ) k=0 where l(r k ) is the number of bits used to represent r k, which can be a constant (fixed-length code) or variable (variable-length code). 7 / 53

Information Theory A random event E with probability P(E) is said to contain units of information. I (E) = log 1 = log P(E) P(E) Given a source of independent random events {a 1, a 2,..., a J }, its entropy is J H = P(a j ) log P(a j ) Entropy of an image: j=1 L 1 H = p r (r k ) log 2 p r (r k ) k=0 8 / 53

Information Theory Shannon s first theorem [ Lavg,n lim n n ] = H where L avg,n is the average number of code symbols required to represent all n-symbol groups. It provides a lower bound that can be achieved using variable-length code! 9 / 53

Fidelity Criteria Image compression can be lossy or lossless. To estimate the information loss: objective fidelity criteria root-mean-square error e rms = e(x, y) = ˆf (x, y) f (x, y) [ subjective fidelity criteria 1 MN M 1 ] N 1 1/2 e(x, y) 2 x=0 y=1 10 / 53

Fidelity Criteria Objective vs. subjective criteria e rms = 5.17, 15.67, 14.17 for (a), (b), (c) respectively. 11 / 53

Image Compression Models Mapper: transforms f (x,... ) into a format designed to reduce spatial and temporal redundancy Quantizer: excludes irrelevant information Symbol coder: e.g. variance-length code 12 / 53

Image Formats 13 / 53

Outline 1 Fundamentals 2 Some Basic Compression Methods 3 Digital Image Watermarking 14 / 53

Huffman Coding Entropy H = 2.14bits/symbol L avg = 0.4 1 + 0.3 2 + 0.1 3 + 0.1 4 + 0.06 5 + 0.04 5 = 2.2bits/pixel 15 / 53

Huffman Coding Variable-length, instantaneous uniquely decodable block codes The source symbols are coded once at a time Used in CCITT, JBIG2, JPEG, MPEG-1,2,4, H.26{1,2,3,4} etc. 16 / 53

Golomb Coding Optimal coding of nonnegative geometrically distributed integer inputs P(n) = (1 p)p n Golomb code of n with respect to m, G m (n): 1 Form the unary code of quotient n/m. (The unary code of an integer q is defined as q 1s followed by a 0.) 2 Let k = log 2 m, c = 2 k m, r = nmodm, and compute truncated remainder r such that { r r truncated to k 1 bits 0 r < c = r + c truncated to k bits otherwise 3 Concatenate the results of steps 1 and 2. 17 / 53

Golomb Coding Golomb codes are optimal when log2 (1 + p) m = log 2 (1/p) 18 / 53

Golomb Coding M(n) = { 2n n 0 2 n 1 n < 0 19 / 53

Golomb Coding - Example 20 / 53

Golomb Coding Usually used for the coding of transform of intensities, not for the intensities directly Variable-length, instantaneous uniquely decodable block codes Used in JPEG-LS, AVS 21 / 53

Arithmetic Coding The entire sequence of source symbols (message) is assigned a single arithmetic code word. Use an interval between 0 and 1 to represent a source symbol. Starts from [0, 1), as the message extends, the interval becomes smaller and smaller. More number of digits (or bits) are required to represent smaller intervals. Used in JBIG1, JBIG2, JPEG-2000, H.264, MPEG-4 AVC etc. 22 / 53

Arithmetic Coding Message a 1 a 2 a 3 a 3 a 4 can be encoded with a subinterval [0.06752, 0.0688), or simply 0.068. 23 / 53

LZW Coding Addresses spatial redundancies. Assigns fixed-length code words to variable length source symbols. Builds a dictionary of sequences of source symbols. Used in GIF, TIFF, PDF. 24 / 53

LZW Coding - Example 25 / 53

LZW Coding - Example A 16-pixel 8-bit image encoded using 10 9-bit codes 39 39 126 126 39 39 126 126 39 39 126 126 39 39 126 126 26 / 53

Run-Length Coding Compresses a simple form of spatial redundancy groups of identical intensities. Represents runs of identical intensities as run-length pairs. Particularly effective for binary images. Used in CCITT, JBIG2, JPEG, M-JPEG, MPEG-1,2,4, BMP etc. 27 / 53

Run-Length Coding RLE in BMP encoded mode: two bytes pair, the first byte specifies the number of consecutive pixels that have the intensity contained in the second byte. absolute mode: the first byte is 0, while the second is 28 / 53

Symbol-Based Coding Represents an image as a collection of frequently occurring sub-images (symbols). Uses a symbol dictionary to store symbols. The image is coded as a set of triplets {(x 1, y 1, t 1 ), (x 2, y 2, t 2 ),... } Used in JBIG2, binary images only. 29 / 53

Symbol-Based Coding 30 / 53

Bit-Plane Coding 1 Decompose a multilevel image into a series of binary images (bit planes). 2 Apply run-length coding, symbol-based coding etc. to the bit planes individually. Used in JBIG1, JPEG-2000. 31 / 53

Block Transform Coding 1 Divide the image into small non-overlapping blocks of equal size (e.g. 8 8). 2 Apply 2-D transform on the blocks independently. 3 Quantize the transform coefficients (compression by discarding those with small magnitudes). 4 Encode the retained transform coefficients. Used in JPEG, M-JPEG, MPEG-1,2,4, H.26{1,2,3,4}, DV and HDV, VC-1 etc. 32 / 53

Transform selection Forward discrete transform: Inverse discrete transform: n 1 n 1 T (u, v) = g(x, y)r(x, y, u, v) x=0 y=0 n 1 n 1 g(x, y) = T (u, v)s(x, y, u, v) u=0 v=0 r(x, y, u, v) and s(x, y, u, v) are called the forward and inverse transformation kernels respectively. 33 / 53

Discrete Fourier transform (DFT): Transform selection r(x, y, u, v) = e j2π(ux+vy)/n s(x, y, u, v) = 1 n 2 ej2π(ux+vy)/n Walsh-Hadamard transform (WHT): r(x, y, u, v) = s(x, y, u, v) = 1 n ( 1) m 1 i=0 b i (x)p i (u)+b i (y)p i (v) Discrete cosine transform (DCT): r(x, y, u, v) = s(x, y, u, v) [ ] [ ] (2x + 1)uπ (2y + 1)vπ = α(u)α(v) cos cos 2n 2n 34 / 53

Comparison of transforms Transform selection 50% of the coefficients are truncated, e rms = 2.32, 1.78, 1.13 respectively. 35 / 53

Subimage Size Selection The most popular sizes are 8 8 and 16 16 and DCT performs better than the others. 36 / 53

JPEG Defines three coding systems: a lossy baseline coding system, based on the DCT an extended coding system for greater compression, higher precision etc. a lossless independent coding system for reversible compression Support for the baseline system is required to be JPEG compatible. 37 / 53

JPEG - Example Compression ratio = 25:1 and 52:1 respectively. 38 / 53

Predictive Coding Can be lossless or lossy Used in JBIG2, JPEG, JPEG-LS, MPEG-1,2,4, H.26{1,2,3,4}, HDV, VC-1 etc. 39 / 53

Lossless Predictive Coding Prediction error: e(n) = f (n) ˆf (n) 1-D linear prediction function: Intraframe [ m ] ˆf (x, y) = round α i f (x, y i) i=1 Interframe for video compression [ m ] ˆf (x, y, t) = round α i f (x, y, t i) i=1 40 / 53

Differential coding Lossless Predictive Coding ˆf (x, y) = f (x, y 1) 41 / 53

Interframe prediction Lossless Predictive Coding ˆf (x, y, t) = f (x, y, t 1) 42 / 53

Wavelet Coding No need to construct subimages compared with block transform coding Used in JPEG-2000 43 / 53

JPEG-2000 Compression ratio = 25, 52, 75, 105 respectively Better than JPEG both in an objective view and subjective view 44 / 53

Outline 1 Fundamentals 2 Some Basic Compression Methods 3 Digital Image Watermarking 45 / 53

Digital Image Watermarking Visible watermarks Invisible watermarks Can be performed in the spatial domain or the transform domain. 46 / 53

Visible Watermarking An opaque or semi-transparent sub-image or image that is placed on top of another image Performed in the spatial domain: f w = (1 α)f + αw where f w is the watermarked image, f is the unmarked image, w is the watermark, 0 < α 1. 47 / 53

Visible Watermarking 48 / 53

Invisible Watermarking Cannot be seen with the naked eye. LSB watermarking Invisible, fragile, performed in the spatial domain: ( ) f f w = 4 + w 4 64 where 4 ( ) f 4 sets the two least significant bits of f to 0, w 64 shifts its two most significant bits into the two least significant bit positions (f and w are 8-bit grayscale images). 49 / 53

LSB watermarking is fragile! Invisible Watermarking 50 / 53

Invisible Watermarking DCT-based watermarking Invisible, robust, performed in the transform domain Steps: 1 Compute the 2-D DCT of the image to be watermarked. 2 Locate its K largest coefficients, c 1, c 2,..., c K, by magnitude. 3 Create a watermark: w 1, w 2,..., w K, where w i N (0, 1). 4 Embed the watermark: c i = c i (1 + αw i ) 1 i K replace the original c i with c i. 5 Compute the inverse DCT of the result from step 4. 51 / 53

Robust Invisible Watermark DCT-based watermarking is robust! 52 / 53

Thanks! 53 / 53