4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak

Size: px
Start display at page:

Download "4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak"

Transcription

1 4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof.

2 What is data compression? Reducing the file size without compromising the quality of the data stored in the file too much (lossy compression) or at all (lossless compression). With compression, you can fit higher-quality data (e.g., higher-resolution pictures or video) into a file of the same size as required for lower-quality uncompressed data.

3 Why data compression? Our appetite for data (high-resolution pictures, HD video, audio, documents, etc) seems to always significantly outpace hardware capabilities for storage and transmission.

4 Data compression: Step If the data is continuous-time (e.g., audio) or continuous-space (e.g., picture), it first needs to be discretized.

5 Data compression: Step If the data is continuous-time (e.g., audio) or continuous-space (e.g., picture), it first needs to be discretized. Sampling is typically done nowadays during signal acquisition (e.g., digital camera for pictures or audio recording equipment for music and speech).

6 Data compression: Step If the data is continuous-time (e.g., audio) or continuous-space (e.g., picture), it first needs to be discretized. Sampling is typically done nowadays during signal acquisition (e.g., digital camera for pictures or audio recording equipment for music and speech). We will not study sampling. It is studied in ECE 3, ECE 438, and ECE 44. We will consider compressing discrete-time or discrete-space data.

7 Example: compression of grayscale images An eight-bit grayscale image is a rectangular array of integers between (black) and 255 (white). Each site in the array is called a pixel.

8 Example: compression of grayscale images An eight-bit grayscale image is a rectangular array of integers between (black) and 255 (white). Each site in the array is called a pixel. It takes one byte (eight bits) to store one pixel value, since it can be any number between and 255.

9 Example: compression of grayscale images An eight-bit grayscale image is a rectangular array of integers between (black) and 255 (white). Each site in the array is called a pixel. It takes one byte (eight bits) to store one pixel value, since it can be any number between and 255. It would take 25 bytes to store a 5x5 image.

10 Example: compression of grayscale images An eight-bit grayscale image is a rectangular array of integers between (black) and 255 (white). Each site in the array is called a pixel. It takes one byte (eight bits) to store one pixel value, since it can be any number between and 255. It would take 25 bytes to store a 5x5 image. Can we do better?

11 Example: compression of grayscale images Can we do better than 25 bytes?

12 Two key ideas Idea #: Transform the data to create lots of zeros.

13 Two key ideas Idea #: Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]

14 Two key ideas Idea #: Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]: 255,,,,,,,,,, 55,,,,,,,,,,,,,,

15 Two key ideas Idea #: Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]: 255,,,,,,,,,, 55,,,,,,,,,,,,,, This seems to make things worse: now the numbers can range from 255 to 255, and therefore we need two bytes per pixel!

16 Two key ideas Idea #: Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]: 255,,,,,,,,,, 55,,,,,,,,,,,,,, This seems to make things worse: now the numbers can range from 255 to 255, and therefore we need two bytes per pixel! Idea #2: when encoding the data, spend fewer bits on frequently occurring numbers and more bits on rare numbers.

17 Entropy coding Suppose we are encoding realizations of a discrete random variable X such that value of X probability 22/25 /25 /25 /25

18 Entropy coding Suppose we are encoding realizations of a discrete random variable X such that value of X probability 22/25 /25 /25 /25 Consider the following fixed-length encoder: value of X codeword

19 Entropy coding Suppose we are encoding realizations of a discrete random variable X such that value of X probability 22/25 /25 /25 /25 Consider the following fixed-length encoder: value of X codeword For a file with 25 numbers, E[file size] = 25*2*(22/25+/25+/25+/25) = 5 bits

20 Entropy coding Suppose we are encoding realizations of a discrete random variable X such that value of X probability 22/25 /25 /25 /25 Consider the following fixed-length encoder: value of X codeword For a file with 25 numbers, E[file size] = 25*2*(22/25+/25+/25+/25) = 5 bits Now consider the following encoder: value of X codeword

21 Entropy coding Suppose we are encoding realizations of a discrete random variable X such that value of X probability 22/25 /25 /25 /25 Consider the following fixed-length encoder: value of X codeword For a file with 25 numbers, E[file size] = 25*2*(22/25+/25+/25+/25) = 5 bits Now consider the following encoder: value of X codeword For a file with 25 numbers, E[file size] = 25(22/25 + 2/25 + 3/25 + 3/25) = 3 bits!

22 Entropy coding A similar encoding scheme can be devised for a random variable of pixel differences which takes values between 255 and 255, to result in a smaller average file size than two bytes per pixel.

23 Entropy coding A similar encoding scheme can be devised for a random variable of pixel differences which takes values between 255 and 255, to result in a smaller average file size than two bytes per pixel. Another commonly used idea: run-length coding. I.e., instead of encoding each individually, encode the length of each string of zeros.

24 Back to the four-symbol example value of X probability 22/25 /25 /25 /25 codeword Can we do even better than 3 bits?

25 Back to the four-symbol example value of X probability 22/25 /25 /25 /25 codeword Can we do even better than 3 bits? What about this alternative encoder? value of X probability 22/25 /25 /25 /25 codeword

26 Back to the four-symbol example value of X probability 22/25 /25 /25 /25 codeword Can we do even better than 3 bits? What about this alternative encoder? value of X probability 22/25 /25 /25 /25 codeword E[file size] = 25(22/25 + 2/25 + /25+2/25) = 27 bits

27 Back to the four-symbol example value of X probability 22/25 /25 /25 /25 codeword Can we do even better than 3 bits? What about this alternative encoder? value of X probability 22/25 /25 /25 /25 codeword E[file size] = 25(22/25 + 2/25 + /25+2/25) = 27 bits Is there anything wrong with this encoder?

28 The second encoding is not uniquely decodable! value of X probability 22/25 /25 /25 /25 codeword Encoded string could either be 255 or followed by 55

29 The second encoding is not uniquely decodable! value of X probability 22/25 /25 /25 /25 codeword Encoded string could either be 255 or followed by 55 Therefore, this code is unusable! It turns out that the first code is uniquely decodable.

30 What kinds of distributions are amenable to entropy coding? a b c d a b c d Can do a lot better than two bits per symbol Cannot do better than two bits per symbol

31 What kinds of distributions are amenable to entropy coding? a b c d a b c d Can do a lot better than two bits per symbol Cannot do better than two bits per symbol Conclusion: the transform procedure should be such that the numbers fed into the entropy coder have a highly concentrated histogram (a few very likely values, most values unlikely).

32 What kinds of distributions are amenable to entropy coding? a b c d a b c d Can do a lot better than two bits per symbol Cannot do better than two bits per symbol Conclusion: the transform procedure should be such that the numbers fed into the entropy coder have a highly concentrated histogram (a few very likely values, most values unlikely). Also, if we are encoding each number individually, they should be independent or approximately independent.

33 What if we are willing to lose some information?

34 What if we are willing to lose some information? Quantization

35 Some eight-bit images The five stripes contain random values from (left to right): {252,253,254,255}, {88,89,9,9}, {25,26,27,28}, {6,62,63,64}, {,,2,3}. The five stripes contain random integers from (left to right): {24,,255}, {76,,9}, {3,,28}, {49,,64 }, {,,5}.

36 Converting continuous-valued to discrete-valued signals Many real-world signals are continuous-valued. audio signal a(t): both the time argument t and the intensity value a(t) are continuous; image u(x,y): both the spatial location (x,y) and the image intensity value u(x,y) are continuous; video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.

37 Converting continuous-valued to discrete-valued signals Many real-world signals are continuous-valued. audio signal a(t): both the time argument t and the intensity value a(t) are continuous; image u(x,y): both the spatial location (x,y) and the image intensity value u(x,y) are continuous; video v(x,y,t): x,y,t, and v(x,y,t) are all continuous. Discretizing the argument values t, x, and y (or sampling), is studied in ECE 3, 438, and 44.

38 Converting continuous-valued to discrete-valued signals Many real-world signals are continuous-valued. audio signal a(t): both the time argument t and the intensity value a(t) are continuous; image u(x,y): both the spatial location (x,y) and the image intensity value u(x,y) are continuous; video v(x,y,t): x,y,t, and v(x,y,t) are all continuous. Discretizing the argument values t, x, and y (or sampling), is studied in ECE 3, 438, and 44. However, in addition to descretizing the argument values, the signal values must be discretized as well in order to be digitally stored.

39 Quantization Digitizing a continuous-valued signal into a discrete and finite set of values. Converting a discrete-valued signal into another discrete -valued signal, with fewer possible discrete values.

40 How to compare two quantizers? Suppose data X(),,X(N) is quantized using two quantizers, to result in Y (),,Y (N) and Y 2 (),,Y 2 (N). Suppose both Y (),,Y (N) and Y 2 (),,Y 2 (N) can be encoded with the same number of bits. Which quantization is better? The one that results in less distortion. But how to measure distortion? In general, measuring and modeling perceptual image similarity and similarity of audio are open research problems. Some useful things are known about human audio and visual systems that inform the design of quantizers.

41 Sensitivity of the Human Visual System to Contrast Changes, as a Function of Frequency

42 Sensitivity of the Human Visual System to Contrast Changes, as a Function of Frequency [From Mannos-Sakrison IEEE-IT 974]

43 Sensitivity of the Human Visual System to Contrast Changes, as a Function of Frequency [From Mannos-Sakrison IEEE-IT 974] High and low frequencies may be quantized more coarsely

44 But there are many other intricacies in the way human visual system computes similarity

45 Are these two images similar?

46 What about these two?

47 What about these two? Performance assessment of compression algorithms and quantizers is complicated, because measuring image fidelity is complicated. Often, very simple distortion measures are used such as mean-square error.

48 Scalar vs Vector Quantization s 255 s r quantize each value separately simple thresholding quantize several values jointly more complex r

49 What kinds of joint distributions are s 255 amenable to scalar quantization? r If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently.

50 What kinds of joint distributions are s 255 amenable to scalar quantization? s r If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently r If (r,s) are jointly uniform over yellow region, knowing r tells us a lot about s. Best thing to do: make quantization decisions jointly.

51 What kinds of joint distributions are s 255 amenable to scalar quantization? s r If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently r If (r,s) are jointly uniform over yellow region, knowing r tells us a lot about s. Best thing to do: make quantization decisions jointly. Conclusion: if the data is transformed before quantization, the transform procedure should be such that the coefficients fed into the quantizer are independent (or at least uncorrelated, or almost uncorrelated), in order to enable the simpler scalar quantization.

52 More on Scalar Quantization Does it make sense to do scalar quantization with different quantization bins for different variables? s r

53 More on Scalar Quantization Does it make sense to do scalar quantization with different quantization bins for different variables? No reason to do this if we are quantizing grayscale pixel values. s r

54 More on Scalar Quantization Does it make sense to do scalar quantization with different quantization bins for different variables? No reason to do this if we are quantizing grayscale pixel values. However, if we can decompose the image into components that are less perceptually important and more perceptually important, we should use larger quantization bins for the less important components. s r

55 Structure of a Typical Lossy Compression Algorithm for Audio, Images, or Video data transform quantization entropy coding compressed bitstream

56 Structure of a Typical Lossy Compression Algorithm for Audio, Images, or Video data transform quantization entropy coding compressed bitstream Let s more closely consider quantization and entropy coding. (Various transforms are considered in ECE 3 and ECE 438.)

57 Quantization: problem statement Source (e.g., image, video, speech signal) Sequence of discrete or continuous random variables X(),,X(N) (e.g., transformed image pixel values).

58 Quantization: problem statement Source (e.g., image, video, speech signal) Sequence of discrete or continuous random variables X(),,X(N) (e.g., transformed image pixel values). Quantizer Sequence of discrete random variables Y(),,Y(N), each distributed over a finite set of values (quantization levels)

59 Quantization: problem statement Source (e.g., image, video, speech signal) Sequence of discrete or continuous random variables X(),,X(N) (e.g., transformed image pixel values). Quantizer Sequence of discrete random variables Y(),,Y(N), each distributed over a finite set of values (quantization levels) Errors: D(),,D(N) where D(n) = X(n) Y(n)

60 MSE is a widely used measure of distortion of quantizers Suppose data X(),,X(N) are quantized, to result in Y(),,Y(N). E N n= ( X(n) Y (n)) 2 = E N n= ( D(n) )2 If D(),..., D(N) are identically distributed, this is the same as NE ( D(n) ) 2, for any n.

61 Scalar uniform quantization Use quantization intervals (bins) of equal size [x,x 2 ), [x 2,x 3 ), [x L,x L+ ]. Quantization levels q, q 2,, q L. Each quantization level is in the middle of the corresponding quantization bin: q k =(x k +x k+ )/2.

62 Scalar uniform quantization Use quantization intervals (bins) of equal size [x,x 2 ), [x 2,x 3 ), [x L,x L+ ]. Quantization levels q, q 2,, q L. Each quantization level is in the middle of the corresponding quantization bin: q k =(x k +x k+ )/2. If quantizer input X is in [x k,x k+ ), the corresponding quantized value is Y = q k.

63 Uniform vs non-uniform quantization Uniform quantization is not a good strategy for distributions which significantly differ from uniform.

64 Uniform vs non-uniform quantization Uniform quantization is not a good strategy for distributions which significantly differ from uniform. If the distribution is non-uniform, it is better to spend more quantization levels on more probable parts of the distribution and fewer quantization levels on less probable parts.

65 Scalar Lloyd-Max quantizer X = source random variable with a known distribution. We assume it to be a continuous r.v. with PDF f X (x)>.

66 Scalar Lloyd-Max quantizer X = source random variable with a known distribution. We assume it to be a continuous r.v. with PDF f X (x)>. The results can be extended to discrete or mixed random variables, and to continuous random variables whose density can be zero for some x.

67 Scalar Lloyd-Max quantizer X = source random variable with a known distribution. We assume it to be a continuous r.v. with PDF f X (x)>. The results can be extended to discrete or mixed random variables, and to continuous random variables whose density can be zero for some x. Quantization intervals (x,x 2 ), [x 2,x 3 ), [x L,x L+ ) and levels q,, q L such that x = x L+ = < q < x 2 q 2 < x 3 q 3 < q L < x L q L < + I.e., q k k-th quantization interval

68 Scalar Lloyd-Max quantizer X = source random variable with a known distribution. We assume it to be a continuous r.v. with PDF f X (x)>. The results can be extended to discrete or mixed random variables, and to continuous random variables whose density can be zero for some x. Quantization intervals (x,x 2 ), [x 2,x 3 ), [x L,x L+ ) and levels q,, q L such that x = x L+ = < q < x 2 q 2 < x 3 q 3 < q L < x L q L < + I.e., q k k-th quantization interval Y = the result of quantizing X, a discrete random variable with L possible outcomes, q, q 2,, q L, defined by Y = Y (X) = q if X < x 2 q 2 if x 2 X < x 3 q L if x L X < x L q L X x L

69 Scalar Lloyd-Max quantizer: goal Given the pdf f X (x) of the source r.v. X and the desired number L of quantization levels, find the quantization interval endpoints x 2,,x L and quantization levels q,, q L to minimize the mean-square error, E[(Y X) 2 ].

70 Scalar Lloyd-Max quantizer: goal Given the pdf f X (x) of the source r.v. X and the desired number L of quantization levels, find the quantization interval endpoints x 2,,x L and quantization levels q,, q L to minimize the mean-square error, E[(Y X) 2 ]. To do this, express the mean-square error in terms of the quantization interval endpoints and quantization levels, and find the minimum (or minima) through differentiation.

71 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx

72 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = y(x) x L k = x k+ x k ( ) 2 f X (x)dx

73 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx

74 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx Minimize w.r.t. q k : q k E ( Y X) 2 = x k+ ( ) f X (x)dx 2 q k x = x k

75 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx Minimize w.r.t. q k : q k E ( Y X) 2 = x k+ ( ) f X (x)dx 2 q x k = x k x k+ q k f X (x)dx = xf X (x)dx x k x k+ x k

76 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx Minimize w.r.t. q k : q k E ( Y X) 2 = x k+ ( ) f X (x)dx 2 q x k = x k x k+ q k f X (x)dx = xf X (x)dx, therefore q k = x k x k+ x k x k+ x k x k+ x k xf X (x)dx f X (x)dx

77 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx Minimize w.r.t. q k : q k E ( Y X) 2 = x k+ ( ) f X (x)dx 2 q x k = x k x k+ q k f X (x)dx = xf X (x)dx, therefore q k = x k x k+ x k x k+ x k x k+ x k xf X (x)dx f X (x)dx = E[ X X k-th quantization interval]

78 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx Minimize w.r.t. q k : q k E ( Y X) 2 = x k+ ( ) f X (x)dx 2 q k x = x k x k+ q k f X (x)dx = xf X (x)dx, therefore q k = x k x k+ x k x k+ x k x k+ x k xf X (x)dx f X (x)dx = E[ X X k-th quantization interval] This is a minimum, since 2 q k 2 E ( )2 Y X = x k+ 2 f (x)dx X >. x k

79 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L L k = x k+ x k L k = x k+ x k ( ) 2 f X (x)dx

80 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L: x k E ( Y X) 2 = x k L k = x k+ x k x k+ ( q k x) 2 f X (x)dx + q k x x k x k x k ( ) 2 f X (x)dx L k = x k+ x k ( ) 2 f X (x)dx

81 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L: x k E ( Y X) 2 = x k L k = x k+ x k x k+ ( q k x) 2 f X (x)dx + q k x x k x k = ( q k x k ) 2 f X (x k ) q k x k x k ( ) 2 f X (x k ) ( ) 2 f X (x)dx L k = x k+ x k ( ) 2 f X (x)dx

82 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L: x k E ( Y X) 2 = x k = q k x k By assumption, f X (x) and q k q k. L k = x k+ x k x k+ ( q k x) 2 f X (x)dx + q k x x k x k x k ( ) 2 f X (x)dx L k = x k+ x k ( ) 2 f X (x)dx ( ) 2 f X (x k ) ( q k x k ) 2 f X (x k ) = ( q k q k )( q k + q k 2x k ) f X (x k ) =.

83 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L: x k E ( Y X) 2 = x k L k = x k+ x k x k+ ( q k x) 2 f X (x)dx + q k x x k x k = q k x k By assumption, f X (x) and q k q k. Therefore, x k ( ) 2 f X (x)dx L k = x k+ x k ( ) 2 f X (x)dx ( ) 2 f X (x k ) ( q k x k ) 2 f X (x k ) = ( q k q k )( q k + q k 2x k ) f X (x k ) =. x k = q k + q k 2, for k = 2,, L.

84 Scalar Lloyd-Max quantizer: derivation E ( Y X) 2 = ( y(x) x)2 f X (x)dx = ( y(x) x) 2 f X (x)dx = q k x Minimize w.r.t. x k, for k = 2,, L: x k E ( Y X) 2 = x k L k = x k+ x k x k+ ( q k x) 2 f X (x)dx + q k x x k x k = q k x k By assumption, f X (x) and q k q k. Therefore, x k ( ) 2 f X (x)dx L k = x k+ x k ( ) 2 f X (x)dx ( ) 2 f X (x k ) ( q k x k ) 2 f X (x k ) = ( q k q k )( q k + q k 2x k ) f X (x k ) =. x k = q k + q k 2, for k = 2,, L. This is a minimum, since 2 x k 2 E ( )2 Y X = 2 ( q q k k ) f X (x k ) >.

85 Nonlinear system to be solved x k+ xf X (x)dx x q k = k x k+ = E[ X X k-th quantization interval], for k =,, L f X (x)dx x k x k = q + q k k, for k = 2,, L 2

86 Nonlinear system to be solved x k+ xf X (x)dx x q k = k x k+ = E[ X X k-th quantization interval], for k =,, L f X (x)dx x k x k = q + q k k, for k = 2,, L 2 Closed-form solution can be found only for very simple PDFs. E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.

87 Nonlinear system to be solved x k+ xf X (x)dx x q k = k x k+ = E[ X X k-th quantization interval], for k =,, L f X (x)dx x k x k = q + q k k, for k = 2,, L 2 Closed-form solution can be found only for very simple PDFs. E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer. In general, an approximate solution can be found numerically, via an iterative algorithm (e.g., lloyds command in Matlab).

88 Nonlinear system to be solved x k+ xf X (x)dx x q k = k x k+ = E[ X X k-th quantization interval], for k =,, L f X (x)dx x k x k = q + q k k, for k = 2,, L 2 Closed-form solution can be found only for very simple PDFs. E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer. In general, an approximate solution can be found numerically, via an iterative algorithm (e.g., lloyds command in Matlab). For real data, typically the PDF is not given and therefore needs to be estimated using, for example, histograms constructed from the observed data.

89 Vector Lloyd-Max quantizer? X = ( X(),, X(N) ) = source random vector with a given joint distribution. L = a desired number of quantization points.

90 Vector Lloyd-Max quantizer? X = ( X(),, X(N) ) = source random vector with a given joint distribution. L = a desired number of quantization points. We would like to find: () L events A,, A L that partition the joint sample space of X(),, X(N), and (2) L quantization points q A,,q L A L

91 Vector Lloyd-Max quantizer? X = ( X(),, X(N) ) = source random vector with a given joint distribution. L = a desired number of quantization points. We would like to find: () L events A,, A L that partition the joint sample space of X(),, X(N), and (2) L quantization points q A,,q L A L, such that the quantized random vector, defined by Y = q k if X A k, for k =,, L, minimizes the mean-square error, E Y X 2 = E N n= ( Y (n) X(n) )2

92 Vector Lloyd-Max quantizer? X = ( X(),, X(N) ) = source random vector with a given joint distribution. L = a desired number of quantization points. We would like to find: () L events A,, A L that partition the joint sample space of X(),, X(N), and (2) L quantization points q A,,q L A L, such that the quantized random vector, defined by Y = q k if X A k, for k =,, L, minimizes the mean-square error, E Y X 2 = E N n= ( Y (n) X(n) )2 Difficulty: cannot differentiate with respect to a set A k, and so unless the set of all allowed partitions is somehow restricted, this cannot be solved.

93 Hopefully, prior discussion gives you some idea about various issues involved in quantization. And now, on to entropy coding data transform quantization entropy coding compressed bitstream

94 Problem statement Source (e.g., image, video, speech signal, or quantizer output) Sequence of discrete random variables X(),,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a,,a M }.

95 Problem statement Source (e.g., image, video, speech signal, or quantizer output) Sequence of discrete random variables X(),,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a,,a M }. Encoder: mapping between source symbols and binary strings (codewords) Binary string Requirements: minimize the expected length of the binary string; the binary string needs to be uniquely decodable, i.e., we need to be able to infer X(),,X(N) from it!

96 Source (e.g., image, video, speech signal, or quantizer output) Problem statement Sequence of discrete random variables X(),,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a,,a M }. Encoder: mapping between source symbols and binary strings (codewords) Since X(),,X(N) are assumed independent in this model, we will encode each of them separately. Each can assume any value among {a,,a M }. Therefore, our code will consist of M codewords, one for each symbol a,,a M. symbol codeword a w Binary string a M w M

97 Unique Decodability symbol codeword a b c d How to decode the following string:? It could be aaab or aad or acb or cab or cd. Not uniquely decodable!

98 A condition that ensures unique decodability Prefix condition: no codeword in the code is a prefix for any other codeword.

99 A condition that ensures unique decodability Prefix condition: no codeword in the code is a prefix for any other codeword. If the prefix condition is satisfied, then the code is uniquely decodable. Proof. Take a bit string W that corresponds to two different strings of symbols, A and B. If the first symbols in A and B are the same, discard them and the corresponding portion of W. Repeat until either there are no bits left in W (in this case A=B) or the first symbols in A and B are different. Then one of the codewords corresponding to these two symbols is a prefix for the other.

100 A condition that ensures unique decodability Prefix condition: no codeword in the code is a prefix for any other codeword. Visualizing binary strings. Form a binary tree where each branch is labeled or. Each codeword w can be associated with the unique node of the tree such that string of s and s on the path from the root to the node forms w.

101 A condition that ensures unique decodability Prefix condition: no codeword in the code is a prefix for any other codeword. Visualizing binary strings. Form a binary tree where each branch is labeled or. Each codeword w can be associated with the unique node of the tree such that string of s and s on the path from the root to the node forms w. Prefix condition holds if an only if all the codewords are leaves of the binary tree.

102 A condition that ensures unique decodability Prefix condition: no codeword in the code is a prefix for any other codeword. Visualizing binary strings. Form a binary tree where each branch is labeled or. Each codeword w can be associated with the unique node of the tree such that string of s and s on the path from the root to the node forms w. Prefix condition holds if an only if all the codewords are leaves of the binary tree---i.e., if no codeword is a descendant of another codeword.

103 Example: no prefix condition, no unique decodability, one word is not a leaf symbol codeword a b c d Codeword is a prefix for both codeword and codeword

104 Example: no prefix condition, no unique decodability, one word is not a leaf symbol codeword a b c d Codeword is a prefix for both codeword and codeword w a = w b =

105 Example: no prefix condition, no unique decodability, one word is not a leaf symbol codeword a b c d Codeword is a prefix for both codeword and codeword w c = w a = w b =

106 Example: no prefix condition, no unique decodability, one word is not a leaf symbol codeword a b c d Codeword is a prefix for both codeword and codeword w c = w d = w a = w b =

107 Example: prefix condition, all words are leaves symbol codeword a b c d w a =

108 Example: prefix condition, all words are leaves symbol codeword a b c d w b = w a =

109 Example: prefix condition, all words are leaves symbol codeword a b c d w c = w d = w b = w a =

110 Example: prefix condition, all words are leaves symbol codeword a b c d w c = w d = No path from the root to a codeword contains another codeword. This is equivalent to saying that the prefix condition holds. w b = w a =

111 Example: prefix condition, all words are leaves => unique decodability symbol codeword a b c d w c = w d = Decoding: traverse the string left to right, tracing the corresponding path from the root of the binary tree. Each time a leaf is reached, output the codeword and go back to the root. w b = w a =

112 Example: prefix condition, all words are leaves => unique decodability How to decode the following string? w c = w d = w b = w a =

113 Example: prefix condition, all words are leaves => unique decodability w c = w d = w b = w a =

114 Example: prefix condition, all words are leaves => unique decodability w c = w d = w b = w a =

115 Example: prefix condition, all words are leaves => unique decodability w c = w d = w b = w a =

116 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: c w b = w a =

117 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: c w b = w a =

118 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: c w b = w a =

119 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: c w b = w a =

120 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cd w b = w a =

121 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cd w b = w a =

122 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cda w b = w a =

123 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cda w b = w a =

124 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cda w b = w a =

125 Example: prefix condition, all words are leaves => unique decodability w c = w d = output: cdab w b = w a =

126 Example: prefix condition, all words are leaves => unique decodability w c = w d = final output: cdab w b = w a =

127 Prefix condition and unique decodability There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {, }).

128 Prefix condition and unique decodability There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {, }). For any such code, a prefix condition code can be constructed with an identical set of codeword lengths. (E.g., {, } for {, }.)

129 Prefix condition and unique decodability There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {, }). For any such code, a prefix condition code can be constructed with an identical set of codeword lengths. (E.g., {, } for {, }.) For this reason, we can consider just prefix condition codes.

130 Entropy coding Given a discrete random variable X with M possible outcomes ( symbols or letters ) a,,a M and with PMF p X, what is the lowest achievable expected codeword length among all the uniquely decodable codes? Answer depends on p X ; Shannon s source coding theorem provides bounds. How to construct a prefix condition code which achieves this expected codeword length? Answer: Huffman code.

131 Huffman code Consider a discrete r.v. X with M possible outcomes a,,a M and with PMF p X. Assume that p X (a ) p X (a M ). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)

132 Huffman code Consider a discrete r.v. X with M possible outcomes a,,a M and with PMF p X. Assume that p X (a ) p X (a M ). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.) Consider aggregate outcome a 2 = {a,a 2 } and a discrete r.v. X such that X ' = a 2 if X = a or X = a 2 X otherwise

133 Huffman code Consider a discrete r.v. X with M possible outcomes a,,a M and with PMF p X. Assume that p X (a ) p X (a M ). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.) Consider aggregate outcome a 2 = {a,a 2 } and a discrete r.v. X such that X ' = a 2 if X = a or X = a 2 X otherwise p X ' ( a) = ( ) + p X ( a 2 ) if a = a 2 ( a) if a = a 3,,a M p X a p X

134 Huffman code Consider a discrete r.v. X with M possible outcomes a,,a M and with PMF p X. Assume that p X (a ) p X (a M ). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.) Consider aggregate outcome a 2 = {a,a 2 } and a discrete r.v. X such that X ' = a 2 if X = a or X = a 2 X otherwise p X ' ( a) = ( ) + p X ( a 2 ) if a = a 2 ( a) if a = a 3,,a M p X a p X Suppose we have a tree, T, for an optimal prefix condition code for X. A tree T for an optimal prefix condition code for X can be obtained from T by splitting the leaf a 2 into two leaves corresponding to a and a 2.

135 Huffman code Consider a discrete r.v. X with M possible outcomes a,,a M and with PMF p X. Assume that p X (a ) p X (a M ). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.) Consider aggregate outcome a 2 = {a,a 2 } and a discrete r.v. X such that X ' = a 2 if X = a or X = a 2 X otherwise p X ' ( a) = ( ) + p X ( a 2 ) if a = a 2 ( a) if a = a 3,,a M p X a p X Suppose we have a tree, T, for an optimal prefix condition code for X. A tree T for an optimal prefix condition code for X can be obtained from T by splitting the leaf a 2 into two leaves corresponding to a and a 2. We won t prove this.

136 letter p X (letter) a. a 2. a 3.25 a 4.25 a 5.3 Example

137 letter p X (letter) a. a 2. a 3.25 a 4.25 a 5.3 Example Step : combine the two least likely letters. letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3

138 letter p X (letter) a. a 2. a 3.25 a 4.25 a 5.3 Example Step : combine the two least likely letters. a a 2 letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 a 2

139 letter p X (letter) a. a 2. a 3.25 a 4.25 a 5.3 Tree for X: Example Step : combine the two least likely letters. a a 2 Tree for X (still to be constructed) letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 a 2

140 letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 Example Step 2: combine the two least likely letters from the new alphabet. letter p X (letter) a a 4.25 a 5.3

141 letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 Example Step 2: combine the two least likely letters from the new alphabet. a a 2 a 2 a 3 a 23 letter p X (letter) a a 4.25 a 5.3

142 letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 Tree for X: Example Step 2: combine the two least likely letters from the new alphabet. a a 2 a 2 a 3 a 23 Tree for X letter p X (letter) a a 4.25 a 5.3

143 letter p X (letter) a 2.2 a 3.25 a 4.25 a 5.3 Tree for X: Example Step 2: combine the two least likely letters from the new alphabet. a a 2 a 2 a 3 Tree for X a 23 Tree for X letter p X (letter) a a 4.25 a 5.3

144 letter p X (letter) a a 4.25 a 5.3 Example Step 3: again combine the two least likely letters letter p X (letter) a a a a 2 a 2 a 3 a 23 a 4 a 45 a 5

145 letter p X (letter) a a 4.25 a 5.3 Example Step 3: again combine the two least likely letters letter p X (letter) a a a Tree for X: a 2 a 2 a 3 a 23 Tree for X a 4 a 45 a 5

146 letter p X (letter) a a 4.25 a 5.3 Example Step 3: again combine the two least likely letters letter p X (letter) a a a Tree for X: a 2 a 2 a 3 Tree for X a 23 Tree for X a 4 a 45 a 5

147 letter p X (letter) a a 4.25 a 5.3 Example Step 3: again combine the two least likely letters letter p X (letter) a a a Tree for X Tree for X: a 2 a 2 a 3 Tree for X a 23 Tree for X a 4 a 45 a 5

148 Example letter p X (letter) a Step 4: combine the last two remaining letters a Done! a Tree for X: a 2 a 2 a 3 a 23 a 2345 a 4 a 45 a 5

149 Example letter p X (letter) a a Step 4: combine the last two remaining letters Done! The codeword for each leaf is the sequence of and s along the path from the root to that leaf. a Tree for X: a 2 a 3 a 4 a 5

150 Example a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

151 Example a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

152 Example a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

153 Example a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

154 Example a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

155 Example Expected codeword length: 3(.) + 3(.) + 2(.25) + 2(.25) + 2(.3) = 2.2 bits a Tree for X: letter p X (letter) codeword a 2 a 3 a. a 2. a 4 a 3.25 a 4.25 a 5 a 5.3

156 Self-information Consider again a discrete random variable X with M possible outcomes a,,a M and with PMF p X.

157 Self-information Consider again a discrete random variable X with M possible outcomes a,,a M and with PMF p X. Self-information of outcome a m is I ) = log 2 p X ) bits.

158 Self-information Consider again a discrete random variable X with M possible outcomes a,,a M and with PMF p X. Self-information of outcome a m is I ) = log 2 p X ) bits. E.g., p X ) = then I ) =. The occurrence of a m is not at all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information.

159 Self-information Consider again a discrete random variable X with M possible outcomes a,,a M and with PMF p X. Self-information of outcome a m is I ) = log 2 p X ) bits. E.g., p X ) = then I ) =. The occurrence of a m is not at all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information. Self-information of X is I(X) = log 2 p X (X) and is a random variable.

160 Self-information Consider again a discrete random variable X with M possible outcomes a,,a M and with PMF p X. Self-information of outcome a m is I ) = log 2 p X ) bits. E.g., p X ) = then I ) =. The occurrence of a m is not at all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information. Self-information of X is I(X) = log 2 p X (X) and is a random variable. Entropy of X is the expected value of its self-information: H (X) = E I(X) M p X ) [ ] = p X )log 2 m=

161 Source coding theorem (Shannon) For any uniquely decodable code, the expected codeword length is H (X). Moreover, there exists a prefix condition code for which the expected codeword length is < H (X) +.

162 Example Suppose that X has M=2 K possible outcomes a,,a M.

163 Example Suppose that X has M=2 K possible outcomes a,,a M. Suppose that X is uniform, i.e., p X (a ) = = p X (a M ) = 2 K.

164 Example Suppose that X has M=2 K possible outcomes a,,a M. Suppose that X is uniform, i.e., p X (a ) = = p X (a M ) = 2 K. Then H (X) = E I(X) [ 2 K ] = 2 K log 2 2 K k = ( ) = 2 K ( 2 K ) K ( ) = K

165 Example Suppose that X has M=2 K possible outcomes a,,a M. Suppose that X is uniform, i.e., p X (a ) = = p X (a M ) = 2 K. Then H (X) = E I(X) [ 2 K ] = 2 K log 2 2 K k = ( ) = 2 K ( 2 K ) K ( ) = K On the other hand, observe that there exist 2 K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2 K K-bit sequences as codewords for all the 2 K outcomes of X, will have expected codeword length of K.

166 Example Suppose that X has M=2 K possible outcomes a,,a M. Suppose that X is uniform, i.e., p X (a ) = = p X (a M ) = 2 K. Then H (X) = E I(X) [ 2 K ] = 2 K log 2 2 K k = ( ) = 2 K ( 2 K ) K ( ) = K On the other hand, observe that there exist 2 K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2 K K-bit sequences as codewords for all the 2 K outcomes of X, will have expected codeword length of K. I.e., for this particular random variable, this fixed-length code achieves the entropy of X, which is the lower bound given by the source coding theorem.

167 Example Suppose that X has M=2 K possible outcomes a,,a M. Suppose that X is uniform, i.e., p X (a ) = = p X (a M ) = 2 K. Then H (X) = E I(X) [ 2 K ] = 2 K log 2 2 K k = ( ) = 2 K ( 2 K ) K ( ) = K On the other hand, observe that there exist 2 K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2 K K-bit sequences as codewords for all the 2 K outcomes of X, will have expected codeword length of K. I.e., for this particular random variable, this fixed-length code achieves the entropy of X, which is the lower bound given by the source coding theorem. Therefore, the K-bit fixed-length code is optimal for this X.

168 Lemma : An auxiliary result helpful for proving the source coding theorem log 2 α (α ) log 2 e for log 2 α >. Proof: differentiate g(α) = (α ) log 2 e log 2 α and show that g() = is its minimum.

169 Another auxiliary result: Kraft inequality If integers d,,d M satisfy the inequality M 2 d m, () m= then there exists a prefix condition code whose codeword lengths are these integers. Conversely, the codeword lengths of any prefix condition code satisfy this inequality.

170 Some useful facts about full binary trees A full binary tree of depth D has 2 D leaves.

171 Some useful facts about full binary trees Tree depth D = 4 A full binary tree of depth D has 2 D leaves. (Here, depth is D=4 and the number of leaves is 2 4 =6.)

172 Some useful facts about full binary trees Tree depth D = 4 A full binary tree of depth D has 2 D leaves. (Here, depth is D=4 and the number of leaves is 2 4 =6.) Depth of red node = 2 In a full binary tree of depth D, each node at depth d has 2 D d leaf descendants. (Here, D=4, the red node is at depth d=2, and so it has = 4 leaf descendants.)

173 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a.

174 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2.

175 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times.

176 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times. If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full binary tree of depth d M is a descendant of one of the first m symbols, a,,a r.

177 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times. If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full binary tree of depth d M is a descendant of one of the first m symbols, a,,a r. But note that every node at depth d m has 2 d M d m descendants. Note also that the full tree has 2 d M leaves. Therefore, if every leaf in the tree is a descendant of a,,a r, then r m= 2 d M d m = 2 d M

178 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times. If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full binary tree of depth d M is a descendant of one of the first m symbols, a,,a r. But note that every node at depth d m has 2 d M d m descendants. Note also that the full tree has 2 d M leaves. Therefore, if every leaf in the tree is a descendant of a,,a r, then r r 2 d M d m = 2 d M 2 d m m= m= =

179 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times. If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full binary tree of depth d M is a descendant of one of the first m symbols, a,,a r. But note that every node at depth d m has 2 d M d m descendants. Note also that the full tree has 2 d M leaves. Therefore, if every leaf in the tree is a descendant of a,,a r, then r r 2 d M d m = 2 d M 2 d m m= m= Therefore, M r 2 d m = 2 d m m= m= = M + 2 d m >. This violates (). m=r+

180 Kraft inequality: proof of Suppose d d M satisfy (). Consider the full binary tree of depth d M, and consider all its nodes at depth d. Assign one of these nodes to symbol a. Consider all the nodes at depth d 2 which are not a and not descendants of a. Assign one of them to symbol a 2. Iterate like this M times. If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the full binary tree of depth d M is a descendant of one of the first m symbols, a,,a r. But note that every node at depth d m has 2 d M d m descendants. Note also that the full tree has 2 d M leaves. Therefore, if every leaf in the tree is a descendant of a,,a r, then r r 2 d M d m = 2 d M 2 d m m= m= Therefore, M r 2 d m = 2 d m m= m= = M + 2 d m >. This violates (). m=r+ Thus, our procedure can in fact go on for M iterations. After the M -th iteration, we will have constructed a prefix condition code with codeword lengths d,,d M.

181 Kraft inequality: proof of Suppose d d M, and suppose we have a prefix condition code with there codeword lengths. Consider the binary tree corresponding to this code.

182 Kraft inequality: proof of Suppose d d M, and suppose we have a prefix condition code with there codeword lengths. Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of depth d M.

183 Kraft inequality: proof of Suppose d d M, and suppose we have a prefix condition code with there codeword lengths. Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of depth d M. Again use the following facts: the full tree has 2 d M leaves; the number of leaf descendants of the codeword of length d m is 2 d M d m.

184 Kraft inequality: proof of Suppose d d M, and suppose we have a prefix condition code with there codeword lengths. Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of depth d M. Again use the following facts: the full tree has 2 d M leaves; the number of leaf descendants of the codeword of length d m is 2 d M d m. The combined number of all leaf descendants of all codewords must be less than or equal to the total number of leaves in the full tree: M m= 2 d M d m 2 d M

185 Kraft inequality: proof of Suppose d d M, and suppose we have a prefix condition code with there codeword lengths. Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree of depth d M. Again use the following facts: the full tree has 2 d M leaves; the number of leaf descendants of the codeword of length d m is 2 d M d m. The combined number of all leaf descendants of all codewords must be less than or equal to the total number of leaves in the full tree: M M 2 d M d m 2 d M 2 d m m= m=.

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive

More information

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Multimedia Communications. Mathematical Preliminaries for Lossless Compression Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when

More information

Information and Entropy

Information and Entropy Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

2018/5/3. YU Xiangyu

2018/5/3. YU Xiangyu 2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the

More information

UNIT I INFORMATION THEORY. I k log 2

UNIT I INFORMATION THEORY. I k log 2 UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper

More information

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal

More information

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes

Information Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

at Some sort of quantization is necessary to represent continuous signals in digital form

at Some sort of quantization is necessary to represent continuous signals in digital form Quantization at Some sort of quantization is necessary to represent continuous signals in digital form x(n 1,n ) x(t 1,tt ) D Sampler Quantizer x q (n 1,nn ) Digitizer (A/D) Quantization is also used for

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Digital Image Processing Lectures 25 & 26

Digital Image Processing Lectures 25 & 26 Lectures 25 & 26, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2015 Area 4: Image Encoding and Compression Goal: To exploit the redundancies in the image

More information

Digital communication system. Shannon s separation principle

Digital communication system. Shannon s separation principle Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation

More information

COMM901 Source Coding and Compression. Quiz 1

COMM901 Source Coding and Compression. Quiz 1 German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding

More information

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average

More information

+ (50% contribution by each member)

+ (50% contribution by each member) Image Coding using EZW and QM coder ECE 533 Project Report Ahuja, Alok + Singh, Aarti + + (50% contribution by each member) Abstract This project involves Matlab implementation of the Embedded Zerotree

More information

10-704: Information Processing and Learning Fall Lecture 10: Oct 3

10-704: Information Processing and Learning Fall Lecture 10: Oct 3 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for MARKOV CHAINS A finite state Markov chain is a sequence S 0,S 1,... of discrete cv s from a finite alphabet S where q 0 (s) is a pmf on S 0 and for n 1, Q(s s ) = Pr(S n =s S n 1 =s ) = Pr(S n =s S n 1

More information

Image and Multidimensional Signal Processing

Image and Multidimensional Signal Processing Image and Multidimensional Signal Processing Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ Image Compression 2 Image Compression Goal: Reduce amount

More information

BASICS OF COMPRESSION THEORY

BASICS OF COMPRESSION THEORY BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find

More information

Compression and Coding

Compression and Coding Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)

More information

Lecture 1: Shannon s Theorem

Lecture 1: Shannon s Theorem Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work

More information

Entropy as a measure of surprise

Entropy as a measure of surprise Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify

More information

Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source

Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source Ali Tariq Bhatti 1, Dr. Jung Kim 2 1,2 Department of Electrical

More information

Chapter 2: Source coding

Chapter 2: Source coding Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent

More information

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea

Entropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital

More information

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding SIGNAL COMPRESSION 8. Lossy image compression: Principle of embedding 8.1 Lossy compression 8.2 Embedded Zerotree Coder 161 8.1 Lossy compression - many degrees of freedom and many viewpoints The fundamental

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective

More information

Source Coding: Part I of Fundamentals of Source and Video Coding

Source Coding: Part I of Fundamentals of Source and Video Coding Foundations and Trends R in sample Vol. 1, No 1 (2011) 1 217 c 2011 Thomas Wiegand and Heiko Schwarz DOI: xxxxxx Source Coding: Part I of Fundamentals of Source and Video Coding Thomas Wiegand 1 and Heiko

More information

Quantum-inspired Huffman Coding

Quantum-inspired Huffman Coding Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,

More information

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way

More information

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)

Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner

More information

Summary of Last Lectures

Summary of Last Lectures Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i

More information

Fast Progressive Wavelet Coding

Fast Progressive Wavelet Coding PRESENTED AT THE IEEE DCC 99 CONFERENCE SNOWBIRD, UTAH, MARCH/APRIL 1999 Fast Progressive Wavelet Coding Henrique S. Malvar Microsoft Research One Microsoft Way, Redmond, WA 98052 E-mail: malvar@microsoft.com

More information

Information Theory and Statistics Lecture 2: Source coding

Information Theory and Statistics Lecture 2: Source coding Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection

More information

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.

Optimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min. Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini

More information

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory

Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I

More information

EE-597 Notes Quantization

EE-597 Notes Quantization EE-597 Notes Quantization Phil Schniter June, 4 Quantization Given a continuous-time and continuous-amplitude signal (t, processing and storage by modern digital hardware requires discretization in both

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on lectures from B. Lee, B. Girod, and A. Mukherjee 1 Outline Digital Signal Representation

More information

Quantization. Introduction. Roadmap. Optimal Quantizer Uniform Quantizer Non Uniform Quantizer Rate Distorsion Theory. Source coding.

Quantization. Introduction. Roadmap. Optimal Quantizer Uniform Quantizer Non Uniform Quantizer Rate Distorsion Theory. Source coding. Roadmap Quantization Optimal Quantizer Uniform Quantizer Non Uniform Quantizer Rate Distorsion Theory Source coding 2 Introduction 4 1 Lossy coding Original source is discrete Lossless coding: bit rate

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006 jzhang@cse.unsw.edu.au

More information

Lecture 1 : Data Compression and Entropy

Lecture 1 : Data Compression and Entropy CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for

More information

Exercises with solutions (Set B)

Exercises with solutions (Set B) Exercises with solutions (Set B) 3. A fair coin is tossed an infinite number of times. Let Y n be a random variable, with n Z, that describes the outcome of the n-th coin toss. If the outcome of the n-th

More information

Data Compression. Limit of Information Compression. October, Examples of codes 1

Data Compression. Limit of Information Compression. October, Examples of codes 1 Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality

More information

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT

More information

Compression. What. Why. Reduce the amount of information (bits) needed to represent image Video: 720 x 480 res, 30 fps, color

Compression. What. Why. Reduce the amount of information (bits) needed to represent image Video: 720 x 480 res, 30 fps, color Compression What Reduce the amount of information (bits) needed to represent image Video: 720 x 480 res, 30 fps, color Why 720x480x20x3 = 31,104,000 bytes/sec 30x60x120 = 216 Gigabytes for a 2 hour movie

More information

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University

C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw

More information

Coding of memoryless sources 1/35

Coding of memoryless sources 1/35 Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems

More information

repetition, part ii Ole-Johan Skrede INF Digital Image Processing

repetition, part ii Ole-Johan Skrede INF Digital Image Processing repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and

More information

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18

Information Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18 Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable

More information

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5

Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5 Lecture : Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP959 Multimedia Systems S 006 jzhang@cse.unsw.edu.au Acknowledgement

More information

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University

Huffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw

More information

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms)

Multimedia. Multimedia Data Compression (Lossless Compression Algorithms) Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr

More information

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2 Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

More information

Review of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition

Review of Quantization. Quantization. Bring in Probability Distribution. L-level Quantization. Uniform partition Review of Quantization UMCP ENEE631 Slides (created by M.Wu 004) Quantization UMCP ENEE631 Slides (created by M.Wu 001/004) L-level Quantization Minimize errors for this lossy process What L values to

More information

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG

Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Fault Tolerance Technique in Huffman Coding applies to Baseline JPEG Cung Nguyen and Robert G. Redinbo Department of Electrical and Computer Engineering University of California, Davis, CA email: cunguyen,

More information

Lecture 10 : Basic Compression Algorithms

Lecture 10 : Basic Compression Algorithms Lecture 10 : Basic Compression Algorithms Modeling and Compression We are interested in modeling multimedia data. To model means to replace something complex with a simpler (= shorter) analog. Some models

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 5 Other Coding Techniques Instructional Objectives At the end of this lesson, the students should be able to:. Convert a gray-scale image into bit-plane

More information

Progressive Wavelet Coding of Images

Progressive Wavelet Coding of Images Progressive Wavelet Coding of Images Henrique Malvar May 1999 Technical Report MSR-TR-99-26 Microsoft Research Microsoft Corporation One Microsoft Way Redmond, WA 98052 1999 IEEE. Published in the IEEE

More information

! Where are we on course map? ! What we did in lab last week. " How it relates to this week. ! Compression. " What is it, examples, classifications

! Where are we on course map? ! What we did in lab last week.  How it relates to this week. ! Compression.  What is it, examples, classifications Lecture #3 Compression! Where are we on course map?! What we did in lab last week " How it relates to this week! Compression " What is it, examples, classifications " Probability based compression # Huffman

More information

Chapter 9 Fundamental Limits in Information Theory

Chapter 9 Fundamental Limits in Information Theory Chapter 9 Fundamental Limits in Information Theory Information Theory is the fundamental theory behind information manipulation, including data compression and data transmission. 9.1 Introduction o For

More information

State of the art Image Compression Techniques

State of the art Image Compression Techniques Chapter 4 State of the art Image Compression Techniques In this thesis we focus mainly on the adaption of state of the art wavelet based image compression techniques to programmable hardware. Thus, an

More information

Image Data Compression

Image Data Compression Image Data Compression Image data compression is important for - image archiving e.g. satellite data - image transmission e.g. web data - multimedia applications e.g. desk-top editing Image data compression

More information

Multimedia Information Systems

Multimedia Information Systems Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 3 & 4: Color, Video, and Fundamentals of Data Compression 1 Color Science Light is an electromagnetic wave. Its color is characterized

More information

On Common Information and the Encoding of Sources that are Not Successively Refinable

On Common Information and the Encoding of Sources that are Not Successively Refinable On Common Information and the Encoding of Sources that are Not Successively Refinable Kumar Viswanatha, Emrah Akyol, Tejaswi Nanjundaswamy and Kenneth Rose ECE Department, University of California - Santa

More information

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C

Reduce the amount of data required to represent a given quantity of information Data vs information R = 1 1 C Image Compression Background Reduce the amount of data to represent a digital image Storage and transmission Consider the live streaming of a movie at standard definition video A color frame is 720 480

More information

Lec 03 Entropy and Coding II Hoffman and Golomb Coding

Lec 03 Entropy and Coding II Hoffman and Golomb Coding CS/EE 5590 / ENG 40 Special Topics Multimedia Communication, Spring 207 Lec 03 Entropy and Coding II Hoffman and Golomb Coding Zhu Li Z. Li Multimedia Communciation, 207 Spring p. Outline Lecture 02 ReCap

More information

Information and Entropy. Professor Kevin Gold

Information and Entropy. Professor Kevin Gold Information and Entropy Professor Kevin Gold What s Information? Informally, when I communicate a message to you, that s information. Your grade is 100/100 Information can be encoded as a signal. Words

More information

on a per-coecient basis in large images is computationally expensive. Further, the algorithm in [CR95] needs to be rerun, every time a new rate of com

on a per-coecient basis in large images is computationally expensive. Further, the algorithm in [CR95] needs to be rerun, every time a new rate of com Extending RD-OPT with Global Thresholding for JPEG Optimization Viresh Ratnakar University of Wisconsin-Madison Computer Sciences Department Madison, WI 53706 Phone: (608) 262-6627 Email: ratnakar@cs.wisc.edu

More information

EE67I Multimedia Communication Systems

EE67I Multimedia Communication Systems EE67I Multimedia Communication Systems Lecture 5: LOSSY COMPRESSION In these schemes, we tradeoff error for bitrate leading to distortion. Lossy compression represents a close approximation of an original

More information

Lecture 20: Quantization and Rate-Distortion

Lecture 20: Quantization and Rate-Distortion Lecture 20: Quantization and Rate-Distortion Quantization Introduction to rate-distortion theorem Dr. Yao Xie, ECE587, Information Theory, Duke University Approimating continuous signals... Dr. Yao Xie,

More information

EE368B Image and Video Compression

EE368B Image and Video Compression EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer

More information

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions Engineering Tripos Part IIA THIRD YEAR 3F: Signals and Systems INFORMATION THEORY Examples Paper Solutions. Let the joint probability mass function of two binary random variables X and Y be given in the

More information

Lecture 6: Kraft-McMillan Inequality and Huffman Coding

Lecture 6: Kraft-McMillan Inequality and Huffman Coding EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with

More information

Entropies & Information Theory

Entropies & Information Theory Entropies & Information Theory LECTURE I Nilanjana Datta University of Cambridge,U.K. See lecture notes on: http://www.qi.damtp.cam.ac.uk/node/223 Quantum Information Theory Born out of Classical Information

More information

lossless, optimal compressor

lossless, optimal compressor 6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.

More information

Compression and Coding. Theory and Applications Part 1: Fundamentals

Compression and Coding. Theory and Applications Part 1: Fundamentals Compression and Coding Theory and Applications Part 1: Fundamentals 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance) 2 Why

More information

The information loss in quantization

The information loss in quantization The information loss in quantization The rough meaning of quantization in the frame of coding is representing numerical quantities with a finite set of symbols. The mapping between numbers, which are normally

More information

CS4800: Algorithms & Data Jonathan Ullman

CS4800: Algorithms & Data Jonathan Ullman CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Lecture 3 : Algorithms for source coding. September 30, 2016

Lecture 3 : Algorithms for source coding. September 30, 2016 Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39

More information

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems

More information

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image

Image Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image Fundamentals: Coding redundancy The gray level histogram of an image can reveal a great deal of information about the image That probability (frequency) of occurrence of gray level r k is p(r k ), p n

More information

Multimedia Communications. Scalar Quantization

Multimedia Communications. Scalar Quantization Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied

More information

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:

More information

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000

Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 1 Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 Aline Roumy aline.roumy@inria.fr May 2011 2 Motivation for Video Compression Digital video studio standard ITU-R Rec. 601 Y luminance

More information

Shannon-Fano-Elias coding

Shannon-Fano-Elias coding Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The

More information

ECE533 Digital Image Processing. Embedded Zerotree Wavelet Image Codec

ECE533 Digital Image Processing. Embedded Zerotree Wavelet Image Codec University of Wisconsin Madison Electrical Computer Engineering ECE533 Digital Image Processing Embedded Zerotree Wavelet Image Codec Team members Hongyu Sun Yi Zhang December 12, 2003 Table of Contents

More information

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols. Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit

More information

Chapter 5: Data Compression

Chapter 5: Data Compression Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,

More information

Module 5 EMBEDDED WAVELET CODING. Version 2 ECE IIT, Kharagpur

Module 5 EMBEDDED WAVELET CODING. Version 2 ECE IIT, Kharagpur Module 5 EMBEDDED WAVELET CODING Lesson 13 Zerotree Approach. Instructional Objectives At the end of this lesson, the students should be able to: 1. Explain the principle of embedded coding. 2. Show the

More information

ELEC 515 Information Theory. Distortionless Source Coding

ELEC 515 Information Theory. Distortionless Source Coding ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered

More information

A study of image compression techniques, with specific focus on weighted finite automata

A study of image compression techniques, with specific focus on weighted finite automata A study of image compression techniques, with specific focus on weighted finite automata Rikus Muller Thesis presented in partial fulfilment of the requirements for the Degree of Master of Science at the

More information