Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression
|
|
- Barbra Johns
- 5 years ago
- Views:
Transcription
1 Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression
2 Kirkpatrick (984) Analogy from thermodynamics. The best crystals are found by annealing. First heat up the material to let it bounce from state to state. Slowly cool down the material to allow it to achieve its minimum energy state. 2
3 Local minimum Global minimum 3
4 Local minimum Global minimum 4
5 Local minimum Global minimum 5
6 Local minimum Global minimum 6
7 Local minimum Global minimum 7
8 Local minimum Global minimum 8
9 Local minimum Global minimum 9
10 Local minimum Global minimum 0
11 Solution space S, x in S is a solution E(x) is the energy of x x has a neighborhood of nearby states T is the temperature Cooling schedule, in step t of the algorithm bt Fast cooling T = ae b Slower cooling T = at
12 initialize T to be hot; choose a starting state; repeat generate a random move evaluate the change in energy de if de < 0 then accept the move else accept the move with probability e -de/t update T until T is very small (frozen) 2
13 A state is a spanning tree. T is a neighbor of T if it can be obtained by deletion of an edge in T and insertion of an edge not in T. Energy of a spanning tree T is its cost, c(t). T is a neighbor of T, de = c(t ) - c(t) > 0. Probability of moving to a higher energy state is e -(c(t ) -c(t))/t Higher if either c(t ) - c(t) is small or T is large. Low if either c(t ) - c(t) is large or T is small. 3
14 Not a black box algorithm. Requires tuning the cooling parameter. Has been shown to be very effective in finding good solutions for some optimization problems. Known to converge to optimal solution, but time of convergence is very large. Most likely converges to local optimum. Very little known about effectiveness generally. 4
15 Genetic Algorithms Mate solutions to find better ones Hopfield Nets Like a neural system trying to find minimum energy state DNA-Algorithms Use DNA chemical processes to execute a large search Quantum Algorithms Still an open question if quantum algorithms can solve NP-complete problems in polynomial time 5
16 Maintain a population of solutions. Good solutions mate to obtain even better ones as children. Bad mutations are killed. Include all edges Break cycles that don t include both parents edges 6
17 Major theoretical effort in past 30 years Some results are practical, many are not There both positive and negative results Approximation Complexity Classes APX polynomial time approximation within a constant factor of optimal. PTAS polynomial time in n approximation with a factor of (+a) of optimal for each a > 0. FPTAS polynomial time in n and /a approximation with in a factor of (+a) of optimal. 7
18 Proposal Due next Wednesday, Nov. 7 th. if send to ladner Final paper due December 4 th Friday Purpose Compare different algorithms to solve same problem. Delve into the literature Resources Library Web Instructors 8
19 original Encoder compressed decompressed x y xˆ Decoder Lossless compression Also called entropy coding, reversible coding. Lossy compression Also called irreversible coding. Compression ratio = is number of bits in x. x x x x = xˆ y xˆ 9
20 Conserve storage space Reduce time for transmission Faster to encode, send, then decode than to send the original Progressive transmission Some compression techniques allow us to send the most important bits first so we can get a low resolution version of some data before getting the high fidelity version Reduce computation Use less data to achieve an approximate answer 20
21 Data is not lost - the original is really needed. text compression compression of computer binaries to fit on a floppy Compression ratio typically no better than 4: for lossless compression. Major techniques include Huffman coding Arithmetic coding Dictionary techniques (Ziv,Lempel 977,978) Sequitur (Nevill-Manning, Witten 996) Standards - Morse code, Braille, Unix compress, gzip, zip, GIF, JBIG, JPEG 2
22 Data is lost, but not too much. audio video still images, medical images, photographs Compression ratios of 0: often yield quite high fidelity results. Major techniques include Vector Quantization Wavelets Transforms Standards - JPEG, MPEG 22
23 Developed by Shannon in the 940 s and 50 s Attempts to explain the limits of communication using probability theory. Example: Suppose English text is being sent Suppose a t is received. Given English, the next symbol being a z has very low probability, the next symbol being a h has much higher probability. Receiving a z has much more information in it than receiving a h. We already knew it was more likely we would receive an h. 23
24 Suppose we are given symbols {a, a 2,..., a m }. P(a i ) = probability of symbol a i occurring in the absence of any other information. P(a ) + P(a 2 ) P(a m ) = inf(a i ) = -log 2 P(a i ) bits is the information in bits of a i. y log(x) x 24
25 {a, b, c} with P(a) = /8, P(b) = /4, P(c) = 5/8 inf(a) = -log 2 (/8) = 3 inf(b) = -log 2 (/4) = 2 inf(c) = -log 2 (5/8) =.678 Receiving an a has more information than receiving a b or c. 25
26 The entropy is defined for a probability distribution over symbols {a, a 2,..., a m }. m H = P( ai )log2 i= ( P( a H is the average number of bits required to code up a symbol, given all we know is the probability distribution of the symbols. H is the Shannon lower bound on the average number of bits to code a symbol in this source model. Stronger models of entropy include context. i )) 26
27 {a, b, c} with a /8, b /4, c 5/8. H = /8 *3 + /4 *2 + 5/8*.678 =.3 bits/symbol {a, b, c} with a /3, b /3, c /3. (worst case) H = -3* (/3)*log 2 (/3) =.6 bits/symbol {a, b, c} with a, b 0, c 0 (best case) H = -*log 2 () = 0 Note that the standard coding of 3 symbols takes 2 bits. 27
28 Suppose we have two symbols with probabilities x and -x, respectively..2 -(x log x + (-x)log(-x)) entropy probability of first symbol 28
29 Suppose we are given a string x x 2...x n in an alphabet {a,a 2,...,a m } where P(a i ) is the probability of symbol i. The first-order entropy of x x 2...x n is n H ( x x Lx ) = P( x )inf( x ) = P( x )log2 ( P( x 2 n i i i i i= i= H(x x 2...x n ) is a lower bound on the number of bits to code the string x x 2...x n given only the probabilities of the symbols. This is the Shannon lower bound. n )) 29
30 Suppose we are given an algorithm that compresses a string x of length n and the algorithm only uses the frequencies of the symbols {a,a 2,...,a m } in the string as input. Let c(x) be the compressed result represented in bit. c ( x) H ( x) = nh H = n n log 2 ( m i i i= n n ) where and n i is the frequency of a i. 30
31 x = 0 0 P(0) = 2/2 (from frequencies) P() = 0/2 (from frequencies) H = -((2/2) log 2 (2/2) + (0/2) log 2 (0/2))=.65 Lower bound of 2 x.65 = 7.8 bits Standard for a two symbol alphabet is bits per symbol or 2 bits. There is a potential gain in some algorithm. 3
32 x = P() = P(2) = P(3) = P(6) = /2 (from frequencies) P(4) = P(5) = P(7) = P(8) = 2/2 (from frequencies) H = -((4/2) log 2 (/2) + (8/2) log 2 (2/2))= 2.92 Lower bound of 2 x 2.92 = bits Standard for an 8 symbol alphabet is 3 bits per symbol or 36 bits. No compression algorithm will give us much. 32
33 x = define x k+ = x k +r k r = - - (residual) Compression Algorithm represent x as x, r, r 2,..., r Compress this sequence. 3 bits for x and less than bits for the rest, for less than 4 bits instead of bits. This algorithm does not use just the frequencies of the symbols, but uses correlation between adjacent symbols. 33
34 Huffman (95) Uses frequencies of symbols in a string to build a variable rate prefix code. Each symbol is mapped to a binary string. More frequent symbols have shorter codes. No code is a prefix of another. Example: a 0 b 00 c 0 d 34
35 Example: a 0, b 00, c 0, d Coding: aabddcaa = 6 bits = 4 bits Prefix code ensures unique decodability a a b d d c a a Morse Code an example of variable rate code. E =. and Z =.. 35
36 Example: a 0, b 00, c 0, d 0 Leaves are labeled with symbols. a 0 b 0 c d The code of a symbol is the sequence of labels on the path from the root to the symbol. 36
37 Encoding: send the code, then the encoded data x = aabddcaa c(x) = a0,b00,c0,d, Huffman code code of x In some cases (e.g. H.263 video coder) the code is fixed Decoding: Build the Huffman tree, then use it to decode. a 0 0 b 0 c d repeat start at root of tree repeat if node is not a leaf if read bit = then go right else go left report leaf 37
38 Let p, p 2,..., p m be the probabilities for the symbols a, a 2,...,a m, respectively. Define the cost of the Huffman tree T to be C ( T ) = where r i is the length of the path from the root to a i. C(T) is the expected length of the code of a symbol coded by the tree T. m i= p i r i 38
39 Example: a /2, b /8, c /8, d /4 T 0 a 0 0 b c d C(T) = x /2 + 3 x /8 + 3 x /8 + 2 x /4 =.75 a b c d 39
40 Input: Probabilities p, p 2,..., p m for symbols a, a 2,...,a m, respectively. Output: A Huffman tree that minimizes the average number of bits (bit rate) to code a symbol. That is, minimizes where r i is the length of the path from the root to a i. m HC ( T ) = p r bit rate i i= i 40
41 In an optimal Huffman tree a lowest probability symbol has maximum distance from the root. If not exchanging a lowest probability symbol with one at maximum distance will lower the cost. k p T h p smallest p < q k < h q T q p C(T ) = C(T) + hp - hq + kq - kp = C(T) - (h-k)(q-p) < C(T) 4
42 The second lowest probability is a sibling of the the smallest in some optimal Huffman tree. If not, we can move it there not raising the cost. k q T h p smallest q 2nd smallest q < r k < h r T r p q p C(T ) = C(T) + hq - hr + kr - kq = C(T) - (h-k)(r-q) < C(T) 42
43 Assuming we have an optimal Huffman tree T whose two lowest probability symbols are siblings at maximum depth, they can be replaced by a new symbol whose probability is the sum of their probabilities. The resulting tree is optimal for the new symbol set. T h p smallest q 2nd smallest T q p q+p C(T ) = C(T) + (h-)(p+q) - hp -hq = C(T) - (p+q) 43
44 If T were not optimal then we could find a lower cost tree T. This will lead to a lower cost tree T for the original alphabet. T T T q+p q+p q p C(T ) = C(T ) + p + q < C(T ) + p + q = C(T) which is a contradiction 44
45 form a node for each symbol a i with weight p i ; insert the nodes in a min priority queue ordered by probability; while the priority queue has more than one element do min := delete-min; min2 := delete-min; create a new node n; n.weight := min.weight + min2.weight; n.left := min; n.right := min2; insert(n) return the last node in the priority queue. 45
46 P(a) =.4, P(b)=., P(c)=.3, P(d)=., P(e)= a b c d e a c d b e 46
47 a c d b e a c d b e 47
48 a c.4.6 a d c b e d b e 48
49 .4.6 a a c c d d b e b e 49
50 0 a 0 c 0 d 0 b average number of bits per symbol is.4 x +. x x 2 +. x 3 +. x 4 = 2. e a 0 b 0 c 0 d 0 e 50
51 P(a) =.4, P(b)=., P(c)=.3, P(d)=., P(e)=. Entropy H = -(.4 x log 2 (.4) +. x log 2 (.) +.3 x log 2 (.3) +. x log 2 (.) +. x log 2 (.)) = 2.05 bits per symbol Huffman Code HC =.4 x +. x x 2 +. x 3 +. x 4 = 2. bits per symbol pretty good! 5
52 P(a) = /2, P(b) = /4, P(c) = /8, P(d) = /6, P(e) = /6 Compute the Huffman tree and its bit rate. Compute the Entropy Compare 52
53 The Huffman code is within one bit of the entropy lower bound. H Huffman code does not work well with a two symbol alphabet. Example: P(0) = /00, P() = 99/00 HC = bits/symbol HC H = -((/00)*log 2 (/00) + (99/00)log 2 (99/00)) =.08 bits/symbol H 53
54 Assuming independence P(ab) = P(a)P(b), so we can lump symbols together. Example: P(0) = /00, P() = 99/00 P(00) = /0000, P(0) = P(0) = 99/0000, P() = 980/ HC =.03 bits/symbol (2 bit symbol) =.55 bits/bit Still not that close to H =.08 bits/bit
55 Suppose we add a one symbol context. That is in compressing a string x x 2...x n we want to take into account x k- when encoding x k. New model, so entropy based on just independent probabilities of the symbols doesn t hold. The new entropy model (2nd order entropy) has for each symbol a probability for each other symbol following it. Example: {a,b,c} prev next a b c a b..9 0 c
56 prev next a b c a b..9 0 c...8 Code for first symbol a 00 b 0 c 0 a b c a.4 b a.9. c.8 0 b.2 c.4 a b b a c c a. b. 56
57 Time to design Huffman Code is O(n log n) where n is the number of symbols. Probabilities frequencies in string to be compressed frequencies from a training set for a fixed codebook Huffman works better for larger alphabets. There are adaptive (one pass) Huffman coding algorithms. No need to transmit code. Huffman is still popular. It is simple and it works. 57
58 Binary source with 0 s much more frequent than s. Variable-to-variable length code Prefix code. Golumb code of order 4 input output
59 input output
60 Let p be the probability of 0 Choose the order k such that k + k k p p < p most likely least likely 0 k
61 p =.89 p 7 = p 6 =.503 p 5 =.558 Order = input output
62 order GC entropy entropy p 62
63 Very effective as a run-length code As p goes to effectiveness near entropy There are adaptive Golomb codes, so that the order can change according to frequency of 0 s. 63
CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression
CSEP 52 Applied Algorithms Spring 25 Statistical Lossless Data Compression Outline for Tonight Basic Concepts in Data Compression Entropy Prefix codes Huffman Coding Arithmetic Coding Run Length Coding
More informationText Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2
Text Compression Jayadev Misra The University of Texas at Austin December 5, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction to Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable
More informationLecture 4 : Adaptive source coding algorithms
Lecture 4 : Adaptive source coding algorithms February 2, 28 Information Theory Outline 1. Motivation ; 2. adaptive Huffman encoding ; 3. Gallager and Knuth s method ; 4. Dictionary methods : Lempel-Ziv
More informationCompression and Coding
Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More informationCSEP 590 Data Compression Autumn Arithmetic Coding
CSEP 590 Data Compression Autumn 2007 Arithmetic Coding Reals in Binary Any real number x in the interval [0,1) can be represented in binary as.b 1 b 2... where b i is a bit. x 0 0 1 0 1... binary representation
More informationImage and Multidimensional Signal Processing
Image and Multidimensional Signal Processing Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ Image Compression 2 Image Compression Goal: Reduce amount
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More informationCSCI 2570 Introduction to Nanocomputing
CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationCSE 421 Greedy: Huffman Codes
CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:
More informationDigital communication system. Shannon s separation principle
Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationCMPT 365 Multimedia Systems. Lossless Compression
CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding
More informationHuffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University
Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw
More informationBASICS OF COMPRESSION THEORY
BASICS OF COMPRESSION THEORY Why Compression? Task: storage and transport of multimedia information. E.g.: non-interlaced HDTV: 0x0x0x = Mb/s!! Solutions: Develop technologies for higher bandwidth Find
More informationCS4800: Algorithms & Data Jonathan Ullman
CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationInformation and Entropy
Information and Entropy Shannon s Separation Principle Source Coding Principles Entropy Variable Length Codes Huffman Codes Joint Sources Arithmetic Codes Adaptive Codes Thomas Wiegand: Digital Image Communication
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More informationImage Compression. Qiaoyong Zhong. November 19, CAS-MPG Partner Institute for Computational Biology (PICB)
Image Compression Qiaoyong Zhong CAS-MPG Partner Institute for Computational Biology (PICB) November 19, 2012 1 / 53 Image Compression The art and science of reducing the amount of data required to represent
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More informationSource Coding Techniques
Source Coding Techniques. Huffman Code. 2. Two-pass Huffman Code. 3. Lemple-Ziv Code. 4. Fano code. 5. Shannon Code. 6. Arithmetic Code. Source Coding Techniques. Huffman Code. 2. Two-path Huffman Code.
More informationData Compression Techniques
Data Compression Techniques Part 2: Text Compression Lecture 5: Context-Based Compression Juha Kärkkäinen 14.11.2017 1 / 19 Text Compression We will now look at techniques for text compression. These techniques
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationLecture 10 : Basic Compression Algorithms
Lecture 10 : Basic Compression Algorithms Modeling and Compression We are interested in modeling multimedia data. To model means to replace something complex with a simpler (= shorter) analog. Some models
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More informationInformation Theory with Applications, Math6397 Lecture Notes from September 30, 2014 taken by Ilknur Telkes
Information Theory with Applications, Math6397 Lecture Notes from September 3, 24 taken by Ilknur Telkes Last Time Kraft inequality (sep.or) prefix code Shannon Fano code Bound for average code-word length
More informationCMPT 365 Multimedia Systems. Final Review - 1
CMPT 365 Multimedia Systems Final Review - 1 Spring 2017 CMPT365 Multimedia Systems 1 Outline Entropy Lossless Compression Shannon-Fano Coding Huffman Coding LZW Coding Arithmetic Coding Lossy Compression
More informationImplementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source
Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source Ali Tariq Bhatti 1, Dr. Jung Kim 2 1,2 Department of Electrical
More informationMultimedia Information Systems
Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 3 & 4: Color, Video, and Fundamentals of Data Compression 1 Color Science Light is an electromagnetic wave. Its color is characterized
More informationBasic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.
Universal Lossless coding Lempel-Ziv Coding Basic principles of lossless compression Historical review Variable-length-to-block coding Lempel-Ziv coding 1 Basic Principles of Lossless Coding 1. Exploit
More informationLecture 3 : Algorithms for source coding. September 30, 2016
Lecture 3 : Algorithms for source coding September 30, 2016 Outline 1. Huffman code ; proof of optimality ; 2. Coding with intervals : Shannon-Fano-Elias code and Shannon code ; 3. Arithmetic coding. 1/39
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More information17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.
( c ) E p s t e i n, C a r t e r, B o l l i n g e r, A u r i s p a C h a p t e r 17: I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science 17.1 Binary Codes Normal numbers we use
More informationELEC 515 Information Theory. Distortionless Source Coding
ELEC 515 Information Theory Distortionless Source Coding 1 Source Coding Output Alphabet Y={y 1,,y J } Source Encoder Lengths 2 Source Coding Two coding requirements The source sequence can be recovered
More informationMultimedia Communications. Mathematical Preliminaries for Lossless Compression
Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when
More information1 Introduction to information theory
1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through
More information2018/5/3. YU Xiangyu
2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 AEP Asymptotic Equipartition Property AEP In information theory, the analog of
More informationEE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018
Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code
More informationLec 05 Arithmetic Coding
ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 208 p. Outline Lecture 04 ReCap Arithmetic
More informationCh 0 Introduction. 0.1 Overview of Information Theory and Coding
Ch 0 Introduction 0.1 Overview of Information Theory and Coding Overview The information theory was founded by Shannon in 1948. This theory is for transmission (communication system) or recording (storage
More informationCompressing Tabular Data via Pairwise Dependencies
Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet
More information4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak
4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the
More informationImage Compression. Fundamentals: Coding redundancy. The gray level histogram of an image can reveal a great deal of information about the image
Fundamentals: Coding redundancy The gray level histogram of an image can reveal a great deal of information about the image That probability (frequency) of occurrence of gray level r k is p(r k ), p n
More informationTTIC 31230, Fundamentals of Deep Learning David McAllester, April Information Theory and Distribution Modeling
TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Information Theory and Distribution Modeling Why do we model distributions and conditional distributions using the following objective
More informationAn instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1
Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,
More informationDigital Communications III (ECE 154C) Introduction to Coding and Information Theory
Digital Communications III (ECE 154C) Introduction to Coding and Information Theory Tara Javidi These lecture notes were originally developed by late Prof. J. K. Wolf. UC San Diego Spring 2014 1 / 8 I
More information( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1
( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2 0 1 6 C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science In this chapter, we learn how data can
More informationCoding for Discrete Source
EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 8 Greedy Algorithms V Huffman Codes Adam Smith Review Questions Let G be a connected undirected graph with distinct edge weights. Answer true or false: Let e be the
More informationCompression and Coding. Theory and Applications Part 1: Fundamentals
Compression and Coding Theory and Applications Part 1: Fundamentals 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance) 2 Why
More informationLecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments
Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006 jzhang@cse.unsw.edu.au
More informationInformation and Entropy. Professor Kevin Gold
Information and Entropy Professor Kevin Gold What s Information? Informally, when I communicate a message to you, that s information. Your grade is 100/100 Information can be encoded as a signal. Words
More informationlossless, optimal compressor
6. Variable-length Lossless Compression The principal engineering goal of compression is to represent a given sequence a, a 2,..., a n produced by a source as a sequence of bits of minimal possible length.
More informationHomework Set #2 Data Compression, Huffman code and AEP
Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code
More informationQuantum-inspired Huffman Coding
Quantum-inspired Huffman Coding A. S. Tolba, M. Z. Rashad, and M. A. El-Dosuky Dept. of Computer Science, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, Egypt. tolba_954@yahoo.com,
More informationCompression. Encryption. Decryption. Decompression. Presentation of Information to client site
DOCUMENT Anup Basu Audio Image Video Data Graphics Objectives Compression Encryption Network Communications Decryption Decompression Client site Presentation of Information to client site Multimedia -
More informationIntroduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.
Preface p. xvii Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p. 6 Summary p. 10 Projects and Problems
More informationA Simple Introduction to Information, Channel Capacity and Entropy
A Simple Introduction to Information, Channel Capacity and Entropy Ulrich Hoensch Rocky Mountain College Billings, MT 59102 hoenschu@rocky.edu Friday, April 21, 2017 Introduction A frequently occurring
More informationLecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5
Lecture : Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP959 Multimedia Systems S 006 jzhang@cse.unsw.edu.au Acknowledgement
More informationData Compression. Limit of Information Compression. October, Examples of codes 1
Data Compression Limit of Information Compression Radu Trîmbiţaş October, 202 Outline Contents Eamples of codes 2 Kraft Inequality 4 2. Kraft Inequality............................ 4 2.2 Kraft inequality
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More information1. Basics of Information
1. Basics of Information 6.004x Computation Structures Part 1 Digital Circuits Copyright 2015 MIT EECS 6.004 Computation Structures L1: Basics of Information, Slide #1 What is Information? Information,
More information21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]
575 21. Dynamic Programming III FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5] Approximation 576 Let ε (0, 1) given. Let I opt an optimal selection. No try to find a valid selection
More informationL. Yaroslavsky. Fundamentals of Digital Image Processing. Course
L. Yaroslavsky. Fundamentals of Digital Image Processing. Course 0555.330 Lec. 6. Principles of image coding The term image coding or image compression refers to processing image digital data aimed at
More informationrepetition, part ii Ole-Johan Skrede INF Digital Image Processing
repetition, part ii Ole-Johan Skrede 24.05.2017 INF2310 - Digital Image Processing Department of Informatics The Faculty of Mathematics and Natural Sciences University of Oslo today s lecture Coding and
More informationData Compression Techniques
Data Compression Techniques Part 1: Entropy Coding Lecture 4: Asymmetric Numeral Systems Juha Kärkkäinen 08.11.2017 1 / 19 Asymmetric Numeral Systems Asymmetric numeral systems (ANS) is a recent entropy
More informationLecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More informationEntropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea
Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital
More informationLecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code
Lecture 16 Agenda for the lecture Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code Variable-length source codes with error 16.1 Error-free coding schemes 16.1.1 The Shannon-Fano-Elias
More informationChapter 5: Data Compression
Chapter 5: Data Compression Definition. A source code C for a random variable X is a mapping from the range of X to the set of finite length strings of symbols from a D-ary alphabet. ˆX: source alphabet,
More informationData Compression Techniques (Spring 2012) Model Solutions for Exercise 2
582487 Data Compression Techniques (Spring 22) Model Solutions for Exercise 2 If you have any feedback or corrections, please contact nvalimak at cs.helsinki.fi.. Problem: Construct a canonical prefix
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationInformation Theory. David Rosenberg. June 15, New York University. David Rosenberg (New York University) DS-GA 1003 June 15, / 18
Information Theory David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 18 A Measure of Information? Consider a discrete random variable
More informationASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER
ASYMMETRIC NUMERAL SYSTEMS: ADDING FRACTIONAL BITS TO HUFFMAN CODER Huffman coding Arithmetic coding fast, but operates on integer number of bits: approximates probabilities with powers of ½, getting inferior
More informationencoding without prediction) (Server) Quantization: Initial Data 0, 1, 2, Quantized Data 0, 1, 2, 3, 4, 8, 16, 32, 64, 128, 256
General Models for Compression / Decompression -they apply to symbols data, text, and to image but not video 1. Simplest model (Lossless ( encoding without prediction) (server) Signal Encode Transmit (client)
More informationData compression. Harald Nautsch ISY Informationskodning, Linköpings universitet.
Data compression Harald Nautsch harald.nautsch@liu.se ISY Informationskodning, Linköpings universitet http://www.icg.isy.liu.se/en/courses/tsbk08/ Course contents Source modeling: Random variables and
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 2017 Author: Galen Reeves Last Modified: October 18, 2017 Outline of lecture: 5.1 Introduction to Lossless Source
More informationLecture 6: Kraft-McMillan Inequality and Huffman Coding
EE376A/STATS376A Information Theory Lecture 6-0/25/208 Lecture 6: Kraft-McMillan Inequality and Huffman Coding Lecturer: Tsachy Weissman Scribe: Akhil Prakash, Kai Yee Wan In this lecture, we begin with
More informationEECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have
EECS 229A Spring 2007 * * Solutions to Homework 3 1. Problem 4.11 on pg. 93 of the text. Stationary processes (a) By stationarity and the chain rule for entropy, we have H(X 0 ) + H(X n X 0 ) = H(X 0,
More informationWavelet Scalable Video Codec Part 1: image compression by JPEG2000
1 Wavelet Scalable Video Codec Part 1: image compression by JPEG2000 Aline Roumy aline.roumy@inria.fr May 2011 2 Motivation for Video Compression Digital video studio standard ITU-R Rec. 601 Y luminance
More informationSummary of Last Lectures
Lossless Coding IV a k p k b k a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101 100 root 1 60 1 0 0 1 40 0 32 28 23 e 17 1 0 1 0 1 0 16 a 16 d 15 i
More information! Where are we on course map? ! What we did in lab last week. " How it relates to this week. ! Compression. " What is it, examples, classifications
Lecture #3 Compression! Where are we on course map?! What we did in lab last week " How it relates to this week! Compression " What is it, examples, classifications " Probability based compression # Huffman
More informationA Mathematical Theory of Communication
A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions
More information1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument
1 Basic Definitions A Problem is a relation from input to acceptable output. For example, INPUT: A list of integers x 1,..., x n OUTPUT: One of the three smallest numbers in the list An algorithm A solves
More informationEE5585 Data Compression January 29, Lecture 3. x X x X. 2 l(x) 1 (1)
EE5585 Data Compression January 29, 2013 Lecture 3 Instructor: Arya Mazumdar Scribe: Katie Moenkhaus Uniquely Decodable Codes Recall that for a uniquely decodable code with source set X, if l(x) is the
More informationECE 587 / STA 563: Lecture 5 Lossless Compression
ECE 587 / STA 563: Lecture 5 Lossless Compression Information Theory Duke University, Fall 28 Author: Galen Reeves Last Modified: September 27, 28 Outline of lecture: 5. Introduction to Lossless Source
More informationIntroduction to information theory and coding
Introduction to information theory and coding Louis WEHENKEL Set of slides No 5 State of the art in data compression Stochastic processes and models for information sources First Shannon theorem : data
More informationZebo Peng Embedded Systems Laboratory IDA, Linköping University
TDTS 01 Lecture 8 Optimization Heuristics for Synthesis Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 8 Optimization problems Heuristic techniques Simulated annealing Genetic
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationLecture 1: September 25, A quick reminder about random variables and convexity
Information and Coding Theory Autumn 207 Lecturer: Madhur Tulsiani Lecture : September 25, 207 Administrivia This course will cover some basic concepts in information and coding theory, and their applications
More informationLecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)
3- Mathematical methods in communication Lecture 3 Lecturer: Haim Permuter Scribe: Yuval Carmel, Dima Khaykin, Ziv Goldfeld I. REMINDER A. Convex Set A set R is a convex set iff, x,x 2 R, θ, θ, θx + θx
More informationFast Progressive Wavelet Coding
PRESENTED AT THE IEEE DCC 99 CONFERENCE SNOWBIRD, UTAH, MARCH/APRIL 1999 Fast Progressive Wavelet Coding Henrique S. Malvar Microsoft Research One Microsoft Way, Redmond, WA 98052 E-mail: malvar@microsoft.com
More information