Information and Entropy. Professor Kevin Gold
|
|
- Britton Pope
- 5 years ago
- Views:
Transcription
1 Information and Entropy Professor Kevin Gold
2 What s Information? Informally, when I communicate a message to you, that s information. Your grade is 100/100 Information can be encoded as a signal. Words on the page Sound wave in the air Bits on a hard drive, in RAM, or in an Internet packet
3 Same Information, More or Less Signal I can be more or less wordy while communicating the same information. Your grade on the midterm is 100/100. Your grade is 100/ The number of possible messages constrains how short the message can be. I can get away with the shortest code, +, only if there are few possible grades ( +,, -) If there s no context, I need to preface by specifying on the midterm or the meaning is ambiguous
4 Using Fixed-Length Binary Codes In addition to using bits to represent numbers in binary, we can use bits to represent arbitrary messages. Suppose we have N messages we d like to send. We can assign binary numbers to represent these messages: 000 = Hello, 001 = Goodbye, 010=Send Help, 011=I m fine, 100=Thanks, 101=I m trapped in a computer = How many messages do we need? Enough to have a code for each message. We get 2 b codes with b bits, so we need log 2 N bits for N messages. More possible messages requires more bits. 8 bits = up to 2 8 = 256 messages
5 Sample Fixed-Length Codes: ASCII and the Unicode BMP ASCII is still often used to encode characters at a flat rate of 8 bits per character = A, = B, = $, = %, = 0, = 1, Unicode can use more than 16 bits if necessary, but the codes in the 16-bit basic multilingual plane (BMP) can handle a wide variety of international characters, as well as things like mathematical notation and emoji Hex for Unicode codes (x varies by column)
6 Why Fixed-Length? Avoiding Ambiguity Fixed-length codes are a convenient way to know where one code ends and another begins = BA in ASCII; we know the cut is at 8 bits Suppose our codes from the previous example went: 0 = Hello, 1 = Goodbye, 10=Send Help, 11=I m fine, 100=Thanks, 101=I m trapped in a computer Is 0101 Hello, Goodbye, Hello, Goodbye or Hello, I m trapped in a computer? We can t tell here
7 An Advantage of Variable- Length Codes: Compression Codes like ASCII assign the same number of bits regardless of whether the character is common ( e, ) or uncommon ( ~, ) But we can assign variable-length codes to symbols where more common symbols get shorter codes and end up sending fewer bits overall Example: Compress AAAAAAAABDCB Code 1: 00 = A, 01 = B, 10 = C, 11 = D Bitstring is (2 bits) Code 2: 0 = A, 10 = B, 110 = C, 111 = D Bitstring is (18 bits)
8 The Prefix Property Ambiguity arises when a decoder doesn t know whether to accept a code as done or keep reading more But we can arrange for no code to be the beginning (prefix) of any other code. This is called the prefix property. Codes that obey the prefix property don t need spaces or separating symbols to be sent between codes to understand what is being sent. Decoding = A, 10 = B, 110 = C, 111 = D (obeys prefix property) 0 is the start of no symbol but A. First char is A. Same logic applies to next 7 symbols. They must each be A. Read 1, could be 3 codes. Read 0: 10 is a unique start. It s B. [etc]
9 Morse Code Doesn t Have the Prefix Property Invented before we understood codes well, Morse code is ambiguous without pauses between symbols (compare A to ET ) On a computer, an extra symbol or sequence to denote the next character would waste space; here, the pause wasted time
10 Codes that Obey the Prefix Property Can Be Modeled With Trees A 0 1 B C 1 D Left->A (x8) Right, Left->B Right, Right, Right->D [ ] Every variable-length code that obeys the prefix property can be modeled as a binary tree (tree with a root node and at most two children per node) Edges are labeled 0 or 1, and the leaves are labeled with symbols to encode Decode each character by following labeled branches from the root This must obey the prefix property as long as symbols are only at leaves
11 Optimal Codes Require a Number of Bits Related to their Probability Compression is optimal if it uses as few bits as possible to represent the same message. Variable-length codes such as the one we just showed can be optimal as long as they are organized in the right way according to character frequency. There is an algorithm to do this: Huffman coding (which automatically creates a tree from character counts) There is an interesting property of the number of bits used per symbol in this compression - it can t be less than -log 2 p where p is the probability (frequency) of the character.
12 Huffman Coding Example We have a file with: 16 a s, b s, c s, 2 d s, 1 e, 1 f, g s (32 total) First, create 1 node per character (these are our trees) a 16 b c d 2 e 1 f 1 g
13 Huffman Coding Example Now, join the two trees with the smallest counts 2 a 16 b c d 2 e 1 f 1 g
14 Huffman Coding Example Now, join the two trees with the smallest counts And repeat d 2 2 a 16 b c e 1 f 1 g
15 Huffman Coding Example Now, join the two trees with the smallest counts And repeat 8 d 2 2 a 16 b c e 1 f 1 g
16 Huffman Coding Example Now, join the two trees with the smallest counts And repeat 8 g 8 d 2 2 a 16 b c e 1 f 1
17 Huffman Coding Example b c g d 2 2 a 16 e 1 f 1
18 Huffman Coding Example 32 a b c g d 2 2 e 1 f 1
19 Huffman Coding Example a b c g 0 1 d e 1 f 1
20 Huffman Coding Example a b 8 c g a = 0 b = 100 c = 101 d = 1110 e = f = g = 110 d e 1 f 1
21 Example of Optimal Symbol Compression We have a file with: 16 a s, b s, c s, 2 d s, 1 e, 1 f, g s (32 total) Pr(a) = 16/32 = 1/2 -log2 (1/2) = -(-1) = 1 bit Pr(b) = /32 = 1/8 -log2 (1/8) = -(-3) = 3 bits Pr(c) = /32 = 1/8 -log2 (1/8) = -(-3) = 3 bits Pr(d) = 2/32 = 1/16 -log2 (1/16) = -(-) = bits Pr(e) = 1/32 -log2 (1/32) = -(-5) = 5 bits Pr(f) = 1/32 -log2 (1/32) = -(-5) = 5 bits Pr(g) = /32 = 1/8 -log2 (1/8) = -(-3) = 3 bits
22 Example of Optimal Symbol Compression We have a file with: 16 a s, b s, c s, 2 d s, 1 e, 1 f, g s (32 total) Pr(a) = 16/32 = 1/2 -log 2 (1/2) = -(-1) = 1 Pr(b) = /32 = 1/8 -log 2 (1/8) = -(-3) = 3 Pr(c) = /32 = 1/8 -log 2 (1/8) = -(-3) = 3 Pr(d) = 2/32 = 1/16 -log 2 (1/16) = -(-) = Pr(e) = 1/32 -log 2 (1/32) = -(-5) = 5 Pr(f) = 1/32 -log 2 (1/32) = -(-5) = 5 Pr(g) = /32 = 1/8 -log 2 (1/8) = -(-3) = 3 a b c g a = 0 b = 100 c = 101 d = 1110 e = f = g = 110 d e f
23 MPEG Coding Uses Variable-Length Codes To compress video and audio, the MPEG standard gives variable-length codes for features such as brightness, amount of movement in part of the scene, and texture Some values are more common than others, and the more common ones get shorter codes Flat textures (fewer bits) vs busy textures (more bits) No movement (fewer bits) vs. movement (more bits) No change from previous video frame (no bits) vs. change (more bits) ffmpeg debug showing motion vectors
24 JPEG is similar, leading to predictable differences in file size 17,67 bytes 322,17 bytes due to flat or regular textures unpredictable texture (both 102x768 jpg s)
25 Why Not Use Variable-Length Codes (VLCs)? Though VLCs save memory, there s no way to get random access to a code down the line in constant time Like linked lists, it takes linear time to access a code - by decoding all the codes before it To get random access with fixed-length codes, we can use the same trick as arrays - multiply index by code length to get the place to look for a particular code For this reason, VLCs tend to just be used for compression, and files are decompressed in a single linear-time pass before being used Most common data types have fixed lengths as speed of access is more important than memory use
26 Getting Used to -log2 p If N symbols are equally likely, they have each have probability p = 1/N. There s nothing to be gained from variable-length codes in this case - it s all symmetric - so we d assign the same number of bits to everything. We determined that we need log 2 N bits to represent all N codes. If we drop the ceiling and replace N with 1/p, we have log 2 (1/p) = log 2 p -1 = -log 2 p. Thus, plugging in probabilities that correspond to equally likely outcomes will produce a reasonable number of bits. -log 2 (1/2) = 1 bit for a coin flip. -log 2 (1/) = 2 bits for possible values, 00, 01, 10, 11. -log 2 (1/256) = 8 bits, and so on.
27 A Mathematical Definition of Information The -log 2 p bound on the number of bits doesn t depend on what kind of thing we re talking about - it s something we can calculate about any event with a probability. For any event, we can ask, How many bits should we use to talk about this event, in an optimal encoding? This quantity is interesting because it is smaller when events are unsurprising (high p), and larger when events are surprising (low p) This matches our everyday use of the idea of being informative enough that this quantity gained the name, the information of an event. If p is the probability of an event, -log 2 p is its information. Even outside a computer context, information is measured in bits.
28 In This Sense, the Image on the Right Has More Information 17,67 bytes 322,17 bytes Information here doesn t imply that the message is interesting - just that it had low probability and therefore takes more bits to encode. Because the image on the right is more unpredictable, it takes more bits to reproduce it exactly. If a source of information is always extremely unpredictable, we say it has high entropy, to be defined more formally next.
29 Entropy In information theory, entropy is the expected information for a source of symbols. Entropy is E[I(X)] where X is an event and I(X) is its information I(X) = -log 2 Pr(X) Applying the definition of expectation, this is Σ X -Pr(X)log 2 Pr(X) It simultaneously represents: How many bits we need to use on average to encode the stream of symbols How unpredictable each symbol is, on average
30 Entropy Examples Series of coin flips: ABABAABBAB Pr(A) = 1/2, Pr(B) = 1/2 Entropy = -1/2 log 2 (1/2) - 1/2 log 2 (1/2) = -1/2(-1) - 1/2(-1) = 1/2 + 1/2 = 1 bit. (H = 0, T = 1) Series of rolls of a -sided die: 1,2,,1,3,2,,3 Pr( 1 ) = Pr( 2 ) = Pr( 3 ) = Pr( ) = 1/ -(1/) log 2 (1/) = 1/ (2) = 1/2, so 1/2 + 1/2 + 1/2 + 1/2 = 2 bits (00, 01, 10, 11)
31 Entropy Examples Pr( I am Groot ) = 1-1 log 2 (1) = -1(0) = 0 bits Although the entropy could be higher if we considered that Groot could speak or not Unfair -sided die: rolls 1/2 of the time, 3 1/ of the time, 1-2 1/8 of the time each - 1/2 log 2 1/2-1/ log 2 1/ - 1/8 log 2 1/8-1/8 log 2 1/8 = -1/2(-1) - 1/(-2) - 1/8(-3) - 1/8(-3) = 1/2 + 1/2 + 3/8 + 3/8 = 1.75 Notice that this is less than the fair die; this die is more predictable
32 Entropy Examples 256 equally likely characters: Σ -1/256 log 2 1/256 = log = 8 bits. If we really have no way of predicting what s coming next, we need all 8 bits every time. 256 characters where lowercase letters and digits (36) are used with equal probability 60% of the time; capital letters and 10 special characters (36) are used with equal probability 30% of the time, and the remaining characters split the remaining 10%: Pr (any particular lower or digit) = 0.6*1/36 = Pr (any particular upper or special char) = 0.3 *1/36 = Pr (other) = 0.1*1/(256-72) = 0.1/18 = ( log ) + 36 ( log ) + 18( log ) = 6.69 bits under optimal encoding not that big a deal, really
33 Quick Check: Entropy Calculate the entropy for the following sequence of characters (assuming the observed frequencies reflect the underlying probabilities): ABCA (To get you started, notice Pr(A) = 2/ = 1/2 and that its term is -(1/2)log2(1/2) = (1/2)(1) = 1/2.)
34 Quick Check: Entropy Calculate the entropy for the following sequence of characters: ABCA Pr(A) = 1/2 -(1/2)log 2 (1/2) = (1/2)(1) = 1/2 Pr(B) = 1/ -(1/)log 2 (1/) = (1/)(2) = 1/2 Pr(C) identical to Pr(B), so 1/2 + 1/2 + 1/2 = 1.5 Sanity check: makes sense because 0, 10, 11 uses 1 bit half the time and 2 bits half the time
35 Physical Entropy Informational entropy, definted in the 190 s by Claude Shannon, gets its name from physical entropy, defined in the 19th century in physics (specifically thermodynamics) In physics, entropy refers to how unpredictable a physical system is; the equation is k B Σ X Pr(X) ln Pr(X) where X are physical events and k B is a physical constant we don t need to discuss you can see the resemblance to Shannon s concept In physics, entropy is associated with heat, since higher temperature leads to higher unpredictability of the particles If describing physical events, these quantities only differ by a constant
36 Information Gain Just as learning new information can change our perception of a probability, it can also reduce entropy For example, we could learn that the next symbol was definitely not an A. This would reduce our surprise when we get a B or a C. When we learn information that rules out particular possibilities or changes the probability distribution, we can calculate a new entropy. The difference in entropies is called the information gain.
37 Information Gain Example We re on an assembly line with appliances coming down the pipe: 1/2 dishwashers 1/ dryers 1/ stoves The current entropy is 1/2(1) + 1/(2) + 1/(2) = 1.5 bits. If we recorded the sequence, it would take 1.5 bits on average if 0 = dishwasher, 10 = dryer, 11 = stove The message comes down the line: No more dishwashers today! New probabilities: 1/2 dryer, 1/2 stove. The new entropy is (1/2)(1) + (1/2)(1) = 1 bit. We could encode what happens now using just 0 = dryer, 1 = stove. The information gain of no dishwashers is = 0.5 bits.
38 Information Gain on an Image It s common for uncompressed images to represent pixels as triples (R,G,B) where each value is If we treat all colors as equally likely, entropy of the pixel values is -log 2 (1/(256*256*256)) = -log 2 (1/2 2 ) = 2 bits, precisely the bits needed to send each pixel We skipped summing over 2 2 values and dividing by 2 2 ; these operations cancel out Suppose we learn that the image is grayscale, meaning all pixels are (x, x, x) for some x in [0,255]. The new entropy is -log 2 1/256 = 8 bits (again the true bits/pixel) The information gain is thus 16 bits - a measure of the savings per pixel
39 Information Gain in Machine Learning Some classifiers try to focus on features that have information gain relative to the training example classifications Example: If classifying images as octopus or not, a Yes is less surprising once you know it has 8 arms Before (base rate) Octopus 0.25 Not Octopus arms < 8 arms Octopus 1 Not Octopus 0 Both piles have lower entropy Octopus 0.05 Not Octopus 0.95
40 Information Gain in Machine Learning Worse features will tend to do little or nothing to reduce the entropy, and these can be ignored in the classification orange Octopus 0.2 Before (base rate) Octopus 0.25 Not Octopus 0.75 not orange Not Octopus 0.8 Little or no information gain Octopus 0.3 Not Octopus 0.7
41 Entropy and Steganography Steganography is the hiding of information in plain sight One way cyberattacks can occur (or messages transmitted) is by embedding payload bits into the low-order bits of innocuous images, videos, or PDFs Examining the least significant bits for high entropy (unpredictability) can reveal that something is not right; the payload must be compressed, creating abnormally high entropy
42 Summary Information for a particular event can be calculated directly from its probability p: -log 2 p. This is the lower bound on the number of bits necessary to encode this event 2 equiprobable events => 1 bit each; equiprobable => 2 bits When p varies among symbols, we can take advantage of this to create variable-length codes (using Huffman coding) that use the optimal number of bits without becoming ambiguous The entropy of a stream of symbols is the expected information - a measure of the average number of bits we need and how unpredictable or surprising the source is
Lecture 1: Shannon s Theorem
Lecture 1: Shannon s Theorem Lecturer: Travis Gagie January 13th, 2015 Welcome to Data Compression! I m Travis and I ll be your instructor this week. If you haven t registered yet, don t worry, we ll work
More information6.02 Fall 2012 Lecture #1
6.02 Fall 2012 Lecture #1 Digital vs. analog communication The birth of modern digital communication Information and entropy Codes, Huffman coding 6.02 Fall 2012 Lecture 1, Slide #1 6.02 Fall 2012 Lecture
More information! Where are we on course map? ! What we did in lab last week. " How it relates to this week. ! Compression. " What is it, examples, classifications
Lecture #3 Compression! Where are we on course map?! What we did in lab last week " How it relates to this week! Compression " What is it, examples, classifications " Probability based compression # Huffman
More informationAn introduction to basic information theory. Hampus Wessman
An introduction to basic information theory Hampus Wessman Abstract We give a short and simple introduction to basic information theory, by stripping away all the non-essentials. Theoretical bounds on
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More information( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1
( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2 0 1 6 C h a p t e r 1 7 : I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science In this chapter, we learn how data can
More informationLecture 10 : Basic Compression Algorithms
Lecture 10 : Basic Compression Algorithms Modeling and Compression We are interested in modeling multimedia data. To model means to replace something complex with a simpler (= shorter) analog. Some models
More informationBandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet)
Compression Motivation Bandwidth: Communicate large complex & highly detailed 3D models through lowbandwidth connection (e.g. VRML over the Internet) Storage: Store large & complex 3D models (e.g. 3D scanner
More informationIntro to Information Theory
Intro to Information Theory Math Circle February 11, 2018 1. Random variables Let us review discrete random variables and some notation. A random variable X takes value a A with probability P (a) 0. Here
More informationLecture 11: Information theory THURSDAY, FEBRUARY 21, 2019
Lecture 11: Information theory DANIEL WELLER THURSDAY, FEBRUARY 21, 2019 Agenda Information and probability Entropy and coding Mutual information and capacity Both images contain the same fraction of black
More informationA Mathematical Theory of Communication
A Mathematical Theory of Communication Ben Eggers Abstract This paper defines information-theoretic entropy and proves some elementary results about it. Notably, we prove that given a few basic assumptions
More information1. Basics of Information
1. Basics of Information 6.004x Computation Structures Part 1 Digital Circuits Copyright 2015 MIT EECS 6.004 Computation Structures L1: Basics of Information, Slide #1 What is Information? Information,
More informationEntropy as a measure of surprise
Entropy as a measure of surprise Lecture 5: Sam Roweis September 26, 25 What does information do? It removes uncertainty. Information Conveyed = Uncertainty Removed = Surprise Yielded. How should we quantify
More informationInformation & Correlation
Information & Correlation Jilles Vreeken 11 June 2014 (TADA) Questions of the day What is information? How can we measure correlation? and what do talking drums have to do with this? Bits and Pieces What
More informationShannon-Fano-Elias coding
Shannon-Fano-Elias coding Suppose that we have a memoryless source X t taking values in the alphabet {1, 2,..., L}. Suppose that the probabilities for all symbols are strictly positive: p(i) > 0, i. The
More informationMultimedia. Multimedia Data Compression (Lossless Compression Algorithms)
Course Code 005636 (Fall 2017) Multimedia Multimedia Data Compression (Lossless Compression Algorithms) Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea E-mail: riaz@sejong.ac.kr
More information17.1 Binary Codes Normal numbers we use are in base 10, which are called decimal numbers. Each digit can be 10 possible numbers: 0, 1, 2, 9.
( c ) E p s t e i n, C a r t e r, B o l l i n g e r, A u r i s p a C h a p t e r 17: I n f o r m a t i o n S c i e n c e P a g e 1 CHAPTER 17: Information Science 17.1 Binary Codes Normal numbers we use
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More informationClassification & Information Theory Lecture #8
Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More information2018/5/3. YU Xiangyu
2018/5/3 YU Xiangyu yuxy@scut.edu.cn Entropy Huffman Code Entropy of Discrete Source Definition of entropy: If an information source X can generate n different messages x 1, x 2,, x i,, x n, then the
More informationImage and Multidimensional Signal Processing
Image and Multidimensional Signal Processing Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ Image Compression 2 Image Compression Goal: Reduce amount
More informationMultimedia Communications. Mathematical Preliminaries for Lossless Compression
Multimedia Communications Mathematical Preliminaries for Lossless Compression What we will see in this chapter Definition of information and entropy Modeling a data source Definition of coding and when
More information4. Quantization and Data Compression. ECE 302 Spring 2012 Purdue University, School of ECE Prof. Ilya Pollak
4. Quantization and Data Compression ECE 32 Spring 22 Purdue University, School of ECE Prof. What is data compression? Reducing the file size without compromising the quality of the data stored in the
More informationUNIT I INFORMATION THEORY. I k log 2
UNIT I INFORMATION THEORY Claude Shannon 1916-2001 Creator of Information Theory, lays the foundation for implementing logic in digital circuits as part of his Masters Thesis! (1939) and published a paper
More informationInformation Theory (Information Theory by J. V. Stone, 2015)
Information Theory (Information Theory by J. V. Stone, 2015) Claude Shannon (1916 2001) Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27:379 423. A mathematical
More informationDept. of Linguistics, Indiana University Fall 2015
L645 Dept. of Linguistics, Indiana University Fall 2015 1 / 28 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission
More informationIntroduction to Information Theory. Uncertainty. Entropy. Surprisal. Joint entropy. Conditional entropy. Mutual information.
L65 Dept. of Linguistics, Indiana University Fall 205 Information theory answers two fundamental questions in communication theory: What is the ultimate data compression? What is the transmission rate
More informationKotebe Metropolitan University Department of Computer Science and Technology Multimedia (CoSc 4151)
Kotebe Metropolitan University Department of Computer Science and Technology Multimedia (CoSc 4151) Chapter Three Multimedia Data Compression Part I: Entropy in ordinary words Claude Shannon developed
More informationCSCI 2570 Introduction to Nanocomputing
CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication
More informationChapter 2: Source coding
Chapter 2: meghdadi@ensil.unilim.fr University of Limoges Chapter 2: Entropy of Markov Source Chapter 2: Entropy of Markov Source Markov model for information sources Given the present, the future is independent
More informationlog 2 N I m m log 2 N + 1 m.
SOPHOMORE COLLEGE MATHEMATICS OF THE INFORMATION AGE SHANNON S THEOREMS Let s recall the fundamental notions of information and entropy. To repeat, Shannon s emphasis is on selecting a given message from
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2011, 28 November 2011 Memoryless Sources Arithmetic Coding Sources with Memory 2 / 19 Summary of last lecture Prefix-free
More informationEntropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39
Entropy Probability and Computing Presentation 22 Probability and Computing Presentation 22 Entropy 1/39 Introduction Why randomness and information are related? An event that is almost certain to occur
More information3F1 Information Theory, Lecture 3
3F1 Information Theory, Lecture 3 Jossy Sayir Department of Engineering Michaelmas 2013, 29 November 2013 Memoryless Sources Arithmetic Coding Sources with Memory Markov Example 2 / 21 Encoding the output
More informationCS4800: Algorithms & Data Jonathan Ullman
CS4800: Algorithms & Data Jonathan Ullman Lecture 22: Greedy Algorithms: Huffman Codes Data Compression and Entropy Apr 5, 2018 Data Compression How do we store strings of text compactly? A (binary) code
More informationLecture 1: September 25, A quick reminder about random variables and convexity
Information and Coding Theory Autumn 207 Lecturer: Madhur Tulsiani Lecture : September 25, 207 Administrivia This course will cover some basic concepts in information and coding theory, and their applications
More informationCMPT 365 Multimedia Systems. Lossless Compression
CMPT 365 Multimedia Systems Lossless Compression Spring 2017 Edited from slides by Dr. Jiangchuan Liu CMPT365 Multimedia Systems 1 Outline Why compression? Entropy Variable Length Coding Shannon-Fano Coding
More informationLecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments
Lecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006 jzhang@cse.unsw.edu.au
More informationCS1800: Hex & Logic. Professor Kevin Gold
CS1800: Hex & Logic Professor Kevin Gold Reviewing Last Time: Binary Last time, we saw that arbitrary numbers can be represented in binary. Each place in a binary number stands for a different power of
More informationCS1800: Mathematical Induction. Professor Kevin Gold
CS1800: Mathematical Induction Professor Kevin Gold Induction: Used to Prove Patterns Just Keep Going For an algorithm, we may want to prove that it just keeps working, no matter how big the input size
More informationDISCRETE HAAR WAVELET TRANSFORMS
DISCRETE HAAR WAVELET TRANSFORMS Catherine Bénéteau University of South Florida Tampa, FL USA UNM - PNM Statewide Mathematics Contest, 2011 SATURDAY, FEBRUARY 5, 2011 (UNM) DISCRETE HAAR WAVELET TRANSFORMS
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More informationWelcome to Comp 411! 2) Course Objectives. 1) Course Mechanics. 3) Information. I thought this course was called Computer Organization
Welcome to Comp 4! I thought this course was called Computer Organization David Macaulay ) Course Mechanics 2) Course Objectives 3) Information L - Introduction Meet the Crew Lectures: Leonard McMillan
More informationLecture 2: Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments. Tutorial 1. Acknowledgement and References for lectures 1 to 5
Lecture : Introduction to Audio, Video & Image Coding Techniques (I) -- Fundaments Dr. Jian Zhang Conjoint Associate Professor NICTA & CSE UNSW COMP959 Multimedia Systems S 006 jzhang@cse.unsw.edu.au Acknowledgement
More informationInformation Theory and Statistics Lecture 2: Source coding
Information Theory and Statistics Lecture 2: Source coding Łukasz Dębowski ldebowsk@ipipan.waw.pl Ph. D. Programme 2013/2014 Injections and codes Definition (injection) Function f is called an injection
More informationRun-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE
General e Image Coder Structure Motion Video x(s 1,s 2,t) or x(s 1,s 2 ) Natural Image Sampling A form of data compression; usually lossless, but can be lossy Redundancy Removal Lossless compression: predictive
More informationAutumn Coping with NP-completeness (Conclusion) Introduction to Data Compression
Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression Kirkpatrick (984) Analogy from thermodynamics. The best crystals are found by annealing. First heat up the material to let
More informationSimple Interactions CS 105 Lecture 2 Jan 26. Matthew Stone
Simple Interactions CS 105 Lecture 2 Jan 26 Matthew Stone Rutgers University Department of Computer Science and Center for Cognitive Science Themes from the Preface 2 Computers are multimedia devices Text
More informationEntropy Coding. Connectivity coding. Entropy coding. Definitions. Lossles coder. Input: a set of symbols Output: bitstream. Idea
Connectivity coding Entropy Coding dd 7, dd 6, dd 7, dd 5,... TG output... CRRRLSLECRRE Entropy coder output Connectivity data Edgebreaker output Digital Geometry Processing - Spring 8, Technion Digital
More informationCompression and Coding
Compression and Coding Theory and Applications Part 1: Fundamentals Gloria Menegaz 1 Transmitter (Encoder) What is the problem? Receiver (Decoder) Transformation information unit Channel Ordering (significance)
More informationDigital communication system. Shannon s separation principle
Digital communication system Representation of the source signal by a stream of (binary) symbols Adaptation to the properties of the transmission channel information source source coder channel coder modulation
More informationPHYS Statistical Mechanics I Assignment 4 Solutions
PHYS 449 - Statistical Mechanics I Assignment 4 Solutions 1. The Shannon entropy is S = d p i log 2 p i. The Boltzmann entropy is the same, other than a prefactor of k B and the base of the log is e. Neither
More informationMITOCW watch?v=vjzv6wjttnc
MITOCW watch?v=vjzv6wjttnc PROFESSOR: We just saw some random variables come up in the bigger number game. And we're going to be talking now about random variables, just formally what they are and their
More informationAdministrative notes. Computational Thinking ct.cs.ubc.ca
Administrative notes Labs this week: project time. Remember, you need to pass the project in order to pass the course! (See course syllabus.) Clicker grades should be on-line now Administrative notes March
More informationCSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression
CSEP 52 Applied Algorithms Spring 25 Statistical Lossless Data Compression Outline for Tonight Basic Concepts in Data Compression Entropy Prefix codes Huffman Coding Arithmetic Coding Run Length Coding
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationCSE 421 Greedy: Huffman Codes
CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits better: 2.52 bits/char 74%*2 +26%*4:
More informationShannon's Theory of Communication
Shannon's Theory of Communication An operational introduction 5 September 2014, Introduction to Information Systems Giovanni Sileno g.sileno@uva.nl Leibniz Center for Law University of Amsterdam Fundamental
More informationInformation Theory CHAPTER. 5.1 Introduction. 5.2 Entropy
Haykin_ch05_pp3.fm Page 207 Monday, November 26, 202 2:44 PM CHAPTER 5 Information Theory 5. Introduction As mentioned in Chapter and reiterated along the way, the purpose of a communication system is
More informationChapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code
Chapter 3 Source Coding 3. An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code 3. An Introduction to Source Coding Entropy (in bits per symbol) implies in average
More informationMathematical Foundations of Computer Science Lecture Outline October 18, 2018
Mathematical Foundations of Computer Science Lecture Outline October 18, 2018 The Total Probability Theorem. Consider events E and F. Consider a sample point ω E. Observe that ω belongs to either F or
More informationHuffman Coding. C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University
Huffman Coding C.M. Liu Perceptual Lab, College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)573877 cmliu@cs.nctu.edu.tw
More informationKolmogorov complexity ; induction, prediction and compression
Kolmogorov complexity ; induction, prediction and compression Contents 1 Motivation for Kolmogorov complexity 1 2 Formal Definition 2 3 Trying to compute Kolmogorov complexity 3 4 Standard upper bounds
More informationCoding of memoryless sources 1/35
Coding of memoryless sources 1/35 Outline 1. Morse coding ; 2. Definitions : encoding, encoding efficiency ; 3. fixed length codes, encoding integers ; 4. prefix condition ; 5. Kraft and Mac Millan theorems
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 13 Competitive Optimality of the Shannon Code So, far we have studied
More informationImplementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source
Implementation of Lossless Huffman Coding: Image compression using K-Means algorithm and comparison vs. Random numbers and Message source Ali Tariq Bhatti 1, Dr. Jung Kim 2 1,2 Department of Electrical
More informationBASIC COMPRESSION TECHNIQUES
BASIC COMPRESSION TECHNIQUES N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lectures # 05 Questions / Problems / Announcements? 2 Matlab demo of DFT Low-pass windowed-sinc
More informationDCSP-3: Minimal Length Coding. Jianfeng Feng
DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than
More informationInformation Theory, Statistics, and Decision Trees
Information Theory, Statistics, and Decision Trees Léon Bottou COS 424 4/6/2010 Summary 1. Basic information theory. 2. Decision trees. 3. Information theory and statistics. Léon Bottou 2/31 COS 424 4/6/2010
More informationChapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code
Chapter 2 Date Compression: Source Coding 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code 2.1 An Introduction to Source Coding Source coding can be seen as an efficient way
More informationSIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding
SIGNAL COMPRESSION Lecture 3 4.9.2007 Shannon-Fano-Elias Codes and Arithmetic Coding 1 Shannon-Fano-Elias Coding We discuss how to encode the symbols {a 1, a 2,..., a m }, knowing their probabilities,
More informationNumber Representation and Waveform Quantization
1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More informationCh 0 Introduction. 0.1 Overview of Information Theory and Coding
Ch 0 Introduction 0.1 Overview of Information Theory and Coding Overview The information theory was founded by Shannon in 1948. This theory is for transmission (communication system) or recording (storage
More informationReal-Time Audio and Video
MM- Multimedia Payloads MM-2 Raw Audio (uncompressed audio) Real-Time Audio and Video Telephony: Speech signal: 2 Hz 3.4 khz! 4 khz PCM (Pulse Coded Modulation)! samples/sec x bits = 64 kbps Teleconferencing:
More informationImage Data Compression
Image Data Compression Image data compression is important for - image archiving e.g. satellite data - image transmission e.g. web data - multimedia applications e.g. desk-top editing Image data compression
More informationMATH2206 Prob Stat/20.Jan Weekly Review 1-2
MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion
More informationMurray Gell-Mann, The Quark and the Jaguar, 1995
Although [complex systems] differ widely in their physical attributes, they resemble one another in the way they handle information. That common feature is perhaps the best starting point for exploring
More informationClassical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006
Classical Information Theory Notes from the lectures by prof Suhov Trieste - june 2006 Fabio Grazioso... July 3, 2006 1 2 Contents 1 Lecture 1, Entropy 4 1.1 Random variable...............................
More informationDigital Image Processing Lectures 25 & 26
Lectures 25 & 26, Professor Department of Electrical and Computer Engineering Colorado State University Spring 2015 Area 4: Image Encoding and Compression Goal: To exploit the redundancies in the image
More informationClassification and Regression Trees
Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity
More informationCompression. Reality Check 11 on page 527 explores implementation of the MDCT into a simple, working algorithm to compress audio.
C H A P T E R 11 Compression The increasingly rapid movement of information around the world relies on ingenious methods of data representation, which are in turn made possible by orthogonal transformations.the
More informationMACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION
MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be
More informationCOMM901 Source Coding and Compression. Quiz 1
German University in Cairo - GUC Faculty of Information Engineering & Technology - IET Department of Communication Engineering Winter Semester 2013/2014 Students Name: Students ID: COMM901 Source Coding
More informationExamples MAT-INF1100. Øyvind Ryan
Examples MAT-INF00 Øyvind Ryan February 9, 20 Example 0.. Instead of converting 76 to base 8 let us convert it to base 6. We find that 76//6 = 2 with remainder. In the next step we find 2//6 = 4 with remainder.
More informationInformation Theory and Coding Techniques
Information Theory and Coding Techniques Lecture 1.2: Introduction and Course Outlines Information Theory 1 Information Theory and Coding Techniques Prof. Ja-Ling Wu Department of Computer Science and
More informationCIS 2033 Lecture 5, Fall
CIS 2033 Lecture 5, Fall 2016 1 Instructor: David Dobor September 13, 2016 1 Supplemental reading from Dekking s textbook: Chapter2, 3. We mentioned at the beginning of this class that calculus was a prerequisite
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationCounting. 1 Sum Rule. Example 1. Lecture Notes #1 Sept 24, Chris Piech CS 109
1 Chris Piech CS 109 Counting Lecture Notes #1 Sept 24, 2018 Based on a handout by Mehran Sahami with examples by Peter Norvig Although you may have thought you had a pretty good grasp on the notion of
More informationNP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness
Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce
More informationDiscrete Mathematics and Probability Theory Fall 2010 Tse/Wagner MT 2 Soln
CS 70 Discrete Mathematics and Probability heory Fall 00 se/wagner M Soln Problem. [Rolling Dice] (5 points) You roll a fair die three times. Consider the following events: A first roll is a 3 B second
More informationDigital Systems Roberto Muscedere Images 2013 Pearson Education Inc. 1
Digital Systems Digital systems have such a prominent role in everyday life The digital age The technology around us is ubiquitous, that is we don t even notice it anymore Digital systems are used in:
More informationLecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018
CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of
More informationGod doesn t play dice. - Albert Einstein
ECE 450 Lecture 1 God doesn t play dice. - Albert Einstein As far as the laws of mathematics refer to reality, they are not certain; as far as they are certain, they do not refer to reality. Lecture Overview
More informationWe are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors
CSC258 Week 3 1 Logistics If you cannot login to MarkUs, email me your UTORID and name. Check lab marks on MarkUs, if it s recorded wrong, contact Larry within a week after the lab. Quiz 1 average: 86%
More informationFormalizing Probability. Choosing the Sample Space. Probability Measures
Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take
More informationOptimal codes - I. A code is optimal if it has the shortest codeword length L. i i. This can be seen as an optimization problem. min.
Huffman coding Optimal codes - I A code is optimal if it has the shortest codeword length L L m = i= pl i i This can be seen as an optimization problem min i= li subject to D m m i= lp Gabriele Monfardini
More informationSTA Module 4 Probability Concepts. Rev.F08 1
STA 2023 Module 4 Probability Concepts Rev.F08 1 Learning Objectives Upon completing this module, you should be able to: 1. Compute probabilities for experiments having equally likely outcomes. 2. Interpret
More information