Lossless Compression Lossy Compression

Similar documents
Lecture 6: Coding theory

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

, g. Exercise 1. Generator polynomials of a convolutional code, given in binary form, are g. Solution 1.

CSEP 521 Applied Algorithms Spring Statistical Lossless Data Compression

Finite State Automata and Determinisation

Outline Data Structures and Algorithms. Data compression. Data compression. Lossy vs. Lossless. Data Compression

Parse trees, ambiguity, and Chomsky normal form

6.5 Improper integrals

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Spacetime and the Quantum World Questions Fall 2010

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

Instructions. An 8.5 x 11 Cheat Sheet may also be used as an aid for this test. MUST be original handwriting.

The Plan. Honey, I Shrunk the Data. Why Compress. Data Compression Concepts. Braille Example. Braille. x y xˆ

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Nondeterministic Automata vs Deterministic Automata

Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem 2/16/15

University of Sioux Falls. MAT204/205 Calculus I/II

CS 573 Automata Theory and Formal Languages

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Homework Solution - Set 5 Due: Friday 10/03/08

Probability. b a b. a b 32.

Discrete Structures Lecture 11

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7

(a) A partition P of [a, b] is a finite subset of [a, b] containing a and b. If Q is another partition and P Q, then Q is a refinement of P.

Name Ima Sample ASU ID

10. AREAS BETWEEN CURVES

Regular Language. Nonregular Languages The Pumping Lemma. The pumping lemma. Regular Language. The pumping lemma. Infinitely long words 3/17/15

Introduction to Olympiad Inequalities

Chapter 4 State-Space Planning

p-adic Egyptian Fractions

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Algorithm Design and Analysis

Linear Inequalities. Work Sheet 1

Chapter 3. Vector Spaces. 3.1 Images and Image Arithmetic

Before we can begin Ch. 3 on Radicals, we need to be familiar with perfect squares, cubes, etc. Try and do as many as you can without a calculator!!!

Review of Gaussian Quadrature method

2.4 Linear Inequalities and Interval Notation

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows:

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

Lecture 3: Equivalence Relations

Algorithm Design and Analysis

Symmetrical Components 1

Data Compression Techniques (Spring 2012) Model Solutions for Exercise 4

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

INTEGRATION. 1 Integrals of Complex Valued functions of a REAL variable

Math Lecture 23

First Midterm Examination

Polynomials. Polynomials. Curriculum Ready ACMNA:

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

Section 6: Area, Volume, and Average Value

Designing Information Devices and Systems I Anant Sahai, Ali Niknejad. This homework is due October 19, 2015, at Noon.

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Can one hear the shape of a drum?

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

Part 4. Integration (with Proofs)

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b

PYTHAGORAS THEOREM WHAT S IN CHAPTER 1? IN THIS CHAPTER YOU WILL:

expression simply by forming an OR of the ANDs of all input variables for which the output is

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

Linear Systems with Constant Coefficients

Winter 2016 COMP-250: Introduction to Computer Science. Lecture 24, April 7, 2016

1 ELEMENTARY ALGEBRA and GEOMETRY READINESS DIAGNOSTIC TEST PRACTICE

Engr354: Digital Logic Circuits

Formal Language and Automata Theory (CS21004)

AP CALCULUS Test #6: Unit #6 Basic Integration and Applications

Generalization of 2-Corner Frequency Source Models Used in SMSIM

1 Nondeterministic Finite Automata

CSE 332. Sorting. Data Abstractions. CSE 332: Data Abstractions. QuickSort Cutoff 1. Where We Are 2. Bounding The MAXIMUM Problem 4

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

6.004 Computation Structures Spring 2009

PAIR OF LINEAR EQUATIONS IN TWO VARIABLES

Homework 3 Solutions

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

f(a+h) f(a) x a h 0. This is the rate at which

Linear Algebra Introduction

Intermediate Math Circles Wednesday 17 October 2012 Geometry II: Side Lengths

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Infinite Geometric Series

Designing Information Devices and Systems I Spring 2018 Homework 8

Formal languages, automata, and theory of computation

First Midterm Examination

Formula for Trapezoid estimate using Left and Right estimates: Trap( n) If the graph of f is decreasing on [a, b], then f ( x ) dx

NON-DETERMINISTIC FSA

CARLETON UNIVERSITY. 1.0 Problems and Most Solutions, Sect B, 2005

a < a+ x < a+2 x < < a+n x = b, n A i n f(x i ) x. i=1 i=1

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

Mathematics Number: Logarithms

Coalgebra, Lecture 15: Equations for Deterministic Automata

Designing finite automata II

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

Factorising FACTORISING.

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.

Fast Boolean Algebra

Minimal DFA. minimal DFA for L starting from any other

Transcription:

Administrivi CSE 39 Introdution to Dt Compression Spring 23 Leture : Introdution to Dt Compression Entropy Prefix Codes Instrutor Prof. Alexnder Mohr mohr@s.sunys.edu offie hours: TBA We http://mnl.s.sunys.edu/lss/se39/24-fll/ Miling list http://mnl.s.sunys.edu/milmn/listinfo/se39/ Plese susrie y Mondy! Text Book Khlid Syood, Introdution to Dt Compression, Seond Edition, Morgn Kufmnn Pulishers, 2, ISBN 55865584. $79.95 list. 2 Bsi Dt Compression Conepts Compression Rtios: Bewre! originl Enoder ompressed x y xˆ Deoder Lossless ompression x = xˆ Also lled entropy oding, reversile oding. Lossy ompression x xˆ Also lled irreversile oding. Compression rtio = x y x is the numer of its in x. deompressed Compression rtio = x y. Two wys to mke the rtio lrger: Derese the size of the ompressed version. Inrese the size of the unompressed version! 3 4 Why Compress Brille Conserve storge spe. Redue time for trnsmission: Fster to enode, send, nd deode thn to send the originl. Progressive trnsmission: Some ompression tehniques llow us to send the most importnt its first so we n get low resolution version of some dt efore getting the high fidelity version. Redue omputtion Use less dt to hieve n pproximte nswer. System to red text y feeling rised dots on pper (or on eletroni displys). Invented in 82s y Louis Brille, Frenh lind mn. z nd the with mother th h gh 5 6

Brille Exmple Cler text: Cll me Ishmel. Some yers go -- never mind how long preisely -- hving \\ little or no money in my purse, nd nothing prtiulr to interest me on shore, \\ I thought I would sil out little nd see the wtery prt of the world. (238 hrters) Grde 2 Brille in ASCII.,ll me,i\%mel4,``s ye$>$s go -- n``e m9d h[ l;g preisely -- hv+ \\ ll or no m``oy 9 my purse \& no?+ ``piul$>$ 6 9t]e/ me on \%ore \\,i $?$``$ $,i wd sil ll \& see! wt]y ``p (! \_w4 (23 hrters) Compression rtio = 238/23 =.7 7 8 9 Lossless Compression Lossy Compression Dt is not lost - the originl is relly needed. Dt is lost, ut not too muh: text ompression. ompression of omputer inries to fit on floppy. Audio. Video. Still imges, medil imges, photogrphs. Compression rtio typilly no etter thn 4: for lossless ompression on mny kinds of files. Sttistil Tehniques: Compression rtios of : often yield quite high fidelity results. Mjor tehniques inlude: Huffmn oding. Arithmeti oding. Golom oding. Ditionry tehniques: LZW, LZ77. Sequitur. Burrows-Wheeler Method. Stndrds - Morse ode, Brille, Unix ompress, gzip, zip, zip, GIF, PNG, JBIG, Lossless JPEG. Vetor Quntiztion. Wvelets. Blok trnsforms. Stndrds JPEG, JPEG 2, MPEG (, 2, 4, 7). 2 2

Why is Dt Compression Possile Most dt from nture hs redundny There is more dt thn the tul informtion ontined in the dt. Squeezing out the exess dt mounts to ompression. However, unsqeezing out is neessry to e le to figure out wht the dt mens. Alwys possile to ompress? Consider two-it sequene. Cn you lwys ompress it to one it? Informtion theory is needed to understnd the limits of ompression nd give lues on how to ompress well. 3 Wht is Informtion? Anlog dt: Also lled ontinuous dt. Represented y rel numers (or omplex numers). Digitl dt: Finite set of symols {, 2,, n }. All dt represented s sequenes (strings) in the symol set. Exmple: {,,, d, r}: rdr. Digitl dt n e n pproximtion to nlog dt. 4 Symols Romn lphet plus puntution. ASCII 256 symols. Binry {, }: nd re lled its. All digitl informtion n e represented in inry. {,,, d} fixed length representtion: ; ; ; d. 2 its per symol. Exerise Bits Per Symol Suppose we hve n symols. How mny its (s funtion of n) re neessry to represent symol in inry? Hint: Turn the prolem round: how mny symols n s funtion of? 5 6 Disussion Non-powers of 2. Cn we do etter thn fixed length representtion for non-powers-of-2? Informtion Theory Developed y Shnnon in the 94 s nd 5 s. Attempts to explin the limits of ommunition using proility theory. Exmple: Suppose English text is eing sent It is more likely you reeive n e thn z. In some sense, z hs more informtion thn e euse you expet e. 7 8 3

y First-order Informtion Suppose we re given symols {, 2,..., m }. P( i ) = proility of symol i ourring in the sene of ny other informtion. P( ) + P( 2 ) +... + P( m ) = inf( i ) = -log 2 P( i ) its is the informtion of i in its. 7 6 5 4 3 2 -log(x) Exmple {,, } with P() = /8, P() = /4, P() = 5/8 inf() = -log 2 (/8) = 3 inf() = -log 2 (/4) = 2 inf() = -log 2 (5/8) =.678 Reeiving n hs more informtion thn reeiving or...8.5.22.29.36.43.5.57.64.7.78 x.85.92.99 9 2 First Order Entropy The first order entropy is defined for proility distriution over symols {, 2,..., m }: m H = P ) log ( P( )) i= ( i 2 H is the verge numer of its required to ode up symol, given ll we know is the proility distriution of the symols. H is the Shnnon lower ound on the verge numer of its to ode symol in this soure model. Stronger models of entropy inlude ontext. We ll tlk out this lter. i Entropy Exmples {,, } with /8, /4, 5/8. H = /8 *3 + /4 *2 + 5/8*.678 =.3 its/symol {,, } with /3, /3, /3. (worst se) H = -3* (/3)*log 2 (/3) =.6 its/symol Note tht the stndrd oding of 3 symols tkes 2 its. 2 22 An Extreme Cse {,, } with p()=, p()=, p()=. H =? Entropy Curve Suppose we hve two symols with proilities x nd -x, respetively. mximum entropy t.5.2 -(x log x + (-x)log(-x)) entropy.8.6.4.2..2.3.4.5.6.7.8.9 proility of first symol 23 24 4

A Simple Prefix Code {,, } with /8, /4, 5/8. A prefix ode is defined y inry tree Prefix ode property no output is prefix of nother tree input output ode repet strt t root of tree repet if red it = then go right else go left until node is lef report lef until end of the ode 25 26 27 28 29 3 5

3 32 33 34 35 36 6

How Good is the Code? 5/8 /8 /4 it rte = (/8)2 + (/4)2 + (5/8) = /8 =.375 ps Entropy =.3 ps Stndrd ode = 2 ps (ps = its per symol) 37 38 Exerise Exerise 2 Plyer : pik string from lphet {,,, d} nd enode the string using the tree on the ord. Plyer 2: deode the string plyer gives you. Chek for equlity. (While you wit, try Exerise 2). String: rdr. Alphet: {,,, d, r}. Design prefix ode tht ompresses the string the most! 39 4 7