Greedy Alg: Huffman abhi shelat

Similar documents
abhi shelat

L feb abhi shelat

BFS Dijkstra. Oct abhi shelat

Algorithm Design and Analysis

Greedy Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 10

Algorithm Design and Analysis

Shortest Path Algorithms

CSE 421 Greedy: Huffman Codes

Slides for CIS 675. Huffman Encoding, 1. Huffman Encoding, 2. Huffman Encoding, 3. Encoding 1. DPV Chapter 5, Part 2. Encoding 2

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

3 Greedy Algorithms. 3.1 An activity-selection problem

25. Minimum Spanning Trees

General Methods for Algorithm Design

25. Minimum Spanning Trees

1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument

Discrete Wiskunde II. Lecture 5: Shortest Paths & Spanning Trees

IS 709/809: Computational Methods in IS Research Fall Exam Review

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

CS 4407 Algorithms Lecture: Shortest Path Algorithms

CS 253: Algorithms. Chapter 24. Shortest Paths. Credit: Dr. George Bebis

21. Dynamic Programming III. FPTAS [Ottman/Widmayer, Kap. 7.2, 7.3, Cormen et al, Kap. 15,35.5]

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

CS4800: Algorithms & Data Jonathan Ullman

2-INF-237 Vybrané partie z dátových štruktúr 2-INF-237 Selected Topics in Data Structures

Dijkstra s Single Source Shortest Path Algorithm. Andreas Klappenecker

CS781 Lecture 3 January 27, 2011

Lecture 1 : Data Compression and Entropy

Analysis of Algorithms. Outline. Single Source Shortest Path. Andres Mendez-Vazquez. November 9, Notes. Notes

Divide-and-Conquer Algorithms Part Two

CS 580: Algorithm Design and Analysis

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Single Source Shortest Paths

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

Greedy Algorithms and Data Compression. Curs 2018

Information Theory and Statistics Lecture 2: Source coding

Breadth First Search, Dijkstra s Algorithm for Shortest Paths

CMPSCI611: The Matroid Theorem Lecture 5

Heaps and Priority Queues

Design and Analysis of Algorithms

CSE 202: Design and Analysis of Algorithms Lecture 3

Discrete Mathematics

CMPS 2200 Fall Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk. 10/8/12 CMPS 2200 Intro.

Autumn Coping with NP-completeness (Conclusion) Introduction to Data Compression

CS 161: Design and Analysis of Algorithms

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Introduction to Algorithms

Introduction to Algorithms

Minimum Spanning Trees

CSE101: Design and Analysis of Algorithms. Ragesh Jaiswal, CSE, UCSD

Data Structures in Java

Algorithms Booklet. 1 Graph Searching. 1.1 Breadth First Search

CS Data Structures and Algorithm Analysis

CS583 Lecture 11. Many slides here are based on D. Luebke slides. Review: Dynamic Programming

An Introduction to Matroids & Greedy in Approximation Algorithms

ICS 252 Introduction to Computer Design

Introduction to Algorithms

CS 6301 PROGRAMMING AND DATA STRUCTURE II Dept of CSE/IT UNIT V GRAPHS

1 Some loose ends from last time

CS 161: Design and Analysis of Algorithms

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Matching Residents to Hospitals

First Lemma. Lemma: The cost function C h has the ASI-Property. Proof: The proof can be derived from the definition of C H : C H (AUVB) = C H (A)

Lecture 9. Greedy Algorithm

Single Source Shortest Paths

Contents Lecture 4. Greedy graph algorithms Dijkstra s algorithm Prim s algorithm Kruskal s algorithm Union-find data structure with path compression

Proof methods and greedy algorithms

Context-Free Languages

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Matroids and Greedy Algorithms Date: 10/31/16

CMPS 6610 Fall 2018 Shortest Paths Carola Wenk

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

Text Compression. Jayadev Misra The University of Texas at Austin December 5, A Very Incomplete Introduction to Information Theory 2

Mon Tue Wed Thurs Fri

Assignment 5: Solutions

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

EE 121: Introduction to Digital Communication Systems. 1. Consider the following discrete-time communication system. There are two equallly likely

SIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding

Chapter 10 Search Structures

CS325: Analysis of Algorithms, Fall Final Exam

Greedy Algorithms and Data Compression. Curs 2017

Fundamental Algorithms

Coding of memoryless sources 1/35

FIBONACCI HEAPS. preliminaries insert extract the minimum decrease key bounding the rank meld and delete. Copyright 2013 Kevin Wayne

Algorithms Theory. 08 Fibonacci Heaps

Algorithm Design Strategies V

CS60007 Algorithm Design and Analysis 2018 Assignment 1

Fibonacci (Min-)Heap. (I draw dashed lines in place of of circular lists.) 1 / 17

CPSC 320 Sample Final Examination December 2013

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

Algorithm Design and Analysis

Algorithm Design and Analysis (NTU CSIE, Fall 2017) Homework #3. Homework #3. Due Time: 2017/12/14 (Thu.) 17:20 Contact TAs:

Notes on induction proofs and recursive definitions

Introduction to Algorithms

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

Join Ordering. Lemma: The cost function C h has the ASI-Property. Proof: The proof can be derived from the definition of C H :

Transcription:

L15 Greedy Alg: Huffman 4102 10.17.201 abhi shelat

Huffman Coding

image: wikimedia

In testimony before the committee, Mr. Lew stressed that the Treasury Department would run out of extraordinary measures to free up cash in a matter of days. At that point, the country s bills might overwhelm its cash on hand plus any receipts from taxes or other sources, leading to an unprecedented default.mr. Lew said that Treasury had no workarounds to avoid breaching the debt ceiling. There is no plan other than raising the debt limit, he said. The legal issues, even regarding interest and principal on the debt, are complicated.

m In testimony before the committee, Mr. Lew stressed that the Treasury Department would run out of extraordinary measures to free up cash in a matter of days. At that point, the country s bills might overwhelm its cash on hand plus any receipts from taxes or other sources, leading to an unprecedented default.mr. Lew said that Treasury had no workarounds to avoid breaching the debt ceiling. There is no plan other than raising the debt limit, he said. The legal issues, even regarding interest and principal on the debt, are complicated.

def: cost of an encoding B(T,{f c })= X c2c f c `c e: 25 000 i: 200 001 o: 170 010 u: 7 011 p: 7 100 g: 47 101 b: 40 110 f: 24 111 1

e: 240 i: 20061 a: 199 o: 17092 r: 160491 n: 1521 t: 152570 s: 192 l: 10172 c: 1007 u: 7211 p: 7077 m: 70504 d: 6007 h: 64165 y: 51527 g: 47011 b: 4051 f: 24110 v: 2010 k: 16012 w: 125 z: 49 x: 6926 q: 729 j: 075 00 225 150 75 0 character frequency e i a o r n t s l c u p m d h y g b f v k w z x q j

morse code

def: prefix code e: 25 0 i: 200 10 o: 170 110 u: 7 1110 p: 7 11110 g: 47 111110 b: 40 1111110 f: 24 11111110

prefix code binary tree

e: 25 0 i: 200 10 o: 170 110 u: 7 1110 p: 7 11110 g: 47 111110 b: 40 1111110 f: 24 11111110 code to binary tree 111111010111110

e: 25 00 i: 200 01 o: 170 10 u: 7 110 p: 7 111 2 2 2 e i o u p use tree to encode messages

goal given the character frequencies {f c } c C produce a prefix code T with smallest cost min T B(T,{f c})

property y Lemma:optimal tree must be full. x a b

divide & conquer?

counter-example e: 2 i: 25 o: 20 u: 1 p: 5

)"!'(!## &$# %$ $% "$ "#!" e i o u p g b f

!'(!## &$# %$ $% "$ )" e i o u p g "#!" b f

!'(!## &$# %$ $% )" "$ e i o u p g "#!" b f

200 i 70 170 o 1 25 e 511 165 276 111 e: 25 01 i: 200 11 o: 170 10 u: 7 0011 p: 7 0010 g: 47 0000 b: 40 00011 f: 24 00010 470 400 40 4 12 1 200 120 27 7 u 7 p 64 47 g 40 b 24 f

objective

lemma: exchange argument

lemma: T exchange argument Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for y x a b

lemma: exchange argument Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for T T y b x a a b x y

exchange argument proof: lemma: Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for

exchange argument lemma: Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for T y first step x a b

exchange argument lemma: Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for T T y first step y x a a b x b f a f b f x f y f y f b f x f a

T T y y x f x f a a a b x b

T T y y x f x f a a a b x b B(T )= f c c + f x x + f a a B(T )= f c c + f x x + f a a c c B(T ) B(T ) 0

T T y b a a x b x y B(T ) B(T ) 0

T T T y y b x a a a b x b x y

T T T y y b x a a a b x b x y B(T ) B(T ) 0 B(T ) B(T ) 0 Tis also optimal

exchange argument lemma: Let x, y C be characters with smallest frequencies f x,f y. There exists an optimal prefix code T for C in which x, y are siblings. That is, the codes for x, y have the same length and only di er in the last bit. The optimal solution for consists of computing an optimal solution for T T y b x a a b x y

f c!'(!## &$# %$ $% "$ optimal sub-structure f x "# f y!"

f c optimal sub-structure!'(!## &$# %$ $% "$ f x "# f y!" problem of size n!'(!## &$# %$ $% "$ )" f c problem of size n-1 f z

f c optimal sub-structure!'(!## &$# %$ $% "$ problem of size n!'(!## &$# %$ $% "$ )" f c problem of size n-1 f x "# f z f y!" Lemma:

f c optimal sub-structure!'(!## &$# %$ $% "$ problem of size n!'(!## &$# %$ $% "$ )" f c problem of size n-1 f x "# f z f y!" Lemma: The optimal solution for T consists of computing an optimal solution for T and replacing the left z with a node having children x, y.

T z z

T T z x y B(T ) B(T )

T T z x y B(T ) B(T ) B(T )=B(T ) f x f y

Suppose T is not optimal

Suppose U T is not optimal B(U) <B(T ) x y

Suppose U T is not optimal B(U) <B(T ) x y U z

Suppose U T is not optimal B(U) <B(T ) x y B(U )=B(U) f x f y U < B(t) - fx - fy z But this implies that B(T ) was not optimal.

therefore T z x y

summary of argument

MST

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

connecting houses image:www.princegeorgeva.org, thefranciscofamily.org, www.rightdriveacademy.co.uk, www.ccscambridge.org, www.drawingcoach.com, www.pastoral.org.uk, www.daasgallery.com

b d g 10 2 a 9 e 5 9 i 12 11 c 1 f 6 h

graphs clrs [ch 22]

representation G =(V,E) adjacency list space: time list neighbors: time check an edge:

representation G =(V,E) adjacency matrix space: time list neighbors: time check an edge:

definition: path a sequence of nodes with the property that simple path: cycle:

definition:tree connected graph: a tree is

what we want: b d g 10 2 a 9 e 5 9 i 12 11 c 1 f 6 h

minimum spanning tree looking for a set of edges that T E (a) connects all vertices (b) has the least cost min (u,v) T w(u, v)

looking for a set of edges thatt E (a) connects all vertices (b) has the least cost facts min (u,v) T w(u, v) how many edges does solution have? does solution have a cycle?

strategy start with an empty set of edges A repeat for v-1 times: add lightest edge that does not create a cycle

example b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

kruskal e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7 e d f c b a 10 1 9 12 6 5 9 11 2 i g h 7

why does this work? 1 T 2 repeat V 1 times: add to T the lightest edge e E that does not create a cycle

definition: cut

example of a cut b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

definition: crossing a cut

definition: crossing a cut an edge e =(u, crosses v) a graph cut (S,V-S) if u S v V S b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

example of a crossing b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

definition: respect

theorem: cut property

suppose the set of edges thm: cut property A is part of an m.s.t. let (S, V S) be any cut that respects A. let edge e be the min-weight edge across (S, V S) then: A {e} is part of an m.s.t.

example of theorem b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

d 5 7 h 9 g 11 2 e 6 i b f 10 9 1 a 12 c

Theorem 2 Suppose the set of edges A is part of a minimum spanning tree of G = (V, E). Let (S, V S) be any cut that respects A and let e be the edge with the minimum weight that crosses (S, V S). Then the set A {e} is part of a minimum spanning tree.

proof of cut thm d 7 5 g 9 2 11 i e h 6 c 1 f 9 12 b 10 a

Kruskal-pseudocode(G) correctness 1 A 2 repeat V 1 times: add to A the lightest edge e E that does not create a cycle

Kruskal-pseudocode(G) correctness 1 A 2 repeat V 1 times: add to A the lightest edge e E that does not create a cycle proof: by induction. in step 1, A is part of some MST. suppose that after k steps, A is part of some MST (line 2). in line, we add an edge e to A.

cases for edge e

cases for edge e e must be lightest edge crossing

analysis? Kruskal-pseudocode(G) 1 A 2 repeat V 1 times: add to A the lightest edge e E that does not create a cycle

General-MST-Strategy(G =(V,E)) 1 A 2 repeat V 1 times: Pick a cut (S, V S) that respects A 4 Let e be min-weight edge over cut (S, V S) 5 A A {e}

prim General-MST-Strategy(G =(V,E)) 1 A 2 repeat V 1 times: Pick a cut (S, V S) that respects A 4 Let e be min-weight edge over cut (S, V S) 5 A A {e} A is a subtree edge e is lightest edge that grows the subtree

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

prim b d g 10 7 2 a 9 e 5 9 i 12 11 c 1 f 6 h

idea: implementation

implementation

new data structure

binary heap full tree, key value <= to key of children

binary heap full tree, key value <= to key of children ' &# ( && &' ) :!' ""

binary heap full tree, key value <= to key of children ' &# ( && &' ) :!' "" %

binary heap full tree, key value <= to key of children ' &# ( && % ) :!' "" &'

binary heap full tree, key value <= to key of children how to extractmin? ' % ( && &# ) :!' "" &'

binary heap full tree, key value <= to key of children how to extractmin? &' % ( && &# ) :!' "" &'

binary heap full tree, key value <= to key of children how to extractmin? ( % &' && &# ) :!' "" &'

binary heap full tree, key value <= to key of children how to extractmin? ( % ) && &# &' :!' "" &'

binary heap full tree, key value <= to key of children how to decreasekey? ( % ) && &# &' :!' ""

binary heap full tree, key value <= to key of children how to decreasekey? ( $ ) && % &' :!' ""

implementation use a priority queue to keep track of light edges insert: makequeue: extractmin: decreasekey:

algorithm

prim(g =(V,E)) implementation 1 Q Q is a Priority Queue 2 Initialize each v V with key k v, v nil Pick a starting node r and set k r 0 4 Insert all nodes into Q with key k v. 5 while Q = 6 do u extract-min(q) 7 for each v Adj (u) do if v Q and w(u, v) <k v 9 then v u 10 decrease-key(q, v, w(u, v)) Sets k v w(u, v)