Dynamic Programming. Weighted Interval Scheduling. Algorithmic Paradigms. Dynamic Programming

Similar documents
Chapter 6. Weighted Interval Scheduling. Dynamic Programming. Algorithmic Paradigms. Dynamic Programming Applications

CSE 202 Dynamic Programming II

Copyright 2000, Kevin Wayne 1

Areas. ! Bioinformatics. ! Control theory. ! Information theory. ! Operations research. ! Computer science: theory, graphics, AI, systems,.

Chapter 6. Dynamic Programming. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

CS 580: Algorithm Design and Analysis

Dynamic Programming 1

6. DYNAMIC PROGRAMMING I

Dynamic Programming: Interval Scheduling and Knapsack

6. DYNAMIC PROGRAMMING I

CSE 421 Weighted Interval Scheduling, Knapsack, RNA Secondary Structure

Dynamic Programming. Cormen et. al. IV 15

Chapter 6. Dynamic Programming. CS 350: Winter 2018

6. DYNAMIC PROGRAMMING I

Copyright 2000, Kevin Wayne 1

CS 580: Algorithm Design and Analysis

CSE 421 Dynamic Programming

Dynamic Programming. Data Structures and Algorithms Andrei Bulatov

6.6 Sequence Alignment

Chapter 6. Dynamic Programming. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Dynamic Programming( Weighted Interval Scheduling)

Chapter 6. Dynamic Programming. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

6. DYNAMIC PROGRAMMING II

Lecture 2: Divide and conquer and Dynamic programming

1.1 First Problem: Stable Matching. Six medical students and four hospitals. Student Preferences. s6 h3 h1 h4 h2. Stable Matching Problem

RNA Secondary Structure. CSE 417 W.L. Ruzzo

Objec&ves. Review. Dynamic Programming. What is the knapsack problem? What is our solu&on? Ø Review Knapsack Ø Sequence Alignment 3/28/18

CS Lunch. Dynamic Programming Recipe. 2 Midterm 2! Slides17 - Segmented Least Squares.key - November 16, 2016

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

Aside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n

Algoritmiek, bijeenkomst 3

The Double Helix. CSE 417: Algorithms and Computational Complexity! The Central Dogma of Molecular Biology! DNA! RNA! Protein! Protein!

Algorithms and Theory of Computation. Lecture 9: Dynamic Programming

Outline DP paradigm Discrete optimisation Viterbi algorithm DP: 0 1 Knapsack. Dynamic Programming. Georgy Gimel farb

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

More Dynamic Programming

More Dynamic Programming

Dynamic programming. Curs 2015

Algorithm Design and Analysis

Matching Residents to Hospitals

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

Design and Analysis of Algorithms

Greedy Algorithms My T. UF

Dynamic programming. Curs 2017

Lecture 2: Pairwise Alignment. CG Ron Shamir

Algorithm Design and Analysis

Dynamic Programming. Credits: Many of these slides were originally authored by Jeff Edmonds, York University. Thanks Jeff!

Algorithm Design and Analysis

Activity selection. Goal: Select the largest possible set of nonoverlapping (mutually compatible) activities.

DAA Unit- II Greedy and Dynamic Programming. By Mrs. B.A. Khivsara Asst. Professor Department of Computer Engineering SNJB s KBJ COE, Chandwad

Chapter 4. Greedy Algorithms. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Matrix Multiplication. Data Structures and Algorithms Andrei Bulatov

CS 6901 (Applied Algorithms) Lecture 3

CS 580: Algorithm Design and Analysis

Dynamic Programming. Prof. S.J. Soni

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

CS60007 Algorithm Design and Analysis 2018 Assignment 1

Knapsack and Scheduling Problems. The Greedy Method

Greedy Algorithms. Kleinberg and Tardos, Chapter 4

CS473 - Algorithms I

CS 6783 (Applied Algorithms) Lecture 3

1 Assembly Line Scheduling in Manufacturing Sector

CSE 202 Homework 4 Matthias Springer, A

Lecture 9. Greedy Algorithm

CSE 421 Greedy Algorithms / Interval Scheduling

2. ALGORITHM ANALYSIS

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

CS583 Lecture 11. Many slides here are based on D. Luebke slides. Review: Dynamic Programming

Lecture 7: Dynamic Programming I: Optimal BSTs

Partha Sarathi Mandal

0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

CS 374: Algorithms & Models of Computation, Spring 2017 Greedy Algorithms Lecture 19 April 4, 2017 Chandra Chekuri (UIUC) CS374 1 Spring / 1

CPSC 320 (Intermediate Algorithm Design and Analysis). Summer Instructor: Dr. Lior Malka Final Examination, July 24th, 2009

Algoritmi e strutture di dati 2

13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks

Algorithm Design and Analysis

CS483 Design and Analysis of Algorithms

CS781 Lecture 3 January 27, 2011

Algorithms Design & Analysis. Dynamic Programming

General Methods for Algorithm Design

COMP 355 Advanced Algorithms

Week 7 Solution. The two implementations are 1. Approach 1. int fib(int n) { if (n <= 1) return n; return fib(n 1) + fib(n 2); } 2.

Computer Science & Engineering 423/823 Design and Analysis of Algorithms

Coin Changing: Give change using the least number of coins. Greedy Method (Chapter 10.1) Attempt to construct an optimal solution in stages.

Approximation: Theory and Algorithms

Final exam study sheet for CS3719 Turing machines and decidability.

Algorithms: COMP3121/3821/9101/9801

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

Running Time. Overview. Case Study: Sorting. Sorting problem: Analysis of algorithms: framework for comparing algorithms and predicting performance.

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

This means that we can assume each list ) is

CMPSCI 311: Introduction to Algorithms Second Midterm Exam

Copyright 2000, Kevin Wayne 1

Similarity Search. The String Edit Distance. Nikolaus Augsten. Free University of Bozen-Bolzano Faculty of Computer Science DIS. Unit 2 March 8, 2012

Chapter 5. Divide and Conquer. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Algorithms Exam TIN093 /DIT602

Reductions. Reduction. Linear Time Reduction: Examples. Linear Time Reductions

Outline. Approximation: Theory and Algorithms. Motivation. Outline. The String Edit Distance. Nikolaus Augsten. Unit 2 March 6, 2009

Transcription:

lgorithmic Paradigms Dynamic Programming reed Build up a solution incrementally, myopically optimizing some local criterion Divide-and-conquer Break up a problem into two sub-problems, solve each sub-problem independently, and combine solution to sub-problems to form solution to original problem Dynamic programming Break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems lgorithm Design by Éva Tardos and Jon Kleinberg Portions opyright 25 ddison Wesley Slides by Kevin Wayne and Neil Rhodes 2 Dynamic Programming Steps haracterize structure of optimal solution Weighted Interval Scheduling Define a recurrence for the value of an optimal solution ompute the value of an optimal solution Either bottom-up Top-down with memoization onstruct the optimal solution from optimal value and intermediate information 3

Weighted Interval Scheduling nweighted Interval Scheduling Review Weighted interval scheduling problem Job j starts at s j, finishes at f j, and has weight or value v j Two jobs compatible if they don't overlap oal: find maximum weight subset of mutually compatible jobs reedy algorithm works if all weights are onsider jobs in ascending order of finish time dd job to subset if it is compatible with previously chosen jobs b a c Observation reedy algorithm can fail spectacularly if arbitrary weights are allowed d e f g h 2 3 4 5 8 9 Time weight = 999 weight = a b 2 3 4 5 8 9 Time 5 Weighted Interval Scheduling Dynamic Programming: Binary hoice Notation Label jobs by finishing time: f f 2 f n Def p(j) = largest index i < j such that job i is compatible with j Ex: p(8) = 5, p() = 3, p(2) = 2 3 4 Notation OPT(j) = value of optimal solution to the problem consisting of job requests, 2,, j ase : OPT selects job j can't use incompatible jobs { p(j), p(j) 2,, j - must include optimal solution to problem consisting of remaining compatible jobs, 2,, p(j) ase 2: OPT does not select job j must include optimal solution to problem consisting of remaining compatible jobs, 2,, j- optimal substructure 5 8 2 3 4 5 8 9 Time # if j = OPT( j) = $ % max v j OPT( p( j)), OPT( j ") { otherwise 8

Weighted Interval Scheduling: Brute Force Weighted Interval Scheduling: Brute Force Brute force algorithm Observation Recursive algorithm fails spectacularly because of redundant sub-problems $ exponential algorithms Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f 2 f n Ex Number of recursive calls for family of "layered" instances grows like Fibonacci sequence 5 ompute p(), p(2),, p(n) 4 3 ompute-opt(j) { if (j = ) return else return max(v j ompute-opt(p(j)), ompute-opt(j-)) 2 3 4 p() =, p(j) = j-2 5 3 2 2 2 9 Weighted Interval Scheduling: Memoization Weighted Interval Scheduling: Running Time Memoization Store results of each sub-problem in a cache; lookup as needed laim Memoized version of algorithm takes O(n log n) time Sort by finish time: O(n log n) omputing p(") : O(n) after sorting by start time Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f 2 f n ompute p(), p(2),, p(n) for j = to n M[j] = empty M[j] = global array M-ompute-Opt(j) { if (M[j] is empty) M[j] = max(w j M-ompute-Opt(p(j)), M-ompute-Opt(j-)) return M[j] M-ompute-Opt(j): each invocation takes O() time and either (i) returns an existing value M[j] (ii) fills in one new entry M[j] and makes two recursive calls Progress measure # = # nonempty entries of M[] initially # =, throughout # n (ii) increases # by $ at most 2n recursive calls Overall running time of M-ompute-Opt(n) is O(n) Remark O(n) if jobs are pre-sorted by start and finish times 2

utomated Memoization Weighted Interval Scheduling: Bottom-p utomated memoization Many functional programming languages (eg, Lisp) have library or built-in support for memoization Bottom-up dynamic programming nwind recursion (def-memo-fun F (n) (if (<= n ) n ( (F (- n )) (F (- n 2))))) Lisp (efficient) static int F(int n) { if (n <= ) return n; else return F(n-) F(n-2); Java (exponential) F(4) Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f 2 f n ompute p(), p(2),, p(n) Iterative-ompute-Opt { M[] = for j = to n M[j] = max(v j M[p(j)], M[j-]) F(39) F(38) F(38) F(3) F(3) F(3) F(3) F(3) F(3) F(35) F(3) F(35) F(35) F(34) 3 4 Weighted Interval Scheduling: Finding a Solution Q Dynamic programming algorithms computes optimal value What if we want the solution itself? Do some post-processing Knapsack Problem Run M-ompute-Opt(n) Run Find-Solution(n) Find-Solution(j) { if (j = ) output nothing else if (v j M[p(j)] > M[j-]) print j Find-Solution(p(j)) else Find-Solution(j-) # of recursive calls n $ O(n) 5

Knapsack Problem Dynamic Programming: False Start Knapsack problem iven n objects and a "knapsack" Item i weighs w i > kilograms and has value v i > Knapsack has capacity of W kilograms oal: choose subset of items to fill knapsack so as to maximize total value Ex: { 3, 4 has value 4 Item Value Weight W = 2 2 3 8 5 4 22 5 28 Def OPT(i) = max profit subset of items,, i ase : OPT does not select item i OPT selects best of {, 2,, i- ase 2: OPT selects item i accepting item i does not immediately imply that we will have to reject other items without knowing what other items were selected before i, we don't even know if we have enough room for i onclusion Need more sub-problems reedy: repeatedly add item with maximum ratio v i / w i Ex: { 5, 2, achieves only value = 35 $ greedy not optimal 8 Dynamic Programming: dding a New Variable Knapsack Problem: Bottom-p Def OPT(i, w) = max profit subset of items,, i with weight limit w Knapsack Fill up an n-by-w array ase : OPT does not select item i OPT selects best of {, 2,, i- using weight limit w ase 2: OPT selects item i new weight limit = w w i OPT selects best of {, 2,, i using this new weight limit # if i = % OPT(i, w) = $ OPT(i ", w) if w i > w % & max{ OPT(i ", w), v i OPT(i ", w " w i ) otherwise Input: n, w,,w N, v,,v N for w = to W M[, w] = for i = to n M[i, ] = for i = to n for w = to W if (w i > w) M[i, w] = M[i-, w] else M[i, w] = max {M[i-, w], v i M[i-, w-w i ] return M[n, W] 9 2

Knapsack lgorithm Knapsack Problem: Running Time W Running time %(n W) Not polynomial in input size 2 3 4 5 8 9 "Pseudo-polynomial" & Decision version of Knapsack is NP-complete { n {, 2 {, 2, 3 8 9 24 25 25 25 25 {, 2, 3, 4 8 22 24 28 29 29 4 {, 2, 3, 4, 5 8 22 28 29 34 34 4 Item Value Weight OPT: { 4, 3 value = 22 8 = 4 W = 2 3 2 8 5 4 22 5 28 2 22 Finding Optimal Substructure solution to the problem makes a choice Leaves one or more subproblems to be solved RN Secondary Structure For example, choose whether to use an item Figure out which subproblems to use if you knew the choice that led to an optimal solution Try all possible choices and choose the one that provides the optimal solution Show that each subproblem in an optimal solution is itself optimal Optimal substructure How many subproblems are needed in an optimal solution to the original problem For the examples so far, this has been one How many choices must be considered 23

RN Secondary Structure RN Secondary Structure RN String B = b b 2 b n over alphabet {,,, Secondary structure RN is single-stranded so it tends to loop back and form base pairs with itself This structure is essential for understanding behavior of molecule Ex: Secondary structure set of pairs S = { (b i, b j ) that satisfy: [Watson-rick] S is a matching and each pair in S is a Watson- rick complement: -, -, -, or - [No sharp turns] The ends of each pair are separated by at least 4 intervening bases If (b i, b j ) ' S, then i < j - 4 [Non-crossing] If (b i, b j ) and (b k, b l ) are two pairs in S, then we cannot have i < k < j < l Free energy sual hypothesis is that an RN molecule will form the secondary structure with the optimum total free energy approximate by number of base pairs oal iven an RN molecule B = b b 2 b n, find a secondary structure S that maximizes the number of base pairs complementary base pairs: -, - 25 2 RN Secondary Structure: Examples RN Secondary Structure: Subproblems Examples First attempt OPT(j) = maximum number of base pairs in a secondary structure of the substring b b 2 b j match b t and b n base pair t n Difficulty Results in two sub-problems 4 Finding secondary structure in: b b 2 b t- Finding secondary structure in: b t b t2 b n- OPT(t-) need more sub-problems ok sharp turn crossing 2 28

Dynamic Programming Over Intervals Bottom p Dynamic Programming Over Intervals Notation OPT(i, j) = maximum number of base pairs in a secondary structure of the substring b i b i b j Q What order to solve the sub-problems? Do shortest intervals first ase If i ( j - 4 OPT(i, j) = by no-sharp turns condition ase 2 Base b j is not involved in a pair OPT(i, j) = OPT(i, j-) RN(b,,b n ) { for k = 5,,, n- for i =, 2,, n-k j = i k ompute M[i, j] i 4 3 2 ase 3 Base b j pairs with b t for some i t < j - 4 return M[, n] using recurrence 5 8 j non-crossing constraint decouples resulting sub-problems OPT(i, j) = max t { OPT(i, t-) OPT(t, j-) take max over t such that i t < j-4 and b t and b j are Watson-rick complements Running time O(n 3 ) Remark Same core idea in KY algorithm to parse context-free grammars 29 3 Dynamic Programming Summary Recipe haracterize structure of problem Sequence lignment Recursively define value of optimal solution ompute value of optimal solution onstruct optimal solution from computed information Dynamic programming techniques Binary choice: weighted interval scheduling dding a new variable: knapsack Dynamic programming over intervals: RN secondary structure Top-down vs bottom-up: different people have different intuitions 3

String Similarity Edit Distance How similar are two strings? ocurrance occurrence o c u r r a n c e - o c c u r r e n c e mismatches, gap o c - u r r a n c e pplications Basis for nix diff Speech recognition omputational biology Edit distance [Levenshtein 9, Needleman-Wunsch 9, Smith-Waterman 98] ap penalty ); mismatch penalty * pq ost = sum of gap and mismatch penalties o c c u r r e n c e mismatch, gap T T T - T T T o c - u r r - a n c e T T T T - T T o c c u r r e - n c e mismatches, 3 gaps * T * T * 2* 2) * 33 34 Sequence lignment Sequence lignment: Problem Structure oal: iven two strings X = x x m and Y = y y n find alignment of minimum cost Def n alignment M is a set of ordered pairs x i -y j such that each item occurs in at most one pair and no crossings Def The pair x i -y j and x i' -y j' cross if i < i', but j > j' cost( M ) = $ " xi y j $ % $ % (x i, y j ) # M i : x 42 43 i unmatched j : y j unmatched 444 42 444443 mismatch Ex: T vs TT Sol: M = -y, -, x 4 -, x 5 -, x -y gap x x 4 x 5 x T - - T T Def OPT(i, j) = min cost of aligning strings x x i and y y j ase : OPT matches x i -y j pay mismatch for x i -y j min cost of aligning two strings x x i- and y y j- ase 2a: OPT leaves x i unmatched pay gap for x i and min cost of aligning x x i- and y y j ase 2b: OPT leaves y j unmatched pay gap for y j and min cost of aligning x x i and y y j- " j& if i = $ " ' xi y j OPT(i (, j () $ $ OPT(i, j) = # min # & OPT(i (, j) otherwise $ $ & OPT(i, j () % % $ i& if j = y y 35 3

Sequence lignment: lgorithm Sequence-lignment(m, n, x x m, y y n, ), *) { for i = to m M[, i] = i) for j = to n M[j, ] = j) Sequence lignment in Linear Space for i = to m for j = to n M[i, j] = min(*[x i, y j ] M[i-, j-], ) M[i-, j], ) M[i, j-]) return M[m, n] nalysis %(mn) time and space English words or sentences: m, n omputational biology: m = n =, billions ops OK, but B array? 3 Sequence lignment: Linear Space Sequence lignment: Linear Space Q an we avoid using quadratic space? Easy Optimal value in O(m n) space and O(mn) time ompute OPT(i, ) from OPT(i-, ) Edit distance graph Let f(i, j) be shortest path from (,) to (i, j) Observation: f(i, j) = OPT(i, j) No longer a simple way to recover alignment itself Theorem [Hirschberg, 95] Optimal alignment in O(m n) space and O(mn) time - y y lever combination of divide-and-conquer and dynamic programming Inspired by idea of Savitch from complexity theory x " xi y j ) ) 39 4

Sequence lignment: Linear Space Sequence lignment: Linear Space Edit distance graph Let f(i, j) be shortest path from (,) to (i, j) an compute f (, j) for any j in O(mn) time and O(m n) space Edit distance graph Let g(i, j) be shortest path from (i, j) to (m, n) an compute by reversing the edge orientations and inverting the roles of (, ) and (m, n) j y y y y - - x x ) " xi y j ) 4 42 Sequence lignment: Linear Space Sequence lignment: Linear Space Edit distance graph Let g(i, j) be shortest path from (i, j) to (m, n) an compute g(, j) for any j in O(mn) time and O(m n) space Observation The cost of the shortest path that uses (i, j) is f(i, j) g(i, j) j y y y y - - x x 43 44

Sequence lignment: Linear Space Sequence lignment: Linear Space Observation 2 let q be an index that minimizes f(q, n/2) g(q, n/2) Then, the shortest path from (, ) to (m, n) uses (q, n/2) Divide: find index q that minimizes f(q, n/2) g(q, n/2) using DP lign x q and y n/2 onquer: recursively compute optimal alignment in each piece n / 2 n / 2 - y y - y y x q x q 45 4 Sequence lignment: Running Time nalysis Sequence lignment in Linear Space Theorem Let T(m, n) = max running time of algorithm on strings of length at most m and n T(m, n) = %(mn) T(m, n) = T(q, n/2) T(m-q, n/2) %(mn) T(m, n) = T(q, n/2) T(m-q, n/2) cmn Note that (q*n/2) (m-q)*n/2 = (qm-q)n/2 = mq/2 Level ost cmn nother way to look at the solution Quadratic: OPT(i, j) = min cost of aligning strings x x i and y y j The first part of x matches with the first part of y Linear: OPT(i) = min cost of aligning strings x x i and y y n/2 and x i x i2 x m and y n/2 y n/22 y n The first half of y matches with some initial part of x and the second half of y matches with the rest of x Linear space algorithm due to Hirschberg, 95 2 cmn/2 i cmn/2 i Total time is %(mn) 4 48