//8 Fast Integer Division Too (!) Schönhage Strassen algorithm CS 8: Algorithm Design and Analysis Integer division. Given two n-bit (or less) integers s and t, compute quotient q = s / t and remainder r = s mod t (such that s=qt+r). log log log Only used for really big numbers: a Jeremiah Blocki Purdue University Spring 8 Announcement: Homework due February th at :9PM Fact. Complexity of integer division is (almost) same as integer multiplication. To compute quotient q: Approximate x = / t using Newton's method: After i=log n iterations, either q = s x i or q = s x i. If s x t > s then q = s x ( multiplication) Otherwise q = s x r=s-qt ( multiplication) x i x i tx i Total: O(log n) multiplications and subtractions using fast multiplication State of the Art Integer Multiplication (Theory): log for increasing small log log Integer Division: Input: x,y (positive n bit integers) Output: positive integers q (quotient) and remainder r s.t. and Algorithm to compute quotient q and remainder r requires O(log n) multiplications using Newton s method (approximates roots of a realvalued polynomial). Recap: Divide and Conquer Toom- Generalization Framework: Divide, Conquer and Merge Example : Counting Inversions in O(n log n) time. Subroutine: Sort-And-Count (divide & conquer) Count Left-Right inversions (merge) in time O(n) when input is already sorted Example : Closest Pair of Points in O(n log n) time. Split input in half by x coordinate and find closest point on left and right half ( = min(, )) Merge: Exploits structural properties of problems Remove elements at distance > from dividing line L Sort remaining points by y coordinate to obtain, Claim: Example : Integer Multiplication in time O(n.8 ) Divide each n-bit number into two n/-bit numbers Key Trick: Only need a= multiplications of n/-bit numbers! Split into parts / / Requires: multiplications of n/ bit numbers and O() additions, shifts. Toom-Cook Generalization (split into k parts): (k-) multiplications of n/k bit numbers.. lim log for large enough k Caveat: Hidden constants increase with k Q. Multiply two -by- matrices with scalar multiplications? A. Yes! [Strassen 99] (n log ) O(n.8 ) Q. Multiply two -by- matrices with scalar multiplications? A. Impossible. [Hopcroft and Kerr 9] (n log ) O(n.9 ) Q. Two -by- matrices with scalar multiplications? A. Also impossible. (n log ) O(n. ) Begun, the decimal wars have. [Pan, Bini et al, Schönhage, ] Two -by- matrices with, scalar multiplications. O(n.8 ) O(n.8 Two 8-by-8 matrices with, scalar multiplications. ) A year later. O(n.99 ) December, 99. O(n.8 ) January, 98. O(n.8 ) Copyright, Kevin Wayne
//8 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer. Break up a problem into sub-problems, solve each sub-problem independently, and combine solution to subproblems to form solution to original problem. Best known. O(n. ) [Coppersmith-Winograd, 98] Conjecture. O(n + ) for any >. Caveat. Theoretical improvements to Strassen are progressively less practical. Best known. O(n.9 ) [Le Gall, ] Conjecture. O(n + ) for any >. Caveat. Theoretical improvements to Strassen are progressively less practical. Dynamic programming. Break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems. 9 Dynamic Programming History Chapter Dynamic Programming Bellman. [9s] Pioneered the systematic study of dynamic programming. Etymology. Dynamic programming = planning over time. Secretary of Defense was hostile to mathematical research. Bellman sought an impressive name to avoid confrontation. "it's impossible to use dynamic in a pejorative sense" "something not even a Congressman could object to" Best known. O(n. ) [Williams, ] Reference: Bellman, R. E. Eye of the Hurricane, An Autobiography. Conjecture. O(n + ) for any >. Caveat. Theoretical improvements to Strassen are progressively less practical. Slides by Kevin Wayne. Copyright Pearson-Addison Wesley. All rights reserved. 8 Copyright, Kevin Wayne
//8 Dynamic Programming Applications Unweighted Interval Scheduling (will cover in Greedy paradigms) Areas. Bioinformatics. Control theory. Information theory. Operations research. Computer science: theory, graphics, AI, compilers, systems,. Some famous dynamic programming algorithms. Unix diff for comparing two files. Viterbi for hidden Markov models. Smith-Waterman for genetic sequence alignment. Bellman-Ford for shortest path routing in networks. Cocke-Kasami-Younger for parsing context free grammars.. Weighted Interval Scheduling Previously Showed: Greedy algorithm works if all weights are. Solution: Sort requests by finish time (ascending order) Observation. Greedy algorithm can fail spectacularly if arbitrary weights are allowed. weight = 999 weight = a b 8 9 Time Computing Fibonacci numbers Weighted Interval Scheduling Weighted Interval Scheduling On the board. Weighted interval scheduling problem. Job j starts at s j, finishes at f j, and has weight or value v j. Two jobs compatible if they don't overlap. Goal: find maximum weight subset of mutually compatible jobs. Notation. Label jobs by finishing time: f f... f n. Def. p(j) = largest index i < j such that job i is compatible with j. Ex: p(8) =, p() =, p() =. a b c d e f g 8 9 h Time 8 9 Time 8 8 Copyright, Kevin Wayne
//8 Dynamic Programming: Binary Choice Weighted Interval Scheduling: Brute Force Weighted Interval Scheduling: Running Time Notation. OPT(j) = value of optimal solution to the problem consisting of job requests,,..., j. Case : OPT selects job j. collect profit v j can't use incompatible jobs { p(j) +, p(j) +,..., j - must include optimal solution to problem consisting of remaining compatible jobs,,..., p(j) optimal substructure Case : OPT does not select job j. must include optimal solution to problem consisting of remaining compatible jobs,,..., j- Observation. Recursive algorithm fails spectacularly because of redundant sub-problems exponential algorithms. Ex. Number of recursive calls for family of "layered" instances grows like Fibonacci sequence (F n >. n ). p() =, p(j) = j- T(n) = T(n-)+T(n-)+ T() = Claim. Memoized version of algorithm takes O(n log n) time. Sort by finish time: O(n log n). Computing p( ) : O(n log n) via sorting by start time. M-Compute-Opt(j): each invocation takes O() time and either (i) returns an existing value M[j] (ii) fills in one new entry M[j] and makes two recursive calls Progress measure = # nonempty entries of M[]. initially =, throughout n. (ii) increases by at most n recursive calls. Overall running time of M-Compute-Opt(n) is O(n). if j OPT( j) max v j OPT( p( j)), OPT( j ) otherwise Key Insight: Do we really need to repeat this computation? Remark. O(n) if jobs are pre-sorted by start and finish times. 9 Weighted Interval Scheduling: Brute Force Weighted Interval Scheduling: Memoization Weighted Interval Scheduling: Finding a Solution Brute force algorithm. Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f... f n. Compute p(), p(),, p(n) Compute-Opt(j) { if (j = ) return else return max(v j + Compute-Opt(p(j)), Compute-Opt(j-)) T(n) = T(n-)+T(p(n))+O() T() = Memoization. Store results of each sub-problem in a cache; lookup as needed. Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f... f n. Compute p(), p(),, p(n) for j = to n M[j] = empty M[] = global array M-Compute-Opt(j) { if (M[j] is empty) M[j] = max(v j + M-Compute-Opt(p(j)), M-Compute-Opt(j-)) return M[j] Q. Dynamic programming algorithms computes optimal value. What if we want the solution itself? A. Do some post-processing. Run M-Compute-Opt(n) Run Find-Solution(n) Find-Solution(j) { if (j = ) output nothing else if (v j + M[p(j)] > M[j-]) print j Find-Solution(p(j)) else Find-Solution(j-) # of recursive calls n O(n). Copyright, Kevin Wayne
//8 Weighted Interval Scheduling: Bottom-Up Segmented Least Squares Segmented Least Squares Bottom-up dynamic programming. Unwind recursion. Input: n, s,,s n, f,,f n, v,,v n Sort jobs by finish times so that f f... f n. Compute p(), p(),, p(n) Iterative-Compute-Opt { M[] = for j = to n M[j] = max(v j + M[p(j)], M[j-]) Least squares. Foundational problem in statistic and numerical analysis. Given n points in the plane: (x, y ), (x, y ),..., (x n, y n ). Find a line y = ax + b that minimizes the sum of the squared error: y n SSE ( y i ax i b) i x Solution. Calculus min error is achieved when Segmented least squares. Points lie roughly on a sequence of several line segments. Given n points in the plane (x, y ), (x, y ),..., (x n, y n ) with x < x <... < x n, find a sequence of lines that minimizes: the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + c L, for some constant c >. y a n x i i y i ( i x i ) ( i y i ) i, b n x i ( x i ) i i y i a i x i n 9 x Segmented Least Squares Dynamic Programming: Multiway Choice. Segmented Least Squares Segmented least squares. Points lie roughly on a sequence of several line segments. Given n points in the plane (x, y ), (x, y ),..., (x n, y n ) with x < x <... < x n, find a sequence of lines that minimizes f(x). Q. What's a reasonable choice for f(x) to balance accuracy and parsimony? goodness of fit Notation. OPT(j) = minimum cost for points p, p i+,..., p j. e(i, j) = minimum sum of squares for points p i, p i+,..., p j. To compute OPT(j): Last segment uses points p i, p i+,..., p j for some i. Cost = e(i, j) + c + OPT(i-). number of lines y if j OPT( j) min e(i, j) c OPT(i ) otherwise i j x 8 Copyright, Kevin Wayne
//8 Segmented Least Squares: Algorithm Knapsack Problem Dynamic Programming: Adding a New Variable INPUT: n, p,,p N, c Segmented-Least-Squares() { M[] = for j = to n for i = to j compute the least square error e ij for the segment p i,, p j for j = to n M[j] = min i j (e ij + c + M[i-]) return M[n] Running time. O(n ). can be improved to O(n ) by pre-computing various statistics Bottleneck = computing e(i, j) for O(n ) pairs, O(n) per pair using previous formula. Knapsack problem. Given n objects and a "knapsack." Item i weighs w i > kilograms and has value v i >. Knapsack has capacity of W kilograms. Goal: fill knapsack so as to maximize total value. Ex: {, has value. value weight 8 8 Greedy: repeatedly add item with maximum ratio v i / w i. Ex: {,, achieves only value = greedy not optimal. # W = Def. OPT(i, w) = max profit subset of items,, i with weight limit w. Case : OPT does not select item i. OPT selects best of {,,, i- using weight limit w Case : OPT selects item i. new weight limit = w w i OPT selects best of {,,, i using this new weight limit if i OPT(i, w) OPT(i, w) if w i w max OPT(i, w), v i OPT(i, w w i ) otherwise Dynamic Programming: False Start Knapsack Problem: Bottom-Up. Knapsack Problem Def. OPT(i) = max profit subset of items,, i. Case : OPT does not select item i. OPT selects best of {,,, i- Case : OPT selects item i. accepting item i does not immediately imply that we will have to reject other items without knowing what other items were selected before i, we don't even know if we have enough room for i Conclusion. Need more sub-problems! Knapsack. Fill up an n-by-w array. Input: n, W, w,,w N, v,,v N for w = to W M[, w] = for i = to n for w = to W if (w i > w) M[i, w] = M[i-, w] else M[i, w] = max {M[i-, w], v i + M[i-, w-w i ] return M[n, W] Copyright, Kevin Wayne
//8 Knapsack Algorithm W + 8 9 { n + {, {,, 8 9 {,,, 8 8 9 9 {,,,, 8 8 9 Item Value Weight OPT: {, value = + 8 = W = 8 8 Knapsack Problem: Running Time Running time. (n W). Not polynomial in input size! Only need log bits to encode each weight Problem can be encoded with log bits "Pseudo-polynomial." Decision version of Knapsack is NP-complete. [Chapter 8] Knapsack approximation algorithm. There exists a poly-time algorithm that produces a feasible solution that has value within.% of optimum. [Section.8] 8 Copyright, Kevin Wayne