Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

Similar documents
CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

Introduction to Algorithms 6.046J/18.401J/SMA5503

1.1 Administrative Stuff

Introduction to Algorithms

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

Outline. 1 Introduction. Merging and MergeSort. 3 Analysis. 4 Reference

1 Divide and Conquer (September 3)

Algorithms And Programming I. Lecture 5 Quicksort

Solving recurrences. Frequently showing up when analysing divide&conquer algorithms or, more generally, recursive algorithms.

CS483 Design and Analysis of Algorithms

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Answer the following questions: Q1: ( 15 points) : A) Choose the correct answer of the following questions: نموذج اإلجابة

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

The Divide-and-Conquer Design Paradigm

Quicksort algorithm Average case analysis

Divide and Conquer CPE 349. Theresa Migler-VonDollen

Data Structures and Algorithms CSE 465

Algorithms and Data Structures Strassen s Algorithm. ADS (2017/18) Lecture 4 slide 1

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

Divide-and-conquer algorithm

Tutorials. Algorithms and Data Structures Strassen s Algorithm. The Master Theorem for solving recurrences. The Master Theorem (cont d)

The divide-and-conquer strategy solves a problem by: 1. Breaking it into subproblems that are themselves smaller instances of the same type of problem

Divide and Conquer. Arash Rafiey. 27 October, 2016

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

Chapter 4 Divide-and-Conquer

The Master Theorem for solving recurrences. Algorithms and Data Structures Strassen s Algorithm. Tutorials. The Master Theorem (cont d)

Divide and conquer. Philip II of Macedon

Kartsuba s Algorithm and Linear Time Selection

Solving Recurrences. Lecture 23 CS2110 Fall 2011

CSCI Honor seminar in algorithms Homework 2 Solution

Lecture 3. Big-O notation, more recurrences!!

Divide and Conquer Strategy

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Design and Analysis of Algorithms

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Asymptotic Analysis, recurrences Date: 9/7/17

Introduction to Divide and Conquer

Divide and Conquer. Recurrence Relations

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

An analogy from Calculus: limits

CS 470/570 Divide-and-Conquer. Format of Divide-and-Conquer algorithms: Master Recurrence Theorem (simpler version)

Design and Analysis of Algorithms

Algorithms. Quicksort. Slide credit: David Luebke (Virginia)

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

CS483 Design and Analysis of Algorithms

Asymptotic Analysis and Recurrences

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Partition and Select

1 Closest Pair of Points on the Plane

shelat divide&conquer closest points matrixmult median

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Matrix Multiplication

Lecture 12 : Recurrences DRAFT

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

Divide-and-conquer: Order Statistics. Curs: Fall 2017

CS 161 Summer 2009 Homework #2 Sample Solutions

Divide-and-Conquer. a technique for designing algorithms

The scope of the midterm exam is up to and includes Section 2.1 in the textbook (homework sets 1-4). Below we highlight some of the important items.

Algorithms. Jordi Planes. Escola Politècnica Superior Universitat de Lleida

Announcements Monday, September 18

Fast Convolution; Strassen s Method

CS 580: Algorithm Design and Analysis

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

IS 709/809: Computational Methods in IS Research Fall Exam Review

CSI2101-W08- Recurrence Relations

data structures and algorithms lecture 2

CSE 613: Parallel Programming. Lectures ( Analyzing Divide-and-Conquer Algorithms )

CSE 613: Parallel Programming. Lecture 8 ( Analyzing Divide-and-Conquer Algorithms )

CS161: Algorithm Design and Analysis Recitation Section 3 Stanford University Week of 29 January, Problem 3-1.

Announcements August 31

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Divide and Conquer. Chapter Overview. 1.2 Divide and Conquer

1 Substitution method

Reductions, Recursion and Divide and Conquer

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

Copyright 2000, Kevin Wayne 1

CPS 616 DIVIDE-AND-CONQUER 6-1

Divide and Conquer. Andreas Klappenecker

Chapter 5. Divide and Conquer CLRS 4.3. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

SYMBOL EXPLANATION EXAMPLE

Lecture 10: Powers of Matrices, Difference Equations

shelat 16f-4800 sep Matrix Mult, Median, FFT

Lecture 2: Divide and conquer and Dynamic programming

Lecture 4. Quicksort

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 1: Introduction

Today s Outline. CS 362, Lecture 13. Matrix Chain Multiplication. Paranthesizing Matrices. Matrix Multiplication. Jared Saia University of New Mexico

The maximum-subarray problem. Given an array of integers, find a contiguous subarray with the maximum sum. Very naïve algorithm:

Midterm Exam 2 Solutions

CS361 Homework #3 Solutions

Chapter 5 Divide and Conquer

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

CS 231: Algorithmic Problem Solving

Algorithms Test 1. Question 1. (10 points) for (i = 1; i <= n; i++) { j = 1; while (j < n) {

CS173 Running Time and Big-O. Tandy Warnow

CS S Lecture 5 January 29, 2019

Extended Algorithms Courses COMP3821/9801

The divide-and-conquer strategy solves a problem by: Chapter 2. Divide-and-conquer algorithms. Multiplication

1: Please compute the Jacobi symbol ( 99

Transcription:

Class Note #14 Date: 03/01/2006 [Overall Information] In this class, we studied an algorithm for integer multiplication, which improved the running time from θ(n 2 ) to θ(n 1.59 ). We then used some of the insights gained in analyzing Strassen s algorithm for matrix multiplication. Finally, we briefly discussed the Quick Sort algorithm. [Announcements] 1, Homework#3 was assigned. It covers divide and conquer algorithms and techniques for solving recurrence relations. 2, Quiz#4 is scheduled on 03/08/2006 (next Wednesday). It is the last quiz before the midterm exam and will cover divide&conquer algorithms and their analysis. 3, The results of quiz#3 were announced, and solutions were distributed.

[During the Lecture] 1, At the beginning of today s lecture, we continued our study of the integer multiplication problem. First, the class reviewed the divide and conquer approach we studied last time, which used the formula xy = 2^n x 1 y 1 + 2^(n/2)(x 1 y 0 + x 0 y 1 ) + x 0 y 0 after dividing the original inputs x and y into x=2^(n/2) x 1 +x 0 and y = 2^ (n/2) y 1 y 0. Based on the Master Theorem, it turns out that no improvement has been achieved compared with the classical method (the elementary school method).

2, To actually improve the running time, we want to reduce the number of recursive calls in each iteration. In other words, we d like to reduce the number of products we compute from four to some smaller number. Since the only three quantities we really need are, according to the formula, x 1 y 1, x 1 y 0 + x 0 y 1 and x 0 y 0, the trick we are going to use is to only compute b = (x 1 +x 0 )(y 0 + y 1 ), a = x 0 y 0 and c = x 0 y 0. Notice that x 1 y 0 + x 0 y 1 can be derived from these as (b a c). So after a, b and c are computed recursively, we can calculate the integer multiplication results as xy=c2 n + (b a c)2 n/2 + a. The three line pseudocode of this new version integer multiplication algorithm was written onto the white board.

3, The first and last line both take θ (n), because they simply involve a constant number of additions and shifts of numbers of at most n bits. Because (x 1 +x 0 ) and (y 0 + y 1 ) have at most (n/2)+1 bits, the recursive multiplication steps take at most T((n/2)+1). The recurrence equation is thus T(n) =θ (n) + 3T((n/2)+1). For large n, the added 1 actually makes no difference in the solution (this is not too difficult to prove), so we instead study the simpler recurrence T(n) =θ (n) + 3T((n/2)+1). We need to compare f(n)=cn with n (log 2 3). Since n (log 2 3) = n 1.59 > n 1.1, we can choose = 0.1, and see that we are in case 1 of the Master Theorem. So T(n) = θ(n 1.59 ). Compared with the classical method s θ(n 2 ), this is clearly an improvement. Intuitively, it says that to multiply two numbers, one does not really need to compute all bit products.

4, Next, the class moved to the matrix multiplication problem. The classical method was briefly reviewed. In the case of multiplying two nxn matrices, because there are n 2 elements to be calculated and each element needs θ (n) scalar multiplications, θ (n 3 ) is the total running time. Merely writing down the n 2 elements will take θ (n 2 ) time, so it is impossible to beat θ (n 2 ) for matrix multiplication. However, between θ (n 3 ) and θ (n 2 ), there is still room for improvement. We use divide and conquer to come up with some improvement. In order to do so, we divide both the input and output matrices into four sub matrices.

5, For those blocks, one can easily see that the same multiplication rules apply as for individual elements of a 2x2 matrix, so R = AE + BF; S = AG + BH; T = CE + DF; U = CG + DH. We can compute AE, BF, AG recursively and then add them to obtain R, S, T, U. The addition step takes θ (n 2 ), so the total running time satisfies T(n) = 8T(n/2) +θ (n 2 ). Unfortunately, but not surprisingly (since this algorithm is just a re distribution of the original additions and scalar multiplications), the Master Theorem tells us that T(n) =θ (n 3 ), so we have not improved the running time yet.

6, An actual improvement will thus require us to solve the problem by computing at most 7 smaller products, and then deducing R, S, T, U from them. The basic idea of deriving such products is akin to what we did for integer multiplication, but the actual choice of products, due to Strassen, is not really obvious, nor is there a good explanation why these particular choices should work, beyond the fact that one can prove they do. The products P 1, P 2, P 3, P 4, P 5, P 6, P 7 are given in the picture below, and from them, we can derive the desired matrix entries as follows: R= p 5 + p 4 p 2 + p 6 ; S = p 1 + p 2 ;

T = p 3 + p 4 ; U = p 5 + p 1 p 3 + p 7. 7, Those expressions can be verified quite easily with standard arithmetic (even though coming up with them is likely much harder). In class, the expression of R was verified as an example.

Using those formulas and expressions, Strassen's Algorithm is rather simple. P 1, P 2,, P 7 are computed recursively after the division step. Then compute R, S, T, U using the above expressions. The correctness of Strassen algorithm directly follows the mathematics verifications. 8, Both the divide and combine steps take time θ (n 2 ), so the total running time is: T(n) = 7T(n/2) +θ (n 2 ). This time, the Master Theorem tells us T(n) = θ (n (log 2 7) ) = θ (n 2.81 ), which again shows that not all scalar multiplications that seemed necessary at first in fact are. Improving the running time of matrix multiplication is still a very active topic in algorithm research, not least because matrix multiplication is a core step for many scientific applications, data mining, machine learning, graphics, and other application domains. Currently, the best algorithm,

due to Coppersmith and Winograd, has a running time of θ (n 2.39 ). Unfortunately, the constant is huge, so the algorithm is not practical for any inputs. Strassen's algorithm, on the other hand, is quite practical for large enough n (though its constant is also worse than for the default algorithm); depending on the exact parameters of the computer, the break even point happens somewhere between n=20 and n=100. Naturally, one can design a hybrid approach, which starts by using Strassen's algorithm for large n, and once the problem has been broken into small enough subproblems, switches to the default algorithm. It should also be noted that, while the default and Strassen's algorithm are good choices for dense matrices, if the matrix is sparse (i.e., has mostly zeroes), specialized algorithms for sparse matrices run much faster. 9, After presenting and discussing Strassen's algorithm, a small size matrix multiplication problem was done in class. We were able to see how the algorithm works in detail as well as how the bigger constant came back to haunt us for small sizes.

10, During the last 10 minutes of the lecture, we had a brief introduction to the Quick Sort algorithm. Recall that in Merge Sort, most of the work is done in the combination step, while the division step is trivial. Quick Sort is the opposite way. Quick Sort first chooses a pivot for each recurrence. It then puts all elements smaller than the pivot in one subarray (on the left), and all elements larger than the pivot in another (on the right). The subarrays themselves are not sorted yet. Afterwards, Quick Sort is called recursively on the left and right sub arrays respectively, resulting in a completely sorted array without any further combination step. If m is the index of the pivot element, then T(n) = T(m 1) + T(n m+1) +θ (n). The ideal m should thus be close to n/2, in which case the running

time is: T(n) = 2T(n/2) + θ(n) = θ (n log(n)). On the other hand, a poorly chosen m (for instance, m=1 or m=n) would result in a running time of θ (n 2 ). 11, Thus, ideally, Quick Sort would want to choose the median for m in each iteration. Alternatively, picking a pivot randomly (i.e., using a random index) also works well. A bad choice would be to pick the first or last element of the array, as arrays are frequently already partially sorted, and many pivots would thus be close to the smallest or largest element. With respect to finding the median, it should be noted that there is a divide and conquer algorithm, which will find the median of an array in

time θ (n). This algorithm divides the whole array into n/5 small arrays, each containing 5 numbers. The median of those small arrays can be calculated in constant time. Then the algorithm calculates the median of all those baby medians. Using this median of medians m, we can return to the size 5 arrays. If the median m' of such a small array is alrger than m, then the two largest elements of that array can be pruned, whereas if m' is smaller then m, then the two smallest elements can be pruned. The algorithm then recursively finds the median on the remaining 3n/5 elements. Please refer to the textbook for the details of Quick Sort and median calculation if you are interested. In the coming lectures, we will move forward to the study of Dynamic Programming.