shelat 16f-4800 sep Matrix Mult, Median, FFT

Similar documents
feb abhi shelat FFT,Median

feb abhi shelat Matrix, FFT

shelat divide&conquer closest points matrixmult median

Divide and Conquer. Maximum/minimum. Median finding. CS125 Lecture 4 Fall 2016

Introduction to Algorithms 6.046J/18.401J/SMA5503

Divide and conquer. Philip II of Macedon

Introduction to Algorithms

The divide-and-conquer strategy solves a problem by: Chapter 2. Divide-and-conquer algorithms. Multiplication

Class Note #14. In this class, we studied an algorithm for integer multiplication, which. 2 ) to θ(n

The divide-and-conquer strategy solves a problem by: 1. Breaking it into subproblems that are themselves smaller instances of the same type of problem

Kartsuba s Algorithm and Linear Time Selection

Answer the following questions: Q1: ( 15 points) : A) Choose the correct answer of the following questions: نموذج اإلجابة

Divide and Conquer: Polynomial Multiplication Version of October 1 / 7, 24201

A design paradigm. Divide and conquer: (When) does decomposing a problem into smaller parts help? 09/09/ EECS 3101

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

How to Multiply. 5.5 Integer Multiplication. Complex Multiplication. Integer Arithmetic. Complex multiplication. (a + bi) (c + di) = x + yi.

1 Divide and Conquer (September 3)

Integer Multiplication

Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography

CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication

CSE 421 Algorithms: Divide and Conquer

Speedy Maths. David McQuillan

Divide and Conquer algorithms

CPSC 518 Introduction to Computer Algebra Asymptotically Fast Integer Multiplication

PUTNAM PROBLEMS SEQUENCES, SERIES AND RECURRENCES. Notes

CMPS 2200 Fall Divide-and-Conquer. Carola Wenk. Slides courtesy of Charles Leiserson with changes and additions by Carola Wenk

Lecture 8 - Algebraic Methods for Matching 1

Algorithm Design and Analysis

CS 577 Introduction to Algorithms: Strassen s Algorithm and the Master Theorem

Copyright 2000, Kevin Wayne 1

Section September 6, If n = 3, 4, 5,..., the polynomial is called a cubic, quartic, quintic, etc.

Fast Polynomial Multiplication

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

Polynomials. In many problems, it is useful to write polynomials as products. For example, when solving equations: Example:

Divide and Conquer. CSE21 Winter 2017, Day 9 (B00), Day 6 (A00) January 30,

Divide and Conquer. Arash Rafiey. 27 October, 2016

Parallel Integer Polynomial Multiplication Changbo Chen, Svyatoslav Parallel Integer Covanov, Polynomial FarnamMultiplication

! Break up problem into several parts. ! Solve each part recursively. ! Combine solutions to sub-problems into overall solution.

Solutions to 2015 Entrance Examination for BSc Programmes at CMI. Part A Solutions

Divide and Conquer Algorithms

Integer multiplication and the truncated product problem

Copyright 2000, Kevin Wayne 1

Algorithms, Design and Analysis. Order of growth. Table 2.1. Big-oh. Asymptotic growth rate. Types of formulas for basic operation count

Algebra. Mathematics Help Sheet. The University of Sydney Business School

x 9 or x > 10 Name: Class: Date: 1 How many natural numbers are between 1.5 and 4.5 on the number line?

Short Division of Long Integers. (joint work with David Harvey)

Chapter 4 Finite Fields

Implementation of the DKSS Algorithm for Multiplication of Large Numbers

A matrix over a field F is a rectangular array of elements from F. The symbol

Basic elements of number theory

Basic elements of number theory

Divide and Conquer Strategy

1.1 Administrative Stuff

6.S897 Algebra and Computation February 27, Lecture 6

Permutations and Polynomials Sarah Kitchen February 7, 2006

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

We consider the problem of finding a polynomial that interpolates a given set of values:

Analysis of Algorithms CMPSC 565

Design and Analysis of Algorithms

Algorithms (II) Yu Yu. Shanghai Jiaotong University

CS483 Design and Analysis of Algorithms

CSCI Honor seminar in algorithms Homework 2 Solution

RSA Implementation. Oregon State University

Great Theoretical Ideas in Computer Science

Integer multiplication with generalized Fermat primes

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Tropical Polynomials

Three Ways to Test Irreducibility

Faster integer multiplication using short lattice vectors

Three Ways to Test Irreducibility

Great Theoretical Ideas in Computer Science

ax b mod m. has a solution if and only if d b. In this case, there is one solution, call it x 0, to the equation and there are d solutions x m d

CMPSCI611: Three Divide-and-Conquer Examples Lecture 2

be any ring homomorphism and let s S be any element of S. Then there is a unique ring homomorphism

Elliptic Curves I. The first three sections introduce and explain the properties of elliptic curves.

DIVIDE AND CONQUER II

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: October 11. In-Class Midterm. ( 7:05 PM 8:20 PM : 75 Minutes )

COMPUTER ARITHMETIC. 13/05/2010 cryptography - math background pp. 1 / 162

Polynomials. Chapter 4

Outline. Number Theory and Modular Arithmetic. p-1. Definition: Modular equivalence a b [mod n] (a mod n) = (b mod n) n (a-b)

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication

Algorithms and Data Structures

(308 ) EXAMPLES. 1. FIND the quotient and remainder when. II. 1. Find a root of the equation x* = +J Find a root of the equation x 6 = ^ - 1.

that if a b (mod m) and c d (mod m), then ac bd (mod m) soyou aren't allowed to use this fact!) A5. (a) Show that a perfect square must leave a remain

Computer Algebra: General Principles

Solutions to Exercises, Section 2.4

Divide and Conquer. Andreas Klappenecker. [based on slides by Prof. Welch]

ELEMENTARY LINEAR ALGEBRA

compare to comparison and pointer based sorting, binary trees

Week 15-16: Combinatorial Design

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

Lucas Lehmer primality test - Wikipedia, the free encyclopedia

Copyright 2000, Kevin Wayne 1

Math 101 Study Session Spring 2016 Test 4 Chapter 10, Chapter 11 Chapter 12 Section 1, and Chapter 12 Section 2

Review of Matrices and Block Structures

1 Linear transformations; the basics

Official Solutions 2014 Sun Life Financial CMO Qualifying Rêpechage 1

ITEC2620 Introduction to Data Structures

Winter Camp 2009 Number Theory Tips and Tricks

Transcription:

L5 shelat 16f-4800 sep 23 2016 Matrix Mult, Median, FFT

merge-sort (A, p, r) if p<r q (p + r)/2 merge-sort (A, p, q) merge-sort (A, q +1,r) merge(a, p, q, r) MERGE(A[1.. n],m): i 1; j m +1 for k 1 to n if j>n B[k] A[i]; i i +1 else if i>m B[k] A[j]; j j +1 else if A[i] <A[j] B[k] A[i]; i i +1 else B[k] A[j]; j j +1 for k 1 to n A[k] B[k] jeff erickson

Karatsuba(ab, cd) Base case: return b*d if inputs are 1-digit ac = Karatsuba(a,c) bd = Karatsuba(b,d) 3T (n/2) + 2n t = Karatsuba((a+b),(c+d)) mid = t - ac - bd RETURN ac*100 2 + mid*100 + bd 4n 3n

Closest(P,SX,SY) Let q be the middle-element of SX Divide P into Left, Right according to q delta,r,j = MIN(Closest(Left, LX, LY) Closest(Right, RX, RY) ) Mohawk = { Scan SY, add pts that are delta from q.x } For each point x in Mohawk (in order): Compute distance to its next 15 neighbors Update delta,r,j if any pair (x,y) is < delta Return (delta,r,j) Can be reduced to 7!

second attempt arbit+(a[1...n]) base case if A <=2, (lg,minl,maxl) = arbit(left(a)) (rg,minr,maxr) = arbit(right(a)) return max{maxr-minl,lg,rg}, min{minl, minr}, max{maxl, maxr}

multiplication

=

=

nx c i,j = a i,k b k,j k=1

= AE + BG AF + BH CE + DG CF + DH [Strassen] P 1 = A(F H) P 2 =(A + B)H P 3 =(C + D)E P 4 = D(G E) P 5 =(A + D)(E + H) P 6 =(B D)(G + H) P 7 =(A C)(E + F )

[strassen]

[strassen]

[strassen]

taking this idea further 3x3 matricies [Laderman 75]

1978 victor pan method 70x70 matrix using 143640 mults what is the recurrence:

Median

problem: given a list of n elements, find the element of rank n/2. (half are larger, half are smaller)

problem: given a list of n elements, find the element of rank n/2. (half are larger, half are smaller) can generalize to i first solution: sort and pluck.

problem: given a list of n elements, find the element of rank i. key insight: we do not have to fully sort. semi sort can suffice.

pick first element partition list about this one see where we stand

review: how to partition a list

review: how to partition a list GOAL: start with THIS LIST and END with THAT LIST less than greater than

review: how to partition a list

review: how to partition a list

review: how to partition a list

review: how to partition a list

review: how to partition a list

review: how to partition a list partitioning a list about an element takes linear time.

select

select handle base case. partition list about first element if pivot p is position i, return pivot else if pivot p is in position > i select else select

Assume our partition always select splits list into two eql parts handle base case. partition list about first element if pivot is position i, return pivot else if pivot is in position > i select else select

Assume our partition always select splits list into two eql parts handle base case. partition list about first element if pivot is position i, return pivot else if pivot is in position > i select else select

problem: what if we always pick bad partitions?

problem: what if we always pick bad partitions?

problem: what if we always pick bad partitions?

problem: what if we always pick bad partitions?

select handle base case. partition list about first element if pivot is position i, return pivot else if pivot is in position > i select else select

select handle base case. partition list about first element if pivot is position i, return pivot else if pivot is in position > i select else select T (n) =T (n 1) + O(n) (n 2 )

Needed: a good partition element partition

Needed: a good partition element partition produce an element where 30% smaller, 30% larger

solution: bootstrap image: gucci image: mark nason image: d&g

partition

partition

partition

partition median of each group form a smaller list select use the median of this smaller list as the partition element

partition 1. 2. 3. 4. 5.

partition divide list into groups of 5 elements find median of each small list gather all medians call select(...) on this sublist to find median return the result

partition divide list into groups of 5 elements find median of each small list gather all medians call select(...) on this sublist to find median return the result

a nice property of our partition

a nice property of our partition

a nice property of our partition

SWITCH TO A BIGGER EXAMPLE

a nice property of our partition < < < < < < < < < < <

a nice property of our partition < < < < < < < < < < < this implies there are at most numbers larger than /smaller

a nice property of our partition

select

select handle base case for small list else pivot = FindPartitionValue(A,n) partition list about pivot if pivot is position i, return pivot else if pivot is in position > i select else select

FindPartition divide list into groups of 5 elements find median of each small list gather all medians call select(...) on this sublist to find median return the result

select handle base case for small list else pivot = FindPartitionValue(A,n) partition list about pivot if pivot is position i, return pivot else if pivot is in position > i select else select

select handle base case for small list else pivot = FindPartitionValue(A,n) partition list about pivot if pivot is position i, return pivot else if pivot is in position > i select else select

Fast Fourier Transform

big ideas:

f(x) =5+2x + x 2 75 50 25-12.5-10 -7.5-5 -2.5 0 2.5 5 7.5 10

f(x) =5+2x + x 2 75 50 25-12.5-10 -7.5-5 -2.5 0 2.5 5 7.5 10

degree polynomial n points on a curve

FFT input: a 0,a 1,a 2,...,a n 1 output:

FFT input: a 0,a 1,a 2,...,a n 1 output: evaluate polynomial A at (any) n different points. n points on a curve

Later, we shall see that the same ideas for FFT can be used to implement Inverse-FFT. Inverse FFT: Given n-points,

Later, we shall see that the same ideas for FFT can be used to implement Inverse-FFT. Inverse FFT: Given n-points, y 0,y 1,...,y n 1 find a degree n polynomial A such that y i = A(! i )

Brute force method to evaluate A at n points:

solve the large problem by solving smaller problems and combining solutions T(n)=

suppose we had already had eval of Ae,Ao on {4,9,16,25} A e (4) A e (9) A e (16) A e (25) A 0 (4) A 0 (9) A 0 (16) A 0 (25)

suppose we had already had eval of Ae,Ao on {4,9,16,25} A e (4) A e (9) A e (16) A e (25) A 0 (4) A 0 (9) A 0 (16) A 0 (25) Then we could compute 8 terms: A(2) = A e (4) + 2A o (4) A( 2) = A e (4) + ( 2)A o (4) A(3) = A e (9) + 3A o (9) A( 3) = A e (9) + ( 3)A o (9) A(4), A(-4), A(5), A(-5)

FFT(f=a[1,...,n]) Evaluates degree n poly on the n th roots of unity

Last remaining issue:

Roots of unity should have n solutions what are they?

Remember this? e 2 i =1

consider the n solutions are: n1,e 2 i/n,e 2 i2/n,e 2 i3/n,...,e 2 i(n 1)/no

the n solutions are: consider for j=0,1,2,3,...,n-1 = is an n th root of unity

What is this number? = is an n th root of unity

Taylor series expansion of a function f around point a f (y) = f (a)+ f 0 (a) (y a)+ f 00 (a) 1! 2! (y a) 2 + f 000 (a) 3! (y a) 2 + e x = around 0

What is this number? = is an n th root of unity e 2pij/n = cos(2pj/n)+i sin(2pj/n)

= is an n th root of unity Lets compute! 1,8

Compute all 8 roots of unity Then graph them

roots of unity should have n solutions e 2pij/n = cos(2pj/n)+i sin(2pj/n)

squaring the n th roots of unity

squaring the n th roots of unity

Thm: Squaring an n th root produces an n/2 th root. 1! 1,8 = p2 + p i example: 2! 2 1,8 = 1 p2 + p i 2 = 2 1 p2 2 +2 1 p2 i p 2 + i p 2 2 =1/2+i 1/2 = i

Thm: Squaring an n th root produces an n/2 th root. n1,e 2 i(1/n),e 2 i(2/n),e 2 i(3/n),...,e 2 i(n/2)/n,e 2 i(n/2+1)/n 2 i(n,...,e 1)/no

Thm: Squaring an n th root produces an n/2 th root. n1,e 2 i(1/n),e 2 i(2/n),e 2 i(3/n),...,e 2 i(n/2)/n,e 2 i(n/2+1)/n 2 i(n,...,e 1)/no 1 e 2 i(1/(n/2)) e 2 i(2/(n/2)) e 2 i(3/(n/2)) 1 e 2 i((n/2)+1/(n/2)) = e 2 i(1+1/(n/2)) =1 e 2 i(1/(n/2))

If n=16

evaluate at a root of unity

evaluate at a root of unity A(! i,n )=A e (! 2 i,n)+! i,n A o (! 2 i,n) n th root of unity n/2 th root of unity n/2 th root of unity

FFT(f=a[1,...,n]) Evaluates degree n poly on the n th roots of unity

FFT(f=a[1,...,n]) Base case if n<=2 E[...] <- FFT(Ae) O[...] <- FFT(Ao) // eval Ae on n/2 roots of unity // eval Ao on n/2 roots of unity combine results using equation: Return n points.

1 1 7 8 9 4 3 2 3 5 7 8 5 3 6 7 7 1 5 6 1 7 8 9

a3 a2 a1 a0 b3 b2 b1 b0 a3b0 a2b0 a1b0 a0b0 a3b1 a2b1 a1b1 a0b1 a3b2 a2b2 a1b2 a0b2 a3b3 a2b3 a1b3 a0b3

A(x) =a 3 x 3 + a 2 x 2 + a 1 x + a 0 B(x) =b 3 x 3 + b 2 x 2 + b 1 x + b 0 C(x) = a 3 b 3 x 6 + (a 3 b 2 + a 2 b 3 )x 5 + (a 3 b 1 + a 2 b 2 + a 1 b 3 )x 4 + (a 3 b 0 + a 2 b 1 + a 1 b 2 + a 0 b 3 )x 3 + (a 2 b 0 + a 1 b 1 + a 0 b 2 )x 2 + (a 1 b 0 + a 0 b 1 )x+ a 0 b 0

y=2x+1 y=x+1

y=2x+1 y=x+1

a3 a2 a1 a0 b3 b2 b1 b0 A(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 B(x) =b 0 + b 1 x + b 2 x 2 + b 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7

a3 a2 a1 a0 b3 b2 b1 b0 A(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 B(x) =b 0 + b 1 x + b 2 x 2 + b 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 A(! 0 ) A(! 1 ) A(! 2 ) A(! 7 )...

a3 a2 a1 a0 b3 b2 b1 b0 A(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 B(x) =b 0 + b 1 x + b 2 x 2 + b 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 A(! 0 ) A(! 1 ) A(! 2 ) A(! 7 )... B(! 0 ) B(! 1 ) B(! 2 )... B(! 7 ) FFT FFT

a3 a2 a1 a0 b3 b2 b1 b0 A(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 B(x) =b 0 + b 1 x + b 2 x 2 + b 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 A(! 0 ) A(! 1 ) A(! 2 ) A(! 7 )... B(! 0 ) B(! 1 ) B(! 2 )... B(! 7 ) FFT FFT C(! 0 ) C(! 1 ) C(! 2 )... C(! 7 )

a3 a2 a1 a0 b3 b2 b1 b0 A(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 B(x) =b 0 + b 1 x + b 2 x 2 + b 3 x 3 +0x 4 +0x 5 +0x 6 +0x 7 A(! 0 ) A(! 1 ) A(! 2 ) A(! 7 )... B(! 0 ) B(! 1 ) B(! 2 )... B(! 7 ) FFT FFT C(! 0 ) C(! 1 ) C(! 2 )... C(! 7 ) C(x) =c 0 + c 1 x + c 2 x 2 + c 7 x 7 IFFT

application to mult karatsuba 1 7 8 9 1 4 3 2 a b c d

application to mult karatsuba 1 7 8 9 1 4 3 2 a b c d

Multiplying n-bit numbers https://en.wikipedia.org/wiki/file:integer_multiplication_by_fft.svg Schönhage Strassen 71 Fürer 07 O(n log n log log n) O(n log(n)2 log (n) )

AGMP-BASEDIMPLEMENTATIONOFSCHÖNHAGE-STRASSEN S LARGE INTEGER MULTIPLICATION ALGORITHM PIERRICK GAUDRY, ALEXANDER KRUPPA, AND PAUL ZIMMERMANN Abstract. Schönhage-Strassen s algorithm is one of the best known algorithms for multiplying large integers. Implementing it efficiently is of utmost importance, since many other algorithms rely on it as a subroutine. We present here an improved implementation, based on the one distributed within the GMP library. The following ideas and techniques were used or tried: faster arithmetic modulo 2 n +1, improved cache locality, Mersenne transforms, Chinese Remainder Reconstruction, the 2trick,Harley sandgranlund stricks, improved tuning. We also discuss some ideas we plan to try in the future. Introduction Since Schönhage and Strassen have shown in 1971 how to multiply two N-bit integers in O(N log N log log N) time[21],severalauthorsshowedhowtoreduceotheroperations inverse, division, square root, gcd, base conversion, elementary functions to multiplication, possibly with log N multiplicative factors [5, 8, 17, 18, 20, 23]. It has now become common practice to express complexities in terms of the cost M(N) to multiply two N-bit numbers, and many researchers tried hard to get the best possible constants in front of M(N) forthe above-mentioned operations (see for example [6, 16]). Strangely, much less effort was made for decreasing the implicit constant in M(N) itself, although any gain on that constant will give a similar gain on all multiplication-based operations. Some authors reported on implementations of large integer arithmetic for specific hardware or as part of a number-theoretic project [2, 10]. In this article we concentrate on the question of an optimized implementation of Schönhage-Strassen s algorithm on a classical workstation.

Applications of FFT

Applications of FFT

418.222 417.929 418.127 398.417 397.617 401.902 405.7328 414.795 408.15 400.868 411.8386

String matching with * ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCC CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGC CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGG AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCC CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAG TTTAATTACAGACCTGAA Looking for all occurrences of GGC*GAG*C*GC where I don't care what the * symbol is.