CS145: Probability & Computing Lecture 24: Algorithms

Similar documents
Introduction to Randomized Algorithms: Quick Sort and Quick Selection

Lecture Notes CS:5360 Randomized Algorithms Lecture 20 and 21: Nov 6th and 8th, 2018 Scribe: Qianhang Sun

Randomized algorithms. Inge Li Gørtz

Randomized Algorithms Multiple Choice Test

Simulation. Where real stuff starts

ECE Homework Set 2

14 Random Variables and Simulation

Second main application of Chernoff: analysis of load balancing. Already saw balls in bins example. oblivious algorithms only consider self packet.

Simulation. Where real stuff starts

Expectation of geometric distribution

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

CS155: Probability and Computing: Randomized Algorithms and Probabilistic Analysis

CS Introduction to Complexity Theory. Lecture #11: Dec 8th, 2015

CS145: Probability & Computing Lecture 11: Derived Distributions, Functions of Random Variables

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2013

SFM-11:CONNECT Summer School, Bertinoro, June 2011

Random Number Generation. CS1538: Introduction to simulations

CS 580: Algorithm Design and Analysis

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples

Non-Interactive Zero Knowledge (II)

CS145: Probability & Computing

NP Completeness and Approximation Algorithms

Lecture 10 Oct. 3, 2017

The Probabilistic Method

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples:

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

CS151 Complexity Theory. Lecture 13 May 15, 2017

Introduction to Interactive Proofs & The Sumcheck Protocol

Lecture 2: Randomized Algorithms

CS145: Probability & Computing

-bit integers are all in ThC. Th The following problems are complete for PSPACE NPSPACE ATIME QSAT, GEOGRAPHY, SUCCINCT REACH.

CS 282A/MATH 209A: Foundations of Cryptography Prof. Rafail Ostrosky. Lecture 4

2008 Winton. Review of Statistical Terminology

Lecture 5: Two-point Sampling

Notes for Lecture 25

2 Random Variable Generation

Lecture 12: Interactive Proofs

Lecture 1: January 16

1 Maintaining a Dictionary

Linear Congruences. The equation ax = b for a, b R is uniquely solvable if a 0: x = b/a. Want to extend to the linear congruence:

13. RANDOMIZED ALGORITHMS

Lecture: Analysis of Algorithms (CS )

COMP219: Artificial Intelligence. Lecture 20: Propositional Reasoning

NP-Complete Reductions 2

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

CPSC 467b: Cryptography and Computer Security

NP-Complete Problems. More reductions

CSE 525 Randomized Algorithms & Probabilistic Analysis Spring Lecture 3: April 9

Lecture 19: Interactive Proofs and the PCP Theorem

Lecture 4. Quicksort

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

Chapter 13. Randomized Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

CS 580: Algorithm Design and Analysis

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Essential facts about NP-completeness:

Chapter 2. Reductions and NP. 2.1 Reductions Continued The Satisfiability Problem (SAT) SAT 3SAT. CS 573: Algorithms, Fall 2013 August 29, 2013

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

Lecture 4: Two-point Sampling, Coupon Collector s problem

Complexity Theory. Jörg Kreiker. Summer term Chair for Theoretical Computer Science Prof. Esparza TU München

Review of Statistical Terminology

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam

Mind the gap Solving optimization problems with a quantum computer

2 Evidence that Graph Isomorphism is not NP-complete

Lecture 5: February 21, 2017

CS 5114: Theory of Algorithms. Tractable Problems. Tractable Problems (cont) Decision Problems. Clifford A. Shaffer. Spring 2014

Lecture 10 Algorithmic version of the local lemma

CSC 373: Algorithm Design and Analysis Lecture 30

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Random Variable. Pr(X = a) = Pr(s)

Adleman Theorem and Sipser Gács Lautemann Theorem. CS422 Computation Theory CS422 Complexity Theory

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

Chapter 5 : Randomized computation

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

Pseudo-random Number Generation. Qiuliang Tang

Recap from Last Time

HW11. Due: December 6, 2018

Uniform random numbers generators

Numerical methods for lattice field theory

CS 5114: Theory of Algorithms

NP-Completeness Part II

Lecture 1 : Probabilistic Method

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.

Lecture 11: Non-Interactive Zero-Knowledge II. 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian problem

Data Structures and Algorithms

NP-Completeness Part II

Algorithms lecture notes 1. Hashing, and Universal Hash functions

1 Cryptographic hash functions

Copyright 2000, Kevin Wayne 1

Finding Satisfying Assignments by Random Walk

Generating pseudo- random numbers

The Class NP. NP is the problems that can be solved in polynomial time by a nondeterministic machine.

Independence, Variance, Bayes Theorem

On the influence of non-perfect random numbers on probabilistic algorithms

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

13. RANDOMIZED ALGORITHMS

Transcription:

CS145: Probability & Computing Lecture 24: Algorithms Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis, Introduction to Probability, 2008 Pitman, Probability, 1999

<latexit sha1_base64="q/vneicisbvrmk3zpp1c6refdbq=">aaaclhicbvdlsgmxfm34rpvvdekm2bhaztdtjw6eykfcvrcdls5qmmmmdc08sdjcgead3pgrgriwifu/w3raobyecdk5515u7vfirou0zam2sbm1vbnb2cvuhxwehzdottsisjgmlryxihc8jaijiwljkhnpxjygwgpe9sanmw8/es5ofd7ksuzcaa1d6lompjl6pyau391uog5oyuh4ikd2bvnht6pufmnlr5v9inyutqugyeh6v1q2dtmhxcfwgptbas1+6c0zrdgjscgxq0l0ldowboq4pjirrogkgsqij9gq9bqnuucem+blzvbskqpor1ydumjcxe5iusdejpbuzydkskx6m/e/r5di/9pnargnkor4pshpgjqrncuhb5qtlnleeyq5vx+feiq4wlllw1qhwksrr5n2zbauf6iv67eloarghfyacrdafaide9aelydbm3gfh2cqvwjv2qf2ns/d0by9z+apto8fnm+khw==</latexit> <latexit sha1_base64="q/vneicisbvrmk3zpp1c6refdbq=">aaaclhicbvdlsgmxfm34rpvvdekm2bhaztdtjw6eykfcvrcdls5qmmmmdc08sdjcgead3pgrgriwifu/w3raobyecdk5515u7vfirou0zam2sbm1vbnb2cvuhxwehzdottsisjgmlryxihc8jaijiwljkhnpxjygwgpe9sanmw8/es5ofd7ksuzcaa1d6lompjl6pyau391uog5oyuh4ikd2bvnht6pufmnlr5v9inyutqugyeh6v1q2dtmhxcfwgptbas1+6c0zrdgjscgxq0l0ldowboq4pjirrogkgsqij9gq9bqnuucem+blzvbskqpor1ydumjcxe5iusdejpbuzydkskx6m/e/r5di/9pnargnkor4pshpgjqrncuhb5qtlnleeyq5vx+feiq4wlllw1qhwksrr5n2zbauf6iv67eloarghfyacrdafaide9aelydbm3gfh2cqvwjv2qf2ns/d0by9z+apto8fnm+khw==</latexit> <latexit sha1_base64="q/vneicisbvrmk3zpp1c6refdbq=">aaaclhicbvdlsgmxfm34rpvvdekm2bhaztdtjw6eykfcvrcdls5qmmmmdc08sdjcgead3pgrgriwifu/w3raobyecdk5515u7vfirou0zam2sbm1vbnb2cvuhxwehzdottsisjgmlryxihc8jaijiwljkhnpxjygwgpe9sanmw8/es5ofd7ksuzcaa1d6lompjl6pyau391uog5oyuh4ikd2bvnht6pufmnlr5v9inyutqugyeh6v1q2dtmhxcfwgptbas1+6c0zrdgjscgxq0l0ldowboq4pjirrogkgsqij9gq9bqnuucem+blzvbskqpor1ydumjcxe5iusdejpbuzydkskx6m/e/r5di/9pnargnkor4pshpgjqrncuhb5qtlnleeyq5vx+feiq4wlllw1qhwksrr5n2zbauf6iv67eloarghfyacrdafaide9aelydbm3gfh2cqvwjv2qf2ns/d0by9z+apto8fnm+khw==</latexit> <latexit sha1_base64="q/vneicisbvrmk3zpp1c6refdbq=">aaaclhicbvdlsgmxfm34rpvvdekm2bhaztdtjw6eykfcvrcdls5qmmmmdc08sdjcgead3pgrgriwifu/w3raobyecdk5515u7vfirou0zam2sbm1vbnb2cvuhxwehzdottsisjgmlryxihc8jaijiwljkhnpxjygwgpe9sanmw8/es5ofd7ksuzcaa1d6lompjl6pyau391uog5oyuh4ikd2bvnht6pufmnlr5v9inyutqugyeh6v1q2dtmhxcfwgptbas1+6c0zrdgjscgxq0l0ldowboq4pjirrogkgsqij9gq9bqnuucem+blzvbskqpor1ydumjcxe5iusdejpbuzydkskx6m/e/r5di/9pnargnkor4pshpgjqrncuhb5qtlnleeyq5vx+feiq4wlllw1qhwksrr5n2zbauf6iv67eloarghfyacrdafaide9aelydbm3gfh2cqvwjv2qf2ns/d0by9z+apto8fnm+khw==</latexit> 2-SAT Algorithm Boolean formula in Conjunctive Normal Form (CNF): F =(X [ W [ Y ) \ (X [ Ȳ ) \ (W [ V )... Variables get values in {True, False} Problem: Given a formula with up to two variables per clause, find a Boolean assignment that satisfies all clauses.

2-SAT Algorithm Algorithm: 1. Start with an arbitrary assignment. 2. Repeat till all clauses are satisfied: 1. Pick an unsatisfied clause. 2. If the clause has one variable change the value of that variable. 3. If the clause has two variables, choose one uniformly at random and change its value. What the is the expected run-time of this algorithm?

Analysis of the 2-SAT Algorithm Theorem: Assuming that the formula has a satisfying assignment, the expected run-time to find that assignment is O(n 2 ). W.l.o.g. assume that all clause have two variables. Fix one satisfying assignment. Let n be the number of variable in the formula. X j = number of variables that match the satisfying assignment at time j D t = the expected number of steps to termination when t variables match the satisfying assignment.

Analysis of the 2-SAT Algorithm X j = number of variables that match the satisfying assignment at time j. Algorithm: Pick an unsatisfied clause and change the value of one its two variables, chosen at random. Pr(X j = 1 X j-1 =0) =1 Pr(X j = t X j-1 = t-1) 1/2 1 $ % 1 W.l.og. we assume Pr(X j = t X j-1 = t-1) = 1/2

Analysis of the 2-SAT Algorithm D t = the expected number of steps to termination when t variables matched the satisfying assignment. D n = 0 D t = 1+ ½ (D t+1 + D t-1 ) 1 " # 1 D 0 = 1+ D 1

Analysis of the 2-SAT Algorithm D t = the expected number of steps to termination when t variables matched the satisfying assignment. D n = 0 D t = 1+ ½ (D t+1 + D t-1 ) 1 " # 1 D 0 = 1+ D 1 Guess and verify: D t = n 2 - t 2 Theorem: Assuming that the formula has a satisfying assignment, the expected run-time to find that assignment is O(n 2 ).

Analysis of the 2-SAT Algorithm Theorem: Assuming that the formula has a satisfying assignment, the expected run-time to find that assignment is O(n 2 ). Theorem: There is a one-sides error randomized algorithm for the 2- SAT problem that terminates in O(n 2 log n) time, and with high probability returns an assignment when the formula is satisfiable, and always returns ``UNSATISFIABLE'' when no assignment exists. Proof: Repeat the algorithm until finds a satisfying assignment or up to log n times. In each repeat, run the algorithm 2n 2 steps. The probability that the algorithm does not find an assignment, when exists, in one round of 2n 2 steps is bounded by ½. The probability that the algorithm does not find an assignment, when exists, in log n one round is bounded by 1/n.

Probability and Algorithms Classical Algorithm Analysis Ø Algorithm: A function that uses a deterministic sequence of operations to produce a unique output for any valid input Ø Complexity analysis: determine the largest number of operations over all possible inputs of some size (worst case) Randomized Algorithms Ø Pseudo-random numbers are used in algorithm operations Ø Las Vegas algorithm: Gives correct result with random run-time Ø Monte Carlo algorithm: distribution on the correction of the result. Probabilistic Algorithm Analysis Ø Given a random input, what is the expected running time? Ø Given a random input, what is the CDF of running times? Ø Input distribution could be uniform, or informed by application sortvis.org

A Randomized Quicksort Procedure Q S(S); Input: AsetS. Output: The set S in sorted order. 1 Choose a random element y uniformly from S. 2 Compare all elements of S to y. Let S 1 = {x 2 S {y} x apple y}, S 2 = {x 2 S {y} x > y}. 3 Return the list: Q S(S 1 ), y, Q S(S 2 ).

What is the Expected Running Time? Let s 1,...,s n be the elements of S is sorted order. For i =1,...,n, andj > i, define 0-1 random variable X i,j,s.t. X i,j =1i s i is compared to s j in the run of the algorithm, else X i,j =0. The number of comparisons in running the algorithm is T = nx X X i,j. i=1 j>i We are interested in E[T ]= X j>i E[X i,j ]= X j>i Pr(X i,j = 1). Theorem E[T ] = O(n log n).

What is the Expected Running Time? Let s 1,...,s n be the elements of S is sorted order. For i =1,...,n, andj > i, define 0-1 random variable X i,j,s.t. X i,j =1i s i is compared to s j in the run of the algorithm, else X i,j =0. The number of comparisons in running the algorithm is T = nx X X i,j. i=1 j>i We are interested in E[T ].

What is the Expected Running Time? What is the probability that X i,j =1? s i is compared to s j i either s i or s j is chosen as a split item before any of the j i 1 elements between s i and s j are chosen. Elements are chosen uniformly at random! elements in the set [s i, s i+1,...,s j ] are chosen uniformly at random. E[X i,j ]=Pr(X i,j = X1) X= 2 X X j i +1. X X E[T ] = E[ nx X X X i,j ]= i=1 j>i X X n nx X X E[X i,j ]= i=1 j>i nx X i=1 nx 2 X k apple 2nH n = n log n + O(n). k=1 j>i j 2 i +1 apple H n = P n k=1 1 k

Deterministic QuickSort Procedure DQ S(S); Input: AsetS. Output: The set S in sorted order. 1 Let y be the first element in S. 2 Compare all elements of S to y. Let S 1 = {x 2 S {y} x apple y}, S 2 = {x 2 S {y} x > y}. (Elements is S 1 and S 2 are in th same order as in S.) 3 Return the list: DQ S(S 1 ), y, DQ S(S 2 ).

Probabilistic Analysis of Deterministic Q_S Theorem The expected run time of DQ S on a random input, uniformly chosen from all possible permutation of S is O(n log n). Proof. Set X i,j as before. If all permutations have equal probability, all permutations of S i,...,s j have equal probability, thus Pr(X i,j )= 2 j i +1. nx X E[ X i,j ]=O(nlog n). i=1 j>i

Random Number Functions Ø Assumption: We have a source of independent random variables which follow a continuous uniform distribution on [0,1]: U 1,U 2,U 3,... Ø In Matlab, the rand function provides this f U (u i ) Coming later: Methods for generating uniform variables

Discrete Random Number Generation Ø Input: Independent uniform variables U 1,U 2,U 3,... Ø We can use these to exactly sample from any discrete distribution using the cumulative distribution function: F X (x) =P (X apple x) = X kapplex p X (k) F X (x) Ø Define the inverse of this discrete CDF: h(u) =min{x F X (x) u} X i = h(u i ) p X (x) P (X i = k) =P (F X (k 1) <U i apple F X (k)) = F X (k) F X (k 1) = p X (k) This function transforms uniform variables to our target distribution!

Continuous Random Number Generation Ø Input: Independent uniform variables U 1,U 2,U 3,... Ø We can use these to exactly sample from any continuous distribution using the cumulative distribution function: F X (x) =P (X apple x) = Ø Assuming continuous CDF is invertible: h(u) =F 1 X (u) X i = h(u i ) Z x 1 f X (z) dz 1 0 f X (x) P (X i apple x) =P (h(u i ) apple x) =P (U i apple F X (x)) = F X (x) This function transforms uniform variables to our target distribution! p(y) y Fh(y) X (x)

Uniform Random Number Generation Ø Assumption: We have a source of independent f U (u i ) random variables which follow a continuous uniform distribution on [0,1] : U 1,U 2,U 3,... Ø In Matlab, the rand function provides this Ø Chaotic dynamical systems are used to generate sequences of pseudo-random numbers whose distribution is approximately uniform on [0,1] Ø Simplest examples are linear congruential generators, but try to use more sophisticated methods! ū i+1 =(aū i + c) modm u i = 1 mūi c and m should be relatively prime and large (plus more).

Examples of Uniform Generators Linear Congruential Generator ū i+1 =(aū i + c) modm The seed determines starting point. Wikipedia RANDU: A catastrophically bad random number generator that was fairly widely used in the 1970 s ū i+1 = 65539 ū i mod 2 31 u i =2 31 ū i Constants picked for ease of hardware implementation, BUT Strong correlations among triples of values

Examples of Uniform Generators Mersenne Twister: Used by Matlab rand, period of 2 19937 1 Pseudo-random sequence looks nearly random. Don t try this at home!

Symmetry Breaking Ethernet Communication: A message is received by all nodes. Only one message per step. If two messages are sent - no message is received. Nodes detect collision No central scheduling protocol.

Contention Resolution Protocol Assume n nodes on the Ethernet In each step that a node has a message in its queue it ties to send it with probability 1/n (independent of other nodes). Probability of a collision in each step is (1-1/n) n-1 " -1 Expected number of steps till a node successfully sends it message O(n) Equivalently: each node waits a number of steps distributed U[0,n-1] Very wasteful if only a few nodes have messages to send!

Bakeoff Protocol Each node protocol: 1. i 0 2. While queue is not empty 1. Wait a random number of steps in [1,..,2 i ] 2. Try to send a message 3. If a collusion detected " " + 1 Analysis: Assume that m nodes have non-empty queues After log m steps all nodes with non-empty queues have i= log m The expected wait of a message at the head of each queue in O(m) The bakeoff protocol optimize to the actual load on the channel.

Verifying Matrix Multiplication Given three n n matrices A, B, andc in a Boolean field, we want to verify AB = C. Standard method: Matrix multiplication - takes (n 3 ) ( (n 2.37 )) operations. Randomized algorithm: 1 Chooses a random vector r =(r 1, r 2,...,r n ) 2 {0, 1} n. 2 Compute B r; 3 Compute A(B r); 4 Computes C r; 5 If A(B r) 6= C r return AB 6= C, elsereturnab = C. The algorithm takes (n 2 ) time.

Verifying Matrix Multiplication Randomized algorithm: 1 Chooses a random vector r =(r 1, r 2,...,r n ) 2 {0, 1} n. 2 Compute B r; 3 Compute A(B r); 4 Computes C r; 5 If A(B r) 6= C r return AB 6= C, elsereturnab = C. The algorithm takes (n 2 ) time. The algorithm takes (n ) time. Theorem If AB 6= C, and r is chosen uniformly at random from {0, 1} n,then Pr(AB r = C r) apple 1 2.

Verifying Matrix Multiplication The algorithm takes (n ) time. Theorem If AB 6= C, and r is chosen uniformly at random from {0, 1} n,then Pr(AB r = C r) apple 1 2. Let D = AB C 6= 0. AB r = C r implies that D r =0. Since D 6= 0it has some non-zero entry; assume d 11. nx For D r =0,itmustbethecasethat or equivalently X r 1 = X j=1 P n j=2 d 1jr j d 11. d 1j r j = 0, P

Verifying Matrix Multiplication The algorithm takes (n ) time. Theorem If AB 6= C, and r is chosen uniformly at random from {0, 1} n,then Pr(AB r = C r) apple 1 2. Let D = AB C 6= 0. AB r = C r implies thatxd r =0. Since D 6= 0it has some non-zero entry; assume d 11. For D r =0,itmustbethecasethat or equivalently X r 1 = nx d 1j r j = 0, j=1 P n j=2 d 1jr j P. d 11 For fixed r 2,...,r n. there is only one value of r 1 that satisfies the equality.