Solution Set for Homework #1

Similar documents
PROBABILITY VITTORIA SILVESTRI

PROBABILITY. Contents Preface 1 1. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8. Preface

Discrete Mathematics & Mathematical Reasoning Chapter 6: Counting

HOMEWORK ASSIGNMENT 6

2016 Final for Advanced Probability for Communications

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions

Lecture 4: Counting, Pigeonhole Principle, Permutations, Combinations Lecturer: Lale Özkahya

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

1. Consider a random independent sample of size 712 from a distribution with the following pdf. c 1+x. f(x) =

14.1 Finding frequent elements in stream

Number Theory and Counting Method. Divisors -Least common divisor -Greatest common multiple

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Qualifying Exam in Probability and Statistics.

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

1 Basic Combinatorics

CPSC 536N: Randomized Algorithms Term 2. Lecture 9

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

ACO Comprehensive Exam October 14 and 15, 2013

MAT 271E Probability and Statistics

Problem set 1, Real Analysis I, Spring, 2015.

1 Randomized Computation

Section F Ratio and proportion

Math 564 Homework 1. Solutions.

The Inclusion Exclusion Principle

Comprehensive Examination Quantitative Methods Spring, 2018

CMSC Discrete Mathematics SOLUTIONS TO FIRST MIDTERM EXAM October 18, 2005 posted Nov 2, 2005

Pr[X = s Y = t] = Pr[X = s] Pr[Y = t]

= 2 5 Note how we need to be somewhat careful with how we define the total number of outcomes in b) and d). We will return to this later.

CSE 291: Fourier analysis Chapter 2: Social choice theory

Week 2: Sequences and Series

Notes from Week 9: Multi-Armed Bandit Problems II. 1 Information-theoretic lower bounds for multiarmed

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

Introduction and basic definitions

Chapter 3 : Conditional Probability and Independence

Random Variable. Pr(X = a) = Pr(s)

First and Last Name: 2. Correct The Mistake Determine whether these equations are false, and if so write the correct answer.

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma

Mathathon Round 1 (2 points each)

CSE 190, Great ideas in algorithms: Pairwise independent hash functions

Math 151. Rumbos Fall Solutions to Review Problems for Final Exam

MATH475 SAMPLE EXAMS.

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Lectures on Elementary Probability. William G. Faris

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Defining the Integral

C.7. Numerical series. Pag. 147 Proof of the converging criteria for series. Theorem 5.29 (Comparison test) Let a k and b k be positive-term series

MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM

Lecture 4: Two-point Sampling, Coupon Collector s problem

2018 Best Student Exam Solutions Texas A&M High School Students Contest October 20, 2018

March 5, Solution: D. The event happens precisely when the number 2 is one of the primes selected. This occurs with probability ( (

Basic Combinatorics. Math 40210, Section 01 Fall Homework 8 Solutions

Cryptography and Security Final Exam

CS280, Spring 2004: Final

Lecture Notes Introduction to Ergodic Theory

MORE ON CONTINUOUS FUNCTIONS AND SETS

Lecture 5: January 30

Math Bootcamp 2012 Miscellaneous

Year 10 Mathematics Probability Practice Test 1

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Iowa State University. Instructor: Alex Roitershtein Summer Exam #1. Solutions. x u = 2 x v

Bandits, Experts, and Games

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Math 163: Lecture notes

Third Problem Assignment

Krzysztof Burdzy University of Washington. = X(Y (t)), t 0}

Senior Math Circles March 3, 2010 Counting Techniques and Probability II

Existence of Optimal Strategies in Markov Games with Incomplete Information

HMMT November 2017 November 11, 2017

MATH 117 LECTURE NOTES

On the number of ways of writing t as a product of factorials

Iowa State University. Instructor: Alex Roitershtein Summer Homework #5. Solutions

Asymptotic Statistics-III. Changliang Zou

Putnam Greedy Algorithms Cody Johnson. Greedy Algorithms. May 30, 2016 Cody Johnson.

1 Distributional problems

CS Homework Chapter 6 ( 6.14 )

Stat 516, Homework 1

Probability and Measure

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Counting Methods. CSE 191, Class Note 05: Counting Methods Computer Sci & Eng Dept SUNY Buffalo

Spring 2012 Math 541B Exam 1

Math 151. Rumbos Spring Solutions to Review Problems for Exam 3

Dalal-Schmutz (2002) and Diaconis-Freedman (1999) 1. Random Compositions. Evangelos Kranakis School of Computer Science Carleton University

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answer sheet.

Lecture 12: Randomness Continued

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

Scott Taylor 1. EQUIVALENCE RELATIONS. Definition 1.1. Let A be a set. An equivalence relation on A is a relation such that:

variance of independent variables: sum of variances So chebyshev predicts won t stray beyond stdev.

Chapter 8 Sequences, Series, and Probability

Randomized Load Balancing:The Power of 2 Choices

Conway s RATS Sequences in Base 3

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010

Course Notes. Part IV. Probabilistic Combinatorics. Algorithms

1 Stochastic Dynamic Programming

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Online Learning, Mistake Bounds, Perceptron Algorithm

0 Sets and Induction. Sets

Proof. We indicate by α, β (finite or not) the end-points of I and call

Transcription:

CS 683 Spring 07 Learning, Games, and Electronic Markets Solution Set for Homework #1 1. Suppose x and y are real numbers and x > y. Prove that e x > ex e y x y > e y. Solution: Let f(s = e s. By the mean value theorem, there exists a number z such that x > z > y and f f(x f(y (z =. x y Observe that the left side is equal to e z and the right side is equal to (e x e y /(x y. As x > z > y we have e x > e z = ex e y x y > ey. 2. A standard 52-card deck 1 is randomly partitioned into four 13-element sets, which are dealt to players named North, South, East, and West. (a Calculate Pr(South gets exactly 2 aces North gets exactly 1 ace. Solution: For a player P, let A(P denote the number of aces dealt to P. We have Pr(A(South = 2 A(North = 1 = Pr(A(South = 2 and A(North = 1. Pr(A(North = 1 The number of ways of dealing a hand to North is ( 52 13, and the number of ways of dealing a hand to North with exactly 1 ace is ( ( 4 48 ( 1 12 because there are 4 1 ways to deal an ace to North and ( 48 12 ways to deal North s 12 remaining cards from the remaining 48 cards which are not aces. Hence, ( 4 48 Pr(A(North = 1 = 1( 12. ( 39 Similarly, there are ( 52 13 13 ways to deal a 13-card hand to each of North and South. (After dealing 13 cards to North, there are 39 cards remaining in the deck and hence ( 39 13 ways to deal South a 13-card hand from these remaining cards. Of all the possible ways to deal North and South s 13-card hands, there are ( 4 3 ( 48 ( 36 1( 2 12 11 ways to deal a pair of hands in which North gets one ace and South gets two. (The product of four terms is justified by considering the number of ways to deal one ace to North, the number of ways to deal two of the remaining aces to South, the number of ways to deal 12 more non-ace cards to North, and 1 A standard 52-card deck is the set 2, 3, 4, 5, 6, 7, 8, 9, 10, jack, queen, king, ace} clubs, diamonds, hearts, spades}. ( 52 13

the number of ways to deal 11 more non-ace cards to South from the remaining 36 non-ace cards which were not dealt to North. Hence, ( 4 ( 3 48 ( 36 1 Pr(A(South = 2 and A(North = 1 = 2( 12 11. Putting all of this together, we find that Pr(A(South = 2 A(North = 1 = = ( 4 1 ( 52 13 ( 3 48 2( 12 ( 36 11 ( 52 ( 39 13 13 ( 3 36 2( 11 ( 39 13 ( 39 13 = 156 703 0.2219... / ( 4 1 ( 48 12 ( 52 13 (b Let H and D denote the number of hearts and diamonds, respectively, dealt to North. Calculate E(H D as a function of D. Solution: Let S and C denote the number of spades and clubs, respectively, dealt to North. By symmetry, we have E(H D = E(S D = E(C D and we also know that H + S + C = 13 D. Hence E(H D = 1 3 E(H + S + C D = 1 (13 D. 3 3. Let x 1, x 2,..., x n be independent uniformly-distributed random samples from the interval [0, 1]. Define the following probabilities: p(n is the probability that min k x k > 0.01. q(n is the probability that min i j x i x j < 1 n 2. r(n is the probability that min i j x i x j > 1 100n. s(n is the probability that exactly n/2 of the numbers x 1,..., x n lie in [0, 1 2 ]. Estimate the asymptotic behavior of each of these probabilities as n tends to infinity. Specifically, for each of the sequences p(n, q(n, r(n, s(n, determine whether the sequence (A tends to zero exponentially fast, i.e. is bounded above by c n for some constant c < 1, for all sufficiently large n; (B tends to zero, but not exponentially fast; (C remains bounded away from 0 and 1;

(D tends to 1. Solution: Answers: (A for p(n. (C for q(n. (A for r(n. (B for s(n. Justifications: (a We have p(n = (0.99 n, hence (A is correct. (b First, here s a heuristic for seeing that q(n should remain bounded away from 0 and 1. For each pair of distinct indices i, j, the probability that x i x j < 1 n 2 is very close to 2. Hence the expected number of unordered pairs i, j} such n 2 that i j and x i x j < 1 is roughly ( n 2, which approaches 1 as n. n 2 2 n 2 Since the expected number of pairs i, j} satisfying x i x j < 1 is approaching a n 2 constant, it is reasonable to suspect that the probability that one such pair exists is bounded away from 0 and 1. To make this rigorous, we start by establishing the following lemma. Lemma 1. Let ε > 0 be a positive real number, n be a positive integer, and x 1, x 2,..., x n independent uniformly-distributed samples from [0, 1]. The probability that min i j x i x j > ε is bounded above by e (n 1(n 2ε/2. Proof. For a point x [0, 1] let I x denote the interval [x, x + ε] [0, 1], and for a set S [0, 1] let I S = x S I x. Let us call a set S [0, 1] independent if x y > ε for all distinct x, y S. It is clear that if S is an independent set of k elements then the sets I x (x S are pairwise disjoint, and at most one of them has measure less than ε; consequently I S has measure at least (k 1ε. Now let S k = x 1, x 2,..., x k } and observe that Pr(x 1,..., x n } is independent n = Pr(S k is independent S k 1 is independent k=2 n Pr(x k I Sk 1 S k 1 is independent k=2 n [1 (k 2ε] k=2 n k=2 e (k 2ε = e (n 1(n 2ε/2.

If we apply Lemma 1 to x 1, x 2,..., x n } with ε = 1 we find that n 2 Pr(min i j x i x j > 1 n 2 < e (n 1(n 2/2n2. The right side converges to e 1/2 as n, and this proves that q(n is bounded away from 0. To prove that q(n is bounded away from 1, let Y be the random variable which counts the number of unordered pairs i, j} of distinct elements of [n] satisfying x i x j < 1 n 2. For any particular pair i, j, we have Hence Pr ( x i x j 1n x 2 i = 1n + min x 2 i, 1 } n, 1 x 2 i. Pr ( x i x j 1n = 1n 1 + min x, 1n }, 1 x dx = 2 2 2 0 2 n 1 2 n, 4 and consequently E(Y = ( n 2 ( 2 n 2 1 n 4, which tends to 1 as n. On the other hand, since Y is a non-negative integer valued random variable we have E(Y = Pr(Y > n Pr(Y > 0 + Pr(Y > 1 = q(n + Pr(Y > 1. n=0 The left side tends to 1 as n. So if we can prove that Pr(Y > 1 is bounded away from zero, this implies that q(n is bounded away from 1. To prove Pr(Y > 1 is bounded away from zero, we apply Lemma 1 twice, with ε = 1/n 2, using the sets T = x 1, x 2,..., x n/2 } and U = x n/2 +1,..., x n }. We find that lim sup n Pr(T is independent and lim sup n Pr(U is independent are both bounded above by e 1/8. Also, the events T is independent and U is independent are independent, so lim inf n Pr(neither T nor U is independent is greater than or equal to (1 e 1/8 2. The event neither T nor U is independent implies that Y 2, which completes the proof that Pr(Y > 1 is bounded away from zero. (c Applying Lemma 1 with ε = 1 100n, we conclude that r(n e (n 1(n 2/200n, which tends to zero exponentially fast as n tends to infinity. (d Let T = i 0 x i 1 }. For any given set U [n], we have Pr(T = U = 2 2 n since the numbers x 1,..., x n are independent, and each has probability 1 of 2 belonging to the interval [0, 1 ]. Hence 2 s(n = ( n 2 n. (1 n/2

Using Stirling s approximation to the factorial function, we find that ( n 2πn (n/e n = Θ ( n/2 2π(n/2 2 = Θ (n/2e n/2 ( n 1/2 2 n. (2 Putting together (1 and (2, we find that s(n = Θ ( n 1/2, which implies that s(n tends to zero, but not exponentially fast. 4. Let x be a real-valued random variable which is exponentially distributed with decay rate 4, i.e. Pr(x > r = e 4r for all r > 0. (a What is the probability density function of x? Solution: The cumulative distribution function of x is 1 e 4r if r 0 F (r = 0 if r < 0. so the probability density function is f(r = F (r = (b What is the expected value of x? 4e 4r if r 0 0 if r < 0. Solution: For a non-negative random variable x, it holds that E(x = For the given distribution of x, this implies E(x = 0 0 Pr(x rdr. e 4r dr = 1 4 e 4r = 1 4. (c Let y = x. What is the probability density function of y? Solution: The cumulative distribution function of y is G(s = Pr(y s Pr(x s = 2 if s 0 0 otherwise 1 e 4s 2 if s 0 = 0 otherwise. Therefore the probability density function of y is 8se g(s = G 4s 2 if s 0 (s = 0 otherwise. 0

5. (a You are shopping for a house by observing a sequence of houses one by one, in random order. You have decided to stop and buy a house as soon as you see one which is nicer than the first house you observed. What is the expected number of houses you will have to look at, including the first one? Mathematically, let s model this process as follows. Let z 1, z 2,... denote an infinite sequence of independent uniformly-distributed random samples from the interval [0, 1]. (Interpretation: z i is the quality of the i-th house you observed. Let τ be the smallest i > 1 such that z i > z 1. What is E(τ? Solution: For a non-negative integer valued random variable X, it holds that E(X = n=0 Pr(X > n. The probability that τ > n is equal to 1 if n = 0, and otherwise it is the probability that z 1 = maxz 1, z 2,..., z n }, which is equal to 1/n since each of z 1,..., z n is equally likely to be the maximum. Hence 1 E(τ = Pr(τ > n = 1 + n =. n=0 (b Now suppose that you modify your stopping rule. For some fixed predetermined number k, you observe the first k houses without purchasing. Let h be the secondbest house observed among the first k. Your policy is to buy the next house (after the first k which is nicer than h. In more precise terms, let z 1, z 2,... be a sequence of independent random variables uniformly distributed in [0, 1] as before, let z a > z b be the two largest elements of the set z 1,..., z k }, and let ρ be the smallest i > k such that z i > z b. What is E(ρ, as a function of k? Solution: As above, we begin by computing Pr(ρ > n. This is equal to 1 when n k. Otherwise, ρ > n if and only if the two largest samples in z 1,..., z n } belong to the subset z 1,..., z k }. Each of the ( n 2 unordered pairs of samples is equally likely to be the two largest samples, so we find that ( / ( k n k(k 1 Pr(ρ > n = = 2 2 n(n 1. Hence E(ρ = Pr(ρ > n n=0 n=1 k(k 1 = k + n(n 1 n=k ( 1 = k + k(k 1 n 1 1 n n=k ( 1 = k + k(k 1 k 1 = 2k.

6. (a Find a non-zero vector v = (x, y, z such that v is a linear combination of (0, 1, 1 and (1, 0, 0, and v is also a linear combination of ( 1, 2, 0 and (1, 1, 1. (b Let Solution: A vector v = (v x, v y, v z is a linear combination of (0, 1, 1 and (1, 0, 0 if and only if v x v y v z det 0 1 1 = 0 1 0 0 v y v z = 0. (3 Similarly, v is a linear combination of ( 1, 2, 0 and (1, 1, 1 if and only if v x v y v z det 1 2 0 = 0 1 1 1 2v x + v y 3v z = 0. (4 From (3 we get v y = v z. Plugging this into (4 we get 2v x 4z v = 0 which implies v x = 2v z. Thus v = (2, 1, 1 is a valid solution. S = (x, y, z x + y + z = 5, x 0, y 0, z 0}. Let v = (13, 16, 6. Find the point in S which is closest to v, i.e. the w S which minimizes v w 2. Solution: S is a triangular region in the plane x + y + z = 5}. It is bounded by the lines x+y = 5, z = 0}, x+z = 5, y = 0}, y +z = 5, x = 0}. Its corners are (5, 0, 0, (0, 5, 0, (5, 0, 0. Therefore w must be one of the following seven points. i. The point (5, 0, 0. ii. The point (0, 5, 0. iii. The point (0, 0, 5. iv. The point on the line x + y = 5, z = 0} which is closest to v. This point u = (u x, u y, u z satisfies u x + u y = 5, u z = 0, and ( u v (1, 1, 0 = 0, i.e. u x 13 = u y 16. Therefore the closest point on that line is (1, 4, 0. v. The point on the line x + z = 5, y = 0} which is closest to v. This point u = (u x, u y, u z satisfies u x + u z = 5, u y = 0, and ( u v (1, 0, 1 = 0, i.e. u x 13 = u z 6. Therefore the closest point on that line is (6, 0, 1. vi. The point on the line y + z = 5, x = 0} which is closest to v. This point u = (u x, u y, u z satisfies u y + u z = 5, u x = 0, and ( u v (0, 1, 1 = 0, i.e. u y 16 = u z 6. Therefore the closest point on that line is (0, 7.5, 2.5. vii. The point on the plane x + y + z = 0} which is closest to v. Denoting this point by u = (u x, u y, u z, we know that the vector u v must be parallel to the normal vector to this plane, which is (1, 1, 1. Hence u x + u y + u z = 5 and u x 13 = u y 16 = u z 6. Solving, we obtain u = (3, 6, 4.

Only the first four of these seven points belong to S, and one can check that (1, 4, 0 is the closest to v. 7. Recall the prediction problem we discussed on the first day of class: there are n experts predicting a binary sequence B 1, B 2,..., and one of the experts never makes a mistake. (In other words, if b ij denotes the prediction of expert i on the j-th trial, we are assuming that there exists some i (1 i n such that b ij = B j for all j. Assume that both the prediction matrix (b ij and the sequence (B j are specified by an oblivious adversary. We saw that there is a deterministic prediction algorithm which makes at most log 2 (n mistakes, and that this mistake bound is optimal. (a Show that there is a randomized prediction algorithm whose expected number of mistakes (against any oblivious adversary is at most 1 2 log 2(n. Solution: For a function f : [0, 1] [0, 1], consider the following algorithm. Alg(f : S = 1, 2,..., n} /* S is the set of experts who have not made a mistake yet */ for j=1,2,... Let a = i S b ij /* The number of experts in S predicting 1. */ Let b = S a. /* The number of experts in S predicting 0. */ Output prediction A t = 1 with probability f ( a a+b. Otherwise output prediction A t = 0. Observe B j. S S \ i b ij B j } /* Remove experts who made a mistake. */ end We will analyze the algorithm for a generic function f. From this analysis we will deduce the constraints which f must satisfy in order to ensure at most 1 2 log 2(n mistakes in expectation. Let W t be the number of experts in S at the beginning of the t-th iteration of the main loop. Note that W 1 = n and that W t 1 for all t, by assumption. Let Φ t = log 2 (W t, and observe that Φ t 0 for all t, hence (Φ t Φ t+1 Φ 1 = log 2 (n. t=1 Let X t = a t B t and observe that t=1 X t is the number of mistakes made by the algorithm. So if we can prove that we will be done. E(X t 1 2 (Φ t Φ t+1

Let a and b be the number of experts in S predicting 1 and 0 (respectively at time t. Let p = a. If B a+b t = 0 we have ( 1 E(X t = f(p, Φ t Φ t+1 = log 2. 1 p If B t = 1 we have E(X t = 1 f(p, Φ t Φ t+1 = log 2 ( 1 p Hence we are looking for a function f that satisfies the following for all p: f(p 1 ( 1 2 log 2 (5 1 p 1 f(p 1 ( 1 2 log 2. (6 p We may rewrite (6 as f(p 1 1 2 log 2 Combining all the constraints on f, we come up with: max 0, 1 } 2 log 2 (4p f(p min. ( 1 = 1 p 2 log 2 (4p. (7 1, 1 2 log 2 ( 1 } 1 p for all p [0, 1]. For any function f satisfying (8, the algorithm Alg(f will make at most 1 2 log 2(n mistakes in expectation. To see that there is at least one function f satisfying (8, observe that 1 4p(1 p = (1 2p 2 0, hence the inequality 4p(1 p 1 is valid for all p [0, 1]. Taking the logarithm of both sides, we obtain log 2 (4p + log 2 (1 p 0 1 2 log 2(4p 1 ( 1 2 log 2. (9 1 p For 0 p 1/4, the left side of (8 is 0 while the right side is non-negative. For 1/4 p 3/4, the fact that the left side of (8 is bounded above by the right side follows from (9. For 3/4 p 1, the left side is at most 1 while the right side is equal to 1. So, for example, the following choice of f suffices: f(p = 0 if 0 p < 1/4 1 2 log 2(4p if 1/4 p 3/4 1 if 3/4 < p 1. (8

(b Show that no randomized prediction algorithm can make fewer than 1 2 log 2(n mistakes, in expectation. In other words, prove that for every randomized prediction algorithm there exists an oblivious adversary such that the expected number of mistakes made by the algorithm against this adversary is at least 1 2 log 2(n. Solution: Let k = log 2 (n. Consider the following random input instance. For j = 1, 2,..., k and 1 i 2 k, expert i predicts the j-th binary digit of the integer i 1 (padded with initial 0 s if necessary, so that it is a string of k binary digits. For i > 2 k, expert i always predicts 0. Finally, B 1,..., B k is a string of independent, uniformly-distributed, binary digits. Note that the construction guarantees the existence of an expert who makes no mistakes. For any randomized prediction algorithm, the probability of a mistake at time 1 j k is 1 since the algorithm s prediction depends only on the experts 2 predictions and the algorithm s own random bits, and the random variable B j is independent of this data. Hence the expected number of mistakes made by the algorithm is k/2. It follows that there is at least one sequence B 1, B 2,..., B k which causes the algorithm to make at least k/2 mistakes in expectation.