The Near-miss Birthday Problem. Gregory Quenell Plattsburgh State

Similar documents
2 - Strings and Binomial Coefficients

Attacks on hash functions. Birthday attacks and Multicollisions

Probability. Part 1 - Basic Counting Principles. 1. References. (1) R. Durrett, The Essentials of Probability, Duxbury.

Day 2 and 3 Graphing Linear Inequalities in Two Variables.notebook. Formative Quiz. 1) Sketch the graph of the following linear equation.

Lecture 04: Balls and Bins: Birthday Paradox. Birthday Paradox

L11.P1 Lecture 11. Quantum statistical mechanics: summary

CS 210 Foundations of Computer Science

2016 Canadian Intermediate Mathematics Contest

Testing a Hash Function using Probability

Carleton University. Final Examination Winter DURATION: 2 HOURS No. of students: 152

CS 2336 Discrete Mathematics

Discrete Mathematics & Mathematical Reasoning Chapter 6: Counting

6.8 The Pigeonhole Principle

Lecture 4: Counting, Pigeonhole Principle, Permutations, Combinations Lecturer: Lale Özkahya

Glossary. COMMON CORE STATE STANDARDS for MATHEMATICS

Some Review Problems for Exam 3: Solutions

Notes. Combinatorics. Combinatorics II. Notes. Notes. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006

GLOSSARY & TABLES MATHEMATICS. UTAH CORE STATE STANDARDS for. Education. Utah. UTAH CORE STATE STANDARDS for MATHEMATICS GLOSSARY & TABLES 135

Common Core Georgia Performance Standards CCGPS Mathematics

CS Foundations of Computing

Solution: By direct calculation, or observe that = = ( ) 2222 = =

Counting. Spock's dilemma (Walls and mirrors) call it C(n,k) Rosen, Chapter 5.1, 5.2, 5.3 Walls and Mirrors, Chapter 3 10/11/12

Discrete Probability

Massachusetts Institute of Technology Machine Learning, Fall Problem Set 1 Solutions

PROBABILITY VITTORIA SILVESTRI

Warm-up Quantifiers and the harmonic series Sets Second warmup Induction Bijections. Writing more proofs. Misha Lavrov

1 The Basic Counting Principles

2013 ΜΑΘ National Convention

Discrete Mathematics with Applications MATH236

Combinations and Probabilities

Homework every week. Keep up to date or you risk falling behind. Quizzes and Final exam are based on homework questions.

CPSC 467: Cryptography and Computer Security

PROBABILITY. Contents Preface 1 1. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8. Preface

CISC-102 Fall 2017 Week 1 David Rappaport Goodwin G-532 Office Hours: Tuesday 1:30-3:30

Recitation 6. Randomization. 6.1 Announcements. RandomLab has been released, and is due Monday, October 2. It s worth 100 points.

Name: Harry Potter (pothar31) Discrete Math HW#6 Solutions March 9, Added: Chapter 6 Summary/Review: 17(a), 8, 15, 29, 42 and Q1 and Q2 below

Recitation 5: Elementary Matrices

Lecture 24: MAC for Arbitrary Length Messages. MAC Long Messages

Lecture 6: The Pigeonhole Principle and Probability Spaces

Discrete Mathematics, Spring 2004 Homework 4 Sample Solutions

Lecture 8: Equivalence Relations

Counting Review R 1. R 3 receiver. sender R 4 R 2

The arrangement of the fundamental particles on mass levels derived from the Planck Mass

Carleton University. Final Examination Winter DURATION: 2 HOURS No. of students: 275

Hash Functions. A hash function h takes as input a message of arbitrary length and produces as output a message digest of fixed length.

MODEL ANSWERS TO THE SEVENTH HOMEWORK. (b) We proved in homework six, question 2 (c) that. But we also proved homework six, question 2 (a) that

x n -2.5 Definition A list is a list of objects, where multiplicity is allowed, and order matters. For example, as lists

CPSC 467: Cryptography and Computer Security

Math 461 B/C, Spring 2009 Midterm Exam 1 Solutions and Comments

1 Maintaining a Dictionary

1 Basic Combinatorics

CMSC 441: Algorithms. NP Completeness

Ad Placement Strategies

Recursive Definitions

CHAPTER 1 NUMBER SYSTEMS. 1.1 Introduction

Probability, For the Enthusiastic Beginner (Exercises, Version 1, September 2016) David Morin,

Carleton University. Final Examination Fall DURATION: 2 HOURS No. of students: 223

After that, we will introduce more ideas from Chapter 5: Number Theory. Your quiz in recitation tomorrow will involve writing proofs like those.

Unit 4 - Equations and Inequalities - Vocabulary

1 Introduction. n = Key-Words: - Mersenne numbers, prime numbers, Generalized Mersenne numbers, distributions

The Leech Lattice. Balázs Elek. November 8, Cornell University, Department of Mathematics

Calendar Squares Ten Frames 0-10

Definition: Let S and T be sets. A binary relation on SxT is any subset of SxT. A binary relation on S is any subset of SxS.

Senior Math Circles November 19, 2008 Probability II

Finite and infinite sets, and cardinality

Some Review Problems for Exam 3: Solutions

CMPUT651: Differential Privacy

The Law of Averages. MARK FLANAGAN School of Electrical, Electronic and Communications Engineering University College Dublin

Introduction to Probability, Fall 2009

Homework 4 Solutions

MATH 10B METHODS OF MATHEMATICS: CALCULUS, STATISTICS AND COMBINATORICS

Counting. Mukulika Ghosh. Fall Based on slides by Dr. Hyunyoung Lee

Introductory Probability

In a five-minute period, you get a certain number m of requests. Each needs to be served from one of your n servers.

CSCI 150 Discrete Mathematics Homework 5 Solution

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

Balls & Bins. Balls into Bins. Revisit Birthday Paradox. Load. SCONE Lab. Put m balls into n bins uniformly at random

arxiv: v1 [math.co] 30 Aug 2017

Computing and Communicating Functions over Sensor Networks

Math-2A Lesson 2-1. Number Systems

Homework 1. Spring 2019 (Due Tuesday January 22)

The CS 5 Times. CS 5 Penguin Prepares Revenge

Set theory background for probability

Tracking code for microwave instability

Name (please print) Mathematics Final Examination December 14, 2005 I. (4)

IUPUI Department of Mathematical Sciences High School Math Contest Solutions to problems

Combinatorial Proofs and Algebraic Proofs I

Exam III Review Math-132 (Sections 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 8.1, 8.2, 8.3)

String Art and Calculus

Name: (This only happens every four years or does it?)

Counting. Math 301. November 24, Dr. Nahid Sultana

Number of People Contacted

6. This sum can be rewritten as 4( ). We then recall the formula n =

Supplementary Information: Three-dimensional quantum photonic elements based on single nitrogen vacancy-centres in laser-written microstructures

Pr[A B] > Pr[A]Pr[B]. Pr[A B C] = Pr[(A B) C] = Pr[A]Pr[B A]Pr[C A B].

W3203 Discrete Mathema1cs. Coun1ng. Spring 2015 Instructor: Ilia Vovsha.

Supplemental Material : A Unified Framework for Multi-Target Tracking and Collective Activity Recognition

Order Statistics and Distributions

Lecture 3: Miscellaneous Techniques

CS221 Practice Midterm #2 Solutions

Transcription:

The Near-miss Birthday Problem Gregory Quenell Plattsburgh State 1

The Classic Birthday Problem Assuming birthdays are uniformly distributed over 365 days, find P (at least one shared birthday) in a random sample of n people. Solution: Use complementation. P (at least one shared birthday) = 1 P (no shared birthday) = 1 P (n different birthdays) 2

Finding P (n different birthdays) 3

Finding P (n different birthdays) Birthday of person 1 364 days {}}{ 4

Finding P (n different birthdays) Birthday of person 1 364 days {}}{ P (n different birthdays) = number of ways to place n 1 birthdays in 364 days with no collision ( ) total number of ways to place n 1 birthdays in 365 days 5

Finding P (n different birthdays) Birthday of person 1 364 days {}}{ P (n different birthdays) = number of ways to place n 1 birthdays in 364 days with no collision ( ) total number of ways to place = = n 1 birthdays in 365 days ( ) 364 (n 1)! n 1 (365) n 1 364 363 (364 (n 2)) 365 365 365 = (364) n 1 (365) n 1 6

Some numbers 1.0 0.8 0.6 0.4 0.2 0 10 20 30 40 50 P (at least one shared birthday) n P (shared birthday) 10 0.117 15 0.253 20 0.411 25 0.569 23 0.507 30 0.706 35 0.814 40 0.891 45 0.941 50 0.970 = 1 (364) n 1 (365) n 1 7

The Near-miss Birthday Problem Assuming birthdays are uniformly distributed over 365 days, find at least one pair of birthdays P that are either coincident or adjacent in a random sample of n people. Solution: Use complementation. P (at least one near miss) = 1 P (no near misses) = 1 P (n isolated birthdays) 8

Finding P (n isolated birthdays) Birthday of person 1 364 days {}}{ P (n isolated birthdays) = number of ways to place n 1 birthdays in 364 days with no collision and no two birthdays adjacent ( ) total number of ways to place n 1 birthdays in 365 days 9

Finding P (n isolated birthdays) Birthday of person 1 364 days {}}{ P (n isolated birthdays) = This is still (365) n 1 number of ways to place n 1 birthdays in 364 days with no collision and no two birthdays adjacent ( ) total number of ways to place n 1 birthdays in 365 days 10

Finding P (n isolated birthdays) Birthday of person 1 364 days {}}{ P (n isolated birthdays) = This is still (365) n 1 number of ways to place n 1 birthdays in 364 days with no collision and no two birthdays adjacent ( ) total number of ways to place n 1 birthdays in 365 days How do we count these? 11

Counting isolated birthdays n 1 isolated birthdays in 364 days {}}{ a 1 a 2 a 3 a 4 a n 1 a n Every arrangement of n 1 isolated birthdays corresponds to a gap sequence a 1, a 2,..., a n in which { a1 + a 2 + + a n = 364 (n 1) a i 1 for all i 12

Aside on counting A sequence a 1, a 2,..., a n of positive integers such that a 1 + a 2 + + a n = S is called an n-part composition of S. Theorem: The number of n-part compositions of S is ( ) S 1. n 1 Proof: Write down a string of S dots. Then there are S 1 inter-dot spaces. 13

Aside on counting A sequence a 1, a 2,..., a n of positive integers such that a 1 + a 2 + + a n = S is called an n-part composition of S. Theorem: The number of n-part compositions of S is ( ) S 1. n 1 Proof: Write down a string of S dots. Then there are S 1 inter-dot spaces. 3 {}}{ + 2 + 4 + {}}{ 3 {}}{ Placing bars in n 1 of these S 1 spaces determines an n-part composition of S, and conversely. 14

Application to birthdays n 1 isolated birthdays in 364 days {}}{ a 1 a 2 a 3 a 4 a n 1 a n There are ( ) [364 (n 1)] 1 n 1 possible gap sequences. = ( ) 364 n n 1 15

Application to birthdays n 1 isolated birthdays in 364 days {}}{ a 1 a 2 a 3 a 4 a n 1 a n There are ( ) [364 (n 1)] 1 n 1 possible gap sequences. = ( ) 364 n n 1 Result: ( number of ways to place n 1 isolated birthdays in 364 days ) = ( ) 364 n (n 1)! n 1 = (364 n) n 1 16

The near-miss birthday formula P (no near miss) = (364 n) n 1 365 n 1 The probability of at least one near miss in a random sample of n people is 1 (364 n) n 1 365 n 1. The least n for which this probability exceeds 0.5 is n = 14: P at least one pair of coincident or adjacent birthdays in a random sample of 14 people 0.537. 17

More numbers 1.0 0.8 0.6 0.4 0.2 10 20 30 40 50 P (shared birthday) = 1 (364) n 1 (365) n 1 n P (shared) P (near miss) 10 0.117 0.314 14 0.223 0.537 15 0.253 0.590 20 0.411 0.804 25 0.569 0.926 23 0.507 0.888 30 0.706 0.978 35 0.814 0.995 40 0.891 0.999 45 0.941 1 50 0.970 1 P (near miss) = 1 (364 n) n 1 (365) n 1 18

Birthdays shared by k or more people 0 0 1 0 2 0 1 3 0 1 0 0 0 1 1 0 2 0 0 0 0 1 0 4 0 0 Let (X 1, X 2,..., X 365 ) be a random vector in which X i is the number of people in a random sample of size n who were born on day i. Then (X 1, X 2,..., X 365 ) follows a multinomial distribution with n things, 365 bins, and constant probability 1/365. Thus P ((X 1, X 2,..., X 365 ) = (x 1, x 2,..., x 365 )) = ( ) n x 1 x 2 x 365 = 1 365 n n! x 1! x 2! x 365! ( ) n 1 365 We want P (max(x 1, X 2,..., X 365 ) k), the probability that some date is the birthday of k or more people in the sample. Again, we use complementation: P (max(x 1, X 2,..., X 365 ) k), = 1 P (max(x 1, X 2,..., X 365 ) k 1) = 1 P (X i k 1) for all i. 19

Finding P (X i k 1) for i = 1, 2,..., 365 We need n! 365 n k 1 k 1 k 1 x 1 =0 x 2 =0 x 365 =0 x 1 + x 2 + + x 365 =n ( 1 x 1! 1 x 2! 1 ) x 365! Consider the product ( 1 0! + 1 ) ( 1! + + 1 1 (k 1)! 0! + 1 ) 1! + + 1 (k 1)! factor for x 1 factor for x 2 ( 1 0! + 1 ) 1! + + 1 (k 1)! factor for x 365 To pick out the terms with x 1 + x 2 + + x 365 = n, introduce a tracer variable: ( τ 0 0! + τ 1 1! + + τ ) ( k 1 τ 0 (k 1)! 0! + τ 1 1! + + τ ) ( k 1 τ 0 (k 1)! 0! + τ 1 1! + + τ ) k 1 (k 1)! The coefficient of τ n in this product is exactly the sum that we want. 20

The multiple-birthday formula We have P (X i k 1 i) = n! 365 n coeff of τ n in ( τ 0 0! + τ 1 1! + + τ k 1 ) 365 (k 1)! And so P (max(x i ) k) = 1 n! ( τ 365 [τ n 0 ] n 0! + τ 1 1! + + τ k 1 ) 365 (k 1)! 21

An example In a random sample of 100 people, what s the probability that there are six (or more) who share a birthday? It s 1 100! ( τ 365 100[τ 100 0 ] 0! + τ 1 1! + τ 2 2! + τ 3 3! + τ 4 4! + τ 5 ) 365 5! Mathematica says the answer is 150 259 132 496 532 424 666 066 218 503 258 905 869 817 110 259 709 611 434 657 314 752 093 849 611 634 141 921 578 743 930 338 023 787 684 475 659 966 265 486 068 519 258 590 304 556 011 085 660 183 037 201 911 170 138 053 982 437 173 570 483 591 520 443 688 927 322 874 822 292 982 338 348 606 684 459 507 966 128 295 833 309/1 018 100 624 231 385 241 853 189 999 481 940 942 382 873 878 399 046 008 966 742 039 665 259 133 127 558 338 726 075 853 312 698 838 815 389 196 105 495 212 915 667 272 376 736 512 436 519 973 194 623 721 779 480 597 820 765 897 548 554 160 854 805 712 082 157 001 360 774 761 962 446 621 765 820 964 355 953 037 738 800 048 828 125 (This is about 0.0001476.) 22

Multiple birthday probabilities 1.0 0.8 0.6 0.4 0.2 100 200 300 400 500 Probabilities of at least one date in the calendar being the shared birthday of k people for k = 3, 4, and 5. 23