Introduction to Probability. Ariel Yadin. Lecture 1. We begin with an example [this is known as Bertrand s paradox]. *** Nov.

Similar documents
INTRODUCTION TO PROBABILITY. Course: Fall Lecture notes updated: December 24, Contents. Lecture Lecture 2.

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Statistical Inference

Lecture 1. ABC of Probability

Lecture 3. Probability and elements of combinatorics

the time it takes until a radioactive substance undergoes a decay

Venn Diagrams; Probability Laws. Notes. Set Operations and Relations. Venn Diagram 2.1. Venn Diagrams; Probability Laws. Notes

Basic Probability. Introduction

CMPSCI 240: Reasoning about Uncertainty

Econ 325: Introduction to Empirical Economics

Dept. of Linguistics, Indiana University Fall 2015

Statistical Theory 1

Discrete Probability. Mark Huiskes, LIACS Probability and Statistics, Mark Huiskes, LIACS, Lecture 2

2. AXIOMATIC PROBABILITY

Statistics for Financial Engineering Session 2: Basic Set Theory March 19 th, 2006

Probability Theory Review

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

1 The Basic Counting Principles

Lecture notes for probability. Math 124

Chapter 7 Wednesday, May 26th

Chapter 1: Introduction to Probability Theory

3.2 Probability Rules

Fundamentals of Probability CE 311S

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

Mutually Exclusive Events

Probability- describes the pattern of chance outcomes

Stats Probability Theory

Introduction to Probability

Probability Theory and Applications

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Independence. P(A) = P(B) = 3 6 = 1 2, and P(C) = 4 6 = 2 3.

Recitation 2: Probability

STAT 430/510 Probability

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

Topic 5 Basics of Probability

Probability Review I

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

Introduction to Bayesian Methods - 1. Edoardo Milotti Università di Trieste and INFN-Sezione di Trieste

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Outcomes, events, and probability

Probability Notes (A) , Fall 2010

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Lecture 8: Probability

Lecture 3 - Axioms of Probability

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

Probability: Axioms, Properties, Interpretations

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics?

ECE353: Probability and Random Processes. Lecture 2 - Set Theory

Formalizing Probability. Choosing the Sample Space. Probability Measures

Probability: Terminology and Examples Class 2, Jeremy Orloff and Jonathan Bloom

St. Fatima Language Schools

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

MAT2377. Ali Karimnezhad. Version September 9, Ali Karimnezhad

RVs and their probability distributions

Conditional Probability and Bayes

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Chapter 2: Probability Part 1

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Basic Measure and Integration Theory. Michael L. Carroll

Probability 1 (MATH 11300) lecture slides

Chapter 2. Probability. Math 371. University of Hawai i at Mānoa. Summer 2011

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

MA : Introductory Probability

STAT 285 Fall Assignment 1 Solutions

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Dynamic Programming Lecture #4

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

AMS7: WEEK 2. CLASS 2

(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)

Independence Solutions STAT-UB.0103 Statistics for Business Control and Regression Models

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Deep Learning for Computer Vision

University of California, Berkeley, Statistics 134: Concepts of Probability. Michael Lugo, Spring Exam 1

STA 291 Lecture 8. Probability. Probability Rules. Joint and Marginal Probability. STA Lecture 8 1

Recap of Basic Probability Theory

HW MATH425/525 Lecture Notes 1

Week 2. Section Texas A& M University. Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

Lecture 1. Chapter 1. (Part I) Material Covered in This Lecture: Chapter 1, Chapter 2 ( ). 1. What is Statistics?

Lecture Lecture 5

Recap of Basic Probability Theory

Lecture 2: Probability, conditional probability, and independence

Lecture 1 : The Mathematical Theory of Probability

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

Lecture 1: Probability Fundamentals

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events

MA : Introductory Probability

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Probability: Sets, Sample Spaces, Events

Name: Exam 2 Solutions. March 13, 2017

Problem # Number of points 1 /20 2 /20 3 /20 4 /20 5 /20 6 /20 7 /20 8 /20 Total /150

Transcription:

Introduction to Probability Ariel Yadin Lecture 1 1. Example: Bertrand s Paradox We begin with an example [this is known as Bertrand s paradox]. *** Nov. 1 *** Question 1.1. Consider a circle of radius 1, and an equilateral triangle bounded in the circle, say ABC. (The side length of such a triangle is 3.) Let M be a randomly chosen chord in the circle. What is the probability that the length of M is greater than the length of a side of the triangle (i.e. 3)? Solution 1. How to chose a random chord M? One way is to choose a random angle, let r be the radius at that angle, and let M be the unique chord perpendicular to r. Let x be the intersection point of M with r. (See Figure 1, left.) Because of symmetry, we can rotate the triangle so that the chord AB is perpendicular to r. Since the sides of the triangle intersect the perpendicular radii at distance 1/2 from 0, M is longer than AB if and only if x is at distance at most 1/2 from 0. r has length 1, so the probability that x is at distance at most 1/2 is 1/2. Solution 2. Another way: Choose two random points x, y on the circle, and let M be the chord connecting them. (See Figure 1, right.) Because of symmetry, we can rotate the triangle so that x coincides with the vertex A of the triangle. So we can see that y falls in the arc BC on the circle if and only if M is longer than a side of the triangle. The probability of this is the length of the arc BC over 2π. That is, 1/3 (the arc BC is one-third of the circle). Solution 3. A different way to choose a random chord M: Choose a random point x in the circle, and let r be the radius through x. Then choose M to be the chord through x perpendicular to r. (See Figure 1, left.) 1

2 y B C B C x r M M A A Figure 1. How to choose a random chord. Again we can rotate the triangle so that r is perpendicular to the chord AB. Then, M will be longer than AB if and only if x lands inside the triangle; that is if and only if the distance of x to 0 is at most 1/2. Since the area of a disc of radius 1/2 is 1/4 of the disc of radius 1, this happens with probability 1/4. How did we reach this paradox? The original question was ill posed. We did not define precisely what a random chord is. The different solutions come from different ways of choosing a random chord - these are not the same. We now turn to precisely defining what we mean by random, probability, etc. 2. Sample spaces When we do some experiment, we first want to collect all possible outcomes of the experiment. In mathematics, a collection of objects is just a set. The set of all possible outcomes of an experiment is called a sample space. Usually we denote the sample space by Ω, and its elements by ω. Example 1.2. A coin toss. Ω = {H, T}. Actually, it is the set of heads and tails, maybe {, }, but H, T are easier to write.

3 Tossing a coin twice. Ω = {H, T} 2. What about tossing a coin k times? Throwing two dice. Ω = {,,,,, } 2. It is probably easier to use {1, 2, 3, 4, 5, 6} 2. The lifetime of a person. Ω = R +. What if we only count the years? What if we want years and months? Bows and arrows: Shooting an arrow into a target of radius 1/2 meters. Ω = {(x, y) : x 2 + y 2 1/4}. What about Ω = {(r, θ) : r [0, 1/2], θ [0, 2π)}. What if the arrow tip has thickness of 1cm? So we don t care about the point up to radius 1cm? What about missing the target altogether? A random real valued continuously differentiable function on [0, 1]. Ω = C 1 ([0, 1]). Noga tosses a coin. If it comes out heads she goes to the probability lecture, and either takes notes or sleeps throughout. If the coin comes out tails, she goes running, and she records the distance she ran in meters and the time it took. Ω = {H} { notes, sleep } {T} N R +. 3. Events Suppose we toss a die. So the sample space is, say, Ω = {1, 2, 3, 4, 5, 6}. We want to ecode the outcome the number on the die is even. This can be done by collecting together all the relevant possible outcomes to this event. That is, a sub-collection of outcomes, or a subset of Ω. The event the number on the die is even corresponds to the subset {2, 4, 6} Ω. What do we want to require from events? We want the following properties: Everything is an event; that is, Ω is an event. If we can ask if an event occured, then we can also ask if it did not occur; that is, if A is an event, then so is A c. If we have many events of which we can ask if they have occured, then we can also ask if one of them has occured; that is, if (A n ) n N is a sequence of events, then so is n A n. A word on notation: A c = Ω\A. One needs to be careful in which space we are taking the complement. Some books use A. Thus, if we want to say what events are, we have: If F is the collection of events on a sample space Ω, then F has the following properties: The elements of F are subsets of Ω (i.e. Ω F. F P(Ω)).

4 If A F then A c F. If (A n ) n N is a sequence of elements of F, then n A n F. Definition 1.3. F with the properties above is called a σ-algebra (or σ-field). Explain the name: algebra, σ-algebra. Example 1.4. Let Ω be any set (sample space). Then, F = {, Ω} G = 2 Ω = P(Ω) are both σ-algebras on Ω. When Ω is a countable sample space, then we will always take the full σ-algebra 2 Ω. (We will worry about the uncountable case in the future.) To sum up: For a countable sample space Ω, any subset A Ω is an event. Example 1.5. The experiment is tossing three coins. The event A is second toss was heads. Ω = {H, T} 3. A = {(x, H, y) : x, y {H, T}}. The experiment is how many minutes passed until a goal is scored in the Manchester derby. The event A is the first goal was scored after minute 23 and before minute 45. Ω = {0, 1,..., 90} (maybe extra-time?). A = {23, 24,..., 44}. ** Nov. 3 *** 4. Probability Remark 1.6. Two sets A, B are called disjoint if A B =. For a sequence (A n ) n N, we say that (A n ) are mutually disjoint if any two are disjoint; i.e. for any n m, A n A m =. What is probability? It is the assignment of a value to each event, between 0 and 1, saying how likely this event is. That is, if F is the collection of all events on some sample space, then probability is a function from the events to [0, 1]; i.e. P : F [0, 1]. Of course, not every such assignment is good, and we would like some consistency. What would we like of P? For one thing, we would like that the likelyhood of everything is 100%. That is, P(Ω) = 1. Also, we would like that if A, B F are two events that are disjoint, i.e. A B =, then the likelyhood that one of A, B occurs is the sum of their likelyhoods; i.e. P(A B) = P(A) + P(B). This leads to the following definition:

5 Definition 1.7 (Probability measure). Let Ω be a sample space, and F a σ-algebra on Ω. (For simplicity one can take F = 2 Ω.) A probability measure on (Ω, F) is a function P : F [0, 1] such that P(Ω) = 1. If (A n ) is a sequence of mutually disjoint events in F, then P( n A n ) = n P(A n ). (Ω, F, P) is called a probability space. Example 1.8. A fair coin toss: Ω = {H, T}, F = 2 Ω = {, {H}, {T} Ω}. P( ) = 0, P({H}) = P({T}) = 1/2, P(Ω) = 1. Unfair coin toss: P( ) = 0, P({H}) = 3/4, P({T}) = 1/4, P(Ω) = 1. Fair die: Ω = {1, 2,..., 6}, F = 2 Ω. For A Ω let P(A) = A /6. Let Ω be any countable set. Let F = 2 Ω. Let p : Ω [0, 1] be a function such that ω Ω p(ω) = 1. Then, for any A Ω define P(A) = ω A p(ω). P is a probability measure on (Ω, F). Let γ = n 1 n 2. For A {1, 2,..., } define P(A) = 1 γ a 2. P is a probability measure. Some properties of probability measures: a A Proposition 1.9. Let (Ω, F, P) be a probability space. Then, (1) F and P( ) = 0. (2) If A 1,..., A n are a finite number of mutually disjoint events, then n n A := A j F and P(A) = P(A j ). (3) For any event A F, P(A c ) = 1 P(A). (4) If A B are events, then P(B \ A) = P(B) P(A).

6 (5) If A B are events, then P(A) P(B). (6) (Inclusion-Exclusion Principle.) For events A, B F, P(A B) = P(A) + P(B) P(A B). Proof. Let A, B, (A n ) n N be events in F. (1) = Ω c F. Set A j = for all j N. This is a sequence of mutually disjoint events. Since = j A j, we have that P( ) = j P( ) implying that P( ) = 0. (2) For j > n let A j =. Then (A j ) j N is a sequence of mutually disjoint events, so n P( A j ) = P( j A j ) = j n P(A j ) = P(A j ). (3) A A c = Ω and A A c =. So the additivity of P on disjoint sets gives that P(A) + P(A c ) = P(Ω) = 1. (4) If A B, then B = (B \ A) A, and (B \ A) A =. Thus, P(B) = P(B \ A) + P(A). (5) Since A B, we have that P(B) P(A) = P(B \ A) 0. (6) Let C = A B. Note that A B = (A \ C) (B \ C) C, where all sets in the union are mutually disjoint. Since C A, C B we get P(A B) = P(A\C)+P(B\C)+P(C) = P(A) P(C)+P(B) P(C)+P(C) = P(A)+P(B) P(C). Proposition 1.10 (Boole s inequality). Let (Ω, F, P) be a probability space. If (A n ) n N are events, then P( A n ) P(A n ). n N n N Proof. For every n let B 0 =, and n B n = A j and C n = A n \ B n 1. Claim 1. For any n > m 1, we have that C n C m =. Proof. Since m n 1, we have that A m B n 1. Since C m A m, C n C m (A n \ B n 1 ) B n 1 =. Claim 2. n A n n C n.

7 Proof. Let ω n A n. So there exists n such that ω A n. Let m be the smallest integer such that ω A m and ω A m 1 (if ω A 1 let m = 1). Then, ω A m, and ω A k for all 1 k < m. In other words: ω A m and ω m 1 k=1 A k = B m 1. Or: ω A m \ B m 1 = C m. Thus, there exists some m 1 such that ω C m ; i.e. ω n C n. We conclude that (C n ) n 1 is a collection of events that are pairwise disjoint, and n A n n C n. Using # 5 from Proposition 1.9 P( n A n ) P( n C n ) = n P(C n ) n P(A n ).