Introduction to probability theory

Similar documents
Introduction to Statistical Inference

Statistical Inference

Recitation 2: Probability

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Lecture 1: Basics of Probability

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Chapter 2 Random Variables

Recap of Basic Probability Theory

Probability. Lecture Notes. Adolfo J. Rumbos

Recap of Basic Probability Theory

Basics on Probability. Jingrui He 09/11/2007

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

Deep Learning for Computer Vision

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 1

1: PROBABILITY REVIEW

Sample Spaces, Random Variables

PROBABILITY THEORY. Prof. S. J. Soni. Assistant Professor Computer Engg. Department SPCE, Visnagar

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

1.1 Review of Probability Theory

Week 2. Review of Probability, Random Variables and Univariate Distributions

Lecture 3 Probability Basics

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Math 105 Course Outline

Chapter 1. Sets and probability. 1.3 Probability space

Notes 1 : Measure-theoretic foundations I

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Probability and Statistics Concepts

EE514A Information Theory I Fall 2013

Statistics and Econometrics I

18.175: Lecture 2 Extension theorems, random variables, distributions

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Name: Firas Rassoul-Agha

Bandits, Experts, and Games

Quick Tour of Basic Probability Theory and Linear Algebra

Random Variables and Their Distributions

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Discrete Random Variables

Chapter 3. Chapter 3 sections

5. Conditional Distributions

Introduction to Probability Theory and Statistics

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Lecture 2: Repetition of probability theory and statistics

Lecture 9: Conditional Probability and Independence

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Probability, Random Processes and Inference

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

Elementary Discrete Probability

MATH Notebook 5 Fall 2018/2019

Probability Space: Formalism Simplest physical model of a uniform probability space:

Intro to Stats Lecture 11

Basic Probability and Information Theory: quick revision

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

1 Presessional Probability

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

Chapter 1: Introduction to Probability Theory

Econ 325: Introduction to Empirical Economics

CS 361: Probability & Statistics

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Probability Theory and Applications

Probability Basics Review

Statistika pro informatiku

1 Probability and Random Variables

PROBABILITY CHAPTER LEARNING OBJECTIVES UNIT OVERVIEW

Algorithms for Uncertainty Quantification

Introduction to Information Entropy Adapted from Papoulis (1991)

CSC Discrete Math I, Spring Discrete Probability

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Fundamental Tools - Probability Theory II

Introduction to Probability and Stocastic Processes - Part I

Probabilistic models

1 Probability theory. 2 Random variables and probability theory.

Origins of Probability Theory

Chapter 2: Random Variables

Statistics 100A Homework 5 Solutions

Relationship between probability set function and random variable - 2 -

Lecture Lecture 5

The main results about probability measures are the following two facts:

Mathematical Statistics 1 Math A 6330

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

MTH 202 : Probability and Statistics

Probability Review I

Chapter 4 : Discrete Random Variables

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Lecture 3. Discrete Random Variables

PROBABILITY VITTORIA SILVESTRI

Markov Branching Process

Single Maths B: Introduction to Probability

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Math-Stat-491-Fall2014-Notes-I

Great Theoretical Ideas in Computer Science

Chapter 2: The Random Variable

Part (A): Review of Probability [Statistics I revision]

STAT 7032 Probability Spring Wlodek Bryc

Probability and random variables. Sept 2018

CS626 Data Analysis and Simulation

Transcription:

Introduction to probability theory Fátima Sánchez Cabo Institute for Genomics and Bioinformatics, TUGraz f.sanchezcabo@tugraz.at 07/03/2007 - p. 1/35

Outline Random and conditional probability (7 March) From Probability Theory to Statistics (14 March) Hypothesis testing (14 March) Stochastic processes and Markov chains (21 March) 07/03/2007 - p. 2/35

Introduction Sample space and events Probability Space Algebra of Sets Probability measure 07/03/2007 - p. 3/35

Introduction Introduction Sample space and events Probability Space Algebra of Sets Probability measure Life is full of unpredictable events Probabilistic models are used to make inference from this type of : their outcome cannot be completely determined, but we can get some hints about the most likely situation that will occur Randomness does not mean chaos: for example the most likely value in a normally distributed random variable is the mode; very small or very large values are very unlikely to occur 0.4 Normal(0,1) 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 4 3 2 1 0 1 2 3 4 07/03/2007 - p. 4/35

Random : Definition Introduction Sample space and events Probability Space Algebra of Sets Probability measure All outcomes of the experiment are known in advance But, it is a priori unknown which will be the outcome of each performance of the experiment: Systematic and random errors Complex processes, result of many combined processes The experiment can be repeated under identical conditions 07/03/2007 - p. 5/35

Examples Introduction Sample space and events Probability Space Algebra of Sets Probability measure Throwing a die: Possible outcomes: {1,2,3,4,5,6} It is unknown what we will get next if we throw the die We can throw the die n times and all are independent trials Tossing a coin: Possible outcomes: {head, tail} It is unknown what we will get next if we toss the coin We can toss the coin n independent times The length life of a light bulb produced by a manufacturer: Possible outcomes: Any number between 0 and The life-time of a bulb is not known before hand We assume that the life-time of n bulbs can be measured under the same conditions 07/03/2007 - p. 6/35

Sample space, events, σ algebra Introduction Sample space and events Probability Space Algebra of Sets Probability measure Sample space: Collection of possible elementary outcomes from a random experiment. 1. Throwing a die: Ω={1,2,3,4,5,6} 2. Tossing a coin: Ω={head, tail} 3. Life-time of a bulb: Ω = [0, ) Event: A set of outcomes of the experiment. 1. Throwing a die: "To obtain a 6" ({6}) or "Not to obtain a 6" ({1,2,3,4,5}) 2. Tossing a coin: "To obtain a head" 3. Life-time of a bulb: "The life-time of the bulb is greater than 3 years" A σ-field (or σ-algebra) is a non-empty collection of subsets of Ω that satisfy: ǫ F if A ǫ F then A c ǫ F,and if A i ǫ F is a countable sequence of sets then i A i ǫ F 07/03/2007 - p. 7/35

Probability space Introduction Sample space and events Probability Space Algebra of Sets Probability measure The pair (Ω, F) is called under the previous conditions a sample space A measure is a nonnegative countably additive set function, i.e. a function µ : F R such as: 1. µ(a) µ( ) for all AǫF, and 2. if A i ǫ F and i A i = then µ( i A i) = i µ(a i) a If µ(ω) = 1 the measure is called a probability measure a countable: the set of the natural number, uncountable: the set of the real numbers 07/03/2007 - p. 8/35

A little on Algebra of Sets Introduction Sample space and events Probability Space Algebra of Sets Probability measure Sets algebra is equivalent to elementary algebra (arithmetic): Intersection Multiplication; Union Addition Differently from arithmetic operations (only distributive for the product) union and intersection have both the distributive property Uniqueness of complements: Given A, B such as A B = Ω and A B =, then B = A c B\A = B A c Morgan s Laws: (A B) c = A c B c (A B) c = A c B c A cc = A c = Ω; Ω c = 07/03/2007 - p. 9/35

Probability measure Introduction Sample space and events Probability Space Algebra of Sets Probability measure Some properties of the probability measure: 1. P(A B) = P(A) + P(B) P(A B) 2. P(A c ) = 1 P(A) 3. If B A P(B) = P(A) + P(B\A) P(A) 4. In general, if A 1,...,A n ǫf P( A i ) = = P(A i ) i<j P(A i A j )... + ( 1) n+1 P(A 1... A n ) Exercise: Proof 1-3 07/03/2007 - p. 10/35

Example Introduction Sample space and events Probability Space Algebra of Sets Probability measure The probability of a chicken from Steiermark to be infected with the Bird flu is 0.2 a. At the same time a chicken may have another lethal disease called Y that appears with probability 0.5. Calculate the probability of chicken dying due to any disease: 1. If it is not possible that both diseases appear simultaneously; 2. If the probability of having both diseases is 0.6. a Unreal number 07/03/2007 - p. 11/35

Dependent events Bayes Theorem 07/03/2007 - p. 12/35

Dependent events Dependent events Bayes Theorem P(B)=2/3 P(W)=1/3 P(B)=1/3 P(W)=2/3 Which is the probability of getting a white ball? Is the same in both? 07/03/2007 - p. 13/35

Dependent events Dependent events Bayes Theorem Definition: Two events A, B are called independent if P(A B) = P(A) P(B). Otherwise, P(A B) = P(A) P(B A) = P(B) P(A B) Multiplication rule: Given a probability space (Ω, F, P) such as A 1,...,A n ǫf and P( n 1 i=1 A i) > 0, then P( n i=1a i ) = P(A 1 )P(A 2 A 1 )...P(A n n 1 i=1 A i) 07/03/2007 - p. 14/35

Dependent events Bayes Theorem Definition: (Ω, F, P) is a probability space such as the events A and B ǫ F. If P(B) > 0 it can be defined the probability of A conditional on B as: P(A B) = P(A B) P(B) Total probability theorem: Let Ω be the sample space of a random experiment and {A i, i = 1, 2,...} ǫ F such as A i Aj = i j and i A i = Ω. Then, for all B ǫ F P(B) = i P(B A i )P(A i ) 07/03/2007 - p. 15/35

Example Dependent events Bayes Theorem The probability of a chicken from Steiermark to be infected with the Bird flu is 0.2 a. At the same time a chicken may have another lethal disease called Y that appears with probability 0.5. The death rate if a chicken has the Bird Flu is 0.8. The probability of death for a chicken with disease Y is 0.1. Both diseases cannot appear simultaneously. Additionally, a chicken might die due to natural causes with a probability 0.1. Calculate the probability of dying for a chicken in Steiermark. a Unreal number 07/03/2007 - p. 16/35

Bayes Theorem Dependent events Bayes Theorem Let Ω be the sample space of a random experiment and {A i, i = 1, 2,...} ǫ F such as A i Aj = i j and i A i = Ω. Let B ǫ F with P(B)>0. Then P(A i B) = P(B A i) P(B) Equivalently, P(A i B) = P(B A i)p(a i ) j P(B A j)p(a j ) 07/03/2007 - p. 17/35

Example Dependent events 1 Door A Door B Door C Bayes Theorem 2 3 07/03/2007 - p. 18/35

Example 1 Door A Door B Door C Dependent events 2 3 Bayes Theorem Naive approach: Regardless to the initial situation now there are only two doors from which I could choose. Hence, Pr(car is behind A)=Pr(car is not behind A)= 1 2 It is not an advantage to switch the door. By Bayes Theorem: 07/03/2007 - p. 19/35

Example (cont.) We define the event A as "car is behind door A" (the same for other doors) Dependent events Bayes Theorem Prior : Pr(A) = Pr(B) = Pr(C) = 1, i = 1, 2, 3 3 Pr(open C A) = 1 2 Pr(open C B) = 1 Pr(open C C) = 0 By the Total Probability Theorem: Pr(op C) = = P(op C A) P(A) + P(op C B) P(B) + P(op C C) P(C) = 1 2 1 3 + 1 1 3 + 0 1 3 = 1 2 07/03/2007 - p. 20/35

Example (cont.) By Bayes Theorem: Dependent events Bayes Theorem Pr(A open C) = Pr(B open C) = Pr(open C A) Pr(A) Pr(open C) Pr(open C B) Pr(B) Pr(open C) = 1 2 1 3 1 2 = 1 1 3 1 2 = 1 3 = 2 3 Conclusion: The probability of winning the car is bigger if you change the door!!! 07/03/2007 - p. 21/35

Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem 07/03/2007 - p. 22/35

Random variable Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem The probability measure P is a set function and hence difficult to work with it. We define a random variable on a probability space as a real valued function X defined in Ω such as: X : Ω R and X 1 (B) = {ω : X(ω)ǫB}ǫF, for all Borel set B R In other words, if the inverse of any Borel subset of R (semi-open intervals) belongs to the σ-algebra Example: The random experiment "To toss a coin" can be represented with the random variable { 1 if Head( X 1 (1) = HeadǫF) X = 0 if Tail( X 1 (0) = TailǫF) 07/03/2007 - p. 23/35

Example 3.1. Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem A roulette wheel has 38 slots -18 red, 18 black and 2 green. A gambler bets 1 $ to red each time. We define the random variable X i as the monetary gain of the gambler at game i. (From Durret (1996)) Ω = {Red, Black, Green} Events: F = {{Red}, {Black}, {Green}, {Red, Black}, {Red, Green}, {Black, Green} {Red, Black, Green}, } P(Red) = 18 38 = 9 19, P(no Red) = 20 38 = 10 19 Random variable: X = { +1 Red 1 Red 07/03/2007 - p. 24/35

Discrete random variables Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem A discrete random variable can take a countable number of predetermined values Example: "To toss a coin", "To throw a die", "Number of cars crossing a line during a certain time interval" Mass function: For discrete random variables, the mass function determines the probability of each element of the sample space. Example: for the random experiment "To throw a die" p(1) =... = p(6) = 1 6 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 Mass probability function of 100 random numbers from a binomial distribution (n=100, p=0.5) 07/03/2007 - p. 25/35

Discrete random variables Typical mass functions of discrete random variables: Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem 1. Bernoulli: X = { 1 p 0 (1 p) 2. Binomial: Number of successes within n Bernoulli trials: P(X = k) = C(n, k)p k (1 p) n k 3. Poisson, P(X = k) = e λ λ k k! 07/03/2007 - p. 26/35

Continuous random variables Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem For continuous random variables (that can take any real value) the way the probability is distributed within the sample space is more difficult to define. The probability density function (pdf) has the following properties: 1. P[a X b] = b a f(x)dx 2. f(x) 0, xǫr 3. f(x)dx = 1 200 180 160 140 120 100 80 60 40 20 0 4 3 2 1 0 1 2 3 4 Histogram of frequencies for a normal random variable 07/03/2007 - p. 27/35

Continuous random variables Most usual pdf Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem 1. Normal distribution f(x) = 1 2πσ 2 exp{ (x µ)2 2σ 2 } Random errors are normally distributed 2. Uniform f(x) = 1 b a I [a,b] 3. Gamma f(x) = βα Γ(α) xα 1 e βx 07/03/2007 - p. 28/35

Distribution function F(x) = P(X x) = { x f(x)dx cont r.v. a P(X = k) discrete r.v. x a Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem where a is the smallest value that the r.v. can take. Empirical CDF Empirical CDF 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 F(x) 0.5 F(x) 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 1 2 3 4 5 6 7 8 9 x 0 4 3 2 1 0 1 2 3 4 x Probability distribution of 100 random numbers from a binomial distribution (n=10, p=0.5) and cumulative probability distribution of 1000 random numbers from a normal distribution (µ=0, σ 2 =1) 07/03/2007 - p. 29/35

Example 3.2. Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem A fair coin (p=0.5) is tossed twice. For each one of the possible outcomes of the experiment we define the random variable that indicates the number of heads, i.e.: X = 2 HH 1 HT,TH 0 TT Calculate and plot the probability distribution for this random variable. 07/03/2007 - p. 30/35

Distribution function and probability Properties of the distribution function: Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem 1. lim x = 0; lim x + = 1 2. If x < y F(x) F(y) 3. F is continuous from the right, i.e. F(x + h) F(x) as h 0 Distribution function and probability: P(X > x) = 1 F(x) P(x < X y) = F(y) F(x) 07/03/2007 - p. 31/35

Expectation and variance Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem Expectation: Continuous random variable: E[X] = xf(x)dx Discrete random variable: E[X] = x i P(X = x i ) Variance:V [X] = E[(X E(X)) 2 ] = E[X 2 ] (E[X]) 2 Probability Statistic Base Population Sample Central tendency Expectation Average Dispersion Variance Sample variance 07/03/2007 - p. 32/35

Central Limit Theorem Random variable Continuous random variables Distribution function Expectation and variance Central Limit Theorem Given X 1,...,X n a set of random variables independents and with common distribution f, such as E[X i ] = µ and V [X i ] = σ 2 for all i. Then it is true that: If n is big enough. X E[ X] V [ X] N(0, 1) 07/03/2007 - p. 33/35

07/03/2007 - p. 34/35

[1] Durbin, R., Eddy, S., Krogh, A. and Mitchison, G. (1996) Biological sequence analysis, Cambridge University Press. [2] Durret, R. (1996) Probability: Theory and examples, Duxbury Press, Second edition. [3] Rohatgi, V.K. and Ehsanes Saleh, A.K.Md. (1988) An introduction to probability and statistics, Wiley, Second Edition. [4] Tuckwell, H.C. (1988) Elementary applications of probability theory, Chapman and Hall. [5] http://www.math.tau.ac.il/ tsirel/courses/introprob/syl1a.html [Engineering statistics] http://www.itl.nist.gov/div898/handbook/ 07/03/2007 - p. 35/35