Part (A): Review of Probability [Statistics I revision]

Similar documents
Deep Learning for Computer Vision

Homework 4 Solution, due July 23

Probability. VCE Maths Methods - Unit 2 - Probability

Conditional Probability

More on Distribution Function

4th IIA-Penn State Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Probabilistic models

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

3rd IIA-Penn State Astrostatistics School July, 2010 Vainu Bappu Observatory, Kavalur

Conditional Probability

STAT 430/510 Probability

Mathematics. ( : Focus on free Education) (Chapter 16) (Probability) (Class XI) Exercise 16.2

Dept. of Linguistics, Indiana University Fall 2015

SDS 321: Introduction to Probability and Statistics

What is Probability? Probability. Sample Spaces and Events. Simple Event

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Lecture 3 - Axioms of Probability

Recitation 2: Probability

CSC Discrete Math I, Spring Discrete Probability

Single Maths B: Introduction to Probability

p. 4-1 Random Variables

ELEG 3143 Probability & Stochastic Process Ch. 1 Experiments, Models, and Probabilities

Expected Value 7/7/2006

Probabilistic models

Discrete Random Variable

MATH1231 Algebra, 2017 Chapter 9: Probability and Statistics

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

1 Presessional Probability

Analysis of Engineering and Scientific Data. Semester

Joint Distribution of Two or More Random Variables

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G.

What is a random variable

Laws of Probability, Bayes theorem, and the Central Limit Theorem

2. Conditional Probability

STAT/MA 416 Answers Homework 4 September 27, 2007 Solutions by Mark Daniel Ward PROBLEMS

Topic 3: The Expectation of a Random Variable

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Laws of Probability, Bayes theorem, and the Central Limit Theorem

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with

Applied Statistics I

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin

Laws of Probability, Bayes theorem, and the Central Limit Theorem

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Week 12-13: Discrete Probability

ECE 450 Lecture 2. Recall: Pr(A B) = Pr(A) + Pr(B) Pr(A B) in general = Pr(A) + Pr(B) if A and B are m.e. Lecture Overview

Laws of Probability, Bayes theorem, and the Central Limit Theorem

Lecture 2: Repetition of probability theory and statistics

Lecture 1 : The Mathematical Theory of Probability

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

CS206 Review Sheet 3 October 24, 2018

Probabilities and Expectations

Quantitative Methods for Decision Making

Statistical Inference

ω X(ω) Y (ω) hhh 3 1 hht 2 1 hth 2 1 htt 1 1 thh 2 2 tht 1 2 tth 1 3 ttt 0 none

Properties of Probability

Expectation of Random Variables

Probability Distributions for Discrete RV

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

3 Multiple Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables

STA 2023 EXAM-2 Practice Problems From Chapters 4, 5, & Partly 6. With SOLUTIONS

Introduction to Probability. Experiments. Sample Space. Event. Basic Requirements for Assigning Probabilities. Experiments

LECTURE NOTES by DR. J.S.V.R. KRISHNA PRASAD

STAT 430/510 Probability Lecture 7: Random Variable and Expectation

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Statistics for Economists. Lectures 3 & 4

Probability, Random Processes and Inference

Discrete random variables and probability distributions

PROBABILITY CHAPTER LEARNING OBJECTIVES UNIT OVERVIEW

Chapter 2: Discrete Distributions. 2.1 Random Variables of the Discrete Type

PERMUTATIONS, COMBINATIONS AND DISCRETE PROBABILITY

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

CME 106: Review Probability theory

Introduction to Probability 2017/18 Supplementary Problems

Lecture 10. Variance and standard deviation

6.2 Introduction to Probability. The Deal. Possible outcomes: STAT1010 Intro to probability. Definitions. Terms: What are the chances of?

Notes for Math 324, Part 17

Lecture 16. Lectures 1-15 Review

STA 2023 EXAM-2 Practice Problems. Ven Mudunuru. From Chapters 4, 5, & Partly 6. With SOLUTIONS

Discrete Random Variables

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

EE514A Information Theory I Fall 2013

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

Examples of random experiment (a) Random experiment BASIC EXPERIMENT

Chapter 8: An Introduction to Probability and Statistics

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 16 Notes. Class URL:

Review of Probability. CS1538: Introduction to Simulations

Random Models. Tusheng Zhang. February 14, 2013

Probability Theory and Applications

Lecture notes for probability. Math 124

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

Binomial and Poisson Probability Distributions

Laws of Probability, Bayes theorem, and the Central Limit Theorem. Do Random phenomena exist in Nature? Do Random phenomena exist in Nature?

Introduction to Probability and Stocastic Processes - Part I

Transcription:

Part (A): Review of Probability [Statistics I revision] 1 Definition of Probability 1.1 Experiment An experiment is any procedure whose outcome is uncertain ffl toss a coin ffl throw a die ffl buy a lottery ticket ffl measure an individual s height h 1

1.2 Sample Space A sample space, S, is the set of all possible outcomes of a random experiment ffl S = fh; Tg ffl S = f1; 2; 3; 4; 5; 6g ffl S = fwin, LOSEg ffl S = fh : h 0g 1.3 Events An event, A, is an element, or appropriate subset of S e.g. roll a die, S = f1; 2; 3; 4; 5; 6g ffl event A, that an even number is obtained is given by A = f2; 4; 6g 2

ffl event B, that a number no greater than 4 is obtained is described by B = f1; 2; 3; 4g S can be discrete (e.g. coin, die, lottery), or continuous (e.g. height). In the first part of this course we will deal with the discrete case (as in Stats 1). $ Set operations If A; B are events, according to Set Theory, we define the operations: ffl A [ B: ffl A B: union, A or B happening intersection, A and B happening A ffl 0 μ c ; A A A A [ A = c A = ; (or A): complement of wrt S, not A Notice that: S, 3

1.4 Probability mass function A probability function, P (s), is a real-valued function of a collection of events from a sample space, S, attaching a value in [0; 1] to each event, satisfying the axioms: A1. P (A) 0 for any event A A2. P (S) = 1 A3. If A 1 ;A 2 ;A 3 ;:::is a countable collection of mutually exclusive events (A i A j = ; for i 6= j, i.e. they have no common elements), then P ([ i A i ) = P i P (A i) 4

ffl P (;) = 0 ) P (A [ A c ) = P (S) = 1 Important properties of P (s) P (A [ B) = P (A) + P (B) P (A B) ffl This can be generalised n for events, e.g. n = for 3: $ (A 1 [ A 2 [ A 3 ) = P (A 1 ) + P (A 2 ) + P (A 3 ) P (A 1 A 2 ) P (A 1 A 3 ) P (A 2 A 3 ) + P (A 1 A 2 A 3 ). P Notice: P (A [ B) = P (A) + P (B) not always true! It only holds for disjoint (mutually exclusive) events. P (A c ) = 1 P (A) ffl A; A Proof: are disjoint, and A [ A c = S c ) P (A) + P (A c ) = 1 In many practical situations it is easier to calculate P (A c ) first and use this property to calculate P (A). 5

1.5 Independence and Conditional Probability For any event A with P (A) > 0 we can define (1) P(A B) P (B=A) = P(A) If A and B are independent then by definition P (A B) = P (A) P (B) and therefore (B=A) = P(A)P(B) P P(A) = P(B): Note that rearrangement of (1) gives the chain (multiplication) rule: P (A B) = P (A)P (B=A) = P(B)P(A=B): (2) This can also be generalised for n events. 6

kx kx $ 1.6 Total Probability Rule If A ;A 2 ;:::;A k 1 are mutually exclusive events and form a partition of a sample space S (i.e. [ k i=1 A i = S), with P (A i ) > 0 8i, then for an event B 2 S: P (B A i ) = P (A i )P (B=A i ) (3) P (B) = i=1 i=1 7

$ Example: An insurance company covers claims from 4 different motor A portfolios, ;A 2 ;A 3 1 A and 4. A Portfolio 1 4000 covers policy A holders; 2 covers A 7000; 3 covers 13000; and A 4 covers 6000 of the total 30000 policy holders insured by the company. It is estimated that the proportions of policies that will result in a claim in the following year in each of the portfolios are 8; 5; 2 and 4 respectively. What is the probability that a policy chosen randomly from one of the portfolios will result in a claim in the following year? Solution... 8

1.7 Bayes Theorem If P (A) > 0 and P (B) > 0, from(1) (4) P (A B) P (A)P (B=A) = P (A=B) = P(B) or, if A 1 ;A 2 ;:::;A k form a partition of a sample space S with P (B) P (A i ) > 0 8i, and P (B) > 0, then from (3) (A =B) P j)p j (B=A ) = (A j = 1; P j 2;:::;k: P k (5) ); i (B=A i)p (A P i=1 Example: (previous cont.) If a claim rises from a policy in the year concerned, what is the probability that the policy belongs to A portfolio 3? Solution... 9

1.8 Random variables Until now we have associated probabilities with events in a sample space. We will now review the concept of random variables. $ Definition: A random variable (r.v.) X is a function from a sample space S to R (the real numbers). That is, to any outcome, s, we associate a real number X(s). The range of a r.v. X, denoted as S x, is the set fr 2 R : r = X(s) for some s 2 Sg. [i.e. the set of all real numbers r which are equal to X(s) for some outcome s.] We will usually denote random variables using capital letters, e.g. X, and their realisations (values) using small letters, e.g. x. 10

If S is finite or countable then the range of X is a set of discrete numbers. We say X is a discrete random variable. Some simple examples: (i) Experiment: Toss a balanced coin 3 times. Sample space S: All possible sequences of T;H of length 3. Let X(s) = no. of Heads in outcome (sequence) s. Then X is a discrete random variable. (ii) Experiment: Toss a balanced coin until a H is thrown. Sample space: S = fh;th;tth;:::g. Let X(s) = no. of throws (i.e. length of s). Then X is a discrete random variable. 11

1.9 Probability function of a discrete r.v. Any probability function on S leads naturally to a probability function on the range of X, defined by $ X P (s): f X (x) = P (X = x) = s:x(s)=x 12

In Example (i) before: s : HHH HHT HTH THH TTT TTH THT HTT Prob: 1 8 1 8 1 8 1 8 1 8 1 8 1 8 1 8 X : 3 2 2 2 0 1 1 1 Probability of 2 Hs is P (X = 2) = 1 8 + 1 8 + 1 8 = 3 8. Usually we consider r.v. s which describe some useful summary of the outcome of an experiment, e.g. the total score in a multiple choice test. 13

$ Axioms of probability For a probability function of a discrete r.v. to satisfy the axioms of probability, it is sufficient that i. f X (x) 0 for all x 2 S x ii. P all x f X (x) = 1 14

1.10 S infinite Until now in the examples we have only considered cases where the sample space S consists of a countable and finite number of events. However, there may be also an infinite number of possibilities involved with the experiment under consideration. Some results on infinite series... Example: Two (TV show) players (A and B) play a game in which each wins a prize if s/he is the first to choose the correct screen out of the 6-screen display panel shown to them. Player A goes first, and the prize is randomly re-allocated to one of the screens after each player s choice (so that the probability of either player winning at a single draw is independently 1=6). What is the probability that A wins? Solution... 15

2 Some common discrete distributions [Statistics I] 2.1 The binomial distribution: Bin(n; p) Consider an experiment consisting of n independent trials where the probability of success is p; 0 < p < 1. Let X denote the number of successes (S s). Then n x (1 p) n x ; x = 0; 1; 2;:::;n: (6) p f X (x) = P (X = x) = x Why? The sample S space consists of all sequences S of F and of length n. For any such sequence, s, with x S s and (therefore) (n x) F s the probability is P (s) = p x (1 p) n x (since the trials are independent so probabilities multiply). There are n x such sequences, and therefore (6) follows. 16

$ 2.2 The geometric distribution: Geometric(p) [Statistics I revision] Again consider a sequence of independent trials with probability of success P (S) = p. Let X be the number of trials required to achieve the 1st success [e.g. number of attempts required to pass the driving test]. S = S;FS;FFS;FFFS;::: X = 1;2;3;4;::: Therefore for x = 1; 2; 3;::: z } x 1 FF :::F S) = (1 p) x 1 p: (7) f X (x) =P(X =x) =P( 17

$ The memoryless property The geometric distribution has the memoryless property. That is, given that there have already been n trials without success, the probability that x additional trials will be required for the first success is independent of n: P (X > x + n=x > n) = P (X > x) [Show] [Is the number of attempts required to pass the driving test well described by the geometric distribution?] 18

2.3 [NEW!] The negative binomial distribution: NBin(r; p) This is a generalisation of the geometric distribution. Consider the same sequence of independent trials as in (2.2) and let X denote the number of trials required to achieve r successes, where r can be any positive integer. Clearly the range of X is r;r+1;r+2;:::. $ What is f X (x) = P (X = x); x r? Let s be an outcome for which X(s) = x. Then s is a sequence of S s and F s such that: (i) s has length x (ii) s has r S s and x r F s (iii) the last entry is S From P = p) (ii), p (independence of trials!!). r (1 (s) x r 19

How many such outcomes are there? Trial: 1 2 3 ::: x 1 x Result:??? :::? S 0 (x r)f 0 s (r 1)S s x 1 There r 1 are ways of assigning the S s to position 1; 2;:::;x 1. Therefore x 1 p) x r p r ; x = r;r+1;::: (8) (1 f X (x) = P (X = x) = r 1 20

Relationship to geometric distribution 1. NBin(1;p) Geometric(p) 2. If X 1 ;X 2 ;:::;X r are independent Geometric(p) r.v. s, then Y = X 1 + X 2 + :::+X r ο NBin(r;p). To summarise: In a sequence of independent trials, each with probability of success p: ffl the number of successes in n trials follows a bin(n; p) distribution ffl the number of trials until the first success follows a geometric(p) distribution ffl the number of trials until the rth success follows a NBin(r;p)distribution. 21

2.4 [NEW!] The Poisson distribution: Poisson( ) This is a distribution which is often used to describe the outcomes of experiments that involve counting objects or events (e.g. the number of road accidents occurring on a stretch of road in a 1-month period). Its range is f0; 1; 2; 3;:::g, and its probability function is given by f X (x) = P (X = x) = e x ; x = 0; 1; 2; 3;:::; > 0: (9) x! It has connections with the Bin(n; p) distribution, and also the Exponential distribution (see later). 22

3 Expectation, variance and independence [Statistics I] 3.1 Mathematical Expectation of a discrete r.v. X (Also referred to as expected value or mean of X.) The expectation of a X r.v. is a weighted average of all possible values of X, with weights determined by the probability distribution of X. It measures where the centre of the distribution lies. Definition: Let X be a discrete r.v. with probability mass function f X (x). Then its expected value is X xf X (x) = μ x (mean of X) (10) E(X) = Notice that if the above sum does not converge for an infinite set S x, then E(X) is not defined. x2sx 23

$ Properties of expectation: If X; Y are r.v. s and a; b 2 R are constants: (i) E(X + Y ) = E(X) + E(Y ) (ii) E(a) = a; E(aX + b) = ae(x) + b (iii) If X 0, then E(X) 0 24

3.2 Expectation of a function of a discrete r.v. Let X be a r.v. with probability function f X (x), and Y = g(x) a real-valued function of it. Then Y is also a r.v., and its expectation is given by g(x)f X (x) = X yf Y (y): E(Y ) = X y x2sx Example: An individual invests an amount $105k of on a financial product. The (total) return of his investment at the end of the following year is given in the table: Return (in x : 118 113 110 107 $1000 s) Probability f X (x) : 3 20 7 20 7 20 3 20 If the profit of the investment, Y, is given (as a function of the total return X) by the formula Y = X 0:99 105, find the expected profit of the investor. 25

= (118 0:99 105) 3 20 + (1130:99 105) 7 $ Solution: X g(x)f X (x) E(Y ) = x 20 (110 0:99 105) 7 + + (1070:99 105) 3 20 20 7:5 3 = + 2:8 7 20 0:1 7 20 20 2:9 3 20 = 1:6 [An expected profit of $1600. However, notice that with probability 1=2 the investor will lose money! Is this a good investment?] 26

3.3 Variance of a discrete r.v. X The variance of a random variable x (var(x), also ff 2 x ) is defined as the expectation of the deviation (squared) of X from its mean (expected value), i.e. var(x) = EfX E(X)g 2 (11) and for a discrete r.v. is calculated as X fx E(X)g 2 f X (x) (12) var(x) = x2sx It measures the spread of the r.v. X, i.e. the extent to which the distribution X of is dispersed around its μ mean x. It can also be calculated as var(x) = E(X 2 ) EfXg 2 : (13) 27

The positive square root of the variance is called the standard deviation (s.d.) of X, ff x = p var(x) Properties of variance (i) var(x) 0 with equality if and only if X is a constant. (ii) If a; b 2 R are constants, then var(a + bx) = b 2 var(x). 28

$ 3.4 Independence of r.v. s Definition The discrete r.v. s X; Y are said to be independent of each other iff P (X = x Y = y) = P (X = x) P (Y = y) 8x 2 S x ; 8y 2 S y. Properties If X; Y are independent r.v. s, then (i) E(XY ) = E(X)E(Y ) (ii) var(x + Y ) = var(x) + var(y ) [Prove] 29

$ Counter example for property (ii): If X = Y (NOT independent), then var(x + Y ) = var(2x) = 4var(X) 6= var(x) + var(y ) 30