Midterm #1. Lecture 10: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example, cont. Joint Distributions - Example

Similar documents
Practice Problem - Skewness of Bernoulli Random Variable. Lecture 7: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Random Variables. Lecture 6: E(X ), Var(X ), & Cov(X, Y ) Random Variables - Vocabulary. Random Variables, cont.

MAS113 Introduction to Probability and Statistics

Tom Salisbury

Lecture 11. Probability Theory: an Overveiw

COMPSCI 240: Reasoning Under Uncertainty

SDS 321: Introduction to Probability and Statistics

Limiting Distributions

CSE 312 Final Review: Section AA

Lecture 1: August 28

STAT 430/510: Lecture 16

Chapter 6: Large Random Samples Sections

COMP2610/COMP Information Theory

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

STAT Chapter 5 Continuous Distributions

CS145: Probability & Computing

Lecture 3 - Expectation, inequalities and laws of large numbers

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Discrete Probability Refresher

Introduction to Probability

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Quick Tour of Basic Probability Theory and Linear Algebra

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Twelfth Problem Assignment

Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages

Fundamental Tools - Probability Theory IV

Expectation of Random Variables

MTH135/STA104: Probability

CMPSCI 240: Reasoning Under Uncertainty

Chapter 4. Chapter 4 sections

Sampling Distributions

Final Examination Solutions (Total: 100 points)

Proving the central limit theorem

Disjointness and Additivity

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Lecture 1: Review on Probability and Statistics

Lecture 4. August 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Refresher on Discrete Probability

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Limiting Distributions

Applied Statistics I

18.440: Lecture 28 Lectures Review

Week 9 The Central Limit Theorem and Estimation Concepts

Lectures on Elementary Probability. William G. Faris

MATH Notebook 5 Fall 2018/2019

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Properties of Random Variables

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

3 Multiple Discrete Random Variables

Lecture 4: Sampling, Tail Inequalities

STAT 430/510: Lecture 10

18.440: Lecture 28 Lectures Review

CMPSCI 240: Reasoning Under Uncertainty

Exam 1 - Math Solutions

1 Exercises for lecture 1

Lecture 2: Repetition of probability theory and statistics

Probability and Statistics Notes

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 18: Central Limit Theorem. Lisa Yan August 6, 2018

Limit Theorems. STATISTICS Lecture no Department of Econometrics FEM UO Brno office 69a, tel

STAT 418: Probability and Stochastic Processes

Lecture 2: Review of Probability

STT 441 Final Exam Fall 2013

STAT 414: Introduction to Probability Theory

Math 151. Rumbos Spring Solutions to Review Problems for Exam 3

Lecture 1. ABC of Probability

6 The normal distribution, the central limit theorem and random samples

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Probability Notes. Compiled by Paul J. Hurtado. Last Compiled: September 6, 2017

Lectures on Statistics. William G. Faris

Probability Distributions Columns (a) through (d)

SDS 321: Introduction to Probability and Statistics

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH A Test #2 June 11, Solutions

ECE 302 Division 1 MWF 10:30-11:20 (Prof. Pollak) Final Exam Solutions, 5/3/2004. Please read the instructions carefully before proceeding.

SOR201 Solutions to Examples 3

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

1 Presessional Probability

Lecture 4: September Reminder: convergence of sequences

Class 26: review for final exam 18.05, Spring 2014

3 Conditional Expectation

Copyright c 2006 Jason Underdown Some rights reserved. choose notation. n distinct items divided into r distinct groups.

ESS011 Mathematical statistics and signal processing

Functions of two random variables. Conditional pairs

Chapter 5 continued. Chapter 5 sections

CME 106: Review Probability theory

Stationary independent increments. 1. Random changes of the form X t+h X t fixed h > 0 are called increments of the process.

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula

Things to remember when learning probability distributions:

Stochastic Models of Manufacturing Systems

Topic 7: Convergence of Random Variables

Exercises and Answers to Chapter 1

Transcription:

Midterm #1 Midterm 1 Lecture 10: and the Law of Large Numbers Statistics 104 Colin Rundel February 0, 01 Exam will be passed back at the end of class Exam was hard, on the whole the class did well: Mean: 75 Median: 81 SD: 1.8 Max: 105 Final grades will be curved, midterm grades will be posted this week. Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 1 / - Example - Example, cont. Draw two socks at random, without replacement, from a drawer full of twelve colored socks: black, 4 white, purple Let B be the number of Black socks, W the number of White socks drawn, then the distributions of B and W are given by: P(Bk) P(Wk) 8 Note - B HyperGeo(1,, ) 0 1 5 1 11 1 11 3 7 1 11 8 4 8 1 11 3 ( )( ) k k 5 1 11 4 3 1 11 ( 1 ) and W HyperGeo(1, 4, ) ( )( ) 4 8 k k ( ) 1 Let B be the number of Black socks, W the number of White socks drawn, then the distributions of B and W are given by: 0 1 B 1 1 W 0 1 8 4 3 0 0 0 8 P(B b, W w) 3 ( )( 4 b w)( ( 1 ) b w ) Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 3 /

Marginal Distribution Conditional Distribution Note that the column and row sums are the distributions of B and W respectively. P(B b) P(B b, W 0) + P(B b, W 1) + P(B b, W ) P(W w) P(B 0, W w)+p(b 1, W w)+p(b, W w) These are the marginal distributions of B and W. In general, P(X x) y P(X x, Y y) y P(X x Y y)p(y y) Conditional distributions are defined as we have seen previously with P(X x Y y) P(X x, Y y) P(Y y) joint pmf marginal pmf Therefore the pmf for white socks given no black socks were drawn is P(W w B 0) P(W w, B 0) P(B 0) / 1 1 / 8 8 if W 0 if W 1 / if W Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 4 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 5 / Expectation of Independence, cont. E[g(X, Y )] x g(x, y)p(x x, Y y) For example we can define g(x, y) x y then y E(BW ) (0 0 1/) + (0 1 8/) + (0 /) + (1 0 1/) + (1 1 4/) + (1 0/) + ( 0 /) + ( 1 0/) + (1 0/) 4/ 4/11 Note that E(BW ) E(B)E(W ) since Remember that Cov(X, Y ) 0 when X and Y are independent. Cov(B, W ) E[(B E[B])(W E[W ])] E(BW ) E(B)E(W ) 4/11 /3 10/33 0.30303 E(B)E(W ) (0 / + 1 3/ + /) (0 8/ + 1 3/ + /) / 44/ /3 This implies that B and W are not independent. Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 7 /

Expectation of Conditional Probability Multinomial Distribution Works like any other distribution E(X Y y) xp(x x Y y) x Therefore we can calculating things like conditional mean and variance, E(W B 0) 0 1/ + 1 8/ + / 0/ 1.333 E(W B 0) 0 1/ + 1 8/ + / 3/.1333 Var(W B 0) E(W B 0) E(W B 0) 3/ (4/3) 1/45 0.355 Let X 1, X,, X k be the k random variables that reflect the number of outcomes belonging to category k in n trials with the probability of success for category k being p k, X 1,, X k Multinom(n, p 1,, p k ) P(X 1 x 1,, X k x k ) f (x 1,, x k n, p 1,, p k ) where n! x 1! x k! px 1 1 px k k k k x i n and p i 1 E(X i ) np i Var(X i ) np i (1 p i ) Cov(X i, X j ) np i p j Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 8 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 9 / Multinomial Example Markov s Inequality Some regions of DNA have an elevated amount of GC relative to AT base pairs. If in a normal region of DNA we expect equal amounts of ACGT vs a GC rich region which has twice as much GC as AT. If we observe the following sequence ACTGACTTGGACCCGACGGA what is the probability that it is from a normal region or a GC rich region. For any random variable X 0 and constant a > 0, then P(X a) E(X ) a Corollary - Chebyshev s Inequality: P( X E(X ) a) Var(X ) a Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 10 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 11 /

Derivation of Markov s Inequality Derivation of Chebyshev s Inequality Let X be a random variable such that X 0 then { 1 if X a I X a 0 if X < a ai X a X E(aI X a ) E(X ) ae(i X a ) E(X ) P(X a) E(X ) a Proposition - for a non-decreasing function f (x) then P(X a) P(f (X ) f (a)) E(f (X )) f (a) If we define the positive valued random variable to be X E(X ) and f (x) x then P( X E(X ) a) P[(X E(X )] a ) E([X E(X )] ) a Var(X ) a If we define a kσ where σ Var(X ) then P( X E(X ) kσ) Var(X ) k σ 1 k Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 1 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 13 / Independent and Identically Distributed (iid) Sums of iid Random Variables Let X 1, X,, X n iid D where D is some probability distribution where E(X i ) µ and Var(X i ) σ. We defined S n X 1 + X + + X n A collection of random variables that share the same probability distribution and all are mutually independent. Example If X Binom(n, p) then X n Y i where Y 1,, Y n iid Bern(p) E(S n ) E(X 1 + X + + X n ) E(X 1 ) + E(X ) + + E(X n ) µ + µ + + µ nµ Var(S n ) E[((X 1 + X + + X n ) (µ + µ + + µ)) ] E[((X 1 mu) + (X µ) + + (X n µ)) ] n n n E[(X i µ) ] + E[(X i µ)(x j µ)] j1 i j n n n Var(X i ) + Cov(X i, X j ) nσ j1 i j Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 14 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 /

Average of iid Random Variables Weak Let X 1, X,, X n iid D where D is some probability distribution where E(X i ) µ and Var(X i ) σ. We defined X n (X 1 + X + + X n )/n Based on these results and Markov s Inequality we can show the following: E( X n ) E(S n /n) E(S n )/n µ Var( X n ) Var(S n /n) 1 n Var(S n) nσ n σ n P( X n µ > ɛ) P( S n nµ nɛ) P[(S n nµ) n ɛ ] Therefore, given σ < E[(S n nµ) ] n ɛ nσ n ɛ σ nɛ lim P( X n µ ɛ) 0 Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 1 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 17 / LLN and CLT Weak ( X n converges in probability to µ): lim P( X n µ > ɛ) 0 Strong ( X n converges almost surely to µ): P( lim X n µ) 1 Strong LLN is a more powerful result (Strong LLN implies Weak LLN), proof is more complicated. These results justify the long term frequency definition of probability Law of large numbers shows us that which shows that n >>> S n nµ. S n nµ lim lim n ( X n µ) 0 What happens if we divide by something that grows slower than n like n? S n n mu lim n lim n( X n µ) d N(0, σ ) This is the Central Limit Theorem, of which the DeMoivre-Laplace theorem for the normal approximation to the binomial is a special case. Hopefully by the end of this class we will have the tools to prove this. Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 18 / Statistics 104 (Colin Rundel) Lecture 10 February 0, 01 19 /