Probability review (week 2) Solutions

Similar documents
Solutions to Homework 2 - Probability Review

Week 2: Probability review Bernoulli, binomial, Poisson, and normal distributions Solutions

7.1 Convergence of sequences of random variables

Lecture Chapter 6: Convergence of Random Sequences

Lecture 19: Convergence

7.1 Convergence of sequences of random variables

AMS570 Lecture Notes #2

Topic 9: Sampling Distributions of Estimators

4. Basic probability theory

Distribution of Random Samples & Limit theorems

Topic 9: Sampling Distributions of Estimators

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

HOMEWORK I: PREREQUISITES FROM MATH 727

Topic 9: Sampling Distributions of Estimators

Stat 400: Georgios Fellouris Homework 5 Due: Friday 24 th, 2017

Simulation. Two Rule For Inverting A Distribution Function

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Probability and statistics: basic terms

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Lecture 5. Random variable and distribution of probability

Parameter, Statistic and Random Samples

(6) Fundamental Sampling Distribution and Data Discription

Lecture 7: Properties of Random Samples

Lecture 20: Multivariate convergence and the Central Limit Theorem

Chapter 6 Principles of Data Reduction

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data

Mathematics 170B Selected HW Solutions.

Infinite Sequences and Series

Discrete probability distributions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

4. Partial Sums and the Central Limit Theorem

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Lecture 4. Random variable and distribution of probability

Discrete Random Variables and Probability Distributions. Random Variables. Discrete Models

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Chapter 2 The Monte Carlo Method

Discrete Probability Functions

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Approximations and more PMFs and PDFs

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Sampling Distributions, Z-Tests, Power

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Lecture 2: Concentration Bounds

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Empirical Distributions

Axioms of Measure Theory

Expectation and Variance of a random variable

Handout #5. Discrete Random Variables and Probability Distributions

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

1.3 Convergence Theorems of Fourier Series. k k k k. N N k 1. With this in mind, we state (without proof) the convergence of Fourier series.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

This section is optional.

Physics 116A Solutions to Homework Set #1 Winter Boas, problem Use equation 1.8 to find a fraction describing

Solution to Chapter 2 Analytical Exercises

Probability and Statistics

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Lecture 12: November 13, 2018

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

( ) = p and P( i = b) = q.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Binomial Distribution

Lecture 2: Monte Carlo Simulation

EE 4TM4: Digital Communications II Probability Theory

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Modeling and Performance Analysis with Discrete-Event Simulation

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

The Random Walk For Dummies

STAT Homework 1 - Solutions

Chapter 2 Transformations and Expectations

Chapter 6 Sampling Distributions

CS / MCS 401 Homework 3 grader solutions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

1. ARITHMETIC OPERATIONS IN OBSERVER'S MATHEMATICS

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

Estimation for Complete Data

Learning Theory: Lecture Notes

1. (25 points) Use the limit definition of the definite integral and the sum formulas 1 to compute

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

MIT Spring 2016

Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

CH5. Discrete Probability Distributions

f(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim

Unbiased Estimation. February 7-12, 2008

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Transcription:

Probability review (week 2) Solutios A. Biomial distributio. BERNOULLI, BINOMIAL, POISSON AND NORMAL DISTRIBUTIONS. X is a biomial RV with parameters,p. Let u i be a Beroulli RV with probability of success p=. The epected value of u i is E(u i )=(p)+(-p)=p (the summatio of each possible outcome weighted by its probability). The epected value of X is the sum of the epected values of each idividual trial (represeted by the Beroulli RV u i ), so it ca be derived by: X = where u i s are Beroulli r.v. s Now for the variace: i= E(X ) = E( = u i u i ) = i= p = p. i= E(u i ) i= σ 2 X = E[(X E(X )) 2 ] = E(X 2 ) (EX ) 2 = E( = [due to the liearity of Epectatio] u i i= j= u j 2 p 2 ) Eu i u j 2 p 2 = p + ( )p 2 2 p 2 = p( p). i,j= I calculatio of i,j= Eu iu j, ote that Eu i u j = Eu 2 i = p if i = j, which there are of such terms i the summatio, ad is equal to E(u i )E(u j ) = p.p = p 2, where there are ( ) of them i the summatio. The MatLab code follows: % This fuctio returs a vector the same size as u where etries % are the biomial with parameters ad p, calculated % at each elemet of u. This tries to mimic Matlab s % built-i fuctio f=( bio,u,,p) fuctio f=my_biomial_(u,,p) % Note the followig from help o the fuctio "factorial": % Sice double precisio umbers oly have about % 5 digits, the aswer is oly accurate for N <= 2. % For how Matlab itself calculates the biomial, see bio.m f=zeros(size(u)); %iitializatio of f for i=:legth(u) if (u(i)>=)&&(u(i)<=) % sice support of biomial(,p) is,,.., f(i)=pˆu(i)*(-p)ˆ(-u(i))*factorial()/(factorial(-u(i))*factorial(u(i))); % usig the relatio for of biomial with parameters,p

Ad ow the mai m file, where we fi E[X ] = 5 ad = 6,, 2, 5. Notice the use of stem for illustratig pmf ad stairs for cmf i discrete radom variables. clear all; close all; clc; _vector=[6,,2,5]; i=; for =_vector p=5/; figure() subplot(2,2,i); % stem(:,( bio,:,,p),. ); % cheat lie! stem(:,my_biomial_(:,,p),. ); title([ =,um2str()]); label( ); ylabel( ); grid o; ais([,5,,.5]); figure(2) subplot(2,2,i); % stem(:,( bio,:,,p),. ); % cheat lie! % stem(:,my_biomial_(:,,p),. ); % see below for my_biomial_ code stairs(:,cumsum(my_biomial_(:,,p)), LieWidth,); % usig cumsum title([ =,um2str()]); label( ); ylabel( ); grid o; ais([,5,,]); i=i+; I case you prefer a separate fuctio for biomial : % fuctio takes vector u, ad scalars ad p, % ad returs a vector of the same size as u, where etries % are the biomial with parameters ad p, calculated % at the poits of elemets of u. tryig to mimic Matlab s % built-i fuctio f=( bio,u,,p) % This fuctio calls my_biomial_ ad calculates a cumulative sum % over possible values up to u(i) for each etry of u. fuctio F=my_biomial_(u,,p) F=zeros(size(u)); %iitializatio of F for i=:legth(u) F(i)=sum(my_biomial_(:u(i),,p)); The correspodig ad plots ca be foud i figures ad 2. B. Biomial ad Poisso distributios. For Poisso RV X p with parameter λ, the epected value is calculated below. E(X P ) = k e λ λ k = λe λ kλ k = λe λ d k! k! dλ ( λ k k! ) = d λe λ dλ (eλ ) == λe λ e λ = λ k= k= The Matlab code to plot the Poisso distributio follows. clear all; close all; clc; k=

figure lambda=5; =:5; % stem(,( poiss,,lambda),. ); % cheat lie! my_poisso_=ep(-lambda)*(lambda.ˆ)./factorial(); % Note the use of "dot" for elemet-wise operatio % Thus, my_poisso_ is ow a vector with the same size as. stem(,my_poisso_,. ); title([ Poisso Distributio with \lambda =,um2str(lambda)]); label( ); ylabel( ); grid o; ais([,5,,.5]); Graph ca be foud i figure 3 Now we eed to calculate the MSE. A simple trial ad error, such as the followig code reveals that we oly eed to cosider the first values for calculatio of MSE (Note that Poisso has a decliig tail, which mea that as, P(X P = ). ) clear all; close all; clc; =:2; lambda=5; my posso=ep(-lambda)*lambda.ˆ./factorial(); test=my posso<.5; test I fact, more precisely, we should cosider oly the terms 3 : 9, as show i the below code, which also geerates figure 4. close all; clear all; clc; lambda=5; =3:9; _ide=; _vector=[6 2 5]; my_mse=zeros(,4); for =_vector _ide=_ide+; my_poisso_=ep(-lambda)*(lambda.ˆ)./factorial(); my_mse(,_ide)=sum((my_biomial_(,,lambda/)... -my_poisso_).ˆ2.*my_poisso_); stem(_vector,my_mse, * ); label( ); ylabel( MSE ); grid o; ais([,5,,.2]); The results reported i a table I, below. TABLE I MEAN-SQUARE-ERROR BETWEEN PDF OF A POISSON WITH λ = 5 AND BINOMIALS OF (, λ/) FOR = 6,, 2, 5. THE PROBABILITIES IN THE POISSON LESS THAN.5 ARE NEGLECTED. MSE 6.684.68 2.25 5.3 The table above shows the MSE betwee the biomial ad Poisso distributios for = 6,, 2, 5. As we ca see, as icreases, the MSE falls rapidly to zero, idicatig the distributios become more ad more similar for

larger s. C. Biomial ad Poisso distributios agai. The pmf of biomial RV X (, λ/p) coverges to the pmf of Poisso RV X p as as show below. p X () = λ lim! ( )(! ( )!! p ( p)! = ( )!! ( λ ) ( λ ) ( )( 2)... ( + ) ( λ = λ )! ( λ ) = λ! ( )( )... ( + )... ( + ) ( λ ) ( λ λ = )! ) ( λ ) ( λ ) ( λ (... ) lim ) ( λ ) By defiitio: Thus, = λ! lim ( λ ) ( λ ) = e λ lim p X () = λ! e λ = p Xp () Please refer to pp. 32-33 of the slides titled block probability review for more iformatio. D. Biomial ad ormal distributios. If Z is a RV with stadard ormal distributio, i.e. N(, ), the accordig to i= Z := i µ σ, X = i= u i has a Normal distributio with mea p ad variace σ, i.e. N(p, σ ). (Note: σ is equivalet to p( p)). The code follows: close all; clear all; clc; _vector=[ 2 5]; p=.5; _ide=; for =_vector _ide=_ide+; mea_ormal=*p; variace_ormal=sqrt(*p*(-p)); =:; %defiig the mea %defiig the variace subplot(3,,_ide); %plottig graph stairs(,[(my_biomial_(,,p)), orm(,mea_ormal,variace_ormal) ]..., LieWidth,2); title([ =,um2str()]); label( ); ylabel( ); leg( Biomial, Normal ); grid o; ais([,5,,]); The resultig graph ca be foud i fig 5

E. Normal ad Poisso approimatios. I provide you with a hit for this part: Poisso limit theorem is about accumulatio of icreasigly improbable evets. I particular, ote that for covergece of the distributio of sum of iid Beroulli RV s (which is a biomial RV) to a Poiso distributio with mea (=rate) λ, we eeded the probability of i the Beroulli RV to be p = λ/, ad as, this probability goes to zero. O the other had, i CLT, p is fied. Hece, CLT ad Poisso limit theorem are addressig basically differet limits. I leave more cotemplatio o this matter to you!

.5 =6.5 =.4.4.3.2.3.2.. 2 3 4 5 =2.5.4 2 3 4 5 =5.5.4.3.2. 2 3 4 5.3.2. 2 3 4 5 Fig.. biomial pmf for = 6,, 2, 5 ad p = 5/. (part A)

=6 =.8.8.6.4.6.4.2.2 2 3 4 5 =2.8 2 3 4 5 =5.8.6.4.2 2 3 4 5.6.4.2 2 3 4 5 Fig. 2. biomial cmf for = 6,, 2, 5 ad p = 5/. (part A).5 Poisso Distributio with λ = 5.45.4.35.3.25.2.5..5 2 3 4 5 Fig. 3. Depictig the pmf of Poisso distributio with λ = 5. Note that the support of a Poisso radom variable is the whole set of {,, 2,...} up to, however, the pmf for oly the first 5 poits is show. Note the similarity with figures for large.

.2.8.6.4.2 MSE..8.6.4.2 2 3 4 5 Fig. 4. Mea-Square-Error betwee of a Poisso with λ = 5 ad Biomials of (, λ/) for = 6,, 2, 5. The probabilities i the Poisso less tha.5 are eglected. As we see, by icreasig, the MSE rapidly vaishes. (part B) = Biomial Normal.5 5 5 2 25 3 35 4 45 5 =2 Biomial Normal.5 5 5 2 25 3 35 4 45 5 =5 Biomial Normal.5 5 5 2 25 3 35 4 45 5 Fig. 5. of the Biomial r.v. Y for differet s: =, 2, 3, ad its approimatio by a ormal. As we ca see, by icreasig, the ormal distributio is providig a improved approimatio of the Biomial distributio, a fact which follows from Cetral Limit Theorem. (part D)