Fridayʼs lecture" Problem solutions" Joint densities" 1."E(X) xf (x) dx (x,y) dy X,Y Marginal distributions" The distribution of a ratio" Problems"

Similar documents
Chapter 4 Multiple Random Variables

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

ρ < 1 be five real numbers. The

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Point Estimation: definition of estimators

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Lecture 3 Probability review (cont d)

Econometric Methods. Review of Estimation

Chapter 4 Multiple Random Variables

1 Solution to Problem 6.40

Chapter 5 Properties of a Random Sample

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Special Instructions / Useful Data

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 3. Sampling, sampling distributions, and parameter estimation

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

22 Nonparametric Methods.

,m = 1,...,n; 2 ; p m (1 p) n m,m = 0,...,n; E[X] = np; n! e λ,n 0; E[X] = λ.

CHAPTER VI Statistical Analysis of Experimental Data

Summary of the lecture in Biostatistics

Chapter 3 Sampling For Proportions and Percentages

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

X ε ) = 0, or equivalently, lim

Bayes (Naïve or not) Classifiers: Generative Approach

STK4011 and STK9011 Autumn 2016

Chapter 14 Logistic Regression Models

LINEAR REGRESSION ANALYSIS

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Lecture 9: Tolerant Testing

Qualifying Exam Statistical Theory Problem Solutions August 2005

Module 7. Lecture 7: Statistical parameter estimation

Introduction to Probability

Continuous Distributions

Lecture Notes Types of economic variables

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

TESTS BASED ON MAXIMUM LIKELIHOOD

Point Estimation: definition of estimators

Lecture Note to Rice Chapter 8

Maximum Likelihood Estimation

6.867 Machine Learning

Parameter, Statistic and Random Samples

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Class 13,14 June 17, 19, 2015

Lecture 02: Bounding tail distributions of a random variable

Problem Solutions for BST 695: Special Topics in Statistical Theory, Kui Zhang, Solutions from Previous Homework

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Introduction to local (nonparametric) density estimation. methods

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Linear Regression with One Regressor

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

Functions of Random Variables

The Occupancy and Coupon Collector problems

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

7. Joint Distributions

3. Basic Concepts: Consequences and Properties

Dimensionality Reduction and Learning

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

A practical threshold estimation for jump processes

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

Idea is to sample from a different distribution that picks points in important regions of the sample space. Want ( ) ( ) ( ) E f X = f x g x dx

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE

Probability and. Lecture 13: and Correlation

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

CHAPTER 3 POSTERIOR DISTRIBUTIONS

Section 2 Notes. Elizabeth Stone and Charles Wang. January 15, Expectation and Conditional Expectation of a Random Variable.

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

Bayes Decision Theory - II

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Simulation Output Analysis

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

Generative classification models

Construction and Evaluation of Actuarial Models. Rajapaksha Premarathna

Multiple Choice Test. Chapter Adequacy of Models for Regression

STK3100 and STK4100 Autumn 2017

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

M2S1 - EXERCISES 8: SOLUTIONS

Study of Correlation using Bayes Approach under bivariate Distributions

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

ESTIMATION OF SURVIVAL RATES FROM A TAG-RECAPTURE STUDY WITH TAG LOSS. Walter K. Kremers. Biometrics Unit, Cornell, Ithaca, New York ABSTRACT

Random Variables and Probability Distributions

Law of Large Numbers

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

A be a probability space. A random vector

Chapter 9 Jordan Block Matrices

arxiv: v1 [math.st] 24 Oct 2016

L(θ X) s 0 (1 θ 0) m s. (s/m) s (1 s/m) m s

Multiple Linear Regression Analysis

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

STATISTICAL INFERENCE

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

Transcription:

Frdayʼs lecture" Jot destes" Margal dstrbutos" The dstrbuto of a rato" Problems" Problem solutos" 1." E(X) = xf X (x)dx = x f X,Y (x,y)dy dx 2. E(X) " = kp X (k) = p X (1) + 2p X (2) +... = xf X,Y (x,y)dxdy k=1 = P(X > ) + P(X > 1) + P(X > 2) +... = P(X > k) k= 152 153 3." P(X < 1,Y < 1) = 1 1 λ 2 e λ (x+y) dxdy = (1 exp( 1λ)) 2 (1 e 1λ ) 2 =.1 λ =.11 4. From 2, f X ad Y are o-egatve teger-valued, we have " E(X) = (1 F X (k)) (1 F Y (k)) = E(Y) k= k= 2 ca be geeralzed for teger-valued radom varables to" E(X) = P(X > k) + P(X k) k= k= ad almost the same argumet apples." 1 154 I the cotuous case, f X, we have" (1 F X (x))dx = f X (t)dt dx x = f X (t) dx = tf X (t)dt = E(X) ad the same argumet as the frst case apples. Fally, the secod argumet ca smlarly be exteded to the cotuous case." k 5. (a)" P(S = N = k) = π (1 π) k t (b) P(S " =,N = k) = P(S = N = k)p(n = k) = k πk (1 π) k k pk (1 p) k 155 1

(c)" P(S = ) = P(S =,N = ) = = π = (1 p) 1 π π = (1 p) 1 π = = = ( πp) (1 πp)!π (1 π) p (1 p) ( )!!( )! (1 π)p 1 p (d) " P(N =,S = ) P(N = S = ) = P(S = ) 1 p = 1 πp 1+ 1 1 p 1 πp (1 π)p 1 p (1 π)p 1 p 156 A codtoal desty" If (X,Y) has ot desty f X,Y (x,y), we ca defe a codtoal desty of X, gve that Y=y, by" f X Y (x y) = f (x,y) X,Y f Y We ca the compute" P(X A Y = y) = x A f X Y (x y)dx eve though the codto {Y=y} has probablty." Dscrete case?" 157 A example" Let f X,Y (x,y)=2, x, y, x+y 1. Fd the codtoal desty of Y gve that X=x." Idepedece" Two radom varables are depedet f" P(X A,Y B) = P(X A)P(Y B) I partcular, ths holds f" p X,Y (x,y) = p X (x)p Y or" or" f X,Y (x,y) = f X (x)f Y f X Y (x y) = f X (x) 158 159 2

Idepedece" Two radom varables are depedet f" P(X A,Y B) = P(X A)P(Y B) I partcular, ths holds f" p X,Y (x,y) = p X (x)p Y or" f X,Y (x,y) = f X (x)f Y The addto rule for expectatos" E(X+Y) = E(X) + E(Y)" NOTE: No assumpto of depedece. Ths result holds wheever the expectatos exst." A specal case: E(aX + b) = a E(X) + b" or" f X Y (x y) = f X (x) 16 161 The addto rule for varaces" Var(X+Y) = E((X+Y) 2 ) (E(X+Y)) 2 " " " "= E(X 2 )+2E(XY)+E(Y 2 ) (E(X)) 2" " " " 2E(X)E(Y) (E(Y)) 2 " " " "= Var(X) + Var(Y) " " " "+ 2(E(XY) E(X)E(Y))" If X ad Y are depedet, " E(XY) = xyf X (x)f Y dxdy = xf X (x)dx yf Y dy = E(X)E(Y) so Var(X+Y) = Var(X) + Var(Y)" Covarace" The covarace of X ad Y s defed as" Cov(X,Y) = E{(X E(X))(Y E(Y))}" " " = E(XY) E(X)E(Y) E(X)E(Y)" " " "+ E(X)E(Y)" " " "= E(XY) E(X)E(Y)" If X ad Y are depedet, Cov(X,Y) =." Cov(X,X) = " Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y)" Var(X Y) =" 162 163 3

Correlato" Uts of covarace s product of uts of X ad Y. Sometmes oe wats a utfree quatty. To do that we stadardze" X ad Y:" X * = X E(X) Var(X), Y* = Y E(Y) Var(Y) Defe the correlato coeffcet" ρ(x,y) = Cov(X *,Y * ) where" σ X = = Cov(X,Y) σ X σ Y Var(X) 164 Propertes of correlato coeffcet" ρ(x,y) 1 If ρ(x,y) = 1 the Y = ax + b" 165 Last Modayʼs lecture" Codtoal dstrbuto ad desty" Idepedet radom varables" The addto rules for expected value ad varace" Covarace ad correlato" A example" Let X be -1,, or 1 wth equal probabltes 1/3. " E(X) = " Let Y = 1 f X=, otherwse." E(Y) = " XY = " E(XY) = " Cov(X,Y) = E(XY) E(X)E(Y) = " Are X ad Y depedet?" 166 167 4

Calculatg covarace ad correlato" f X,Y (x,y) = 2, < x < y < 1 f X (x) = 2(1 x), < x < 1 f Y = 2y, < y < 1 1 E(X) = x 2(1 x) dx = 1 3 1 E(Y) = y 2y dy = 2 3 1 y E(XY) = xy 2dxdy = 1 4 Cov(X,Y) = 1 1 2 = 1 4 3 3 36 1 Var(X) = x 2 2(1 x)dx 1 3 1 Var(Y) = y 2 2y dy 2 3 Corr(X,Y) = 1 36 1 18 = 1 2 ( ) 2 = 1 ( ) 2 = 1 18 18 168 Law of large umbers" Let X 1, X 2,... be depedet radom varables, all wth the same dstrbuto havg expected value µ ad varace σ 2.! The" P 1 X µ > ε =1 as E 1 X =1 = 1 Var X =1 = Lk" We wrte " X = " 1 " X", the sample average. " 169 =1 Estmato" I have studed mmum aual temperatures Karlstad, Swede. It has bee suggested that" F X ( x;µ,σ,ξ) = exp 1+ ξ x µ σ If we kew the parameters ξ, μ ad σ, we could draw a hstogram of the data ad plot the correspodg desty to see f t s a good ft. " Desty..2.4.6.8 Karlstad data" µ,σ,ξ = 22,5,-.25 µ,σ,ξ = 2.4,4.9,-.23 Karlstad 17-4 -35-3 -25-2 -15-1 Mmum temperature 171 5

Modayʼs lecture" Cocepts" Covarace ad correlato" Law of large umbers" Estmato" 172 Samplg dstrbuto" Sce " ˆθ(X " 1,...,X ) s a radom varable, we ca compute ts samplg dstrbuto cdf " P(ˆ θ (X 1,...,X ) x) ad other propertes such as" E θ (ˆθ(X 1,...,X ) = ˆθ (x,...,x )f 1 X (x 1,...,x ;θ)dx 1...dx bas(ˆ θ,θ) = E θ (ˆ θ ) θ Var θ (ˆ θ ) se(ˆ θ ) = sd θ (ˆ θ ) = h(θ) ese(ˆ θ ) = h(ˆ θ ) What estmato s all about" I 1918 R. A. Fsher proposed estmatg parameters by cosderg " how lkely are the data f θ s the true parameter?" Choosg the parameter that makes the observatos most lkely s formalzed usg the lkelhood fucto! L(θ ) = f X 1,...,X (x 1,...,x ;θ), cotuous case p X 1,...,X (x 1,...,x ;θ ),dscrete case The data are fxed! The parameter s varyg! R.A.Fsher 189-1962 175 6

The expoetal case" Cosder x 1,...,x depedet observatos from a expoetal dstrbuto wth parameter λ." The lkelhood s " L(λ) = λ exp( λx ) = λ exp( λ x ) =1 Lkelhood.e+ 5.e-25 1.e-24 1.5e-24..5 1. 1.5 2. lambda =1 176 The method of maxmum lkelhood" Defe the mle" ˆθ = argmax(l(θ)) We compute t by settg" L (θ) = ad checkg that " L (θ) " < ", or that L has sg chage + about the maxmum. Alteratvely, plot L (θ) as a fucto of θ ad fd the maxmum umercally." Computatoal trck: maxmze the log lkelhood (θ) " = log(l(θ)) 177 Expoetal case" L(λ) = λ e λ x L (λ) = λ 1 e λ x λ ( x )e λ x = λ 1 e λ x ( λ x ) (λ) = log(λ) λ (λ) = λ x (λ) = λ < 2 x = ˆλ = 1/ x Bomal case" N L(p) = x p x =1 (1 p)n x = cost p x N (1 p) (p) = cost + log p x 1 p + log(1 p)n (p) = x =1 p(1 p) N 1 p = ˆp = x N N (p) = (ˆp p) so + p(1 p) x 178 179 7

Frdayʼs lecture" The method of maxmum lkelhood" Computatoal tools" Checkg for a maxmum" Problems" 18 Problem solutos" 1. Note msprt problem:" E(X Y = y) = xf X Y (x y)dx (a)" E(E(X Y)) = x f (x,y) X,Y f f Y Y dy dx = = xf X,Y (x,y)dxdy = E(X) (b)" Var(X Y = y) = E(X 2 2 Y = y) E(X Y = y) As (a), " E(E(X 2 Y = y)) = E(X 2 ) Let Z = E(X Y = y), so Var(X) = E(Var(X Y = y)) + [ E(Z) ] 2 181 But E(X) " " = E(E(X " " Y " = y)) " = " E(Z) " "so" E(X) " " " " whece " [ ] 2 = [ E(Z) ] 2 Var(X) = E(Var(X Y = y)) + E(Z 2 ) E(Z) (c) " [ ] 2 = E(Var(X Y = y) + Var(E(X Y = y)) E(Y) = E(E(Y X = k)) =E(X.1) = 1.1= 1 2. Cov(U,V) = Cov(X+Y,Y+Z) = Cov(X,Y) +" Cov(X,Z) + Cov(Y,Y) + Cov(Y,Z) = Var(Y)" =144" Var(U) = Var(X) + Var(Y) + Cov(X,Y) = 169" Var(V) = Var(Y) + Var(Z) + Cov(Y,Z) = 18" 144 Corr(U,V) = 169 18 =.83 3. (a)" E( µ) = E( a X ) = a E(X ) = µ a = µ =1 (b)" Var( µ) = a 2 Var(X ) = σ 2 2 a =1 (c) By symmetry the weghts ought to be equal, whch case they each have to be 1/. Ths s deed optmal, sce" =1 a 1 =1 2 wth equalty f ad oly f each a = 1/. " =1 =1 =1 2 = a 2 a + 2 = a 2 1 =1 =1 2 so" a = a 1 + 1 1 =1 2 =1 182 183 8

4. Cov(X,aX+b) = a Cov(X,X) = a Var(X)" Var(aX+b) = a 2 Var(x) so Corr(X,aX+b) =" a /" a. Coversely, f Corr(X,Y) = 1 wrte" Var(Y* X*) = Var(Y*) + Var(X*) 2 = so Y* X* = c or" Y E(Y) σ Y = X E(X) σ X + c so Y = ax + b where" a = σ Y, b = E(Y) + σ σ Y (c + E(X) ) X σ X The case Corr(X,Y)=-1 s smlar, startg from Var(Y* + X*) = " 5. Let X ad Y be the respectve arrval tmes. They are depedet, U(9,1). We eed to compute P( X Y >1) = 1 P( X Y 1)." 184 The ot dstrbuto s uform o the square wth corers (9,9), (9,1), (1,9) ad (1,1). The probablty we wat therefore s the area of the two tragles wth below:" Ths area s clearly (5/6) 2 = 25/36." 185 HOV lae eeded?" HOV, cot." The followg data are for passeger car occupacy durg oe hour at Wlshre ad Budy Los Ageles:" Occupats Frequecy Predcted 1 676 2 227 3 56 4 26 5 6 6 14 The geometrc dstrbuto p X = p(1-p) y-1 s a reasoable ft. The log lkelhood s" l(p) = log(p) + (y 1)log(1 p) =1 whece" l (p) = p (y 1) = p ˆ = 1 1 p y 186 But we do ot have detaled data o the 6 group. However," P(Y 6) = p(1 p) k 1 = p(1 p) 5 (1 p) k = (1 p) 5 k=6 k= so the log lkelhood, the log probablty of what we actually observed, becomes" l(p) = ( log(p) + (y 1)log(1 p) ) + 5log(1 p) {:y <6} {:y 6} = (111 14)log(p) + 455 log(1 p) + 14 5log(1 p) = 997 log(p) + 525 log(1 p) 187 9

The uform dstrbuto" Let X 1,,X be d U(,θ). The " L(θ) = θ, so " "l(θ) = - log(θ) ad " "l (θ) = -/θ What s the mle?" Reparametrzato" A drug reacto survellace program was carred out 9 hosptals. Out of 11,526 motored patets, 3,24 had a adverse reacto." Model: X = # adverse reactos ~" Log lkelhood" 188 Stadard error of the mle:" The bootstrap method ust plugs the estmate of p to the formula for the stadard error:" But the stadard error s a fucto of p. What s ts mle?" Fact: The mle of " h(θ) s h(ˆθ). 189 Stoppg" Flp a co utl the frst heads. Suppose t takes 6 tres. The lkelhood s" Now suppose we were gog to flp the co 6 tmes, ad happeed to get oe head. The lkelhood s" How are the mles dfferet?" Fact: Chagg the lkelhood by a costat does ot chage the mle." However, the stadard errors would be dfferet." 19 1