Mathematical statistics

Similar documents
Mathematical statistics

Review. December 4 th, Review

Mathematical statistics

Central Limit Theorem ( 5.3)

HT Introduction. P(X i = x i ) = e λ λ x i

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Problem Selected Scores

STAT 512 sp 2018 Summary Sheet

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Chapters 9. Properties of Point Estimators

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

SUFFICIENT STATISTICS

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Elements of statistics (MATH0487-1)

A Very Brief Summary of Statistical Inference, and Examples

Theory of Statistics.

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Statistics 3858 : Maximum Likelihood Estimators

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

Math Review Sheet, Fall 2008

COMP2610/COMP Information Theory

Stat 5102 Final Exam May 14, 2015

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Introduction to Estimation Methods for Time Series models Lecture 2

Exercises and Answers to Chapter 1

Math 494: Mathematical Statistics

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Problem 1 (20) Log-normal. f(x) Cauchy

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

This does not cover everything on the final. Look at the posted practice problems for other topics.

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STA 260: Statistics and Probability II

Practice Problems Section Problems

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Statistics. Statistics

Regression Estimation Least Squares and Maximum Likelihood

Chapter 8 - Statistical intervals for a single sample

Hypothesis testing: theory and methods

Statistics II Lesson 1. Inference on one population. Year 2009/10

Lecture 7 Introduction to Statistical Decision Theory

MATH4427 Notebook 2 Fall Semester 2017/2018

Chapter 4 - Lecture 3 The Normal Distribution

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Lecture 8: Information Theory and Statistics

STAT 730 Chapter 4: Estimation

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Math 151. Rumbos Spring Solutions to Review Problems for Exam 3

Chapter 8.8.1: A factorization theorem

Statistics: Learning models from data

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

AMCS243/CS243/EE243 Probability and Statistics. Fall Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Statistics and Sampling distributions

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Chapter 1: Revie of Calculus and Probability

(Practice Version) Midterm Exam 2

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Math 105 Course Outline

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Statistical Inference

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 120 minutes.

Brief Review on Estimation Theory

18.05 Practice Final Exam

Applied Statistics I

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

ECE534, Spring 2018: Solutions for Problem Set #3

EIE6207: Estimation Theory

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

Review of Discrete Probability (contd.)

7 Random samples and sampling distributions

Test Problems for Probability Theory ,

ECE 275B Homework # 1 Solutions Version Winter 2015

MAT 271E Probability and Statistics

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

Example: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently

MAT 271E Probability and Statistics

STAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

6 The normal distribution, the central limit theorem and random samples

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Transcription:

October 18 th, 2018 Lecture 16: Midterm review

Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8: Confidence Intervals Week 10 Week 14 Chapter 9: Test of Hypothesis Regression

Chapter 6: Summary

Chapter 6 6.1 Statistics and their distributions 6.2 The distribution of the sample mean 6.3 The distribution of a linear combination

Random sample Definition The random variables X 1, X 2,..., X n are said to form a (simple) random sample of size n if 1 the X i s are independent random variables 2 every X i has the same probability distribution

Section 6.1: Sampling distributions 1 If the distribution and the statistic T is simple, try to construct the pmf of the statistic 2 If the probability density function f X (x) of X s is known, the try to represent/compute the cumulative distribution (cdf) of T P[T t] take the derivative of the function (with respect to t )

Section 6.3: Linear combination of normal random variables Theorem Let X 1, X 2,..., X n be independent normal random variables (with possibly different means and/or variances). Then T = a 1 X 1 + a 2 X 2 +... + a n X n also follows the normal distribution.

Section 6.3: Computations with normal random variables

Section 6.3: Linear combination of random variables Theorem Let X 1, X 2,..., X n be independent random variables (with possibly different means and/or variances). Define T = a 1 X 1 + a 2 X 2 +... + a n X n, then the mean and the standard deviation of T can be computed by E(T ) = a 1 E(X 1 ) + a 2 E(X 2 ) +... + a n E(X n ) σ 2 T = a2 1 σ2 X 1 + a 2 2 σ2 X 2 +... + a 2 nσ 2 X n

Section 6.2: Distribution of the sample mean Theorem Let X 1, X 2,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then, in the limit when n, the standardized version of X have the standard normal distribution Rule of Thumb: lim P n ( X µ σ/ n z ) = P[Z z] = Φ(z) If n > 30, the Central Limit Theorem can be used for computation.

Practice problems

Example 1* Problem Consider the distribution P x 40 45 50 p(x) 0.2 0.3 0.5 Let {X 1, X 2 } be a random sample of size 2 from P, and T = X 1 + X 2. 1 Derive the probability mass function of T 2 Compute the expected value and the standard deviation of T Question: If T = X 1 X 2, can you still solve the problem?

Example 2* Problem Let {X 1, X 2 } be a random sample of size 2 from the exponential distribution with parameter λ { λe λx if x 0 f (x) = 0 if x < 0 and T = X 1 + X 2. 1 Compute the cumulative density function (cdf) of T 2 Compute the probability density function (pdf) of T Question: If T = X 1 + 2X 2, can you still solve the problem?

Example 3 Problem Two airplanes are flying in the same direction in adjacent parallel corridors. At time t = 0, the first airplane is 10 km ahead of the second one. Suppose the speed of the first plane (km/h) is normally distributed with mean 520 and standard deviation 10 and the second planes speed, independent of the first, is also normally distributed with mean and standard deviation 500 and 10, respectively. What is the probability that after 2h of flying, the second plane has not caught up to the first plane?

Example Problem The tip percentage at a restaurant has a mean value of 18% and a standard deviation of 6%. What is the approximate probability that the sample mean tip percentage for a random sample of 40 bills is between 16% and 19%?

Example Problem The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze is a normally distributed random variable with mean µ = 1.5(minutes) and standard deviation σ =.35 (minutes). Suppose five rats are selected. Let X 1, X 2,..., X 5 denote their times in the maze. Assuming the X i s to be a random sample from this normal distribution, what is the probability that the total time for the five T = X 1 + X 2 + X 3 + X 4 + X 5 is between 6 and 8 minutes.

Example Problem Let X 1, X 2,..., X n be random sample from a normally distribution with mean 2.65 and standard deviation 0.85. If n = 25, compute Find n such that P[ X 3] P[ X 3] 0.95

Chapter 7: Summary

Overview 7.1 Point estimate unbiased estimator mean squared error 7.2 Methods of point estimation method of moments method of maximum likelihood.

Point estimate Definition A point estimate ˆθ of a parameter θ is a single number that can be regarded as a sensible value for θ. population parameter = sample = estimate θ = X 1, X 2,..., X n = ˆθ

Mean Squared Error Measuring error of estimation ˆθ θ or (ˆθ θ) 2 The error of estimation is random Definition The mean squared error of an estimator ˆθ is E[(ˆθ θ) 2 ]

Bias-variance decomposition Theorem ( ) 2 MSE(ˆθ) = E[(ˆθ θ) 2 ] = V (ˆθ) + E(ˆθ) θ Bias-variance decomposition Mean squared error = variance of estimator + (bias) 2

Unbiased estimators Definition A point estimator ˆθ is said to be an unbiased estimator of θ if E(ˆθ) = θ for every possible value of θ. Unbiased estimator Bias = 0 Mean squared error = variance of estimator

Example 1 Problem Consider a random sample X 1,..., X n from the pdf f (x) = 1 + θx 2 1 x 1 Show that ˆθ = 3 X is an unbiased estimator of θ.

Method of moments: ideas Let X 1,..., X n be a random sample from a distribution with pmf or pdf f (x; θ 1, θ 2,..., θ m ) Assume that for k = 1,..., m X k 1 + X k 2 +... + X k n n = E(X k ) Solve the system of equations for θ 1, θ 2,..., θ m

Method of moments: Example 4 Problem Suppose that for a parameter 0 θ 1, X is the outcome of the roll of a four-sided tetrahedral die x 1 2 3 4 p(x) 3θ 4 θ 4 3(1 θ) 4 (1 θ) 4 Suppose the die is rolled 10 times with outcomes 4, 1, 2, 3, 1, 2, 3, 4, 2, 3 Use the method of moments to obtain an estimator of θ.

Maximum likelihood estimator Let X 1, X 2,..., X n have joint pmf or pdf where θ is unknown. f joint (x 1, x 2,..., x n ; θ) When x 1,..., x n are the observed sample values and this expression is regarded as a function of θ, it is called the likelihood function. The maximum likelihood estimates θ ML are the value for θ that maximize the likelihood function: f joint (x 1, x 2,..., x n ; θ ML ) f joint (x 1, x 2,..., x n ; θ) θ

How to find the MLE? Step 1: Write down the likelihood function. Step 2: Can you find the maximum of this function? Step 3: Try taking the logarithm of this function. Step 4: Find the maximum of this new function. To find the maximum of a function of θ: compute the derivative of the function with respect to θ set this expression of the derivative to 0 solve the equation

Example 3 Let X 1,..., X 10 be a random sample of size n = 10 from a distribution with pdf { (θ + 1)x θ if 0 x 1 f (x) = 0 otherwise The observed x i s are 0.92, 0.79, 0.90, 0.65, 0.86, 0.47, 0.73, 0.97, 0.94, 0.77 Question: Use the method of maximum likelihood to obtain an estimator of θ.

Fisher-Neyman factorization theorem Theorem T is sufficient for if and only if nonnegative functions g and h can be found such that f (x 1, x 2,..., x n ; θ) = g(t(x 1, x 2,..., x n ), θ) h(x 1, x 2,..., x n ) i.e. the joint density can be factored into a product such that one factor, h does not depend on θ; and the other factor, which does depend on θ, depends on x only through t(x).

Fisher information Definition The Fisher information I (θ) in a single observation from a pmf or pdf f (x; θ) is the variance of the random variable U = which is [ ] log f (X, θ) I (θ) = Var θ Note: We always have E[U] = 0. log f (X,θ) θ,

The Cramer-Rao Inequality Theorem Assume a random sample X 1, X 2,..., X n from the distribution with pmf or pdf f (x, θ) such that the set of possible values does not depend on θ. If the statistic T = t(x 1, X 2,..., X n ) is an unbiased estimator for the parameter θ, then V (T ) 1 n I (θ)

Large Sample Properties of the MLE Theorem Given a random sample X 1, X 2,..., X n from the distribution with pmf or pdf f (x, θ) such that the set of possible values does not depend on θ. Then for large n the maximum likelihood estimator ˆθ has approximately a normal distribution with mean θ and variance 1 n I (θ). More precisely, the limiting distribution of n(ˆθ θ) is normal with mean 0 and variance 1/I (θ).

Chapter 8: Confidence intervals 8.1 Basic properties of confidence intervals (CIs) Interpreting CIs General principles to derive CI 8.2 Large-sample confidence intervals for a population mean Using the Central Limit Theorem to derive CIs 8.3 Intervals based on normal distribution Using Student s t-distribution

Overview Section 8.1 Normal distribution, σ is known Section 8.2 Normal distribution Using Central Limit Theorem needs n > 30 σ is known needs n > 40 Section 8.3 Normal distribution, σ is known n is small Introducing t-distribution

Interpreting confidence interval 95% confidence interval: If we repeat the experiment many times, the interval contains µ about 95% of the time

Section 8.1 Assumptions: Normal distribution σ is known

Section 8.2 If after observing X 1 = x 1, X 2 = x 2,..., X n = x n (n > 40), we compute the observed sample mean x and sample standard deviation s. Then ( ) s s x z α/2, x + z n α/2 n is a 95% confidence interval of µ

Confidence intervals

One-sided CIs CIs: 100(1 α)% confidence ( x z α/2 σ n, x + z α/2 σ n ) One-sided CIs: 100(1 α)% confidence (, x + z α σ n ) 95% confidence ( x 1.96 σ n, x + 1.96 σ n ) 95% confidence (, x + 1.64 σ n )

Prediction intervals We have available a random sample X 1, X 2,..., X n from a normal population distribution We wish to predict the value of X n+1, a single future observation.

Principles for deriving CIs If X 1, X 2,..., X n is a random sample from the normal distribution N (µ, σ 2 ), then For µ X µ S/ n t n 1 For predicting X n+1 For σ X Xn+1 S 1 + 1/n t n 1 (n 1) S 2 σ 2 χ2 n 1 For sample proportion (n large) ˆp p p(1 p)/n N (0, 1)

Principles for deriving CIs If X 1, X 2,..., X n is a random sample from a distribution f (x, θ), then Find a random variable Y = h(x 1, X 2,..., X n ; θ) such that he probability distribution of Y does not depend on θ or on any other unknown parameters. Find constants a, b such that P [a < h(x 1, X 2,..., X n ; θ) < b] = 1 α Manipulate these inequality to isolate θ P [l(x 1, X 2,..., X n ) < θ < u(x 1, X 2,..., X n )] = 1 α

α t