Review. December 4 th, Review

Similar documents
Mathematical statistics

Mathematical statistics

Mathematical statistics

Tests about a population mean

Central Limit Theorem ( 5.3)

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Statistical Inference

Mathematical statistics

Math 494: Mathematical Statistics

Problem 1 (20) Log-normal. f(x) Cauchy

Math Review Sheet, Fall 2008

Mathematical statistics

Statistics. Statistics

Design of Engineering Experiments

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Practice Problems Section Problems

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

STAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.

STAT 512 sp 2018 Summary Sheet

ISyE 6644 Fall 2014 Test 3 Solutions

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

AMCS243/CS243/EE243 Probability and Statistics. Fall Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A

HT Introduction. P(X i = x i ) = e λ λ x i

TUTORIAL 8 SOLUTIONS #

INTERVAL ESTIMATION AND HYPOTHESES TESTING

MTMS Mathematical Statistics

Hypothesis testing: theory and methods

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

STA 260: Statistics and Probability II

COMP2610/COMP Information Theory

Chapters 9. Properties of Point Estimators

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Chapter 10: Inferences based on two samples

Math 494: Mathematical Statistics

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Bias Variance Trade-off

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Introductory Econometrics

Chapter 8 - Statistical intervals for a single sample

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Lecture 7 Introduction to Statistical Decision Theory

18.05 Practice Final Exam


Hypothesis Testing Problem. TMS-062: Lecture 5 Hypotheses Testing. Alternative Hypotheses. Test Statistic

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

This paper is not to be removed from the Examination Halls

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Applied Statistics I

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

CH.9 Tests of Hypotheses for a Single Sample

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Mathematical statistics

the amount of the data corresponding to the subinterval the width of the subinterval e x2 to the left by 5 units results in another PDF g(x) = 1 π

Summary of Chapters 7-9

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Simple and Multiple Linear Regression

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

Solution: First note that the power function of the test is given as follows,

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Master s Written Examination - Solution

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

STAT 461/561- Assignments, Year 2015

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Probability Theory and Statistics. Peter Jochumzen

Hypothesis Testing Chap 10p460

Hypothesis Testing - Frequentist

MAT 271E Probability and Statistics

Lecture 8: Information Theory and Statistics

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Week 12 Hypothesis Testing, Part II Comparing Two Populations

MAT 271E Probability and Statistics

Math 152. Rumbos Fall Solutions to Assignment #12

Chapter 8 of Devore , H 1 :

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

2. A Basic Statistical Toolbox

Regression Estimation Least Squares and Maximum Likelihood

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Introduction to Simple Linear Regression

Elements of statistics (MATH0487-1)

Introduction to Statistical Inference

Detection theory. H 0 : x[n] = w[n]

First Year Examination Department of Statistics, University of Florida

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Qualifying Exam in Probability and Statistics.

Statistical Inference

Stat 5102 Final Exam May 14, 2015

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002

Transcription:

December 4 th, 2017

Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115

Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence Intervals Chapter 9: Tests of Hypotheses Chapter 10: Two-sample inference

Chapter 6: Statistics and Sampling Distributions

Chapter 6 6.1 Statistics and their distributions 6.2 The distribution of the sample mean 6.3 The distribution of a linear combination

Random sample Definition The random variables X 1, X 2,..., X n are said to form a (simple) random sample of size n if 1 the X i s are independent random variables 2 every X i has the same probability distribution

Section 6.1: Sampling distributions 1 If the distribution and the statistic T is simple, try to construct the pmf of the statistic 2 If the probability density function f X (x) of X s is known, the try to represent/compute the cumulative distribution (cdf) of T P[T t] take the derivative of the function (with respect to t )

Section 6.3: Linear combination of normal random variables Theorem Let X 1, X 2,..., X n be independent normal random variables (with possibly different means and/or variances). Then T = a 1 X 1 + a 2 X 2 +... + a n X n also follows the normal distribution.

Section 6.3: Computations with normal random variables

Section 6.3: Linear combination of random variables Theorem Let X 1, X 2,..., X n be independent random variables (with possibly different means and/or variances). Define T = a 1 X 1 + a 2 X 2 +... + a n X n, then the mean and the standard deviation of T can be computed by E(T ) = a 1 E(X 1 ) + a 2 E(X 2 ) +... + a n E(X n ) σ 2 T = a2 1 σ2 X 1 + a 2 2 σ2 X 2 +... + a 2 nσ 2 X n

Section 6.2: Distribution of the sample mean Theorem Let X 1, X 2,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then, in the limit when n, the standardized version of X have the standard normal distribution Rule of Thumb: lim P n ( X µ σ/ n z ) = P[Z z] = Φ(z) If n > 30, the Central Limit Theorem can be used for computation.

Chapter 7: Point Estimation

Chapter 7: Point estimates 7.1 Point estimate unbiased estimator mean squared error 7.2 Methods of point estimation method of moments method of maximum likelihood.

Point estimate Definition A point estimate ˆθ of a parameter θ is a single number that can be regarded as a sensible value for θ. population parameter = sample = estimate θ = X 1, X 2,..., X n = ˆθ

Mean Squared Error & Bias-variance decomposition Definition The mean squared error of an estimator ˆθ is E[(ˆθ θ) 2 ] Theorem ( ) 2 MSE(ˆθ) = E[(ˆθ θ) 2 ] = V (ˆθ) + E(ˆθ) θ Bias-variance decomposition Mean squared error = variance of estimator + (bias) 2

Unbiased estimators Definition A point estimator ˆθ is said to be an unbiased estimator of θ if E(ˆθ) = θ for every possible value of θ. Unbiased estimator Bias = 0 Mean squared error = variance of estimator

Example Problem Consider a random sample X 1,..., X n from the pdf f (x) = 1 + θx 2 1 x 1 Show that ˆθ = 3 X is an unbiased estimator of θ.

Method of moments: ideas Let X 1,..., X n be a random sample from a distribution with pmf or pdf f (x; θ 1, θ 2,..., θ m ) Assume that for k = 1,..., m X k 1 + X k 2 +... + X k n n = E(X k ) Solve the system of equations for θ 1, θ 2,..., θ m

Method of moments: example Problem Suppose that for a parameter 0 θ 1, X is the outcome of the roll of a four-sided tetrahedral die x 1 2 3 4 p(x) 3θ 4 θ 4 3(1 θ) 4 (1 θ) 4 Suppose the die is rolled 10 times with outcomes 4, 1, 2, 3, 1, 2, 3, 4, 2, 3 Use the method of moments to obtain an estimator of θ.

Midterm: Problem 2a Problem Let X 1, X 2,..., X n represent a random sample from a distribution with pdf f (x, θ) = 2x /(θ+1) θ + 1 e x2, x > 0 It can be shown that E(X 2 1) = θ Use this fact to construct an estimator of θ based on the method of moments. Sketch: MoM: E(X 2 ) = θ 1 E(X 2 ) = X 2 1 + X 2 2 +... + X 2 n n

Maximum likelihood estimator Let X 1, X 2,..., X n have joint pmf or pdf where θ is unknown. f joint (x 1, x 2,..., x n ; θ) When x 1,..., x n are the observed sample values and this expression is regarded as a function of θ, it is called the likelihood function. The maximum likelihood estimates θ ML are the value for θ that maximize the likelihood function: f joint (x 1, x 2,..., x n ; θ ML ) f joint (x 1, x 2,..., x n ; θ) θ

How to find the MLE? Step 1: Write down the likelihood function. Step 2: Can you find the maximum of this function? Step 3: Try taking the logarithm of this function. Step 4: Find the maximum of this new function. To find the maximum of a function of θ: compute the derivative of the function with respect to θ set this expression of the derivative to 0 solve the equation

Midterm: Problem 2b Problem Let X 1, X 2,..., X n represent a random sample from a distribution with pdf f (x, θ) = 2x /(θ+1) θ + 1 e x2, x > 0 Derive the maximum-likelihood estimator for parameter θ based on the following dataset with n = 10 17.85, 11.23, 14.43, 19.27, 5.59, 6.36, 9.41, 6.31, 13.78, 11.95

Chapters 8 and 10: Confidence intervals

Interpreting confidence interval 95% confidence interval: If we repeat the experiment many times, the interval contains µ about 95% of the time

Confidence intervals By target Chapter 8: Confidence intervals for population means Chapter 8: Prediction intervals for an additional sample Chapter 10: Confidence intervals for difference between two population means independent samples paired samples By types (Standard) two-sided confidence intervals One-sided confidence intervals (confidence bounds) By distributions of the statistics z-statistic t-statistic

Chapter 8: Confidence intervals Section 8.1 Normal distribution, σ is known Section 8.2 Normal distribution, σ is known n > 40 Section 8.3 Normal distribution, σ is known n is small t-distribution

Section 8.1 Assumptions: Normal distribution σ is known

Section 8.2 If after observing X 1 = x 1, X 2 = x 2,..., X n = x n (n > 40), we compute the observed sample mean x and sample standard deviation s. Then ( ) s s x z α/2, x + z n α/2 n is a 95% confidence interval of µ

Section 8.3

Prediction intervals We have available a random sample X 1, X 2,..., X n from a normal population distribution We wish to predict the value of X n+1, a single future observation.

Confidence intervals: difference between two means Independent samples 1 X 1, X 2,..., X m is a random sample from a population with mean µ 1 and variance σ 2 1. 2 Y 1, Y 2,..., Y n is a random sample from a population with mean µ 2 and variance σ 2 2. 3 The X and Y samples are independent of each other. Paired samples 1 There is only one set of n individuals or experimental objects 2 Two observations are made on each individual or object

Difference between population means: independent samples

Difference between population means: paired samples The paired t CI for µ D is d ± t α/2,n 1 s D n A one-sided confidence bound results from retaining the relevant sign and replacing t α/2,n 1 by t α,n 1.

Principles for deriving CIs If X 1, X 2,..., X n is a random sample from a distribution f (x, θ), then Find a random variable Y = h(x 1, X 2,..., X n ; θ) such that he probability distribution of Y does not depend on θ or on any other unknown parameters. Find constants a, b such that P [a < h(x 1, X 2,..., X n ; θ) < b] = 1 α Manipulate these inequality to isolate θ P [l(x 1, X 2,..., X n ) < θ < u(x 1, X 2,..., X n )] = 1 α

Examples For µ and X n+1 X µ S/ n t n 1, X X n+1 S 1 + 1/n t n 1 Difference between two means [independent samples] ( X Ȳ ) (µ 1 µ 2 ) S1 2 m + S2 2 n t ν Difference between two means [paired samples] T = D µ D S D / n t n 1

Chapters 9 and 10: Tests of hypotheses

Test of hypotheses By target Chapter 9: population mean Chapter 10: difference between two population means independent samples paired samples By the alternative hypothesis > < By the type of test z-test t-test By method of testing Rejection region p-value

Hypothesis testing In any hypothesis-testing problem, there are two contradictory hypotheses under consideration The null hypothesis, denoted by H 0, is the claim that is initially assumed to be true The alternative hypothesis, denoted by H a, is the assertion that is contradictory to H 0.

Implicit rules H 0 will always be stated as an equality claim. If θ denotes the parameter of interest, the null hypothesis will have the form H 0 : θ = θ 0 θ 0 is a specified number called the null value The alternative hypothesis will be either: H a : θ > θ 0 H a : θ < θ 0 H a : θ θ 0

Test procedures A test procedure is specified by the following: A test statistic T : a function of the sample data on which the decision (reject H 0 or do not reject H 0 ) is to be based A rejection region R: the set of all test statistic values for which H 0 will be rejected A type I error consists of rejecting the null hypothesis H 0 when it is true A type II error involves not rejecting H 0 when H 0 is false.

Hypothesis testing for one parameter: rejection region method 1 Identify the parameter of interest 2 Determine the null value and state the null hypothesis 3 State the appropriate alternative hypothesis 4 Give the formula for the test statistic 5 State the rejection region for the selected significance level α 6 Compute statistic value from data 7 Decide whether H 0 should be rejected and state this conclusion in the problem context

Normal population with known σ Null hypothesis: µ = µ 0 Test statistic: Z = X µ 0 σ/ n

Large-sample tests Null hypothesis: µ = µ 0 Test statistic: Z = X µ 0 S/ n [Does not need the normal assumption]

t-test [Require normal assumption]

P-value

Testing by P-value method Remark: the smaller the P-value, the more evidence there is in the sample data against the null hypothesis and for the alternative hypothesis.

P-values for z-tests

P-values for t-tests

Testing by rejection region method Parameter of interest: µ = true average activation temperature Hypotheses Test statistic: H 0 : µ = 130 H a : µ 130 z = x 130 1.5/ n Rejection region: either z z 0.005 or z z 0.005 = 2.58 Substituting x = 131.08, n = 25 z = 2.16. Note that 2.58 < 2.16 < 2.58. We fail to reject H 0 at significance level 0.01. The data does not give strong support to the claim that the true average differs from the design value.

Testing by p-value

Interpreting P-values A P-value: is not the probability that H 0 is true is not the probability of rejecting H 0 is the probability, calculated assuming that H 0 is true, of obtaining a test statistic value at least as contradictory to the null hypothesis as the value that actually resulted

Testing the difference between two population means Setting: independent normal random samples X 1, X 2,..., X m and Y 1, Y 2,..., Y n with known values of σ 1 and σ 2. Constant 0. Null hypothesis: Alternative hypothesis: (a) H a : µ 1 µ 2 > 0 (b) H a : µ 1 µ 2 < 0 (c) H a : µ 1 µ 2 0 H 0 : µ 1 µ 2 = 0 When = 0, the test (c) becomes H 0 : µ 1 = µ 2 H a : µ 1 µ 2

Difference between 2 means (independent samples) Proposition