Mathematical statistics

Similar documents
Mathematical statistics

Mathematical statistics

Review. December 4 th, Review

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

A Very Brief Summary of Statistical Inference, and Examples

Nuisance parameters and their treatment

Chapter 8.8.1: A factorization theorem

Parameter Estimation

SUFFICIENT STATISTICS

STAT 730 Chapter 4: Estimation

Review of Discrete Probability (contd.)

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Math Review Sheet, Fall 2008

Chapters 9. Properties of Point Estimators

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

MATH4427 Notebook 2 Fall Semester 2017/2018

Problem Selected Scores

A Very Brief Summary of Statistical Inference, and Examples

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero

Stat410 Probability and Statistics II (F16)

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

Problem 1 (20) Log-normal. f(x) Cauchy

Central Limit Theorem ( 5.3)

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

HT Introduction. P(X i = x i ) = e λ λ x i

Practice Problems Section Problems

Exercises and Answers to Chapter 1

Example: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Statistics: Learning models from data

Likelihoods. P (Y = y) = f(y). For example, suppose Y has a geometric distribution on 1, 2,... with parameter p. Then the pmf is

Generalized Linear Models and Exponential Families

Math 152. Rumbos Fall Solutions to Assignment #12

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Machine learning - HT Maximum Likelihood

Probability and Estimation. Alan Moses

COS513 LECTURE 8 STATISTICAL CONCEPTS

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Summary. Ancillary Statistics What is an ancillary statistic for θ? .2 Can an ancillary statistic be a sufficient statistic?

The Expectation-Maximization Algorithm

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

From Model to Log Likelihood

Gamma and Normal Distribuions

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Statistical Theory MT 2007 Problems 4: Solution sketches

1 Degree distributions and data

Generalized Linear Models

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

STAT 3610: Review of Probability Distributions

1.1 Review of Probability Theory

Master s Written Examination

Exam 3, Math Fall 2016 October 19, 2016

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Rowan University Department of Electrical and Computer Engineering

This paper is not to be removed from the Examination Halls

Lecture 34: Properties of the LSE

1 Probability and Random Variables

g-priors for Linear Regression

Review and continuation from last week Properties of MLEs

First Year Examination Department of Statistics, University of Florida

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

Three hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

STA 260: Statistics and Probability II

Brief Review on Estimation Theory

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

Topic 19 Extensions on the Likelihood Ratio

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Introduction to Probability Theory for Graduate Economics Fall 2008

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Statistics 3858 : Maximum Likelihood Estimators

Bayesian Methods: Naïve Bayes

SOLUTION FOR HOMEWORK 8, STAT 4372

Last Lecture - Key Questions. Biostatistics Statistical Inference Lecture 03. Minimal Sufficient Statistics

INSTITUTE OF ACTUARIES OF INDIA

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Math 494: Mathematical Statistics

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Parametric Techniques Lecture 3

Exponential Families

Lecture 5: Moment generating functions

1 Probability Model. 1.1 Types of models to be discussed in the course

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Stat 5102 Final Exam May 14, 2015

MAXIMUM LIKELIHOOD, SET ESTIMATION, MODEL CRITICISM

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 10: Generalized likelihood ratio test

Expected Values, Exponential and Gamma Distributions

Transcription:

October 1 st, 2018 Lecture 11: Sufficient statistic

Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence Intervals Chapter 9: Test of Hypothesis Regression

Overview 7.1 Point estimate unbiased estimator mean squared error 7.2 Methods of point estimation method of moments method of maximum likelihood. 7.3 Sufficient statistic 7.4 Information and Efficiency Large sample properties of the maximum likelihood estimator

Point estimate Definition A point estimate ˆθ of a parameter θ is a single number that can be regarded as a sensible value for θ. population parameter = sample = estimate θ = X 1, X 2,..., X n = ˆθ

Method of maximum likelihood

Random sample Let X 1, X 2,..., X n be a random sample of size n from a distribution with density function f X (x). Then the density of the joint distribution of (X 1, X 2,..., X n ) is f joint (x 1, x 2,..., x n ) = n f X (x i ) i=1

Maximum likelihood estimator Let X 1, X 2,..., X n have joint pmf or pdf where θ is unknown. f joint (x 1, x 2,..., x n ; θ) When x 1,..., x n are the observed sample values and this expression is regarded as a function of θ, it is called the likelihood function. The maximum likelihood estimates θ ML are the value for θ that maximize the likelihood function: f joint (x 1, x 2,..., x n ; θ ML ) f joint (x 1, x 2,..., x n ; θ) θ

How to find the MLE? Step 1: Write down the likelihood function. Step 2: Can you find the maximum of this function? Step 3: Try taking the logarithm of this function. Step 4: Find the maximum of this new function. To find the maximum of a function of θ: compute the derivative of the function with respect to θ set this expression of the derivative to 0 solve the equation

Example Problem Let X 1,..., X n be a random sample from the normal distribution N (0, σ 2 ), that is f (x, θ) = 1 σ 2π e Use the method of maximum likelihood to obtain an estimator of σ (as a function of the data x 1, x 2,..., x n ). x 2 2σ 2

Example Problem Let β > 1 and X 1,..., X n be a random sample from a distribution with pdf { β if x > 1 f (x) = x β+1 0 otherwise Use the method of maximum likelihood to obtain an estimator of β.

Sufficient statistic

Example Your professor stores a dataset x 1, x 2,..., x n in his computer. He says it is a random sample from the exponential distribution f X (x) = λe λx, x 0 where λ is an unknown parameter. He wants you to work on the dataset and give him a good estimate of λ Assume that the sample size is very large, n = 10 20, and you could not copy the whole dataset You can compute any summary statistics of the dataset using the computer, but the lab is closing in 5 minutes What will you do?

Example If you are using the method of moments x = x 1 + x 2 +... + x n n If you are using the method of maximum likelihood L(λ) = λ n e λ(x 1+x 2 +...+x n) In both case, it seems that you need to only save n and t = x 1 + x 2 +... + x n

Conditional probability For discrete random variables, the conditional probability mass function of Y given the occurrence of the value x of X can be written according to its definition as: P(Y = y X = x) = P(Y = y, X = x) P(X = x) For continuous random variables, the conditional probability of Y given the occurrence of the value x of X has density function f Y (y X = x) = f joint(y, x) f (x)

Some observations Basic estimation problem: Given a density function f (x, θ) and a sample X 1, X 2,..., X n Construct a statistic ˆθ = T (X 1, X 2,..., X n ) Different statistic t leads different estimate, different accuracies If, however, the distribution of t(x 1, X 2,..., X n ) does not depend on θ, then it is no good Similarly, if the conditional probability P(X 1, X 2,..., X n T ) does not depend on θ, then this means that T (X 1, X 2,..., X n ) contained all the information to estimate θ

Sufficient statistic Definition A statistic T = t(x 1,..., X n ) is said to be sufficient for making inferences about a parameter θ if the joint distribution of X 1, X 2,..., X n given that T = t does not depend upon θ for every possible value t of the statistic T.

Fisher-Neyman factorization theorem Theorem T is sufficient for if and only if nonnegative functions g and h can be found such that f (x 1, x 2,..., x n ; θ) = g(t(x 1, x 2,..., x n ), θ) h(x 1, x 2,..., x n ) i.e. the joint density can be factored into a product such that one factor, h does not depend on θ; and the other factor, which does depend on θ, depends on x only through t(x).

Example Let X 1, X 2,..., X n be a random sample of from a Poisson distribution with parameter λ where λ is unknown. f (x, λ) = 1 x! e λx x = 0, 1, 2,..., Find a sufficient statistic of λ.

Jointly sufficient statistic Definition The m statistics T 1 = t 1 (X 1,..., X n ), T 2 = t 2 (X 1,..., X n ),..., T m = t m (X 1,..., X n ) are said to be jointly sufficient for the parameters θ 1, θ 2,..., θ k if the joint distribution of X 1, X 2,..., X n given that T 1 = t 1, T 2 = t 2,..., T m = t m does not depend upon θ 1, θ 2,..., θ k for every possible value t 1, t 2,..., t m of the statistics.

Fisher-Neyman factorization theorem Theorem T 1, T 2,..., T m are sufficient for θ 1, θ 2,..., θ k if and only if nonnegative functions g and h can be found such that f (x 1, x 2,..., x n ; θ 1, θ 2,..., θ k ) = g(t 1, t 2,..., t m, θ 1, θ 2,..., θ k ) h(x 1, x 2,..., x n )

Example 3 Let X 1, X 2,..., X n be a random sample from N (µ, σ 2 ) Prove that f X (x) = 1 e (x µ)2 2σ 2 2πσ T 1 = X 1 +... + X n, T 2 = X 2 1 + X 2 2 +... + X 2 n are jointly sufficient for the two parameters µ and σ.

Example 4 Let X 1, X 2,..., X n be a random sample from a Gamma distribution 1 f X (x) = Γ(α)β α x α 1 e x/β where α, β is unknown. Prove that T 1 = X 1 +... + X n, T 2 = n i=1 are jointly sufficient for the two parameters α and β. X i