Foundations of Statistical Inference. Sufficient statistics. Definition (Sufficiency) Definition (Sufficiency)

Similar documents
Answers to QUIZ

Transform Techniques. Moment Generating Function

Empirical Process Theory

U( θ, θ), U(θ 1/2, θ + 1/2) and Cauchy (θ) are not exponential families. (The proofs are not easy and require measure theory. See the references.

Notes for Lecture 17-18

5. Stochastic processes (1)

Sensors, Signals and Noise

Continuous Time. Time-Domain System Analysis. Impulse Response. Impulse Response. Impulse Response. Impulse Response. ( t) + b 0.

Graphical Event Models and Causal Event Models. Chris Meek Microsoft Research

Right tail. Survival function

CS Homework Week 2 ( 2.25, 3.22, 4.9)

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

Some Basic Information about M-S-D Systems

Discrete Markov Processes. 1. Introduction

Part III: Chap. 2.5,2.6 & 12

Introduction to Probability and Statistics Slides 4 Chapter 4

Chapter 3 Common Families of Distributions

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

13.3 Term structure models

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

Unit Root Time Series. Univariate random walk

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Reliability of Technical Systems

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

Cash Flow Valuation Mode Lin Discrete Time

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION

Vehicle Arrival Models : Headway

Stochastic Structural Dynamics. Lecture-6

CHEMICAL KINETICS: 1. Rate Order Rate law Rate constant Half-life Temperature Dependence

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is

A Bayesian Approach to Spectral Analysis

Homework 10 (Stats 620, Winter 2017) Due Tuesday April 18, in class Questions are derived from problems in Stochastic Processes by S. Ross.

Math 315: Linear Algebra Solutions to Assignment 6

Approximation Algorithms for Unique Games via Orthogonal Separators

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS

Question 1: Question 2: Topology Exercise Sheet 3

Asymptotic Equipartition Property - Seminar 3, part 1

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

Bifurcation Analysis of a Stage-Structured Prey-Predator System with Discrete and Continuous Delays

GMM - Generalized Method of Moments

An Introduction to Malliavin calculus and its applications

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

BBP-type formulas, in general bases, for arctangents of real numbers

Monochromatic Infinite Sumsets

Homework 4 (Stats 620, Winter 2017) Due Tuesday Feb 14, in class Questions are derived from problems in Stochastic Processes by S. Ross.

Understanding the asymptotic behaviour of empirical Bayes methods

The Arcsine Distribution

Statistical Distributions

8.022 (E&M) Lecture 16

Predator - Prey Model Trajectories and the nonlinear conservation law

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence

ON THE DEGREES OF RATIONAL KNOTS

CHAPTER 2 Signals And Spectra

Math-Net.Ru All Russian mathematical portal

Chapter 2. First Order Scalar Equations

Random Walk with Anti-Correlated Steps

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

Mathematical Theory and Modeling ISSN (Paper) ISSN (Online) Vol 3, No.3, 2013

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

STA 114: Statistics. Notes 2. Statistical Models and the Likelihood Function

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Radical Expressions. Terminology: A radical will have the following; a radical sign, a radicand, and an index.

Most Probable Phase Portraits of Stochastic Differential Equations and Its Numerical Simulation

Innova Junior College H2 Mathematics JC2 Preliminary Examinations Paper 2 Solutions 0 (*)

INDEPENDENT SETS IN GRAPHS WITH GIVEN MINIMUM DEGREE

Network Flow. Data Structures and Algorithms Andrei Bulatov

OBJECTIVES OF TIME SERIES ANALYSIS

Lecture 2 April 04, 2018

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

Outline. lse-logo. Outline. Outline. 1 Wald Test. 2 The Likelihood Ratio Test. 3 Lagrange Multiplier Tests

TMA 4265 Stochastic Processes

An random variable is a quantity that assumes different values with certain probabilities.

Chapter 6. Systems of First Order Linear Differential Equations

Lecture #31, 32: The Ornstein-Uhlenbeck Process as a Model of Volatility

THE BERNOULLI NUMBERS. t k. = lim. = lim = 1, d t B 1 = lim. 1+e t te t = lim t 0 (e t 1) 2. = lim = 1 2.

Uniqueness of solutions to quadratic BSDEs. BSDEs with convex generators and unbounded terminal conditions

Generalized Least Squares

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND

Measurement Error 1: Consequences Page 1. Definitions. For two variables, X and Y, the following hold: Expectation, or Mean, of X.

Chapter Three Systems of Linear Differential Equations

Math 333 Problem Set #2 Solution 14 February 2003

4.1 - Logarithms and Their Properties

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

Random Processes 1/24

Stability and Bifurcation in a Neural Network Model with Two Delays

References are appeared in the last slide. Last update: (1393/08/19)

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

I. Return Calculations (20 pts, 4 points each)

Lecture 33: November 29

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Avd. Matematisk statistik

Representation of Stochastic Process by Means of Stochastic Integrals

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

EXERCISES FOR SECTION 1.5

Transcription:

Foundaions of Saisical Inference Julien Beresycki Lecure 2 - Sufficiency, Facorizaion, Minimal sufficiency Deparmen of Saisics Universiy of Oxford MT 2016 Julien Beresycki (Universiy of Oxford BS2a MT 2016 1 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 2 / 57 Sufficien saisics Le X 1,..., X n be a random sample from f (x; θ. Definiion (Sufficiency A saisic T (X 1,..., X n is a funcion of he daa ha does no depend on unknown parameers. A saisic T (X 1,..., X n is said o be sufficien for θ if he condiional disribuion of X 1,..., X n, given T, does no depend on θ. Tha is, f (x, θ f (x Commen The definiion says ha a sufficien saisic T conains all he informaion here is in he sample abou θ. Definiion (Sufficiency A saisic T (X 1,..., X n is a funcion of he daa ha does no depend on unknown parameers. A saisic T (X 1,..., X n is said o be sufficien for θ if he condiional disribuion of X 1,..., X n, given T, does no depend on θ. Tha is, f (x, θ f (x Wha does his even mean? I means ha for any funcion g he map is consan. θ E θ [g(x T ] Julien Beresycki (Universiy of Oxford BS2a MT 2016 3 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 4 / 57

Example 7 n independen rials where he probabiliy of success is p. Le X 1,..., X n be indicaor variables which are 1 or 0 depending if he rial is a success or failure. Le T n i1 X i. The condiional disribuion of X 1,..., X n given T is g(x 1,..., x n, p f (x 1,..., x n, p h( p n i1 px i (1 p 1 x i p (1 p n p (1 p n p (1 p n ( n 1, 1 : Facorizaion Crierion T (X 1,..., X n is a sufficien saisic for θ if and only if here exis wo non-negaive funcions K 1, K 2 such ha he likelihood funcion L(θ; x can be wrien L(θ; x K 1 [(x 1,..., x n ; θ]k 2 [x 1,..., x n ] K 1 [; θ]k 2 [x], where K 1 depends only on he sample hrough T, and K 2 does no depend on θ. no depending on p, so T is sufficien for p. Commen Makes sense, since no informaion in he order. Julien Beresycki (Universiy of Oxford BS2a MT 2016 5 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 6 / 57 Proof - for discree random variables 1. Assume ha T is sufficien, hen he disribuion of he sample is L(θ; x f (x θ f (x, θ f (x, θh( θ T is sufficien which implies f (x, θ f (x h( θ depends on x hrough (x only so L(θ; x f (x h( θ We se L(θ; x K 1 K 2, where K 1 h, K 2 f. 2. Suppose L(θ; x f (x θ K 1 [; θ]k 2 [x].then h( θ f (x, θ L(θ; x Thus f (x, θ {x:t (x} f (x, θ h( θ {x:t (x} K 1 [; θ] L(θ; x h( θ {x:t (x} K 2 (x. K 2 [x] {x:t (x} K 2(x, no depending on θ. (K 1 cancels ou in numeraor and denominaor. Julien Beresycki (Universiy of Oxford BS2a MT 2016 7 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 8 / 57

Minimal sufficiency How much can we reduce he daa wihou loosing informaion? Is here a minimal sufficien saisic? Example 7 (con. Consider n 3 Bernoulli rials 1 T 1 (X (X 1, X 2, X 3 (he individual sequences of rials 2 T 2 (X (X 1, 3 i1 X i (he 1s random variable and he oal sum. 3 T 3 (X 3 i1 X i (he oal sum 4 T 4 (X I(T 3 (X 0 (I is indicaor funcion; Exercise Prove T 4 no sufficien Definiion (Minimaliy A saisic is minimal sufficien if i can be expressed as a funcion of every oher sufficien saisic. Example 7 (con. : Minimal sufficiency n Bernoulli rials wih T n i1 X i. Suppose T above is no minimal sufficien bu anoher saisic U is MS.Then U can be given as a funcion of T (and no vis versa or T is MS and here exis 1 2 values of T so ha U( 1 U( 2 (ie T U is many o one so U T is no a funcion, and we assume for he momen no oher make U( U( 1.The even U u is he even T { 1, 2 }. Le x 1,...x n conain 1 successes. Then g(x 1,..., x n u, p g(x 1,..., x n 1, pp( 1 u, p g(x 1,..., x n 1 P(T 1 T { 1, 2 }, p ( ( 1 n n 1 p 1 (1 p n 1 ( 1 1 p 1(1 p n 1 + n 2 p 2(1 p n 2 which depends on p, so U is no sufficien, a conradicion, and hence T mus be MS (similar reasoning for muliple i. Julien Beresycki (Universiy of Oxford BS2a MT 2016 9 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 10 / 57 Minimal sufficiency and pariions of he sample space Inuiively, a minimal sufficien saisic mos efficienly capures all possible informaion abou he parameer θ. Any saisic T (X pariions he sample space ino subses and in each subse T (X has consan value. Minimal sufficien saisics correspond o he coarses possible pariion of he sample space. In he example of n 3 Bernoulli rials consider he following 4 saisics and he pariions hey induce. T 1 (X ( X 1, X 2, X 3 3 T 2 (X X 1, X i i1 3 T 3 (X X i T 4 (X I T 3 (X 0 i1 ( Julien Beresycki (Universiy of Oxford BS2a MT 2016 11 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 12 / 57

Lemma 1 : Lehmann-Scheffé pariions Proof (for discree RVs Consider he pariion of he sample space defined by puing x and y ino he same class of he pariion if and only if L(θ; y/l(θ; x f (y θ/f (x θ m(x, y. Then any saisic corresponding o his pariion is minimal sufficien. Commen This Lemma ells us how o define pariions ha correspond o minimal sufficien saisics. I says ha raios of likelihoods of wo values x and y in he same pariion (and hence same saisic value should no depend on θ. 1. Sufficiency. Suppose T is such a saisic g(x, θ f (x θ f ( θ f (x θ τ {y : T (y } f (y θ, y τ f (x θ y τ f (x θm(x, y [ ] 1 m(x, y y τ which does no depend on θ. Hence he pariion D is sufficien. Julien Beresycki (Universiy of Oxford BS2a MT 2016 13 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 14 / 57 2. Minimal sufficiency. Now suppose U is any oher sufficien saisic and ha U(x U(y for some pair of values (x, y. If we can show ha U(x U(y implies T (x T (y, hen he Lehmann-Scheffé pariion induced by T includes he pariion based on any oher sufficien saisic.in oher words, T is a funcion of every oher sufficien saisic, and so mus be minimal sufficien. Since U is sufficien we have L(θ; y L(θ; x K 1[u(y; θ]k 2 [y] K 1 [u(x; θ]k 2 [x] K 2[y] K 2 [x] which does no depend on θ. So he saisic U produces a pariion a leas as fine as ha induced by T, and he resul is proved. Julien Beresycki (Universiy of Oxford BS2a MT 2016 15 / 57 Sufficiency in an exponenial family For a sample X 1,..., X n i.i.d. from a full-rank k-parameer exponenial family i holds ha The saisic T (x; i1 B 1(x i,..., n i1 B k(x i is minimal sufficien. The disribuion of T (x belongs o a k-parameer exponenial family. n n k L(θ; x f (x i ; θ exp A j (θb j (x i + C(x i + D(θ i1 i1 j1 ( k n n exp A j (θ B j (x i + nd(θ + C(x i. j1 i1 Exponenial family form again. Julien Beresycki (Universiy of Oxford BS2a MT 2016 16 / 57 i1

Sufficiency in an exponenial family Suppose he family is in canonical form so φ j A j (θ, and le j n i1 B j(x i, C(x n i1 C(x i. k L(θ; x exp φ j j + nd(θ + C(x. j1 By he facorizaion crierion 1,..., k are sufficien saisics for φ 1,..., φ k. In fac, we do no need canonical form. If k L(θ; x exp A j (θ j + nd(θ + C(x j1 is a minimal k-dimensional linear exponenial family hen (by he regulariy condiions above 1,..., k are minimal sufficien for θ 1,..., θ k. Minimal sufficiency is verified using Lemma 1. Julien Beresycki (Universiy of Oxford BS2a MT 2016 17 / 57