TAMS24: Notations and Formulas

Similar documents
Kurskod: TAMS24 (Statistisk teori)/provkod: TEN :00-12:00. English Version. 1 (3 points) 2 (3 points)

(all terms are scalars).the minimization is clearer in sum notation:

English Version P (1 X < 1.5) P (X 1) = c[x3 /3 + x 2 ] 1.5. = c c[x 3 /3 + x 2 ] 2 1

Topic 9: Sampling Distributions of Estimators

STATISTICAL INFERENCE

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

1.010 Uncertainty in Engineering Fall 2008

Topic 9: Sampling Distributions of Estimators

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Expectation and Variance of a random variable

Topic 9: Sampling Distributions of Estimators

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Properties and Hypothesis Testing

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

MATH/STAT 352: Lecture 15

Random Variables, Sampling and Estimation

MA Advanced Econometrics: Properties of Least Squares Estimators

1 Inferential Methods for Correlation and Regression Analysis

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Mathematical Statistics - MS

Statistics 20: Final Exam Solutions Summer Session 2007

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Stat 200 -Testing Summary Page 1

Stat 319 Theory of Statistics (2) Exercises

LECTURE 8: ASYMPTOTICS I

Asymptotic Results for the Linear Regression Model

Common Large/Small Sample Tests 1/55

Introductory statistics

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

of the matrix is =-85, so it is not positive definite. Thus, the first

Lecture 33: Bootstrap

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Theory MT 2009 Problems 1: Solution sketches

Lecture 2: Monte Carlo Simulation

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Chapter 1 Simple Linear Regression (part 6: matrix version)

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Estimation for Complete Data

Lecture 11 Simple Linear Regression

Final Examination Statistics 200C. T. Ferguson June 10, 2010

Stat410 Probability and Statistics II (F16)

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Statistical Theory MT 2008 Problems 1: Solution sketches

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

SDS 321: Introduction to Probability and Statistics

11 Correlation and Regression

Lecture 7: Properties of Random Samples

Probability and Statistics

Summary. Recap ... Last Lecture. Summary. Theorem

Questions and Answers on Maximum Likelihood

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Parameter, Statistic and Random Samples

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Chapter 6 Principles of Data Reduction

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Linear Regression Models

Matrix Representation of Data in Experiment

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Unbiased Estimation. February 7-12, 2008

Efficient GMM LECTURE 12 GMM II

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Section 14. Simple linear regression.

Last Lecture. Wald Test

Lecture 19: Convergence

Maximum Likelihood Estimation

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Simple Linear Regression

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

5. Likelihood Ratio Tests

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Lecture Notes 15 Hypothesis Testing (Chapter 10)

32 estimating the cumulative distribution function

Sample Size Determination (Two or More Samples)

Transcription:

TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: = V X = X µ = X X Stadard deviatio Stadardavvikelse: = DX = V X Populatio X Radom sample slumpmässigt stickprov: X,, X are idepedet ad have the same distributio as the populatio X Before observe/measure, X,, X are radom variables, ad after observe/measure, we use x,, x which are umbers ot radom variables Sample mea Stickprovsmedelvärde: Before observe/measure, X = X i, ad after observe/measure, x = x i Sample variace Stickprovsvarias: Before observe/measure, S = X i X, ad after observe/measure, s = x i x Sample stadard deviatio Stickprovsstadardavvikelse: Before observe/measure, S = S, ad after observe/measure, s = s c i X i = c i X i, V c i X i = c i V X i, if X,, X are idepedet oberoede If X Nµ,, the X µ N, If X,, X are idepedet ad X i Nµ i, i, the d + c i X i Nd + c i µ i, c i i For a populatio X with a ukow parameter θ, ad a radom sample X,, X } : stimator Stickprovsvariabel: ˆΘ = gx,, X, a radom variable / stimate Puktskattig: ˆθ = gx,, x, a umber Ubiased Vätevärdesriktig: ˆΘ = θ ffective ffektiv: Two estimators ˆΘ ad ˆΘ are ubiased, we say that ˆΘ is more effective tha ˆΘ if V ˆΘ < V ˆΘ Biomial distributio X BiN, p : there are N idepedet ad idetical trials, each trial has a probability of success p, ad X = the umber of successes i these N trials The radom variable X BiN, p has a probability fuctio saolikhetsfuktio N pk = P X = k = p k p N k k xpoetial distributio X xp/µ : whe we cosider the waitig time/lifetime The radom variable X xp/µ has a desity fuctio täthetsfuktio Poit estimatio fx = µ e x/µ, x Method of momets Mometmetode: # of equatios depeds o # of ukow parameters, X = x, X = x i, X 3 = x 3 i, Cosistet Kosistet: A estimator ˆΘ = gx,, X is cosistet if lim P ˆΘ θ > ε =, for ay costat ε > This is called covergece i probability Theorem: If ˆΘ = θ ad lim V ˆΘ =, the ˆΘ is cosistet Least square method mista-kvadrat-metode: The least square estimate ˆθ is the oe miimizig Qθ = x i X Maximum-likelihood method Maximum-likelihood-metode: The maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio Lθ = fx i θ, if X is cotiuous, px i θ, if X is discrete Remark o ML: I geeral, it is easier/better to maximize l Lθ Remark o ML: If there are several radom samples say m from differet populatios with a same ukow parameter θ, the the maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio defied as Lθ = L θ L m θ, where L i θ is the likelihood fuctio from the i-th populatio /

stimates of populatio variace : If there is oly oe populatio with a ukow mea, the method of momets ad maximum-likelihood method, i geeral, give a estimate of as follows = x i x NOT ubiased A adjusted or corrected estimate would be the sample variace s = x i x ubiased If there are m differet populatios with ukow meas ad a same variace, the a adjusted or corrected ML estimate is s = s + + m s m + + m ubiased where i is the sample size of the i-th populatio, ad s i is the sample variace of the i-th populatio Stadard error medelfelet of a estimator ˆΘ: is a estimate of the stadard deviatio D ˆΘ 3 Iterval estimatio Oe sample X,, X } from Nµ, Two samples X,, X } from Nµ, Y,, Y } from Nµ, Nµ, ad Nµ, are idepedet x λ α/, if is kow fact / I µ = N, x t α/ s, if is ukow fact s/ t s I = χ α, s fact S χ χ α Ukow ca be estimated by the sample variace s = x i x x ȳ λ α/ +, if ad are kow fact X Ȳ µ µ N, + x ȳ t α/ + s +, if = = is ukow X Ȳ µ µ fact I µ µ = S t + + s x ȳ t α/ f + s, if both are ukow fact X Ȳ µ µ S + S tf degrees of freedom f = s / + s / I = + s χ α, + s, if + χ α = = + fact + S χ + Ukow ca be estimated by the samples variace s = s + s + s /+s / Remark: The idea of usig fact to fid cofidece itervals is very importat There are a lot more differet cofidece itervals besides above For istace, we cosider two idepedet samples: X,, X } from Nµ, ad Y,, Y } from Nµ, I this case, we ca easily prove that c X + c Ȳ N c µ + c µ, If is kow, the fact c X+c Ȳ c µ +c µ c + c If is ukow, the fact c X+c Ȳ c µ +c µ S c + c N, So we ca fid I cµ +c µ c + c 3 Cofidece itervals from ormal approximatios t + So we ca fid I cµ +c µ ˆp ˆp X BiN, p : I p = ˆp λ α/ N, fact ˆP p N, ˆP ˆP N we require that N ˆp > ad N ˆp ˆp > N X HypN,, p : I p = ˆp λ α/ N ˆP ˆp ˆp, fact X P oµ : I µ = x λ α/ x, fact X µ X N, p N N ˆP ˆP we require that x > 5 X xp µ : I µ = x x + λ, α/ λ, fact X µ α/ µ/ N,, x X µ I µ = x λ α/, fact X/ N, we require that 3 N, Remark: Agai there are more cofidece itervals besides above For istace, we cosider two idepedet samples: X from BiN, p ad Y from BiN, p, with ukow p ad p As we kow p p ˆP N p, ad ˆP p p N p,, so ˆP ˆP N p p, p p + p p Therefore, fact is ˆP ˆP p p N,, ˆP ˆP + ˆP ˆP I p p = ˆp ˆp λ α/ ˆp ˆp + ˆp ˆp m samples: The ukow = = m = ca be estimated by s = s ++m s m ++ m 3/ 3 Cofidece itervals from the ratio of two populatio variaces 4/

Suppose there are two idepedet samples X,, X } from Nµ,, ad Y,, Y } from Nµ, The S Thus χ ad S χ, therefore S / S F,, / s I / = s F α,, s s fact F α, 33 Large sample size 3, populatio may be completely ukow If there is o iformatio about the populatios, the we ca apply Cetral Limit Theorem usually with a large sample 3 to get a approximated ormal distributios Here are two examples: xample : Let X,, X }, 3, be a radom sample from a populatio, the o matter what distributio the populatio is X µ s/ N, xample : Let X,, X }, 3, be a radom sample from a populatio, ad Y,, Y }, 3, be a radom sample from aother populatio which is idepedet from the first populatio, the o matter what distributios the populatios are 4 Hypothesis testig X Ȳ µ µ s + s N, 4 Oe sample ad the geeral theory of hypothesis testig Suppose there is a radom sample X,, X } from a populatio X with a ukow parameter θ, H : θ = θ vs H : θ < θ, or θ > θ, or θ θ H is true H is false ad θ = θ reject H type I error or sigificace level α power hθ do t reject H α type II error βθ = hθ Regardig the p-value: For otatioal simplicity, we employ reject H if ad oly if p-value < α TS := test statistic ad C := critical regio reject H if TS C reject H if ad oly if p-value < α 4 Hypothesis testig for populatio meas Oe sample: X,, X } from Nµ, Null hypothesis H : µ = µ H : µ < µ : TS = x µ is kow: H : µ > µ : TS = x µ / N, H : µ µ : TS = x µ H : µ < µ : TS = x µ is ukow: H : µ > µ : TS = x µ s/ t H : µ µ : TS = x µ /, C =, λ α, p-value = P N, TS /, C = λ α, +, p-value = P N, TS /, C =, λ α/ λ α/, +, p-value = P N, TS s/, C =, t α, p-value = P t TS s/, C = t α, +, p-value = P t TS s/, C =, t α/ t α/, +, p-value = P t TS Two samples: X,, X } from Nµ, Y,, Y } from Nµ, Null hypothesis H : µ = µ H : µ < µ : TS = x ȳ, C =, λ α, + p-value = P N, TS, are kow: H : µ > µ : TS = x ȳ, C = λ α, +, X Ȳ µ µ + N, + p-value = P N, TS H : µ µ : TS = x ȳ, C =, λ α/ λ α/, +, + p-value = P N, TS X Ȳ µ µ S + H : µ < µ : TS = x ȳ s, C =, t α +, + p-value = P t + TS x ȳ s + H = is ukow: : µ > µ : TS =, C = t α +, +, t + p-value = P t + TS H : µ µ : TS = x ȳ s, C =, t + α/ + p-value = P t + TS both ukow: similarly as i the tree of cofidece itervals t α/ +, +, 5/ 6/

43 Hypothesis testig for populatio variaces X,, X } from Nµ, S χ H : = H : < H : > H : s : TS =, C =, χ α, p-value = P χ TS s : TS =, C = χ α, +, p-value = P χ TS s : TS =, C =, χ α χ α, +, p-value = P χ TS or P χ TS H : < : TS = s /s, C =, F α,, p-value = P F, TS X,, X } from Nµ, H : Y,, Y } from Nµ, > : TS = s /s, C = F α,, +, p-value = P F, TS S / F S, H : / : TS = s /s, C =, F α, H : = F α,, +, p-value = P F, TS or P F, TS X ad Y are ucorrelated: if covx, Y = A importat theorem: Suppose that a radom vector X has a mea µ X ad a covariace matrix C X Defie a ew radom vector Y = AX + b, for some matrix A ad vector b The µ Y = Aµ X + b, C Y = AC X A Stadard ormal vectors: X i } are idepedet ad X i N,, X X X =, thus µ X =, C X =, desity f Xx = π e X Geeral ormal vectors: Y = AX + b, where X is a stadard ormal vector, ad µ Y = b, C Y = AA, desity f Y y = π detc Y e 6 Simple ad multiple Liear regressios y µ y C y µy Y x x 44 Large sample size 3, populatio may be completely ukow If there is o iformatio about the populatios, the we ca apply Cetral Limit Theorem usually with a large sample 3 The idea is exactly the same as the oe used i cofidece itervals Oe example is: a sample X,, X }, 3, from some populatio which is ukow with a mea µ ad stadard deviatio Null hypothesis H : µ = µ The it follows from CLT that s/ N,, therefore H : µ < µ : TS = x µ H : µ > µ : TS = x µ s/, C =, λ α, p-value = P N, TS s/, C = λ α, +, p-value = P N, TS H : µ µ : TS = x µ s/, C =, λ α/ λ α/, +, p-value = P N, TS 5 Multi-dimesio radom variables or radom vectors Covariace Kovarias of X, Y : X,Y = covx, Y = X µ X Y µ Y, covx, X = V X Correlatio coefficiet Korrelatio of X, Y : ρ X,Y = covx,y = X,Y V X V Y X Y A rule: for real costats a, a i, b ad b j, m m cova + a i X i, b + b j Y j = a i b j covx i, Y j j= 7/ j= Simple liear regressio: Y j = β + β x j + ε j, ε j N,, j =,, Multiple liear regressio: Y j = β + β x j + β x j + + β k x jk + ε j, ε j N,, j =,, Both Simple liear regressio ad Multiple liear regressio ca be writte as vector forms: Y x x k Y Y = Xβ + ε : Y =, X = x x k β, β =, ε N, I β Y x x k k Y Nµ Y, C Y, where µ Y = Xβ ad C Y = I stimate of the coefficiet β: ˆβ = X X X y stimator of the coefficiet β: ˆB = X X X Y N β, X X stimated lie is: ˆµ j = ˆβ + ˆβ x j + ˆβ x j + + ˆβ k x jk Aalysis of variace: SS T OT = SS R = SS = y j ȳ, j= ˆµ j ȳ, j= y j ˆµ j, j= SS T OT j= = Y j Ȳ χ, if β = = β k = j= ˆµ j Ȳ SS R = SS = χ k, if β = = β k = j= Y j ˆµ j χ k 8/

SS T OT = SS R + SS, ad R = SS R SS T OT is estimated as ˆ = S = SS k For the Hypothesis testig: H : β = = β k = vs H : at least oe β j, SS R/k SS / k F k, k TS = SS R/k SS / k C = F α k, k, + We kow ˆB = X X X Y N β, X X, thus if we deote h h h k X X h h h k =, h k h k h kk ad we wat to test H : β k+ = = β k+p = vs H : at least oe β k+i, SS SS /p F p, k p / k p SS TS = SS SS SS /p / k p C = F α p, k p, + Variable selectio If we have a respose variable y with possibly may predictors x,, x k, the how to choose appropriate x s some x s are useful to Y, ad some are ot: Step : corrx,, x k, y, choose a maximal correlatio say x i, Y = β + β i x i + ε, test if β i =? Step : do regressio Y = β + β i x i + β x + ε for =,, i, i +,, k, choose a miimal SS say x j, Y = β + β i x i + β j x j + ε, test if β j =? Step 3: repeat Step util the last test for β = is ot rejected the ˆB j Nβ j, h jj ad ˆB j β j N, But is geerally ukow, therefore h jj ˆB j β j S h jj t k, Cofidece iterval of β j is: I βj = ˆβ j t α/ k s h jj s h jj is sometimes deoted as d ˆβ j or se ˆβ j 7 Basic χ -test Suppose we wat to test H : H : X distributio with or without ukow parameters X distributio with or without ukow parameters Hypothesis testig H : β j = vs H : β j has TS = ˆβ j s h jj C =, t α/ k t α/ k, + Rewrite simple ad multiple liear regressios as follows: Y = β + β x + + β k x k + ε, ε N,, the model µ = Y = β + β x + + β k x k, the mea ˆµ = ˆβ + ˆβ x + + ˆβ k x k, the estimated lie For a give/fixed x =, x,, x k, the scalar ˆµ is a estimate of ukow µ ad Y The we ca talk about accuracy of this estimate i terms of cofidece itervals ad predictio itervals Cofidece iterval of µ: I µ = ˆµ t α/ k s x X X x Predictio iterval of Y : I Y = ˆµ t α/ k s + x X X x Suppose we have two models: Model : Y = β + β x + + β k x k + ε Model : Y = β + β x + + β k x k + β k+ x k+ + + β k+p x k+p + ε, fact is : k N i p i p i χ k #of ukow parameters The TS = k N i p i p i C = χ αk #of ukow parameters, + Homogeeity test Suppose we have a data with r rows ad k colums, H : differet rows have a same patter i terms of colums quivaletly, The H : differet rows have differet patters i terms of colums H : H : rows ad colums are idepedet rows ad colums are ot idepedet fact is : k r N ij p ij j= p ij χ r k TS = k r N ij p ij j= p ij C = χ αr k, +, where p ij = p i q j are the theoretical probabilities 9/ /