Kurskod: TAMS24 (Statistisk teori)/provkod: TEN :00-12:00. English Version. 1 (3 points) 2 (3 points)

Similar documents
TAMS24: Notations and Formulas

English Version P (1 X < 1.5) P (X 1) = c[x3 /3 + x 2 ] 1.5. = c c[x 3 /3 + x 2 ] 2 1

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

(all terms are scalars).the minimization is clearer in sum notation:

Topic 9: Sampling Distributions of Estimators

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

STATISTICAL INFERENCE

Statistics 20: Final Exam Solutions Summer Session 2007

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Expectation and Variance of a random variable

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Stat 319 Theory of Statistics (2) Exercises

1 Inferential Methods for Correlation and Regression Analysis

Random Variables, Sampling and Estimation

Properties and Hypothesis Testing

Topic 9: Sampling Distributions of Estimators

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Common Large/Small Sample Tests 1/55

Introductory statistics

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Topic 9: Sampling Distributions of Estimators

Stat410 Probability and Statistics II (F16)

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

1.010 Uncertainty in Engineering Fall 2008

5. Likelihood Ratio Tests

Lecture 2: Monte Carlo Simulation

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Chapter 1 Simple Linear Regression (part 6: matrix version)

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Statistical Theory MT 2009 Problems 1: Solution sketches

Last Lecture. Wald Test

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Problem Set 4 Due Oct, 12

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Estimation for Complete Data

6 Sample Size Calculations

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Stat 200 -Testing Summary Page 1

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Efficient GMM LECTURE 12 GMM II

Summary. Recap ... Last Lecture. Summary. Theorem

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Lecture 3. Properties of Summary Statistics: Sampling Distribution

STAT431 Review. X = n. n )

Probability and Statistics

Questions and Answers on Maximum Likelihood

Statistical Theory MT 2008 Problems 1: Solution sketches

11 Correlation and Regression

Homework for 4/9 Due 4/16

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 11 and 12: Basic estimation theory

University of California, Los Angeles Department of Statistics. Hypothesis testing

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Lecture 7: Properties of Random Samples

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Final Examination Statistics 200C. T. Ferguson June 10, 2010

Stat 139 Homework 7 Solutions, Fall 2015

Unbiased Estimation. February 7-12, 2008

Mathematical Statistics - MS

Module 1 Fundamentals in statistics

Lecture 33: Bootstrap

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Matrix Representation of Data in Experiment

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

of the matrix is =-85, so it is not positive definite. Thus, the first

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

LECTURE 8: ASYMPTOTICS I

Asymptotic Results for the Linear Regression Model

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MA Advanced Econometrics: Properties of Least Squares Estimators

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Statistical inference: example 1. Inferential Statistics

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

This is an introductory course in Analysis of Variance and Design of Experiments.

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

MATH/STAT 352: Lecture 15

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Transcription:

Kurskod: TAMS4 Statistisk teori)/provkod: TEN 08-08-4 08:00-:00 Examiator: Zhexia Liu Tel: 070089508) You are permitted to brig: a calculator, ad formel -och tabellsamlig i matematisk statistik Scores ratig: 8- poits givig rate 3; 5-45 poits givig rate 4; 5-8 poits givig rate 5 3 poits) Eglish Versio Assume that the distributio of lifetimes uit: year) of a certai type of electroic compoets is Exp/µ) where the true average lifetime µ is ukow Oe chose 400 such electroic compoets, ad after oe year 09 compoets still worked amely, the other 9 compoets were broke after oe year) Based o this iformatio, use the method of momets to fid a poit estimate of µ Solutio Lethal X = umber of compoets which still work after oe year The X Bi400, p), where p = P Exp/µ) > ) = fracµe x/µ dx = e /µ It follows from the method of momets that EX) = x here x = x / = 09), therefore 400 p = 09 implyig e /µ = 09/400 ˆµ = / l09/400) = 077 3 poits) A radom sample {X,, X } is take from a populatio Nµ, σ) with ukow µ ad kow σ ) p) Fid a poit estimator of µ usig Maximum-Likelihood method ) p) Is this poit estimator i ) ubiased? Why? 3) p) Is this poit estimator i ) cosistet? Why? Solutio ) The likelihood fuctio is Takig the logarithm gives Lµ) = ) e Xi µ) /σ ) = e σ Xi µ) πσ πσ l Lµ) = l πσ) σ I order to fid the maximal poit, we take the first derivative X i µ) 0 = l Lµ) = σ X i µ) ˆµ = X The secod derivative rule verifies that ˆµ = X is ideed a maximum ) Yes, sice Eˆµ) = E X) = E X + + X )) = EX ) + + EX )) = µ = µ 3) Yes, sice ˆµ is ubiased ad V ˆµ) = V X + + X )) = V X ) + + V X )) = σ = σ 0, as Page /3

3 3 poits) Oe wats to collect a radom sample of values from a populatio P oµ) Usig the sample, oe iteds to test the ull hypothesis H 0 : µ = 4 agaist the alterative hypothesis H : µ > 4 such that the probability of the first type error is 005 ad the probability of the secod type error is 00 with the true µ = 5 How should be chose? Solutio Let s preted that is large so that we ca use ormal approximatios, that is It the follows that X + + X P oµ) Nµ, µ) X µ µ/ N0, ) The fact of the first type error is 005 gives that H 0 is rejected whe X µ 0 µ0 / > z 005 = 645 Therefore, 00 = the secod type error = P do t reject H 0 whe H 0 is false ad µ = 5) = P X µ 0 645 whe µ = 5) µ0 / = P X 4 4 + 645 whe µ = 5) = P X 4 µ 4 + 645 ) 5 ) P N0, ) µ/ 5/ 4 4 + 645 ) 5 ) 5/ Therefore, 4 4 + 645 ) 5 = 33 5/ = 75 ie = 73) 4 3 poits) ) )) X 4 Assume that X = N, Oe wats to make a liear combiatio Y = ax X 5) 3 + bx such that the mea EY ) = 8 ad the variace V Y ) is miimized Determie a ad b Solutio It follows from 8 = EY ) = aex ) + bex ) = a + 5b that a = 8 5b The the variace is computed as V Y ) = V ax + bx ) = a V X ) + b V X ) + abcovx, X ) = 4a + 3b + ab = 8 5b) + 3b + 8 5b)b = 3b 7b + 64 To fid the miimal vale of V X) we just take the first derivative 0 = dv Y )/db = 46b 7 b = 7/46 = 565 a = 8 5b)/ = 0087 5 3 poits) The umber of cars passig a bridge ca be assumed to be Poisso distributed with a mea µ cars per miute from North ad a mea µ cars per miute from South Suppose that the umber of cars from North is idepedet of the umber of cars from South Durig a hour there were 60 cars passed of which 90 cars were from North Fid a 95% cofidece iterval for µ µ Page /3

Solutio Let The X = umber of cars from North i a hour P o60µ ) N60µ, 60µ ), Y = umber of cars from South i a hour P o60µ ) N60µ, 60µ ) X Y N60µ 60µ, 60µ + 60µ ) X 60 Y 60 ) µ µ ) µ +µ 60 Therefore, the cofidece iterval for µ µ is I µ µ = x 60 y 60 ) z ˆµ + ˆµ α/ 60 = 90 60 70 90/60 + 70/60 60 ) 96 60 = 0333 96 0 = 0333 043 = 008, 0746) N0, ) 6 3 poits) Oe wishes to ivestigate whether or ot the check out frequecy i a certai library varies with the day of the week Durig a radomly chose week oe couts the umber of books checked out at the idividual days: weekday Moday Tuesday Wedesday Thursday Friday # books checked out 08 35 4 46 0 Test o a sigificace level α = 00 whether or ot the check out frequecy varies with the day of the week Solutio I this case, The the test statistic is ad the rejectio regio is H 0 : p = p = p 3 = p 4 = p 5 = 0 agaist H : some p i 0 T S = 5 N i p i ) = 7866, p i C = χ α5 ), + ) = 38, + ) It is clear that T S / C, so we do t reject H 0 ie there is o evidece that frequecy varies with the day of the week Page 3/3

Svesk Versio 3 poäg) Atag att fördelige av livslägde ehet: år) för e viss typ av elektroiska kompoeter är Exp/µ) där de saa geomsittliga livslägde µ är okäd Ma valde 400 sådaa elektroiska kompoeter, och efter ett år arbetade 09 kompoeter fortfarade ämlige de adra 9 kompoetera bröts efter ett år) Basera på dea iformatio, aväd momet-metode för att beräka e puktskattig av µ 3 poäg) Ett slumpmässigt stickprov {X,, X } tas frå e populatio Nµ, σ) med okäd µ och käd σ ) p) Beräka e puktskattig av µ geom att aväda Maximum Likelihood-metode ) p) Är dea puktskattig i ) vätevärdesriktig? Varför? 3) p) Är dea puktskattig i ) kosistet? Varför? 3 3 poäg) Ma öskar isamla ett slumpmässigt stickprov om värde frå e populatio P oµ) Med hjälp av stickprovet avser ma att testa ollhypotese H 0 : µ = 4 mot de alterativa hypotese H : µ > 4 på sådat sätt att saolikhete för fel av första slaget är 005 och saolikhete för fel av adra slaget är 00 med de saa µ = 5 Hur skall väljas? 4 3 poäg) X ) )) 4 Atag att X = N, Ma vill göra e lijärkombiatio Y = ax X 5) 3 + bx såda att vätevärdet EY ) = 8 och variase V Y ) miimeras Bestäm a och b 5 3 poäg) Atalet bilar som passerar e bro ka atas vara Poissofördelat med ett vätevärde µ bilar per miut orrut och ett vätevärde µ bilar per miut söderut Atag att atalet bilar orrut är oberoede av atalet bilar söderut Uder e timme passerade 60 bilar varav 90 var orrut Bilda ett approximativt 95% kofidesitervall för µ µ 6 3 poäg) Ma vill udersöka om utlåigsfrekvese för ett bibliotek varierar med veckodag Uder e slumpmässigt vald vecka erhölls följade resultat: veckodag mådag tisdag osdag torsdag fredag # utlåade böcker 08 35 4 46 0 Testa på sigifikasivå α = 00 huruvida utlåige varierar med veckodag Page /

TAMS4: Notatios ad Formulas by Xiagfeg Yag Basic otatios ad defiitios X: radom variable stokastiska variabel); Mea Vätevärde): { kp Xk), if X is discrete, µ = EX) = xf Xx)dx, if X is cotiuous; Variace Varias): σ = V X) = EX µ) ) = EX ) EX)) ; Stadard deviatio Stadardavvikelse): σ = DX) = V X); Populatio X; Radom sample slumpmässigt stickprov): X,, X are idepedet ad have the same distributio as the populatio X Before observe/measure, X,, X are radom variables, ad after observe/measure, we use x,, x which are umbers ot radom variables); Sample mea Stickprovsmedelvärde): Before observe/measure, X = X i, ad after observe/measure, x = x i; Sample variace Stickprovsvarias): Before observe/measure, S = X i X), ad after observe/measure, s = x i x) ; Sample stadard deviatio Stickprovsstadardavvikelse): Before observe/measure, S = S, ad after observe/measure, s = s ; E cixi) = ciexi), V cixi) = c i V Xi), if X,, X are idepedet oberoede); If X Nµ, σ), the X µ σ N0, ); If X,, X are idepedet ad Xi Nµi, σi), the d + cixi Nd + ciµi, c i σ i ); For a populatio X with a ukow parameter θ, ad a radom sample {X,, X} : Estimator Stickprovsvariabel): ˆΘ = gx,, X), a radom variable; /0 Estimate Puktskattig): ˆθ = gx,, x), a umber; Ubiased Vätevärdesriktig): E ˆΘ) = θ; Effective Effektiv): Two estimators ˆΘ ad ˆΘ are ubiased, we say that ˆΘ is more effective tha ˆΘ if V ˆΘ) < V ˆΘ); Biomial distributio X BiN, p) : there are N idepedet ad idetical trials, each trial has a probability of success p, ad X = the umber of successes i these N trials The radom variable X BiN, p) has a probability fuctio saolikhetsfuktio) ) N pk) = P X = k) = p k p) N k ; k Expoetial distributio X Exp/µ) : whe we cosider the waitig time/lifetime The radom variable X Exp/µ) has a desity fuctio täthetsfuktio) fx) = µ e x/µ, x 0 Poit estimatio Method of momets Mometmetode): # of equatios depeds o # of ukow parameters, EX) = x, EX ) = x i, EX 3 ) = x 3 i, Cosistet Kosistet): A estimator ˆΘ = gx,, X) is cosistet if lim P ˆΘ θ > ε) = 0, for ay costat ε > 0 This is called covergece i probability ) Theorem: If E ˆΘ) = θ ad lim V ˆΘ) = 0, the ˆΘ is cosistet Least square method mista-kvadrat-metode): The least square estimate ˆθ is the oe miimizig Qθ) = xi EX)) Maximum-likelihood method Maximum-likelihood-metode): The maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio Lθ) = { fx i; θ), if X is cotiuous, px i; θ), if X is discrete Remark o ML: I geeral, it is easier/better to maximize l Lθ); Remark o ML: If there are several radom samples say m) from differet populatios with a same ukow parameter θ, the the maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio defied as Lθ) = Lθ) Lmθ), where Liθ) is the likelihood fuctio from the i-th populatio /0

Estimates of populatio variace σ : If there is oly oe populatio with a ukow mea, the method of momets ad maximum-likelihood method, i geeral, give a estimate of σ as follows σ = xi x) NOT ubiased) A adjusted or corrected) estimate would be the sample variace xi x) ubiased) s = If there are m differet populatios with ukow meas ad a same variace σ, the a adjusted or corrected) ML estimate is s = )s + + m )s m ) + + m ) ubiased) where i is the sample size of the i-th populatio, ad s i is the sample variace of the i-th populatio Stadard error medelfelet) of a estimator ˆΘ: is a estimate of the stadard deviatio D ˆΘ) 3 Iterval estimatio Oe sample, X} from Nµ, σ) Two samples {X,, X } from Nµ, σ); {Y,, Y } from Nµ, σ); Nµ, σ) ad Nµ, σ) are idepedet Iµ = x λ α/ σ, if σ is kow; fact X µ x t α/ ) s, if σ is ukow; fact X µ s/ t ) I σ = )s σ/ N0, ) ) ; fact )S σ χ ) χ α ), )s χ α ) Ukow σ ca be estimated by the sample variace s = x i x) σ x ȳ) λ α/ Iµ µ + σ, if σ ad σ are kow; X Ȳ ) µ µ) fact N0, ) σ + σ x ȳ) t α/ + ) s +, if σ = σ = σ is ukow; X Ȳ ) µ µ) = fact S + t + ) s x ȳ) t α/ f) + s, if σ σ both are ukow; I σ = + )s χ α + ), + )s χ α + ) X Ȳ ) µ µ) fact tf) S + S degrees of freedom f = s /+s /) s / ) + s / ) ), if σ = σ = σ; fact + )S σ χ + ) Ukow σ ca be estimated by the samples variace s = )s + )s + m samples: The ukow σ = = σ m = σ ca be estimated by s = )s ++m )s m )++m ) 3/0 Nµ, σ) idep Nµ, σ) Remark: The idea of usig fact to fid cofidece itervals is very importat There are a lot more differet cofidece itervals besides above For istace, we cosider two idepedet samples: {X,, X } from Nµ, σ) ad {Y,, Y } from Nµ, σ) I this case, we ca easily prove that c X + cȳ N cµ + cµ, σ c + c c X+cȲ ) cµ+cµ) If σ is kow, the fact σ c + c N0, ) So we ca fid Icµ+cµ ; c X+cȲ ) cµ+cµ) If σ is ukow, the fact S c + c t + ) So we ca fid Icµ+cµ 3 Cofidece itervals from ormal approximatios ˆp ˆp) X BiN, p) : Ip = ˆp λ α/ N, fact ˆP p N0, ) ˆP ˆP ) N we require that N ˆp > 0 ad N ˆp ˆp) > 0) N X HypN,, p) : Ip = ˆp λ α/ N ˆp ˆp), fact ˆP p N N ˆP ˆP N0, ) ) x X P oµ) : Iµ = x λ α/, fact X µ N0, ) X we require that x > 5) X Exp µ ) : I x µ = x + λ α/, λ α/ Iµ = x λ α/ x, fact, fact X µ µ/ N0, ), X µ N0, ) X/ we require that 30) Remark: Agai there are more cofidece itervals besides above For istace, we cosider two idepedet samples: X from BiN, p) ad Y from BiN, p), with ukow p ad p As we kow ˆP N p, p p) ad ˆP N p, p p), ) so ˆP ˆP N p p, p p) + p p) ˆP ˆP) p p) Therefore, fact is ˆP ˆP ) + ˆP ˆP ) Ip p = ˆp ˆp) λ α/ ˆp ˆp) + ˆp ˆp) N 0, ), 3 Cofidece itervals from the ratio of two populatio variaces 4/0

Suppose there are two idepedet samples {X,, X } from Nµ, σ), ad {Y,, Y } from Nµ, σ) The )S χ σ ) ad )S χ σ ), therefore S /σ S F /σ, ), fact Thus I σ /σ = s s F α, ), s ) s F α, ) 33 Large sample size 30, populatio may be completely ukow) If there is o iformatio about the populatios), the we ca apply Cetral Limit Theorem usually with a large sample 30) to get a approximated ormal distributios Here are two examples: Example : Let {X,, X}, 30, be a radom sample from a populatio, the o matter what distributio the populatio is) X µ s/ N0, ) Example : Let {X,, X }, 30, be a radom sample from a populatio, ad {Y,, Y }, 30, be a radom sample from aother populatio which is idepedet from the first populatio, the o matter what distributios the populatios are) X Ȳ ) µ µ) N0, ) s + s 4 Hypothesis testig 4 Oe sample ad the geeral theory of hypothesis testig Suppose there is a radom sample {X,, X} from a populatio X with a ukow parameter θ, H0 : θ = θ0 vs H : θ < θ0, or θ > θ0, or θ θ0 H0 is true H0 is false ad θ = θ reject H0 type I error or sigificace level) α power) hθ) do t reject H0 α type II error) βθ) = hθ) Regardig the p-value: reject H0 if ad oly if p-value < α For otatioal simplicity, we employ TS := test statistic ; ad C := critical regio reject H0 if TS C; reject H0 if ad oly if p-value < α 5/0 4 Hypothesis testig for populatio meas) Oe sample: {X,, X} from Nµ, σ) Null hypothesis H0 : µ = µ0 σ is kow: X µ σ/ N0, ) H : µ < µ0 : TS = x µ0 σ/, C =, λ α), p-value = P N0, ) TS); H : µ > µ0 : TS = x µ0 σ/, C = λ α, + ), p-value = P N0, ) TS); H : µ µ0 : TS = x µ0 σ/, C =, λ α/) λ α/, + ), p-value = P N0, ) TS ) σ is ukow: X µ s/ t ) H : µ < µ0 : TS = x µ0 s/, C =, t α )), p-value = P t ) TS); H : µ > µ0 : TS = x µ0 s/, C = t α ), + ), p-value = P t ) TS); H : µ µ0 : TS = x µ0 s/, C =, t α/ )) t α/ ), + ), p-value = P t ) TS ) Two samples: {X,, X } from Nµ, σ); {Y,, Y } from Nµ, σ); Null hypothesis H0 : µ = µ σ, σ are kow: X Ȳ ) µ µ) N0, ) σ + σ H : µ < µ : TS = x ȳ) σ + σ p-value = P N0, ) TS); H : µ > µ : TS = x ȳ), C = λα, + ), σ + σ, C =, λα), p-value = P N0, ) TS); H : µ µ : TS = x ȳ) σ + σ, C =, λ α/ ) λ α/, + ), p-value = P N0, ) TS ) H : µ < µ : TS = σ = σ is ukow: x ȳ) s, C =, + tα + )), p-value = P t + ) TS); H : µ > µ : TS = x ȳ) s, C = + tα + ), + ), X Ȳ ) µ µ) S + t + ) p-value = P t + ) TS); H : µ µ : TS = x ȳ) s, C =, t + α/ + )) p-value = P t + ) TS ) t α/ + ), + ), σ σ both ukow: similarly as i the tree of cofidece itervals 6/0

43 Hypothesis testig for populatio variaces) H : σ {X,, X } from Nµ, σ) )S < σ )s 0 : TS =, C = 0, χ σ α )), 0 p-value = P χ ) TS); χ H : σ > σ )s 0 : TS =, C = χ σ α ), + ), ) σ 0 H0 : σ = σ 0 p-value = P χ ) TS); H : σ σ )s 0 : TS =, C = 0, χ σ 0 α )) χ α ), + ), p-value = P χ ) TS) or P χ ) TS) H : σ {X,, X } from Nµ, σ) < σ : TS = s /s, C = 0, F α, )), p-value = P F, ) TS); H : σ > σ : TS = s /s, C = F α, ), + ), p-value = P F, ) TS); {Y,, Y } from Nµ, σ) S /σ F S, ) /σ H0 : σ = σ H : σ σ : TS = s /s, C = 0, F α, )) F α, ), + ), p-value = P F, ) TS) or P F, ) TS) 44 Large sample size 30, populatio may be completely ukow) If there is o iformatio about the populatios), the we ca apply Cetral Limit Theorem usually with a large sample 30) The idea is exactly the same as the oe used i cofidece itervals Oe example is: a sample {X,, X}, 30, from some populatio which is ukow) with a mea µ ad stadard deviatio σ Null hypothesis H0 : µ = µ0 The it follows from CLT that X µ s/ N0, ), therefore H : µ < µ0 : TS = x µ0 s/, C =, λ α), p-value = P N0, ) TS); H : µ > µ0 : TS = x µ0 s/, C = λ α, + ), p-value = P N0, ) TS); H : µ µ0 : TS = x µ0 s/, C =, λ α/) λ α/, + ), p-value = P N0, ) TS ) 5 Multi-dimesio radom variables or radom vectors) Covariace Kovarias) of X, Y ): σx,y = covx, Y ) = E X µx)y µy ), covx, X) = V X)) Correlatio coefficiet Korrelatio) of X, Y ): ρx,y = covx,y ) V X) V Y ) = σx,y σx σy A rule: for real costats a, ai, b ad bj, m m cova + aixi, b + bjyj) = aibjcovxi, Yj) j= j= 7/0 X ad Y are ucorrelated: if covx, Y ) = 0 A importat theorem: Suppose that a radom vector X has a mea µx ad a covariace matrix CX Defie a ew radom vector Y = AX + b, for some matrix A ad vector b The µy = AµX + b, CY = ACXA Stadard ormal vectors: {Xi} are idepedet ad Xi N0, ), X = X X X, thus µx = 0 0 0, CX = 0 0 0 0 0 0, desity fxx) = π) e x x Geeral ormal vectors: Y = AX + b, where X is a stadard ormal vector, ad µy = b, CY = AA, desity fyy) = π) detcy) e y µy) C y µy) Y 6 Simple ad multiple) Liear regressios Simple liear regressio: Yj = β0 + βxj + εj, εj N0, σ), j =,, Multiple liear regressio: Yj = β0 + βxj + βxj + + βkxjk + εj, εj N0, σ), j =,, Both Simple liear regressio ad Multiple liear regressio ca be writte as vector forms: Y x xk Y x β0 xk Y = Xβ + ε : Y =, X =, β =, ε N0, σ I ) βk Y x xk Y NµY, CY), where µy = Xβ ad CY = σ I Estimate of the coefficiet β: ˆβ = X X) X y Estimator of the coefficiet β: ˆB = X X) X Y N β, σ X X) ) Estimated lie is: ˆµj = ˆβ0 + ˆβxj + ˆβxj + + ˆβkxjk Aalysis of variace: SST OT = yj ȳ), j= SSR = ˆµj ȳ), j= SSE = yj ˆµj), j= SST OT σ = j= Y j Ȳ ) σ χ ), if β = = βk = 0; j= ˆµ j Ȳ ) σ χ k), if β = = βk = 0; SSR σ ˆµj) χ k ) = SSE σ j= = j σ 8/0

SST OT = SSR + SSE, ad R = SS R SST OT σ is estimated as ˆσ = S = SSE k For the Hypothesis testig: H0 : β = = βk = 0 vs H : at least oe βj 0, SSR/k SSE/ k ) F k, k ) SSR/k TS = SSE/ k ) C = Fαk, k ), + ) We kow ˆB = X X) X Y N β, σ X X) ), thus if we deote X X ) = h00 h0 h0k h0 h hk, hk hk hkk the ˆBj Nβj, σ hjj) ad ˆBj βj σ hjj N0, ) But σ is geerally ukow, therefore ˆBj βj S t k ), hjj s hjj is sometimes deoted as d ˆβj) or se ˆβj) Cofidece iterval of βj is: Iβj = ˆβj t α/ k ) s hjj; Hypothesis testig H0 : βj = 0 vs H : βj 0 has TS = ˆβj s hjj C =, t α/ k )) t α/ k ), + ) Rewrite simple ad multiple liear regressios as follows: Y = β0 + βx + + βkxk + ε, ε N0, σ), the model); µ = EY ) = β0 + βx + + βkxk, the mea); ˆµ = ˆβ0 + ˆβx + + ˆβkxk, the estimated lie) For a give/fixed x =, x,, xk), the scalar ˆµ is a estimate of ukow µ ad Y ) The we ca talk about accuracy of this estimate i terms of cofidece itervals ad predictio itervals) Cofidece iterval of µ: Iµ = ˆµ t α/ k ) s x X X) x Predictio iterval of Y : IY = ˆµ t α/ k ) s + x X X) x Suppose we have two models: { Model : Y = β0 + βx + + βkxk + ε; Model : Y = β0 + βx + + βkxk + βk+xk+ + + βk+pxk+p + ε, 9/0 ad we wat to test H0 : βk+ = = βk+p = 0 vs H : at least oe βk+i 0, SS ) E SS) E )/p F p, k p ) SS ) E / k p ) TS = SS) E SS) E )/p SS ) E / k p ) C = Fαp, k p ), + ) Variable selectio If we have a respose variable y with possibly may predictors x,, xk, the how to choose appropriate x s some x s are useful to Y, ad some are ot): Step : corrx,, xk, y), choose a maximal correlatio say xi), Y = β0 + βixi + ε, test if βi = 0? Step : do regressio Y = β0 + βixi + β x + ε for =,, i, i +,, k, choose a miimal SSE say xj), Y = β0 + βixi + βjxj + ε, test if βj = 0? Step 3: repeat Step util the last test for β = 0 is ot rejected 7 Basic χ -test { H0 : X distributio with or without ukow parameters); Suppose we wat to test H : X distributio with or without ukow parameters) fact is : k Ni pi) χ k #of ukow parameters); pi The TS = k Ni pi) ; pi C = χ αk #of ukow parameters), + ) Homogeeity test Suppose we have a data with r rows ad k colums, { H0 : differet rows have a same patter i terms of colums); H : differet rows have differet patters i terms of colums) Equivaletly, { H0 : rows ad colums are idepedet; H : rows ad colums are ot idepedet The fact is : k r Nij pij) j= χ r )k )); pij TS = k r Nij pij) j= ; pij C = χ αr )k )), + ), where pij = pi qj are the theoretical probabilities 0/0