TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: = V X = X µ = X X Stadard deviatio Stadardavvikelse: = DX = V X Populatio X Radom sample slumpmässigt stickprov: X,, X are idepedet ad have the same distributio as the populatio X Before observe/measure, X,, X are radom variables, ad after observe/measure, we use x,, x which are umbers ot radom variables Sample mea Stickprovsmedelvärde: Before observe/measure, X = X i, ad after observe/measure, x = x i Sample variace Stickprovsvarias: Before observe/measure, S = X i X, ad after observe/measure, s = x i x Sample stadard deviatio Stickprovsstadardavvikelse: Before observe/measure, S = S, ad after observe/measure, s = s c i X i = c i X i, V c i X i = c i V X i, if X,, X are idepedet oberoede If X Nµ,, the X µ N, If X,, X are idepedet ad X i Nµ i, i, the d + c i X i Nd + c i µ i, c i i For a populatio X with a ukow parameter θ, ad a radom sample X,, X } : stimator Stickprovsvariabel: ˆΘ = gx,, X, a radom variable / stimate Puktskattig: ˆθ = gx,, x, a umber Ubiased Vätevärdesriktig: ˆΘ = θ ffective ffektiv: Two estimators ˆΘ ad ˆΘ are ubiased, we say that ˆΘ is more effective tha ˆΘ if V ˆΘ < V ˆΘ Biomial distributio X BiN, p : there are N idepedet ad idetical trials, each trial has a probability of success p, ad X = the umber of successes i these N trials The radom variable X BiN, p has a probability fuctio saolikhetsfuktio N pk = P X = k = p k p N k k xpoetial distributio X xp/µ : whe we cosider the waitig time/lifetime The radom variable X xp/µ has a desity fuctio täthetsfuktio Poit estimatio fx = µ e x/µ, x Method of momets Mometmetode: # of equatios depeds o # of ukow parameters, X = x, X = x i, X 3 = x 3 i, Cosistet Kosistet: A estimator ˆΘ = gx,, X is cosistet if lim P ˆΘ θ > ε =, for ay costat ε > This is called covergece i probability Theorem: If ˆΘ = θ ad lim V ˆΘ =, the ˆΘ is cosistet Least square method mista-kvadrat-metode: The least square estimate ˆθ is the oe miimizig Qθ = x i X Maximum-likelihood method Maximum-likelihood-metode: The maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio Lθ = fx i θ, if X is cotiuous, px i θ, if X is discrete Remark o ML: I geeral, it is easier/better to maximize l Lθ Remark o ML: If there are several radom samples say m from differet populatios with a same ukow parameter θ, the the maximum-likelihood estimate ˆθ is the oe maximizig the likelihood fuctio defied as Lθ = L θ L m θ, where L i θ is the likelihood fuctio from the i-th populatio /
stimates of populatio variace : If there is oly oe populatio with a ukow mea, the method of momets ad maximum-likelihood method, i geeral, give a estimate of as follows = x i x NOT ubiased A adjusted or corrected estimate would be the sample variace s = x i x ubiased If there are m differet populatios with ukow meas ad a same variace, the a adjusted or corrected ML estimate is s = s + + m s m + + m ubiased where i is the sample size of the i-th populatio, ad s i is the sample variace of the i-th populatio Stadard error medelfelet of a estimator ˆΘ: is a estimate of the stadard deviatio D ˆΘ 3 Iterval estimatio Oe sample X,, X } from Nµ, Two samples X,, X } from Nµ, Y,, Y } from Nµ, Nµ, ad Nµ, are idepedet x λ α/, if is kow fact / I µ = N, x t α/ s, if is ukow fact s/ t s I = χ α, s fact S χ χ α Ukow ca be estimated by the sample variace s = x i x x ȳ λ α/ +, if ad are kow fact X Ȳ µ µ N, + x ȳ t α/ + s +, if = = is ukow X Ȳ µ µ fact I µ µ = S t + + s x ȳ t α/ f + s, if both are ukow fact X Ȳ µ µ S + S tf degrees of freedom f = s / + s / I = + s χ α, + s, if + χ α = = + fact + S χ + Ukow ca be estimated by the samples variace s = s + s + s /+s / Remark: The idea of usig fact to fid cofidece itervals is very importat There are a lot more differet cofidece itervals besides above For istace, we cosider two idepedet samples: X,, X } from Nµ, ad Y,, Y } from Nµ, I this case, we ca easily prove that c X + c Ȳ N c µ + c µ, If is kow, the fact c X+c Ȳ c µ +c µ c + c If is ukow, the fact c X+c Ȳ c µ +c µ S c + c N, So we ca fid I cµ +c µ c + c 3 Cofidece itervals from ormal approximatios t + So we ca fid I cµ +c µ ˆp ˆp X BiN, p : I p = ˆp λ α/ N, fact ˆP p N, ˆP ˆP N we require that N ˆp > ad N ˆp ˆp > N X HypN,, p : I p = ˆp λ α/ N ˆP ˆp ˆp, fact X P oµ : I µ = x λ α/ x, fact X µ X N, p N N ˆP ˆP we require that x > 5 X xp µ : I µ = x x + λ, α/ λ, fact X µ α/ µ/ N,, x X µ I µ = x λ α/, fact X/ N, we require that 3 N, Remark: Agai there are more cofidece itervals besides above For istace, we cosider two idepedet samples: X from BiN, p ad Y from BiN, p, with ukow p ad p As we kow p p ˆP N p, ad ˆP p p N p,, so ˆP ˆP N p p, p p + p p Therefore, fact is ˆP ˆP p p N,, ˆP ˆP + ˆP ˆP I p p = ˆp ˆp λ α/ ˆp ˆp + ˆp ˆp m samples: The ukow = = m = ca be estimated by s = s ++m s m ++ m 3/ 3 Cofidece itervals from the ratio of two populatio variaces 4/
Suppose there are two idepedet samples X,, X } from Nµ,, ad Y,, Y } from Nµ, The S Thus χ ad S χ, therefore S / S F,, / s I / = s F α,, s s fact F α, 33 Large sample size 3, populatio may be completely ukow If there is o iformatio about the populatios, the we ca apply Cetral Limit Theorem usually with a large sample 3 to get a approximated ormal distributios Here are two examples: xample : Let X,, X }, 3, be a radom sample from a populatio, the o matter what distributio the populatio is X µ s/ N, xample : Let X,, X }, 3, be a radom sample from a populatio, ad Y,, Y }, 3, be a radom sample from aother populatio which is idepedet from the first populatio, the o matter what distributios the populatios are 4 Hypothesis testig X Ȳ µ µ s + s N, 4 Oe sample ad the geeral theory of hypothesis testig Suppose there is a radom sample X,, X } from a populatio X with a ukow parameter θ, H : θ = θ vs H : θ < θ, or θ > θ, or θ θ H is true H is false ad θ = θ reject H type I error or sigificace level α power hθ do t reject H α type II error βθ = hθ Regardig the p-value: For otatioal simplicity, we employ reject H if ad oly if p-value < α TS := test statistic ad C := critical regio reject H if TS C reject H if ad oly if p-value < α 4 Hypothesis testig for populatio meas Oe sample: X,, X } from Nµ, Null hypothesis H : µ = µ H : µ < µ : TS = x µ is kow: H : µ > µ : TS = x µ / N, H : µ µ : TS = x µ H : µ < µ : TS = x µ is ukow: H : µ > µ : TS = x µ s/ t H : µ µ : TS = x µ /, C =, λ α, p-value = P N, TS /, C = λ α, +, p-value = P N, TS /, C =, λ α/ λ α/, +, p-value = P N, TS s/, C =, t α, p-value = P t TS s/, C = t α, +, p-value = P t TS s/, C =, t α/ t α/, +, p-value = P t TS Two samples: X,, X } from Nµ, Y,, Y } from Nµ, Null hypothesis H : µ = µ H : µ < µ : TS = x ȳ, C =, λ α, + p-value = P N, TS, are kow: H : µ > µ : TS = x ȳ, C = λ α, +, X Ȳ µ µ + N, + p-value = P N, TS H : µ µ : TS = x ȳ, C =, λ α/ λ α/, +, + p-value = P N, TS X Ȳ µ µ S + H : µ < µ : TS = x ȳ s, C =, t α +, + p-value = P t + TS x ȳ s + H = is ukow: : µ > µ : TS =, C = t α +, +, t + p-value = P t + TS H : µ µ : TS = x ȳ s, C =, t + α/ + p-value = P t + TS both ukow: similarly as i the tree of cofidece itervals t α/ +, +, 5/ 6/
43 Hypothesis testig for populatio variaces X,, X } from Nµ, S χ H : = H : < H : > H : s : TS =, C =, χ α, p-value = P χ TS s : TS =, C = χ α, +, p-value = P χ TS s : TS =, C =, χ α χ α, +, p-value = P χ TS or P χ TS H : < : TS = s /s, C =, F α,, p-value = P F, TS X,, X } from Nµ, H : Y,, Y } from Nµ, > : TS = s /s, C = F α,, +, p-value = P F, TS S / F S, H : / : TS = s /s, C =, F α, H : = F α,, +, p-value = P F, TS or P F, TS X ad Y are ucorrelated: if covx, Y = A importat theorem: Suppose that a radom vector X has a mea µ X ad a covariace matrix C X Defie a ew radom vector Y = AX + b, for some matrix A ad vector b The µ Y = Aµ X + b, C Y = AC X A Stadard ormal vectors: X i } are idepedet ad X i N,, X X X =, thus µ X =, C X =, desity f Xx = π e X Geeral ormal vectors: Y = AX + b, where X is a stadard ormal vector, ad µ Y = b, C Y = AA, desity f Y y = π detc Y e 6 Simple ad multiple Liear regressios y µ y C y µy Y x x 44 Large sample size 3, populatio may be completely ukow If there is o iformatio about the populatios, the we ca apply Cetral Limit Theorem usually with a large sample 3 The idea is exactly the same as the oe used i cofidece itervals Oe example is: a sample X,, X }, 3, from some populatio which is ukow with a mea µ ad stadard deviatio Null hypothesis H : µ = µ The it follows from CLT that s/ N,, therefore H : µ < µ : TS = x µ H : µ > µ : TS = x µ s/, C =, λ α, p-value = P N, TS s/, C = λ α, +, p-value = P N, TS H : µ µ : TS = x µ s/, C =, λ α/ λ α/, +, p-value = P N, TS 5 Multi-dimesio radom variables or radom vectors Covariace Kovarias of X, Y : X,Y = covx, Y = X µ X Y µ Y, covx, X = V X Correlatio coefficiet Korrelatio of X, Y : ρ X,Y = covx,y = X,Y V X V Y X Y A rule: for real costats a, a i, b ad b j, m m cova + a i X i, b + b j Y j = a i b j covx i, Y j j= 7/ j= Simple liear regressio: Y j = β + β x j + ε j, ε j N,, j =,, Multiple liear regressio: Y j = β + β x j + β x j + + β k x jk + ε j, ε j N,, j =,, Both Simple liear regressio ad Multiple liear regressio ca be writte as vector forms: Y x x k Y Y = Xβ + ε : Y =, X = x x k β, β =, ε N, I β Y x x k k Y Nµ Y, C Y, where µ Y = Xβ ad C Y = I stimate of the coefficiet β: ˆβ = X X X y stimator of the coefficiet β: ˆB = X X X Y N β, X X stimated lie is: ˆµ j = ˆβ + ˆβ x j + ˆβ x j + + ˆβ k x jk Aalysis of variace: SS T OT = SS R = SS = y j ȳ, j= ˆµ j ȳ, j= y j ˆµ j, j= SS T OT j= = Y j Ȳ χ, if β = = β k = j= ˆµ j Ȳ SS R = SS = χ k, if β = = β k = j= Y j ˆµ j χ k 8/
SS T OT = SS R + SS, ad R = SS R SS T OT is estimated as ˆ = S = SS k For the Hypothesis testig: H : β = = β k = vs H : at least oe β j, SS R/k SS / k F k, k TS = SS R/k SS / k C = F α k, k, + We kow ˆB = X X X Y N β, X X, thus if we deote h h h k X X h h h k =, h k h k h kk ad we wat to test H : β k+ = = β k+p = vs H : at least oe β k+i, SS SS /p F p, k p / k p SS TS = SS SS SS /p / k p C = F α p, k p, + Variable selectio If we have a respose variable y with possibly may predictors x,, x k, the how to choose appropriate x s some x s are useful to Y, ad some are ot: Step : corrx,, x k, y, choose a maximal correlatio say x i, Y = β + β i x i + ε, test if β i =? Step : do regressio Y = β + β i x i + β x + ε for =,, i, i +,, k, choose a miimal SS say x j, Y = β + β i x i + β j x j + ε, test if β j =? Step 3: repeat Step util the last test for β = is ot rejected the ˆB j Nβ j, h jj ad ˆB j β j N, But is geerally ukow, therefore h jj ˆB j β j S h jj t k, Cofidece iterval of β j is: I βj = ˆβ j t α/ k s h jj s h jj is sometimes deoted as d ˆβ j or se ˆβ j 7 Basic χ -test Suppose we wat to test H : H : X distributio with or without ukow parameters X distributio with or without ukow parameters Hypothesis testig H : β j = vs H : β j has TS = ˆβ j s h jj C =, t α/ k t α/ k, + Rewrite simple ad multiple liear regressios as follows: Y = β + β x + + β k x k + ε, ε N,, the model µ = Y = β + β x + + β k x k, the mea ˆµ = ˆβ + ˆβ x + + ˆβ k x k, the estimated lie For a give/fixed x =, x,, x k, the scalar ˆµ is a estimate of ukow µ ad Y The we ca talk about accuracy of this estimate i terms of cofidece itervals ad predictio itervals Cofidece iterval of µ: I µ = ˆµ t α/ k s x X X x Predictio iterval of Y : I Y = ˆµ t α/ k s + x X X x Suppose we have two models: Model : Y = β + β x + + β k x k + ε Model : Y = β + β x + + β k x k + β k+ x k+ + + β k+p x k+p + ε, fact is : k N i p i p i χ k #of ukow parameters The TS = k N i p i p i C = χ αk #of ukow parameters, + Homogeeity test Suppose we have a data with r rows ad k colums, H : differet rows have a same patter i terms of colums quivaletly, The H : differet rows have differet patters i terms of colums H : H : rows ad colums are idepedet rows ad colums are ot idepedet fact is : k r N ij p ij j= p ij χ r k TS = k r N ij p ij j= p ij C = χ αr k, +, where p ij = p i q j are the theoretical probabilities 9/ /