6. Cox Regression Models. (Part I)
|
|
- Andrew Webb
- 5 years ago
- Views:
Transcription
1 6. Cox Regressio Models (Part I)
2 The Proportioal Hazards Model A proportioal hazards model proposed by D.R. Cox (197) assmes that λ t z = λ 0 t e β 1z 1 + +β p z p = λ 0 t e zt β where z is a p 1 vector of covariates sch as treatmet idicators, progositc factors, etc., ad β is a p 1 vector of regressio coefficiets. Note that there is o itercept β 0 i the model. So λ 0 t is ofte called the baselie hazard fctio. It ca be iterpreted as the hazard fctio for the poplatio of sbjects with z = 0, i.e. λ t z = 0 = λ 0 t. The baselie hazard fctio λ 0 t ca take ay shape as a fctio of t. The oly reqiremet is that λ 0 t > 0. This is the oparametric part of the model ad z t β is the parametric part of the model. So Cox's proportioal hazards model is a semiparametric model.
3 Iterpretatio of a PH model exp z It is easy to show that der PH model, S t z = S 0 t Tβ, where S t z is the srvival fctio of the sbpoplatio with covariate z ad S 0 t is the srvival fctio of baselie poplatio (z = 0). That is S 0 t = e 0 For ay two sets of covariates z 0 ad z 1, t λ0 d λ t z 1 = λ 0 t e z1 λ t z 0 λ 0 t e = e z 1 z Tβ 0, for all t > 0 z 0 T β which is a costat over time. Eqivaletly, log λ t z 1 = z λ t z 1 z T 0 β. 0 T β With oe it icrease i z k while other covariate vales beig held fixed, the log λ t z k + 1 λ t z = β k + 1 λ t z k = e β k or, k λ t z k Therefore, β k e β k is the icrease i log hazard-ratio (hazard-ratio) at ay time with it icrease i the kth covariate z k. Frthermore, sice P t T < t + Δt T t, z k + 1 P t T < t + Δt T t, z k λ t z k + 1 Δt λ t z k Δt = e β k so e β k ca be loosely iterpreted as the ratio of two coditioal probabilities of dyig i the ear ftre give a sbject is alive at ay time t.
4 Iferetial Problems From the iterpretatio of the model, it is obvios that β characterizes the effect of z. So β shold be the focs of or iferece while λ 0 t is a isace parameter. Give a sample of cesored srvival data, or iferetial problems iclde: 1. Estimate β; derive its statistical properties.. Testig hypothesis H 0 : β = 0 or for part of β. 3. Diagostics. Sice the baselie hazard λ 0 t is left completely specified (ifiite dimesioal), ordiary likelihood methods ca't be sed to estimate β. Cox coceived of the idea of a partial likelihood to remove the isace parameter λ 0 t from the proposed estimatig eqatio. Historical Note: Cox described the proportioal hazards model i JRSSB (197), i what is ow the most qoted statistical papers i history. He also otlied i this paper the method for estimatio which he referred to as sig coditioal likelihood. It was poited ot to him i the literatre that what he proposed was ot a coditioal likelihood ad that there may be some flaws i his logic. Cox (1975) was able to recast his method of estimatio throgh what he called partial likelihood ad pblished this i Biometrika. This approach seemed to be based o sod iferetial priciples. Rigoros proofs showig the cosistecy ad asymptotic ormality were ot pblished til 1981 whe Tsiatis (Aals of Statistics) demostrated these large sample properties. I 198, Aderso ad Gill (Aals of Statistics) simplified ad geeralized these reslts throgh the se of cotig processes.
5 Estimatio Usig Partial Likelihood Data: X i, Δ i, z i, i = 1,,, where for the ith idividal, T X i = mi T i, C i ; Δ i = I T i, C i ; z i = z i1, z i,, z ip is a vector of covariates. Model: λ t z i = λ 0 t e z i t β where, P t T < t + h T t, z i λ t z i = lim h 0 + h Assme that C i ad T i are coditioally idepedet give z i. The the case-specific hazard ca be sed to represet the hazard of iterest. That is (i terms of coditioal probabilities) P x X i < x + Δx, Δ i = 1 X i x, z i = P x T i < x + Δx T i x, z i λ Ti x z i Δx Similar to the case of log rak test, we eed to defie some otatio. Let s break the time axis (patiet time) ito a grid of poits. Assme the srvival time is cotios. We hece ca take the grid poits dese eogh so that at most oe death ca occr withi ay iterval.
6 Notatios Let dn i () deote the idicator for the ith idividal beig observed to die i, + Δ. Namely, dn i = I X i, + Δ, Δ i = 1 ; Let Y i () deote the idicator for whether or ot the ith idividal is at risk at time. Namely, Y i = I X i Let dn x = i=1 dn i x deote the mber of deaths for the whole sample occrrig i, + Δ. Sice we are assmig Δ is sfficietly small, so dn is either 1 or 0 at ay time. Let Y x = i=1 Y i x be the total mber from the etire sample who are at risk at time. Let F x deote the iformatio p to time x (oe of the grid poits) F x = dn i, Y i, z i, i = 1,,, for all grid poits < x ad dn x Also, F = X i, Δ i, z i, i = 1,, deote all the data i the sample. Let I x deote the idividal i the sample who died at time x if someoe died. If o oe dies at time x, the I x = 0. For example, I x = j meas that the jth idividal died i x, x + Δx. The the data (with reddacy) ca be expressed as F 1, I 1, F, I,, F, where 1 < < are the grid poits
7 The Partial Likelihood The likelihood of the parameter λ 0 t ad β ca be writte as P F 1 = f 1 ; λ 0 t, β P I 1 = i 1 F 1 = f 1 ; λ 0 t, β P F = f F 1 = f 1, I 1 = i 1 ; λ 0 t, β P I = i F = f, F 1 = f 1, I 1 = i 1 ; λ 0 t, β ad the last term ca be simplified as P I = i F = f, F 1 = f 1, I 1 = i 1 ; λ 0 t, β = P I = i F = f ; λ 0 t, β That is, the fll likelihood ca be writte as the prodct of a series of coditioal likelihoods. The partial likelihood (as defied by D.R. Cox) cosists of the prodct of every other coditioal probabilities i the above presetatio. That is PL = all grid pt P I = i F = f ; λ 0 t, β
8 Ex: Patiet ID x δ z PL = 1 eβ e β +e 3β e3β e 3β e β e β +e β +e β +e 3β Case 1: Sppose coditioal o F() we have dn = 0. That is, o death is observed at time. I sch a case, I = 0 with probability 1. Hece for ay grid poit where dn = 0, we have P I = 0 F = f = 1 Case : dn = 1. Coditioal o F, if we kow that oe idividal dies at time, the it mst be oe of the idividals still at risk (alive ad ot cesored) at time ; i.e., amog the followig idividals i: Y i = 1. Also coditioal o F, we kow the covariate vector z i associated to each idividal i sch that Y i = 1. Therefore, we ask the followig qestio: AmogY = i=1 Y i idividals, what is the probability that the observed death happeed to the ith sbject (who is actally observed to die at ) rather tha to the other patiets? A: It is proportioal to the ith sbject s case-specific hazard of dyig at time.
9 Let A i = the evet that sbject i is goig to die i, + Δ give that he/she is still alive at. If a patiet is ot at risk at (i.e., Y i = 0), the A i = φ. Sice we chose Δ to be so small that there is at most oe death i, + Δ, so we kow A 1, A,, A are mtally exclsive. Also, der idepedet cesorig assmptio, the case-specific hazard is the same as the hazard of iterest (Chapter 3 ), i.e., λ, δ i = 1 z i = λ z i. Sice Δ is chose to be very small, so P A i Y i λ, δ i = 1 z i Δ = Y i λ z i Δ = Y i λ 0 e ztβ Δ assmptio of the cox model P I = i F = f ; λ 0 t, β = P A i() A 1 A = P A i() P A l λ 0 e z T i() β Δ λ 0 eztβ l ΔY l = e zi() T β e ztβ l Y l σ Here Y i() = 1 sice we kow this patiet had to be at risk at, ad died i, + Δ. Take together: PL = all grid pt e z i T β e z l Tβ Y l dn()
10 Other eqivalet ways of writig the partial likelihood iclde: Let t 1,, t d defie the distict death times, the d PL β = j=1 e z T i t j β e z l Tβ Y l t j PL β = i=1 all grid pt e z i T β e z l Tβ Y l dn i () PL β = i=1 e z T i t j β e z l Tβ Y l t j δ i The importace of sig the partial likelihood is that this fctio depeds oly o β, the parameter of iterest, ad is free of the baselie hazard λ 0 t, which is ifiite dimesioal isace fctio. Cox sggested treatig PL as a reglar likelihood fctio ad makig iferece o β accordigly. For example, we maximize the partial likelihood to get the estimate of β, ofte called MPLE (maximm partial likelihood estimate), ad se the mis of the secod derivative of the log partial likelihood as the iformatio matrix, etc.
11 ҧ Properties of the score of the partial likelihood For ease of presetatio, let s focs o oe covariate case. the log partial likelihood fctio of β is l β = The score fctio is all grid pt l β U β = β = all grid pt Ad the secod derivative is l β β = dn() σ dn() z I β log z, β = σ z l e z l β Y l Defie e z l β Y l σ dn() z I z l e z l β Y l e z l β Y l e z l β Y l σ σ z l e z l β Y l e z l β Y l z l e z l β Y l e z l β Y l = z l w l where, w l = ez l β Y l e z l β Y l w l is the weight that is proportioal to the hazard of the idividal failig. So ҧ z, β ca be iterpreted as the weighted average of the covariate z amog those idividals still at risk at time with weights w l.
12 Defie V z, β = σ = σ z l e z l β Y l e z l β Y l = z l w l = z l e z l β Y l e z l β Y l z l zҧ, β zҧ, β σ zҧ, β z l e z l β Y l e z l β Y l This last represetatio says that V z, β ca be iterpreted as the weighted variace of the covariates amog those idividals still at risk at ad hece V z, β > 0. Coseqetly, l β β = dn V z, β < 0 w l Therefore l β has a iqe maximizer ad ca be obtaied iqely by solvig the followig partial likelihood eqatio: l β U β = β = dn() z I σ z l e z l β Y l = 0 all grid pt e z l β Y l This maximizer መβ defies the MPLE of β.
13 Termiology: The qatity l β β = dn V z, β is defied as the partial likelihood observed iformatio ad is deoted by J β. Ultimately, we wat to show that the MPLE መβ has ice statistical properties. These iclde: Cosistecy: መβ will coverge to the tre vale of β, which geerated the data as the sample size gets larger. We call this tre vale β 0. Asymptotic Normality: መβ will be approximately ormally distribted with mea β 0 ad a variace which ca be estimated from the data. This approximatio will be better as the sample size gets larger. This reslt is sefl i makig iferece for the tre β. Efficiecy: Amog all other competig estimators for β, the MPLE has the smallest variace, at least, whe the sample size gets larger. I order to show the properties for መβ, we expad U መβ at the tre vale β 0 sig Taylor expasio: 0 = U መβ U β 0 + U β 0 መβ β β 0 መβ β 0 U β 1 0 U β β 0 = J β 1 0 U β 0 This expressio idicates that we eed to ivestigate the properties of the score fctio U β 0, i.e., U β 0 = dn z l zҧ, β
14 Properties of the Score I E U β 0 = E dn z I() zҧ, β 0 = E dn z I() zҧ, β 0 = E E dn z I() zҧ, β 0 F() = E dn E z I() F() zҧ, β 0 Notice that: E z I() F() = σ z l e z l β 0Y l e z l β 0Y l = z l w l = zҧ, β 0 Therefore, E U β 0 = 0
15 Coditioal distribtio of z I() give F() Vale of I() Vale of z I() Probability 1 z 1 e z 1 β 0 Y 1 / e z l β 0Y l = w 1 z e z β 0 Y / e z l β 0Y l = w 1 e z z β 0 Y / e z l β 0Y l E z I() F() = σ z l e z l β 0Y l e z l β 0Y l = w = z l w l = zҧ, β 0 Var z I() F() = z l E z I() F() w l = = V z (, β 0 ) σ z l zҧ, β 0 e z l β 0Y l e z l β 0 Y l
16 Properties of the Score II Var U β 0 = E U β 0 = E dn = E z I() zҧ, β 0 dn z I zҧ, β 0 + E dn z I zҧ, β 0 dn z I zҧ, β 0 As sal, WLOG, assme >, ad deote A = dn z I zҧ, β 0, A = dn z I zҧ, β 0 The E A A = E E A A F = E A E A F = 0 So, the expectatio of the cross-prodct is zero. The, Var U β 0 = E A = E E A F() E A F() = E dn z I zҧ, β 0 F() dn = dn = E dn z I zҧ, β 0 F()
17 Coditioal o F, dn is kow, zҧ, β 0 is also kow, ad zҧ, β 0 = E z I() F() E A F() = dn E z I zҧ, β 0 = dn Var z I F() = dn V z (, β 0 ) F() Coseqetly, Var U β 0 = E dn V z (, β 0 ) = E dn V z, β 0 Note that the qatity σ dn V z, β 0 is a statistic (ca be calclated from the observed data), so σ dn V z, β 0 is a biased estimate of Var U β 0. I fact, σ dn V z, β 0 is the partial likelihood observed iformatio J β 0 we defied before. The score U β 0 = σ A() is a sm of coditioally correlated mea zero radom variables ad its variace ca be biasedly estimated by J β 0. By the martigale CLT, we have: U β 0 ~ a N(0, J β 0 ) መβ β 0 ~ a N(0, J 1 β 0 ) መβ β 0 ~ a N(0, J 1 መβ ) Note that: J 1 መβ = σ dn V z, መβ Sice መβ β 0 J β 0 1 U β 0 I practice, sice β 0 is kow, we ca sbstitte መβ for β 0 for the variace of U β 0.
18 Iferece with a Sigle Covariate Assme a proportioal hazards model with a sigle covariate z λ t = λ 0 t e zβ After we get or data x i, δ i, z i, the MPLE መβ ca be obtaied by solvig the partial likelihood eqatio; i.e., settig the partial score to zero. The asymptotically, መβ ~ a N(β 0, J 1 መβ ) We ca se this fact to costrct cofidece iterval for β ad test the hypothesis H 0 : β = β 0, etc. For example, a 1 α CI of β is መβ ± z α/ J 1 መβ 1/ Example: Myelomatosis data (SAS code) Defie a treatmet idicator trt1 which takes vale 0 for treatmet 1 ad takes vale 1 for treatmet. A 95% CI of β is: 0.578±1.96* = [-0.46, 1.57] A 95% CI of the hazard ratio e β is: e 0.46, e 1.57 = 0.653, 4.816
19 Compariso of Score Test ad Two-Sample Log Rak Test Assme z is the dichotomos idicator for treatmet; i.e., 1 for treatmet 1 z = ቊ 0 for treatmet 0 ad the proportioal hazards model: λ t = λ 0 t e zβ Score test: Uder H 0 : β = 0, the score U(0) (evalated der H 0 ) has the distribtio U 0 ~ a N(0, J 0 ) U 0 ~ a or, χ 1 J 1/ 0 Notice that U 0 = dn z I zҧ, 0 ҧ The: 1. If a death occrs at time, the dn = 1, i which case there will a cotribtio to U 0 by addig z I z, 0. Otherwise o cotribtio.. Sice z = 1 for treatmet 1 ad z = 0 for treatmet 0, z I will the be the mber of deaths at time from treatmet Uder H 0 : β = 0, zҧ, 0 is simplified to be zҧ, 0 = zl Y l () Y l (), which is the proportio of idividals i grop 1 amog those at risk at time. Sice we oly assme oe death at time, this proportio is the expected mber of death for treatmet 1 amog those at risk at time, der the ll hypothesis of o treatmet differece.
20 4. Therefore, U 0 is the sm over the death times of the observed mber of deaths from treatmet 1 mis the expected mber of deaths der the ll hypothesis. This was the merator of the two-sample log rak test: dn 1 Y 1 Y dn() 5. The deomiator of the score test was compted as J 1 0 = dn V z, 0 sice V z, 0 = σ l z l zҧ, 0 Y l () = σ l Y l () = Y 0 Y 1 Y + Y 1 Y 0 () Y () Y() 1 = 1 Y 1 Y dn Y 0 Y 1 Y () = Y 0 Y 1 Y() Y 3 () 1 Y Y 1() Y() Y() = Y 0 Y 1 Y () Y 0 () Note that I the special case where dn()ca oly be oe or zero, the variace sed to compte the lograk test statistic is: Y 1 Y 0 dn [Y dn()] Y Y 1 = Y 0 Y 1 Y (), which is J(0)
21 Therefore, for cotios srvival time data with o ties, the score test of the hypothesis H 0 : β = 0 i the proportioal hazards model is exactly the same as the lograk test for dichotomos covariate z. I fact, the score test U 0 J 1/ 0 ca be sed to test for ay covariate vale z, whether or ot z is discrete or cotios. The hypothesis H 0 : β = 0 implies that the hazard rate at ay time t is affected by the covariate z, or that the srvival distribtio does ot deped o z. The alterative hypothesis H a : β 0 wold mea that idividals with a higher vale of z wold have stochastically larger (or smaller depedig o the sig of β) srvival distribtio tha those idividals with a smaller vales of z. The test commad i Proc Lifetest comptes the score test of the hypothesis H 0 : β = 0 for the proportioal hazards model. Note that whe sig the test commad, the covariate z is ot limited to beig dichotomos, or discrete. Example: Myelomatosis data revisited (SAS code)
22 Likelihood Ratio Test As i the ordiary likelihood theory, the (partial) likelihood ratio test ca also be sed to test the ll hypothesis: H 0 : β = β 0 The likelihood ratio test ses the fact that derh 0 : β = β 0, l መβ l β 0 ~ a χ 1 Therefore, for a give level of sigificace α, we reject H 0 : β = β 0 if A Heristic Proof: Expadig l β 0 l β 0 Sice መβ maximize l β, i. e. So, H 0 : β = β 0, U መβ l መβ l β 0 χ 1,α P χ 1 > χ 1,α = α l መβ + dl መβ dβ at the MPLE መβ, we get መβ β d l መβ! d β = dl መβ dβ = 0 d l መβ ad d β l መβ l β 0 J መβ መβ β 0 = መβ β 0 J 1/ 0 ~ a χ 1 = J መβ መβ β 0 Note: The SAS procedre Phreg ca ONLY hadle right cesored data. By መβ β 0 ~ a N(0, J 1 መβ )
LECTURE 13 SPURIOUS REGRESSION, TESTING FOR UNIT ROOT = C (1) C (1) 0! ! uv! 2 v. t=1 X2 t
APRIL 9, 7 Sprios regressio LECTURE 3 SPURIOUS REGRESSION, TESTING FOR UNIT ROOT I this sectio, we cosider the sitatio whe is oe it root process, say Y t is regressed agaist aother it root process, say
More informationApproximation of the Likelihood Ratio Statistics in Competing Risks Model Under Informative Random Censorship From Both Sides
Approimatio of the Lielihood Ratio Statistics i Competig Riss Model Uder Iformative Radom Cesorship From Both Sides Abdrahim A. Abdshrov Natioal Uiversity of Uzbeista Departmet of Theory Probability ad
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationLecture Notes 15 Hypothesis Testing (Chapter 10)
1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationDirection: This test is worth 150 points. You are required to complete this test within 55 minutes.
Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationSTAT331. Example of Martingale CLT with Cox s Model
STAT33 Example of Martigale CLT with Cox s Model I this uit we illustrate the Martigale Cetral Limit Theorem by applyig it to the partial likelihood score fuctio from Cox s model. For simplicity of presetatio
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationIn this document, if A:
m I this docmet, if A: is a m matrix, ref(a) is a row-eqivalet matrix i row-echelo form sig Gassia elimiatio with partial pivotig as described i class. Ier prodct ad orthogoality What is the largest possible
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationPartial Differential Equations
EE 84 Matematical Metods i Egieerig Partial Differetial Eqatios Followig are some classical partial differetial eqatios were is assmed to be a fctio of two or more variables t (time) ad y (spatial coordiates).
More informationIntroduction. Question: Why do we need new forms of parametric curves? Answer: Those parametric curves discussed are not very geometric.
Itrodctio Qestio: Why do we eed ew forms of parametric crves? Aswer: Those parametric crves discssed are ot very geometric. Itrodctio Give sch a parametric form, it is difficlt to kow the derlyig geometry
More informationUnivariate Regression
Uivariate Regressio Agst 2006 1 Basic Model Assme y i F ( ) E (y i ) µ i Var(y i ) σ 2 (ot ecessarily ormal). Assme that µ i β. Ths, we ca write y i β + i E i 0 Var( i ) σ 2 E ( i ) 0 Examples: 1. School
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More information2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2
Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More informationStat 200 -Testing Summary Page 1
Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More informationSequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet
More informationMAT2400 Assignment 2 - Solutions
MAT24 Assigmet 2 - Soltios Notatio: For ay fctio f of oe real variable, f(a + ) deotes the limit of f() whe teds to a from above (if it eists); i.e., f(a + ) = lim t a + f(t). Similarly, f(a ) deotes the
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationand the sum of its first n terms be denoted by. Convergence: An infinite series is said to be convergent if, a definite unique number., finite.
INFINITE SERIES Seqece: If a set of real mbers a seqece deoted by * + * Or * + * occr accordig to some defiite rle, the it is called + if is fiite + if is ifiite Series: is called a series ad is deoted
More informationNumerical Methods for Finding Multiple Solutions of a Superlinear Problem
ISSN 1746-7659, Eglad, UK Joral of Iformatio ad Comptig Sciece Vol 2, No 1, 27, pp 27- Nmerical Methods for Fidig Mltiple Soltios of a Sperliear Problem G A Afrozi +, S Mahdavi, Z Naghizadeh Departmet
More informationTHE CONSERVATIVE DIFFERENCE SCHEME FOR THE GENERALIZED ROSENAU-KDV EQUATION
Zho, J., et al.: The Coservative Differece Scheme for the Geeralized THERMAL SCIENCE, Year 6, Vol., Sppl. 3, pp. S93-S9 S93 THE CONSERVATIVE DIFFERENCE SCHEME FOR THE GENERALIZED ROSENAU-KDV EQUATION by
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationLecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece
More informationLINEARIZATION OF NONLINEAR EQUATIONS By Dominick Andrisani. dg x. ( ) ( ) dx
LINEAIZATION OF NONLINEA EQUATIONS By Domiick Adrisai A. Liearizatio of Noliear Fctios A. Scalar fctios of oe variable. We are ive the oliear fctio (). We assme that () ca be represeted si a Taylor series
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationSTAT431 Review. X = n. n )
STAT43 Review I. Results related to ormal distributio Expected value ad variace. (a) E(aXbY) = aex bey, Var(aXbY) = a VarX b VarY provided X ad Y are idepedet. Normal distributios: (a) Z N(, ) (b) X N(µ,
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationIt is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.
MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More information1 Models for Matched Pairs
1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationDirection: This test is worth 250 points. You are required to complete this test within 50 minutes.
Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More informationSimple Linear Regression
Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationClass 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 18: The asymptotic variance of OLS and heteroskedasticity
Ecoomics 326 Methods of Empirical Research i Ecoomics Lecture 8: The asymptotic variace of OLS ad heteroskedasticity Hiro Kasahara Uiversity of British Columbia December 24, 204 Asymptotic ormality I I
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationAgreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times
Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationInferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.
Iferetial Statistics ad Probability a Holistic Approach Iferece Process Chapter 8 Poit Estimatio ad Cofidece Itervals This Course Material by Maurice Geraghty is licesed uder a Creative Commos Attributio-ShareAlike
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationMa 530 Introduction to Power Series
Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationApplication of Digital Filters
Applicatio of Digital Filters Geerally some filterig of a time series take place as a reslt of the iability of the recordig system to respod to high freqecies. I may cases systems are desiged specifically
More informationIf, for instance, we were required to test whether the population mean μ could be equal to a certain value μ
STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationLarge Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution
Large Sample Theory Covergece Covergece i Probability Covergece i Distributio Cetral Limit Theorems Asymptotic Distributio Delta Method Covergece i Probability A sequece of radom scalars {z } = (z 1,z,
More informationTable 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab
Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationInvestigating the Significance of a Correlation Coefficient using Jackknife Estimates
Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationSTA6938-Logistic Regression Model
Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More information6 Sample Size Calculations
6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig
More informationIt should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.
Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 7: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review How ca we set a cofidece iterval o a proportio? 2 Review How ca we set a cofidece iterval
More informationPOWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES*
Kobe Uiversity Ecoomic Review 50(2004) 3 POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES* By HISASHI TANIZAKI There are various kids of oparametric
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationAssignment 2 Solutions SOLUTION. ϕ 1 Â = 3 ϕ 1 4i ϕ 2. The other case can be dealt with in a similar way. { ϕ 2 Â} χ = { 4i ϕ 1 3 ϕ 2 } χ.
PHYSICS 34 QUANTUM PHYSICS II (25) Assigmet 2 Solutios 1. With respect to a pair of orthoormal vectors ϕ 1 ad ϕ 2 that spa the Hilbert space H of a certai system, the operator  is defied by its actio
More information10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random
Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More informationControl Charts. Introduction. Purpose and benefit: UCL (upper control limit) UWL (upper warning limit) Quality feature
Qality featre Cotrol Charts Itrodctio A Cotrol Chart shows the time corse of a process characteristic. For this prpose, data ca be take cotiosly or i periodic samples. The prereqisite is that the process
More information