A Distributional Approach Using Propensity Scores
|
|
- Roxanne Malone
- 6 years ago
- Views:
Transcription
1 A Distributioal Approach Usig Propesity Scores Zhiqiag Ta Departmet of Biostatistics Johs Hopkis School of Public Health zta Jue 20, 2005
2 Outlie Itroductio Couterfactual framework Illustratio Applicatio No-cofoudig case Kow propesity score Parametric propesity score Cofoudig case
3 Itroductio Right heart catheterizatio (RHC) is performed daily i hospitals sice 1970s. The beefit of RHC had NOT bee demostrated i a successful radomized cliical trial. Coors et al. s (1996) observatioal study raised the cocer that RHC might ot beefit critically ill patiets ad might i fact cause harm. Data were collected o 5735 critically ill patiets admitted to the ICUs of five medical ceters: Treatmet: No-RHC or RHC Outcome: 30-day survival Covariates: 75 covariates HOW to evaluate the effect of RHC o survival?
4 Couterfactual framework X: covariates measured T : treatmet variable takig value 0 or 1 if a patiet actually receives No-RHC or RHC (Y 0, Y 1 ): potetial outcome that would be observed if a patiet received No-RHC or RHC Y = (1 T ) Y 0 + T Y 1 : observed outcome We are iterested i average causal effect E( Y 1 Y 0 ) = E(Y 1 ) E(Y 0 ) or P ({Y 1 }) versus P ({Y 0 }) Assigmet mechaism No-cofoudig: Cofoudig: T (Y 0, Y 1 ) X T (Y 0, Y 1 ) X Propesity score: π(x) = P (T = 1 X)
5 Thirty day survival curves RHC, Raw No RHC, Raw Day Proportio of Survivig Raw histogram of aps RHC No RHC Raw histogram of meabp Raw histogram of pafi
6 Illustratio RHC= 1 RHC= 0 BP= 1 (52, 28) 80 (11, 9) 20 BP= 0 (30, 10) 40 (37, 23) 60 82, , Patiets get RHC at radom P ( survival RHC = 1 ) 82/120 = 68.3% P ( survival RHC = 0 ) 48/80 = 60.0% Patiets get RHC at radom give blood pressure Weight each patiet such that 80w 1 (1) = 1/2, 40w 1 (0) = 1/2, 20w 0 (1) = 1/2, 60w 0 (0) = 1/2. Compare the weighted probabilities 52w 1 (1) + 30w 1 (0) = 70.0%, 11w 0 (1) + 37w 0 (0) = 58.3%.
7 WHAT IF patiets are NOT equally likely to get RHC at each level of blood pressure? Previous estimates: P ( obs survival BP =, RHC = 1 ) = 70.0%, P ( obs survival BP =, RHC = 0 ) = 58.3%. Weight each patiet such that λ 1i w 1 (1) = , i=81 λ 0i w 0 (1) = , 41 where Λ 1 λ 1i, λ 0i Λ (Λ = 1.5). λ 1i w 1 (0) = 1 2, λ 0i w 0 (0) = 1 2, Boud the weighted probabilities 120 λ 1i w 1 (X i ) Y 1i, subject to the foregoig costraits. λ 0i w 0 (X i ) Y 0i, P (!obs survival BP =, RHC = 1 ) 72.2%, P (!obs survival BP =, RHC = 0 ) 55.0%.
8 Thirty day survival curves RHC, Raw No RHC, Raw Day Proportio of Survivig Raw histogram of aps RHC No RHC Raw histogram of meabp Raw histogram of pafi
9 Thirty day survival curves RHC, Raw No RHC, Raw RHC, Weighted No RHC, Weighted Day Proportio of Survivig Raw histogram of aps RHC No RHC Raw histogram of meabp Raw histogram of pafi Weighted histogram of aps RHC No RHC Weighted histogram of meabp Weighted histogram of pafi
10 Thirty day survival curves RHC, Observed RHC, Couterfactual No RHC, Couterfactual No RHC, Observed Day Proportio of Survivig Weighted histogram of aps No RHC, Couterfactual No RHC, Observed Weighted histogram of meabp Weighted histogram of pafi Weighted histogram of aps RHC, Observed RHC, Couterfactual Weighted histogram of meabp Weighted histogram of pafi
11 No-cofoudig case Data: (X i, Y T i, T i ), i = 1, 2,..., Likelihood: = L 1 L 2 [ ] (1 π(x i )) 1 T i π(x i ) T i [ ] G 0 ({X i, Y 0i }) 1 T i G 1 ({X i, Y 1i }) T i where G 0 is the joit distributio of (X, Y 0 ) ad G 1 is the joit distributio of (X, Y 1 ). G 0 ad G 1 iduce the same margial distributios o the covariate space X. Equivaletly, h(x) dg 0 (x, y 0 ) = h(x) dg 1 (x, y 1 ) for each bouded fuctio h o X. Take fiitely may costraits ad fid MLE (Ĝ 0, Ĝ 1 ): ˆµ 1 = y 1 dĝ 1 (x, y 1 ), ˆµ 0 = y 0 dĝ 0 (x, y 0 ).
12 Kow propesity score [Model S0: kow π ] Maximize the likelihood subject to the costraits π (x) dg 0 = π (x)dg 1, h j (x) dg 0 = h j (x)dg 1, j = 1,..., m. Let h = (π, 1 π, h 1,..., h m). Maximize 1 1 log(λ h (X i )) + 1 The Ĝ 1 {(X i, Y 1i )} = Ĝ 0 {(X i, Y 0i )} = i= 1 +1 log(1 λ h (X i )). 1 λ h (X i ), i = 1,..., 1, 1 1 λ h (X i ), i = 1 + 1,...,. First-order approximatio: µ 1 = 1 µ 0 = 1 Y 1i T i π (X i ) β 1 [ 1 Y 0i (1 T i ) 1 π (X i ) β 0 h (X i ) ( Ti )] 1 π (X i ) π (X i ) 1, [ 1 h (X i ) ( 1 Ti )] π (X i ) 1 π (X i ) 1, where β 1 = B 1 C 1 ad β 0 = B 1 C 0.
13 The method of cotrol variates: 1 Y 1i T i π (X i ) b 1 [ 1 i=0 h (X i ) 1 π (X i ) ( Ti π (X i ) 1 )]. The optimal choice of b 1 is β 1 = B 1 C 1. A more geeral class of estimators: 1 Y 1i T i π (X i ) 1 ( Ti ) φ 1 (X i ) π (X i ) 1. The optimal choice of φ 1 (x) is E(Y 1 X = x). achieves semiparametric efficiecy uder S0. Choose h such that E(Y 1 X = x) is cotaied the liear spa of E(Y 0 X = x) is cotaied the liear spa of h (x) 1 π (x), h (x) π (x). Outcome regressio [Model R] E(Y 1 X) = Ψ ( α 1 g 1(X) ), E(Y 0 X) = Ψ ( α 0 g 0(X) ). Choose h = ( π, 1 π, π Ψ(ˆα 0 g 0), (1 π )Ψ(ˆα 1 g 1) ).
14 Parametric propesity score [Model S: π( ; γ)] Maximize the likelihood subject to the costraits ˆπ(x) dg 1 = ˆπ(x)dG 0, ĥ j (x) dg 1 = ĥ j (x)dg 0, j = 1,..., m. Let ĥ = (ˆπ, 1 ˆπ, ĥ 1,..., ĥ m ). Maximize 1 1 log(λ ĥ(x i )) + 1 i= log(1 λ ĥ(x i )). The Ĝ 1 {(X i, Y 1i )} = λ ĥ(x i ), i = 1,..., 1, 1 Ĝ 0 {(X i, Y 0i )} = 1 λ ĥ(x i ), i = 1 + 1,...,. First-order approximatio: µ 1 = 1 µ 0 = 1 Y 1i T i ˆπ(X i ) β 1 [ 1 Y 0i (1 T i ) 1 ˆπ(X i ) β 0 [ 1 ĥ(x i ) ( Ti )] 1 ˆπ(X i ) ˆπ(X i ) 1, ĥ(x i ) ( 1 Ti )] ˆπ(X i ) 1 ˆπ(X i ) 1, where β 1 = B 1 C 1 ad β 0 = B 1 C 0.
15 Our strategy is To build ad check propesity score models to esure cosistecy To use outcome regressio models for variace ad bias reductio Propesity score models ca be checked with the followig idea: Pick up a collectio of test fuctios ĥ j s o X, for example, (ˆπ, 1 ˆπ, ˆπX, (1 ˆπ)X). Compute the sample average ( T Ẽ[ĥj (X) ˆπ(X) 1 T )] 1 ˆπ(X) i.e. average differece i ĥ j (X) betwee the treated ad cotrol after propesity score weightig. If model S is correct, the the sample averages relative to stadard errors, or z-ratios, should be statistically osigificat from zero. Examiatio of z-ratios agaist the stadard ormal ca reveal possible misspecificatio of model S.
16 z ratio z ratio Model Model
17 Cofoudig case Data: (X i, Y T i, T i ), i = 1, 2,..., Likelihood: L 1 L 2 [ ] = (1 π(x i )) 1 T i π(x i ) T i [ ] H 0 ({X i, Y 0i }) 1 T i H 1 ({X i, Y 1i }) T i where H 0 is the distributio P ({Y 0 } T = 0, X)P ({X}) ad H 1 is the distributio P ({Y 1 } T = 1, X)P ({X}). H 0 ad H 1 iduce the same margial distributios o the covariate space X. Equivaletly, h(x) dh 0 (x, y 0 ) = h(x) dh 1 (x, y 1 ) for each bouded fuctio h o X. Covergece of previous estimates: (Ĝ 0, Ĝ 1 ) (H 0, H 1 ) ˆµ 1, µ 1 E[E(Y 1 T = 1, X)] ˆµ 0, µ 0 E[E(Y 0 T = 0, X)]
18 Umeasured cofoudig: gaps betwee P ({Y 0 } T = 0, X) ad P ({Y 0 } T = 1, X) P ({Y 1 } T = 0, X) ad P ({Y 1 } T = 1, X) i.e. systematic differeces betwee the treated ad utreated eve if they received the same treatmet. Defie the Rado-Nikodym derivatives: λ 0 (Y 0 ; X) = P (dy 0 T = 1, X) P (dy 0 T = 0, X), λ 1 (Y 1 ; X) = P (dy 1 T = 0, X) P (dy 1 T = 1, X). The case λ 0 = λ 1 1 correspods to o cofoudig, while deviatios of λ 0 ad λ 1 from 1 idicate umeasured cofoudig. By Bayes rule, λ 0 ad λ 1 ca be see as odds ratios: λ 0 (Y 0 ; X) = 1 π(x) P (T = 1 Y 0, X) π(x) P (T = 0 Y 0, X), λ 1 (Y 1 ; X) = π(x) P (T = 0 Y 1, X) 1 π(x) P (T = 1 Y 1, X). A sesitivity aalysis model: Λ 1 λ 0 (Y 0 ; X), λ 1 (Y 1 ; X) Λ, where Λ 1 idicates the degree of departure from o cofoudig.
19 Let ĥ c = (ˆπ, 1 ˆπ, ĥ 1,..., ĥ m c). For a value of Λ, fid bouds for y t λ t dh t by liear programmig: mi or max y t λ t dĝ t subject to λ t dĝ t = 1, ˆπ(x)λ t dĝ t = ˆπ(x) dĝ t, ĥ j (x)λ t dĝ t = ĥ j (x) dĝ t, j = 1,..., m c, ad 1 Λ λ t Λ. Ĝ 1 is supported o {(X i, Y 1i )},...,1 ad Ĝ 0 o {(X i, Y 0,i )} i=1 +1,...,. Itegral is fiite sum. The ukows are the values of λ t o observed data: λ 1i = λ 1 (Y 1i ; X i ), i = 1,..., 1, λ 0i = λ 0 (Y 0i ; X i ), i = 1 + 1,...,. Comparisos of the distributios Ĝ 0 [Y 0 T = 0, X][X], Ĝ 1 [Y 1 T = 1, X][X] λ 0 dĝ 0 [Y 0 T = 1, X][X], λ 1 dĝ 1 [Y 1 T = 0, X][X] idicate (i) balace o covariates, (ii) hidde bias, ad (iii) causal effects.
REGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More information1 Models for Matched Pairs
1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i
More informationBIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov
Microarray Ceter BIOSTATISTICS Lecture 5 Iterval Estimatios for Mea ad Proportio dr. Petr Nazarov 15-03-013 petr.azarov@crp-sate.lu Lecture 5. Iterval estimatio for mea ad proportio OUTLINE Iterval estimatios
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationSampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals
Chapter 6 Studet Lecture Notes 6-1 Busiess Statistics: A Decisio-Makig Approach 6 th Editio Chapter 6 Itroductio to Samplig Distributios Chap 6-1 Chapter Goals After completig this chapter, you should
More information6 Sample Size Calculations
6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig
More informationJournal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula
Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationRegression with an Evaporating Logarithmic Trend
Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,
More informationSTA6938-Logistic Regression Model
Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationStatistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes
Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationLinear Regression Models
Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect
More informationHigh-Dimensional M-Estimation with Missing Outcomes: A Semi-Parametric Framework
High-Dimesioal M-Estimatio with Missig Outcomes: A Semi-Parametric Framework (A Overview of the Methods ad the Mai Results) Abhishek Chakrabortty Uiversity of Pesylvaia Harvard Visit. August 20-23, 2018.
More informationSingular Continuous Measures by Michael Pejic 5/14/10
Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig
More informationAgenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740
Ageda: Recap. Lecture. Chapter Homework. Chapt #,, 3 SAS Problems 3 & 4 by had. Copyright 06 by D.B. Rowe Recap. 6: Statistical Iferece: Procedures for μ -μ 6. Statistical Iferece Cocerig μ -μ Recall yes
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More information(6) Fundamental Sampling Distribution and Data Discription
34 Stat Lecture Notes (6) Fudametal Samplig Distributio ad Data Discriptio ( Book*: Chapter 8,pg5) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye 8.1 Radom Samplig: Populatio:
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.
More informationMathematics 170B Selected HW Solutions.
Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Aalysis Mahida Samarakoo Jauary 28, 2016 Mahida Samarakoo STAC51: Categorical data Aalysis 1 / 35 Table of cotets Iferece for Proportios 1 Iferece for Proportios Mahida Samarakoo
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationUnbiased Estimation. February 7-12, 2008
Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationIntroductory statistics
CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationHomework 5 Solutions
Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the
More informationSample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.
ample ie Estimatio i the Proportioal Haards Model for K-sample or Regressio ettigs cott. Emerso, M.D., Ph.D. ample ie Formula for a Normally Distributed tatistic uppose a statistic is kow to be ormally
More informationA goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality
A goodess-of-fit test based o the empirical characteristic fuctio ad a compariso of tests for ormality J. Marti va Zyl Departmet of Mathematical Statistics ad Actuarial Sciece, Uiversity of the Free State,
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationG. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan
Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity
More informationSample questions. 8. Let X denote a continuous random variable with probability density function f(x) = 4x 3 /15 for
Sample questios Suppose that humas ca have oe of three bloodtypes: A, B, O Assume that 40% of the populatio has Type A, 50% has type B, ad 0% has Type O If a perso has type A, the probability that they
More informationChapter two: Hypothesis testing
: Hypothesis testig - Some basic cocepts: - Data: The raw material of statistics is data. For our purposes we may defie data as umbers. The two kids of umbers that we use i statistics are umbers that result
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationAdditional Notes and Computational Formulas CHAPTER 3
Additioal Notes ad Computatioal Formulas APPENDIX CHAPTER 3 1 The Greek capital sigma is the mathematical sig for summatio If we have a sample of observatios say y 1 y 2 y 3 y their sum is y 1 + y 2 +
More informationA new distribution-free quantile estimator
Biometrika (1982), 69, 3, pp. 635-40 Prited i Great Britai 635 A ew distributio-free quatile estimator BY FRANK E. HARRELL Cliical Biostatistics, Duke Uiversity Medical Ceter, Durham, North Carolia, U.S.A.
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationClustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.
Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationExample: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.
1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationAsymptotic Coupling and Its Applications in Information Theory
Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM
More informationData Description. Measure of Central Tendency. Data Description. Chapter x i
Data Descriptio Describe Distributio with Numbers Example: Birth weights (i lb) of 5 babies bor from two groups of wome uder differet care programs. Group : 7, 6, 8, 7, 7 Group : 3, 4, 8, 9, Chapter 3
More informationIMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS
IMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS HANXIANG PENG AND ANTON SCHICK Abstract. Suppose we have data from a bivariate model with parametric margials. Efficiet
More informationCEU Department of Economics Econometrics 1, Problem Set 1 - Solutions
CEU Departmet of Ecoomics Ecoometrics, Problem Set - Solutios Part A. Exogeeity - edogeeity The liear coditioal expectatio (CE) model has the followig form: We would like to estimate the effect of some
More informationMathematical Statistics - MS
Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios
More informationBull. Korean Math. Soc. 36 (1999), No. 3, pp. 451{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Seung Hoe Choi and Hae Kyung
Bull. Korea Math. Soc. 36 (999), No. 3, pp. 45{457 THE STRONG CONSISTENCY OF NONLINEAR REGRESSION QUANTILES ESTIMATORS Abstract. This paper provides suciet coditios which esure the strog cosistecy of regressio
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More informationError & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :
Error Error & Ucertaity The error is the differece betwee a TRUE value,, ad a MEASURED value, i : E = i There is o error-free measuremet. The sigificace of a measuremet caot be judged uless the associate
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationNYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)
NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we
More informationAppendix to: Hypothesis Testing for Multiple Mean and Correlation Curves with Functional Data
Appedix to: Hypothesis Testig for Multiple Mea ad Correlatio Curves with Fuctioal Data Ao Yua 1, Hog-Bi Fag 1, Haiou Li 1, Coli O. Wu, Mig T. Ta 1, 1 Departmet of Biostatistics, Bioiformatics ad Biomathematics,
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationStatistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005
Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationChapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008
Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationTHE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS
R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated
More informationACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory
1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.
More informationFinal Review for MATH 3510
Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationProbability and Statistics
ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, 2016 1 Basic robability Theory 11 robability Spaces A probability space is a triple (Ω,
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationLogit regression Logit regression
Logit regressio Logit regressio models the probability of Y= as the cumulative stadard logistic distributio fuctio, evaluated at z = β 0 + β X: Pr(Y = X) = F(β 0 + β X) F is the cumulative logistic distributio
More informationStat410 Probability and Statistics II (F16)
Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationof the matrix is =-85, so it is not positive definite. Thus, the first
BOSTON COLLEGE Departmet of Ecoomics EC771: Ecoometrics Sprig 4 Prof. Baum, Ms. Uysal Solutio Key for Problem Set 1 1. Are the followig quadratic forms positive for all values of x? (a) y = x 1 8x 1 x
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationConfidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.
MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval
More informationEstimation of Gumbel Parameters under Ranked Set Sampling
Joural of Moder Applied Statistical Methods Volume 13 Issue 2 Article 11-2014 Estimatio of Gumbel Parameters uder Raked Set Samplig Omar M. Yousef Al Balqa' Applied Uiversity, Zarqa, Jorda, abuyaza_o@yahoo.com
More information