Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th

Size: px
Start display at page:

Download "Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th"

Transcription

1 DISCUSSION OF THE PAPER BY LIN AND YING Xihong Lin and Raymond J. Carroll Λ July 21, 2000 Λ Xihong Lin is Associate Professor, Department ofbiostatistics, University of Michigan, Ann Arbor, MI Her research was supported by a grant from the National Cancer Institute (CA 76404). Raymond J. Carroll (carroll@stat.tamu.edu) is Distinguished Professor, Departments of Statistics and Biostatistics & Epidemiology, Texas A&M University, College Station TX His research was supported byagrant from the National Cancer Institute (CA 57030), and by the Texas A&M Center for Environmental and Rural Health via a grant from the National Institute of Environmental Health Sciences (P30-ES09106).

2 Professors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to the analysis of longitudinal data. Their paper makes an interesting contribution to the developing literature on semiparametric and nonparametric regression in longitudinal data. Discussions are of course a means of bringing out points that authors are aware of but did not have space to address. In our discussion, we focus on two points. ffl For computational convenience, Lin and Ying choose single nearest neighbor "smoothing" instead of standard smoothing techniques, i.e., they average over a neighborhood that contains but a single member. This choice permits simple computation and elegant techniques. However, their dismissal of standard smoothing techniques deserves some discussion. As we show below, their method can have near zero efficiency compared to alternative semiparametric methods, at least in the special cases we have examined. ffl In their simulation, Lin and Ying find that their estimates of parametric components of semiparametric models have efficiency near one compared to parametric modeling techniques. They are very careful not to claim this as a general phenomenon, but we worry that most readers will not see how careful they really are. We indicate using semiparametric efficiency bounds that their estimators of the parametric components can have near zero efficiency compared to parametric methods, when the latter choose the correct model. We also indicate that their simulation results can be explained theoretically. For simplicity and to satisfy space limitations, in our discussion we focus here on the case that (a) X(t) is a scalar; (b) the number of observations per subject is bounded away from infinity; and (c) the number of subjects goes to infinity (n!1). We will not repeat these assumptions in what follows. 1 Semiparametric Regression and the Choice of Singleton Nearest Neighbors The Lin and Ying method has undeniable appeal in terms of computational simplicity, and we conjecture that it has good efficiency when the number of observations per subject is large. However, in many longitudinal studies, the number of observations per subject is small. For example, in the AIDS example considered in the paper, the number of observations per subjects ranged from 1 to 18 with a median of 8. We now show that the computational simplicity achieved by using single 1

3 nearest neighbor smoothing techniques as compared to standard smoothing techniques can lead to estimators with arbitrarily low efficiency compared to methods readily available in standard statistical packages, e.g., the gam function in Splus. Consider the simple special case that (a) X(t) = X is, as in their example, non-time varying (i.e., X and T are independent); (b) each individual has a single observation time T, which varies from individual to individual; and (c) these observation times have acontinuous density function. Write Y i = Y (T i )andffl i = ffl(t i ). Then their model (1.1) is Y i = ff(t i )+fix i + ffl i : (1) Assume for simplicity that var(ffl i )=ff 2 ffl, write the mean and variance of X as μ x and ffx, 2 respectively, and note that E(XjT )=μ x. Model (1) is often referred to as the partial linear model, and there is an enormous literature on it, including, among many others, Speckman (1988), Hastie & Tibshirani (1990) and Severini and Staniswalis (1994). Estimators of fi are available through the gam function in Splus. In particular, the latter authors show that the semiparametric efficient estimator is easily computed without iteration or backfitting as follows: (a) first regress Y on T to form an estimator bm y (t) using, e.g., kernel methods or spline methods; then (b) estimate fi by regressing Y bm y (T )onx X, where X is the sample mean of the X's. Severini and Staniswalis (1994) show that this estimate of fi is semiparametric efficient and has an asymptotic variance of var( fi) b ß n 1 ff2 ffl ff 2 ; x while, as shown in the Appendix, Lin and Ying's estimator, b fi LY, has an asymptotic variance of var( b fi LY ) ß n 1 varfff(t )g + ff2 ffl ff 2 x : (2) The efficiency of the Lin and Ying estimator compared to the semiparametric efficient estimator is Efficiency( b fi LY ; b fi)= ff 2 ffl varfff(t )g + ff 2 ffl ; which can be arbitrarily small if the function ff(t )varies considerably. To gain insight into this result, we compare the estimating function corresponding to Lin and Ying's estimator with the semiparametric efficient score of fi. One can easily show that the semiparametric efficient score of fi under model (1) is, by suppressing the index i, fx E(XjT )gfy Xfi ff(t )g=ff 2 ffl (Severini and Staniswalis, 1994), which is the same as the estimating function U g (fi) given in the equation above (2.7) when g(t ) = ff(t ). This suggests that the efficiency of 2

4 an estimator of fi heavily depends the asymptotic choice of g(t ) and the method for estimating it. Nonparametric regression estimators of ff(t )give an estimating function asymptotically equivalent to (correctly) assuming that g(t )=ff(t ). Lin and Ying's estimator of ff(t )by using the singleton nearest neighbor method via calculating Y Λ, which isequaltoy in this case, gives an estimating function asymptotically equivalent to (incorrectly) assuming that g(t ) = Efff(T )g, which can of course differ from ff(t ) dramatically if ff(t ) varies considerably. It is possible to construct the semiparametric efficient score in our setting when there is more than one observation per individual (Lin and Carroll, 2000), but implementing this has not been done. However if, as do Lin and Ying, one ignores the correlations in the residual process ffl(t), then the same basic estimator described above applies: one simply ignores the within-subject correlations and combines the (Y; X;T) data into a large" data set and estimates ff(t) using standard smoothing methods. This is in fact a GEE type estimator under working independence. The resulting estimator of fi is p n-consistent and its asymptotic variance is easy to work out (Lin and Carroll, 2000) and estimate. We expect that the GEE type estimator is generally more efficient, and often much more efficient, than that of Lin and Ying, since the above simple scenario with one observation per subject is a special case. More generally, we conjecture that replacing Y Λ (t) in Lin and Ying's (2.8) with a standard nonparametric regression estimator, while slightly more complex computationally, will lead to more efficient estimation, and sometimes nearly infinitely more efficient estimation. We would be interested in Professors Lin and Ying's comments on this issue. Another issue is that the consistency of Lin and Ying's estimator requires the covariate history X i (t) to be fully observed, while the GEE type estimator does not require this assumption. If X i (t) is a time-independent covariate, this assumption is easy to satisfy. If X i (t) is time-varying and only a finite number of observations per subject are available (a common case in longitudinal studies), this assumption will be difficult to satisfy, since information on X i (t) is often available only at the observation times. If one approximates X i (t) by X Λ i Λ (t) defined similarly to Y (t), using i the singleton nearest neighbor method, the resulting estimating equation can be shown to biased and the estimator of fi to be p n inconsistent. Hence an alternative method needs to be proposed. We conjecture that once again the use of conventional nonparametric regression techniques will solve the consistency issue, and we are interested in Professors Lin and Ying's suggestion on how to handle this situation. 3

5 2 Relative Efficiency of Semiparametric Regression and Parametric Regression Lin and Ying showed through simulation studies that the loss of efficiency in estimating the finite dimensional parameter fi by fitting the semiparametric regression model (1.1) is negligible compared to a parametric model when ff(t) is estimated parametrically. We show theoretically that semiparametric regression is fully efficient in estimating fi compared to parametric regression if X i (t) is a time-independent covariate, i.e., X and T are independent, However, semiparametric regression can be subject to an arbitrarily large loss of efficiency when X i (t) is time-varying. Consider a simple special case where each subject has a single observation time, which can vary from subject to subject. Assume the optimal (semiparametric efficient) score is used, i.e., g(t )=ff(t ) in equation (2.7). Without loss of generality, assume (X; T; Y ) are centered and have mean zero. Compare the semiparametric model (1) with the simple parametric linear model Y i = fft i + fix i + ffl i ; (3) where ffl i ο N(0;ff 2 ffl ). It can be shown that the semiparametric efficient information bound for fi under model (1) is I S = Efvar(XjT )g ff 2 : ffl The efficient information for fi under the parametric linear model (3) is I P = 1 ( ff 2 E(X 2 ) [E(XT)]2 ) ffl E(T 2 ) = 1 ( ff 2 E[var(XjT )] + E E 2 (XjT ) Λ E(T 2 ) [E(TE(XjT ))] 2 ) ffl E(T 2 ) (4) 1 ff 2 ffl Efvar(XjT )g = I S ; where the Cauchy-Schwartz inequality is applied in the last step and the equality holds when E(XjT ) is linear in T or is free of T. The second term in (4) can be arbitrarily large, e.g., when E(XjT ) varies with T substantially in a nonlinear fashion. Hence the loss of efficiency in estimating fi using semiparametric regression can be arbitrarily large compared to parametric regression. Lin and Ying's simulation studies considered a time-varying X i (t) case and the results in Table 2 showed that little loss of efficiency when fitting the semiparametric regression model compared to fitting the parametric regression model. Examination of the data generating mechanism used 4

6 for Table 2 suggests that it assumes realizations of X vary with T, but X and T are in fact independent. Our results suggest that this is the type of situation that semiparametric estimators are nearly efficient in the parametric sense. It would be interesting to run a simulation study where X and T are correlated, e.g., by allowing the mean of X(T ) to vary in a major way with T, to assess the loss of efficiency of semiparametric regression compared to parametric regression. Appendix: Verification of (2) The Lin and Ying estimator is given in their (2.8). For our simple special case, we have bfi LY = n 1 P n i=1(x i X)(Y i Y ) n 1 P n i=1(x i X) 2 = fi + n 1 P n i=1(x i X) fff(t i ) ff + ffl i fflg n 1 P n i=1(x i X) 2 : Since by assumption ff(t i ), X i and ffl i are independent, (2) follows immediately. References Hastie, T. & Tibshirani, R. (1990), Generalized Additive Models, Chapman and Hall, New York. Lin, X. and Carroll, R. J. (2000), Semiparametric Regression For Clustered Data Using Generalized Estimating Equations," under review. Severini, T. A. and Staniswalis, J. G. (1994), Quasilikelihood Estimation in Semiparametric Models," Journal of the American Statistical Association, 89, Speckman, P. (1988), Kernel smoothing in partial linear models," Journal of the Royal Statistical Society, Series B, 50,

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models The Model The GLM is: g( µ) = ß 0 + ß 1 x 1 + ß 2 x 2 +... + ß k x k The generalization to the GAM is: g(µ) = ß 0 + f 1 (x 1 ) + f 2 (x 2 ) +... + f k (x k ) where the functions

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

ATINER's Conference Paper Series STA

ATINER's Conference Paper Series STA ATINER CONFERENCE PAPER SERIES No: LNG2014-1176 Athens Institute for Education and Research ATINER ATINER's Conference Paper Series STA2014-1255 Parametric versus Semi-parametric Mixed Models for Panel

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,

More information

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Arnab Maity 1 Department of Statistics, Texas A&M University, College Station TX 77843-3143, U.S.A. amaity@stat.tamu.edu

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

2 Causal Graph and Conditional Independence Test 2.1 Causal Graph A directed graph is defined as a set of vertices and a set of directed edges connect

2 Causal Graph and Conditional Independence Test 2.1 Causal Graph A directed graph is defined as a set of vertices and a set of directed edges connect Simulation Study of Conditional Independence Test Using GAM Method Chu, Tianjiao December 13, 2000 1 Introduction Causal information, or even partial causal information, can help decision making. For example,

More information

for function values or parameters, which are neighbours in the domain of a metrical covariate or a time scale, or in space. These concepts have been u

for function values or parameters, which are neighbours in the domain of a metrical covariate or a time scale, or in space. These concepts have been u Bayesian generalized additive mied models. study A simulation Stefan Lang and Ludwig Fahrmeir University of Munich, Ludwigstr. 33, 8539 Munich email:lang@stat.uni-muenchen.de and fahrmeir@stat.uni-muenchen.de

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist

More information

A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions

A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions Yu Yvette Zhang Abstract This paper is concerned with economic analysis of first-price sealed-bid auctions with risk

More information

LOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA

LOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA The Annals of Statistics 1998, Vol. 26, No. 3, 1028 1050 LOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA By C. Y. Wang, 1 Suojin Wang, 2 Roberto G. Gutierrez and R. J. Carroll 3

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS

SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS Statistica Sinica 2 (21), 853-869 SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS Zhangsheng Yu and Xihong Lin Indiana University and Harvard School of Public

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Motivational Example

Motivational Example Motivational Example Data: Observational longitudinal study of obesity from birth to adulthood. Overall Goal: Build age-, gender-, height-specific growth charts (under 3 year) to diagnose growth abnomalities.

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Interaction effects for continuous predictors in regression modeling

Interaction effects for continuous predictors in regression modeling Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage

More information

Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance

Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance The Statistician (1997) 46, No. 1, pp. 49 56 Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance models By YUEDONG WANG{ University of Michigan, Ann Arbor, USA [Received June 1995.

More information

Fast Methods for Spatially Correlated Multilevel Functional Data

Fast Methods for Spatially Correlated Multilevel Functional Data Fast Methods for Spatially Correlated Multilevel Functional Data Ana-Maria Staicu, Department of Statistics, North Carolina State University, 23 Stinson Drive Raleigh, NC 27695-8203, USA email: staicu@stat.ncsu.edu,

More information

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS

QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Statistica Sinica 15(05), 1015-1032 QUANTIFYING PQL BIAS IN ESTIMATING CLUSTER-LEVEL COVARIATE EFFECTS IN GENERALIZED LINEAR MIXED MODELS FOR GROUP-RANDOMIZED TRIALS Scarlett L. Bellamy 1, Yi Li 2, Xihong

More information

Bayesian Estimation and Inference for the Generalized Partial Linear Model

Bayesian Estimation and Inference for the Generalized Partial Linear Model Bayesian Estimation Inference for the Generalized Partial Linear Model Haitham M. Yousof 1, Ahmed M. Gad 2 1 Department of Statistics, Mathematics Insurance, Benha University, Egypt. 2 Department of Statistics,

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Econometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018

Econometrics I. Lecture 10: Nonparametric Estimation with Kernels. Paul T. Scott NYU Stern. Fall 2018 Econometrics I Lecture 10: Nonparametric Estimation with Kernels Paul T. Scott NYU Stern Fall 2018 Paul T. Scott NYU Stern Econometrics I Fall 2018 1 / 12 Nonparametric Regression: Intuition Let s get

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Modelling Survival Data using Generalized Additive Models with Flexible Link

Modelling Survival Data using Generalized Additive Models with Flexible Link Modelling Survival Data using Generalized Additive Models with Flexible Link Ana L. Papoila 1 and Cristina S. Rocha 2 1 Faculdade de Ciências Médicas, Dep. de Bioestatística e Informática, Universidade

More information

MA 123 (Calculus I) Lecture 3: September 12, 2017 Section A2. Professor Jennifer Balakrishnan,

MA 123 (Calculus I) Lecture 3: September 12, 2017 Section A2. Professor Jennifer Balakrishnan, What is on today Professor Jennifer Balakrishnan, jbala@bu.edu 1 Techniques for computing limits 1 1.1 Limit laws..................................... 1 1.2 One-sided limits..................................

More information

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL DANIEL J. HENDERSON AND CHRISTOPHER F. PARMETER Abstract. In this paper we propose an asymptotically equivalent single-step alternative to the two-step

More information

Defect Detection using Nonparametric Regression

Defect Detection using Nonparametric Regression Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

GENERALIZED ADDITIVE MODELS FOR DATA WITH CONCURVITY: STATISTICAL ISSUES AND A NOVEL MODEL FITTING APPROACH. by Shui He B.S., Fudan University, 1993

GENERALIZED ADDITIVE MODELS FOR DATA WITH CONCURVITY: STATISTICAL ISSUES AND A NOVEL MODEL FITTING APPROACH. by Shui He B.S., Fudan University, 1993 GENERALIZED ADDITIVE MODELS FOR DATA WITH CONCURVITY: STATISTICAL ISSUES AND A NOVEL MODEL FITTING APPROACH by Shui He B.S., Fudan University, 1993 Submitted to the Graduate Faculty of the Graduate School

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure

On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure Biometrics DOI: 10.1111/j.1541-0420.2009.01324.x On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure Paul S. Albert 1, and Joanna

More information

Wavelet Regression Estimation in Longitudinal Data Analysis

Wavelet Regression Estimation in Longitudinal Data Analysis Wavelet Regression Estimation in Longitudinal Data Analysis ALWELL J. OYET and BRAJENDRA SUTRADHAR Department of Mathematics and Statistics, Memorial University of Newfoundland St. John s, NF Canada, A1C

More information

Measurement error, GLMs, and notational conventions

Measurement error, GLMs, and notational conventions The Stata Journal (2003) 3, Number 4, pp. 329 341 Measurement error, GLMs, and notational conventions James W. Hardin Arnold School of Public Health University of South Carolina Columbia, SC 29208 Raymond

More information

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Jeffrey D. Hart Department of Statistics, Texas A&M University Current and Future Trends in Nonparametrics Columbia, South Carolina

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Variable Selection and Weighting by Nearest Neighbor Ensembles

Variable Selection and Weighting by Nearest Neighbor Ensembles Variable Selection and Weighting by Nearest Neighbor Ensembles Jan Gertheiss (joint work with Gerhard Tutz) Department of Statistics University of Munich WNI 2008 Nearest Neighbor Methods Introduction

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error The Stata Journal (), Number, pp. 1 12 The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error James W. Hardin Norman J. Arnold School of Public Health

More information

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics

More information

American Journal of EPIDEMIOLOGY

American Journal of EPIDEMIOLOGY Volume 156 Number 3 August 1, 2002 American Journal of EPIDEMIOLOGY Copyright 2002 by The Johns Hopkins Bloomberg School of Public Health Sponsored by the Society for Epidemiologic Research Published by

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Optimization Problems

Optimization Problems Optimization Problems The goal in an optimization problem is to find the point at which the minimum (or maximum) of a real, scalar function f occurs and, usually, to find the value of the function at that

More information

Local regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression

Local regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction

More information

Lecture 9. Time series prediction

Lecture 9. Time series prediction Lecture 9 Time series prediction Prediction is about function fitting To predict we need to model There are a bewildering number of models for data we look at some of the major approaches in this lecture

More information

1. Introduction This paper focuses on two applications that are closely related mathematically, matched-pair studies and studies with errors-in-covari

1. Introduction This paper focuses on two applications that are closely related mathematically, matched-pair studies and studies with errors-in-covari Orthogonal Locally Ancillary Estimating Functions for Matched-Pair Studies and Errors-in-Covariates Molin Wang Harvard School of Public Health and Dana-Farber Cancer Institute, Boston, USA and John J.

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Supplementary Material to Clustering in Complex Latent Variable Models with Many Covariates

Supplementary Material to Clustering in Complex Latent Variable Models with Many Covariates Statistica Sinica: Supplement Supplementary Material to Clustering in Complex Latent Variable Models with Many Covariates Ya Su 1, Jill Reedy 2 and Raymond J. Carroll 1,3 1 Department of Statistics, Texas

More information

in a series of dependent observations asks for specific regression techniques that take care of the error structure to avoid under- respectively overs

in a series of dependent observations asks for specific regression techniques that take care of the error structure to avoid under- respectively overs Semiparametric regression smoothing and feature detection in time series Michael G. Schimek Institute for Medical Informatics, Statistics and Documentation, Karl-Franzens-University Graz, Austria, Europe;

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes

On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes On Model Fitting Procedures for Inhomogeneous Neyman-Scott Processes Yongtao Guan July 31, 2006 ABSTRACT In this paper we study computationally efficient procedures to estimate the second-order parameters

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

A Bias Correction for the Minimum Error Rate in Cross-validation

A Bias Correction for the Minimum Error Rate in Cross-validation A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

SEMIPARAMETRIC MIXTURE OF BINOMIAL REGRESSION WITH A DEGENERATE COMPONENT

SEMIPARAMETRIC MIXTURE OF BINOMIAL REGRESSION WITH A DEGENERATE COMPONENT Statistica Sinica 22 (2012), 27-46 doi:http://dx.doi.org/10.5705/ss.2010.062 SEMIPARAMETRIC MIXTURE OF BINOMIAL REGRESSION WITH A DEGENERATE COMPONENT J. Cao and W. Yao Simon Fraser University and Kansas

More information

Hypothesis Testing in Smoothing Spline Models

Hypothesis Testing in Smoothing Spline Models Hypothesis Testing in Smoothing Spline Models Anna Liu and Yuedong Wang October 10, 2002 Abstract This article provides a unified and comparative review of some existing test methods for the hypothesis

More information

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form. 1 Introduction to Nonparametric and Semiparametric Estimation Good when there are lots of data and very little prior information on functional form. Examples: y = f(x) + " (nonparametric) y = z 0 + f(x)

More information

Simplified marginal effects in discrete choice models

Simplified marginal effects in discrete choice models Economics Letters 81 (2003) 321 326 www.elsevier.com/locate/econbase Simplified marginal effects in discrete choice models Soren Anderson a, Richard G. Newell b, * a University of Michigan, Ann Arbor,

More information

Inversion Base Height. Daggot Pressure Gradient Visibility (miles)

Inversion Base Height. Daggot Pressure Gradient Visibility (miles) Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 19: Nonparametric Analysis

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Nearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2

Nearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2 Nearest Neighbor Machine Learning CSE546 Kevin Jamieson University of Washington October 26, 2017 2017 Kevin Jamieson 2 Some data, Bayes Classifier Training data: True label: +1 True label: -1 Optimal

More information

Gradient-Based Learning. Sargur N. Srihari

Gradient-Based Learning. Sargur N. Srihari Gradient-Based Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation

More information

Announcements. Proposals graded

Announcements. Proposals graded Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:

More information

Semiparametric Analysis of Heterogeneous Data Using Varying-Scale Generalized Linear Models

Semiparametric Analysis of Heterogeneous Data Using Varying-Scale Generalized Linear Models Semiparametric Analysis of Heterogeneous Data Using Varying-Scale Generalized Linear Models Minge Xie, Douglas G. Simpson, and Raymond J. Carroll 1 Summary This paper describes a class of heteroscedastic

More information

the university of british columbia department of statistics technical report # 217

the university of british columbia department of statistics technical report # 217 the university of british columbia department of statistics technical report # 217 seasonal confounding and residual correlation in analyses of health effects of air pollution by isabella r ghement nancy

More information

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1 36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations

More information

Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis

Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Jeffrey S. Morris University of Texas, MD Anderson Cancer Center Joint wor with Marina Vannucci, Philip J. Brown,

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

Introduction to Regression

Introduction to Regression Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric

More information

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS Mem. Gra. Sci. Eng. Shimane Univ. Series B: Mathematics 47 (2014), pp. 63 71 ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS TAKUMA YOSHIDA Communicated by Kanta Naito (Received: December 19, 2013)

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information