Efficient Estimation in Convex Single Index Models 1

Size: px
Start display at page:

Download "Efficient Estimation in Convex Single Index Models 1"

Transcription

1 1/28 Efficient Estimation in Convex Single Index Models 1 Rohit Patra University of Florida 1 Joint work with Arun K. Kuchibhotla (UPenn) and Bodhisattva Sen (Columbia)

2 2/28 Overview 1 Introduction 2 Estimation 3 Asymptotics 4 Simulation study

3 Introduction 3/28

4 4/28 A semiparametric model Convex single index model Y = m 0 (θ 0 X ) + ɛ, E(ɛ X ) = 0. (Y, X ) R R d P θ0,m 0. θ 0 R d is the unknown coefficient vector. m 0 : R R is an unknown convex link function with no parametric restriction. Offers a balance between flexibility of nonparametric models and interpretability of parametric models.

5 5/28 Goal and Applicability Model Problem Y = m 0 (θ 0 X ) + ɛ, E(ɛ X ) = 0. Estimate θ 0 and m 0 simultaneously, when we have i.i.d. data {(x i, y i )} n i=1 from the above model. Why convex link Convex/concave SIMs are widely used in economics, operations research, financial engineering among other fields. Production functions, utility functions, and call option prices are known to be concave.

6 6/28 Restrictions Identifiability: If m 1 (t) = m 0 ( t/2) and θ 1 = 2θ 0, then m 0 (θ0 x) = m 1(θ1 x). Thus we need to assume θ 0 Θ := {β R d : β = 1, β 1 > 0}. We need some regularity assumptions on the class of link functions. Only Shape constraints Shape and Smoothness constraints Some relevant works in the shape constrained single index model include: Murphy et al. (1999), Chen and Samworth (2015), Groeneboom and Hendrickx (2016), and Balabdaoui et al. (2016).

7 Y = 2(X θ 0 ) 2 + N(0, 5), X U[0, 1] 3, m(t) = 2t 2. Y: Output for 555 Belgian Firms X: Labour, Capital, Wage Index y Convex LSE Smoothing Splines Truth Index log(output) Splines Concave LSE Kong and Xia (2007) Labour Data for Belgian Firms (1996) Figure: Estimated link functions: stability due to convex constraint. Index := ˆθx. 7/28

8 Estimation 8/28

9 9/28 Estimation in Convex SIM [Kuchibhotla, Patra, Sen, 2017] Lipschitz LSE where C L = ( ˆm, ˆθ) 1 = argmin m C L,θ Θ n n {y i m(θ x i )} 2, i=1 { m m is convex and m(t 1 ) m(t 2 ) L t 1 t 2 } t 1, t 2 R. Penalized LSE ( ˇm, ˇθ) 1 := argmin (m,θ) R Θ n n {y i m(θ x i )} 2 + ˇλ 2 n {m (t)} 2 dt, i=1 where R denotes the class of all convex functions that have absolutely continuous first derivative.

10 10/28 Computation: Alternating Scheme Recall: ( ˆm, ˆθ) = argmin m CL,θ Θ Q n (m, θ), where Q n (m, θ) := 1 n n {y i m(θ x i )} 2. i=1 Minimization for fixed θ For a fixed θ, define ˆm θ := argmin Q n (m, θ). m C L The minimization is a convex optimization problem. ˆm θ can computed efficiently using the nnls package in R. ˇm θ can be computed via a damped newton type algorithm. Our R package simest has a implementation of this computation.

11 11/28 Computation contd. Minimization for fixed θ We can now define the profiled loss Q n : Θ R, Q n (θ) := Q n ( ˆm θ, θ) Minimization over θ To find ˆθ, we now minimize Q n (θ) over θ Θ. The loss function is not convex. Simulations suggest a large domain of attraction for moderate d. We use a gradient step on the unit sphere based on the right derivative of ˆm θ. Some initial work has suggested that the convergence of this alternating scheme is linear, i.e., θ (k+1) ˆθ (1 ρ) θ (k) ˆθ.

12 Asymptotics 12/28

13 13/28 Theoretical properties of LLSE Rates of convergence of the estimators [Kuchibhotla, Patra, Sen, 2017] Under some regularity conditions on m 0 and distribution of X (bounded support), sub-gaussian errors and L L 0 the Lipschitz LSE satisfies ˆm ˆθ m 0 θ 0 = O p (n 2/5 ), [Estimation error] ˆm θ 0 m 0 θ 0 = O p (n 2/5 ), [Estimation error of ˆm] ˆm θ 0 m 0 θ 0 = O p (n 2/15 ), [Estimation error of ˆm ] ˆθ θ 0 = O p (n 2/5 ). [Estimation error of ˆθ] For a convex Lipschitz function g : R R let g define the right derivative of g that satisfies g(b) = g(a) + b a g (t)dt. Here for any f : R R, and θ R d, we define f θ 2 := f (θ x) 2 dp X (x).

14 14/28 Asymptotic normality: Homoscedastic model Recall: 1 ( ˆm, ˆθ) = argmin (m,θ) C L Θ n n {y i m(θ x i )} 2. i=1 Semiparametric efficiency of ˆθ [Kuchibhotla, Patra, and Sen, 2017] Assume that E(ɛ 2 X ) σ 2. Let l θ,m : R d R R d 1 be the efficient score and let us define the efficient information matrix I θ0,m 0 := E(l θ0,m 0 l θ 0,m 0 ) R (d 1) (d 1). If m 0 is twice differentiable, then under some regularity conditions we can conclude n( ˆθ θ 0 ) d N(0, H θ0 I 1 θ 0,m 0 H θ 0 ).

15 15/28 Inference Our result readily yields confidence sets for θ 0. We can use the following plug-in estimator for the covariance estimator: ˆΣ := ˆσ 4 Hˆθ Pˆθ, ˆm [lˆθ, ˆm (Y, X )l ˆθ, ˆm (Y, X )] 1 H ˆθ, where n ˆσ 2 := [y i ˆm(ˆθ x i )] 2 /n i=1 l θ,m (y, x) := ( y m(θ x) ) m (θ x)h θ { x hθ (θ x) }. Asymptotic 1 2α confidence interval [ ˆθ i z ) 1/2 α (ˆΣi,i, ˆθi + z ) 1/2 α (ˆΣi,i ], n n

16 16/28 Asymptotic properties of the PLSE ˇλ 1 n If = O p (n 2/5 ) and ˇλ n = o p (n 1/4 ), then under some similar regularity assumptions the PLSE satisfies ˇm ˇθ m 0 θ 0 = O p (ˇλ n ), ˇm θ 0 m 0 θ 0 = O p (ˇλ n ), [Estimation error] [Estimation error for ˇm] ˇm θ 0 m 0 θ 0 = O p (ˇλ 1/2 n ), [Estimation error for ˇm ] and n( ˇθ θ0 ) d N(0, H θ0 I 1 θ 0,m 0 H θ 0 ). An example choice of ˇλ n := C n 2/5. Proof borrows ideas from the empirical process theory; see e.g., Mammen and van de Geer (1997) and van de Geer (2000).

17 17/28 Difficulty in proving efficiency The LLSE ˆm is a piecewise affine function and lies on the boundary of C L. Nuisance tangent space Consider the following model: Y = m(θ X ) + ɛ, where m C L, and θ Θ. A linear perturbation/submodel around m: m s,a (t) = m(t) s a(t), where s R. The score for the single index model along this submodel is proportional to a( ). lin{a : D R m s,a R for small enough s} L 2 (Λ). The set inclusion is strict when m is not strongly convex.

18 18/28 Efficient score Let S θ denote the parametric score of the model and Λ m denote the nuisance tangent space. When m is strongly convex, the efficient score is known to be Π(S θ Λ m) := l θ,m (y, x) = ( y m(θ x) ) m (θ x)h θ { x hθ (θ x) }. Since ˆm is not strongly convex, it is not clear if one can show that?? Π(Sˆθ Λ ˆm) = lˆθ, ˆm.

19 19/28 Efficient score Since ˆm lies on the boundary of C L, the least favorable path does not exist, i.e., we can not find (θ t, m t ) (centered at (ˆθ, ˆm)) such that?? lˆθ, ˆm = ( y mt (θt x) ) 2 t. t=0 Since LLSE is the minimizer of the least squares loss, this would mean P n lˆθ, ˆm = 0. Since ˆm is piecewise affine, it is not clear if one can show that P n lˆθ, ˆm?? = o p (n 1/2 ).

20 20/28 Approximations We next try to construct some paths around (ˆθ, ˆm) that will have score close to lˆθ, ˆm. We find a path that has the following score: S θ,m = { ( y m(θ x) ) H θ [ θ x m (θ x)x + m (u)k (u)du m (θ x)k(θ x) s 0 ] + m 0 (s 0)k(s 0 ) m 0 (s 0)h θ0 (s 0 ). Note l θ0,m 0 = S θ0,m 0 = ( y m(θ 0 x)) H θ 0 m 0 (θ 0 x) { x h θ0 (θ 0 x)}. van der Vaart (2002) calls such a path approximately least favorable.

21 21/28 Approximation, Part 2 S θ,m is not very tractable, so we further approximate this by ψ θ,m (x, y) := (y m(θ x))h θ [m (θ x)x h θ0 (θ x)m 0(θ x)]. Compare ψ θ,m to l θ,m = ( y m(θ x) ) Hθ m (θ x) { x h θ (θ x) }. Also note ψ θ0,m 0 = l θ0,m 0. We show that P n ψˆθ, ˆm = o p(n 1/2 ).

22 Simulation study 22/28

23 23/28 Another Estimator Convex LSE ( m, θ) := argmin Q n (m, θ). (m,θ) C Θ We can compute the LSE via the alternating minimization algorithm. Simulations suggest θ is n consistent. The behavior of m at the boundary is not well-understood. In fact, in univariate regression, m is unbounded in probability at the boundary.

24 Choice of L Y = (θ 0 X ) 2 + N(0,.1 2 ), where X Uniform[ 1, 1] 4 and θ 0 = 1 4 /2. Mean Absolute Deviation CvxLSE L Figure: Box plots of i=1 θ i θ 0,i (over 1000 replications, n = 500) from the following model as the tuning parameter varies over {3, 4, 5, 7, 10} and CvxLSE. Here L 0 = 4. 24/28

25 25/28 Confidence Interval Y = (θ 0 X ) 2 + N(0,.3 2 ), X Uniform[ 1, 1] 3 and θ 0 = 1 3 / 3. (1) Table: The estimated coverage probabilities and average lengths (obtained from 800 replicates) of nominal 95% confidence intervals for the first coordinate of θ 0 for the model (1). n CvxLip CvxPen Coverage Avg Length Coverage Avg Length

26 0.005 d = d = CvxPen CvxLip Smooth d = EFM EDR CvxPen CvxLip Smooth d = 100 EFM EDR CvxPen CvxLip Smooth EFM EDR Figure: Boxplots of d i=1 ˆθ i θ 0,i /d (over 500 replications) based on 200 observations for dimensions 10, 25, 50, and 100. CvxPen CvxLip Smooth EFM 26/28

27 27/28 Summary First work providing efficient estimator in shape-constrained LSE (in a bundled parameter problem) when ˆm is piecewise affine. Our estimators readily lead to asymptotic confidence sets for θ 0. Our methods are robust towards the choice of the tuning parameter. The proposed estimators are implemented in the R package simest.

28 28/28 References [1] Kuchibhotla, A. K. and Patra, R. K. (2016). simest: Single Index Model Estimation with Constraints on Link Function. R package version 0.2. [2] Kuchibhotla, A. K., Patra, R. K., and Sen, B. (2017). Efficient Estimation in Convex Single Index Models. arxiv.org/abs/ [3] Kuchibhotla, A. K., and Patra, R. K. (2017). Efficient estimation in single index models through smoothing splines. arxiv.org/abs/ [4] Balabdaoui, F., Durot, C. and Jankowski, H. (2016). Least squares estimation in the monotone single index model. arxiv.org/abs/ [5] Groeneboom, P. and Hendrickx, K. (2016). Current status linear regression. Annals of Statistics (Forthcoming). [6] Chen, Y. and Samworth, R. J. (2014). Generalised additive and index models with shape constraints. JRSSB. 78(4), [7] Murphy, S. A., van der Vaart, A. W. and Wellner, J. A. (1999). Current status regression. Math. Methods Statist. 8(3), Thank You! Questions?

Efficient Estimation in Single Index Models through Smoothing splines

Efficient Estimation in Single Index Models through Smoothing splines Efficient Estimation in Single Index Models through Smoothing splines Arun K. Kuchibhotla and Rohit K. Patra University of Pennsylvania and University of Florida, e-mail: arunku@upenn.edu; rohitpatra@ufl.edu

More information

SEMIPARAMETRIC INFERENCE WITH SHAPE CONSTRAINTS ROHIT KUMAR PATRA

SEMIPARAMETRIC INFERENCE WITH SHAPE CONSTRAINTS ROHIT KUMAR PATRA SEMIPARAMETRIC INFERENCE WITH SHAPE CONSTRAINTS ROHIT KUMAR PATRA Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences

More information

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan

OPTIMISATION CHALLENGES IN MODERN STATISTICS. Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan OPTIMISATION CHALLENGES IN MODERN STATISTICS Co-authors: Y. Chen, M. Cule, R. Gramacy, M. Yuan How do optimisation problems arise in Statistics? Let X 1,...,X n be independent and identically distributed

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

Construction of PoSI Statistics 1

Construction of PoSI Statistics 1 Construction of PoSI Statistics 1 Andreas Buja and Arun Kumar Kuchibhotla Department of Statistics University of Pennsylvania September 8, 2018 WHOA-PSI 2018 1 Joint work with "Larry s Group" at Wharton,

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

VARIABLE SELECTION AND INDEPENDENT COMPONENT

VARIABLE SELECTION AND INDEPENDENT COMPONENT VARIABLE SELECTION AND INDEPENDENT COMPONENT ANALYSIS, PLUS TWO ADVERTS Richard Samworth University of Cambridge Joint work with Rajen Shah and Ming Yuan My core research interests A broad range of methodological

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

High Dimensional Propensity Score Estimation via Covariate Balancing

High Dimensional Propensity Score Estimation via Covariate Balancing High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

Testing against a parametric regression function using ideas from shape restricted estimation

Testing against a parametric regression function using ideas from shape restricted estimation Testing against a parametric regression function using ideas from shape restricted estimation Bodhisattva Sen and Mary Meyer Columbia University, New York and Colorado State University, Fort Collins November

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

13. Parameter Estimation. ECE 830, Spring 2014

13. Parameter Estimation. ECE 830, Spring 2014 13. Parameter Estimation ECE 830, Spring 2014 1 / 18 Primary Goal General problem statement: We observe X p(x θ), θ Θ and the goal is to determine the θ that produced X. Given a collection of observations

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students

More information

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility The Slow Convergence of OLS Estimators of α, β and Portfolio Weights under Long Memory Stochastic Volatility New York University Stern School of Business June 21, 2018 Introduction Bivariate long memory

More information

Testing against a linear regression model using ideas from shape-restricted estimation

Testing against a linear regression model using ideas from shape-restricted estimation Testing against a linear regression model using ideas from shape-restricted estimation arxiv:1311.6849v2 [stat.me] 27 Jun 2014 Bodhisattva Sen and Mary Meyer Columbia University, New York and Colorado

More information

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo) Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

More information

Nonparametric Inference In Functional Data

Nonparametric Inference In Functional Data Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The

More information

Locally Robust Semiparametric Estimation

Locally Robust Semiparametric Estimation Locally Robust Semiparametric Estimation Victor Chernozhukov Juan Carlos Escanciano Hidehiko Ichimura Whitney K. Newey The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Generalized Cp (GCp) in a Model Lean Framework

Generalized Cp (GCp) in a Model Lean Framework Generalized Cp (GCp) in a Model Lean Framework Linda Zhao University of Pennsylvania Dedicated to Lawrence Brown (1940-2018) September 9th, 2018 WHOA 3 Joint work with Larry Brown, Juhui Cai, Arun Kumar

More information

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Statistics Seminar, York October 15, 2015 Statistics Seminar,

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Estimation for two-phase designs: semiparametric models and Z theorems

Estimation for two-phase designs: semiparametric models and Z theorems Estimation for two-phase designs:semiparametric models and Z theorems p. 1/27 Estimation for two-phase designs: semiparametric models and Z theorems Jon A. Wellner University of Washington Estimation for

More information

A review of some semiparametric regression models with application to scoring

A review of some semiparametric regression models with application to scoring A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

1 One-way analysis of variance

1 One-way analysis of variance LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood Nonparametric estimation of s concave and log-concave densities: alternatives to maximum likelihood Jon A. Wellner University of Washington, Seattle Cowles Foundation Seminar, Yale University November

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian Sparse Linear Regression with Unknown Symmetric Error Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of

More information

Minimax Estimation of a nonlinear functional on a structured high-dimensional model

Minimax Estimation of a nonlinear functional on a structured high-dimensional model Minimax Estimation of a nonlinear functional on a structured high-dimensional model Eric Tchetgen Tchetgen Professor of Biostatistics and Epidemiologic Methods, Harvard U. (Minimax ) 1 / 38 Outline Heuristics

More information

AGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0

AGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0 AGEC 661 ote Eleven Ximing Wu M-estimator So far we ve focused on linear models, where the estimators have a closed form solution. If the population model is nonlinear, the estimators often do not have

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

On Efficiency of the Plug-in Principle for Estimating Smooth Integrated Functionals of a Nonincreasing Density

On Efficiency of the Plug-in Principle for Estimating Smooth Integrated Functionals of a Nonincreasing Density On Efficiency of the Plug-in Principle for Estimating Smooth Integrated Functionals of a Nonincreasing Density arxiv:188.7915v2 [math.st] 12 Feb 219 Rajarshi Mukherjee and Bodhisattva Sen Department of

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Guy Lebanon February 19, 2011 Maximum likelihood estimation is the most popular general purpose method for obtaining estimating a distribution from a finite sample. It was

More information

3.0.1 Multivariate version and tensor product of experiments

3.0.1 Multivariate version and tensor product of experiments ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 3: Minimax risk of GLM and four extensions Lecturer: Yihong Wu Scribe: Ashok Vardhan, Jan 28, 2016 [Ed. Mar 24]

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

CURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University

CURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University CURRENT STATUS LINEAR REGRESSION By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University We construct n-consistent and asymptotically normal estimates for the finite

More information

Nonparametric estimation of log-concave densities

Nonparametric estimation of log-concave densities Nonparametric estimation of log-concave densities Jon A. Wellner University of Washington, Seattle Seminaire, Institut de Mathématiques de Toulouse 5 March 2012 Seminaire, Toulouse Based on joint work

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Bootstrap and Parametric Inference: Successes and Challenges

Bootstrap and Parametric Inference: Successes and Challenges Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist

More information

Semiparametric Gaussian Copula Models: Progress and Problems

Semiparametric Gaussian Copula Models: Progress and Problems Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Likelihood Based Inference for Monotone Response Models

Likelihood Based Inference for Monotone Response Models Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 11, 2006 Abstract The behavior of maximum likelihood estimates (MLEs) and the likelihood ratio

More information

Various types of likelihood

Various types of likelihood Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. semi-parametric likelihood, partial likelihood 3. empirical likelihood,

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

A Semiparametric Binary Regression Model Involving Monotonicity Constraints

A Semiparametric Binary Regression Model Involving Monotonicity Constraints doi: 10.1111/j.1467-9469.2006.00499.x Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA, 2006 A Semiparametric Binary Regression

More information

Likelihood Based Inference for Monotone Response Models

Likelihood Based Inference for Monotone Response Models Likelihood Based Inference for Monotone Response Models Moulinath Banerjee University of Michigan September 5, 25 Abstract The behavior of maximum likelihood estimates (MLE s) the likelihood ratio statistic

More information

Shape constrained estimation: a brief introduction

Shape constrained estimation: a brief introduction Shape constrained estimation: a brief introduction Jon A. Wellner University of Washington, Seattle Cowles Foundation, Yale November 7, 2016 Econometrics lunch seminar, Yale Based on joint work with: Fadoua

More information

Least squares estimation in the monotone single index model

Least squares estimation in the monotone single index model Submitted to Bernoulli ariv:1610.06026v2 [math.st] 18 Apr 2018 Least squares estimation in the monotone single index model FADOUA BALABDAOUI 1,2, CÉCILE DUROT3 and HANNA JANKOWSKI 4 1 Université Paris-Dauphine,

More information

Semiparametric posterior limits

Semiparametric posterior limits Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Independent and conditionally independent counterfactual distributions

Independent and conditionally independent counterfactual distributions Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views

More information

Advanced Econometrics

Advanced Econometrics Advanced Econometrics Dr. Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Econometrics Winter 2013/2014 1 / 156 General information Aims and prerequisites Objective:

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series #10-104 College of Health and Human Development

More information

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.

Maximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p. Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Ensemble estimation and variable selection with semiparametric regression models

Ensemble estimation and variable selection with semiparametric regression models Ensemble estimation and variable selection with semiparametric regression models Sunyoung Shin Department of Mathematical Sciences University of Texas at Dallas Joint work with Jason Fine, Yufeng Liu,

More information

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1

Recap. Vector observation: Y f (y; θ), Y Y R m, θ R d. sample of independent vectors y 1,..., y n. pairwise log-likelihood n m. weights are often 1 Recap Vector observation: Y f (y; θ), Y Y R m, θ R d sample of independent vectors y 1,..., y n pairwise log-likelihood n m i=1 r=1 s>r w rs log f 2 (y ir, y is ; θ) weights are often 1 more generally,

More information

1.1 Basis of Statistical Decision Theory

1.1 Basis of Statistical Decision Theory ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of

More information

Quantifying Stochastic Model Errors via Robust Optimization

Quantifying Stochastic Model Errors via Robust Optimization Quantifying Stochastic Model Errors via Robust Optimization IPAM Workshop on Uncertainty Quantification for Multiscale Stochastic Systems and Applications Jan 19, 2016 Henry Lam Industrial & Operations

More information

Likelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona

Likelihood based Statistical Inference. Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona Likelihood based Statistical Inference Dottorato in Economia e Finanza Dipartimento di Scienze Economiche Univ. di Verona L. Pace, A. Salvan, N. Sartori Udine, April 2008 Likelihood: observed quantities,

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models

A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models A Least-Squares Approach to Consistent Information Estimation in Semiparametric Models Jian Huang Department of Statistics University of Iowa Ying Zhang and Lei Hua Department of Biostatistics University

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Inference on distributions and quantiles using a finite-sample Dirichlet process

Inference on distributions and quantiles using a finite-sample Dirichlet process Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

arxiv: v1 [stat.me] 28 May 2018

arxiv: v1 [stat.me] 28 May 2018 High-Dimensional Statistical Inferences with Over-identification: Confidence Set Estimation and Specification Test Jinyuan Chang, Cheng Yong Tang, Tong Tong Wu arxiv:1805.10742v1 [stat.me] 28 May 2018

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Learning Mixtures of Truncated Basis Functions from Data

Learning Mixtures of Truncated Basis Functions from Data Learning Mixtures of Truncated Basis Functions from Data Helge Langseth, Thomas D. Nielsen, and Antonio Salmerón PGM This work is supported by an Abel grant from Iceland, Liechtenstein, and Norway through

More information