Nonparametric Inference In Functional Data

Size: px
Start display at page:

Download "Nonparametric Inference In Functional Data"

Transcription

1 Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ.

2 An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt + ɛ, β W2 m (0, 1), the Sobolev space of order m X is a random process ɛ is zero-mean error

3 General Aim In this talk, we address the following questions in a unified framework: how to construct confidence interval for the regression mean µ = α x(t)β(t)dt? how to construct prediction interval for Y future? how to test H 0 : β = β 0 versus H 1 : β β 0?

4 General Aim In this talk, we address the following questions in a unified framework: how to construct confidence interval for the regression mean µ = α x(t)β(t)dt? how to construct prediction interval for Y future? how to test H 0 : β = β 0 versus H 1 : β β 0?

5 General Aim In this talk, we address the following questions in a unified framework: how to construct confidence interval for the regression mean µ = α x(t)β(t)dt? how to construct prediction interval for Y future? how to test H 0 : β = β 0 versus H 1 : β β 0?

6 General Aim In this talk, we address the following questions in a unified framework: how to construct confidence interval for the regression mean µ = α x(t)β(t)dt? how to construct prediction interval for Y future? how to test H 0 : β = β 0 versus H 1 : β β 0?

7 Literature Review The existing methods for inference rely on functional principle component analysis (FPCA), which requires the covariance kernel and reproducing kernel to share common ordered eigenfunctions, i.e., perfectly aligned; Müller and Stadtmüller (2005), Cai and Hall (2006), Hall and Horowitz (2007), etc. There is a lack of unified treatment for various inference problems such as confidence/prediction interval construction, (adaptive) hypothesis testing, and functional contrast testing.

8 Literature Review The existing methods for inference rely on functional principle component analysis (FPCA), which requires the covariance kernel and reproducing kernel to share common ordered eigenfunctions, i.e., perfectly aligned; Müller and Stadtmüller (2005), Cai and Hall (2006), Hall and Horowitz (2007), etc. There is a lack of unified treatment for various inference problems such as confidence/prediction interval construction, (adaptive) hypothesis testing, and functional contrast testing.

9 Model and Assumptions: Model: Y i = α X i (t)β(t)dt + ɛ i, where (Y 1, X 1 ),..., (Y n, X n ) are iid samples and E{ɛ i } = 0, E{ɛ 2 i } = 1 Functional parameter: β W2 m (0, 1), the m-order Sobolev space Covariance function: C(s, t) = E{X(s)X(t)} satisfies 1 0 C(s, t)β(s)ds = 0 β = 0

10 Model and Assumptions: Model: Y i = α X i (t)β(t)dt + ɛ i, where (Y 1, X 1 ),..., (Y n, X n ) are iid samples and E{ɛ i } = 0, E{ɛ 2 i } = 1 Functional parameter: β W2 m (0, 1), the m-order Sobolev space Covariance function: C(s, t) = E{X(s)X(t)} satisfies 1 0 C(s, t)β(s)ds = 0 β = 0

11 Model and Assumptions: Model: Y i = α X i (t)β(t)dt + ɛ i, where (Y 1, X 1 ),..., (Y n, X n ) are iid samples and E{ɛ i } = 0, E{ɛ 2 i } = 1 Functional parameter: β W2 m (0, 1), the m-order Sobolev space Covariance function: C(s, t) = E{X(s)X(t)} satisfies 1 0 C(s, t)β(s)ds = 0 β = 0

12 FPCA Estimation Sample covariance function: Ĉ(s, t) = 1 n n (X i (s) X(s))(X i (t) X(t)) i=1 KarhunenLoéve decomposition: C(s, t) = k=1 λ kψ k (s)ψ k (t) with λ 1 λ 2... Ĉ(s, t) = k=1 λ k ψk (s) ψ k (t) with λ 1 λ 2... Estimate β by β = b 1 ψ1 + b 2 ψ2 + + b kn ψkn, where b j are estimated basis coefficients.

13 FPCA Estimation Sample covariance function: Ĉ(s, t) = 1 n n (X i (s) X(s))(X i (t) X(t)) i=1 KarhunenLoéve decomposition: C(s, t) = k=1 λ kψ k (s)ψ k (t) with λ 1 λ 2... Ĉ(s, t) = k=1 λ k ψk (s) ψ k (t) with λ 1 λ 2... Estimate β by β = b 1 ψ1 + b 2 ψ2 + + b kn ψkn, where b j are estimated basis coefficients.

14 FPCA Estimation Sample covariance function: Ĉ(s, t) = 1 n n (X i (s) X(s))(X i (t) X(t)) i=1 KarhunenLoéve decomposition: C(s, t) = k=1 λ kψ k (s)ψ k (t) with λ 1 λ 2... Ĉ(s, t) = k=1 λ k ψk (s) ψ k (t) with λ 1 λ 2... Estimate β by β = b 1 ψ1 + b 2 ψ2 + + b kn ψkn, where b j are estimated basis coefficients.

15 Penalized Estimation: where ( α, β) = arg l n,λ (α, β) = 1 2n + λ 2 min α R,β W m 2 n (Y i α i=1 1 0 (0,1) l n,λ(α, β), 1 0 β (m) (t) 2 dt. X i (t)β(t)dt) 2

16 Advantage of Penalized Estimation No perfect alignment assumption Provides a unified framework for inference Easy to make nonparametric inference within regularization framework Estimation performance is better

17 Advantage of Penalized Estimation No perfect alignment assumption Provides a unified framework for inference Easy to make nonparametric inference within regularization framework Estimation performance is better

18 Advantage of Penalized Estimation No perfect alignment assumption Provides a unified framework for inference Easy to make nonparametric inference within regularization framework Estimation performance is better

19 Advantage of Penalized Estimation No perfect alignment assumption Provides a unified framework for inference Easy to make nonparametric inference within regularization framework Estimation performance is better

20 A Graphical Comparison of FPCA and Penalized Estimation: Cai and Yuan (2012) k 0 controls the alignment between covariance and reproducing kernels. Larger value of k 0 yields more misalignment. Roughness Regularization Functional PCA Excess Risk k0= 5 k0= 10 k0= 15 k0= 20 Excess Risk Relative Efficiency n n

21 Assumption: Simultaneous Diagonalization There exists functions ϕ ν and nondecreasing sequences ρ ν ν 2k for some k > 0 such that for any ν, µ 1, C(s, t)ϕ ν (s)ϕ µ (t)dsdt = δ νµ, and 1 0 ϕ (m) ν (t)ϕ (m) µ (t)dt = ρ ν δ νµ. Furthermore, any β W m 2 (0, 1) satisfies β = ν b νϕ ν for some real sequence b ν.

22 Construction of CI Let µ 0 = α x 0(t)β(t)dt be the regression mean at X = x 0. The 95% confidence interval for µ 0 is CI : µ 0 ± 1.96σ n / n, where µ 0 = α x 0(t) β(t)dt, σ 2 n = 1 + ν x ν = 1 0 x 0(t)ϕ ν (t)dt. x 2 ν 1+λρ ν,

23 Construction of PI Let Y 0 be future response generated from Y 0 = µ 0 + ɛ, then the 95% prediction interval for Y 0 is P I : µ 0 ± σ 2 n/n.

24 Theoretical Validity Theorem If ɛ is sub-exponential, the true function β 0 is suitably smooth, and λ is properly tuned, e.g., λ n k/(2k+1). Then as n, P (µ 0 CI) 0.95, and P (Y 0 P I) 0.95.

25 Penalized Likelihood Ratio Test Testing hypotheses H 0 : α = α 0, β = β 0 versus H 1 : H 0 is not true. Define the penalized likelihood ratio test (PLRT) P LRT n = l n,λ (α 0, β 0 ) l n,λ ( α, β), where ( α, β) is the penalized MLE.

26 Wilks Phenomenon Wilks phenomenon means that the null limit distribution of the likelihood ratio is free of any nuisance parameters and design distribution. Theorem Suppose H 0 holds and E{ɛ 4 } <, and λ is suitably tuned, e.g., λ n 4k/(4k+1). Then where σ 2 = 2nσ 2 P LRT n d χ 2 un, 0 (1 + x2k ) 1 dx 0 (1 + x2k ) 2 dx, u n = 1 cλ 1 2k c is constant free of α 0, β 0, distribution of X. ( 1 0 (1 + x2k ) 1 dx) 2 ( 1 0 (1 + x2k ) 2 dx),

27 Minimax Property of PLRT Suppose we want to test H 0 : β = 0, but the following local alternative hypothesis is true: H 1n : β = β n, where β n satisfies β n L 2 cn 2k/(4k+1). Theorem For arbitrary ε > 0, there exist c such that for any n 1: inf P βn (reject H 0 ) 1 ε. β n W2 m(0,1): βn L 2 cn 2k/(4k+1)

28 An Example: Standard Brownian Motion When m = 2 (cubic spline) and X is Brownian motion with covariance function C(s, t) = min{s, t}, s, t (0, 1), we have σ and u n 0.31λ 1/6. Therefore, 2n(1.08) P LRT n d χ 2 un.

29 An Adaptive Testing Procedure Based on Likelihood Ratio If the smoothness degrees of both X and β are unknown, how well can we do? We will propose a testing procedure adaptive to these smoothness degrees and show that our procedure achieves the minimax rate of testing.

30 Let P LRT (k) be the penalized likelihood ratio test associated with k, and τ k = P LRT (k) E{P LRT (k)} V ar(p LRT (k)), k = 1, 2,..., k n. Define AT = B n ( max 1 k k n τ k B n ), where B n satisfies 2πB 2 n exp(b 2 n) = k 2 n.

31 Size of the Test A valid test should achieve the correct size. Theorem Under H 0 : β = 0, if k n (log n) d 0, for some constant d 0 (0, 1/2), then for any γ (0, 1), P (AT c γ ) 1 γ, as n, where c γ = log( log(1 γ)).

32 Adaptive Minimax Rate Suppose k is the true value of k. Let δ(n, k ) = n 2k /(4k +1) (log log n) k /(4k +1). Theorem Suppose k n (log n) d 0, for some constant d 0 (0, 1/2). Then, for any ε (0, 1), there exists c > 0 s.t. for any n 1, inf P β(reject H 0 ) 1 ε. β L 2 cδ(n,k )

33 Simulation Setup X(t) = 100 j=1 λj η j V j (t), where λ j = (j 0.5) 2 π 2, V j (t) = 2 sin((j 0.5)πt), η 1,..., η 100 iid N(0, 1). The test function is β B,ξ B 0 = 100 k=1 k 2ξ 1 j=1 j ξ 0.5 V j (t), where B = 0, 0.1, 1 and ξ = 0.1, 0.5, 1. Draw n iid samples from Y = 1 0 X(t)β 0(t)dt + N(0, 1) for n = 100, 500.

34 Figure: Plots of β 0 (t) when B = 1 Setting 1: Plot of β(t) when B=1 β(t) ξ= 0.1 ξ= 0.5 ξ= Time

35 Coverage Proportion of Confidence Interval Table: 100 coverage proportion (average length) of CI when B = ξ = 1 n = 100 n = (0.56) 94.99(0.39)

36 Size Comparison with Hilgert, Mas and Verzelen (2013) Hilgert, Mas and Verzelen (2013) proposed an FPCA-based testing procedure which is adaptive to the truncation parameter k n. We compare our approaches with theirs, denoted HMV. Table: 100 size when B = 0 n = 100 n = 500 HMV PLRT AT

37 Power Comparison with Hilgert, Mas and Verzelen (2013) Table: 100 power when n = 100 Test B = 0.1 B = 1 ξ = 0.1 HMV AT PLRT ξ = 1 HMV AT PLRT

38 Power Comparison with Hilgert, Mas and Verzelen (2013) Table: 100 power when n = 500 Test B = 0.1 B = 1 ξ = 0.1 HMV AT PLRT ξ = 1 HMV AT PLRT

39 Summary We propose applicable procedures for inference in functional data analysis Our approaches do not require perfect alignment Our approaches are asymptotic valid, i.e., desired size and coverage probability The PLRT and adaptive testing procedures are more powerful than existing ones. Extensions to general cases not reported here: quasi-likelihood framework composite hypotheses adaptive testing in non-gaussian error

40 Thank you for your attention!

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Bayesian Aggregation for Extraordinarily Large Dataset

Bayesian Aggregation for Extraordinarily Large Dataset Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work

More information

A Reproducing Kernel Hilbert Space Approach to Functional Linear Regression

A Reproducing Kernel Hilbert Space Approach to Functional Linear Regression A Reproducing Kernel Hilbert Space Approach to Functional Linear Regression Ming Yuan and. ony Cai 2 Georgia Institute of echnology and University of Pennsylvania (March 8, 2009) Abstract We study in this

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

Nonparametric Heterogeneity Testing for Massive Data

Nonparametric Heterogeneity Testing for Massive Data Nonparametric Heterogeneity Testing for Massive Data Abstract A massive dataset often consists of a growing number of potentially heterogeneous sub-populations. This paper is concerned about testing various

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1. Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

A REPRODUCING KERNEL HILBERT SPACE APPROACH TO FUNCTIONAL LINEAR REGRESSION

A REPRODUCING KERNEL HILBERT SPACE APPROACH TO FUNCTIONAL LINEAR REGRESSION he Annals of Statistics 2010, Vol. 38, No. 6, 3412 3444 DOI: 10.1214/09-AOS772 Institute of Mathematical Statistics, 2010 A REPRODUCING KERNEL HILBER SPACE APPROACH O FUNCIONAL LINEAR REGRESSION BY MING

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Quantile Regression for Extraordinarily Large Data

Quantile Regression for Extraordinarily Large Data Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step

More information

Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression

Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Communications of the Korean Statistical Society 2009, Vol. 16, No. 4, 713 722 Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Young-Ju Kim 1,a a Department of Information

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

Inference on distributions and quantiles using a finite-sample Dirichlet process

Inference on distributions and quantiles using a finite-sample Dirichlet process Dirichlet IDEAL Theory/methods Simulations Inference on distributions and quantiles using a finite-sample Dirichlet process David M. Kaplan University of Missouri Matt Goldman UC San Diego Midwest Econometrics

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

arxiv: v1 [math.st] 12 Nov 2012

arxiv: v1 [math.st] 12 Nov 2012 he Annals of Statistics 2010, Vol. 38, No. 6, 3412 3444 DOI: 10.1214/09-AOS772 c Institute of Mathematical Statistics, 2010 arxiv:1211.2607v1 [math.s] 12 Nov 2012 A REPRODUCING KERNEL HILBER SPACE APPROACH

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

ST495: Survival Analysis: Hypothesis testing and confidence intervals

ST495: Survival Analysis: Hypothesis testing and confidence intervals ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Minimax Estimation of Kernel Mean Embeddings

Minimax Estimation of Kernel Mean Embeddings Minimax Estimation of Kernel Mean Embeddings Bharath K. Sriperumbudur Department of Statistics Pennsylvania State University Gatsby Computational Neuroscience Unit May 4, 2016 Collaborators Dr. Ilya Tolstikhin

More information

Modeling Multi-Way Functional Data With Weak Separability

Modeling Multi-Way Functional Data With Weak Separability Modeling Multi-Way Functional Data With Weak Separability Kehui Chen Department of Statistics University of Pittsburgh, USA @CMStatistics, Seville, Spain December 09, 2016 Outline Introduction. Multi-way

More information

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics Mathematics Qualifying Examination January 2015 STAT 52800 - Mathematical Statistics NOTE: Answer all questions completely and justify your derivations and steps. A calculator and statistical tables (normal,

More information

Functional Data Analysis

Functional Data Analysis FDA 1-1 Functional Data Analysis Michal Benko Institut für Statistik und Ökonometrie Humboldt-Universität zu Berlin email:benko@wiwi.hu-berlin.de FDA 1-2 Outline of this talk: Introduction Turning discrete

More information

Smooth Common Principal Component Analysis

Smooth Common Principal Component Analysis 1 Smooth Common Principal Component Analysis Michal Benko Wolfgang Härdle Center for Applied Statistics and Economics benko@wiwi.hu-berlin.de Humboldt-Universität zu Berlin Motivation 1-1 Volatility Surface

More information

A General Framework for High-Dimensional Inference and Multiple Testing

A General Framework for High-Dimensional Inference and Multiple Testing A General Framework for High-Dimensional Inference and Multiple Testing Yang Ning Department of Statistical Science Joint work with Han Liu 1 Overview Goal: Control false scientific discoveries in high-dimensional

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Lecture Notes 15 Prediction Chapters 13, 22, 20.4.

Lecture Notes 15 Prediction Chapters 13, 22, 20.4. Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data

More information

Interval Estimation for AR(1) and GARCH(1,1) Models

Interval Estimation for AR(1) and GARCH(1,1) Models for AR(1) and GARCH(1,1) Models School of Mathematics Georgia Institute of Technology 2010 This talk is based on the following papers: Ngai Hang Chan, Deyuan Li and (2010). Toward a Unified of Autoregressions.

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Spline Density Estimation and Inference with Model-Based Penalities

Spline Density Estimation and Inference with Model-Based Penalities Spline Density Estimation and Inference with Model-Based Penalities December 7, 016 Abstract In this paper we propose model-based penalties for smoothing spline density estimation and inference. These

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square

Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Likelihood Ratio Test in High-Dimensional Logistic Regression Is Asymptotically a Rescaled Chi-Square Yuxin Chen Electrical Engineering, Princeton University Coauthors Pragya Sur Stanford Statistics Emmanuel

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Compressive Inference

Compressive Inference Compressive Inference Weihong Guo and Dan Yang Case Western Reserve University and SAMSI SAMSI transition workshop Project of Compressive Inference subgroup of Imaging WG Active members: Garvesh Raskutti,

More information

Log Covariance Matrix Estimation

Log Covariance Matrix Estimation Log Covariance Matrix Estimation Xinwei Deng Department of Statistics University of Wisconsin-Madison Joint work with Kam-Wah Tsui (Univ. of Wisconsin-Madsion) 1 Outline Background and Motivation The Proposed

More information

CHANGE DETECTION IN TIME SERIES

CHANGE DETECTION IN TIME SERIES CHANGE DETECTION IN TIME SERIES Edit Gombay TIES - 2008 University of British Columbia, Kelowna June 8-13, 2008 Outline Introduction Results Examples References Introduction sunspot.year 0 50 100 150 1700

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University WHOA-PSI Workshop, St Louis, 2017 Quotes from Day 1 and Day 2 Good model or pure model? Occam s razor We really

More information

Uncertainty Quantification for Inverse Problems. November 7, 2011

Uncertainty Quantification for Inverse Problems. November 7, 2011 Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Cross-Validation with Confidence

Cross-Validation with Confidence Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Calibrating Environmental Engineering Models and Uncertainty Analysis

Calibrating Environmental Engineering Models and Uncertainty Analysis Models and Cornell University Oct 14, 2008 Project Team Christine Shoemaker, co-pi, Professor of Civil and works in applied optimization, co-pi Nikolai Blizniouk, PhD student in Operations Research now

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Test for Discontinuities in Nonparametric Regression

Test for Discontinuities in Nonparametric Regression Communications of the Korean Statistical Society Vol. 15, No. 5, 2008, pp. 709 717 Test for Discontinuities in Nonparametric Regression Dongryeon Park 1) Abstract The difference of two one-sided kernel

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information

Efficient Estimation in Convex Single Index Models 1

Efficient Estimation in Convex Single Index Models 1 1/28 Efficient Estimation in Convex Single Index Models 1 Rohit Patra University of Florida http://arxiv.org/abs/1708.00145 1 Joint work with Arun K. Kuchibhotla (UPenn) and Bodhisattva Sen (Columbia)

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Binary choice 3.3 Maximum likelihood estimation

Binary choice 3.3 Maximum likelihood estimation Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood

More information

Minimax lower bounds I

Minimax lower bounds I Minimax lower bounds I Kyoung Hee Kim Sungshin University 1 Preliminaries 2 General strategy 3 Le Cam, 1973 4 Assouad, 1983 5 Appendix Setting Family of probability measures {P θ : θ Θ} on a sigma field

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Hypothesis Testing in Smoothing Spline Models

Hypothesis Testing in Smoothing Spline Models Hypothesis Testing in Smoothing Spline Models Anna Liu and Yuedong Wang October 10, 2002 Abstract This article provides a unified and comparative review of some existing test methods for the hypothesis

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May 2 2011 p. 1/35 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite).

More information

Nonparametric Tests for Multi-parameter M-estimators

Nonparametric Tests for Multi-parameter M-estimators Nonparametric Tests for Multi-parameter M-estimators John Robinson School of Mathematics and Statistics University of Sydney The talk is based on joint work with John Kolassa. It follows from work over

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

Nonparametric Drift Estimation for Stochastic Differential Equations

Nonparametric Drift Estimation for Stochastic Differential Equations Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Generalized additive modelling of hydrological sample extremes

Generalized additive modelling of hydrological sample extremes Generalized additive modelling of hydrological sample extremes Valérie Chavez-Demoulin 1 Joint work with A.C. Davison (EPFL) and Marius Hofert (ETHZ) 1 Faculty of Business and Economics, University of

More information

Inverse Statistical Learning

Inverse Statistical Learning Inverse Statistical Learning Minimax theory, adaptation and algorithm avec (par ordre d apparition) C. Marteau, M. Chichignoud, C. Brunet and S. Souchet Dijon, le 15 janvier 2014 Inverse Statistical Learning

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals Erich HAEUSLER University of Giessen http://www.uni-giessen.de Johan SEGERS Tilburg University http://www.center.nl EVA

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

Numerical differentiation by means of Legendre polynomials in the presence of square summable noise

Numerical differentiation by means of Legendre polynomials in the presence of square summable noise www.oeaw.ac.at Numerical differentiation by means of Legendre polynomials in the presence of square summable noise S. Lu, V. Naumova, S. Pereverzyev RICAM-Report 2012-15 www.ricam.oeaw.ac.at Numerical

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are

More information

Modeling Repeated Functional Observations

Modeling Repeated Functional Observations Modeling Repeated Functional Observations Kehui Chen Department of Statistics, University of Pittsburgh, Hans-Georg Müller Department of Statistics, University of California, Davis Supplemental Material

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information

Stochastic Demand and Revealed Preference

Stochastic Demand and Revealed Preference Stochastic Demand and Revealed Preference Richard Blundell Dennis Kristensen Rosa Matzkin UCL & IFS, Columbia and UCLA November 2010 Blundell, Kristensen and Matzkin ( ) Stochastic Demand November 2010

More information

Causal Inference: Discussion

Causal Inference: Discussion Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information