Confidence regions and intervals in nonlinear regression

Similar documents
A choice of norm in discrete approximation

The existence theorem for the solution of a nonlinear least squares problem

Parameter estimation of diffusion models

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Akaike Information Criterion

PROPERTIES OF THE GENERALIZED NONLINEAR LEAST SQUARES METHOD APPLIED FOR FITTING DISTRIBUTION TO DATA

ANALYSIS OF NONLINEAR REGRESSION MODELS: A CAUTIONARY NOTE

On behaviour of solutions of system of linear differential equations

Nancy Reid SS 6002A Office Hours by appointment

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

A CONNECTION BETWEEN LOCAL AND DELETION INFLUENCE

Sensitivity and Asymptotic Error Theory

On nonlinear weighted least squares fitting of the three-parameter inverse Weibull distribution

Mathematical foundations of the methods for multicriterial decision making

A multidimensional generalization of the Steinhaus theorem

Econometrics I, Estimation

Statistical Models. ref: chapter 1 of Bates, D and D. Watts (1988) Nonlinear Regression Analysis and its Applications, Wiley. Dave Campbell 2009

Nonlinear Regression. Chapter 2 of Bates and Watts. Dave Campbell 2009

Economics 582 Random Effects Estimation

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

Association studies and regression

Saddlepoint-Based Bootstrap Inference in Dependent Data Settings

Formal Statement of Simple Linear Regression Model

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Testing Restrictions and Comparing Models

Nancy Reid SS 6002A Office Hours by appointment

Parameter estimation: ACVF of AR processes

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Math 494: Mathematical Statistics

Better Bootstrap Confidence Intervals

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

A Resampling Method on Pivotal Estimating Functions

8. Hypothesis Testing

Standardizing the Empirical Distribution Function Yields the Chi-Square Statistic

i=1 h n (ˆθ n ) = 0. (2)

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics. Lecture 4 August 9, 2000 Frank Porter Caltech. 1. The Fundamentals; Point Estimation. 2. Maximum Likelihood, Least Squares and All That

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Semi-parametric estimation of non-stationary Pickands functions

University of California, Berkeley

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

ASSESSING A VECTOR PARAMETER

Calibrating Environmental Engineering Models and Uncertainty Analysis

2014/2015 Smester II ST5224 Final Exam Solution

Efficient estimation of a semiparametric dynamic copula model

Regression Estimation Least Squares and Maximum Likelihood

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

ESTIMATING RHEOLOGICAL PROPERTIES OF YOGURT USING DIFFERENT VERSIONS OF THE FREUNDLICH MODEL AND DESIGN MATRICES

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Bootstrap Testing in Econometrics

A nonparametric two-sample wald test of equality of variances

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Statistical Modeling for Citrus Yield in Pakistan

Parametric Techniques Lecture 3

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Introduction to Time Series Analysis. Lecture 12.

Likelihood inference in the presence of nuisance parameters

Remedial Measures, Brown-Forsythe test, F test

Bayesian Asymptotics

Estimation of cumulative distribution function with spline functions

Mathematical statistics

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Inference Based on the Wild Bootstrap

Master s Written Examination

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is

Lecture 9: Introduction to Kriging

TEORIA BAYESIANA Ralph S. Silva

Some General Types of Tests

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Least squares problems

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

"ZERO-POINT" IN THE EVALUATION OF MARTENS HARDNESS UNCERTAINTY

Nonlinear Parameter Estimation for State-Space ARCH Models with Missing Observations

Graduate Econometrics I: Maximum Likelihood II

Default priors and model parametrization

Introduction to Estimation Methods for Time Series models Lecture 2

Finite Sample Confidence Regions for Parameters in Prediction Error Identification using Output Error Models

University of California San Diego and Stanford University and

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

ECON 3150/4150, Spring term Lecture 6

On Bayesian Inference with Conjugate Priors for Scale Mixtures of Normal Distributions

APPLICATION OF TWO-PHASE REGRESSION TO GEOTECHNICAL DATA. E. Stoimenova, M. Datcheva, T. Schanz 1

Statistical Inference Using Maximum Likelihood Estimation and the Generalized Likelihood Ratio

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

by Søren Johansen Department of Economics, University of Copenhagen and CREATES, Aarhus University

Psychology 282 Lecture #4 Outline Inferences in SLR

Modern Likelihood-Frequentist Inference. Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine

Chapter 3: Maximum Likelihood Theory

Massachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

19.1 Problem setup: Sparse linear regression

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Transcription:

Mathematical Communications 2(1997), 71-76 71 Confidence regions and intervals in nonlinear regression Mirta Benšić Abstract. This lecture presents some methods which we can apply in searching for confidence regions and intervals for true values of regression parameters. The nonlinear regression models with independent and identically distributed errors and L p norm estimators are discussed. Key words: nonlinear regression, confidence regions, confidence intervals, L p norm Sažetak. Područja i intervali povjerenja nelinearne regresije. U ovom predavanju opisane su neke metode koje možemo primijeniti za traženje područja i intervala povjerenja pravih vrijednosti regresijskih parametara. Pri tome razmatrani su nelinearni regresijski modeli s nezavisno i jednoliko distribuiranim greškama u L p normi. Ključne riječi: nelinearna regresija, područja povjerenja, intervali povjerenja, L p norma 1. Introduction This lecture presents a brief review of methods which we apply in searching for confidence regions and confidence intervals for true values of the regression parameters in nonlinear models. As it is known, the usual least squares estimator for the regression parameters is not always the best choice for an estimator if the additive errors are not normal [10]. Namely, this estimator is known to be sensitive to departures from normality in the residual distributions. As alternatives to the least squares estimator in these cases, members of the class of L p norm estimators have been proposed. In this context we discuss the ways for computing the confidence intervals and regions if the L p -norm estimator is used. The lecture presented at the Mathematical Colloquium in Osijek organized by Croatian Mathematical Society - Division Osijek, January 31, 1997. Faculty of Economics, University of Osijek, Gajev trg 7, HR-31 000 Osijek, Croatia, e-mail: bensic@oliver.efos.hr

72 M. Benšić 2. The model Let us suppose that we have response variables y i (i = 1,..., n), observed with unknown errors e i (i = 1,..., n) and we want to fit them to m fixed predictor variables x i1,..., x im ( x i = [x i1,..., x im ] τ,) i = 1,..., n using a function f(x i ; θ). Here θ = [θ 1,..., θ k ] τ is the vector of k unknown parameters and we suppose the function f is nonlinear in its parameters. When the errors e i are additive random variables, the response variables can be shown by y = F ( θ) + e, y = [y 1,..., y n ] τ, F (θ) = [f(x 1, θ),..., f(x n, θ)] τ e = [e 1,..., e n ] τ. Here θ denotes a true, but unknown value of the vector parameter θ. Moreover, let us suppose that the errors e i (i=1,...,n) are independent and identically distributed random variables. The L p -norm estimator of the parameter θ is the value ˆθ (p) = [ˆθ (p) 1,..., ˆθ (p) k ] that minimizes the sum of the p-th exponent of the absolute value of residuals, p [1, ). Thus, if we denote ˆθ (p) is a vector which satisfies r i (θ) = y i f(x i ; θ), i = 1,..., n S p (θ) = n r i (θ) p i=1 S p (ˆθ (p) ) = min S p (θ) θ Θ if it exists. Here, Θ is the set of all possible values for the vector parameter θ, Θ R k. Being special cases, this class of estimators contains the minimum absolute deviations estimator (MAD, p = 1) and the least squares estimator (LSE, p = 2). As we suppose, y is a random vector which depends on m deterministic predictor variables. It means that the vector ˆθ (p) will also be random and for the specific realizations of y we have different values of ˆθ (p). Which is the true one? We can answer this question only through the confidence regions, i.e. using the properties of ˆθ (p) as a random vector, we can indicate with some specific confidence level 1 α (α (0, 1)) in what region about ˆθ (p) we might reasonably expect θ to be. Such regions are known as 100(1 α)% confidence regions. A joint confidence region for all parameters θ 1,..., θ k is defined using a function CR α : Y a region in R k that satisfies P { θ CR α (y)} = 1 α.

Confidence regions and intervals in nonlinear regression 73 A confidence interval for an individual parameter θ j is defined using a function CI j,α : Y an interval in R that satisfies (see, for example, [3]). P { θ j CI j,α (y)} = 1 α 3. Least squares estimator If p = 2, the L p norm estimator is in fact the LSE. There are currently many results regarding this case for computing confidence regions in approximate sense (for large samples) or in exact sense (for small samples). These results mostly regard models with normal, independent and identically distributed errors (i.i.d. errors). Thus, in this section we will suppose that the errors are normal i.i.d. The least expensive computational procedure for computing confidence regions if LSE is applied arose from the linearization approach and suggests (1 α)% confidence regions for θ: 1 {θ : (θ ˆθ) τ ˆV 1 (θ ˆθ) kf k,n k,1 α }, where ˆV = s 2 [(J(ˆθ)) τ J(ˆθ)] 1, s 2 = S 2(ˆθ) n k, J(ˆθ) is the Jacobian matrix of F (θ) at ˆθ. This approach also suggests the (1 α)% confidence interval for θ j, j = 1,..., k: {θ : θ j ˆθ j ˆV 1/2 jj t n k,1 α/2 }, where V jj is the (j, j) th element of ˆV. If the contours of S 2 (θ) are approximately elliptical (exactly elliptical in the linear case), these approximations will be adequate but this method can be very poor if the contours of S 2 (θ) are not close to ellipses. There are two other methods for large samples which are more consistent with the Bates and Watts curvature measures ([1]). Thus, the likelihood approach suggests the (1 α)% confidence region for θ: and the lack-of-fit approach suggests: {θ : S 2 (θ) S 2 (ˆθ)[1 + k n k F k,n k,1 α]} 1 Here ˆ = ˆ (2). {θ : R(θ) τ P (θ)r(θ) R(θ) τ (I P (θ))r(θ) k n k F k,n k,1 α}

74 M. Benšić P (θ) = J(θ)(J(θ) τ J(θ)) 1 J(θ) τ R(θ) = Y F (θ). As we can see, these two methods are computationally very expensive, requiring the evaluation of a sufficient number of points to produce a contour. The lack-offit approach gives in fact the exact regions which are not dependent on ˆθ. The assumption that the errors are normal is the only reason for puting this method in the LSE case. Namely, if the errors are normal, LSE is better than the other estimators from the class of L p norm estimators. When the sample size n is small, Duncan ([4]) suggested the jackknife procedure for computing the confidence region. The procedure is as follows: 1. Let ˆθ (i) be the least squares estimate of θ when the i th case is deleted from the sample. 2. Calculate the pseudo-values as the vectors T i = nˆθ (n 1)ˆθ (i) The sample mean and variance of T i are given by S = 1 n 1 T = 1 n n T i, i=1 T = [ T1,... T k ] τ n (T i T)(T i T) τ S = [S ij ] i, j = 1,..., k. i=1 3. A 100(1 α)% confidence region for θ is {θ (T θ) τ S 1 (T i θ) k n k F k,n k,1 α}. A 100(1 α)% confidence interval for θ i can be constructed as 2 4. L p norm estimation k T i ± n k F k,n k,1 αs ii. The confidence intervals for regression parameters in nonlinear models that have been suggested to this day were computed using the results on asymptotic distribution of the L p norm estimator in the linear models and the fact that the errors are additive (see [8], [9],[7]). These intervals are only asymptotical. Thus, if the L 1 norm estimator is applied, a 100(1 α)% confidence interval for θ j is given by θ (1) j ± z α/2 ω 2 1 (J(θ(1) ) τ J(θ (1) ) jj 2 As we can see this procedure requires n+1 nonlinear estimations. Fox et al. ([5], [6]) describes a linear jackknife procedure which is not so computationaly expensive as this.

Confidence regions and intervals in nonlinear regression 75 where ω 2 1 = 1 [2f(m)] 2 and f(m) is the ordinate of the error distribution at the median m. ω1 2 can be estimated by the Cox and Hinkley estimator ([2], [6]). If the L p norm is applied, p (1, ) then a 100(1 α)% confidence interval for θ j is given by where θ (p) j ± z α/2 ω 2 p(j(θ (p) ) τ J(θ (p) ) jj ω 2 p = E[ e i 2p p ] [(p 1)E( e i p 2 )] 2. Here are ω 2 p for some symmetric distributions: The uniform distribution on [ b, b]: ω 2 p = The normal distributions N (0, σ 2 ): b2 2p 1 = 3σ2 2p 1 ωp 2 = 2 πσ 2 Γ(p 1 2 ) (p 1) 2 Γ 2 ( p 1 ) 2 The symmetric Laplace distribution ( density function: f(x) = 1 2b e x /b ): ω 2 p = σ 2 Γ(2p 1) 2(p 1) 2 Γ 2 (p 1) Note that ω 2 p = σ 2 when p = 2 for all the distributions. References [1] D. M. Bates, D. G. Watts, Relative curvature measures of nonlinearity, J.R. Statist. Soc., Ser. B 42(1980), 1 25. [2] D. R. Cox, D. V. Hinkley, Theoretical statistics, Chapman and Hall, London, 1974. [3] J. R. Donaldson, R. B. Schnabel, Computational expirience with confidence regions and confidence intervals for nonlinear least squares, Technometrics 29(1987), 67 82. [4] G. T. Duncan, An empirical study of jackknife - constructed confidence regions in nonlinear regression, Technometrics 20(1978), 123 129. [5] T. Fox, D. V. Hinkley, K. Larntz, Jackknife in nonlinear regression, Technometrics 22(1980), 29 33.

76 M. Benšić [6] R. Gonnin, A. H. Money, Nonlinear L p -norm estimation, Marcel Dekker, Inc., New York and Basel, 1989. [7] R. Gonnin, A. H. Money, Nonlinear L p -norm estimation: Part I - On the choice of the exponent, p, where the errors are additive, Commun. Statist. - Theor. Meth. 14(1985), 827 840. [8] R. I. Jenrich, Asymptotic properties of non-linear least squares estimators, Ann. Math. Statist. 40(1969), 633 643. [9] H. Nyquist, The optimal L p norm estimator in linear regression models, Commun. Statist. - Theor. Meth. 12(1983), 2511 2524. [10] V. A. Sposito, M. L. Hand, B. Skarpness, On the efficiency of using the sample kurtosis in selecting optimal L p norm estimators, Commun. Statist. - Simula. Computa. 12(1983), 265 272.