An introduction to nonparametric and semi-parametric econometric methods
|
|
- Derick Beasley
- 6 years ago
- Views:
Transcription
1 An introduction to nonparametric and semi-parametric econometric methods Robert Breunig Australian National University March 1,
2 Outline 1. Introduction 2. Density Estimation (a) Kernel techniques (b) Bandwidth Selection (c) Estimating derivatives of densities (d) Non-kernel techniques 3. Conditional Mean Estimation 4. Semi-parametric estimation (a) Robinson s method (b) Differencing (c) Binary Choice models (d) Mixed categorical and continuous variables 1
3 Objectives 1. Introduce nonparametric and semiparametric techniques 2. Introduce some of the key issues in the literature 3. Introduce several key tools and techniques 4. Provide examples of the use of techniques 5. Provide reference literature so that interested students can pursue these techniques in their applied work 2
4 Objects of Interest All statistical objects studied by applied econometricians may be expressed as functions of unknown distributions. Measurement of inequality F a (x) F b (x) = x f a (t)dt x f b (t)dt Regression modelling m(x) =E[Y x] = y f(y, x) f 1 (x) dy 3
5 Measurement of response β(x) = de[y x] dx = [ ] d y f(y,x) f 1 (x) dy dx Market risk σ 2 (x) = (y E[Y x]) 2 f(y, x) f 1 (x) dy Discrete choice Prob[Y =1 x] = f(y, x) f 1 (x) >.5 4
6 Parametric Models Parametric econometric methods require the prior specification of the functional form of the object being estimated. For example, one might assume that the conditional mean function is linear f(y, x) m(x) =E[Y x] = y f 1 (x) dy = β 0 + β 1 x This specification implies a constant response [ ] de[y x] d y f(y,x) f 1 (x) dy β(x) = = = β 0 + β 1 x dx dx dx = β 1 5
7 Parametric Methods: Drawbacks Parametric models impose a priori structure on the underlying DGP. Having assumed that this structure is known, we then estimate a handful of unknown parameters. Choice of models is frequently not based upon any attempt to select the correct parametric specification from the space of admissible models. Rather, model selection is usually made on the bases of tractability and ease of interpretation. The risk is that inference, prediction, and policy are all based upon an incorrectly specified parametric model. The consequences of such mis-specification are well known. 6
8 Nonparametric Methods Nonparametric estimators estimate objects of interest to economists by replacing unknown densities and distribution functions with their nonparametric density estimators. They are consistent under less restrictive assumptions than those underlying their parametric counterparts. When there is sufficient data, these estimators frequently reveal features of the data that are invisible under parametric techniques. Different features and structures revealed by nonparametric estimators often lead to different conclusions and policy prescriptions than those based upon parametric methods. 7
9 Four uses of nonparametric 1. Visualizing the data methods 2. Testing and comparing models 3. Conditional Mean Estimation (regression) 4. Combining parametric and nonparametric methods (semi-parametric estimation) 8
10 Basic building block: Nonparametric Kernel Density Estimator f(x) = 1 nh n K i=1 ( ) xi x h 9
11 We would like to estimate the density, f(x), from a sample x 1,x 2,..., x n Histogram Naive Non-parametric/Local Histogram f I (x) = 1 nh n I i=1 ( 1 2 < x i x h < 1 2 ) = n nh n is the number of points which lie between x h 2 and x + h 2. h will determine smoothness of estimate 10
12 Replace Indicator function with smooth weighting function called a Kernel where K (ψ) dψ =1 f(y) = 1 nh n K i=1 ( ) yi y h K (ψ) large for small ψ K (ψ) small for large ψ K () should be symmetric. 11
13 A large class of functions satisfy these assumptions, for example (i) Standard Normal: K(ψ) =(2π) 1/2 exp [ 12 ] (ψ)2 (ii) Uniform: K(ψ) =(2c) 1, c <ψ<cand 0 otherwise. (iii) Epanechnikov (1969) [optimal kernel]: K 0 (ψ) = 3 4 ( 1 ψ 2 ), ψ 1 = 0 otherwise 12
14 In order to implement this estimator, we have to make two choices. 1. Kernel (weight) function: K 2. The smoothing parameter (bandwidth): h It turns out that the choice of kernel does not have much effect on the optimality of the estimator, but that the choice of bandwidth (or window width) has important repercussions for our results. 13
15 Bandwidth selection methods 1. Plug-in methods 2. Likelihood cross-validation 3. Least-squares cross-validation 14
16 All of these methods begin from the same starting point, which is that the bandwidth, h, should be chosen so that the estimated density, f(x) is as close as possible to the true density, f(x). Most of the time we employ some kind of global criteria. The most common is the integrated squared error (ISE) ( f(x) f(x) ) 2 (ISE) or its expected value, the mean integrated squared error (MISE) ( ) 2 E f(x) f(x) (MISE) These two quantities correspond to loss and risk. 15
17 For independent and identically distributed (i.i.d.) data, it is straightforward to show that Bias( ˆf) =E ˆf f = K(ψ)[f(hψ + x) f(x)] dψ and V ( ˆf) =(nh) 1 K 2 (ψ)f(hψ + x)dψ n 1 [ ] 2 K(ψ)f(hψ + x)dψ 16
18 The expressions for exact bias and variance are not useful without knowledge of the quantity that we are attempting to estimate the true underlying density. We can, however, derive approximations to these quantities by expanding f(hψ + x) by Taylor series expansion, for small h. f(hψ + x) =f(x)+hψf (1) (x)+ h2 2 ψ2 f (2) (x)
19 Given i.i.d. assumption above, the assumptions made regarding the kernel function, and the following additional assumptions (A3) The second order derivatives of f are continuous and bounded in some neighborhood of x. (A4) h = h n 0 as n. (A5) nh n as n. we can show that upto O(h 2 ) the bias is given by Bias ( ˆf) = h2 2 μ 2f (2) (x) and upto O(nh) 1, the variance is given by V ( ˆf) =(nh) 1 f(x) K 2 (ψ)dψ, 18
20 [ ( f)) ] 2 MISE = Bias( + Var( f) dx The approximate MISE, using the above expressions, is ( ) AMISE = h4 2 4 μ2 2 f (x) dx +(nh) 1 f(x)dx K 2 (ψ)dψ = 1 4 λ 1h 4 + λ 2 (nh) 1 where λ 1 = μ 2 2 ( f (2) (x)) 2 dx, λ2 = K 2 (ψ)dψ. (1) 19
21 The optimal window width, in the sense that the approximate integrated mean squared error is minimized, will be h = cn
22 Assuming a normal kernel and a normal density, f(x), both λ 1 and λ 2 can be evaluated numerically. This provides h =1.06σ x n 1 5 Software packages which implement nonparametric density estimation (SAS, Shazam, STATA) use this as the default window width. For non-normal distributions it works well as a first approximation. It also can provide a good starting point for data driven methods of bandwidth selection (see more below.) It is by far the most commonly used window width in the literature. 21
23 Silverman (1986) provides several other alternatives which work well for heavily skewed data or multi-modal data. A simple improvement is to replace σ by a robust estimator of spread and he specifies two alternatives that seem to work well; h =.79 Rn 1/5 h =.9 An 1/5, where R is the inter-quartile range and A =min(σ, (R/1.34)). 22
24 Least Squares Cross-validation This is essentially a data-driven technique of choosing the optimal bandwidth. The idea is to minimize a particular criteria function. In least squares cross-validation the function minimized is ISE(h) = = ( ) 2 ˆf(x) f(x) dx ˆf 2 dx + f 2 dx 2 ˆffdx, Since f 2 dx does not depend upon f, the function that people minimize in practice is actually f 2 dx 2 ffdx. 23
25 Further manipulation yields ISE (h) =n 2 h 1 n i=1 n j=1 K K( x i x j h ) 2n 1 n i=1 ˆf i (x i ). as the function that is actually minimized. n 1 n i=1 ˆf i (x i )istheleave one out estimator whichisformedasa standard kernel density estimator, omitting the ith observation. This provides an unbiased estimate of ˆffdx. Most programs actually implement the leave-one out estimator as the actual density estimate since it minimizes the influence of solitary observations. 24
26 Likelihood Cross-validation The basic idea behind this method is to choose an h which maximizes the likelihood log L = n log f(x i ). An estimated log likelihood or i=1 pseudo log likelihood can be written as log L = n log ˆf(x i )=log L(h), i=1 where ˆf(x i ) is a density estimator of f and it depends on h. Maximizing log L with respect to h produces a trivial maximum at h = 0. To overcome this problem, the cross validation principle might be adopted, in which ˆf(x i ) is replaced by ˆf i (x). 25
27 This leave one out version of the estimator can be written as ˆf i (x i )=((n 1)h) 1 n j i j=1 K ( ) xj x i. h Thus the likelihood CV principle is to choose h such that log L(h) = n log ˆf i (x i ) is a maximum. The procedure is also i=1 known as Kullback-Leibler cross validation in the sense that it gives an h for which the Kullback-Leibler distance measure between two densities f and ˆf, I(f, ˆf), is a minimum, where I(f, ˆf) = f(x)log { } f(x) dx; ˆf(x) see Hall (1987). 26
28 A disadvantage of the h obtained by likelihood CV is that it can be severely affected by the tail behavior of f. Furthermore, Hall (1987) has indicated that selecting h by minimizing the Kulback-Leibler measure may be useful for the statistical discrimination problem but not for curve estimation. Thus the likelihood CV procedure has not proven to be of much current interest in the literature. 27
29 Other density estimation techniques Nearest Neighbor Density Estimation Let d(x 1,x) represent the distance of point x 1 from the point x, and for each x denote d k (x) as the distance of x from its k th nearest neighbor (k NN)amongx 1,..., x n. Then, taking h =2d k (x), the estimator can be written as ˆf k NN (x) = #(x 1...x n )in[(x d k (x)), (x+d k (x))] 2nd k (x) = k 2nd k (x) = 1 n I 2nd k (x) i=1 ( ) xi x 2d k (x). The degree of smoothing is controlled by an integer k, typically k n 1/2. 28
30 Series estimation Suppose X is a random variable with density f on the unit interval [0,1]. Under these circumstances it can be expressed as the Fourier series f(x) = a j ζ j (x), j=0 where, for each j 0, the coefficients a j = 1 0 f(x)ζ j (x)dx = Eζ j (x), and the sequence ζ j (x) isgivenbyζ 0 (x) = 1, and ζ j (x) = 2cos π(j +1)x when j is odd and 2sin πjx when j is even. 29
31 Using â j = n 1 n i=1 ζ j(x i ) as an estimator of a j, the orthogonal series estimator is defined as ˆf(x) = m â j ζ j (x), j=0 where m is the cutoff point in the infinite sum and determines the amount of smoothing. The regression analog of this is to express the conditional mean of y as an infinite polynomial in x. 30
32 Variable window width estimators Another option is to let the window width vary with each point in the data according to some rule. The estimator will then have the form ˆf vww (x) = 1 n n i=1 1 h ni K ( ) xi x, In general, the rule should allow larger h in regions where there are few observations and smaller h in those where observations are densely located. h ni 31
33 Penalized likelihood estimators Local Log-Likelihood estimators Both of these techniques treat f(x) as an unknown parameter and try to employ likelihood methods to estimate the unknown quantity. The global likelihood has no finite maximum over the class of all densities, so options are to instead maximize a penalized likelihood function (which imposes some pre-determined amount of smoothness on the function) or the local, kernel-weighted, log-likelihood. 32
34 Example 1 Eruption Length of Old Faithful Geyser in Yellowstone National Park 3 files oldfaithful.pdf Example 2 Hamilton Lin (1996) model of excess stock returns from Standard and Poor 500 Example 3 Ait-Sahalia (1996) nonparametric test of interest rate diffusion models 33
35 Multivariate Density Estimation Consider a bivariate distribution where the ith sample observation is given by (y i,x i )andz =(y, x) isafixedpoint. This can be estimated nonparametrically by ˆf(y, x) = ˆf(z) = 1 nh 2 n i=1 K 1 ( zi z h ), 34
36 The kernel estimator of the marginal density f 1 (x) ofx is ˆf 1 (x) = ˆf(y, x)dy = 1 nh 2 n i=1 = 1 nh i=1 K1 ( yi y h n K ( x i ) x h, x i x h ) dy where K(x) = K 1 (y, x)dy is such that K(x)dx = 1. The estimator of the conditional density of Y given X canthenbewrittenas ˆf(y x) = ˆf(y, x) ˆf 1 (x). 35
37 In general, for a multivariate density estimation problem of dimension d, theoptimalh which minimizes the approximate MISE can be found by substituting nh d for nh in the MISE expression given earlier and minimizing with respect to h. It is easy to show that h = cn 1/(4+d), and, for this h, AMISE =0(n 4/(4+d) ). When the kernel is multivariate standard normal, c = {4/(2d +1)} 1/(d+4). 36
38 Curse of dimensionality It is clear from this result that the higher the dimension q +1,the slower will be the speed of convergence of ˆf to f. Thus one may need a large data size to estimate the multivariate density in high dimensions. 37
39 Multivariate Kernels standard multivariate normal density, where d =dim(ψ) K(ψ) =(2π) d/2 exp( 1 2 ψ ψ) multivariate Epanechnikov kernel K c (ψ) =.5 c 1 d (d + 2)(1 ψ ψ) if ψ ψ<1 and equaling 0 otherwise, where c d is the volume of the unit d-dimensional sphere (c 1 =2, c 2 = π, c 3 =4π/3). 38
40 One disadvantage with direct application of the kernels above is that the variables may exhibit disparate variation. To overcome this problem it is good practice to work with standardized data, i.e., normalized by the standard deviation or some measure of scale. Then each of the elements in ψ will have unit variance and application of a kernel such as the multivariate standard normal is appropriate. 39
41 Conditional mean estimation Consider q +1=p economic variables (Y, X )wherey is the dependent variable and X is a (q 1) vector of regressors; these p variables are taken to be completely characterized by their unknown joint density f(y, x 1,...,x q )=f(y, x), at the points y, x. As noted in the introduction interest frequently centres upon the conditional mean m(x) =E(Y X = x), where x is some fixed value of X. Now suppose that we have n data points (y i,x i ). By definition, Y i = E(Y i X i = x i )+u i = m(x i )+u i where the error term u i has the properties E(u i x i )=0and E(u 2 i x i )=σ 2 (x i ). 40
42 Parametric Estimation Parametric methods specify a form of m(x i ). In the case of a linear specification y i = α + x i β + u i. The least squares estimators of α and β are α =ȳ xβ and β = ( n i=1 (x i x) 2) 1 ( n i=1 (x i x)y i ). The best unbiased parametric estimator of m(x) =α + xβ is m (x) =α + xβ = n a ni (x)y i (2) i=1 where a ni (x) =n 1 +(x x)(x i x) ( n i=1 (x i x) 2) 1.Them in (2) is the weighted sum of y i, where the weights a ni are linear in x, and depend on the distance of x from x. 41
43 The assumption that m(x i )=α + x i β implies certain assumption about the data generating process (joint density). For example, if (y i,x i ) is a bivariate normal density then it can be shown that the mean of the conditional density of y i given x i is, E(y i x i )=α + x i β, where α = Ey i (Ex i )β and β =(var(x i )) 1 cov {(x i,y i )}. This implies that the assumption of linear specification for m(x) holdsif the data comes from the normal distribution. However, if the true distribution is not normal then the linear specification for the conditional expectation may be invalid, and so the least squares estimator of m(x) will become biased and inconsistent. 42
44 For example suppose the true relationship is y i = α + x i β + x 2 i γ + u i then the parameter of interest is β +2γx i = y i / x i. However, if a linear approximation is taken, y i / x i is being estimated under the false restriction that γ = 0. Typically, the exact functional form connecting m(x) withx is unknown. Because of the possibility that forcing the function to be linear or quadratic may affect the accuracy of estimation of m(x), it is worthwhile considering nonparametric estimation of the unknown function, and this task is taken up in the following sections. 43
45 Kernel-Based Estimation Suppose that the x i are i.i.d. random variables. Because m(x i )isthe mean of the conditional density f(y i x i )=f(y X = x i ), there is a potential to employ the methods of density estimation seen earlier. By definition the conditional mean is m = where f 1 (x) is the marginal density of X at x. (yf(y, x)/f 1 (x)) dy, (3) Nadaraya (1964) and Watson (1964) therefore proposed that m be estimated by replacing f(y, x) by ˆf(y, x) andf 1 (x) by ˆf 1 (x), where these density estimators were the kernel estimators discussed above. 44
46 The expressions for ˆf(y, x), ˆf1 (x) from the first part of this talk may be substituted into (3) to give ˆm = y [ (nh p ) 1 (nh q ) 1 n i=1 K ( yi y 1 h n i=1 K ( x i ) x h, x i x h )] dy, (4) where p = q +1andh is the window width. Some simplification yields ˆm = [ (nh q ) 1 n i=1 y i K ( ) ] [ xi x / (nh q ) 1 h n i=1 K ( ) ] xi x h = n K i=1 ( ) xi x y i / h n K i=1 ( ) xi x, h 45
47 A feature of the Nadaraya-Watson estimator is that it is a weighted sum of those y i s that correspond to x i in a neighborhood of x. The weights are low for x i s far away from x and high for x i s closer to x. With this motivation, a general class of nonparametric estimators of m(x) can be written as m = m(x) = n w ni (x)y i i=1 where w ni (x) =w n (x i,x) represents the weight assigned to the i th observation y i, and it depends on the distance of x i from the point x. Note that the parametric estimator m(x) in (2) is a special case with linear weights w ni (x) =a ni (x) such that w ni (x) = 1, but w ni (x) 0 is not necessarily true. 46
48 An implicit assumption in nonparametric estimation is that m(x) is smooth over x, implying that y i contains information about m(x) whenever x i is near to x. The estimator m(x) isasmoothed estimator in the sense that it is constructed, at every point, by local averaging of the observations y i s corresponding to those x i s close to x in some sense. In parametric regression, a functional form is specified for the conditional mean m(x). This functional form, say m(x, β), depends on a finite number of unknown parameters β. The least squares estimate of m = m(x) ism(x, ˆβ), where ˆβ is chosen to minimize n i=1 ( y i m(x i, ˆβ)) 2. (5) 47
49 Compare (5) with the following weighted least squares criterion for the nonparametric estimation of m(x) : n wni(x)[y i m(x)] 2. (6) i=1 In (6), m(x) replaces the m(x, β) that appears in (5). If m(x) is regarded as a single unknown parameter m, it may be estimated by minimizing n wni(x)[y i m] 2. (7) i=1 The resulting estimate, m, ofm(x) is precisely the Nadaraya-Watson estimator. Thus the kernel estimator ˆm is also a least squares estimator, with w ni (x) =K ((x i x)/h). 48
50 One might also think of m(x) as a method of moments estimator. Since E(u i x i )=0, or Ew ni(x)(y i m(x i )) = 0 (8) = E [w ni(x)(y i m)+w ni(x)(m m(x i ))] = 0. (9) If the second term in (9) is ignored and a sample estimate of the first, n 1 n i=1 w ni (x)(y i m), is used, the value of m for which this is zero is again the Nadaraya-Watson estimator. 49
51 Whether the second term can be ignored depends upon the weights wni (x). If the weights were the indicator functions of the local histogram presented earlier, the second term will be identically zero, whereas with kernel weights it is only asymptotically zero. Because the orthogonality relation only holds as n, the situation is out of the normal framework described by Hansen (1982), but it is close to work reported in Powell (1986), in that the expected value of the function the parameter solves changes with the sample size (through h) and so its large sample limit has to be used instead. 50
52 Local Linear Nonparametric Regression The Nadaraya-Watson estimator of m(x) minimizes Σ n i=1 {y i α} 2 K ( x i ) x h with respect to α, giving ˆm(x) =ˆα = [ ΣK ( x i )] x 1 ( h ΣK xi ) x h yi. Stone (1977) and Cleveland (1979) suggested that instead one minimize n {y i α (x i x) β} 2 K i=1 ( ) xi x, h with respect to α and β and set ˆm(x) equal to the resulting estimate of α. 51
53 This estimate can be found by performing a weighted least squares regression of y i against z i =(1 (x i x)) with weights [ ( K xi )] x 1/2. h Thus, while the Nadaraya-Watson estimator is fitting a constant to the data close to x, the local linear approximation is fitting a straight line. This local linear smoothing estimator has been extensively investigated by Fan (1992a), (1993), Fan and Gijbels (1992) and Ruppert and Wand (1994). 52
54 The resulting estimator has the form m LL (x) = n i=1 w LL ni (x)y i, with weights wni LL = e 1 ( z i K i z i ) 1 z i K i, where e 1 is a column vector of dimension the same as z i with unity as first element and zero elsewhere. One advantage of this estimator is that it can be analysed with standard regression techniques, and it has the same first order statistical properties irrespective of whether the x i are stochastic or non-stochastic. The optimal window width is proportional to n 1/5. 53
55 Applications of the idea in econometrics are McManus (1994) to estimation of cost functions, Gourrieourx and Scaillat (1994) to the term structure, Lin and Shu (1994) to estimation of a disequilibrium transition model, Bossaerts and Hillion (1997) to options prices and their determinants, and Ullah and Roy(1996) for a nutrition/income relation. Implementation and computations are discussed in Cleveland et al (1988). Hastie and Loader (1993) provide an excellent account of the history and potential of the method. 54
56 The logic of linear local regression smoothing can be seen by expanding m (x i ) around x to get m (x i )=m(x)+ m x (x )(x i x), (10) where x lies between x i and x. This may be expressed as m (x i )=α + β (x )(x i x). (11) 55
57 Now, since E (y i x i )=m (x i ), the objective function Σ(y i m i (x i )) 2 K i =Σ(y i α β (x )(x i x)) 2 K i is essentially the residual sum of squares from a regression using only observations close to x i = x. Notice that this means that β (x ) will be very close to constant as x must lie between x i and x. This also points to the fact that improvements might be available from expanding m (x i )asaj th order polynomial in (x i x), but doing so requires the derivatives m (j) to exist. 56
58 Example 4 Eruption Length of Old Faithful Geyser Conditional on Waiting Time 57
59 Other Notes The optimal h can be found by minimizing the MISE similar to the density case, and it can be shown that h opt α n 1/(q+4) Cross validation may be performed by minimizing the estimated prediction error (EPE), n 1 Σ(y i ˆm(x i )) 2,where ˆm(x i )is computed as the leave-one-out estimator deleting the i th observation in the sums. To appreciate why minimizing EPE is sensible notice that, when the leave one out estimator is employed and observations are independent, ˆm i is independent of y i, meaning that E (ˆm i (y i m i )) = 0, and so E(EPE)=σ 2 + E ( n 1 Σ( ˆm i m i ) 2) = σ 2 + MASE. 58
60 Minimizing E(EPE) with respect to h is therefore equivalent to minimizing MASE with respect to h. Unfortunately, minimizing the sample EPE tends to produce an estimator of h that converges only extremely slowly to the value of h minimizing E(EPE), of order n 1/10, The curse of dimensionality means that pure nonparametric regression is difficult to use in higher dimension problems. 59
61 Semi-parametric estimation A number of models exist in the literature which have the distinguishing feature that part of the model is linear and part constitutes an unknown non-linear format. which could be written in matrix form as y i = x 1iβ + g 1 (x 2i )+u i, (12) In (23) x 2i cannot have unity as an element. y = X 1 β + g 1 + u. (13) 60
62 This intercept restriction is an identification condition arising from the fact that g 1 (x 2i ) is unconstrained and therefore can have a constant term as part of its definition. Hence, it would always be possible to add any constant number to (23) and then absorb it into g(x 2i ), showing that, without some further restriction upon the nature of g 1 (x 2i ), it is impossible to consistently estimate an intercept. This issue of identification of parameters, particularly in regards to the intercept, but sometimes a scale parameter as well, arises a good deal in the semi-parametric literature and needs to be dealt with by imposing some restrictions. The parameter of interest is β so that the issue is how to estimate it in the presence of the unknown function g 1. 61
63 A Semi-Parametric Estimator of β Taking the conditional expectation of (13) leads to E (y i x 2i )=E (x 1i x 2i ) β + g 1 (x 2i ). Consequently and y i E (y i x 2i )=(x 1i E (x 1i x 2i )) β + u i (14) g 1 (x 2i )=E (y i x 2i ) E (x 1i x 2i ) β. (15) 62
64 Since (14) has the properties of a linear regression model with dependent variable y i E (y i x 2i ) and independent variables (x 1i E (x 1i x 2i )), an obvious estimator of β is ˆβ = [ n ] 1 [ n ] (x 1i ˆm 12i )(x 1i ˆm 12i ) (x 1i ˆm 12i )(y i ˆm 2i ), i=1 i=1 (16) where ˆm 12i and ˆm 2i are the kernel based estimators of m 12i = E(x 1i x 2i )andm 2i = E(y i x 2i ). 63
65 Once ˆβ is found, g 1 (x 2i ) can be estimated from (15) as ĝ 1 (x 2i )= ˆm 2i ˆm 12i ˆβ, (17) for example Stock (1989) works with this model but is particularly interested in estimating g 1 (x 2i ) rather than β. The kernel estimator for β in the context of (13) was analyzed by Robinson (1988) 64
66 Differencing Consider again the partial linear model y i = x 1i β + g 1 (x 2i )+ε i, (18) where x 1 is a scalar. Order the x 2 from smallest to largest so that x 21 x x 2n Suppose that x 1 is a smooth function of x 2 where E[x 1 x 2 ]=g(x 2 ) and therefore x 1 = g(x 2 )+u 65
67 y i y i 1 =(x 1i x 1,i 1 ) β +(f(x 2i ) f(x 2,i 1 )) + ε i ε i 1 =(g(x 2i ) g(x 2,i 1 )) β +(u i u i 1 ) β+ (f(x 2i ) f(x 2,i 1 )) + ε i ε i 1 Provided that the functions f and g are sufficiently smooth and that the data is sufficiently dense, the differences f(x 2i ) f(x 2,i 1 )and g(x 2i ) g(x 2,i 1 ) should be very small providing the approximations z i z i 1 =u i u i 1 y i y i 1 =(u i u i 1 ) β + ε i ε i 1 66
68 The non-parametric difference estimator of β is simply β diff = (zi z i 1 ) (y i y i 1 ) (zi z i 1 ) 2 which converges at the usual rate of n, with normal distribution so that ( D β diff N β, 1.5 σ 2 ) ε n σu 2 67
69 Example 5: Yatchew and No (2001) Gasoline Demand in Canada 68
70 Binary Choice Models We often start with the idea of an underlying linear (latent variable) model yi = x iβ + u i (19) y i =1whenyi > 0 and takes value 0 otherwise. The standard approach to estimating β in (22) is via maximum likelihood. The likelihood function is formed for a sample of size n as L = G i = n [y i ln(g i )+(1 y i )ln(1 G i )] (20) i=1 x i β [g(u)] = Prob(u i <x iβ) 69
71 G is assumed to be normal (probit) or logistic (logit) in most applications. Klein and Spady (1993) propose to estimate a smooth version of the likelihood that locally approximates the parametric likelihood. Note that x i β could be written in more general terms, but Klein and Spady do retain the linear index function in their method. The key transformation is to note that G in (20) is the probability that u is less than the x iβ conditional on the index function and the parameter β. This can be written as a G[x iβ; β] =Prob(y =1) g υ y=1 g υ (21) where g υ y=1 is the distribution of the index function conditional on y =1andg υ is the unconditional distribution of the index function. a Prob(A B) = Prob(A B) Prob(B) = Prob(B A)Prob(A) Prob(B) 70
72 These can both be estimated nonparametrically using standard kernel techniques while the Prob(y = 1) can be estimated as the sample fraction of observations with y i =1. 71
73 Ichimura and Thompson (1998) propose a wider class of estimates which is based upon a random coefficients approach. y i = x iβ i + u i (22) y i =1wheny i > 0 and takes value 0 otherwise. The distribution of β i is estimated by nonparametric methods with few restrictions. Ai and Chen (Econometrica, 2003) have proposed a better method for estimating binary choice models which is currently considered the state of the art. 72
74 Additional notes on bandwidth selection Plug-in methods Usually reserved for simple density estimation Fan and Gijbels (1996) provide plug-in estimators for regression estimation Least-squares cross-validation popular in many applications Ichimura and Todd (2004, Handbook of Econometrics V) find that this method works well in a simulation study The biggest problem with least-squares cross-validation happens when the data are sparse. In this case the method tends to choose a bandwidth which is too large in order to avoid having zero densities in any area (the criterion takes on an unbounded value if the density is zero at any point). 73
75 Variable bandwidth selection methods result in estimates that are no longer densities. Thus global bandwidth selection methods tend to be preferred There are also bootstrap bandwidth selection methods which tend to be very computationally intensive 74
76 Reducing the curse of dimensionality Restricting the class of models ex: Separable models of Robinson and Yatchew ex: Klein and Spady Binary Choice Model Changing the Parameter of Interest ex: Average derivative methods 75
77 Specifying different stochastic assumptions see Powell (1984, J. of Econometrics) I won t discuss this last one. But these methods essentially involve making some restriction on the conditional distribution of observable variables but not enough to estimate the model parametrically. Powell applies these to various limited dependent variable models including the Tobit model. 76
78 Average Derivative Method Consider the model y i = g(x i )+u i, (23) Suppose that instead of estimating the derivative g (x) atevery point, we are interested in E(g (x)) (24) The advantage is that by taking the average over all points, the curse of dimensionality is eliminated. Even though the function g can not be estimated at the rate of parametric convergence, the average of its derivatives can. 77
79 These estimators have achieved great popularity and are discussed in Stoker (1986, Econometrica) Härdle and Stoker (1989, JASA) Powell, Stock and Stoker (1989, Econometrica) The simplest form is the direct average derivative estimator which is simply n Ê(y i x i ) x t i i=1 β = n (25) t i where t is a trimming function that removes points which have zero or negative densities. i=1 78
80 What affects the results? Bandwidth Choice Trimming 79
81 Trimming Trimming essentially refers to the practice of dropping some observations which meet a particular criterion. In other cases, it may mean rounding values at or near zero up to some acceptable level. (ex: Klein and Spady.) Practical reasons In all of the regression estimators that we have looked at, some type of density estimate appears in the denominator of the expression. If this is zero or near zero, the estimate of the conditional mean function is undefined. So it is sometimes necessary to drop data points in order to avoid the boundary problem. 80
82 Technical reasons Semiparametric estimators which use non-parametric estimators in their construction. The non-parametric estimators need to have uniform rates of convergence in order to establish the asymptotic properties of the semiparametric estimators. This generally involves the use of bounded kernels and densities (for x, typically) that are bounded. So most technical proofs involve the introduction of some trimming function. (See Robinson (1988) or Klein and Spady (1993) for examples.) 81
83 Additively Separable Models This represents another way to restrict the class of models y i = β 0 + g 1 (x i1 )+g 2 (x i2 )+...+ g k (x ik )+u (26) Less restrictive than it appears because some variables could involve interactions with other variables. Estimates achieve the univariate rate of convergence: n 2/5 Complicated to estimate. Use Backfitting or an integration approach of Newey (1994, Econometric Theory) and Härdle and Linton (1996, Biometrika) Less commonly applied than the partially linear model 82
84 Partially Linear Models: Recent developments Refinements have been proposed by Ahn and Powell (1993, Journal of Econometrics) Heckman, Ichimura, and Todd (1998, U. of Chicago, still unpublished) These deal with the case where instrumental variables are needed and where sample selection correction of unknown functional form is estimated. 83
85 Other Notes The book by Pagan and Ullah (1998) remains an excellent reference. The new book by Li and Racine (2006) is written to serve more as a teaching text, complete with problem sets and examples. More recent developments are discussed by Ichimura and Todd (Handbook of Econometrics, Volume 5, 2004). I particularly like their section on bandwidth selection (chapter 6) for semi-parametric, parametric, and average derivative regression estimation techniques. 84
Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationNonparametric Regression
Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd
ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationNonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Univariate Kernel Regression The relationship between two variables, X and Y where m(
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationLocal Polynomial Regression
VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationAdaptive Nonparametric Density Estimators
Adaptive Nonparametric Density Estimators by Alan J. Izenman Introduction Theoretical results and practical application of histograms as density estimators usually assume a fixed-partition approach, where
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationFinite Sample Performance of Semiparametric Binary Choice Estimators
University of Colorado, Boulder CU Scholar Undergraduate Honors Theses Honors Program Spring 2012 Finite Sample Performance of Semiparametric Binary Choice Estimators Sean Grover University of Colorado
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationLocal linear multiple regression with variable. bandwidth in the presence of heteroscedasticity
Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationEcon 273B Advanced Econometrics Spring
Econ 273B Advanced Econometrics Spring 2005-6 Aprajit Mahajan email: amahajan@stanford.edu Landau 233 OH: Th 3-5 or by appt. This is a graduate level course in econometrics. The rst part of the course
More informationA nonparametric method of multi-step ahead forecasting in diffusion processes
A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School
More informationUNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics
DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and
More informationMaximum Smoothed Likelihood for Multivariate Nonparametric Mixtures
Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationPreface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation
Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric
More informationLocal regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression
Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction
More informationNonparametric Modal Regression
Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric
More informationEstimation of Treatment Effects under Essential Heterogeneity
Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March
More informationSection 7: Local linear regression (loess) and regression discontinuity designs
Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear
More informationFunction of Longitudinal Data
New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for
More informationKernel density estimation
Kernel density estimation Patrick Breheny October 18 Patrick Breheny STA 621: Nonparametric Statistics 1/34 Introduction Kernel Density Estimation We ve looked at one method for estimating density: histograms
More information4 Nonparametric Regression
4 Nonparametric Regression 4.1 Univariate Kernel Regression An important question in many fields of science is the relation between two variables, say X and Y. Regression analysis is concerned with the
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer May 20, 2015 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures Cross Validation Local Polynomial Regression
More informationIdentification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity
Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn
More informationChap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University
Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics
More informationEconomics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation
Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 19: Nonparametric Analysis
More informationAPEC 8212: Econometric Analysis II
APEC 8212: Econometric Analysis II Instructor: Paul Glewwe Spring, 2014 Office: 337a Ruttan Hall (formerly Classroom Office Building) Phone: 612-625-0225 E-Mail: pglewwe@umn.edu Class Website: http://faculty.apec.umn.edu/pglewwe/apec8212.html
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationEconometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit
Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables
More informationSINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL
SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL DANIEL J. HENDERSON AND CHRISTOPHER F. PARMETER Abstract. In this paper we propose an asymptotically equivalent single-step alternative to the two-step
More informationLinear regression COMS 4771
Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationModel-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego
Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationBayesian estimation of bandwidths for a nonparametric regression model with a flexible error density
ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian estimation of bandwidths for a nonparametric regression model
More informationIntroduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.
1 Introduction to Nonparametric and Semiparametric Estimation Good when there are lots of data and very little prior information on functional form. Examples: y = f(x) + " (nonparametric) y = z 0 + f(x)
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo Han Liu John Lafferty Larry Wasserman Statistics Department Computer Science Department Machine Learning Department Carnegie Mellon
More informationEcon 582 Nonparametric Regression
Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume
More informationGenerated Covariates in Nonparametric Estimation: A Short Review.
Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated
More informationA Novel Nonparametric Density Estimator
A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with
More informationNonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors
Nonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors Peter Hall, Qi Li, Jeff Racine 1 Introduction Nonparametric techniques robust to functional form specification.
More informationMore on Estimation. Maximum Likelihood Estimation.
More on Estimation. In the previous chapter we looked at the properties of estimators and the criteria we could use to choose between types of estimators. Here we examine more closely some very popular
More informationBandwidth selection for kernel conditional density
Bandwidth selection for kernel conditional density estimation David M Bashtannyk and Rob J Hyndman 1 10 August 1999 Abstract: We consider bandwidth selection for the kernel estimator of conditional density
More informationA Bootstrap Test for Conditional Symmetry
ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend
More informationA review of some semiparametric regression models with application to scoring
A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France
More informationIntroduction to Regression
Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/100 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression
More informationNew Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data
ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human
More informationSEMIPARAMETRIC APPLICATIONS IN ECONOMIC GROWTH. Mustafa Koroglu. A Thesis presented to The University of Guelph
SEMIPARAMETRIC APPLICATIONS IN ECONOMIC GROWTH by Mustafa Koroglu A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Doctor of Philosophy in Economics
More informationParametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON
Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)
More informationNonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix
Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract
More informationSimple Estimators for Semiparametric Multinomial Choice Models
Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper
More informationNonparametric Econometrics in R
Nonparametric Econometrics in R Philip Shaw Fordham University November 17, 2011 Philip Shaw (Fordham University) Nonparametric Econometrics in R November 17, 2011 1 / 16 Introduction The NP Package R
More informationCOMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma
COMS 4771 Introduction to Machine Learning James McInerney Adapted from slides by Nakul Verma Announcements HW1: Please submit as a group Watch out for zero variance features (Q5) HW2 will be released
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationNonparametric Estimation of Luminosity Functions
x x Nonparametric Estimation of Luminosity Functions Chad Schafer Department of Statistics, Carnegie Mellon University cschafer@stat.cmu.edu 1 Luminosity Functions The luminosity function gives the number
More informationLecture 6: Discrete Choice: Qualitative Response
Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationBoundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta
Boundary Correction Methods in Kernel Density Estimation Tom Alberts C o u(r)a n (t) Institute joint work with R.J. Karunamuni University of Alberta November 29, 2007 Outline Overview of Kernel Density
More informationDESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA
Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationSimple Estimators for Monotone Index Models
Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)
More informationStatistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression
Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression
More informationIntroduction to Econometrics
Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,
More informationTopic 4: Model Specifications
Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationDay 3B Nonparametrics and Bootstrap
Day 3B Nonparametrics and Bootstrap c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010),
More informationIdentification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case
Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationDISCUSSION PAPER. The Bias from Misspecification of Control Variables as Linear. L e o n a r d G o f f. November 2014 RFF DP 14-41
DISCUSSION PAPER November 014 RFF DP 14-41 The Bias from Misspecification of Control Variables as Linear L e o n a r d G o f f 1616 P St. NW Washington, DC 0036 0-38-5000 www.rff.org The Bias from Misspecification
More informationWhat s New in Econometrics? Lecture 14 Quantile Methods
What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression
More informationConfidence intervals for kernel density estimation
Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationLinear models. Linear models are computationally convenient and remain widely used in. applied econometric research
Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y
More informationLOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA
The Annals of Statistics 1998, Vol. 26, No. 3, 1028 1050 LOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA By C. Y. Wang, 1 Suojin Wang, 2 Roberto G. Gutierrez and R. J. Carroll 3
More information13 Endogeneity and Nonparametric IV
13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationNonparametric Density Estimation
Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationWorking Paper No Maximum score type estimators
Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More informationApplied Health Economics (for B.Sc.)
Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative
More informationNinth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"
Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More information