Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
|
|
- Tobias Black
- 5 years ago
- Views:
Transcription
1
2 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome A. Linear E[Y 0i X i] X Outcome B. Nonlinear E[Y 0i X i] i X utcome C. Nonlinearity mistaken for discontinuity Nonlinearitymistakenfordiscontinuity O X
3 Objectives of the slides Overview of nonparametric density estimation, which plays a central role in nonparametric analysis. Methods for estimating conditional means: Nadaraya-Watson, kernel regression. Semiparametric models: partially linear (Robinson, 1988, Yatchew, 1998), single index models (Ichimura, 1993, Klein and Spady, 1993). Familiarize with implementation.
4 Histograms Kernel density estimation
5 Distribution function Histograms Kernel density estimation Definition: Cumulative distribution function. The cumulative distribution function (c.d.f.) of a random variable X, denoted F X ( ), is a function with domain on the real line and counterdomain on the interval [0, 1] which satisfies F X (x) = P[X x] = P[ω : X (ω) x] for every real number x. F X ( ) = 0 and F X (+ ) = 1 F X ( ) is a monotone non-decreasing function, i.e. F X (a) F X (b) if a < b F X ( ) is continuous from the right, i.e. lim 0<h 0 F X (x + h) = F X (x)
6 Discrete random variables Histograms Kernel density estimation Definition: Discrete random variable A random variable will be defined discrete if the range of its domain X is countable. If a random variable is discrete than the cumulative distribution function will be defined to be discrete. Definition: Discrete density function If X is a discrete random variable with values x 1, x 2,...x n,... the function f X (x) = P[X = x j ] if x = x j, j = 1,...n,... and zero otherwise is defined the discrete density function of X.
7 Continuous random variables Histograms Kernel density estimation Definition: Continuous random variable A random variable will be called continuous if there exists a function f X ( ) such that F X (x) = x f X (u)du for every real number x. Definition: Probability density function If X is a continuous random variable the function f X ( ) in F X (x) = x f X (u)du is called the probability density function (or continuous density function). Note: It is important to recognize that f is not by itself a probability. Instead, the probability that X lies in the interval (x, x + dx) is f (x)dx, or for a finite interval (a, b) it is b a f (x)dx. Any function f ( ) with domain the real line and counterdomain [0, ) is defined to be a probability density function if: f (x) 0 for all x f (x)dx = 1
8 Histograms Histograms Kernel density estimation Constructing a histogram is straightforward. If X is a discrete random variable with domain {x 1, x 2,...} then select M = #X and point-bins. Then, f ˆ(x) = i 1x i =x n If X is a continuous random variable with domain X then consider a series of bins, intervals that cover the domain of X, assumed to be bounded. Let M be the number of bins, indexed by m, each of the same width h. Then we have bins of the form Then, [x m h, x m + h), m = 1, 2,..., M x 1 < x 2 <... < x M 1 < x M, x m + h = x m+1 h f ˆ(x) = i 1 [x i in same bin as x] nh How to select the bins optimally? In particular, this depends on the bandwidth h.
9 Kernel Histograms Kernel density estimation The kernel K is a symmetric function satisfying: K (ψ)dψ = 1; ψk (ψ)dψ = 0 ψ 2 K (ψ)dψ = µ 2 <. Unless otherwise specified (i.e. if the domain of X is bounded with known bounds), the limits of integration are (, ).
10 Kernel density estimation Histograms Kernel density estimation Let h be the bandwith. Let x be a particular value where we want to estimate f (x). A kernel density estimation is fˆ h (x) = 1 ( ) nh xi x K i h In practice, we need to report a density for the entire domain of x, X, and not only for one particular x. Then we need to specify a grid of values X = [x(1), x(2),..., x(m)], where #X = M for which we are going to estimate the density. Then, a density estimation corresponds to the graph { fˆ h (x(m)), x(m)} M m=1. We could also consider different bandwidths h(x(m)).
11 Kernel density estimation Mean squared error and asymptotic requirements Histograms Kernel density estimation How to evaluate kernel density estimators? Mean squared error (MSE): MSE (x) = E [( fˆ h (x) f (x)) 2 ] = var( fˆ h (x)) + ( bias( fˆ h (x)) ) 2 Following Pagan&Ullah pp.23-24: bias( ˆ f h (x)) h2 2 f (x)µ 2 var( ˆ f h (x)) f (x) nh K (ψ) 2 dψ Then note that for a consistent estimator of f (x) we have a trade-off between bias and variance: 1 h 0 as n. As the sample size increases, we can make the bins smaller to get a more precise estimate of f (x), i.e. to reduce the bias. 2 However, as the bins become smaller that increases the variance too! Then we also require nh, that is, n grows faster than h decreases.
12 Kernel density estimation Rate of convergence Histograms Kernel density estimation Since we want to evaluate the density for the entire domain X, then we consider the integrated mean squared error (IMSE): IMSE = MSE (x)dx = f (x) X nh We can now minimize with respect to h, to get ( ) 1/5 ( h opt = K (ψ) 2 dψ µ 2/5 2 K (ψ) 2 dψ + µ 2 h 4 2 f (x)dx 4 X X ) 1/5 f (x)dx n 1/5 That is, h should decrease at rate 1/5 with respect to n. That means that the rate of convergence depends on fˆ h (x) f (x) = O p (n 2/5 ) which means a slower rate of convergence of the (optimal) maximum likelihood estimator of f (x), assuming correct density of O p (n 1/2 ), i.e. n-consistency.
13 Kernel density estimation Bandwidth selection Histograms Kernel density estimation The most important part in nonparametric density estimation is to select a bandwidth. If h 0, i.e. too small, then there is no smoothing. The density estimator has too many spikes, i.e. one for each observation. If h, i.e. too small, then there is too much smoothing. The density estimator fits the density of the selected kernel. Note that for choosing h opt we need K (ψ) 2 dψ. This depends on the choosen kernal and that it is readily available. µ 2 = ψ 2 K (ψ)dψ. This depends on the choosen kernal and that it is readily available. X f (x)dx. This depends in the unknown f. Bandwidth selection depends on this.
14 Kernel density estimation Bandwidth selection Histograms Kernel density estimation Rule-of-thumb: Use a standard family of distributions to construct X f (x)dx, i.e. Gaussian. Then X f (x)dx = 3 8 πσ 5. Then, ĥ opt = 1.059ˆσn 1/5, with ˆσ = 1 n i (x i x) 2 Plug-in: use rule-of-thumb to get a prior estimate of f, then compute f and plug-in again. Cross-validation: minimize h by estimating IMSE ( fˆ h ). This is computationally intensive as requires to estimate f leaving each observation out of the sample and using the rest. The rate of convergence is extremely slow and too much volatility. (See Pagan&Ullah pp ) Use intuition! is an art...
15 Histograms Kernel density estimation Histograms can be created with the hist command. This can be accessed in Kernel density estimation can be implemented using the kdensity command. This can be accessed in
16 Local constant conditional mean Marginal effect
17 Model misspecification Local constant conditional mean Marginal effect Assume that E (y x) = m(x), where m(.) is a continuous and differentiable (non necessarily linear) function of x. Then we can always define y = m(x) + e with E (e x) = 0. What does OLS estimates in this case? Suppose we estimate the model y i = βx i + u i. Then the OLS estimator, ˆβ satisfies β = Cov(y,x) = Cov(m(x)+e,x) = Cov(m(x),x). Var(x) Var(x) Var(x)
18 Model misspecification Local constant conditional mean Marginal effect Consider the Taylor of m(x), evaluated at x : Then, m(x) = m(x ) + m (x )(x x ) + m (x ) (x x ) β = m (x ) + m (x ) ( Cov(x 2, x) 2x Var(x) ) Var(x) If m(x) = a + bx then β = b. If m(x) = a + bx + cx 2 then β = b + 2cx + c Var(x) as x 2 is an omitted variable) ( Cov(x 2, x) 2x Var(x) ) = b + c Cov(x2,x) Var(x) (same
19 Local constant conditional mean Local constant conditional mean Marginal effect This estimator is locally averaging those values of y close in terms of x. Consider now the estimation of m(x) E [y x] estimated on x. E [y x] = yf (y x)dy = y f (y, x) g(x) dy = f (x) f (x) where f (y x) is the conditinal pdf of y conditional on x and by definition f (y x) f (y,x) f (x) ; f (y, x) is the joint density of y and x, g(x) yf (y, x)dy.
20 Local constant conditional mean Nadaraya-Watson kernel regression estimator Local constant conditional mean Marginal effect The Nadaraya-Watson estimator is a weighted average of those y i s that correspond to x i in a neighborhood of x. Consider now the kernel based estimator of the conditional mean. Define ψ i (x) = x i x h where h is the (fixed) bandwidth that weights distances of each x i to the corresponding value of x. Define a kernel K (.). Then we have, ˆm h (x) = i y i K (ψ i ) i K (ψ i )
21 Local constant conditional mean Nadaraya-Watson kernel regression estimator Local constant conditional mean Marginal effect Note that as h, ˆm h (x) ȳ = n 1 i y i, the unconditional mean of y. What does it mean? For a large bandwidth x lim h ψ i (x) = lim i x h h = 0 and then lim h K (ψ i (x)) = K (0) = max = constant. In this case there is no smoothing based on x. Result: Too much smoothing. Note that as h 0, then ˆm h (x) becomes the nearest neighbor (NN) estimator. What does it mean? For a small bandwidth x lim h 0 ψ i (x) = lim i x h 0 h = except when x i = x. Note that K ( ) = 0 and then, only the x i s equal to x are considered. This is identical to set ˆm 0 (x) = i 1[x i =x]y i i 1[x i =x], that is, for each value of x takes the corresponding vaue of y if there is only one pair (y i, x i = x); it takes the average for all values of y that have x i = x if there are more observations; or takes the closest average of observations with the closest x j = x. Result: No smoothing.
22 Local constant conditional mean Marginal effect Local constant response - Marginal effect Consider now the marginal effect of x on y. Define, β(x) d m(x) dx = m (x) = f (x)g (x) g(x)f (x) f 2 (x) = g (x) f (x) g(x) f (x) f (x) Then a local kernel estimator of the marginal effect is with ĝ(x) = 1 nh i f ˆ(x) = 1 nh i ˆβ(x) = ĝ (x) f ˆ(x) ĝ(x) f ˆ (x) f ˆ(x) y i K (ψ i ), ĝ (x) = 1 nh 2 y i K (ψ i ) K (ψ i ), f ˆ (x) = 1 nh 2 i i K (ψ i )
23 Local polynomial linear regression Local constant conditional mean Marginal effect The Nadaraya-Watson estimator can be obtained as ˆm h (x) = argmin a (y i a) 2 K (ψ i ). Now consider an extension (â h (x), ˆb h (x)) = argmin (a,b) (y i a b(x i x)) 2 K (ψ i ) Note that this is a weighted regression estimator, where the weights are given by the kernel. ( âh (x) b h (x) ) [ ( = K (ψ i ) i 1 (x i x) (x i x) (x i x) 2 ) ] 1 ( K (ψ i ) i y i y i (x i x) This can be extended to a higher order polynomial, i.e. y i a b(x i x) c(x i x) 2. In fact, for the estimation of β(x) = d m(x) dx it is better to include a quadratic polynomial to reduce bias. Note that as h, the local linear regression estimator becomes the OLS estimator lim h â h (x) + ˆb h (x) = ˆβ 0 + ˆβ 1 x. )
24 Some asymptotic properties Assumptions Local constant conditional mean Marginal effect Consider the following assumptions 1 m and f are twice differentiable in a neighborhood of x. f is bounded in a neighborhood of x. x int(x ) [What does it mean? It rules out points with jumps.] 2 The kernel K is a symmetric function satisfying (i) K (ψ)dψ = 1; (ii) ψk (ψ)dψ = 0; (ii) ψ 2 K (ψ)dψ = µ 2 <. [What does it mean? Same properties as for density estimation.] 3 h = h n 0, nh as n. [What does it mean? Same properties as for density estimation.] 4 xs are iid and independent of the error term in the model y = m(x) + u. [What does it mean? Exogeneity assumption.]
25 Some asymptotic properties Nadaraya-Watson estimator Local constant conditional mean Marginal effect Theorem (Pagan&Ullah p. 101) Under the assumptions above BIAS( ˆm h (x)) = h2 2f µ 2(m f + 2f m ) + O(n 1 h 1 ) + o(h 2 ) V ( ˆm h (x)) = σ2 nhf K 2 (ψ)dψ + o(n 1 h 1 ) Then the optimal bandwidth should satisfy h n n 1/5.
26 Some asymptotic properties Local linear regression Local constant conditional mean Marginal effect Theorem (Pagan&Ullah p. 105) Under the assumptions above BIAS( ˆm h (x)) = m h 2 µ 2 + O(n 1 h 1 ) + o(h 2 ) 2 V ( ˆm h (x)) = σ2 nhf K 2 (ψ)dψ + o(n 1 h 1 )
27 The curse of dimentionality Local constant conditional mean Marginal effect The results above can be adjusted for multiple covariates, q. As the number of covariates increases, the rate of convergence deteriorates, and in particular it becomes O p (n 2/(q+4) ). Compare that with OLS models where the rate of convergence is O p (n 1/2) for any q. In particular, h opt n 1/q+4
28 Local constant conditional mean Marginal effect Nadaraya-Watson and local polynomial kernel regression can be implemented using the lpoly command. This can be accessed in
29 Partially linear models Index models
30 Partially linear models Index models Partially linear models: y = x β + g(z) + u A semiparametric partially linear model is given by x i is a p 1 vector of covariates; z i is a q 1 vector of covariates; g(.) is an unspecified function; y i = x i β + g(z i ) + u i, i = 1, 2,..., n, u i y i E (y i x i, z i ) then E (u i x i, z i ) = 0 and E (u 2 i x i, z i ) = σ 2 (x i, z i ) (i.e. potentially heteroskedastic). Example: Suppose we are able to assume linearity in some covariates (i.e. x) but we cannot assume it for others (i.e. z).
31 Partially linear models Index models Partially linear models: y = x β + g(z) + u Robinson s (1988) and Yatchew s (1998) estimators This model avoids the curse of dimensionality if few or a single variable is in z. Separating into linear covariantes and non-linear covariates increase precision of estimates. We have β or ˆβ is n-consistent (usual OLS asymptotics for the linear part).
32 Partially linear models Index models Partially linear models: y = x β + g(z) + u Robinson s (1988) estimator This model can be estimated by Robinson (1988) estimator. Consider E (y i z i ) = E (x i z i ) β + g(z i ) + E (u i z i ) = E (x i z i ) β + g(z i ) (because E (u z) = 0) Then, substracting from the first equation y i E (y i z i ) = (x i E (x i z i ))β + u i. Denoting ỹ i = y i E (y i z i ) and x i = x i E (x i z i ), we can get. β = [ n i ] 1 n x i x i i However we don t have the conditional expectations E (y z) and E (x z) which will be estimated nonparametrically to get E (y z) and Ê (x z). g(z i ) can then be estimated from ĝ(z i ) = ỹ i x i β. x i ỹ i
33 Partially linear models Index models Partially linear models: y = x β + g(z) + u Yatchew s (1998) estimator Sort the data according to z, that is, z (1), z (2),..., z (n). Consider the regression in first differences y = x β + g(z) + u If g(.) is smooth, single-valued with bounded first derivative in a compact support, then g(z) 0 as n. Then β can be estimated from a regression of y on x. Obtain ˆβ. [Note that we don t need to know the form of g(.).] g(z) can then be estimated from a nonparametric regression of y x ˆβ on z.
34 Index models: y = g(x β) + u Partially linear models Index models A semiparametric single index model is given by x i is a q 1 vector of covariates; g(.) is an unspecified function; y i = g(x i β) + u i, i = 1, 2,..., n, Note: this is different from a model where y i = g(x i ) + u i ; u i y i E (y i x i ) then E (u i x i ) = 0 and E (u 2 i x i ) = σ 2 (x i ) (i.e. potentially heteroskedastic).
35 Index models: y = g(x β) + u Partially linear models Index models Ichimura (1993) method consists on assuming there is one parameter value, β 0, where y i = g(x i β 0) + u i, i = 1, 2,..., n. However we can define E (y x β) for any β. Note that E (y x β) = g(x β) unless β = β 0. Consider a grid of β B, B = [β (1), β (2),..., β (M) ]; For each j = 1, 2,..., M and for each observation i = 1, 2,..., n, estimate nonparametrically ĝ i (x i β (j) ) as a leave-one-out nonparametric kernel estimator of g i (x i β); Choose β as ˆβ = argmin β B n i (y i ĝ i (x i β)) 2
36 Partially linear models Index models Binary choice semiparametric single index models Suppose now y is a binary variable, i.e. y {0, 1}. The single index model can be applied as an alternative to logit or probit models. This is the Klein and Spady (1993) estimator. Let g(x i β) = Pr[y = 1 x i ]. Then apply Ichimura s method but maximize a quasi-log-likelihood ˆβ = argmax β B n i (1 y i )ln(1 ĝ i (x i β)) + y i ln(ĝ i (x i β)) Note: Compare with probit and logit models where ˆβ = argmax β B n i (1 y i )ln(1 F (x i β)) + y i ln(f (x i β)) where F is either a normal or logistic cdf.
37 Partially linear models Index models Robinson s (1988) semiparametric partially linear model can be estimated by the command semipar ssc install semipar Yatchew s (1988) semiparametric partially linear model can be estimated by the command plreg sml fits univariate binary-choice models by the semiparametric maximum likelihood estimator of Klein and Spady (1993). articlenum=st0144
38 References Partially linear models Index models This slides are based on Gutierrez, R.G., Linhart, J.M. and Pintblado, J.S. (2003), From the help desk: Local polynomial regression and Stata plugins, Stata Journal, 3(4), Ichimura, H. (1993), Semiparametric least squares (SLS) and weighted SLS estimation of single-index models, Journal of Econometrics, 58, Klein, R.W. and Spady, R.H. (1993), An efficient semiparametric estimator for binary response models, Econometrica, 61, Pagan, A. and Ullah, A. (1999), Nonparametric Econometrics. Cambridge: Cambridge University Press. Racine, J. (2008), Nonparametric Econometrics: A Primer, Foundations and Trends in Econometrics, 3(1), Robinson, P. (1988), Root-n-consistent semi-parametric regression, Econometrica, 56, Yatchew, A. (1998), Nonparametric regression techniques in economics, Journal of Economic Literature, 36,
Nonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationNonparametric Regression
Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y
More informationEcon 582 Nonparametric Regression
Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume
More informationAn introduction to nonparametric and semi-parametric econometric methods
An introduction to nonparametric and semi-parametric econometric methods Robert Breunig Australian National University robert.breunig@anu.edu.au http://econrsss.anu.edu.au/staff/breunig/course_bb.htm March
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationDay 4A Nonparametrics
Day 4A Nonparametrics A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconometrics Aug 28 - Sep 2, 2017. Colin Cameron Univ. of Calif.
More informationFinite Sample Performance of Semiparametric Binary Choice Estimators
University of Colorado, Boulder CU Scholar Undergraduate Honors Theses Honors Program Spring 2012 Finite Sample Performance of Semiparametric Binary Choice Estimators Sean Grover University of Colorado
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationNonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Univariate Kernel Regression The relationship between two variables, X and Y where m(
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationSemiparametric Models and Estimators
Semiparametric Models and Estimators Whitney Newey October 2007 Semiparametric Models Data: Z 1,Z 2,... i.i.d. Model: F aset of pdfs. Correct specification: pdf f 0 of Z i in F. Parametric model: F = {f(z
More informationNinth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"
Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric
More informationECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd
ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram
More informationEconomics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation
Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 19: Nonparametric Analysis
More informationLocal linear multiple regression with variable. bandwidth in the presence of heteroscedasticity
Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable
More informationNonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix
Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract
More informationProbability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.
Probability UBC Economics 326 January 23, 2018 1 2 3 Wooldridge (2013) appendix B Stock and Watson (2009) chapter 2 Linton (2017) chapters 1-5 Abbring (2001) sections 2.1-2.3 Diez, Barr, and Cetinkaya-Rundel
More informationMichael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp
page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic
More informationApplied Health Economics (for B.Sc.)
Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative
More informationLecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)
Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model
More informationDISCUSSION PAPER. The Bias from Misspecification of Control Variables as Linear. L e o n a r d G o f f. November 2014 RFF DP 14-41
DISCUSSION PAPER November 014 RFF DP 14-41 The Bias from Misspecification of Control Variables as Linear L e o n a r d G o f f 1616 P St. NW Washington, DC 0036 0-38-5000 www.rff.org The Bias from Misspecification
More informationParametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON
Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationDay 3B Nonparametrics and Bootstrap
Day 3B Nonparametrics and Bootstrap c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010),
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationLecture 6: Discrete Choice: Qualitative Response
Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;
More informationNADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i
NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u
More informationIntroduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.
1 Introduction to Nonparametric and Semiparametric Estimation Good when there are lots of data and very little prior information on functional form. Examples: y = f(x) + " (nonparametric) y = z 0 + f(x)
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationEstimation of Treatment Effects under Essential Heterogeneity
Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March
More informationA review of some semiparametric regression models with application to scoring
A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More informationESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLocal Polynomial Regression
VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based
More informationNonparametric Density Estimation
Nonparametric Density Estimation Econ 690 Purdue University Justin L. Tobias (Purdue) Nonparametric Density Estimation 1 / 29 Density Estimation Suppose that you had some data, say on wages, and you wanted
More informationCURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University
CURRENT STATUS LINEAR REGRESSION By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University We construct n-consistent and asymptotically normal estimates for the finite
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More information4. Distributions of Functions of Random Variables
4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n
More informationECON 3150/4150, Spring term Lecture 6
ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationNonparametric Function Estimation with Infinite-Order Kernels
Nonparametric Function Estimation with Infinite-Order Kernels Arthur Berg Department of Statistics, University of Florida March 15, 2008 Kernel Density Estimation (IID Case) Let X 1,..., X n iid density
More informationBinary Models with Endogenous Explanatory Variables
Binary Models with Endogenous Explanatory Variables Class otes Manuel Arellano ovember 7, 2007 Revised: January 21, 2008 1 Introduction In Part I we considered linear and non-linear models with additive
More informationSimple Estimators for Semiparametric Multinomial Choice Models
Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper
More informationBAYESIAN DECISION THEORY
Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More information2 (Statistics) Random variables
2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes
More informationRegression Discontinuity Design Econometric Issues
Regression Discontinuity Design Econometric Issues Brian P. McCall University of Michigan Texas Schools Project, University of Texas, Dallas November 20, 2009 1 Regression Discontinuity Design Introduction
More informationUNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics
DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationApplied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid
Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?
More informationNon-linear panel data modeling
Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationO Combining cross-validation and plug-in methods - for kernel density bandwidth selection O
O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationHistogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density
More informationSemiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity
Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model
More informationNonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors
Nonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors Peter Hall, Qi Li, Jeff Racine 1 Introduction Nonparametric techniques robust to functional form specification.
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationPreface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation
Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric
More informationprobability of k samples out of J fall in R.
Nonparametric Techniques for Density Estimation (DHS Ch. 4) n Introduction n Estimation Procedure n Parzen Window Estimation n Parzen Window Example n K n -Nearest Neighbor Estimation Introduction Suppose
More informationMLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22
MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y
More informationDo Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods
Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of
More informationSimple Estimators for Monotone Index Models
Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)
More informationCopula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011
Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models
More informationECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria
ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:
More information7 Semiparametric Estimation of Additive Models
7 Semiparametric Estimation of Additive Models Additive models are very useful for approximating the high-dimensional regression mean functions. They and their extensions have become one of the most widely
More informationOptimal bandwidth selection for the fuzzy regression discontinuity estimator
Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer May 20, 2015 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures Cross Validation Local Polynomial Regression
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationStatistical inference on Lévy processes
Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More informationBinary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment
BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense
More informationLocal regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression
Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction
More information4 Nonparametric Regression
4 Nonparametric Regression 4.1 Univariate Kernel Regression An important question in many fields of science is the relation between two variables, say X and Y. Regression analysis is concerned with the
More informationRewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35
Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate
More information3 Nonparametric Density Estimation
3 Nonparametric Density Estimation Example: Income distribution Source: U.K. Family Expenditure Survey (FES) 1968-1995 Approximately 7000 British Households per year For each household many different variables
More information1 Empirical Likelihood
Empirical Likelihood February 3, 2016 Debdeep Pati 1 Empirical Likelihood Empirical likelihood a nonparametric method without having to assume the form of the underlying distribution. It retains some of
More informationFormulary Applied Econometrics
Department of Economics Formulary Applied Econometrics c c Seminar of Statistics University of Fribourg Formulary Applied Econometrics 1 Rescaling With y = cy we have: ˆβ = cˆβ With x = Cx we have: ˆβ
More informationAdaptive Nonparametric Density Estimators
Adaptive Nonparametric Density Estimators by Alan J. Izenman Introduction Theoretical results and practical application of histograms as density estimators usually assume a fixed-partition approach, where
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationThe logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is
Example The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is log p 1 p = β 0 + β 1 f 1 (y 1 ) +... + β d f d (y d ).
More informationRegression Discontinuity Designs
Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationAnalogy Principle. Asymptotic Theory Part II. James J. Heckman University of Chicago. Econ 312 This draft, April 5, 2006
Analogy Principle Asymptotic Theory Part II James J. Heckman University of Chicago Econ 312 This draft, April 5, 2006 Consider four methods: 1. Maximum Likelihood Estimation (MLE) 2. (Nonlinear) Least
More information