Nonparametric and Semiparametric Estimation Whitney K. Newey Fall 2007

Size: px
Start display at page:

Download "Nonparametric and Semiparametric Estimation Whitney K. Newey Fall 2007"

Transcription

1 Nonparametric and Semiparametric Estimation Witney K. Newey Fall 2007 Introduction Function form misspecification error is important in elementary econometrics. Flexible functional forms; e.g. translog y = β 1 + β 2 ln(x)+ β 3 [ln(x)] 2 Fine for simple nonlinearity, e.g. diminising returns. Economic teory does not restrict form. Nonparametric metods allow for complete flexibility. Good for graps. Good for complete flexibility wit a few dimensions. An Empirical Example An example illustrates. Deaton (1989); effect of rice prices on te distributions of incomes in Tailand. p price of rice; q amount purcased; y amount sold. Cange in benefits from dp is db = (q y)dp = p(q y)d ln(p). Elasticity form: db /d ln(p) =(w py/x), x w budget sare of rice purcases; x total expenditure. Benefit/expenditure measure is te negative of rigt-and side. Empirical Distribution Function Simple nonparametric estimation problem. Te CDF of is F (z) =Pr( z). Let 1,..., n be i.i.d. data, 1(A) indicator of A, so F (z) = E[1( i z)]. X Fˆ (z) = #{i i z} 1 n = 1( i z). n n Empirical CDF. Probability weigt 1/n on eac observation. Consistent and asymptotically normal. Nonparametrically efficient. No good for density estimation. Kernel Density Estimator Add a little continuous noise to smoot out empirical CDF.

2 n ave empirical CDF. U a continuous random variable wit pdf K(u), indep of n a positive scalar. Define = n + U Empirical CDF plus noise. Kernel density estimator is density of. R Derivation: Let F U (u) = u K(t)dt be CDF of U. By iterated expectations so by 1( z) =1(U (z n)/), Differentiating gives pdf E[1( z)] = E[E[1( z) n]], F (z) = Pr( z) = E[1( z)] = E[[1(U z n ) n]] n z n X z i = E[F U ( )] = F U ( )/n. fˆ (z) = df (z)/dz = K (z i )/n; K (u) = 1 K(u/). Tis is a kernel density estimator. Te te bandwidt. function K(u) iste kernel and te scalar is fˆ (z) = df (z)/dz = K (z i )/n; K (u) = 1 K(u/). Bandwidt controls te amount of smooting. As increases, density smooter, but more noise from U, i.e. more bias. As 0 get roug density, spikes at data points, but bias srinks. Coosing important in practice; see below. fˆ (z) will be consistent if 0and n. Examples: Gaussian kernel: K(u) = (2π) 1/2 e u2 /2. Epanecnikov: K(u) = 1( u 1)(1 u 2 )(3/4).

3 Coice of K does not matter as muc as coice of. Epanecnikov kernel as sligtly smaller mean square error, and so optimal. Bias and Variance of Kernel Estimators ( 1,..., n ) are i.i.d.. Bias: f 0 (z) ispdf of i. Expectation of kernel estimator; wit E[fˆ (z)] = K (z t)f 0 (t)dt = 1 K( z t )f 0 (t)dt = K(u)f 0 (z u)du, for cange of variables u = (z t)/. Taylor expand f 0 (z u) around = 0, f 0 (z u) = f 0 (z) f 0 0 (z)u + Γ(, u, z) 2, Γ(, u, z) = f 0 00 (z + (z, u)u)u 2 /2, R were (z, u). For K(u)u 2 du <, K(u)udu = 0, assuming f 00 0 (z) continuous and bounded, K(u)Γ(, u, z)du [ K(u)u 2 du]f 00 0 (z)/2. Ten for o( 2 )= a() witlim 0 a()/ 2 =0, 2 Ten multiplying te expansion by K(u) and integrating gives K(u)Γ(, u, z)du = 2 [ K(u)u 2 du]f 0 00 (z)/2 R +o( 2 ) f 0 (z u) = f 0 (z) f 0 0 (z)u + Γ(, u, z) 2 E[fˆ (z)] = f 0 (z)+ 2 f 00 0 (z) K(u)u 2 du/2+ o( 2 ). We can summarize tese calculations in te following result: Proposition 1: If f 0 (z) is twice continuously differentiable wit bounded second R R R derivative, K(u)du = 1, K(u)udu = 0, u 2 K(u)du <, ten E[fˆ (z)] f 0 (z) = 2 f 00 (z) K(u)u 2 du/2+ o( 2 ). 0

4 Variance: From Proposition 1, E[K (z i )] = E[fˆ (z)] is bounded as 0. Let O(1/n) denote(a n ) n=1 suc tat na n is bounded. Ten by fˆ (z) a sample of average of K (z i ), for 0, Var(fˆ (z)) = {E[K (z i ) 2 ] {E[K (z i )]} 2 }/n = 1 K( z t ) 2 f 0 (t)dt/n + O(1/n) 2 1 = K(u) 2 f 0 (z u)du/(n)+ O(1/n). R For f 0 (z) continuous and bounded and K(u) 2 du <, K(u) 2 f 0 (z u)du f 0 (z) K(u) 2 du. By 0, it follows tat no(1/n) 0, so tat O(1/n) = o(1/n). Plugging in abovevarianceformula we find, Var(fˆ (z)) = f 0 (z) K(u) 2 du/(n)+ o(1/(n)). We can summarize tese calculations in te following result: Proposition 2: If f 0 (z) is continuous and bounded, K(u) 2 du <, 0, and n ten Var[fˆ (z)] = f 0 (z) K(u) 2 du/(n)+ o(1/(n)). Consistency and Convergence Rate of Kernel Estimators For consistency implied by 0; bias goes to zero. n ; variance goes to zero. Bandwidt srinks to zero slower tan 1/n. Intuition for te 0: Smooting noise must go away asymptotically to remove all bias. Intuition for n : For Epanecnikov kernel; K((z i )/) > 0ifand only if z i <. If srinksasfastasorfastertan1/n, te number of observations wit z i < will not grow, so averaging over a finite number of observations, ence variance does not go to zero. R

5 Explicit form for (MSE) under 0,n. MSE(fˆ (z)) = Var(fˆ (z)) + Bias 2 (fˆ (z)) = f 0 (z) K(u) 2 du/(n) + 4 {f0 00 (z) K(u)u 2 du/2} 2 +o( 4 +1/(n)). By 0, MSE vanises slower tan 1/n. Tus, kernel estimator converges slower tan n 1/2. Avoidance of bias by 0 means fraction of te observations used goes to zero. Bandwidt Coice for Density Estimation: Grapical: Coose one tat looks good, report several. Minimize asymptotic integrated MSE. Integrating over z, MSE(fˆ (z))dz = Min over as first-order conditions K(u) 2 du/(n) + f 00 0 (z) 2 dz[ K(u)u 2 du/2] 2 4 +o( 4 +1/(n)). 0 = n 1 2 C f 00 0 (z) 2 dzc 2, C 1 = K(u) 2 du, C 2 =[ K(u)u 2 du] 2. Solving gives = [C 1 /{nc 2 f 00 (z) 2 dz}] 1/5. 0 = [C 1 /{nc 2 f 0 00 (z) 2 dz}] 1/5. Asymptotically optimal bandwidt. Could be estimated by plugging-in estimator for f 0 00 (z). Tis procedure depends on initial bandwidt, but final estimator not as sensitive to coice of bandwidt for f 0 00 (z) as coice of bandwidt for f 0 (z). Silverman s rule of tumb: Optimal bandwidt wen f 0 (z) is Gaussian and K(u) is a standard normal pdf. = 1.06σn 1/5,σ = Var(z i ) 1/2. Use estimator of te standard deviation σ.

6 Estimate directly te integrated MSE: MSE(fˆ (z))dz = E[{fˆ (z) f 0 (z)} 2 ]dz = E[ {fˆ (z) f 0 (z)} 2 ] Unbiased estimator of E[ R fˆ (z) 2 ]is = E[ fˆ (z) 2 ] 2E[ fˆ (z)f 0 (z)] X +f 0 (z) 2. fˆ (z) 2 dz = K ( i t)k (t j )dt/n 2 To find unbiased estimator of second, note tat E[ fˆ (z)f 0 (z)dz] = K (z t)f 0 (z)f 0 (t)dzdt. i,j By observations independent, we can average over pairs to estimate tis term. Last term in MSE does not depend on, so we can drop. Combining estimates of first two terms gives criterion. 2 X CVˆ () = fˆ (z) 2 dz K ( i j ). n(n 1) i=j 6 2 X CVˆ () = fˆ (z) 2 dz K ( i j ). n(n 1) i=j 6 Coosing by minimizing CVˆ () iscalled cross-validation. Motivation for terminology is second term is 2 X fˆ i,( i )/n, 6 fˆ i,(z) = K (z j )/(n 1). j=i Here fˆ i,(z) is estimator tat uses all observations but te it, so tat fˆ i,( i )is cross-validated. Multivariate Density Estimation: Multivariate density estimation can be important as in example. Let z be r 1, K(u) denote a pdf for a r 1 random vector, e.g. K(u) = Π r j=1k(u j ) for univariate pdf k(u). Let Σˆ te sample covariance matrix of i. For a bandwidt let K (u) = r det(σˆ) 1/2 K(Σˆ 1/2 u/),

7 A 1/2 is inverse square root of positive definite A. Often K(u) = ρ(u 0 u)for some ρ, so K (u) = r det(σˆ) 1/2 ρ(u 0 Σˆ 1 u/). Multivariate kernel estimator, wit scale normalization fˆ (z) = K (z i )/n. Gaussian kernel: K(u) = (2π) r/2 exp( u 0 u/2). Epanecnikov kernel: K(u) = C r (1 u 0 u)1(u 0 u 1). Te Curse of Dimensionality for Kernel Estimation Difficult to nonparametrically estimate pdf s of ig dimensional i. Need many observations witin a small distance of a point. As dimension rises wit distance fixed, te proportion of observations tat are close srinks very rapidly. Matematically, wit one dimension [0, 1] can be covered wit 1/ balls of radius, wile it requires 1 r balls to cover [0, 1] r. Tus, if data equally likely to fall in one ball, tend to be many fewer data points in any one ball for ig dimensional data. Silverman (1986, Density Estimation) famous table. Multivariate normal density at zero, te sample size required for MSE to be 10 percent of te density r n ,700 43, , ,000 Curse of dimensionality sows up in convergence rate. Expand again, f 0 (z u) = f 0 (z) [ f 0 (z)/ z] 0 u + 2 u 0 [ 2 f 0 (z + (z, u)u)/ z z 0 ]u/2, so bias asymptotically C 3 2 no matter ow big r. For te variance, setting u = (z t)/ K (z t) 2 f 0 (t)d = 2r K((z t)/) 2 f(t)dt = r K(u) 2 f 0 (z u)du. Integrating, wit Σˆ = I, E[{fˆ (z) f 0 (z)} 2 ]dz C 1 /(n r )+ C First-order conditions for te optimal : 0 = C 1 rn 1 r 1 +4C 2 3. Solving gives = C 0 n 1/(r+4). Plugging back in gives E[{fˆ (z) f 0 (z)} 2 ]dz C 0 n 4/(r+4).

8 Te convergence rate declines as r increases. Nonparametric Regression Often in econometrics te object of estimation is a regression function. A classical formulation is Y = X 0 β + ε wit E[ε X] = 0. A more direct way to write tis model is E[Y X] = X 0 β. A nonparametric version of tis model, tat allows for unknown functional form in te regression, is E[Y X] = g 0 (X), were g 0 (x) is an unknown function. Kernel Regression To estimate g 0 (x) nonparametrically, one can start wit kernel density estimate and plug it into te formula g 0 (x) = yf(y x)dy = yf(y, x)dy/ f(y, x)dy Assume tat X is a scalar. Let k(u 1,u 2 )beabivariatekernel, wit t k(t, u 2 )dt = 0. Data R are (Y 1,X 1 ),..., (Y n,x n ). Let K(u 2 )= k(t, u 2 )dt. By cange of variables t = (y Y i )/, n yfˆ (y, x)dy = n 1 2 X n yk( y Y i, x X i )dy = n 1 1 X (Y i + t)k(t, x X i )dt n = n 1 1 X Y i k(t, x X i )dt n = n 1 1 X Y i K( x X i ), n fˆ (y, x)dy = n 1 2 X Plugging in fˆ (y, x) informula for g 0 (x), R n = n 1 1 X K( x X i ). k( y Y i, x X i )dy P n ³ x X yfˆ (y, x)dy Y i K i gb (x) = R = P ³. X fˆ (y, x)dy i i n =1 K x R

9 Also, kernel regression estimator ĝ(x) justdefined by second equality. Tis regression estimator is a weigted average, wit P ³ x X i X n K ĝ (x) = w i (x)y i,w i (x) = P ³ n j=1 K n By construction, w i (x) = 1, wile w i (x) is nonnegative by K(u) being nonnegative. For symmetric K(u) witauniquemodeat u = 0, more weigt will be given to observations wit X i tat is closer to x. Bandwidt controls ow fast te weigts decline. As declines, more weigt given closer observations. Reduces bias but increases variance. For multivariate X, formula is te same, wit K(u) replaced by multivariate kernel, suc as K(u) =det(σˆ) 1/2 1/2 k(σˆ u) for some kernel k(u). Consistency and convergence rate results are similar. Details are not reported ere because te calculations are complicated by te ratio form of te estimator. Bandwidt coice below. Series Regression Anoter approac to nonparametric regression flexible functional forms wit complete flexibility by letting te number of terms grow wit sample size. Tink of approximating P K g0 (x) by linear combination j=1 p jk (x)β j of approximating functions p jk (x), e.g. polynomials or splines. Estimator of g 0 (x) is predicted value from regressing Y i on p K (X i ) for p K (x) =(p 1K (x),..., p KK (x)) 0. Consistent as K grows wit n. Let Y =(Y 1,..., Y n ) 0 and P =[P K (X 1 ),..., P K (X n )] 0.Ten ĝ(x) = p K (x) 0 β, ˆ βˆ =(P 0 P ) P 0 Y, A denotes generalized inverse, AA A = A. Different tan kernal; global fit ratertan local average. Examples: Power series: x scalar p jk (x) = x j 1, (j = 1, 2,...) p K (x) 0 β is a polynomial. Suc approximations good for global approximation of smoot functions. Not wen most variation in narrow range or wen jump or kink. Using ortogonal polynomials wit respect to density can mitigate multicollinearity. Regression splines: x is scalar, p jk (x) = x j 1, (j = 1, 2, 3, 4), x p jk (x) = 1(x> j 4,K )(x j 4,K ) 3, (j = 5,..., K). X j

10 Te 1,..., K 4 are knots, p K (x) 0 β is cubic in between jk, twice continuously differentiable everywere. Picks up local variation but still global fit. Need to place knots. B-splines can mitigate multi-collinearity. Extend bot to multivariate case by including products of individual components. Convergence Rate for Series Regression Depends on approximation rate, i.e. bias. Assumption A: Tere exists γ > 0, β K, and C suc tat {E[{g 0 (X i ) p K (X i ) 0 β K } 2 ]} 1/2 CK γ. Comes from approximation teory. For power series, X i in a compact set, g 0 (x) is continuously differentiable of order s, ten γ = s/r. For splines, γ =min{4,s}/r. For multivariate, approximation depends on te order of te included terms (e.g. on power) wic grows more slowly wit K wen x is iger dimensional. Proposition 3: If Assumption A is satisfied and Var(Y i X i ) ten Proof: Let E[{ĝ K (X i ) g 0 (X i )} 2 ] K/n + C 2 K 2γ. Q = P (P 0 P ) P 0, ĝ =(ĝ(x 1 ),..., ĝ(x n )) 0 = QY, g 0 = (g 0 (X 1 ),..., g 0 (X n )) 0,ε = Y g 0, ḡ = P β K. Q idempotent, so I Q idempotent, ence as eigenvalues tat are zero or one. Terefore, by Assumption A, E [(g 0 ḡ)(i Q)(g 0 ḡ)] E ( g 0 ḡ) 0 (g 0 ḡ) ne {g 0 (X i ) p K (X i ) 0 β K } 2 CnK 2γ. Also, for X = (X 1,..., X n ), by independence and iterated expectations, for i 6= j, E[ε i ε j X] = E[ε i ε j X i,x j ] = E[ε i E[ε j X i,x j,ε i ] X i,x j ] = E[ε i E[ε j X j ] X i,x j ]=0. Ten for Λ ii = Var(Y i X i )and Λ = diag(λ 11,..., Λ nn )weave E[εε 0 X] = Λ. It follows tat for tr(a) te trace of a square matrix A, by rank(q) K, i i

11 6 E [ε 0 Qε X] = tr(qe [εε 0 X]) = tr(qλ) = tr(qλq) tr(q) K. Ten by iterated expectations, E[ε 0 Qε] CK. Also, Ten by i.i.d. observations, {ĝ(xi ) g 0 (X i )} 2 = (ĝ g 0 ) 0 (ĝ g 0 ) = (Qε (I Q)g 0 ) 0 (Qε (I Q)g 0 ) = ε 0 Qε + g 0 (I Q)g 0 0 = ε 0 Qε +(g 0 ḡ) 0 (I Q)(g 0 ḡ). E[{ĝ(X i ) g 0 (X i )} 2 ] = E[(ĝ i g 0i ) 2 ] = E[(ĝ g 0 ) 0 (ĝ g 0 )]/n K n + C 2 K 2γ. Coosing Bandwidt or Number of Terms Data based coice operationalizes complete flexibility. Number of terms or bandwidt adjusts. Cross-validation is common. Let ĝ i, (X i )and ĝ i,k (X i ) be predicted values for i t observation using all te oters, for kernel and series respectively. ĝ i, (X i ) = P P ³ X i X j j=i Y j K ³, X j= 6 i K i X j Yi ĝk ( Xi) ĝ i,k (X i ) = Y i 1 P K ( X ) 0 ( P 0P ) P K (X i ), Second equality by recursive residuals. Criteria are CVˆ () = v(x i )[Y i ĝ i, (X i )] 2, CVˆ (K) = [Y i ĝ i,k (X i )] 2, v(x) is a weigt function, equal to zero near support boundary i minmize. i. Coose or K to

12 (weigted) sample MSE is Cross-validation criteria optimal, Series too. ĥ converges slowly. MSE() = v(x i ){ĝ (X i ) g 0 (X i )} 2 /n Anoter Empirical Example: min MSE() p 1. MSE(ĥ) Hausman and Newey (1995, Nonparametric Estimation of Exact Consumers Surplus, Econometrica), kernel and series estimates, Y is log of gasoline purcased, function of price, income, and time and location dummies. Six cross-sections of individuals from Energy Department, total of 18,109 observations. Cross-validation criteria are Kernel Spline Power CV Knots CV Order CV Locally Linear Regression: Tere is anoter local metod, locally linear regression, tat is tougt to be superior to kernel regression. It is based on a locally fitting a line rater tan a constant. Unlike kernel regression, locally linear estimation would ave no bias if te true model were linear. In general, locally linear estimation removes a bias term from te kernel estimator, tat makes it ave better beavior near te boundary of te x s and smaller MSE everywere. To describe tis estimator, let K (u) = r K(u/) as before. Consider te estimator ĝ(x) given by te solution to min (Y i g (x X i ) 0 β) 2 K (x X i ). g,β

13 Tat is ĝ(x) is te constant term in a weigted least squares regression of Y i on (1,x X i ), wit weigts K (x X i ). For Y1 1 (x X 1 ) 0 Y =., X =... Y n 1 (x X n ) 0 W = diag (K (x X 1 ),...,K (x X n )) and e 1 a(r +1) 1 vector wit 1 in first position and zeros elsewere, we ave ĝ(x) = e 0 1(X 0 W X ) 1 X 0 WY. Tis estimator depends on x bot troug te weigts K (x X i ) and troug te regressors x X i. Tis estimator is a locally linear fit of te data. It runs a regression wit weigts tat are smaller for observations tat are farter from x. In contrast, te kernel regression estimator solves tis same minimization problem but wit β constrained to be zero, i.e., kernel regression minimizes (Yi g) 2 K (x X i ) Removing te constraint β = 0 leads to lower bias witout increasing variance wen g 0 (x) is twice differentiable. It is also of interest to note tat βˆ from te above minimization problem estimates te gradient g 0 (x)/ x. Like kernel regression, tis estimator can be interpreted as a weigted average of te Y i observations, toug te weigts are a bit more complicated. Let n n n X X X S 0 = K (x X i ),S 1 = K (x X i )(x X i ),S 2 = K (x X i )(x X i )(x X i ) 0 Xn X n mˆ 0 = K (x X i )Y i, mˆ 1 = K (x X i )(x X i )Y i. Ten, by te usual partitioned inverse formula " # 1 Ã! 0 0 S ĝ(x) = e 0 S 1 mˆ 0 1 =(S 0 S 0 1 S 1 2 S 1 ) 1 (ˆm 0 S S 2 mˆ S 1) 1 S 2 mˆ 1 P n = P a i Y i,a i = K (x X i )[1 S 0 S 1 n 1 2 (x X i )] a i It is straigtforward toug a little involved to find asymptotic approximations to te MSE. For simplicity we do tis for scalar x case. Note tat for g 0 =(g 0 (X 1 ),..., g 0 (X n )) 0 ĝ(x) g 0 (x) = e 0 (X 0 W X ) 1 X 0 W (Y g 0 )+ e 0 (X 0 W X ) 1 X 0 Wg 0 g 0 (x). 1

14 Ten for Σ = diag(σ 2 (X 1 ),..., σ 2 (X n )), i E {ĝ(x) g 0 (x)} 2 X 1,..., X n = e 0 ( X 0 W X ) 1 X 0 W ΣW X (X 0 W X ) n + e 0 (X 0 W X ) 1 X 0 Wg 0 g 0 (x) An asymptotic approximation to MSE is obtained by taking te limit as n grows. Note tat we ave 1 n X n 1 j S j = K (x X i )[(x X i )/] j n Ten, by te cange of variables u = (x X i )/, i E n 1 j S j = E[K (x X i ) {(x X i )/} j ]= K(u)u j f 0 (x u)du = μ j f 0 (x)+o(1). R for μ j = K(u)u j du and 0. Also, ³ i 2j var n 1 j S j n 1 E K (x X i ) 2 [(x X i )/ ] n 1 1 K(u) 2 u 2j f 0 (x u)du Cn for n. Terefore, for 0and n n 1 j S j = μ j f 0 (x)+ o p (1). Now let H = diag(1,). Ten by μ 0 =1 and μ 1 =0 we ave " # " # n 1 H 1 W = n 1 S 0 1 S X 0 XH 1 1 S 1 2 = f S 0 (x) + o 2 0 p (1). R Next let ν j = K(u) 2 u j du. tenbyasimilarargumentweave X 1 n K (x X i ) 2 [(x X i )/] j σ 2 (X i )= ν j f 0 (x)σ 2 (x)+ o p (1). n μ 2 o2 1 e It follows by ν 1 =0 tat " # ν0 0 n 1 H 1 X 0 W ΣW XH 1 = f 0 (x)σ 2 (x) + o 0 ν p (1). Ten we ave, for te variance term, by H 1 e 1 = e 1, e 0 1(X 0 W X ) 1 X 0 W ΣW X (X 0 W X ) 1 e 1 Ã! 1 Ã! 1 H 1 X 0 W XH 1 H 1 X 0 W ΣW XH 1 H 1 X 0W XH 1 = n 1 1 e 0 1H 1 H 1 e 1 n n n " # #" # = n 1 1 e " ν 0 ν e 1 σ2 (x) + op (1). 1 0 μ 2 ν 1 ν 2 0 μ 2 f(x) 2

15 It ten follows tat Ã! e 0 1(X 0 W X ) 1 X 0 W ΣW X (X 0 W X ) 1 e 1 = n 1 1 ν 0 σ 2 (x) + op (1). f(x) For te bias consider an expansion g(x i )= g 0 (x)+ g 0 (x)(x i x)+ 1 g 00 (x)(x i x) g 000 (X i)(x i x) Let r i = g 0 (X i ) g 0 (x) [dg 0 (x)/dx](x i x). Ten by te form of X we ave g = (g 0 (X 1 ),..., g 0 (X n )) 0 = g 0 (x)we 1 g 0 (x)we 2 + r 0 It follows by e 0 1e 2 = 0 tat te bias term is e 0 ( X 0 W X) 1 X 0 Wg g 0 (x) = e 0 ( X 0 W X) 1 X 0 W Xe g 0 (x) g 0 (x) 0 +e 1 X 0 X 0 Xe 2 g (x)+ e 1 X 0 X 0 0 ( W X ) 1 W ( W X ) 1 Wr = e 1 ( X 0 W X ) 1 X 0 Wr. Recall tat n 1 j S j = μ j f 0 (x)+ o p (1). Terefore, by μ 3 =0, = n 1 2 H 1 X 0 W ((x X 1 ) 2,..., (x X n ) 2 ) 0 1 g 00 0 (x) Ã! Ã! 2 n 1 2 S 2 1 μ 2 1 n 1 3 S 3 2 g 0 00 (x) = f 0 (x) 0 2 g 0 00 (x)+ o p (1). Assuming tat g (X i) is bounded, bounded ³ ³ ) 3 0 n 1 2 H 1 X 0 W (x X 1 ) 3 g (X 1),..., (x X n g0 000 X n ( X 3 C max Terefore, we ave n 1 2 i K (x X i ) x X i,n 1 2 S 4 ) 0. Ã H e 0 1( X 0 W X) 1 X 0 Wr = 2 e 0 1H 1 1 X 0 W XH 1! 1 2 H 1 X 0 Wr n n 2 Ã! 1 Ã! 1 0 = g 0 00 (x)e 0 μ2 1 + o 2 0 μ 2 μ p (1) 3 2 = 2 [g00 0 (x)μ 2 + o p (1)].

16 Exercise: Apply analogous calculation to sow kernel regression bias is Ã! 0 μ g0 (x)+ g (x) f ( x) f0 ( x) Notice bias is zero if function is linear. Combining te bias and variance expression, we ave te following form for asymptotic MSE: 1 σ 2 (x) 4 n ν 0 f 0 (x) + 4 g 0 00 (x) 2 μ 2 2. In contrast, te kernel MSE is 1 σ 2 (x) 4 " f 0 0 (x) # 2 n ν 0 f 0 (x) + g (x)+2g (x) μ f0 (x) 2. Bias will be muc bigger near boundary of te support were f 0 0 (x)/f 0 (x) is large. For example, if f 0 (x) isapproximately x α for x> 0 near zero, ten f 0 0 (x)/f 0 (x) grows like 1/x as x gets close to zero. Tus, locally linear as smaller boundary bias. Also, locally linear as no bias if g 0 (x) islinearbut kernel obviouslydoes. Simple bandwidt coice metod is to take expected value of MSE. One could use a plug in metod to minimize integrated asymptotic MSE, integrated over ω(x)f 0 (x) for some weigt. Reducing te Curse of Dimensionality Idea: Restrict form of regression so tat it only depends on low dimensional components. Additive model as regression additive in lower dimension components. Index model as regression depending only on a linear combination. Additive model, X = (x 1,..., x r ) 0, E[Y X] = rx j=1 0 g j0 (X j ). One dimensional rate. Series estimator is simplest. Restricts approximating functions to depend on only on component. Scalar u and p L (u), =1,...,L approximating functions, p L (u) =(p 1L (u),..., p LL (u)) 0,K = Lr +1 let p K (x) =(1,p L (x 1 ) 0,...,p L (x r ) 0 ) 0. Regress Y i on p K (X i ). For βˆ0 equal to constant and βˆj, (j =1,..., r) te coefficient vector for p L (x j )/ rx ĝ(x) = βˆ0 + p L (x j ) 0 βˆj. j=1

17 Proposition 4: If Assumption A is satisfied wit g 0 (X) equal to eac component g j0 (X j ) and p K (X) replaced by (1,p L (X j ) 0 ) 0, were te constants C, γ do not depend on j, and Var(Y X) ten E[{ĝ K (X i ) g 0 (X i )} 2 ] /n + r[ L/n + C 2 (L +1) 2γ ]. Proof: Let p L (u) = (1,p L (u) 0 ) 0. By Assumption A, is C and γ so for eac j and L is a(l +1) 1vector β jl =(β jl, β 2 jl0 1 ) 0, Let β =( P r j =1 β 1 E[{g j0 (X j ) p L(X j ) 0 β jl } 2 ] C 2 (L +1) 2γ. jl, β 2 1L0,..., β 2 rl0 ) 0, so tat E[{g(X) p K (X) 0 β } 2 ] rx = E[{g j0 (X j ) p L(X j ) 0 β jl } 2 ] j=1 rc 2 (L +1) 2γ. Ten as in te proof of Propostion 3, we ave E[{ĝ K (X) g 0 (X)} 2 ] K/n + rc 2 (L +1) 2γ = /n + r[ L/n + C 2 (L +1) 2γ ]. Q.E.D. Convergence rate does not depend on r, altoug does affect. Here r could even grow wit sample size at some power of n. Additivity condition satisfied in Hausman and Newey (1995). Semiparametric Models Data: 1, 2,... i.i.d. Model: F aset of pdfs. Correct specification: pdf f 0 of i in F. Semiparametric model: F as parametric θ and nonparametric components. Ex: Linear model E[Y X] = X 0 β 0 ; parametric component is β, everyting else nonparametric. Ex: Probit, Y {0, 1}, Pr(Y =1 X) = Φ(X 0 β 0 ) is parametric component, nonparametric component is distribution of X.

18 Binary Coice wit Unknown Disturbance Distribution: = (Y,X), v(x, β) a known function, Tis model implies Y =1(Y > 0),Y = v(x, β 0 ) ε, ε independent of X, Pr(Y =1 X) = G(v(X, β 0 )), Parameter β, everyting else, including G(u), is nonparametric. Te v(x, β) notation allows location and scale normalization, e.g. v(x, β) = x 1 + x 0 2β, x = (x 1,x 2 0 ) 0, x 1 scalar. Censored Regression wit Unknown Disturbance Distribution: = (Y,X), Y =max{0,y },Y = X 0 β 0 + ε, ε independent of X; Parameter β, everyting else, including distribution of ε, is nonparametric. Binary coice and censored regression are limited dependent variable models. Semiparametric models are important ere because misspecifying te distribution of te disturbances leads to inconsistency of MLE. Partially Linear Regression: = (Y,X,W ), E[Y X, W] = X 0 β 0 + g 0 (W ). Parameter β, everyting else nonparametric, including additive component of regression. Can elp wit curse of dimensionality, wit covariates X entering parametrically. In Hausman and Newey (1995) W is log income and log price, and X includes about 20 time and location dummies. X maybevariableofinterestand g 0 () some covariates, e.g. sample selection. Index Regression: = (Y,X), v(x, β) a known function, E[Y X] = τ(v(x, β 0 )), were te function τ( ) is unknown. Binary coice model as E[Y X] = Pr(Y =1 X) = τ(v(x, β 0 )), wit τ( ). If allow conditional distribution of ε given X to depend (only) on v(x, β 0 ), ten binary coice model becomes index model. Semiparametric Estimators Estimators of β 0. Two kinds; do and do not require nonparametric estimation. Really model specific, but beyond scope to say wy. One general kind of estimator: βˆ =arg min β B q( i,β)/n, β 0 =arg mine[q( i,β)], β B

19 B set of parmaeter values. Extremum estimator. Clever coices of q(, β) insome semiparametric models. Conditional median estimators use two facts: Fact 1: Te median of a monotonic transformation is transformation of te median. Fact 2: Te median minimizes te expected absloute deviation, i.e. med(y X) minimizes E[ Y m(x) ] over functions m( ) of X. Binary Coice: v(x, β) include constant, ε as zero median, ten med(y X) = v(x, β 0 ). 1(y > 0) is montonic tranformation, Fact 1 implies med(y X) =1(med(Y X) > 0) = 1(v(X, β 0 ) > 0). Fact 2, β 0 minimizes E[ y 1(v(x, β) > 0) ], so βˆ =arg min Y 1(v(X i,β) > 0) ] β Maximum score estimator of Manski (1977). Only requires med(y X) = v(x, β 0 ); allows for eteroskedasticity. Censored Regression: x 0 β includes a constant, ε as median zero, ten med(y X) = X 0 β 0. max{0,y} is a monotonic transformation, so Fact 1 says med(y X) = max{0,x 0 β 0 }. By Fact 2, β 0 minimizes E[ y max{0,x 0 β} ], so 0 βˆ =arg min Y i max{0,x i β}. β Censored least absolute deviations estimator of Powell (1984). Only requires med(y X) = X 0 β, allows for eteroskedasticity. Generalize: med(y X) = v(x, β 0 )and Y = T (Y ) for monotonic transformation T (y). By Fact 1 med(y X) = T (v(x, β 0 )). Use Fact 2 to form βˆ =arg min Y i T (v(x i,β)). β Global minimization, rater tan solving first-order conditions, is important. For maximum score no first-order conditions. For censored LAD first-order conditions are zero wenever x 0 iβ< 0for all(i = 1,..., n). Approac provides estimates parameters and conditional median predictions, not conditional means. Generalizes to conditional quantiles. Consistency and Asymptotic Normality of Minimization Estimators A consistency result:

20 Proposition 5: If i) E[q(, β)] as a unique minimum at β 0, ii) β 0 B and B is compact; iii) q(, β) is continuous at β wit probability one; iv) E[sup β B q( i,β) ] < ; ten βˆ =arg min q( i,β) β 0. β B Well known. Allows for q(, β) to be discontinuous. Binary coice model above, assumption iii) satisfied if v(x, β) = x 0 β and X i includes continuously distributed regressor wit corresponding component of β bounded away from zero on B. All te conditions are straigtforward to ceck. An asymptotic normality result, Van der Vaart (1995). p Proposition 6: If βˆ β 0, β 0 is in te interior of B, and i) E[q( i,β)] is twice differentiable at β 0 wit nonsingular Hessian H; ii) tere is d(z) suc tat E[d() 2 ] exists and for all β,β B, q(, β ) q(, β) d()kβ βk; iii) wit probability one q(, β) is differentiable at β 0 wit derivative m(), ten n 1/2 (βˆ β 0 ) d N(0,H 1 E[m()m() 0 ]H 1 ). Straigtforward to ceck for censored LAD. Do not old for maximum score. Instead 1/3 ( ˆ d n β β 0 ). Estimators wit Nonparametric Components Some models require use of nonparametric estimators. Include te partially linear and index regressions. We discuss least squares estimation wen tere is a nonparametric component in te regression. Basic idea is to concentrate out nonparametric component, to find a profile squared residual function, by substituting for nonparametric component an estimator. Partially linear model as in Robinson (1988). Know E[Y X, W] minimizes E[(Y G(X, W )) 2 ]over G, sotat (β 0,g 0 ( )) = arg min E[{Y i X i β g(w )} β,g( ) p 0 2 ]. Do minimization in two steps. First solve for minimum over g for fixed β, substituting tat minimum into te objective function, ten minimize β. Te minimizer over g for fixed β is E[Y i X i 0 β ] = E[Y i i ] E[X i i ] 0 β. Substituting β 0 =arg mine[{y i E[Y i i ] (X i E[X i i ]) 0 β} 2 ]. β

21 Estimate using Ê[Y i i ]and Ê[X i i ] and replacing te outside expectation by a sample average, βˆ =arg min β {Y i Ê[Y i i ] (X i Ê[X i i ]) 0 β} 2 /n, Least squares of Y i Ê[Y i i ]on X i Ê[X i i ]. Kernel or series fine. Index regression, as in Icimura (1993). By E[Y X] = τ 0 (v(x, β 0 )), (β 0,τ 0 ( )) = arg min E[{Y i τ(v(x i,β))} 2 ]. β,τ( ) Concentrating out te τ, τ(x, β) = E[Y v(x, β)]. Let ˆτ(X i,β) a nonparametric estimator of E[Y v(x, β)], estimator is βˆ =arg min {Y i τˆ(x i,β)} 2. β Generalize to log-likeliood and oter objective functions. Let q(z,β,η) depend on parametric component β and nonparametric component η. True values minimize E[q(, β, η)], Estimator ˆη(β) of η(β) =arg min η E[q(, β, η)], βˆ =arg min β q( i,β, ηˆ(β)). Asymptotic teory difficult because of presence of nonparametric estimator. Know tat often n 1/2 rate, asymptotically normal, and even estimation of η(β) doesnot affect asymptotic distribution. Oter estimators tat depend on nonparametric estimators may ave affect on limiting distribution, e.g. average derivative estimator of Stoker (1987) and Powell, Stock, and Stoker (1989). Joint maximization possible but can be difficult because of need to smoot. Cannot allow ˆη(β) tobe n-dimensional. An empirical example is Hausman and Newey (1995). Graps are actually tose for ĝ(w) from a partially linear model. Example of teory, series estimator of partially linear model. Let p K (w) be a K 1 vector of approximating functions, suc as power series or splines. Also let Y = (Y 1,..., Y n ) 0,X =[X 1,..., X n ] 0, P = [p K (W 1 ),...,p K (W n )] 0, Q = P (P 0 P ) P 0, Let Ê[Y i W i ]= p K (W i ) 0 (P 0 P ) P 0 Y and Ê[X i 0 W i ]= p K (W i ) 0 (P 0 P ) P 0 X be series estimators. Residuals are (I Q)Y and (I Q)X, respectively, so by I Q idempotent, βˆ =(X 0 (I Q)X) 1 X 0 (I Q)Y.

22 Here βˆ is also least squares regression of Y on X and P.Result on n 1/2 -consistency and asymptotic normality. Proposition 7: If i) Y i and X i ave finite second moments; ii) H = E[Var(X i W i )] is nonsingular; iii) Var(Y i X i,w i ) and Var(X i W i ) are bounded; iv) tere are C, γ g, and γ x suc tat for every K tere are α K and β K wit E[{g 0 (W i ) α K p K (W i )} 2 ] K γg 0 K and E[kE[X (W i )k 2 ] CK 2γ x v) K/n 0 and n 1/2 K (γ g+γ x ) W i ] β K p 0, ten for Σ = E[Var(Y i X i,w i ){X i E[X i W i ]}{X i E[X i W i ]} 0 ], 1/2 ( ˆ d n β β 0 ) N(0,H 1 ΣH 1 ). Condition ii) is an identification condition tat is essentially no perfect multicollinearity between X i and any function of W i. Intuitively, if one or more of te X i variables were functions of W i ten we could not separately identify g 0 (W i )and tecoefficients on tose variables. Furtermore, we know tat necessary and sufficient conditions for identification of β from least squares objective function were g() as been partialled out are tat X i E[X i W i ] ave a nonsingular second moment matrix. By iterated expectations tat second moment matrix is E[(X i E[X i W i ])(X i E[X i W i ]) 0 ] = E[E[{(X i E[X i W i ])(X i E[X i W i ]) 0 } W i ]] = H. Tus, condition ii) is te same as te usual identification condition for least squares after partialling out te nonparametric component. Te requirement K/n 0 is a small variance condition and n 1/2 K (γg +γx) 0a small bias condition. Te bias ere is of order K (γ g+γ x ), wic is of smaller order tan just te bias in approximating g 0 (z) (wicisonly K γ g ). Indeed, te order of te bias of βˆ is te product of te biases from approximating g 0 (z) and from approximating E[x z]. So, one sufficient condition is tat te bias in eac of te nonparametric estimates vanis faster tan n 1/4. Tis faster tan n 1/4 condition is common to many semiparametric estimators. Some amount of smootness is required for root-n consistency. Existence of K satisfying te rate condition iii) requires tat γ g + γ x >r/2. An analogous smootness requirement (or even a stronger one) is generally needed for root-n consistency of any semiparametric estimator tat requires estimation of a nonparametric component.

23 Proof of Propostion 7: For simplicity we give te proof wen X i is scalar. Let M = I Q, =(W 1,..., W n ) 0, X = [E[X 1 W 1 ],..., E[X n W n ]] 0, V = X X,g 0 =(g 0 (z 1 ),...,g 0 (z n )) 0, ε = Y Xβ 0 g 0. Substituting Y = Xβ 0 + g 0 + ε in te formula for β, ˆ subtracting β0, and multiplying by n 1/2 gives n 1/2 (βˆ β 0 ) = (X 0 MX/n) 1 (X 0 Mg 0 /n 1/2 + X 0 Mε/n 1/2 ) p By te law of large numbers V 0 V/n H. Also, similarly to te proof of Proposition 3, p Ten V 0 MV/n H and so tat V 0 QV/n = O p (K/n), X 0 M X/n = O (K 2γx p ), g 0 0 Mg 0 /n = O p (K 2γg ). V 0 M X/n (V 0 MV/n) 1/2 0 (X M X/n) 1/2 0, X 0 MX/n = (X + V ) 0 M(X + V )/n 0 = X M X/n p +2X 0 MV/n + V 0 MV/n H. Next, similarly to te proof of Proposition 3 we ave E[VV 0 W ] CI n and E[εε 0 W, X] CI n.ten E[{V 0 Mg 0 /n 1/2 } 2 ] = E[g 0 0 ME[VV 0 W ]Mg 0 ]/n CE[g 0 0 Mg 0 ]/n = O(K 2γg ) 0. X 0 Mg/n 1/2 2 n(x 0 MX/n )(g 0 0 Mg 0 /n) p = O p (nk 2γx 2γg ) 0, p

24 p so tat X 0 Mg 0 /n 1/2 = X 0 Mg 0 /n 1/2 + V 0 Mg 0 /n 1/2 0. Also, E[{X 0 Mε/n 1/2 } 2 ] = E[X 0 ME[εε 0 X, W ]MX ]/n CE[X 0 MX ]/n = O(K 2γ x ) 0, E[{V 0 Qε/n 1/2 } 2 ] = E[V 0 QE[εε 0 X, W ]QV ]/n CE[V 0 QV ]/n = O(K/n) 0, d and by te central limit teorem, V 0 ε/n 1/2 N(0, Σ). Terefore, X 0 Mε/n 1/2 = X 0 Mε/n 1/2 +V 0 ε/n 1/2 V 0 Qε/n 1/2 d N(0, Σ). Ten by te continuous mapping and Slutzky teorems it follows tat n 1/2 (βˆ β 0 ) = (H + o p (1)) 1 [V 0 ε/n 1/2 + o p (1)]] d H 1 N(0, Σ) = N(0,H 1 ΣH 1 ).

Semiparametric Models and Estimators

Semiparametric Models and Estimators Semiparametric Models and Estimators Whitney Newey October 2007 Semiparametric Models Data: Z 1,Z 2,... i.i.d. Model: F aset of pdfs. Correct specification: pdf f 0 of Z i in F. Parametric model: F = {f(z

More information

The Priestley-Chao Estimator

The Priestley-Chao Estimator Te Priestley-Cao Estimator In tis section we will consider te Pristley-Cao estimator of te unknown regression function. It is assumed tat we ave a sample of observations (Y i, x i ), i = 1,..., n wic are

More information

Basic Nonparametric Estimation Spring 2002

Basic Nonparametric Estimation Spring 2002 Basic Nonparametric Estimation Spring 2002 Te following topics are covered today: Basic Nonparametric Regression. Tere are four books tat you can find reference: Silverman986, Wand and Jones995, Hardle990,

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc

More information

Chapter 1. Density Estimation

Chapter 1. Density Estimation Capter 1 Density Estimation Let X 1, X,..., X n be observations from a density f X x. Te aim is to use only tis data to obtain an estimate ˆf X x of f X x. Properties of f f X x x, Parametric metods f

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Polynomial Interpolation

Polynomial Interpolation Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximating a function f(x, wose values at a set of distinct points x, x, x 2,,x n are known, by a polynomial P (x

More information

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i

NADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u

More information

Financial Econometrics Prof. Massimo Guidolin

Financial Econometrics Prof. Massimo Guidolin CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis

More information

Kernel Density Estimation

Kernel Density Estimation Kernel Density Estimation Univariate Density Estimation Suppose tat we ave a random sample of data X 1,..., X n from an unknown continuous distribution wit probability density function (pdf) f(x) and cumulative

More information

IEOR 165 Lecture 10 Distribution Estimation

IEOR 165 Lecture 10 Distribution Estimation IEOR 165 Lecture 10 Distribution Estimation 1 Motivating Problem Consider a situation were we ave iid data x i from some unknown distribution. One problem of interest is estimating te distribution tat

More information

Homework 1 Due: Wednesday, September 28, 2016

Homework 1 Due: Wednesday, September 28, 2016 0-704 Information Processing and Learning Fall 06 Homework Due: Wednesday, September 8, 06 Notes: For positive integers k, [k] := {,..., k} denotes te set of te first k positive integers. Wen p and Y q

More information

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.

4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these. Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra

More information

Differentiation in higher dimensions

Differentiation in higher dimensions Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends

More information

The derivative function

The derivative function Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative

More information

Uniform Convergence Rates for Nonparametric Estimation

Uniform Convergence Rates for Nonparametric Estimation Uniform Convergence Rates for Nonparametric Estimation Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~bansen October 2004 Preliminary and Incomplete Abstract Tis paper presents a set of rate

More information

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator.

Lecture XVII. Abstract We introduce the concept of directional derivative of a scalar function and discuss its relation with the gradient operator. Lecture XVII Abstract We introduce te concept of directional derivative of a scalar function and discuss its relation wit te gradient operator. Directional derivative and gradient Te directional derivative

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Copyright c 2008 Kevin Long

Copyright c 2008 Kevin Long Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula

More information

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines

Lecture 15. Interpolation II. 2 Piecewise polynomial interpolation Hermite splines Lecture 5 Interpolation II Introduction In te previous lecture we focused primarily on polynomial interpolation of a set of n points. A difficulty we observed is tat wen n is large, our polynomial as to

More information

HOMEWORK HELP 2 FOR MATH 151

HOMEWORK HELP 2 FOR MATH 151 HOMEWORK HELP 2 FOR MATH 151 Here we go; te second round of omework elp. If tere are oters you would like to see, let me know! 2.4, 43 and 44 At wat points are te functions f(x) and g(x) = xf(x)continuous,

More information

THE IMPLICIT FUNCTION THEOREM

THE IMPLICIT FUNCTION THEOREM THE IMPLICIT FUNCTION THEOREM ALEXANDRU ALEMAN 1. Motivation and statement We want to understand a general situation wic occurs in almost any area wic uses matematics. Suppose we are given number of equations

More information

Applications of the van Trees inequality to non-parametric estimation.

Applications of the van Trees inequality to non-parametric estimation. Brno-06, Lecture 2, 16.05.06 D/Stat/Brno-06/2.tex www.mast.queensu.ca/ blevit/ Applications of te van Trees inequality to non-parametric estimation. Regular non-parametric problems. As an example of suc

More information

A Simple Matching Method for Estimating Sample Selection Models Using Experimental Data

A Simple Matching Method for Estimating Sample Selection Models Using Experimental Data ANNALS OF ECONOMICS AND FINANCE 6, 155 167 (2005) A Simple Matcing Metod for Estimating Sample Selection Models Using Experimental Data Songnian Cen Te Hong Kong University of Science and Tecnology and

More information

Bootstrap confidence intervals in nonparametric regression without an additive model

Bootstrap confidence intervals in nonparametric regression without an additive model Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap

More information

Chapter 4: Numerical Methods for Common Mathematical Problems

Chapter 4: Numerical Methods for Common Mathematical Problems 1 Capter 4: Numerical Metods for Common Matematical Problems Interpolation Problem: Suppose we ave data defined at a discrete set of points (x i, y i ), i = 0, 1,..., N. Often it is useful to ave a smoot

More information

Finite Difference Method

Finite Difference Method Capter 8 Finite Difference Metod 81 2nd order linear pde in two variables General 2nd order linear pde in two variables is given in te following form: L[u] = Au xx +2Bu xy +Cu yy +Du x +Eu y +Fu = G According

More information

THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225

THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225 THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Mat 225 As we ave seen, te definition of derivative for a Mat 111 function g : R R and for acurveγ : R E n are te same, except for interpretation:

More information

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative

Mathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative Matematics 5 Workseet 11 Geometry, Tangency, and te Derivative Problem 1. Find te equation of a line wit slope m tat intersects te point (3, 9). Solution. Te equation for a line passing troug a point (x

More information

Numerical Differentiation

Numerical Differentiation Numerical Differentiation Finite Difference Formulas for te first derivative (Using Taylor Expansion tecnique) (section 8.3.) Suppose tat f() = g() is a function of te variable, and tat as 0 te function

More information

How to Find the Derivative of a Function: Calculus 1

How to Find the Derivative of a Function: Calculus 1 Introduction How to Find te Derivative of a Function: Calculus 1 Calculus is not an easy matematics course Te fact tat you ave enrolled in suc a difficult subject indicates tat you are interested in te

More information

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION

LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LAURA EVANS.. Introduction Not all differential equations can be explicitly solved for y. Tis can be problematic if we need to know te value of y

More information

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x)

1 Calculus. 1.1 Gradients and the Derivative. Q f(x+h) f(x) Calculus. Gradients and te Derivative Q f(x+) δy P T δx R f(x) 0 x x+ Let P (x, f(x)) and Q(x+, f(x+)) denote two points on te curve of te function y = f(x) and let R denote te point of intersection of

More information

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points

MAT 145. Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points MAT 15 Test #2 Name Solution Guide Type of Calculator Used TI-89 Titanium 100 points Score 100 possible points Use te grap of a function sown ere as you respond to questions 1 to 8. 1. lim f (x) 0 2. lim

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.

Consider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx. Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions

More information

2.8 The Derivative as a Function

2.8 The Derivative as a Function .8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open

More information

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY

SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY (Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative

More information

MA455 Manifolds Solutions 1 May 2008

MA455 Manifolds Solutions 1 May 2008 MA455 Manifolds Solutions 1 May 2008 1. (i) Given real numbers a < b, find a diffeomorpism (a, b) R. Solution: For example first map (a, b) to (0, π/2) and ten map (0, π/2) diffeomorpically to R using

More information

Kernel Density Based Linear Regression Estimate

Kernel Density Based Linear Regression Estimate Kernel Density Based Linear Regression Estimate Weixin Yao and Zibiao Zao Abstract For linear regression models wit non-normally distributed errors, te least squares estimate (LSE will lose some efficiency

More information

Function Composition and Chain Rules

Function Composition and Chain Rules Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function

More information

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT

LIMITS AND DERIVATIVES CONDITIONS FOR THE EXISTENCE OF A LIMIT LIMITS AND DERIVATIVES Te limit of a function is defined as te value of y tat te curve approaces, as x approaces a particular value. Te limit of f (x) as x approaces a is written as f (x) approaces, as

More information

Analytic Functions. Differentiable Functions of a Complex Variable

Analytic Functions. Differentiable Functions of a Complex Variable Analytic Functions Differentiable Functions of a Complex Variable In tis capter, we sall generalize te ideas for polynomials power series of a complex variable we developed in te previous capter to general

More information

Poisson Equation in Sobolev Spaces

Poisson Equation in Sobolev Spaces Poisson Equation in Sobolev Spaces OcMountain Dayligt Time. 6, 011 Today we discuss te Poisson equation in Sobolev spaces. It s existence, uniqueness, and regularity. Weak Solution. u = f in, u = g on

More information

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,

NUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example, NUMERICAL DIFFERENTIATION James T Smit San Francisco State University In calculus classes, you compute derivatives algebraically: for example, f( x) = x + x f ( x) = x x Tis tecnique requires your knowing

More information

Order of Accuracy. ũ h u Ch p, (1)

Order of Accuracy. ũ h u Ch p, (1) Order of Accuracy 1 Terminology We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, wic can be for instance te grid size or time step in a numerical

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Combining functions: algebraic methods

Combining functions: algebraic methods Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)

More information

Continuity and Differentiability Worksheet

Continuity and Differentiability Worksheet Continuity and Differentiability Workseet (Be sure tat you can also do te grapical eercises from te tet- Tese were not included below! Typical problems are like problems -3, p. 6; -3, p. 7; 33-34, p. 7;

More information

0.1 Differentiation Rules

0.1 Differentiation Rules 0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate

More information

Hazard Rate Function Estimation Using Erlang Kernel

Hazard Rate Function Estimation Using Erlang Kernel Pure Matematical Sciences, Vol. 3, 04, no. 4, 4-5 HIKARI Ltd, www.m-ikari.com ttp://dx.doi.org/0.988/pms.04.466 Hazard Rate Function Estimation Using Erlang Kernel Raid B. Sala Department of Matematics

More information

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk

Bob Brown Math 251 Calculus 1 Chapter 3, Section 1 Completed 1 CCBC Dundalk Bob Brown Mat 251 Calculus 1 Capter 3, Section 1 Completed 1 Te Tangent Line Problem Te idea of a tangent line first arises in geometry in te context of a circle. But before we jump into a discussion of

More information

ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU. A. Fundamental identities Throughout this section, a and b denotes arbitrary real numbers.

ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU. A. Fundamental identities Throughout this section, a and b denotes arbitrary real numbers. ALGEBRA AND TRIGONOMETRY REVIEW by Dr TEBOU, FIU A. Fundamental identities Trougout tis section, a and b denotes arbitrary real numbers. i) Square of a sum: (a+b) =a +ab+b ii) Square of a difference: (a-b)

More information

Derivatives. By: OpenStaxCollege

Derivatives. By: OpenStaxCollege By: OpenStaxCollege Te average teen in te United States opens a refrigerator door an estimated 25 times per day. Supposedly, tis average is up from 10 years ago wen te average teenager opened a refrigerator

More information

3.4 Worksheet: Proof of the Chain Rule NAME

3.4 Worksheet: Proof of the Chain Rule NAME Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are

More information

Regularized Regression

Regularized Regression Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize

More information

158 Calculus and Structures

158 Calculus and Structures 58 Calculus and Structures CHAPTER PROPERTIES OF DERIVATIVES AND DIFFERENTIATION BY THE EASY WAY. Calculus and Structures 59 Copyrigt Capter PROPERTIES OF DERIVATIVES. INTRODUCTION In te last capter you

More information

Gradient Descent etc.

Gradient Descent etc. 1 Gradient Descent etc EE 13: Networked estimation and control Prof Kan) I DERIVATIVE Consider f : R R x fx) Te derivative is defined as d fx) = lim dx fx + ) fx) Te cain rule states tat if d d f gx) )

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL IFFERENTIATION FIRST ERIVATIVES Te simplest difference formulas are based on using a straigt line to interpolate te given data; tey use two data pints to estimate te derivative. We assume tat

More information

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY

POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY APPLICATIONES MATHEMATICAE 36, (29), pp. 2 Zbigniew Ciesielski (Sopot) Ryszard Zieliński (Warszawa) POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY Abstract. Dvoretzky

More information

Fast Exact Univariate Kernel Density Estimation

Fast Exact Univariate Kernel Density Estimation Fast Exact Univariate Kernel Density Estimation David P. Hofmeyr Department of Statistics and Actuarial Science, Stellenbosc University arxiv:1806.00690v2 [stat.co] 12 Jul 2018 July 13, 2018 Abstract Tis

More information

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016

MAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016 MAT244 - Ordinary Di erential Equations - Summer 206 Assignment 2 Due: July 20, 206 Full Name: Student #: Last First Indicate wic Tutorial Section you attend by filling in te appropriate circle: Tut 0

More information

Logistic Kernel Estimator and Bandwidth Selection. for Density Function

Logistic Kernel Estimator and Bandwidth Selection. for Density Function International Journal of Contemporary Matematical Sciences Vol. 13, 2018, no. 6, 279-286 HIKARI Ltd, www.m-ikari.com ttps://doi.org/10.12988/ijcms.2018.81133 Logistic Kernel Estimator and Bandwidt Selection

More information

Unified estimation of densities on bounded and unbounded domains

Unified estimation of densities on bounded and unbounded domains Unified estimation of densities on bounded and unbounded domains Kairat Mynbaev International Scool of Economics Kazak-Britis Tecnical University Tolebi 59 Almaty 5, Kazakstan email: kairat mynbayev@yaoo.com

More information

. If lim. x 2 x 1. f(x+h) f(x)

. If lim. x 2 x 1. f(x+h) f(x) Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value

More information

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction

INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION. 1. Introduction INFINITE ORDER CROSS-VALIDATED LOCAL POLYNOMIAL REGRESSION PETER G. HALL AND JEFFREY S. RACINE Abstract. Many practical problems require nonparametric estimates of regression functions, and local polynomial

More information

Chapter 2 Limits and Continuity

Chapter 2 Limits and Continuity 4 Section. Capter Limits and Continuity Section. Rates of Cange and Limits (pp. 6) Quick Review.. f () ( ) () 4 0. f () 4( ) 4. f () sin sin 0 4. f (). 4 4 4 6. c c c 7. 8. c d d c d d c d c 9. 8 ( )(

More information

New families of estimators and test statistics in log-linear models

New families of estimators and test statistics in log-linear models Journal of Multivariate Analysis 99 008 1590 1609 www.elsevier.com/locate/jmva ew families of estimators and test statistics in log-linear models irian Martín a,, Leandro Pardo b a Department of Statistics

More information

Differential equations. Differential equations

Differential equations. Differential equations Differential equations A differential equation (DE) describes ow a quantity canges (as a function of time, position, ) d - A ball dropped from a building: t gt () dt d S qx - Uniformly loaded beam: wx

More information

Exam 1 Review Solutions

Exam 1 Review Solutions Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),

More information

Name: Answer Key No calculators. Show your work! 1. (21 points) All answers should either be,, a (finite) real number, or DNE ( does not exist ).

Name: Answer Key No calculators. Show your work! 1. (21 points) All answers should either be,, a (finite) real number, or DNE ( does not exist ). Mat - Final Exam August 3 rd, Name: Answer Key No calculators. Sow your work!. points) All answers sould eiter be,, a finite) real number, or DNE does not exist ). a) Use te grap of te function to evaluate

More information

2.3 Product and Quotient Rules

2.3 Product and Quotient Rules .3. PRODUCT AND QUOTIENT RULES 75.3 Product and Quotient Rules.3.1 Product rule Suppose tat f and g are two di erentiable functions. Ten ( g (x)) 0 = f 0 (x) g (x) + g 0 (x) See.3.5 on page 77 for a proof.

More information

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE U N I V E R S I T Ä T H A M B U R G A note on residual-based empirical likeliood kernel density estimation Birte Musal and Natalie Neumeyer Preprint No. 2010-05 May 2010 DEPARTMENT MATHEMATIK SCHWERPUNKT

More information

Time (hours) Morphine sulfate (mg)

Time (hours) Morphine sulfate (mg) Mat Xa Fall 2002 Review Notes Limits and Definition of Derivative Important Information: 1 According to te most recent information from te Registrar, te Xa final exam will be eld from 9:15 am to 12:15

More information

Section 3: The Derivative Definition of the Derivative

Section 3: The Derivative Definition of the Derivative Capter 2 Te Derivative Business Calculus 85 Section 3: Te Derivative Definition of te Derivative Returning to te tangent slope problem from te first section, let's look at te problem of finding te slope

More information

A New Diagnostic Test for Cross Section Independence in Nonparametric Panel Data Model

A New Diagnostic Test for Cross Section Independence in Nonparametric Panel Data Model e University of Adelaide Scool of Economics Researc Paper No. 2009-6 October 2009 A New Diagnostic est for Cross Section Independence in Nonparametric Panel Data Model Jia Cen, Jiti Gao and Degui Li e

More information

2.1 THE DEFINITION OF DERIVATIVE

2.1 THE DEFINITION OF DERIVATIVE 2.1 Te Derivative Contemporary Calculus 2.1 THE DEFINITION OF DERIVATIVE 1 Te grapical idea of a slope of a tangent line is very useful, but for some uses we need a more algebraic definition of te derivative

More information

Differential Calculus (The basics) Prepared by Mr. C. Hull

Differential Calculus (The basics) Prepared by Mr. C. Hull Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit

More information

Practice Problem Solutions: Exam 1

Practice Problem Solutions: Exam 1 Practice Problem Solutions: Exam 1 1. (a) Algebraic Solution: Te largest term in te numerator is 3x 2, wile te largest term in te denominator is 5x 2 3x 2 + 5. Tus lim x 5x 2 2x 3x 2 x 5x 2 = 3 5 Numerical

More information

Convexity and Smoothness

Convexity and Smoothness Capter 4 Convexity and Smootness 4.1 Strict Convexity, Smootness, and Gateaux Differentiablity Definition 4.1.1. Let X be a Banac space wit a norm denoted by. A map f : X \{0} X \{0}, f f x is called a

More information

Test 2 Review. 1. Find the determinant of the matrix below using (a) cofactor expansion and (b) row reduction. A = 3 2 =

Test 2 Review. 1. Find the determinant of the matrix below using (a) cofactor expansion and (b) row reduction. A = 3 2 = Test Review Find te determinant of te matrix below using (a cofactor expansion and (b row reduction Answer: (a det + = (b Observe R R R R R R R R R Ten det B = (((det Hence det Use Cramer s rule to solve:

More information

LATTICE EXIT MODELS S. GILL WILLIAMSON

LATTICE EXIT MODELS S. GILL WILLIAMSON LATTICE EXIT MODELS S. GILL WILLIAMSON ABSTRACT. We discuss a class of problems wic we call lattice exit models. At one level, tese problems provide undergraduate level exercises in labeling te vertices

More information

lecture 26: Richardson extrapolation

lecture 26: Richardson extrapolation 43 lecture 26: Ricardson extrapolation 35 Ricardson extrapolation, Romberg integration Trougout numerical analysis, one encounters procedures tat apply some simple approximation (eg, linear interpolation)

More information

Bandwidth Selection in Nonparametric Kernel Testing

Bandwidth Selection in Nonparametric Kernel Testing Te University of Adelaide Scool of Economics Researc Paper No. 2009-0 January 2009 Bandwidt Selection in Nonparametric ernel Testing Jiti Gao and Irene Gijbels Bandwidt Selection in Nonparametric ernel

More information

Local Orthogonal Polynomial Expansion (LOrPE) for Density Estimation

Local Orthogonal Polynomial Expansion (LOrPE) for Density Estimation Local Ortogonal Polynomial Expansion (LOrPE) for Density Estimation Alex Trindade Dept. of Matematics & Statistics, Texas Tec University Igor Volobouev, Texas Tec University (Pysics Dept.) D.P. Amali Dassanayake,

More information

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems

5 Ordinary Differential Equations: Finite Difference Methods for Boundary Problems 5 Ordinary Differential Equations: Finite Difference Metods for Boundary Problems Read sections 10.1, 10.2, 10.4 Review questions 10.1 10.4, 10.8 10.9, 10.13 5.1 Introduction In te previous capters we

More information

Introduction to Derivatives

Introduction to Derivatives Introduction to Derivatives 5-Minute Review: Instantaneous Rates and Tangent Slope Recall te analogy tat we developed earlier First we saw tat te secant slope of te line troug te two points (a, f (a))

More information

Sin, Cos and All That

Sin, Cos and All That Sin, Cos and All Tat James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 9, 2017 Outline Sin, Cos and all tat! A New Power Rule Derivatives

More information

3. THE EXCHANGE ECONOMY

3. THE EXCHANGE ECONOMY Essential Microeconomics -1-3. THE EXCHNGE ECONOMY Pareto efficient allocations 2 Edgewort box analysis 5 Market clearing prices 13 Walrasian Equilibrium 16 Equilibrium and Efficiency 22 First welfare

More information

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c)

= 0 and states ''hence there is a stationary point'' All aspects of the proof dx must be correct (c) Paper 1: Pure Matematics 1 Mark Sceme 1(a) (i) (ii) d d y 3 1x 4x x M1 A1 d y dx 1.1b 1.1b 36x 48x A1ft 1.1b Substitutes x = into teir dx (3) 3 1 4 Sows d y 0 and states ''ence tere is a stationary point''

More information

Bootstrap prediction intervals for Markov processes

Bootstrap prediction intervals for Markov processes arxiv: arxiv:0000.0000 Bootstrap prediction intervals for Markov processes Li Pan and Dimitris N. Politis Li Pan Department of Matematics University of California San Diego La Jolla, CA 92093-0112, USA

More information

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households

Volume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of

More information

Math Spring 2013 Solutions to Assignment # 3 Completion Date: Wednesday May 15, (1/z) 2 (1/z 1) 2 = lim

Math Spring 2013 Solutions to Assignment # 3 Completion Date: Wednesday May 15, (1/z) 2 (1/z 1) 2 = lim Mat 311 - Spring 013 Solutions to Assignment # 3 Completion Date: Wednesday May 15, 013 Question 1. [p 56, #10 (a)] 4z Use te teorem of Sec. 17 to sow tat z (z 1) = 4. We ave z 4z (z 1) = z 0 4 (1/z) (1/z

More information

Solutions to the Multivariable Calculus and Linear Algebra problems on the Comprehensive Examination of January 31, 2014

Solutions to the Multivariable Calculus and Linear Algebra problems on the Comprehensive Examination of January 31, 2014 Solutions to te Multivariable Calculus and Linear Algebra problems on te Compreensive Examination of January 3, 24 Tere are 9 problems ( points eac, totaling 9 points) on tis portion of te examination.

More information

A Goodness-of-fit test for GARCH innovation density. Hira L. Koul 1 and Nao Mimoto Michigan State University. Abstract

A Goodness-of-fit test for GARCH innovation density. Hira L. Koul 1 and Nao Mimoto Michigan State University. Abstract A Goodness-of-fit test for GARCH innovation density Hira L. Koul and Nao Mimoto Micigan State University Abstract We prove asymptotic normality of a suitably standardized integrated square difference between

More information

2.11 That s So Derivative

2.11 That s So Derivative 2.11 Tat s So Derivative Introduction to Differential Calculus Just as one defines instantaneous velocity in terms of average velocity, we now define te instantaneous rate of cange of a function at a point

More information

1 Introduction to Optimization

1 Introduction to Optimization Unconstrained Convex Optimization 2 1 Introduction to Optimization Given a general optimization problem of te form min x f(x) (1.1) were f : R n R. Sometimes te problem as constraints (we are only interested

More information

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4.

Solution. Solution. f (x) = (cos x)2 cos(2x) 2 sin(2x) 2 cos x ( sin x) (cos x) 4. f (π/4) = ( 2/2) ( 2/2) ( 2/2) ( 2/2) 4. December 09, 20 Calculus PracticeTest s Name: (4 points) Find te absolute extrema of f(x) = x 3 0 on te interval [0, 4] Te derivative of f(x) is f (x) = 3x 2, wic is zero only at x = 0 Tus we only need

More information

Exponential and logarithmic functions (pp ) () Supplement October 14, / 1. a and b positive real numbers and x and y real numbers.

Exponential and logarithmic functions (pp ) () Supplement October 14, / 1. a and b positive real numbers and x and y real numbers. MA123, Supplement Exponential and logaritmic functions pp. 315-319) Capter s Goal: Review te properties of exponential and logaritmic functions. Learn ow to differentiate exponential and logaritmic functions.

More information

AMS 147 Computational Methods and Applications Lecture 09 Copyright by Hongyun Wang, UCSC. Exact value. Effect of round-off error.

AMS 147 Computational Methods and Applications Lecture 09 Copyright by Hongyun Wang, UCSC. Exact value. Effect of round-off error. Lecture 09 Copyrigt by Hongyun Wang, UCSC Recap: Te total error in numerical differentiation fl( f ( x + fl( f ( x E T ( = f ( x Numerical result from a computer Exact value = e + f x+ Discretization error

More information