Nonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction
|
|
- Francine Cunningham
- 6 years ago
- Views:
Transcription
1 Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann
2 Univariate Kernel Regression The relationship between two variables, X and Y where m( ) is a function. Y = m(x ) Model: y i = m(x i ) + ε i, i = 1,..., n, (1) E(Y X = x) = m(x). (2) (1): Y = m(x ) doesn t need to hold exactly for the i th observation. Error term ε. (2): The relationship holds on average.
3 Univariate Kernel Regression Goal: Estimate m( ) on basis of the observations (x i, y i ), i = 1,..., n. In a parametric approach: m(x) = α + βx and then estimate α and β. In the nonparametric approach: No prior restricions on m( ).
4 Conditional Expectation (repetition) Let X and Y be two random variables with joint pdf f (x, y). Then the conditional expectation of Y given that X = x E(Y X = x) = yf (y x) dy f (x, y) = y f X (x) dy = m(x) Note: m(x) might be quite nonlinear.
5 Example Consider the joint pdf f (x, y) = x + y for 0 x 1 and 0 y 1 with f (x, y) = 0 elsewhere, and marginal pdf f X (x) = 1 0 f (x, y) dy = x with f X (x) = 0 elsewhere. Then which is nonlinear. E(Y X = x) = = = 1 for 0 x 1 f (x, y) y f X (x) dy y x + y 0 x + 1 dy x x + 1 = m(x) 2
6 Design Random design Observations (X i, Y i ), i = 1,..., n from bivariate distribtuion f (x, y). The distribution of f X (x) is unknown. Fix design Controle the predictor variable, X, then Y is the only random variable. (Eg. the beer-example). The distribution of f X (x) is known.
7 Kernel Regression Random design The Nadaraya-Watson estimator ˆm(x) = n 1 n i=1 K h(x X i )Y i n 1 n j=1 K h(x X j ) Rewrite the Nadaraya-Watson estimator ( ) ˆm(x) = 1 n K h (x X i ) n n 1 n i=1 j=1 K h(x X j ) = 1 n W hi (x)y i n i=1 1 Weighted (local) average of Y i (note: n n i=1 W hi(x) = 1) h determines the degree of smoothness. What happens if the denominator of W hi (x) is equal to 0? Then the numerator is also equal to 0, and the estimator is not defined. This happend in regions of sparse data. Y i
8 Kernel Regression
9 Kernel Regression Fix design f X (x) is known. Weights of the form W FD hi (x) = K h(x x i ) f X (x) Simpler structure and therefore easier to analyse. One particular fixed design kernel regression estimator: Gasser-Müller For the case of ordered design points x (i), i = 1,..., n from [a, b] si Whi GM (x) = n K h (x u) du, s i 1 where s i = x (i)+x (i+1) 2, s 0 = a, s n+1 = b. Note that Whi GM (x) sums to 1.
10 Statistical Properties Are the kernel regression estimator consistent? Theorem 4.1: Consistence (Nadaraya-Watson) Assume the univariate random design model and the regularity conditions: K(u) du <, uk(u) 0 for u, EY 2 <. Suppose also h 0, nh, then 1 n n W hi (x)y i = ˆm h (x) P m(x) i=1 where for x holds f X (x) > 0 and x is a point of continuity of m(x), f X (x), and σ 2 (x) = V(Y X = x). Proof idea: Show that the numerator and the denominator of ˆm h (x) converge. Then ˆm h (x) converges (Slutsky s theorem).
11 Statistical Properties What is the speed of the convergence? Theorem 4.2: Speed of the convergence (Gasser-Müller) Assume the univariate fixed design model and the conditions: K( ) has support [ 1, 1] with K( 1) = K(1) = 0, m( ) is twice continuously differentiable, max i x i x i 1 = O(n 1 ) Assume V(ε i ) = σ 2, i = 1,..., n. Then, under h 0, nh, it holds { } 1 n MSE Whi GM (x)y i 1 n nh σ2 K h4 4 µ2 2(K){m (x)} 2 i=1 Note: AMSE = var + bias 2, ie. usual bias-variance trade-off.
12 Statistical Properties What is the speed of the convergence in the random design? Theorem 4.3: Speed of the convergence (Nadaraya-Watson) Assume the univariate random design model and the conditions K(u) du <, uk(u) 0 for u, EY 2 <. Suppose also h 0, nh, then MSE( ˆm h (x)) 1 σ 2 ( nh f X (x) K 2 2+ h4 4 µ2 2(K) m (x) + 2 m (x)f X (x) ) 2 f X (x) where for x holds f X > 0 and x is a point of continuity of m (x), m (x), f X (x), f X (x), and σ 2 (x) = V(Y X = x) Proof idea: Rewrite the Nadaraya-Watson estimator: ˆm h (x) = ˆr h(x) ˆf h (x)
13 Statistical Properties Asymptotic MSE: AMSE(n, h) = 1 nh C 1 + h 4 C 2 Minimizing wrt. h gives the optimal bandwidth h opt n 1/5 Rate of convergence af AMSE is of order O(n 4/5 ). Slower than the rate obtained by LS estimation in linear regression, but the same as in nonparametric density estimation.
14 Local Polynomial Regression The Nadaraya-Watson estimator is a special case of local polynomial regression: Local constant least squares fit. To motivate the local polynomial fit: Look at Taylor expansion of the unknown conditional expectation function: m(t) m(x) + m (x)(t x) + + m (p) (x)(t x) p 1 p! for t in a neighborhood of x. Idea: Fit a polynomial in a neighborhood of x (called local polynomial regression). min β n {Y i β 0 β 1 (X i x) β p (X i x) p } 2 K h (x X i ) }{{}}{{} polynomial local i=1 where β = (β 0, β 1,..., β p ) T.
15 Local Polynomial Regression The result is a weighted least squares estimator ˆβ(x) = (X T WX) 1 X T WY 1 X 1 x... (X 1 x) p Y 1 where X = , Y =. and 1 X n x... (X n x) p Y n K h (x X 1 ) K h (x X 2 )... 0 W = K h (x X n ) Note: This estimator varies with x (in contrast to parametric least squares)
16 Local Polynomial Regression Denote the components of ˆβ(x) by ˆβ 0 (x),..., ˆβ p (x). Then the local polynomial estimator of the regression function m is ˆm p,h (x) = ˆβ 0 (x) because m(x) β 0 (x) (recall the Taylor expansion).
17 Local Polynomial Regression Local constant estimator (Nadaraya-Watson): p = 0 Then min β n {Y i β 0 } 2 K h (x X i ) i=1 ˆm 0,h = n i=1 K h(x X i )Y i n i=1 K h(x X i )
18 Local Polynomial Regression Local linear estimator: p = 1 Then and where min β n {Y i β 0 β 1 (X 1 x)} 2 K h (x X i ) i=1 ( Sh,0 (x) S ˆβ(x) = h,1 (x) S h,1 (x) S h,2 (x) ) 1 ( Th,0 (x) T h,1 (x) ˆm 1,h = ˆβ 0 (x) = T h,0(x)s h,2 (x) T h,1 (x)s h,1 (x) S h,0 (x)s h,2 (x) S 2 h,1 (x) S h,j (x) = T h,j (x) = n K h (x X i )(X i x) j i=1 i=1 n K h (x X i )(X i x) j Y i )
19 Local Polynomial Regression Local polynomial estimator: The general formula for polynomial of order p min β n {Y i β 0 β 1 (X i x) β p (X i x) p } 2 K h (x X i ) i=1 Then ˆβ(x) = S h,0 (x) S h,1 (x)... S h,p (x) S h,1 (x) S h,2 (x)... S h,p+1 (x) S h,p (x) S h,p+1 (x)... S h,2p (x) 1 T h,0 (x) T h,1 (x). T h,p (x)
20 Local Constant Regression n = 7125 observations of net-income and expenditure on food expenditures from the Family Expenditure Survey of British household in 1973.
21 Local Linear Regression Differences between local constant and linear regression on the boundaries. Later we will see, that the local linear fit will be influenced less by outliers.
22 Local Linear Regression Asymtotic MSE for the local linear regression estimator: Note: AMSE( ˆm 1,h (x)) = 1 σ 2 (x) nh f X (x) K h4 4 {m (x)} 2 µ 2 2(K) The bias is design-independent, The bias disappears when m( ) is linear.
23 Local Linear Regression Compare the AMSE: Compare the AMSE of the local constant, the Gasser-Müller and the local linear estimators: AMSE( ˆm 0,h (x)) = AMSE( ˆm GM,h (x)) = AMSE( ˆm 1,h (x)) = 1 σ 2 { (x) nh f X (x) K h4 4 µ2 2 (K) m (x) + 2 m (x)f X (x) } 2 f X (x) 1 nh σ2 K h4 4 µ2 2 (K){m (x)} 2 1 σ 2 (x) nh f X (x) K h4 4 µ2 2 (K){m (x)} 2 LC uses one parameter less compared to LL without reducing the asymptotic variance. But the extra parameter enables LL to reduce the bias. LL fit improve the estimation i boundary compared to LC. GM improves the bias compared to LC.
24 Local Linear Regression Remarks: LL is a weighted local average of the response variables Y i (as the LC), h determines the degree of smoothness of ˆm p,h : For h 0, at X i, then ˆm p,h Y i. When h all weights are equal, ie. we obtain a parametric pth order polynomial fit. Derivatives of m: The natual approach: Estimate m by ˆm and compute the derivative ˆm. Alternative and more effecient method: Polynomial derivative estimator ˆm (ν) p,h (x) = ν! ˆβ ν (x) for the νth derivative of m( ). Usually the order of the polynomial is p = ν + 1 or p = ν + 3.
25 Nearest-Neighbor Estimator Other nonparametric smoothing techniques that differ from the kernel method. Nearest-Neighbor Estimator Kernel regression: Weighted averages Y i in a fixed neighborhood (governed by h) around x. k-nn: Weighted averages of Y i in a neighborhood around x. The width is variable. Using the k nearest observations to x. ˆm k (x) = 1 n W ki (x)y i n where with the set of indices W ki (x) = i=1 { n/k if i Jx 0 otherwise J x = {i : X i is one of the k nearest observations to x}
26 Nearest-Neighbor Estimator Remark: When we estimate m( ) at a point x with spare data, then the k nearest neighbors are rather far away from x. k is the smoothing parameter. Increasing k makes the estimate smoother. The k-nn estimator is the kernel estimator with uniform kernel K(u) = ( u 1) and variable bandwidth R = R(k) with R(k) being the distance between x and its furtherst k-nearest neighbor: ˆm k (x) = n i=1 K R(x X i )Y i n i=1 K R(x X i ) The k-nn estimator can be generalised by considering other kernel functions.
27 Nearest-Neighbor Estimator Bias and variance: Let k, k/n 0 and n. Then E{ ˆm k (x)} m(x) µ { 2(K) 8f X (x) 2 m (x) + 2 m (x)f X (x) } ( ) k 2 f X (x) n Remark: V{ ˆm k (x)} 2 K 2 σ 2 (x) 2 k The variance does not depend of f X (x) (unlike LC) because k-nn always averages over k observations. k-nn approx. identical to LC with bandwidth h wht. MSE: Choose k = 2nhf X (x).
28 Nearest-Neighbor Estimator
29 Median Smoothing Median Smoothing: Goal: Estimate the conditional median function where ˆm(x) = med{y i : i J x } J x = {i : X i is one of the k nearest observations to x} e.i. the median of the Y i s, for which X i is one of the k nearest neighbors of x. med(y X = x) is more robust to outliers than E(Y X = x)
30 Median Smoothing
31 Spline Smoothing Spline Smoothing: Consider the residual sum of squares (RSS) as a criterion for the goodness of fit of m( ) RSS = n {Y i m(x i )} 2 i=1 Minimizing RSS: Interpolating the data m(x i ) = Y i, i = 1,..., n Solution: Add a stabilizer that penalizes non-smoothness of m( ) m 2 2 = {m (x)} 2 dx
32 Spline Smoothing Minimizing problem: ˆm λ = arg min S λ(m) m where S λ (m) = n i=1 {Y i m(x i )} 2 + λ m 2 2. For the class of all twice diff. functions on [a, b] = [X (1), X (n) ] then the unique minimizer is the cubic spline, which consists of cubic polynomials p i (x) = α i + β i x + γ i x 2 + δ i x 3, i = 1,..., n 1 between X (i), X (i+1). λ controls the weight given to the stabilizer: High λ more weigth to m 2 2 smooth estimate. λ 0 ˆm λ ( ) interpolation. λ ˆm λ ( ) linear function.
33 Spline Smoothing The estimator must be twice continuously differentiable: p i (X (i) ) = p i 1 (X (i) ), (no jumps in the function) p i (X (i)) = p i 1 (X (i)), (no jumps in the first derivative) p i (X (i)) = p i 1 (X (i)), (no jumps in the second derivative) p 1 (X (1)) = p n 1 (X (n) ) = 0, (boundary condition) These restrictions and the conditions for minimizing S λ (m) wrt. the coefficients of p i define a system of linear equations which can be solved in O(n) calculations.
34 Spline Smoothing The spline smoother is a linear estimator in Y i ˆm λ (x) = 1 n n W λ,i (x)y i i=1 The spline smoother is asymptotically equivalent to the kernel smoother (under certain conditions) with the spline kernel K S (u) = 1 ( 2 exp u ) ( u sin + π ) with local bandwidth h(x i ) = λ 1/4 n 1/4 f (X i ) 1/4.
35 Spline Smoothing
36 Orthogonal Series Suppose that m( ) can be represented by a Fourier series m(x) = β j ϕ j (x) j=0 where {ϕ j (x)} j=0 is a known basis of functions and {β} j=0 are the unknown Fourier coefficients. Goal: Estimate the unknown Fourier coefficients. Cannot estimate a infinite number of coefficients from a finite number of observations. Therefore choose N that will be included in the Fourier series representaion.
37 Orthogonal Series Estimation proceeds in three steps: Select a basis of functions, Select N < n (integer), Estimate N unknown coefficients N: Smoothing parameter. Large N many terms in the Fourier series interpolation, Small N few terms smooth estimates.
38 Orthogonal Series
39 Smoothing Parameter Selection How can we choose the smoothing parameter of the kernel regression estimators? Which criteria do we have, that measures how close the estimate is to the true curve?
40 Smoothing Parameter Selection Mean Squared Error (MSE): MSE( ˆm h (x)) = E[{ ˆm h (x) m(x)} 2 ] m(x): Unknown constant, ˆm h (x): Random variable, ei. E is taken wrt. this rv. ˆm h (x) is a function of {X i, Y i } n i=1, therefore MSE( ˆm h (x)) =... { ˆm h (x) m(x)} 2 f (x 1,..., x n, y 1,..., y n ) dx 1...dx n dy 1,...dy n Local measure. The AMSE-optimal bandwidth is a complicated function of m (x) and σ 2 (x).
41 Smoothing Parameter Selection Integrated Squared Error (ISE): ISE( ˆm h ) = { ˆm h (x) m(x)} 2 w(x) f X (x) dx Global measure. ISE rv. different samples gives different values of ˆm h (x) and ISE. Weight function: w(x). Used to assign less weight to observations in regions of sparse data (to reduce variance) or at the tail (to reduce the boundary effects).
42 Smoothing Parameter Selection Mean Integrated Squared Error (MISE): MISE( ˆm h ) = E{ISE( ˆm h )} [ =... Expected value of ISE. ] { ˆm h (x) m(x)} 2 w(x) f X (x) dx f (x 1,..., x n, y 1,..., y n ) dx 1...dx n dy 1,...dy n
43 Smoothing Parameter Selection Averaged Squared Error (ASE): ASE( ˆm h ) = 1 n n { ˆm h (X j ) m(x j )} 2 w(x j ) j=1 Gloabal measure and a rv. Discrete approximation to ISE.
44 Smoothing Parameter Selection Mean Averaged Squared Error (MASE): MASE( ˆm h ) = E{ASE X 1 = x 1,..., X n = x n } 1 n = E { ˆm h (X j ) m(x j )} 2 w(x j ) n j=1 Conditional expectation of ASE If we view X 1,..., X n as rv. then MASE is a rv. MASE ia asymptotic version of AMISE
45 Smoothing Parameter Selection Which measure should we use to derive a bandwidth selection method? A natural choise MISE/AMISE: Used in the density case. But more unknown quantities than in the density case. Bandwidth selection: Cross-validation, Penalty terms, Plug-in methods (Fan & Gijbels, 1996) NW-est.: ASE, ISE and MISE lead asymptotically to the same smoothing. Therefore use the easiest: ASE (discrete ISE).
46 Smoothing Parameter Selection Averages Squared Error ASE( ˆm h ) = 1 n n { ˆm h (X i ) m(x i )} 2 w(x i ) i=1 The conditional expectation, MASE MASE(h) = E{ASE(h) X 1 = x 1,...X n = x n } = 1 n E[{ ˆm h (X i ) m(x i )} 2 X 1 = x 1,...X n = x n ] w(x i ) n i=1
47 Smoothing Parameter Selection Bias 2 (thin solid line), variance (dashed line), MASE (thick line). Simulated data set: m(x) = {sin(2πx 3 )} 3 Y i = m(x i ) + ε i, X i U(0, 1), ε i N(0, 0.01)
48 Smoothing Parameter Selection However, MASE and ASE contain m( ) which is unknown. Resubstitution estimate: Replace m( ) with the observations Y p(h) = 1 n n {Y i ˆm h (X i )} 2 w(x i ) i=1 The weighted residual sum of squares (RSS). Problem: Y i is used in ˆm h (X i ) to predict it self. Therefore p(h) arbitrarily small by letting h 0 (interpolation).
49 Smoothing Parameter Selection Cross-validation: Leave-one-out-estimator: ˆm h, i (X i ) = j i K h(x i X j )Y j j i K h(x i X j ) Cross-validation function: CV (h) = 1 n n {Y i ˆm h, i (X i )} 2 w(x i ) i=1 Choose h that minimizes CV(h)
50 Smoothing Parameter Selection Left: Local constant regression estimator (Nadaraya-Watson) with cross-validation bandwidth (ĥ cv = 0.15). Quartic kernel. Right: Local linear regression estimator with cross-validation bandwidth (ĥcv = 0.56). More stable in regions with sparse data and outperforms the LC-est. at the boundaries.
51 Plug-in method (Fan & Gijbels, 1996) Bandwidth selection methods: Constant (global) bandwidth, Variable bandwidth. Reduction of bias at peaked regions, and variance at flat regions. Local variable bandwidth, h(x 0 ) Varies with the location point x 0. Global variable bandwidth, h(x i ) Varies with the data points. Focus on constant and local variable bandwidth.
52 Plug-in method (Fan & Gijbels, 1996) Local bandwidth: Optimal local bandwidth by minimizing the conditional MSE [Bias{ ˆm ν (x 0 ) X}] 2 + V{ ˆm ν (x 0 ) X} where ˆm ν ( ) is the ν th derivative of ˆm( ). Use the general expressions for bias and variance to obtain the (local) AMSE-optimal bandwidth [ hopt LOC σ 2 ] 1/(2p+3) (x 0 ) (x 0 ) = C ν,p (K) {m (p+1) (x 0 )} 2 n 1/(2p+3) f (x 0 ) where C ν,p (K) is just a contant C ν,p (K) = [ (p + 1)! 2 (2ν + 1) Kν 2 ] 1/(2p+3) (t) dt 2(p + 1 ν){ t p+1 Kν 2 (t) dt} 2
53 Plug-in method (Fan & Gijbels, 1996) Constant bandwidth: Optimal global bandwidth by minimizing the conditional MISE ([Bias{ ˆmν (x 0 ) X}] 2 + V{ ˆm ν (x 0 ) X} ) w(x) dx where w( ) 0 som weight function. Use (again) the asymptotic expression of bias and variance to obtain the optimal bandwidth h GLO opt [ σ 2 ] 1/(2p+3) (x)w(x)/f (x) dx = C ν,p (K) {m (p+1) (x)} 2 n 1/(2p+3) w(x) dx
54 Plug-in method (Fan & Gijbels, 1996) The optimal bandwidths hopt LOC quantities: Design density: f ( ) and h GLO opt The conditional variance: σ 2 ( ) = V(Y X = ) The derivative function: m (p+1) ( ) depends on unknown [ hopt LOC σ 2 ] 1/(2p+3) (x 0 ) (x 0 ) = C ν,p(k) {m (p+1) (x 0 )} 2 n 1/(2p+3) f (x 0 ) [ σ hopt GLO 2 ] 1/(2p+3) (x)w(x)/f (x) dx = C ν,p (K) {m (p+1) (x)} 2 n 1/(2p+3) w(x) dx Solution: Substitute by pilot estimates plug-in bandwidth.
55 Plug-in method (Fan & Gijbels, 1996) Rule of thumb for constant bandwidth selection: We find pilot estimates of f ( ), σ 2 ( ) and m (p+1) ( ) by fitting a polynomial of order p + 3 globally to m(x) (parametric fit): ˇm(x) = ˇα ˇα p+3 x p+3 ˇσ 2 : The standardized RSS from this fit. ˇm (p+1) : Derivative function on quadratic form.
56 Plug-in method (Fan & Gijbels, 1996) We want to estimate m (ν) ( ) and we take w(x) = f (x)w 0 (x). Subsitute the pilot estimates into the optimal constant bandwidth. Rule-of-thumb Bandwidth [ ˇσ ȟ ROT = C ν,p (K) 2 w 0 (x) dx n i=1 { ˇm(p+1) (X i )} 2 w 0 (X i ) ] 1/(2p+3) [ (p+1)! where C ν,p(k) = 2 (2ν+1) ] 1/(2p+3) Kν 2 (t) dt 2(p+1 ν){ t p+1 Kν 2 (t) dt}2 is a constant (tabulated for different kernels and different values of ν and p).
57 Smoothing Parameter Selection Local linear regression (p = 1) and the derivative estimation (p = 2) and Rule-of-thumb bandwidth, h.
58 Confidence Regions How close is the smoothed curve to the true curve? Theorem 4.5: Asymp. distribution of the Nadaraya-Watson est. Suppose m and f X are twice differentiable. K(u) 2+κ du < for some κ > 0, x is a continuity point of σ 2 (x), E( Y 2+κ X = x) and f X (x) > 0. Take h = cn 1/5. Then n 2/5 { ˆm h (x) m(x)} L N(b x, v 2 x ) where { m b x = c 2 (x) µ 2 (K) + m (x)f X (x) } 2 f X (x) v 2 x = σ2 (x) K 2 2 cf X (x)
59 Confidence Regions Suppose bias is of negligible magnitude compare to the variance, e.g. h sufficiently small. Approx. confidence intervals: [ ˆm h (x) z 1 α 2 K 2ˆσ 2 (x) n h ˆf h (x), ˆm h(x) + z 1 α 2 ] K 2ˆσ 2 (x) n h ˆf h (x) where z 1 α is the (1 α 2 2 )-quantile of the normal distribution and the estimate of σ 2 (x) is ˆσ 2 (x) = 1 n n W hi (x){y i ˆm h (x)} 2 i=1 where W h i is the weights from the Nadaraya-Watson est.
60 Confidence Regions The Nadaraya-Watson kernel regression and 95% confidence intervals, h = 0.2.
61 Hypothesis Testing What would we like to test? Is there indeed an impact of X on Y? Is the estimated function m( ) significantly different from the traditional parametrerization (eg. linear or log-linear models)? The optimal rates of convergence are different for nonparametric estimation and nonparametric testing. Therefore, the choice of smoothing parameter is an issue to be discussed separately.
62 Hypothesis Testing Specifying the hypothesis: H 0 : The hypothesis, H 1 : The alternative. Nonparametric regression: H 0 : m(x) 0 (X has no impact on Y. Assume EY = 0, otherwise, Ỹ i = Y i Ȳ ). H 1 : m(x) 0. Estimate m(x) by the Nadaraya-Watson estimate: ˆm( ) = ˆm h ( )
63 Hypothesis Testing A measure for the deviation from zero: T 1 = n h { ˆm(x) 0} 2 w(x) dx where w(x) is a weight function (trim boundaries or regions of sparse data). If w(x) = f X (x)w(x) Then the empirical version of T 1 is T 2 = h n { ˆm(x) 0} 2 w(x) i=1
64 Hypothesis Testing Under H 0 : T 1 0 and T 2 0 for n, Under H 1 : T 1 and T 2 for n. Under H 0 : ˆm( ) does not have any bias (Th. 4.3, Speed of the convergens for NW-est.) Under H 0 : T 1 and T 2 converge to N(0, V ) where σ 4 (x) w(x) V = 2 fx 2(x) dx (K K) 2 (x) dx
65 Hypothesis Testing General hypotesis: H 0 : m(x) m θ (x) (specific parametric model) H 1 : m(x) m θ (x). Test statistics, where ˆθ is a consistent estimator. n h { ˆm(x) mˆθ (x)}2 w(x) i=1 Problem: mˆθ ( ) is asymptotically unbiased and converging at rate n. ˆm(x) has a kernel smoothing bias and converges at rate nh.
66 Hypothesis Testing Solution: Introduce an artificial bias by replacing mˆθ ( ) with n ˆmˆθ (x) = i=1 K h(x X i )mˆθ (X i) n i=1 K h(x X i ) in the test statistics T = h n { ˆm(x) ˆmˆθ (x)}2 w(x) i=1 Now the bias of ˆm( ) and ˆmˆθ ( ) cancels out and the same for the convergence rates.
67 Hypothesis Testing We reject H 0 if T is large, but what is large? Theorem 4.7: Asymptotic distribtuion of T Assume the conditions of Th. 4.3, further that ˆθ is a n-consistent estimator for θ and h = O(n 1/5 ). Then T 1 σ 2 (x)w(x) dx h K 2 2 D N ( 0, 2 σ 4 (x)w 2 (x) dx ) (K K) 2 dx
68 Hypothesis Testing Problem: Slow convergence of T towards the normal distribution. Solution: Bootstrap: Simulate the distribution of the test statistics under the hypotesis and determine the critical values based on the simulated distribution. The most popular method is wild bootstrap: Resample from the residuals ˆɛ i, i = 1,..., n under H 0. Each bootstrap residual ɛ i drawn from the same distribution as ˆɛ i up to the first three moments.
69 Hypothesis Testing Wild bootstrap: 1. Estimate mˆθ ( ) under H 0 and construct ˆɛ i = Y i mˆθ (X i), 2. For each X i draw a bootstrap residual ɛ i so that E(ɛ i ) = 0 E(ɛ i 2 ) = ˆɛ 2 i, E(ɛ i 3 ) = ˆɛ 3 i 3. Generate a bootstrap sample {Y i, X i } i=1,...,n by setting Y i = mˆθ(x i ) + ɛ i 4. From this sample, calculate the bootstrap test statistics T 5. Repeat (2.)-(4.) n boot times and use the n boot generated test statistics T to determine the quantiles of the test statistic under H 0. This gives the approximative values of the critical values for your test statistic T.
70 Multivariate Kernel Regression How does Y depends on a vector of variables, X?. We want to estimate E(Y X) = E(Y X 1,..., X d ) = m(x) where X = (X 1,..., X d ) T. We have E(Y X) = yf (y x) dy = yf (y, x) dy f X (x)
71 Multivariate Kernel Regression Multivariate Nadaraya- Watson estimator: (Local constant estimator) Replace with kernel densities ˆm H (x) = n i=1 K H(X i x)y i n i=1 K H(X i x) Weighted sum of the observed Y i.
72 Multivariate Kernel Regression Local linear estimator: The minimization problem min β 0,β 1 n i=1 { Y i β 0 β T 1 (X i x)} 2 KH (X x) Solution ˆβ = ( ˆβ 0, ˆβ T 1 ) T = (X T WX) 1 X T WY 1 (X 1 x) T where X =.., Y = Y 1 and 1 (X n x) T. Y n W = diag(k H (X 1 x),..., K H (X n x)) ˆβ 0 estimates the regression function. ˆm 1,H (x) = ˆβ 0 (x). ˆβ 1 estimates the partial derivatives wrt. the components x.
73 Multivariate Kernel Regression Statistical properties of the Nadaraya-Watson estimator: Theorem 4.8: Asymptotic bias and variance The conditional asymptotic bias and variance of the multivariate Nadaraya-Watson kernel regression estimatora are Bias( ˆm H X 1,..., X n ) µ 2 (K) m(x) T HH T f (x) f X (x) V( ˆm H X 1,..., X n ) µ 2(K)tr{H T H m (x)h} 1 σ(x) n det(h) K 2 2 f X (x) in the interior of the support of f X.
74 Multivariate Kernel Regression Statistical properties of the local linear estimator: Theorem 4.9: Asymptotic bias and variance The conditional asymptotic bias and variance of the multivariate local linear kernel regression estimator are Bias( ˆm 1,H X 1,..., X n ) 1 2 µ 2(K)tr{H T H m (x)h} V( ˆm 1,H X 1,..., X n ) in the interior of the support of f X. 1 σ(x) n det(h) K 2 2 f X (x)
75 Multivariate Kernel Regression Practical aspects: d=2: Comparison of Nadaraya-Watson and the local linear two-dimensional estimate for simulated data. 500 design points uniformly distributed in [0, 1] [0, 1] and m(x) = sin(2πx 1 ) + x 2 and error term N(0, 1 4 ). Bandwidth h 1 = h 2 = 0.3
76 Multivariate Kernel Regression Curse of dimensionality: Nonparametric regression estimators are based on the idea of local (weighted) averaging. In higher dimension the observations are sparsely distributed even for lage sample sizes, and consequently estimators based on local averaging perform unsatisfactorily. Explain the effect by looking at the AMSE where H = h I. The optimal bandwidth AMSE(n, h) = 1 nh d C 1 + h 4 C 2 h opt n 1/(4+d) The rate of convergence of the AMSE is n 4/(4+d). The speed of conv. decreases dramatically for higher dimensions d.
4 Nonparametric Regression
4 Nonparametric Regression 4.1 Univariate Kernel Regression An important question in many fields of science is the relation between two variables, say X and Y. Regression analysis is concerned with the
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationNonparametric Density Estimation (Multidimension)
Nonparametric Density Estimation (Multidimension) Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann February 19, 2007 Setup One-dimensional
More informationNonparametric Regression. Changliang Zou
Nonparametric Regression Institute of Statistics, Nankai University Email: nk.chlzou@gmail.com Smoothing parameter selection An overall measure of how well m h (x) performs in estimating m(x) over x (0,
More informationLocal Polynomial Regression
VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based
More informationECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd
ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationNonparametric Regression
Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationEcon 582 Nonparametric Regression
Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More information3 Nonparametric Density Estimation
3 Nonparametric Density Estimation Example: Income distribution Source: U.K. Family Expenditure Survey (FES) 1968-1995 Approximately 7000 British Households per year For each household many different variables
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationHistogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Construction X 1,..., X n iid r.v. with (unknown) density, f. Aim: Estimate the density
More informationModel-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego
Model-free prediction intervals for regression and autoregression Dimitris N. Politis University of California, San Diego To explain or to predict? Models are indispensable for exploring/utilizing relationships
More informationLocal Polynomial Modelling and Its Applications
Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,
More informationPreface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation
Preface Nonparametric econometrics has become one of the most important sub-fields in modern econometrics. The primary goal of this lecture note is to introduce various nonparametric and semiparametric
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More informationLocal linear multiple regression with variable. bandwidth in the presence of heteroscedasticity
Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend
More informationOn the Robust Modal Local Polynomial Regression
International Journal of Statistical Sciences ISSN 683 5603 Vol. 9(Special Issue), 2009, pp 27-23 c 2009 Dept. of Statistics, Univ. of Rajshahi, Bangladesh On the Robust Modal Local Polynomial Regression
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationOptimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design
Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department
More informationLocal regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression
Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationSpatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood
Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October
More informationNew Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data
ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationNonparametric Modal Regression
Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric
More informationQuantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management
Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti 1 Davide Fiaschi 2 Angela Parenti 3 9 ottobre 2015 1 ireneb@ec.unipi.it. 2 davide.fiaschi@unipi.it.
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationAlternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods
Alternatives to Basis Expansions Basis expansions require either choice of a discrete set of basis or choice of smoothing penalty and smoothing parameter Both of which impose prior beliefs on data. Alternatives
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationDEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE
Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH
More informationBandwidth selection for kernel conditional density
Bandwidth selection for kernel conditional density estimation David M Bashtannyk and Rob J Hyndman 1 10 August 1999 Abstract: We consider bandwidth selection for the kernel estimator of conditional density
More informationOptimal Bandwidth Choice for the Regression Discontinuity Estimator
Optimal Bandwidth Choice for the Regression Discontinuity Estimator Guido Imbens and Karthik Kalyanaraman First Draft: June 8 This Draft: September Abstract We investigate the choice of the bandwidth for
More informationBootstrap of residual processes in regression: to smooth or not to smooth?
Bootstrap of residual processes in regression: to smooth or not to smooth? arxiv:1712.02685v1 [math.st] 7 Dec 2017 Natalie Neumeyer Ingrid Van Keilegom December 8, 2017 Abstract In this paper we consider
More informationNADARAYA WATSON ESTIMATE JAN 10, 2006: version 2. Y ik ( x i
NADARAYA WATSON ESTIMATE JAN 0, 2006: version 2 DATA: (x i, Y i, i =,..., n. ESTIMATE E(Y x = m(x by n i= ˆm (x = Y ik ( x i x n i= K ( x i x EXAMPLES OF K: K(u = I{ u c} (uniform or box kernel K(u = u
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationMinimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.
Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/100 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationDefect Detection using Nonparametric Regression
Defect Detection using Nonparametric Regression Siana Halim Industrial Engineering Department-Petra Christian University Siwalankerto 121-131 Surabaya- Indonesia halim@petra.ac.id Abstract: To compare
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationConvergence rates for uniform confidence intervals based on local polynomial regression estimators
Journal of Nonparametric Statistics ISSN: 1048-5252 Print) 1029-0311 Online) Journal homepage: http://www.tandfonline.com/loi/gnst20 Convergence rates for uniform confidence intervals based on local polynomial
More informationGoodness-of-fit tests for the cure rate in a mixture cure model
Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,
More information36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1
36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationCURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University
CURRENT STATUS LINEAR REGRESSION By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University We construct n-consistent and asymptotically normal estimates for the finite
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationIntroduction to Regression
Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures
More informationFunction of Longitudinal Data
New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for
More informationPenalized Splines, Mixed Models, and Recent Large-Sample Results
Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Supervised Learning: Regression I Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Some of the
More informationChapter 7: Model Assessment and Selection
Chapter 7: Model Assessment and Selection DD3364 April 20, 2012 Introduction Regression: Review of our problem Have target variable Y to estimate from a vector of inputs X. A prediction model ˆf(X) has
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015
More informationTest for Discontinuities in Nonparametric Regression
Communications of the Korean Statistical Society Vol. 15, No. 5, 2008, pp. 709 717 Test for Discontinuities in Nonparametric Regression Dongryeon Park 1) Abstract The difference of two one-sided kernel
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer May 20, 2015 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures Cross Validation Local Polynomial Regression
More informationLocal Modal Regression
Local Modal Regression Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Bruce G. Lindsay and Runze Li Department of Statistics, The Pennsylvania
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationDay 3B Nonparametrics and Bootstrap
Day 3B Nonparametrics and Bootstrap c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010),
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationBayesian estimation of bandwidths for a nonparametric regression model with a flexible error density
ISSN 1440-771X Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Bayesian estimation of bandwidths for a nonparametric regression model
More informationUNIVERSITÄT POTSDAM Institut für Mathematik
UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam
More informationA nonparametric method of multi-step ahead forecasting in diffusion processes
A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School
More informationSemiparametric modeling and estimation of the dispersion function in regression
Semiparametric modeling and estimation of the dispersion function in regression Ingrid Van Keilegom Lan Wang September 4, 2008 Abstract Modeling heteroscedasticity in semiparametric regression can improve
More informationLog-Density Estimation with Application to Approximate Likelihood Inference
Log-Density Estimation with Application to Approximate Likelihood Inference Martin Hazelton 1 Institute of Fundamental Sciences Massey University 19 November 2015 1 Email: m.hazelton@massey.ac.nz WWPMS,
More informationSUPPLEMENTARY MATERIAL FOR PUBLICATION ONLINE 1
SUPPLEMENTARY MATERIAL FOR PUBLICATION ONLINE 1 B Technical details B.1 Variance of ˆf dec in the ersmooth case We derive the variance in the case where ϕ K (t) = ( 1 t 2) κ I( 1 t 1), i.e. we take K as
More informationComputational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data
Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data Géraldine Laurent 1 and Cédric Heuchenne 2 1 QuantOM, HEC-Management School of
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationNonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix
Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationEfficient estimation of a semiparametric dynamic copula model
Efficient estimation of a semiparametric dynamic copula model Christian Hafner Olga Reznikova Institute of Statistics Université catholique de Louvain Louvain-la-Neuve, Blgium 30 January 2009 Young Researchers
More informationLocal Polynomial Regression with Application to Sea Surface Temperatures
Clemson University TigerPrints All Theses Theses 8-2012 Local Polynomial Regression with Application to Sea Surface Temperatures Michael Finney Clemson University, finney2@clemson.edu Follow this and additional
More informationOptimal global rates of convergence for interpolation problems with random design
Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289
More informationLocal linear multivariate. regression with variable. bandwidth in the presence of. heteroscedasticity
Model ISSN 1440-771X Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Local linear multivariate regression with variable bandwidth in the presence
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS
ERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS By YOUSEF FAYZ M ALHARBI A thesis submitted to The University of Birmingham for the Degree of DOCTOR OF PHILOSOPHY School of Mathematics The
More informationDESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA
Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationUltra High Dimensional Variable Selection with Endogenous Variables
1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)
More informationChap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University
Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics
More information9. Robust regression
9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................
More informationMotivational Example
Motivational Example Data: Observational longitudinal study of obesity from birth to adulthood. Overall Goal: Build age-, gender-, height-specific growth charts (under 3 year) to diagnose growth abnomalities.
More informationNonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors
Nonparametric Estimation of Regression Functions In the Presence of Irrelevant Regressors Peter Hall, Qi Li, Jeff Racine 1 Introduction Nonparametric techniques robust to functional form specification.
More informationIndependent and conditionally independent counterfactual distributions
Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationWeek 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend
Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 :
More information