Introduction to Nonparametric Regression

Size: px

Start display at page:

Download "Introduction to Nonparametric Regression"

Charity Carson
5 years ago
Views:

1 Introduction to Nonparametric Regression Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 1

2 Copyright Copyright c 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 2

3 Outline of Notes 1) Need for NP Reg Motivating example Nonparametric regression 3) Local Regression: Overview Examples 2) Local Averaging Overview Examples 4) Kernel Smoothing: Overview Examples Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 3

4 Need for Nonparametric Regression Need for Nonparametric Regression Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 4

5 Need for Nonparametric Regression Motivating Example Results From Four Hypothetical Studies Study 1: y^ = x Study 2: y^ = x y^ R 2 = 0.67 y^ R 2 = x x Study 3: y^ = x Study 4: y^ = x y^ R 2 = 0.67 y^ R 2 = x x Figure: Estimated linear relationship from four hypothetical studies. Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 5

6 Need for Nonparametric Regression Motivating Example Implications of Four Hypothetical Studies What do the results on the previous slide imply? Can we conclude that there is a linear relationship between X and Y? Is the reproducibility of the finding indicative of a valid discovery? What have we learned about the data from these results? Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 6

7 Need for Nonparametric Regression Let s Look at the Data Motivating Example Study 1: y^ = x Study 2: y^ = x y^ R 2 = 0.67 y^ R 2 = x x Study 3: y^ = x Study 4: y^ = x y^ R 2 = 0.67 y^ x x Figure: Estimated linear relationship with corresponding data. Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 7

8 Need for Nonparametric Regression Anscombe s (1973) Quartet Motivating Example > anscombe x1 x2 x3 x4 y1 y2 y3 y Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 8

9 Need for Nonparametric Regression Nonparametric Regression Parametric versus Nonparametric Regression The general linear model is a form of parametric regression, where the relationship between X and Y has some predetermined form. Parameterizes relationship between X and Y, e.g., Ŷ = β 0 + β 1 X Then estimates the specified parameters, e.g., β 0 and β 1 Great if you know the form of the relationship (e.g., linear) In contrast, nonparametric regression tries to estimate the form of the relationship between X and Y. No predetermined form for relationship between X and Y Great for discovering relationships and building prediction models Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 9

10 Need for Nonparametric Regression Nonparametric Regression Problem of Interest Smoothers (aka nonparametric regression) try to estimate functions from noisy data. Suppose we have n pairs of points (x i, y i ) for i {1,..., n}, and WLOG assume that x 1 x 2 x n. Also, suppose the following assumptions hold: (A1) There is a functional relationship between x and y of the form y i = η(x i ) + ɛ i ; i {1,..., n} (A2) The ɛ i are iid from some distribution f (x) with zero mean Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 10

11 Local Averaging Local Averaging Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 11

12 Local Averaging Overview Friedman s (1984) Local Averaging To estimate η at the point x i, we could calculate the average of the y j values corresponding to x j values that are near x i. Friedman (1984) defined near as the smallest symmetric window around x i that contains s observations. Note that s is called the span Size of averaging window can differ for each x i But always use s points in averaging window Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 12

13 Local Averaging Overview Selecting the Span Friedman proposed using a cross-validation approach to select span s. For a given span s, leave-one-out cross-validation: Let ŷ (i) denote the local averaging estimate of η at the point x i obtained by holding out the i-th pair (x i, y i ) Define CV residuals e i (s) = y i ŷ (i) ; note residual is function of s ŝ = min s S (1/n) n i=1 e2 i (s) Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 13

14 Local Averaging Examples Local Averaging Example 1: sunspots data Raw Data span=0.01 sunspots sunlocavg$y years sunlocavg$x span=0.05 span=cv sunlocavg$y sunlocavg$y sunlocavg$x sunlocavg$x Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 14

15 Local Averaging Examples Local Averaging Example 1: R code data(sunspots) yrs=start(sunspots) yre=end(sunspots) years=seq(yrs[1]+yrs[2]/12,yre[1]+yre[2]/12,by=1/12) dev.new(width=8,height=6,norstudiogd=true) par(mfrow=c(2,2)) plot(years,sunspots,type="l",main="raw Data") sunlocavg=supsmu(years,sunspots,span=0.01) plot(sunlocavg,type="l",main="span=0.01") sunlocavg=supsmu(years,sunspots,span=0.05) plot(sunlocavg,type="l",main="span=0.05") sunlocavg=supsmu(years,sunspots) plot(sunlocavg,type="l",main="span=cv") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 15

16 Local Averaging Examples Local Averaging Example 2: simulated data x y η^ η Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 16

17 Local Averaging Examples Local Averaging Example 2: R code > set.seed(1) > x=seq(0,1,length=50) > y=sin(2*pi*x)+rnorm(50,sd=0.5) > locavg=supsmu(x,y) > dev.new(width=6,height=6,norstudiogd=true) > plot(x,y) > lines(locavg$x,locavg$y) > lines(locavg$x,sin(2*pi*x),lty=2) > legend("bottomleft",c(expression(hat(eta)), + expression(eta)),lty=1:2,bty="n") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 17

18 Local Regression Local Regression Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 18

19 Local Regression Overview Cleveland s (1979) Local Regression To estimate η at the point x i, we could calculate the local linear regression line using the (x j, y j ) points near x i. LOWESS: LOcally WEighted Scatterplot Smoothing LOESS: LOcal regression Cleveland (1979) proposed using weighted regression with weights related to distance of x j points to x i. Weight function is scaled so only proportion of (x j, y j ) points are used in each regression Size of regression window can differ for each x i But use αn points in each regression where α (0, 1] Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 19

20 Local Regression Overview Weighted Local Regression Cleveland proposed using a weight function W such that { > 0, x < 1 W (x) = 0, x 1 Then W is modified for each index i {1,..., n} by... Centering W at x i Scaling W such that αn values are nonzero R s loess uses tricube function: W (x) = (1 x 3 ) 3 for x < 1 Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 20

21 Local Regression Overview Weighted Local Regression (continued) Let {w i j }n j=1 denote the weights for a particular point x i. The weighted local regression problem minimizes n j=1 w i j (y j β i 0 βi 1 x j) 2 where β0 i and βi 1 are intercept and slope between x and y in neighborhood around x i Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 21

22 Local Regression Overview Robust Weighted Local Regression To reduce effect of outliers, we can perform another regression with weights based on the residuals ˆɛ i = y i ŷ i where ŷ i = ˆβ i 0 + ˆβ i 1 x i. Bisquare weight function: B(x) = (1 x 2 ) 2, x < 1 Residual-based weights: δ i = B (ˆɛ i /(6median 1 j n ˆɛ j ) ) The robust weighted local regression problem minimizes n j=1 δ j w i j (y j β i 0 βi 1 x j) 2 where β0 i and βi 1 are the robust (i.e., residual corrected) intercept and slope between x and y in neighborhood around x i Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 22

23 Local Regression Overview Selecting the Span Want to minimize the leave-one-out cross-validation criterion: 1 n n (y i ŷ (i) ) 2 i=1 where ŷ (i) is the LOESS estimate of y i obtained by holding out (x i, y i ). Rewrite the leave-one-out cross-validation criterion as 1 n n i=1 (y i ŷ i ) 2 (1 h ii ) 2 where h ii are diagonal entries of the hat matrix H that determines ŷ i. Replace h ii with 1 n n i=1 h ii = tr(h)/n for generalized CV Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 23

24 Local Regression Examples Local Regression Example 1: sunspots data Raw Data span=0.01 sunspots sunlocreg$fitted years sunlocreg$x span=0.05 span=gcv sunlocreg$fitted sunlocreg$fitted sunlocreg$x sunlocreg$x Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 24

25 Local Regression Examples Local Regression Example 1: R code library(fancova) data(sunspots) yrs=start(sunspots) yre=end(sunspots) years=seq(yrs[1]+yrs[2]/12,yre[1]+yre[2]/12,by=1/12) dev.new(width=8,height=6,norstudiogd=true) par(mfrow=c(2,2)) plot(years,sunspots,type="l",main="raw Data") sunlocreg=loess(sunspots~years,span=0.01) plot(sunlocreg$x,sunlocreg$fitted,type="l",main="span=0.01") sunlocreg=loess(sunspots~years,span=0.05) plot(sunlocreg$x,sunlocreg$fitted,type="l",main="span=0.05") sunlocreg=loess.as(years,sunspots,criterion="gcv") plot(sunlocreg$x,sunlocreg$fitted,type="l",main="span=gcv") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 25

26 Local Regression Examples Local Regression Example 2: simulated data x y η^ η Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 26

27 Local Regression Examples Local Regression Example 2: R code > library(fancova) > set.seed(55455) > x=seq(0,1,length=50) > y=sin(2*pi*x)+rnorm(50,sd=0.5) > locreg=loess.as(x,y) > dev.new(width=6,height=6,norstudiogd=true) > plot(x,y) > lines(locreg$x,locreg$fitted) > lines(locreg$x,sin(2*pi*x),lty=2) > legend("bottomleft",c(expression(hat(eta)), + expression(eta)),lty=1:2,bty="n") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 27

28 Kernel Smoothing Kernel Smoothing Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 28

29 Kernel Smoothing Overview Kernel Smoothing for General Functions Kernel smoothing extends KDE idea to estimation a general function η. Nadaraya (1964) and Watson (1964) independently introduced the kernel regression estimate where weights w i = n i=1 ˆη(x) = y ik ( x x i ) h n i=1 K ( x x ) i = h ( ) K x xi h ( n i=1 K x xi h n y i w i i=1 ) are dependent on chosen K and h. CV, GCV, or AIC to select bandwidth in kernel regression problems. Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 29

30 Kernel Smoothing Examples Kernel Smoothing Example 1: sunspots data Raw Data bw=0.01 sunspots fitted(sunkerreg) years years bws=0.05 bws=cv (0.09) fitted(sunkerreg) fitted(sunkerreg) years years Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 30

31 Kernel Smoothing Examples Kernel Smoothing Example 1: R code library(np) data(sunspots) yrs=start(sunspots) yre=end(sunspots) sunspots=as.vector(sunspots) years=seq(yrs[1]+yrs[2]/12,yre[1]+yre[2]/12,by=1/12) dev.new(width=8,height=6,norstudiogd=true) par(mfrow=c(2,2)) plot(years,sunspots,type="l",main="raw Data") sunkerreg=npreg(bws=0.01,txdat=years,tydat=sunspots) plot(years,fitted(sunkerreg),type="l",main="bw=0.01") sunkerreg=npreg(bws=0.05,txdat=years,tydat=sunspots) plot(years,fitted(sunkerreg),type="l",main="bws=0.05") sunkerreg=npreg(txdat=years,tydat=sunspots) plot(years,fitted(sunkerreg),type="l",main="bws=cv (0.09)") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 31

32 Kernel Smoothing Examples Kernel Smoothing Example 2: simulated data x y η^ η Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 32

33 Kernel Smoothing Examples Kernel Smoothing Example 2: R code > library(np) > set.seed(55455) > x=seq(0,1,length=50) > y=sin(2*pi*x)+rnorm(50,sd=0.5) > kerreg=npreg(txdat=x,tydat=y) > dev.new(width=6,height=6,norstudiogd=true) > plot(x,y) > lines(x,fitted(kerreg)) > lines(x,sin(2*pi*x),lty=2) > legend("bottomleft",c(expression(hat(eta)), + expression(eta)),lty=1:2,bty="n") Nathaniel E. Helwig (U of Minnesota) Introduction to Nonparametric Regression Updated 04-Jan-2017 : Slide 33

STAT 704 Sections IRLS and Bootstrap

STAT 704 Sections IRLS and Bootstrap STAT 704 Sections 11.4-11.5. IRLS and John Grego Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 14 LOWESS IRLS LOWESS LOWESS (LOcally WEighted Scatterplot Smoothing)