Two Applications of Nonparametric Regression in Survey Estimation
|
|
- Dorcas Daniels
- 5 years ago
- Views:
Transcription
1 Two Applications of Nonparametric Regression in Survey Estimation 1/56 Jean Opsomer Iowa State University Joint work with Jay Breidt, Colorado State University Gerda Claeskens, Université Catholique de Louvain Göran Kauermann, Universität Bielefeld Giovanna Ranalli, Università di Perugia April 9, 2004
2 Outline 2/56 1. Introduction (a) Two examples (b) Generic and specific inference 2. Nonparametric Regression Using Penalized Splines 3. Generic Inference for Survey Data (a) Nonparametric model-assisted estimator (b) Utah Forest Survey 4. Specific Inference for Survey Data (a) Nonparametric small area estimator (b) Northeastern Lakes Survey 5. Conclusion
3 1. Introduction: Utah Forest Survey 3/56 Forest inventory survey conducted by U.S. Forest Service
4 Utah Forest Survey (2) Phase Variables Geographic X,Y coordinates E-W, N-S Information elev elevation (m) System asp aspect (deg) (GIS) slope slope (deg) hillshd hillshade (solar radiation) N = 67, 216 nlcd vegetation cover type (class) Forest fortyp forest type (class) Inventory trees number of trees and agemax max tree age (years) Analysis ageavg avg tree age (years) (FIA) bamax max tree basal area (sq in) crcov tree crown cover (%) n I = 3, Forest lichen lichen species present (count) Health... Monitoring (FHM) n = 71 Goal: estimate mean number of lichen in region/domains 4/56
5 Introduction: Northeastern Lakes Survey 5/56 Ecological condition survey of Northeastern lakes conducted by U.S. EPA Data collected for 338 lakes
6 Northeastern Lakes Survey (2) Region includes digit Hydrologic Unit Codes (HUC) 6/56 Goal: estimate mean lake Acid Neutralizing Capacity (ANC) for all HUCs
7 Types of Statistical Inference 7/56 Specific Inference: expensive, high quality, targeted using custom-built method (or model) to achieve best possible estimator for particular variable(s) willing to defend model
8 Types of Statistical Inference 7/56 Specific Inference: expensive, high quality, targeted using custom-built method (or model) to achieve best possible estimator for particular variable(s) willing to defend model Generic Inference: cheap, reasonable quality, good for many purposes using method appropriate for large number of variables that need to be estimated jointly Corn Flakes NET WT. 12 OZ.
9 Statistical Inference in Surveys 8/56 Number of Inference Modelling observations Large { generic none generic model-assisted Moderate specific model-based Small specific small domain estimation Number of observations depends on domain (subpopulation) size For moderate sample size, use of generic inference depends on model goodness-of-fit Nonparametric methods can improve both generic and specific inference
10 2. Nonparametric Regression Using Penalized Splines 9/56 Many nonparametric regression methods are available Kernel and local polynomial methods Smoothing splines, regression splines,... Orthogonal decomposition (wavelet, Fourier series) Penalized splines or P-splines regression (Eilers and Marx, 1996) is simple, flexible and computationally attractive smoothing method
11 Definition of Penalized Splines Model 10/56 Regression model y i = m(x i ) + ε i Function m( ) is unknown but assumed well approximated by polynomial spline m(x; β) = β 0 + β 1 x β p x p + K β p+k (x κ k ) p + k=1 p : degree of spline (fixed) κ 1 <... < κ K : set of K knots (fixed) β = (β 0,..., β p+k ) : vector of parameters (unknown) (Ruppert, Wand and Carroll, 2003)
12 Polynomial Spline Basis Functions 11/56 (x κ) p + { (x κ) p if x κ > 0 0 if x κ p=1 0.8 p= basis functions x x Other spline basis functions are possible (B-splines, radial splines)
13 Choosing K 12/56 If K is sufficiently large, m( ; β) can approximate large class of functions y K= K= y K= x Rule of thumb: K = min(#x/4, 35) K= x
14 Expressing Spline Model as Parametric Model 13/56 m(x; β) = β 0 + β 1 x β p x p + x β K β p+k (x κ k ) p + k=1 with x = (1, x,..., x p, (x κ 1 ) p +,..., (x κ K ) p +) β = (β 0,..., β p+k ) T
15 Fitting by Penalized Splines Regression Minimize penalized sum of squares 14/56 min β n (y i m(x i ; β)) 2 + λ i=1 K k=1 β 2 p+k ˆβ λ = ( X T X + λd ) 1 X T Y λ = smoothing penalty (fixed) D = diag{0,..., 0, 1,..., 1} X = design matrix (including spline terms) ˆβ λ is ridge regression estimator, with ridge penalty on nonlinear (spline) terms of model
16 Fitting by Penalized Splines Regression (2) 15/56 λ protects against overfitting and determines smoothness of fit y lambda= lambda= y lambda= x lambda= x
17 Choosing the Penalty λ Cross-Validation: minimize CV sum of squares with respect to λ Mixed model approach: treat spline parameters β p+1,..., β p+k as a random effect with common variance σβ 2 and fit regression using Maximum Likelihood approach (MLE, REML) 16/56
18 P-spline: Hybrid Regression Method 17/56 P-spline is a nonparametric regression method: can fit very large classes of functions adaptive to local features in the data smoothness of function is determined by penalty parameter λ P-spline is a parametric regression method: model can be written as x β fitted by (global) least squares method number of parameters p + K puts upper bound on flexibility of model
19 Extending the model 18/56 Semi-parametric regression Model y i = m(x 1i ; β 1 ) + x 2i β 2 + ε i min β 1,β 2 n (y i m(x 1i ; β 1 ) x 2i β 2 ) 2 + λ i=1 K k=1 β 2 1,p+k
20 Extending the model 18/56 Semi-parametric regression Model y i = m(x 1i ; β 1 ) + x 2i β 2 + ε i min β 1,β 2 n (y i m(x 1i ; β 1 ) x 2i β 2 ) 2 + λ i=1 K k=1 β 2 1,p+k Additive model min β 1,β 2 Model y i = m 1 (x 1i ; β 1 ) + m 2 (x 2i ; β 2 ) + ε i n K (y i m 1 (x 1i ; β 1 ) m 2 (x 2i ; β 2 )) 2 +λ 1 i=1 Other... k=1 β 2 1,p+k+λ 2 K β2,p+k 2 k=1
21 3.1 Generic Inference for Survey Data Population U = {1,..., k,..., N} with unknown population parameters ȳ N = 1 y i and z N, x N,... N U Sample s selected from U according to known sampling design p(s) stratification clustering multiple phases 19/56
22 3.1 Generic Inference for Survey Data Population U = {1,..., k,..., N} with unknown population parameters ȳ N = 1 y i and z N, x N,... N U Sample s selected from U according to known sampling design p(s) stratification clustering multiple phases Generic estimator for the population mean ŷ s = [ w i y i ẑ s = ] w i z i, ˆx s,... s s 19/56
23 Simple Generic Estimation: Design-based Horvitz-Thompson estimator (1952) ŷ π = 1 N s 1 π i y i with inclusion probabilities π i = Pr(i s) Properties of ŷ π (for any design p) with E p (ŷ π ) = 1 N Var p (ŷ π ) = 1 N 2 y i = ȳ N U U π ij = Pr(i, j s) y i π i y j π j (π ij π i π j ) 20/56
24 Better Generic Estimation: Model-assisted Superpopulation model ξ: y i are iid with E ξ (y i ) = β 0 + β 1 x i = x T i β Var ξ (y i ) = σ 2 Least squares population fit for β 21/56 B U = (X T UX U ) 1 X U Y U
25 Better Generic Estimation: Model-assisted Superpopulation model ξ: y i are iid with E ξ (y i ) = β 0 + β 1 x i = x T i β Var ξ (y i ) = σ 2 Least squares population fit for β 21/56 B U = (X T UX U ) 1 X U Y U B U is estimated by sample-based estimator ˆB = ( X T s Π 1 s with Π s = diag{π i, i s} X s ) 1 X T s Π 1 s Y s Model-assisted ( regression ) estimator (Cassel et al., 1977) ŷ r = 1 x T i ˆB + 1 y i x T i ˆB N N U s π i
26 Properties of Regression Estimator Generic estimator ŷ r = w i (s)y i s 22/56 Consistent, asymptotically design unbiased E p (ŷ r ) ȳ N Approximate design variance Var p (ŷ r ) 1 N 2 U y i x T i B U π i y j x T j B U π j (π ij π i π j ) If model is true, smaller than HT variance [ ] HT : Var p (ŷ π ) = 1 y i y j (π N 2 ij π i π j ) π i π j U
27 Nonparametric Regression Estimation 23/56 Superpopulation model ξ: E ξ (y i ) = x T i β Var ξ (y i ) = σ 2
28 Nonparametric Regression Estimation 23/56 Superpopulation model ξ: E ξ (y i ) = x T i β Var ξ (y i ) = σ 2 Replace by: Superpopulation model ξ: E ξ (y i ) = m(x i ) Var ξ (y i ) = v(x i ) Breidt and Opsomer (2000) developed model-assisted nonparametric estimator based on local polynomial regression
29 Nonparametric Regression Estimation 23/56 Superpopulation model ξ: E ξ (y i ) = x T i β Var ξ (y i ) = σ 2 Replace by: Superpopulation model ξ: E ξ (y i ) = m(x i ) Var ξ (y i ) = v(x i ) Breidt and Opsomer (2000) developed model-assisted nonparametric estimator based on local polynomial regression Easier to do with P-splines!
30 Sample-weighted Penalized Splines Regression 24/56 Superpopulation model ξ: E ξ (y i ) = m(x i ; β) = x i β Var ξ (y i ) = v(x i ) Least squares P-spline population fit (for fixed λ) B U,λ = ( X T U X U + λd ) 1 X T U Y U B U,λ is estimated by design-based estimator ˆB λ = ( X T s Π 1 s X s + λd ) 1 X T s Π 1 s Y s
31 Penalized Splines Regression Estimation Model-assisted P-splines estimator ŷ p = 1 N U x T i ˆB λ + 1 N s y i x T i π i ˆB λ 25/56 Design properties: identical to regression estimator (!)
32 Improved Efficiency from Nonparametric Fit Approximate design variance Var p (ŷ p ) 1 N 2 [ Var p (ŷ r ) 1 N 2 [ U y i x T i B U,λ π i U y i x T i B π i HT : Var p (ŷ π ) = 1 N 2 U y j x T j B U,λ (π ij π i π j ) π j y j x T j B ] (π ij π i π j ) π j y i π i y j π j (π ij π i π j ) ] 26/56
33 P-Splines or Local Polynomials for Model- Assisted Estimation? 27/56 Advantages of P-Splines ly related to parametric modelling (ratio, linear models) Model is easy to extend to multivariate, additive, semiparametric cases Handles data sparseness easily, very fast to compute
34 P-Splines or Local Polynomials for Model- Assisted Estimation? 27/56 Advantages of P-Splines ly related to parametric modelling (ratio, linear models) Model is easy to extend to multivariate, additive, semiparametric cases Handles data sparseness easily, very fast to compute Disadvantages of P-Splines Flexibility of model limited by number of parameters p + K
35 Multi-phase Sampling 28/56 Population U (N elements) Phase 1 sample S (n I elements) Phase 2 sample R (n elements)
36 Multi-phase Sampling 28/56 Population U (N elements) Geographic Information System (GIS) N = 67, 216 Phase 1 sample S (n I elements) Phase 2 sample R (n elements) Forest Inventory and Analysis (FIA) n I = 3, 107 Forest Health Monitoring (FHM) n = 71
37 Model-assisted Estimation for Multi-phase samples Different information available at different phases population x ak, z ak, k U GIS variables phase 1 x bk, z bk, k S FIA measurements phase 2 y k, k R FHM measurements (lichen count) (x bk, z bk contains x ak, z ak ) 29/56 Goal: estimate ȳ N with generic but efficient estimator
38 Model-assisted Estimation for Multi-phase samples Different information available at different phases population x ak, z ak, k U GIS variables phase 1 x bk, z bk, k S FIA measurements phase 2 y k, k R FHM measurements (lichen count) (x bk, z bk contains x ak, z ak ) 29/56 Goal: estimate ȳ N with generic but efficient estimator Approach: 1. use P-splines to fit semiparametric additive model for each level of auxiliary info 2. construct model-assisted estimator
39 Model-assisted Estimation for Multi-phase samples (2) Models Model a: using predictors available for U 30/56 E ξ (y i ) = g a (x ai, z ai ) = m a (x ai ; β a ) + z ai γ a Model b: using predictors available for S E ξ (y i ) = g b (x bi, z bi ) = m b (x bi ; β b ) + z bi γ b
40 Model-assisted Estimation for Multi-phase samples (2) Models Model a: using predictors available for U 30/56 E ξ (y i ) = g a (x ai, z ai ) = m a (x ai ; β a ) + z ai γ a Model b: using predictors available for S E ξ (y i ) = g b (x bi, z bi ) = m b (x bi ; β b ) + z bi γ b P-splines regression estimator ŷ p = 1 ĝ ai + 1 N N U S ĝ bi ĝ ai π i(s) + 1 N R y i ĝ bi π i(s) π k(r S) Design properties: follow nonparametric model-assisted framework
41 3.2 Utah Forest Survey Phase Variables Geographic X,Y coordinates E-W, N-S Information elev elevation (m) System asp aspect (deg) (GIS) slope slope (deg) hillshd hillshade (solar radiation) N = 67, 216 nlcd vegetation cover type (class) Forest fortyp forest type (class) Inventory trees number of trees and agemax max tree age (years) Analysis ageavg avg tree age (years) (FIA) bamax max tree basal area (sq in) crcov tree crown cover (%) n I = 3, Forest lichen lichen species present (count) Health... Monitoring (FHM) n = 71 31/56
42 Semiparametric Model a 32/56 E ξ (LICHEN) = m(hillshd; β) + z NLCD γ ps(hillshd) hillshd partial for nlcd Fitted using Splus functions gam(), ps() with p = 2, K = 15, λ = 1 nlcd2
43 Semiparametric Model b E ξ (LICHEN) = m 1 (CRCOV; β 1 ) + m 2 (AGEMAX; β 2 ) +m 3 (BAMAX; β 3 ) + z NLCD γ 33/56 ps(crcov) ps(agemax) crcov agemax ps(bamax) bamax partial for nlcd nlcd2
44 Forest Health Monitoring Estimates 34/56 HT = Generic estimator, ignores all auxiliary information Linear = Generic estimator, all models are fitted by linear regression Semiparametric = Generic estimator, semiparametric (P-spline) models HT Linear Semiparametric Estimate Est. St. Dev (69%) (44%) Nonparametric regression can improve the efficiency of generic estimators in surveys
45 35/56 Estimators for Domains Domain : subpopulation for which separate estimator is needed Models can improve precision of domain estimators, if they have good local properties Nonparametric model better able to adapt to local features of data crcov s(crcov)
46 Model-Assisted Estimators for Domains 36/56 (Särndal, 1984) Obtain sample-weighted model fit for complete sample s Estimator for domain U d U with realized sample s d s is ŷ p = 1 ĝ i + 1 N d N d U d s d y i ĝ i π i Variance follows from model-assisted estimation theory Approach maintains additivity of domain estimates
47 Example: Estimation for Domain with NLCD > 50 37/56 Phase Longitude Index Latitude Index Only n = 27 observations in domain
48 Example (2) All Data (n = 71) HT Linear Semiparametric Estimate Est. St. Dev /56 Domain (n = 27) HT Linear Semiparametric Estimate Est. St. Dev (1.58) (1.56) (1.06) Nonparametric regression makes it possible to maintain generic approach at smaller scales
49 4.1 Specific Inference for Survey Data 39/56 Data contain 557 observations over 113 HUCs Goal: produce estimates of ANC for all HUCs
50 HUC Sample Means 40/56 Problems: unreliable estimates, missing HUCs
51 HUC as Small Areas 41/56 Few sample observations available in most HUCs Average sample size/huc: HUCs contain less than 5 observations 27 out of 113 HUCs contain no sample observations Modelling required to construct reliable HUC-level estimates called small area estimation in surveys statistics mixed model/prediction problem = type of specific inference
52 Nonparametric Small Area Estimation 42/56 P-splines are ideally suited for small area estimation when the mean function is difficult to specify a priori close relationship between classical small area estimation models and P-splines availability of existing software ability to evaluate need for nonlinearity in model and significance of small area effects
53 P-splines as Random Effects 43/56 y i = m(x i ; β) + ε i = x i β + ε i
54 P-splines as Random Effects 43/56 y i = m(x i ; β) + ε i = x i β + ε i = x F i β F + z i γ + ε i x F i β F β 0 + x i β x p i β p (parametric, fixed component) z i γ = z 1i γ z Ki γ K (x i κ 1 ) p +β p (x i κ K ) p +β p+k
55 P-splines as Random Effects 43/56 y i = m(x i ; β) + ε i = x i β + ε i = x F i β F + z i γ + ε i x F i β F β 0 + x i β x p i β p (parametric, fixed component) z i γ = z 1i γ z Ki γ K (x i κ 1 ) p +β p (x i κ K ) p +β p+k (deviations from parametric, treated as random effect) γ = (γ 1,..., γ K ) iid F γ (0, σ 2 γ) ε i F ε (0, σ 2 ε)
56 P-splines Estimator as BLUP Assuming σγ, 2 σε 2 known, Best Linear Unbiased Estimator/Predictor (BLUE/BLUP) is solution to n ( min yi x F β F i β F + z i γ ) 2 σ 2 K + ε γ 2,γ σγ 2 k i=1 (Henderson et al., 1959) [ ] ˆβF = ˆγ with [ ˆβF ˆγ k=1 ( 1 X T X + σ2 ε D) X T Y σγ 2 D = diag{0,..., 0, 1,..., 1} X = [ x F z ] ] = P-splines (ridge) regression estimator ˆβ λ with λ = σε/σ 2 γ 2 44/56
57 P-splines Estimator as EBLUP If σ 2 γ, σ 2 ε are unknown, estimates can be obtained by Maximum Likelihood (ML) or Restricted Maximum Likelihood (REML) (e.g. Searle, Casella and McCulloch, 1992), and ˆβˆλ = [ ˆβF ˆγ ] = ( 1 X T X + ˆσ2 ε D) X T Y ˆσ γ 2 45/56 ˆβˆλ is Empirical BLUP (EBLUP) for β
58 P-splines Estimator as EBLUP If σ 2 γ, σ 2 ε are unknown, estimates can be obtained by Maximum Likelihood (ML) or Restricted Maximum Likelihood (REML) (e.g. Searle, Casella and McCulloch, 1992), and ˆβˆλ = [ ˆβF ˆγ ] = ( 1 X T X + ˆσ2 ε D) X T Y ˆσ γ 2 45/56 ˆβˆλ is Empirical BLUP (EBLUP) for β Smoothing penalty λ = ˆσ 2 ε/ˆσ 2 γ is determined by data Automatically adjusts λ to patterns in data small deviations from parametric shape ˆσ 2 γ small more smoothing data exhibit significant deviations from parametric shape ˆσ 2 γ large less smoothing
59 Spatial Smoothing using P-splines NE Lakes data require bivariate (spatial) smoothing Low-rank radial basis ( thin-plate spline) [ ] z = C(x i κ k ) 1 i n 1 k K [ C(κ k κ k ) with C(r) = r 2 log r (Ruppert et al. 2003) Mixed model ] 1/2 1 k,k K 46/56 y i = x F i βf + z i γ + ε i γ = (γ 1,..., γ K ) iid F γ (0, σ 2 γ) ε i F ε (0, σ 2 ε) Knot selection: regular spatial grid, space-filling algorithm
60 Small Area Estimation as Mixed Effect Regression Classical small area estimation (Battese, Harter and Fuller, 1988): T small areas, with u t the random effect for small area t = 1,..., T Model 47/56 y i = x i β + u t + ε i = x i β + d i u + ε i d i = (d i1,..., d it ) d it = { 1 if i small area t 0 otherwise u = (u 1,..., u T ) iid F u (0, σ 2 u) ε i F ε (0, σ 2 ε) Target Ȳt = X t β + u t estimated by (E)BLUP ȳ t = X t ˆβ + ût
61 Nonparametric Small Area Model Combine both random effects models 48/56 y i = m(x i ; β) + d i u + ε i = x F i β F + z i γ + d i u + ε i Variance components γ iid F γ (0, σ 2 γ) u iid F u (0, σ 2 u) ε i F ε (0, σ 2 ε)
62 Nonparametric Small Area Model Combine both random effects models 48/56 y i = m(x i ; β) + d i u + ε i = x F i β F + z i γ + d i u + ε i Variance components γ iid F γ (0, σ 2 γ) u iid F u (0, σ 2 u) ε i F ε (0, σ 2 ε) EBLUP can be computed by REML, and ȳ t = X F t ˆβ F + Z t ˆγ + û t
63 4.2 Small Area Estimation for NE Lakes 557 measurements on 338 lakes Dependent variable Independent variables Fixed effects Random effects ANC Acid Neutralizing Capacity INT intercept ELEV elevation TPS spatial thin-plate spline for K = 81 HUC 113 small areas 49/56
64 Knot Locations for Spatial Spline 50/56 utmy utmx
65 Full Model: Model Fit 51/56 Estimates Fixed effects Random effects ˆβ F INT 586 ELEV ˆσ TPS 79 HUC 420 Error 173 Correlation between model predictions and ANC: 0.96
66 Full Model: HUC Predictions 52/ Correlation between HUC means and model predictions: 0.97
67 Are both random effects needed? AIC model selection criterion TPS HUC yes no yes no /56 Correlation between ANC and model prediction TPS HUC yes no yes no
68 Spline in Small Area Model? 54/ observed HUCs non observed HUCs HUC predictions HUC model HUC predictions HUC + TPS model Spline model provides better predictions for empty HUCs
69 Small Area Random Effect in Spatial Model? 55/ observed HUCs non observed HUCs HUC predictions TPS model HUC predictions HUC + TPS model HUC effect results in predictions that are closer to observed data
70 5. Conclusions 56/56 Generic and specific inference can be improved with nonparametric regression methods P-splines easily extend existing survey estimation approaches Utah: more efficient generic estimators for region and for domains NE Lakes: improved prediction for HUCs without observations To do: smoothing parameter selection mixed model inference, test for significance of random effects Contact information: http// jopsomer/home.html
Nonparametric Small Area Estimation Using Penalized Spline Regression
Nonparametric Small Area Estimation Using Penalized Spline Regression 0verview Spline-based nonparametric regression Nonparametric small area estimation Prediction mean squared error Bootstrapping small
More informationModel-assisted Estimation of Forest Resources with Generalized Additive Models
Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean Opsomer, Jay Breidt, Gretchen Moisen, Göran Kauermann August 9, 2006 1 Outline 1. Forest surveys 2. Sampling from spatial
More informationNonparametric Small Area Estimation Using Penalized Spline Regression
Nonparametric Small Area Estimation Using Penalized Spline Regression J. D. Opsomer Iowa State University G. Claeskens Katholieke Universiteit Leuven M. G. Ranalli Universita degli Studi di Perugia G.
More informationNonparametric Small Area Estimation via M-quantile Regression using Penalized Splines
Nonparametric Small Estimation via M-quantile Regression using Penalized Splines Monica Pratesi 10 August 2008 Abstract The demand of reliable statistics for small areas, when only reduced sizes of the
More informationPenalized Balanced Sampling. Jay Breidt
Penalized Balanced Sampling Jay Breidt Colorado State University Joint work with Guillaume Chauvet (ENSAI) February 4, 2010 1 / 44 Linear Mixed Models Let U = {1, 2,...,N}. Consider linear mixed models
More informationF. Jay Breidt Colorado State University
Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported
More informationSmall Area Estimation Using a Nonparametric Model Based Direct Estimator
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2009 Small Area Estimation Using a Nonparametric
More informationPenalized Splines, Mixed Models, and Recent Large-Sample Results
Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong
More informationSmall Area Estimation for Skewed Georeferenced Data
Small Area Estimation for Skewed Georeferenced Data E. Dreassi - A. Petrucci - E. Rocco Department of Statistics, Informatics, Applications "G. Parenti" University of Florence THE FIRST ASIAN ISI SATELLITE
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationLikelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science
1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School
More informationThe Hodrick-Prescott Filter
The Hodrick-Prescott Filter A Special Case of Penalized Spline Smoothing Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Joint work with Rob Paige, Missouri University of Science
More informationCross-validation in model-assisted estimation
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 009 Cross-validation in model-assisted estimation Lifeng You Iowa State University Follow this and additional
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationModel-assisted Estimation of Forest Resources with Generalized Additive Models
Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean D. Opsomer, F. Jay Breidt, Gretchen G. Moisen, and Göran Kauermann March 26, 2003 Abstract Multi-phase surveys are often
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More informationAdditional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey
Additional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey J.D. Opsomer Colorado State University M. Francisco-Fernández Universidad de A Coruña July
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationNONPARAMETRIC ENDOGENOUS POST-STRATIFICATION ESTIMATION
Statistica Sinica 2011): Preprint 1 NONPARAMETRIC ENDOGENOUS POST-STRATIFICATION ESTIMATION Mark Dahlke 1, F. Jay Breidt 1, Jean D. Opsomer 1 and Ingrid Van Keilegom 2 1 Colorado State University and 2
More informationNonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling
Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University
More information1. Introduction. Keywords: Auxiliary information; Nonparametric regression; Pseudo empirical likelihood; Model-assisted approach; MARS.
Nonparametric Methods for Sample Surveys of Environmental Populations Metodi nonparametrici nell inferenza per popolazioni finite di carattere ambientale Giorgio E. Montanari Dipartimento di Economia,
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationSome properties of Likelihood Ratio Tests in Linear Mixed Models
Some properties of Likelihood Ratio Tests in Linear Mixed Models Ciprian M. Crainiceanu David Ruppert Timothy J. Vogelsang September 19, 2003 Abstract We calculate the finite sample probability mass-at-zero
More informationChapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70
Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationAndreas Blöchl: Trend Estimation with Penalized Splines as Mixed Models for Series with Structural Breaks
Andreas Blöchl: Trend Estimation with Penalized Splines as Mixed Models for Series with Structural Breaks Munich Discussion Paper No. 2014-2 Department of Economics University of Munich Volkswirtschaftliche
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationSmall Area Modeling of County Estimates for Corn and Soybean Yields in the US
Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov
More informationASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS
Mem. Gra. Sci. Eng. Shimane Univ. Series B: Mathematics 47 (2014), pp. 63 71 ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS TAKUMA YOSHIDA Communicated by Kanta Naito (Received: December 19, 2013)
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationRecovering Indirect Information in Demographic Applications
Recovering Indirect Information in Demographic Applications Jutta Gampe Abstract In many demographic applications the information of interest can only be estimated indirectly. Modelling events and rates
More informationCalibration estimation using exponential tilting in sample surveys
Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary
More informationRegularization in Cox Frailty Models
Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University
More informationExact Likelihood Ratio Tests for Penalized Splines
Exact Likelihood Ratio Tests for Penalized Splines By CIPRIAN CRAINICEANU, DAVID RUPPERT, GERDA CLAESKENS, M.P. WAND Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street, Baltimore,
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationEstimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2
Estimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2 1 Norwegian University of Life Sciences, Department of Ecology and Natural Resource
More informationForecasting Data Streams: Next Generation Flow Field Forecasting
Forecasting Data Streams: Next Generation Flow Field Forecasting Kyle Caudle South Dakota School of Mines & Technology (SDSMT) kyle.caudle@sdsmt.edu Joint work with Michael Frey (Bucknell University) and
More informationApproaches to Modeling Menstrual Cycle Function
Approaches to Modeling Menstrual Cycle Function Paul S. Albert (albertp@mail.nih.gov) Biostatistics & Bioinformatics Branch Division of Epidemiology, Statistics, and Prevention Research NICHD SPER Student
More informationLikelihood ratio testing for zero variance components in linear mixed models
Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,
More informationHigh-dimensional regression
High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and
More informationGaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency
Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of
More informationSimultaneous Confidence Bands for the Coefficient Function in Functional Regression
University of Haifa From the SelectedWorks of Philip T. Reiss August 7, 2008 Simultaneous Confidence Bands for the Coefficient Function in Functional Regression Philip T. Reiss, New York University Available
More informationNonstationary Regressors using Penalized Splines
Estimation and Inference for Varying-coefficient Models with Nonstationary Regressors using Penalized Splines Haiqiang Chen, Ying Fang and Yingxing Li Wang Yanan Institute for Studies in Economics, MOE
More informationSingle-index model-assisted estimation in survey sampling
Journal of onparametric Statistics Vol. 21, o. 4, May 2009, 487 504 Single-index model-assisted estimation in survey sampling Li Wang* Department of Statistics, University of Georgia, Athens, GA, 30602,
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationCombining data from two independent surveys: model-assisted approach
Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationModel-Assisted Estimation of Forest Resources With Generalized Additive Models
Model-Assisted Estimation of Forest Resources With Generalized Additive Models Jean D. OPSOMER, F.JayBREIDT, Gretchen G. MOISEN, and Göran KAUERMANN Multiphase surveys are often conducted in forest inventories,
More informationarxiv: v2 [math.st] 20 Jun 2014
A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun
More informationLecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University
Lecture 18: Kernels Risk and Loss Support Vector Regression Aykut Erdem December 2016 Hacettepe University Administrative We will have a make-up lecture on next Saturday December 24, 2016 Presentations
More informationP -spline ANOVA-type interaction models for spatio-temporal smoothing
P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and
More informationNonparametric Regression
Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y
More informationINSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING
Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy
More informationBeyond Mean Regression
Beyond Mean Regression Thomas Kneib Lehrstuhl für Statistik Georg-August-Universität Göttingen 8.3.2013 Innsbruck Introduction Introduction One of the top ten reasons to become statistician (according
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationOne-stage dose-response meta-analysis
One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationSpatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University
Spatial Lasso with Application to GIS Model Selection F. Jay Breidt Colorado State University with Hsin-Cheng Huang, Nan-Jung Hsu, and Dave Theobald September 25 The work reported here was developed under
More informationVariable Selection and Model Choice in Survival Models with Time-Varying Effects
Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität
More informationComments on: Model-free model-fitting and predictive distributions : Applications to Small Area Statistics and Treatment Effect Estimation
Article Comments on: Model-free model-fitting and predictive distributions : Applications to Small Area Statistics and Treatment Effect Estimation SPERLICH, Stefan Andréas Reference SPERLICH, Stefan Andréas.
More informationRestricted Likelihood Ratio Tests in Nonparametric Longitudinal Models
Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Short title: Restricted LR Tests in Longitudinal Models Ciprian M. Crainiceanu David Ruppert May 5, 2004 Abstract We assume that repeated
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationAUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET
The Problem Identification of Linear and onlinear Dynamical Systems Theme : Curve Fitting Division of Automatic Control Linköping University Sweden Data from Gripen Questions How do the control surface
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationWLS and BLUE (prelude to BLUP) Prediction
WLS and BLUE (prelude to BLUP) Prediction Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 21, 2018 Suppose that Y has mean X β and known covariance matrix V (but Y need
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationKneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"
Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/
More informationSelection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty
Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationSpatial Process Estimates as Smoothers: A Review
Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed
More informationModel Selection and Geometry
Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model
More informationEfficient Estimation for the Partially Linear Models with Random Effects
A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationA comparison of stratified simple random sampling and sampling with probability proportional to size
A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationStatistics 203: Introduction to Regression and Analysis of Variance Penalized models
Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance
More informationRegularized PCA to denoise and visualise data
Regularized PCA to denoise and visualise data Marie Verbanck Julie Josse François Husson Laboratoire de statistique, Agrocampus Ouest, Rennes, France CNAM, Paris, 16 janvier 2013 1 / 30 Outline 1 PCA 2
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationA Hierarchical Perspective on Lee-Carter Models
A Hierarchical Perspective on Lee-Carter Models Paul Eilers Leiden University Medical Centre L-C Workshop, Edinburgh 24 The vantage point Previous presentation: Iain Currie led you upward From Glen Gumbel
More informationSelection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions
in Multivariable Analysis: Current Issues and Future Directions Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine STRATOS Banff Alberta 2016-07-04 Fractional polynomials,
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationAIR FORCE INSTITUTE OF TECHNOLOGY
TOPICS IN STATISTICAL CALIBRATION DISSERTATION Brandon M. Greenwell, Civilian AFIT-ENC-DS-14-M-01 DEPARTMENT OF THE AIR FORCE AIR UNIVERSITY AIR FORCE INSTITUTE OF TECHNOLOGY Wright-Patterson Air Force
More informationLesson 6 Maximum Likelihood: Part III. (plus a quick bestiary of models)
Lesson 6 Maximum Likelihood: Part III (plus a quick bestiary of models) Homework You want to know the density of fish in a set of experimental ponds You observe the following counts in ten ponds: 5,6,7,3,6,5,8,4,4,3
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationImputation for Missing Data under PPSWR Sampling
July 5, 2010 Beijing Imputation for Missing Data under PPSWR Sampling Guohua Zou Academy of Mathematics and Systems Science Chinese Academy of Sciences 1 23 () Outline () Imputation method under PPSWR
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationFlexible Spatio-temporal smoothing with array methods
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and
More informationERRATA. for Semiparametric Regression. Last Updated: 30th September, 2014
1 ERRATA for Semiparametric Regression by D. Ruppert, M. P. Wand and R. J. Carroll Last Updated: 30th September, 2014 p.6. In the vertical axis Figure 1.7 the lower 1, 2 and 3 should have minus signs.
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationStat542 (F11) Statistical Learning. First consider the scenario where the two classes of points are separable.
Linear SVM (separable case) First consider the scenario where the two classes of points are separable. It s desirable to have the width (called margin) between the two dashed lines to be large, i.e., have
More informationChapter 8: Estimation 1
Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.
More information