F. Jay Breidt Colorado State University

Size: px
Start display at page:

Download "F. Jay Breidt Colorado State University"

Transcription

1 Model-assisted survey regression estimation with the lasso 1 F. Jay Breidt Colorado State University Opening Workshop on Computational Methods in Social Sciences SAMSI August 2013 This research was supported in part by the US National Science Foundation (SES ).

2 Model-assisted survey regression estimation with the lasso 2 (Model-assisted survey regression estimation): joint work with Jean Opsomer, Colorado State, and various other colleagues bringing in nonparametric and semiparametric regression methods into classical survey statistics (with the lasso): joint work with former PhD student Kelly McConville, Whitman College; Thomas Lee, UC-Davis; and Gretchen Moisen, US Forest Service

3 Finite population and probability sampling 3 Finite population U = {1, 2,..., N} Response variable y k, k U these are just unknown real numbers, with no probability structure Goal: estimate finite population total t y = k U y k Draw probability sample s U with Pr [k s] = π k > 0

4 Inference for the finite population total 4 Sample membership indicator I k = 1 if k s, I k = 0 otherwise E [I k ] = π k, averaging over all possible samples Use this repeated-sampling probability structure for statistical inference Unbiased Horvitz-Thompson estimator of t y is HT(y k ) = k s y k π k = k U y k I k π k

5 Variance of the Horvitz-Thompson estimator 5 Depends on covariance structure of {I k } k U : Var I y k k = Cov (I π k, I l ) y k y l k U k π k,l U k π l = y k y l kl π k π l k,l U where kl = π kl π k π l and π kl = Pr [I k = 1, I l = 1 ]

6 Variance estimator for the Horvitz-Thompson estimator 6 Provided π kl > 0 for all k, l U, V (HT) = y k y l I k I l kl π k π l π kl is unbiased for k,l U Var (HT) = k,l U kl y k π k y l π l

7 Confidence intervals 7 Under very mild design conditions, V (HT) is consistent for Var (HT) HT is asymptotically normal and confidence intervals can be based on this fact in moderate to large samples { V (HT) } 1/2 {HT(yk ) t y } L N (0, 1) Involves only finite second moments of {y k }, not distributional assumptions not dependence assumptions Good design makes normal approximations better

8 Auxiliary information and the difference estimator 8 Suppose we have auxiliary information vector x k, k U Also have a method m( ) for predicting y k from x k : y k m(x k ) method m( ) does not depend on the sample e.g., inflation-adjust an old census value Unbiased difference estimator of t y is then Diff m (y k ) = k U m(x k ) + k s y k m(x k ) π k

9 Variance of the difference estimator 9 Var m(x k ) + (y k m(x k )) I k π k U k U k = y k m(x k ) y l m(x l ) kl π k π l k,l U Compare to Horvitz-Thompson estimator: Var I y k k = y k y l π kl k U k π k,l U k π l

10 Variance estimator for the difference estimator 10 Provided π kl > 0 for all k, l U, V (Diff m ) = y k m(x k ) y l m(x l ) kl π k k,l U is unbiased for Var (Diff m ) = k,l U π l I k I l π kl kl y k m(x k ) π k y l m(x l ) π l Inherits asymptotic normality from HT under mild conditions.

11 Summary so far Difference estimator is exactly unbiased, regardless of the quality of the method m( ) Diff m (y k ) = k U m(x k ) + HT (y k m(x k )) Has smaller variance than HT(y k ) provided residuals y k m(x k ) have smaller variation than raw values y k Have an exactly unbiased variance estimator Results require that m( ) does not depend on the sample

12 Model-assisted estimation 12 Difference estimator requires method m( ) independent of the sample Model-assisted estimator introduces a working model y k = µ(x k ) + ɛ k If the entire population were observed, use a standard statistical method to estimate µ( ) by m N ( ) (independent of sample)

13 Model-assisted estimation, continued 13 Since only a sample is observed, estimate m N ( ) by m N ( ) not independent of the sample Plug m N ( ) into the difference estimator form: MA(y k ) = k U m N (x k ) + k s y k m N (x k ) π k Plug m N ( ) into the variance estimator: V (MA) = y k m N (x k ) y l m N (x l ) kl π k k,l U π l I k I l π kl

14 Important case: Generalized regression estimation 14 Working model is heteroskedastic multiple regression: y k = µ(x k ) + ɛ k = x k β + ɛ k, ɛ k (0, σ 2 k ) If the entire population were observed, use weighted least squares: m N (x k ) = x k B N = x x j x 1 j x k σ 2 j y j j U j σj 2 j U

15 Generalized regression estimation, continued 15 Estimate finite population fit from the observed sample: m N (x k ) = x k B N = x x j x 1 j x k π j s j σj 2 j y j π j σj 2 j s B N is asymptotically design unbiased and consistent for B N regardless of the quality of the working model specification

16 Generalized regression estimation, continued 16 Plug into model-assisted form: GREG(y k ) = k U x k B N + k s y k x k B N π k Plug into the variance estimator: V (GREG) = k,l U kl y k x k B N π k y l x l B N π l I k I l π kl

17 GREG examples 17 Classical survey ratio estimator: x k is scalar, model is heteroskedastic regression through the origin Classical survey regression estimator: x k is scalar, model is homoskedastic simple linear regression Post-stratification estimator: x k is vector of indicators for categorical covariate...

18 Properties of GREG 18 Write GREG as GREG(y k ) = k U x k B N + k s y k x k B N π k = x k B N + k U k U ) + ( BN B N (y k x k B N)I k k U π k x k ( 1 I k π k = Diff mn (y k ) + (smaller-order term) )

19 GREG estimator behaves like difference estimator 19 Asymptotically unbiased (and mean square consistent), regardless of the quality of the working model µ( ) Variance is asymptotically equivalent to Var x k B N + (y k x k B N) I k π k U k U k = y k x k B N y l x k B N kl π k π l k,l U Smaller asymptotic variance than HT(y k ) provided residuals y k x k B N have less variation than raw values y k

20 Weighting 20 GREG can also be written in weighted form: GREG(y k ) = k s = k s = k s y k x k B N π k + k U x k B N ( 1 + (tx HT(x k )) π k ω ks y k k s x k x k π k ) 1 xk π k y k The GREG weights {ω ks } do not depend on y and can be applied to any response variable

21 US Forest Inventory and Analysis: Many y s! 21 Estimates required for forest area, wood volume, growth, mortality,... By region, species and other classifications

22 Weighting and calibration 22 Note that the weights {ω ks } are calibrated to the X- totals: GREG(x k) = ( ) (tx HT(x k )) x k x k xk π k π k π k k s k s ) 1 x k x k π k = HT(x k) + (tx HT(x k )) ( k s = t x k s x k x k x k π k GREG will be very efficient if y k is approximately a linear combination of x k

23 GREG calibration features 23 GREG(x k ) = t x Calibration reproduces known population information internal consistency across statistical system reassuring for users, logistically convenient for agencies GREG weight adjustments may be large extreme weights, negative weights are possible many methods developed to trim or stabilize weights, including ridge calibration (Rao and Singh 1997, ASA Proc.; Park and Fuller 2005, Survey Methodology; Montanari and Ranalli 2009, Stats Canada workshop.)

24 Alternatives to GREG 24 Keep the calibration, change the( metric: ) Deville and Särndal (1992, JASA) minimize d πk 1, ω ks subject to k s ω ksx k = tx Specify the working model more flexibly: Local polynomial regression (Breidt and Opsomer 2000) Neural nets (Montanari and Ranalli 2005) Penalized splines (Breidt, Claeskens, Opsomer 2005); Regression splines (Goga 2005) Generalized additive models (Opsomer, Breidt, Moisen, and Kauermann 2007); Nonparametric additive models (Wang and Wang 2011)

25 General recipe for model-assisted estimation 25 Write down your favorite m N ( ) you would use if the entire population were observed Create survey-weighted version, m N ( ) Plug in and write model-assisted estimator as MA(y k ) = m N (x k ) + y k m N (x k ) π k U k s k = Diff mn (y k ) + (smaller-order term)

26 Example: Local polynomial regression 26 Breidt and Opsomer (2000), Ann. Stat. Working model: µ( ) is a smooth function of scalar x Estimate µ( ) via local polynomial regression: m N (x i ) = (1, 0,..., 0) ( X Ui W ) 1 UiX Ui X Ui W Ui y U where and X Ui = [ 1 x j x i (x j x i ) q ] j U W Ui = diag { 1 h K ( )} xj x i h j U

27 Example: Local linear regression 27 Fit line locally (as defined by kernel weights) and read off the local intercept x y

28 Properties of the LPR estimator? 28 Estimate m N ( ) using survey weights to get m N ( ), plug in to model-assisted form LPR(y k ) = m N (x k ) + y k m N (x k ) π k U k s k = m N (x k ) + (y k m N (x k ))I k π k U k U k + ( ( m N (x k ) m N (x k )) 1 I ) k π k U k = Diff mn (y k ) + (smaller-order term??)

29 Asymptotic framework 29 Consider sequence of finite populations with n N as N Smoothing assumptions: kernel K( ) is symmetric, continuous, and compactly supported bandwidth h N 0 and Nh 2 N Design assumptions: min i UN π i λ > 0 and min i,j UN π ij λ > 0 limited dependence: max i,j U N :i j max (i,j,k,l) distinct max (i,j,k,l) distinct max (i,j,k) distinct π ij π i π j = O ( ) n 1 N E [(I i π i )(I j π j )(I k π k )(I l π l )] = O ( N 2) E [(I i I j π ij )(I k I l π kl )] = o(1) E [ (I i π i ) 2 (I j π j )(I k π k ) ] = O ( ) n 1 N

30 LPR estimator behaves like difference estimator 30 Under the above asymptotic framework, LPR estimator is mean square consistent for t y Variance is asymptotically equivalent to k,l U kl y k m N (x k ) π k y l m N (x l ) π l Standard plug-in variance estimator is consistent Smaller asymptotic variance than HT provided residuals y k m N (x k ) have less variation than raw values y k

31 Weighting and calibration for LPR 31 LPR(y i ) = i s ω isy i with weights independent of y For qth order local polynomial, weights are calibrated to powers of x: ω is x l i = x l i (l = 0, 1,..., q) i s i U N LPR will be particularly effective if y is approximately a qth order polynomial in x

32 Example: Penalized spline regression 32 K fixed, known knots and K basis functions Penalty parameter λ 2 determines degrees of freedom of the smooth x y lambda^2 = 0 and df = 16 x y lambda^2 = and df = 8 x y lambda^2 = and df = 4 x y lambda^2= 1000 and df= 2

33 Model-assisted estimation with p-splines 33 Breidt, Claeskens and Opsomer (2005, Biometrika) Choose penalty λ 2 to give specified degrees of freedom Formulate p-spline as linear mixed model (LMM) Write down LMM fit for the entire finite population: [m N (c k )] k U = C(C C + Λ) 1 C y U with Λ = blockdiag(0, λ 2 I K ) Estimate m N ( ) using survey weights to get m N ( ), plug in to model-assisted form

34 P-spline estimator behaves like difference estimator 34 Under standard asymptotic framework with K fixed, p-spline estimator is mean square consistent for t y variance is asymptotically equivalent to y k m N (c k ) y l m N (c l ) kl π k π l k,l U standard plug-in variance estimator is consistent smaller asymptotic variance than HT provided residuals y k m N (c k ) have less variation than raw values y k McConville and Breidt (2013, J. Nonpar. Stat.) prove above results with K

35 P-spline weighting and calibration 35 LMM model-assisted estimator (including p-spline) can be written LMM(y k ) = y k c B k + c π B k k k s k U = ( 1 + (tc HT(c k )) c k c k π k π k k s k s + Λ ) 1 ck π k y k = k s ω ks y k Calibrated on X, LMM(x k ) = t x, but not on Z, LMM(z k ) t z, due to the penalization

36 Comparing local polynomials with p-splines 36 Both LPR and p-splines have good robustness properties: comparable efficiency to GREG when parametric working model is correct better efficiency when parametric working model is incorrect better-behaved weights than GREG (e.g., almost never negative) P-splines extend much more readily than kernels to semiparametric models: additional X variables, continuous or categorical, with calibration additional Z variables, continuous or categorical, without calibration

37 Model Calibration : GREG with Model Predictions 37 Both LPR and p-splines had linear structure, allowing calibrated y-independent weights Nonlinear methods are typically uncalibrated, but... Wu and Sitter (2001) JASA: regress y k on m N (x k ): WS(y k ) = y k m N (x k ) R + m N (x k ) π R k k s k U = { ( 1 + m N (x k ) π k k s k U k s ) m N (x k ) π k yielding weights, but these depend on y m N (x k )/π k k s m N(x k ) 2 /π k } y k

38 Forest Inventory and Analysis Program 38 Conducted by United States Forest Service Nationwide network of 400,000 sample plots, visited every five years in rotating panels Goal: national and regional estimates of t y or ȳ U for various y: forest area, wood volume, growth, mortality,... expensive data y are field-collected or manually interpreted from aerial photography interest in using cheap auxiliary data x from remote sensing and other spatial data sources

39 Auxiliary information for all k U in real applications? 39 Required auxiliary info comes from wall-to-wall remote sensing, like satellite imagery digital elevation models geographic information system data products Working model is µ(x k ) = x kβ, as with GREG: but auxiliary variables may be highly correlated may have poor predictive ability for response variables Consider using model selection methods

40 Lasso estimation 40 Tibshirani (1996): Least absolute shrinkage and selection operator (lasso) B (L) N = arg min β (Y U X U β) T (Y U X U β) + λ N p β j simultaneously performs model selection and estimation by shrinking unnecessary coefficients to zero j=1

41 Lasso survey regression estimator 41 Construct survey-weighted version: B (L) N = arg min β (Y s X s β) T Π 1 s (Y s X s β) + λ N Plug into model-assisted form: LASSO(y k ) = k U x k B (L) N + k s y k x k π k p β j j=1 B (L) N

42 Asymptotics for the lasso survey regression estimator 42 Assume conditions under which GREG is consistent and asymptotically equivalent to the difference estimator, k U x kb N + k s y k x k B N π k, where B N = ( X T U X U ) 1 X T U y U ( N ) If λ N = o, then LASSO shares this asymptotic equivalence working model true model, so oracle properties are not relevant any advantages of LASSO are in finite samples

43 Extensions of lasso survey regression estimator 43 Can also consider adaptive lasso, A : Zou (2006, JASA) Lasso is non-linear, so get weights either by Wu and Sitter (2001) model calibration: C Tibshirani (1996) ridge regression approximation to lasso coefficient estimates: R nonlinear calibration weights ridge weights LASSO CLASSO RLASSO ALASSO CALASSO RALASSO

44 Study finite sample properties via simulation 44 Utah tree canopy cover data set national pilot project conducted by FIA and the US Forest Service Remote Sensing Applications Center N = 4, 151 grid points on one Landsat scene in Utah covers parts of 10 counties Response: y k = photo-interpreted tree canopy cover relevant to forest management, fire modeling, air pollution mitigation, water temperature, and carbon storage correlated with many other interesting responses known for all k U for this pilot study very expensive to obtain!

45 Utah tree canopy cover simulation, continued 45 Auxiliary data x k from the 2001 National Land Cover Database and Landsat-5 reflectance bands transformed aspect, slope, topographic positional index, elevation, land cover and NLCD predicted tree canopy cover about half are statistically significant in finite population regression of y on x Sampling designs: simple random (SI) and stratified simple random (STSI)with counties as strata samples sizes n = 50 and n = 100 equal allocation to counties unequal probabilities 2000 replicate samples from same fixed, finite population

46 Results: Design bias and MSE 46 For all estimators, relative design bias was less than 2% Ratio of design MSE for each estimator to design MSE of full GREG estimator: SI STSI n = 50 n = 100 n = 50 n = 100 LASSO Model-Assisted: ALASSO CLASSO CALASSO RLASSO RALASSO Design Only: HT

47 Results: Weight properties 47 Negative weights: calibration weights were negative in only 0.036% of all cases ridge regression weights in only 0.65% GREG weights varied from 1% to 14% negative weights Weight variation within samples and across samples: Weight Variances var(w) var(w j j s) SI STSI SI STSI Estimators n = 50 n = 100 n = 50 n = 100 n = 50 n = 100 n = 50 n = 100 CLASSO CALASSO RLASSO RALASSO GREG HT

48 Summary 48 Lasso survey regression estimators have useful potential dominate MSE of GREG in small samples with large numbers of potential predictors better weight properties: less variation and fewer negative weights Sophisticated computational methods can improve basic survey estimators model-assisted framework gives straightforward recipe for incorporating complex methods Contact:

49 Selected references on model-assisted survey regression estimation (by no means exhaustive, sorry!) 49 Breidt, F. J., G. Claeskens, and J. D. Opsomer (2005). Model-assisted estimation for complex surveys using penalised splines. Biometrika 92, Breidt, F. J. and J. D. Opsomer (2000). Local polynomial regression estimators in survey sampling. Annals of Statistics 28, Breidt, F.J. and Opsomer, J.D. (2008). Endogenous post-stratification in surveys: classifying with a sample-fitted model. Annals of Statistics 36, Breidt, F. J. and J. D. Opsomer (2009). Nonparametric and semiparametric estimation in complex surveys. Sample Surveys: Theory, Methods and Inference, Handbook of Statistics 29, Breidt, F.J., Opsomer, J.D, Johnson, A.A. and Ranalli, M.G. (2007). Semiparametric model-assisted estimation for natural resource surveys. Survey Methodology 33, Cassel, C. M., C. E. Särndal, and J. H. Wretman (1976). Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika 63, Dahlke, M., Breidt, F.J., Opsomer, J.D. and Van Keilegom, I. (2013). Nonparametric endogenous post-stratification estimation. Statistica Sinica 23, Deville, J.C., and Särndal, C.E. (1992), Calibration Estimators in Survey Sampling, Journal of the American Statistical Association 87, Goga, C. (2005). Réduction de la variance dans les sondages en présence d information auxiliarie: Une approache non paramétrique par splines de régression. Canadian Journal of Statistics 33,

50 McConville, K. S. and F. J. Breidt (2013). Survey design asymptotics for the model-assisted penalised spline regression estimator. Journal of Nonparametric Statistics (ahead-of-print), Montanari, G. and M. Ranalli (2005). Nonparametric methods in survey sampling. New Developments in Classification and Data Analysis 100, Montanari, G. E. and M. G. Ranalli (2005). Nonparametric model calibration estimation in survey sampling. Journal of the American Statistical Association 100(472), Montanari,G.E.,and Ranalli, M.G.(2006), A Mixed Model-Assisted Regression Estimator that Uses Variables Employed at the Design Stage, Statistical Methods and Applications 15, Montanari,G.E.,and Ranalli, M.G. (2009).Multiple and ridge model calibration. In Proc. Workshop on Calibration and Estimation in Surveys. Statistics Canada. Opsomer, J. D., F. J. Breidt, G. G. Moisen, and G. Kauermann (2007). Model-assisted estimation of forest resources with generalized additive models (with discussion). Journal of the American Statistical Association 102, Park, M. and Fuller, W.A. (2005), Towards nonnegative regression weights for survey samples. Survey Methodology 31, Park, M. and Fuller, W.A. (2009), The mixed model for survey regression estimation, Journal of Statistical Planning and Inference 139, Rao, J.N.K. and Singh, A.C. (1997). A ridge-shrinkage method for range-restricted weight calibration in survey sampling In ASA Proc. Sect. Survey Res. Meth., American Statistical Association. Robinson, P. M. and C. E. Särndal (1983). Asymptotic properties of the generalized regression estimator in probability sampling. Sankhya: The Indian Journal of Statistics, Series B 45, Särndal, C. E. (2007). The calibration approach in survey theory and practice. Survey Methodology 33(2),

51 Särndal, C.-E., B. Swensson, and J. Wretman (1989). The weighted residual technique for estimating the variance of the general regression estimator of the finite population total. Biometrika 76, Särndal, C.-E., B. Swensson, and J. Wretman (1992). Model Assisted Survey Sampling. New York: Springer- Verlag. Wang, L. and S. Wang (2011). Nonparametric additive model-assisted estimation for survey data. Journal of Multivariate Analysis 102, Wu, C. and R. R. Sitter (2001). A model-calibration approach to using complete auxiliary information from survey data. Journal of the American Statistical Association 96,

Penalized Balanced Sampling. Jay Breidt

Penalized Balanced Sampling. Jay Breidt Penalized Balanced Sampling Jay Breidt Colorado State University Joint work with Guillaume Chauvet (ENSAI) February 4, 2010 1 / 44 Linear Mixed Models Let U = {1, 2,...,N}. Consider linear mixed models

More information

Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling

Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University

More information

Model-assisted Estimation of Forest Resources with Generalized Additive Models

Model-assisted Estimation of Forest Resources with Generalized Additive Models Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean Opsomer, Jay Breidt, Gretchen Moisen, Göran Kauermann August 9, 2006 1 Outline 1. Forest surveys 2. Sampling from spatial

More information

Two Applications of Nonparametric Regression in Survey Estimation

Two Applications of Nonparametric Regression in Survey Estimation Two Applications of Nonparametric Regression in Survey Estimation 1/56 Jean Opsomer Iowa State University Joint work with Jay Breidt, Colorado State University Gerda Claeskens, Université Catholique de

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

Calibration estimation in survey sampling

Calibration estimation in survey sampling Calibration estimation in survey sampling Jae Kwang Kim Mingue Park September 8, 2009 Abstract Calibration estimation, where the sampling weights are adjusted to make certain estimators match known population

More information

NONPARAMETRIC ENDOGENOUS POST-STRATIFICATION ESTIMATION

NONPARAMETRIC ENDOGENOUS POST-STRATIFICATION ESTIMATION Statistica Sinica 2011): Preprint 1 NONPARAMETRIC ENDOGENOUS POST-STRATIFICATION ESTIMATION Mark Dahlke 1, F. Jay Breidt 1, Jean D. Opsomer 1 and Ingrid Van Keilegom 2 1 Colorado State University and 2

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson Department of Statistics Stockholm University Introduction

More information

Calibration estimation using exponential tilting in sample surveys

Calibration estimation using exponential tilting in sample surveys Calibration estimation using exponential tilting in sample surveys Jae Kwang Kim February 23, 2010 Abstract We consider the problem of parameter estimation with auxiliary information, where the auxiliary

More information

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov

More information

Model-assisted Estimation of Forest Resources with Generalized Additive Models

Model-assisted Estimation of Forest Resources with Generalized Additive Models Model-assisted Estimation of Forest Resources with Generalized Additive Models Jean D. Opsomer, F. Jay Breidt, Gretchen G. Moisen, and Göran Kauermann March 26, 2003 Abstract Multi-phase surveys are often

More information

Nonparametric Small Area Estimation Using Penalized Spline Regression

Nonparametric Small Area Estimation Using Penalized Spline Regression Nonparametric Small Area Estimation Using Penalized Spline Regression 0verview Spline-based nonparametric regression Nonparametric small area estimation Prediction mean squared error Bootstrapping small

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Additional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey

Additional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey Additional results for model-based nonparametric variance estimation for systematic sampling in a forestry survey J.D. Opsomer Colorado State University M. Francisco-Fernández Universidad de A Coruña July

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines Nonparametric Small Estimation via M-quantile Regression using Penalized Splines Monica Pratesi 10 August 2008 Abstract The demand of reliable statistics for small areas, when only reduced sizes of the

More information

Cross-validation in model-assisted estimation

Cross-validation in model-assisted estimation Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 009 Cross-validation in model-assisted estimation Lifeng You Iowa State University Follow this and additional

More information

Single-index model-assisted estimation in survey sampling

Single-index model-assisted estimation in survey sampling Journal of onparametric Statistics Vol. 21, o. 4, May 2009, 487 504 Single-index model-assisted estimation in survey sampling Li Wang* Department of Statistics, University of Georgia, Athens, GA, 30602,

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.

More information

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total.   Abstract NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the

More information

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES

REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Statistica Sinica 8(1998), 1153-1164 REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLES Wayne A. Fuller Iowa State University Abstract: The estimation of the variance of the regression estimator for

More information

arxiv: v2 [math.st] 20 Jun 2014

arxiv: v2 [math.st] 20 Jun 2014 A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

1. Introduction. Keywords: Auxiliary information; Nonparametric regression; Pseudo empirical likelihood; Model-assisted approach; MARS.

1. Introduction. Keywords: Auxiliary information; Nonparametric regression; Pseudo empirical likelihood; Model-assisted approach; MARS. Nonparametric Methods for Sample Surveys of Environmental Populations Metodi nonparametrici nell inferenza per popolazioni finite di carattere ambientale Giorgio E. Montanari Dipartimento di Economia,

More information

Penalized Splines, Mixed Models, and Recent Large-Sample Results

Penalized Splines, Mixed Models, and Recent Large-Sample Results Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong

More information

The Use of Survey Weights in Regression Modelling

The Use of Survey Weights in Regression Modelling The Use of Survey Weights in Regression Modelling Chris Skinner London School of Economics and Political Science (with Jae-Kwang Kim, Iowa State University) Colorado State University, June 2013 1 Weighting

More information

A comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group

A comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group A comparison of weighted estimators for the population mean Ye Yang Weighting in surveys group Motivation Survey sample in which auxiliary variables are known for the population and an outcome variable

More information

Combining data from two independent surveys: model-assisted approach

Combining data from two independent surveys: model-assisted approach Combining data from two independent surveys: model-assisted approach Jae Kwang Kim 1 Iowa State University January 20, 2012 1 Joint work with J.N.K. Rao, Carleton University Reference Kim, J.K. and Rao,

More information

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Statistica Sinica 22 (2012), 777-794 doi:http://dx.doi.org/10.5705/ss.2010.238 BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Desislava Nedyalova and Yves Tillé University of

More information

Remote Sensing of Environment

Remote Sensing of Environment Remote Sensing of Environment 128 (2013) 268 275 Contents lists available at SciVerse ScienceDirect Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse Inference for lidar-assisted

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

Nonparametric regression estimation under complex sampling designs

Nonparametric regression estimation under complex sampling designs Retrospective Theses and Dissertations 2004 Nonparametric regression estimation under complex sampling designs Ji-Yeon Kim Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/rtd

More information

Transformation and Smoothing in Sample Survey Data

Transformation and Smoothing in Sample Survey Data Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department

More information

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING

EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING Statistica Sinica 13(2003), 641-653 EFFICIENT REPLICATION VARIANCE ESTIMATION FOR TWO-PHASE SAMPLING J. K. Kim and R. R. Sitter Hankuk University of Foreign Studies and Simon Fraser University Abstract:

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that

More information

Weight calibration and the survey bootstrap

Weight calibration and the survey bootstrap Weight and the survey Department of Statistics University of Missouri-Columbia March 7, 2011 Motivating questions 1 Why are the large scale samples always so complex? 2 Why do I need to use weights? 3

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys

Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys The Canadian Journal of Statistics Vol.??, No.?,????, Pages???-??? La revue canadienne de statistique Generalized Pseudo Empirical Likelihood Inferences for Complex Surveys Zhiqiang TAN 1 and Changbao

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Model-Assisted Estimation of Forest Resources With Generalized Additive Models

Model-Assisted Estimation of Forest Resources With Generalized Additive Models Model-Assisted Estimation of Forest Resources With Generalized Additive Models Jean D. OPSOMER, F.JayBREIDT, Gretchen G. MOISEN, and Göran KAUERMANN Multiphase surveys are often conducted in forest inventories,

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

Advanced Methods for Agricultural and Agroenvironmental. Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller

Advanced Methods for Agricultural and Agroenvironmental. Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller Advanced Methods for Agricultural and Agroenvironmental Monitoring Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller Outline 1. Introduction to the National Resources Inventory 2. Hierarchical

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Mostly Dangerous Econometrics: How to do Model Selection with Inference in Mind

Mostly Dangerous Econometrics: How to do Model Selection with Inference in Mind Outline Introduction Analysis in Low Dimensional Settings Analysis in High-Dimensional Settings Bonus Track: Genaralizations Econometrics: How to do Model Selection with Inference in Mind June 25, 2015,

More information

Estimation under cross classified sampling with application to a childhood survey

Estimation under cross classified sampling with application to a childhood survey TSE 659 April 2016 Estimation under cross classified sampling with application to a childhood survey Hélène Juillard, Guillaume Chauvet and Anne Ruiz Gazen Estimation under cross-classified sampling with

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Estimation under cross-classified sampling with application to a childhood survey

Estimation under cross-classified sampling with application to a childhood survey Estimation under cross-classified sampling with application to a childhood survey arxiv:1511.00507v1 [math.st] 2 Nov 2015 Hélène Juillard Guillaume Chauvet Anne Ruiz-Gazen January 11, 2018 Abstract The

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions

A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions A Unified Theory of Empirical Likelihood Confidence Intervals for Survey Data with Unequal Probabilities and Non Negligible Sampling Fractions Y.G. Berger O. De La Riva Torres Abstract We propose a new

More information

arxiv: v1 [stat.me] 13 Nov 2017

arxiv: v1 [stat.me] 13 Nov 2017 Checking Validity of Monotone Domain Mean Estimators arxiv:1711.04749v1 [stat.me] 13 ov 2017 Cristian Oliva, Mary C. Meyer and Jean D. Opsomer Department of Statistics, Colorado State University, Fort

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

mboost - Componentwise Boosting for Generalised Regression Models

mboost - Componentwise Boosting for Generalised Regression Models mboost - Componentwise Boosting for Generalised Regression Models Thomas Kneib & Torsten Hothorn Department of Statistics Ludwig-Maximilians-University Munich 13.8.2008 Boosting in a Nutshell Boosting

More information

The R package sampling, a software tool for training in official statistics and survey sampling

The R package sampling, a software tool for training in official statistics and survey sampling The R package sampling, a software tool for training in official statistics and survey sampling Yves Tillé 1 and Alina Matei 2 1 Institute of Statistics, University of Neuchâtel, Switzerland yves.tille@unine.ch

More information

A new resampling method for sampling designs without replacement: the doubled half bootstrap

A new resampling method for sampling designs without replacement: the doubled half bootstrap 1 Published in Computational Statistics 29, issue 5, 1345-1363, 2014 which should be used for any reference to this work A new resampling method for sampling designs without replacement: the doubled half

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

On Testing for Informative Selection in Survey Sampling 1. (plus Some Estimation) Jay Breidt Colorado State University

On Testing for Informative Selection in Survey Sampling 1. (plus Some Estimation) Jay Breidt Colorado State University On Testing for Informative Selection in Survey Sampling 1 (plus Some Estimation) Jay Breidt Colorado State University Survey Methods and their Use in Related Fields Neuchâtel, Switzerland August 23, 2018

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications

More information

Small Area Estimation for Skewed Georeferenced Data

Small Area Estimation for Skewed Georeferenced Data Small Area Estimation for Skewed Georeferenced Data E. Dreassi - A. Petrucci - E. Rocco Department of Statistics, Informatics, Applications "G. Parenti" University of Florence THE FIRST ASIAN ISI SATELLITE

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

Effect of outliers on the variable selection by the regularized regression

Effect of outliers on the variable selection by the regularized regression Communications for Statistical Applications and Methods 2018, Vol. 25, No. 2, 235 243 https://doi.org/10.29220/csam.2018.25.2.235 Print ISSN 2287-7843 / Online ISSN 2383-4757 Effect of outliers on the

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Estimation of change in a rotation panel design

Estimation of change in a rotation panel design Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS028) p.4520 Estimation of change in a rotation panel design Andersson, Claes Statistics Sweden S-701 89 Örebro, Sweden

More information

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human

More information

Comments on: Model-free model-fitting and predictive distributions : Applications to Small Area Statistics and Treatment Effect Estimation

Comments on: Model-free model-fitting and predictive distributions : Applications to Small Area Statistics and Treatment Effect Estimation Article Comments on: Model-free model-fitting and predictive distributions : Applications to Small Area Statistics and Treatment Effect Estimation SPERLICH, Stefan Andréas Reference SPERLICH, Stefan Andréas.

More information

COMS 4771 Regression. Nakul Verma

COMS 4771 Regression. Nakul Verma COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the

More information

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety David Buil-Gil, Reka Solymosi Centre for Criminology and Criminal

More information

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University Model Selection, Estimation, and Bootstrap Smoothing Bradley Efron Stanford University Estimation After Model Selection Usually: (a) look at data (b) choose model (linear, quad, cubic...?) (c) fit estimates

More information

Successive Difference Replication Variance Estimation in Two-Phase Sampling

Successive Difference Replication Variance Estimation in Two-Phase Sampling Successive Difference Replication Variance Estimation in Two-Phase Sampling Jean D. Opsomer Colorado State University Michael White US Census Bureau F. Jay Breidt Colorado State University Yao Li Colorado

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

Linear model selection and regularization

Linear model selection and regularization Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics Amang S. Sukasih, Mathematica Policy Research, Inc. Donsig Jang, Mathematica Policy Research, Inc. Amang S. Sukasih,

More information

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS

EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,

More information

Research Article Ratio Type Exponential Estimator for the Estimation of Finite Population Variance under Two-stage Sampling

Research Article Ratio Type Exponential Estimator for the Estimation of Finite Population Variance under Two-stage Sampling Research Journal of Applied Sciences, Engineering and Technology 7(19): 4095-4099, 2014 DOI:10.19026/rjaset.7.772 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted:

More information

Simple design-efficient calibration estimators for rejective and high-entropy sampling

Simple design-efficient calibration estimators for rejective and high-entropy sampling Biometrika (202), 99,, pp. 6 C 202 Biometrika Trust Printed in Great Britain Advance Access publication on 3 July 202 Simple design-efficient calibration estimators for rejective and high-entropy sampling

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-13-244R2 Title Examining some aspects of balanced sampling in surveys Manuscript ID SS-13-244R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.2013.244 Complete

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS

ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS Mem. Gra. Sci. Eng. Shimane Univ. Series B: Mathematics 47 (2014), pp. 63 71 ASYMPTOTICS FOR PENALIZED SPLINES IN ADDITIVE MODELS TAKUMA YOSHIDA Communicated by Kanta Naito (Received: December 19, 2013)

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information