Spatio-Temporal Latent Variable Models: A Potential Waste of Space and Time?
|
|
- Patrick Fitzgerald
- 5 years ago
- Views:
Transcription
1 Spatio-Temporal Latent Variable Models: A Potential Waste of Space and Time? Francis K.C. Hui (Australian National University) Nicole Hill (Institute of Marine and Antarctic Studies) A.H. Welsh (Australian National University) Talk Outline: SO-CPR survey Spatio-temporal LVMs Estimation under misspecification Some simulations JSM 2018 Some images courtesy of Google images 1
2 Take home messages When fitting spatio-temporal LVMs, if you misspecify and assume latent variables are independent across sites, then: Inference on the regression coefficients remains relatively robust, particularly for Gaussian responses Inference on the loadings and latent variable predictions is badly off You save time! 2
3 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species Goal is to identify environmental factors driving species assemblage over time 3
4 Spatio-temporal LVMs Data: 4
5 Spatio-temporal LVMs Data: Model: 5
6 Spatio-temporal LVMs Likelihood: 6
7 Spatio-temporal LVMs Likelihood: In community ecology: GLLVMs and space-time extensions gaining traction in ecology e.g., Warton et al., (2015) > 100 citations; Ovaskainen et al., (2017) > 50 citations 7
8 Spatio-temporal LVMs What if we assume independence? 8
9 Spatio-temporal LVMs What if we assume independence? Pro: Save time; lower-d and simpler integral Con: Model misspecification; two species are correlated at any particular site-time combination, but otherwise not! Can we get away with assuming independence for some forms of inference? 9
10 Some Sims: Design Interested in regression coefficients, loadings, LV predictions T=1 i.e., purely spatial LVMs. Results expected to carry over to T > 1 (expanding domain?) True model is an spatial LVM with d = 3 latent variables. Each latent variable is spatially correlated through an exponential correlation function Data generated on a square grid: n = 7 x 7; 10 x 10; 14 x 14; 22 x 22 (expanding domain) 10
11 Some Sims: Design Interested in regression coefficients, loadings, LV predictions T=1 i.e., purely spatial LVMs. Results expected to carry over to T > 1 (expanding domain?) True model is an spatial LVM with d = 3 latent variables. Each latent variable is spatially correlated through an exponential correlation function Data generated on a square grid: n = 7 x 7; 10 x 10; 14 x 14; 22 x 22 (expanding domain) Compare true and misspecified LVMs MLE done using Laplace approximation via Template Model Builder (TMB, Kristensen et al., 2015) Standard information matrix for true LVM; sandwich information matrix for misspecified LVM 11
12 Some Sims: Normal Responses (most common response type/assumption outside of ecology) Bias of regression coefficients Not much difference between true and misspecified LVMs Bias tends to zero irrespective of coefficient value and norm of loading 12
13 Some Sims: Normal Responses Bias of regression coefficients Not much difference between true and misspecified LVMs Bias tends to zero irrespective of coefficient value and norm of loading Root MSE of regression coefficients Misspecified LVMs have slightly higher RMSE RMSE tends to zero irrespective of coefficient value and norm of loading 13
14 Some Sims: Normal Responses Bias of regression coefficients Not much difference between true and misspecified LVMs Bias tends to zero irrespective of coefficient value and norm of loading Root MSE of regression coefficients Misspecified LVMs have slightly higher RMSE RMSE tends to zero irrespective of coefficient value and norm of loading Coverage probability of regression coefficients Not much difference between true and misspecified LVMs Tends to nominal level irrespective of coefficient value and norm of loading 14
15 Some Sims: Normal Responses Bias of loadings Bias larger for misspecified LVMs particularly at large n and loading values Bias for both LVMs is smaller for loadings close to zero 15
16 Some Sims: Normal Responses Bias of loadings Bias larger for misspecified LVMs particularly at large n and loading values Bias for both LVMs is smaller for loadings close to zero Root MSE of loadings RMSE larger for misspecified LVMs particularly at large n and loading values RMSE for both LVMs is small for loadings close to zero 16
17 Some Sims: Normal Responses Bias of loadings Bias larger for misspecified LVMs particularly at large n and loading values Bias for both LVMs is smaller for loadings close to zero Root MSE of loadings RMSE larger for misspecified LVMs particularly at large n and loading values RMSE for both LVMs is small for loadings close to zero Coverage probability of loadings Misspecified LVMs suffer major undercoverage at large n and larger loading values (but OK for truly zero loadings) Both true and misspecified LVMs undercover at small n 17
18 Some Sims: Normal Responses Prediction of latent variables Misspecified LVMs substantially higher Procrustes error; both tend to zero with large n 18
19 Some Sims: Normal Responses Computation times Misspecified LVMs + sandwich information are much faster to fit and scale better than n 19
20 Some Sims: Binary Responses (presence absence data in ecology; least information inherent in data) Bias of regression coefficients Not much difference between true and misspecified LVMs Bias very close to zero for small betas Evidence of an effect of norm of loadings 20
21 Some Sims: Binary Responses Bias of regression coefficients Not much difference between true and misspecified LVMs Bias very close to zero for small betas Evidence of an effect of norm of loadings Root MSE of regression coefficients Misspecified LVMs tend to have higher RMSE at larger coefficient values and norm of loadings RMSE very close to zero (but more variable) for small betas Clear effect of norm of loadings 21
22 Some Sims: Binary Responses Bias of regression coefficients Not much difference between true and misspecified LVMs Bias very close to zero for small betas Evidence of an effect of norm of loadings Root MSE of regression coefficients Misspecified LVMs tend to have higher RMSE at larger coefficient values and norm of loadings RMSE very close to zero (but more variable) for small betas Clear effect of norm of loadings Coverage probability of regression coefficients At small n, misspecified LVMs overcover while true LVMs undercover At large n, misspecified LVMs undercover at larger coefficient values and norm of loadings Clear effect of norm of loadings Coverage very close to nominal level (but more variable) for small betas 22
23 Some Sims: Binary Responses Results for loadings and latent variables predictions in the case of binary spatio-temporal LVMs are similar to the normal response case. Under misspecification: Point estimation is badly biased Severe undercoverage for sandwich CIs Much high Procrustes errors for predictions of LVs. Computation times are again much faster under misspecification 23
24 Take home messages When fitting spatio-temporal LVMs, if you misspecify and assume latent variables are independent across sites, then: Inference on the regression coefficients remains relatively robust, particularly for Gaussian responses Inference on the loadings and latent variable predictions is badly off You save time! 24
25 Discussion There is some theory but I didn t have time to discuss it e.g., full consistency of coefficients for normal responses; zero consistency of coefficients for non-normal responses; zero consistency of loadings. If your goal is variable selection in LVMs, then misspecifying and assuming independence does not hurt you very much...how much efficiency do you lose versus how much can you get away with? Thanks for listening! 25
26 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 26
27 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 27
28 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 28
29 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 29
30 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 30
31 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 31
32 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 32
33 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 33
34 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 34
35 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 35
36 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 36
37 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 37
38 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 38
39 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 39
40 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 40
41 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 41
42 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 42
43 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 43
44 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 44
45 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 45
46 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 46
47 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 47
48 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 48
49 Southern Ocean Continuous Plankton Recorder Survey SO-CPR Survey: Running annually since 1991 Vessels of opportunity traversing the Southern ocean Presence-absence of around 100 zooplankton species 49
50 Southern Ocean Continuous Plankton Recorder Survey Every site visited once Expanding domain? Infill? Hybrid? Neither? 50
51 Estimation Under Misspecification What asymptotic framework should be work within? A pragmatic regularity condition: Implies MLEs of true spatio-temporal LVM are consistent Focus not on how to obtain consistency, but what happens under misspecification supposing you have consistency to start with. 51
52 Estimation Under Misspecification Full consistency for coefficients when responses are normal: Solving unbiased generalized least squares equation 52
53 Estimation Under Misspecification Full consistency for coefficients when responses are normal: Solving unbiased generalized least squares equation Zero consistency for coefficients in general: For covariates that are uninformative for all species, the misspecified LVM will consistently estimate these zeros. 53
54 Estimation Under Misspecification Full consistency for coefficients when responses are normal: Solving unbiased generalized least squares equation Zero consistency for coefficients in general: For covariates that are uninformative for all species, the misspecified LVM will consistently estimate these zeros. Weak: says nothing about fully/partly informative predictors Very strict assumption of independence between the truly informative and uninformative covariates Same conditions made for studying misspecification in generalized linear mixed models e.g., Neuhaus, McCulloch, and Boylan (2013) Analogous to partial orthogonality condition assumed in high-d variable selection e.g., Huang, Horowitz, and Ma (2008); Fan and Song (2010) 54
55 Estimation Under Misspecification Zero consistency for loadings: If a species is uncorrelated with anything else, then assuming independence does no harm Not a very realistic assumption in ecology, but possible in other settings e.g., social sciences 55
56 Some Sims: Normal Responses Bias of regression coefficients Not much difference between true and misspecified LVMs Bias tends to zero irrespective of coefficient value and norm of loading Root MSE of regression coefficients Misspecified LVMs have slightly higher RMSE RMSE tends to zero irrespective of coefficient value and norm of loading Coverage probability of regression coefficients Not much difference between true and misspecified LVMs Tends to nominal level irrespective of coefficient value and norm of loading CI width of regression coefficients Sandwidth CIs from misspecified LVMs are wider 56
57 Some Sims: Binary Responses Bias of regression coefficients Not much difference between true and misspecified LVMs; biases very close to zero for small betas (zero consistency for coefficients) Evidence of an effect of norm of loadings Root MSE of regression coefficients Misspecified LVMs tend to have higher RMSE at larger coefficient values and norm of loadings Clear effect of norm of loadings RMSE very close to zero (but more variable) for small betas (zero consistency for coefficients); Coverage probability of regression coefficients Misspecified LVMs overcover while true LVMs undercover at small n; at large n, misspecified LVMs undercover at larger coefficient values and norm of loadings Clear effect of norm of loadings Coverage very close to nominal level (but more variable) for small betas CI width of regression coefficients Sandwich CIs from misspecified LVMs are huge at small n! Differences true and misspecified LVMs are small at larger n 57
58 Some Sims: Binary Responses Bias of loadings Misspecified LVMs substantially more biased at large n and large loading values (zero consistency for loadings) 58
59 Some Sims: Binary Responses Bias of loadings Misspecified LVMs substantially more biased at large n and large loading values (zero consistency for loadings) Root MSE of loadings True and misspecified LVMs perform similarly and poorly at small n, but at large n misspecified LVMs have substantially more variability RMSE for misspecified LVMs are close to zero (but more variable) for loadings close to zero (zero consistency for loadings) 59
60 Some Sims: Binary Responses Bias of loadings Misspecified LVMs substantially more biased at large n and large loading values (zero consistency for loadings) Root MSE of loadings True and misspecified LVMs perform similarly and poorly at small n, but at large n misspecified LVMs have substantially more variability RMSE for misspecified LVMs are close to zero (but more variable) for loadings close to zero (zero consistency for loadings) Coverage probability of loadings Misspecified LVMs suffer major undercoverage at large n and larger loading values (but OK for truly zero loadings) True LVMs undercover at smaller n Results for interval width [not presented] show sandwich CIs for misspecified LVMs are wider than standard CIs for true LVMs, sometimes considerably so. 60
61 Some Sims: Binary Responses Prediction of latent variables Misspecified LVMs substantially higher Procrustes error than true LVMs; especially at large n 61
62 Some Sims: Binary Responses Computation times Misspecified LVMs + sandwich information are much faster to fit and scale better than n 62
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models
The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models John M. Neuhaus Charles E. McCulloch Division of Biostatistics University of California, San
More informationThe impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved
More informationGeneralized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.
Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint
More informationGeneral Regression Model
Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical
More informationAccounting for Complex Sample Designs via Mixture Models
Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3
More informationFor right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood,
A NOTE ON LAPLACE REGRESSION WITH CENSORED DATA ROGER KOENKER Abstract. The Laplace likelihood method for estimating linear conditional quantile functions with right censored data proposed by Bottai and
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationAuxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model
Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 21 Version
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More informationStatistics 203: Introduction to Regression and Analysis of Variance Penalized models
Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationHeritability estimation in modern genetics and connections to some new results for quadratic forms in statistics
Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationGeneralized Linear Latent Variable Models for Multivariate Count. and Biomass Data in Ecology
1 2 Generalized Linear Latent Variable Models for Multivariate Count and Biomass Data in Ecology 3 Jenni Niku 1, David I. Warton 2,3, Francis K.C. Hui 4, and Sara Taskinen 1 4 5 6 7 8 9 1 Department of
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationComparing MLE, MUE and Firth Estimates for Logistic Regression
Comparing MLE, MUE and Firth Estimates for Logistic Regression Nitin R Patel, Chairman & Co-founder, Cytel Inc. Research Affiliate, MIT nitin@cytel.com Acknowledgements This presentation is based on joint
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationCorrelated and Interacting Predictor Omission for Linear and Logistic Regression Models
Clemson University TigerPrints All Dissertations Dissertations 8-207 Correlated and Interacting Predictor Omission for Linear and Logistic Regression Models Emily Nystrom Clemson University, emily.m.nystrom@gmail.com
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationRegression Models - Introduction
Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent
More information1 What does the random effect η mean?
Some thoughts on Hanks et al, Environmetrics, 2015, pp. 243-254. Jim Hodges Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota USA 55414 email: hodge003@umn.edu October 13, 2015
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationSTAT 5500/6500 Conditional Logistic Regression for Matched Pairs
STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More informationGradient types. Gradient Analysis. Gradient Gradient. Community Community. Gradients and landscape. Species responses
Vegetation Analysis Gradient Analysis Slide 18 Vegetation Analysis Gradient Analysis Slide 19 Gradient Analysis Relation of species and environmental variables or gradients. Gradient Gradient Individualistic
More informationConstrained estimation for binary and survival data
Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline
More informationRegression Models - Introduction
Regression Models - Introduction In regression models, two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent variable,
More informationRemedial Measures for Multiple Linear Regression Models
Remedial Measures for Multiple Linear Regression Models Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Remedial Measures for Multiple Linear Regression Models 1 / 25 Outline
More informationComments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek
Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT
ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationCentering Predictor and Mediator Variables in Multilevel and Time-Series Models
Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén Part 2 May 7, 2018 Tihomir Asparouhov and Bengt Muthén Part 2 Muthén & Muthén 1/ 42 Overview
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationBayesian spatial quantile regression
Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse
More informationMicroeconometrics: Clustering. Ethan Kaplan
Microeconometrics: Clustering Ethan Kaplan Gauss Markov ssumptions OLS is minimum variance unbiased (MVUE) if Linear Model: Y i = X i + i E ( i jx i ) = V ( i jx i ) = 2 < cov i ; j = Normally distributed
More informationA Statistical Framework for Analysing Big Data Global Conference on Big Data for Official Statistics October, 2015 by S Tam, Chief
A Statistical Framework for Analysing Big Data Global Conference on Big Data for Official Statistics 20-22 October, 2015 by S Tam, Chief Methodologist Australian Bureau of Statistics 1 Big Data (BD) Issues
More information5601 Notes: The Sandwich Estimator
560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............
More informationPart 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 2: Multivariate fmri analysis using a sparsifying spatio-temporal prior Tom Heskes joint work with Marcel van Gerven
More informationLatent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models
Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén August 5, 2018 Abstract We discuss different methods for centering a predictor
More informationSupporting Information for Estimating restricted mean. treatment effects with stacked survival models
Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation
More informationApproximate Median Regression via the Box-Cox Transformation
Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The
More informationUnderstanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys
Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys Emmanuel Jakobowicz CEDRIC-CNAM - 292 rue Saint Martin - 75141 Paris Cedex
More informationHigh Dimensional Propensity Score Estimation via Covariate Balancing
High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)
More informationControlling for Time Invariant Heterogeneity
Controlling for Time Invariant Heterogeneity Yona Rubinstein July 2016 Yona Rubinstein (LSE) Controlling for Time Invariant Heterogeneity 07/16 1 / 19 Observables and Unobservables Confounding Factors
More informationDSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.
DSGE Methods Estimation of DSGE models: GMM and Indirect Inference Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@wiwi.uni-muenster.de Summer
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationNonconvex penalties: Signal-to-noise ratio and algorithms
Nonconvex penalties: Signal-to-noise ratio and algorithms Patrick Breheny March 21 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/22 Introduction In today s lecture, we will return to nonconvex
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division
More informationFitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation
Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl
More informationLinear discriminant functions
Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationReconciling factor-based and composite-based approaches to structural equation modeling
Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based
More informationDSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference
DSGE-Models General Method of Moments and Indirect Inference Dr. Andrea Beccarini Willi Mutschler, M.Sc. Institute of Econometrics and Economic Statistics University of Münster willi.mutschler@uni-muenster.de
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationLinear Regression 9/23/17. Simple linear regression. Advertising sales: Variance changes based on # of TVs. Advertising sales: Normal error?
Simple linear regression Linear Regression Nicole Beckage y " = β % + β ' x " + ε so y* " = β+ % + β+ ' x " Method to assess and evaluate the correlation between two (continuous) variables. The slope of
More informationVariable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1
Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr
More informationA COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES
A COMPARISON OF HETEROSCEDASTICITY ROBUST STANDARD ERRORS AND NONPARAMETRIC GENERALIZED LEAST SQUARES MICHAEL O HARA AND CHRISTOPHER F. PARMETER Abstract. This paper presents a Monte Carlo comparison of
More informationBAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN
BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C., U.S.A. J. Stuart Hunter Lecture TIES 2004
More informationGravity Waves from Southern Ocean Islands and the Southern Hemisphere Circulation
Gravity Waves from Southern Ocean Islands and the Southern Hemisphere Circulation Chaim Garfinkel 1, Luke Oman 2 1. Earth Science Institute, Hebrew University 2 NASA GSFC ECMWF, September 2016 Topographic
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationVariational Approximations for Generalized Linear. Latent Variable Models
1 2 Variational Approximations for Generalized Linear Latent Variable Models 3 4 Francis K.C. Hui 1, David I. Warton 2,3, John T. Ormerod 4,5, Viivi Haapaniemi 6, and Sara Taskinen 6 5 6 7 8 9 10 11 12
More informationDistribution-free ROC Analysis Using Binary Regression Techniques
Distribution-free ROC Analysis Using Binary Regression Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Update Talk 1
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationGeneralized, Linear, and Mixed Models
Generalized, Linear, and Mixed Models CHARLES E. McCULLOCH SHAYLER.SEARLE Departments of Statistical Science and Biometrics Cornell University A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS, INC. New
More informationA Practical Comparison of the Bivariate Probit and Linear IV Estimators
Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Policy Research Working Paper 56 A Practical Comparison of the Bivariate Probit and Linear
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationBayesian Regression (1/31/13)
STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed
More informationAccounting for Population Uncertainty in Covariance Structure Analysis
Accounting for Population Uncertainty in Structure Analysis Boston College May 21, 2013 Joint work with: Michael W. Browne The Ohio State University matrix among observed variables are usually implied
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationThe Impact of Model Misspecification in Clustered and Continuous Growth Modeling
The Impact of Model Misspecification in Clustered and Continuous Growth Modeling Daniel J. Bauer Odum Institute for Research in Social Science The University of North Carolina at Chapel Hill Patrick J.
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationWeighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai
Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving
More informationUsing Bayesian Priors for More Flexible Latent Class Analysis
Using Bayesian Priors for More Flexible Latent Class Analysis Tihomir Asparouhov Bengt Muthén Abstract Latent class analysis is based on the assumption that within each class the observed class indicator
More informationEco517 Fall 2014 C. Sims FINAL EXAM
Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,
More informationfinite-sample optimal estimation and inference on average treatment effects under unconfoundedness
finite-sample optimal estimation and inference on average treatment effects under unconfoundedness Timothy Armstrong (Yale University) Michal Kolesár (Princeton University) September 2017 Introduction
More informationThe performance of estimation methods for generalized linear mixed models
University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2008 The performance of estimation methods for generalized linear
More informationSpatial Temporal Models for Retail Credit
Spatial Temporal Models for Retail Credit Bob Stine The School, University of Pennsylvania stat.wharton.upenn.edu/~stine Introduction Exploratory analysis Trends and maps Outline Measuring spatial association
More information