Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies

Size: px
Start display at page:

Download "Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies"

Transcription

1 Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies Thierry Duchesne 1 (Thierry.Duchesne@mat.ulaval.ca) with Radu Craiu, Daniel Fortin, Sophie Baillargeon Département de mathématiques et de statistique, Université Laval Department of Statistics, University of Toronto Département de biologie, Université Laval Department of Statistics Seminar University of Manitoba October 28, Research funded by NSERC.

2 Outline 1 Introduction Research objectives Sampling designs Data available Methodological objectives 2 Conditional logistic regression Model and notation Justification of conditional logistic regression 3 Population averaged inference Method Example of application 4 Subject specific inference Method Example of application 5 Conclusion 6 References

3 Research objectives Objectives of our research Ecological objectives For the biologists, it is important to understand the links between various attributes of a landscape and how animals select their habitat (or move within their home-range).

4 Research objectives Objectives of our research Ecological objectives For the biologists, it is important to understand the links between various attributes of a landscape and how animals select their habitat (or move within their home-range). Statistical objectives What are the appropriate sampling designs? What are the possible statistical models? How do we make inference on the model parameters?

5 Sampling designs Possible study designs Unmatched used vs unused (or available) designs Useful to determine what landscape attributes predict if a location is likely to be used or not over a specified time frame (e.g., trees with nests vs trees without nests).

6 Sampling designs Possible study designs Unmatched used vs unused (or available) designs Useful to determine what landscape attributes predict if a location is likely to be used or not over a specified time frame (e.g., trees with nests vs trees without nests). Usually analyzed with logistic regression (Y i = 1 if location i is used, Y i = 0 otherwise). To be used with care since in some contexts, available unused.

7 Sampling designs Possible study designs Unmatched used vs unused (or available) designs Useful to determine what landscape attributes predict if a location is likely to be used or not over a specified time frame (e.g., trees with nests vs trees without nests). Usually analyzed with logistic regression (Y i = 1 if location i is used, Y i = 0 otherwise). To be used with care since in some contexts, available unused. If sampling unit is animal (with many used locations per animal), then within animal correlation must be taken into consideration GEE (population-averaged) or mixed models (subject-specific) are used. Again, care must be exercised w.r.t. the available/unused locations.

8 Sampling designs Possible study designs Matched designs For each location used (or step traveled) by an animal, m unused locations that could have been visited by the same animal at the same time are sampled.

9 Sampling designs Possible study designs Matched designs For each location used (or step traveled) by an animal, m unused locations that could have been visited by the same animal at the same time are sampled. The dataset is comprised of several such matched strata for each animal.

10 Sampling designs Possible study designs Matched designs For each location used (or step traveled) by an animal, m unused locations that could have been visited by the same animal at the same time are sampled. The dataset is comprised of several such matched strata for each animal. Does not allow inference on absolute probability of use of a precise location, but does allow inference on the probability of choosing a given location among a set of locations when location attributes are given.

11 Sampling designs Matched design E.g., each location is matched with 10 locations picked at random among those that could have been used at same time. Step Selection Functions. Fortin et al Ecology 86(5):

12 Sampling designs Matched design E.g., each location is matched with 10 locations picked at random among those that could have been used at same time. Step Selection Functions. Fortin et al Ecology 86(5):

13 Data available Part I: Data on the available location We have a detailed GIS database of Prince Albert National Park

14 Data available Part II: Animal location data For each of K animals (female bison), GPS collars give their precise location at a large number of equally spaced time steps

15 Methodological objectives Our precise statistical problems In some cases, we can get more than one Y = 1 in a stratum: e.g., a pair of animals traveling together. How do we make inferences on the preferences of the animals for given landscape attributes under such a sampling design? We will see that this can be done if we can come up with a longitudinal version of conditional logistic regression.

16 Model and notation Notation Animals: c = 1,2,...,K; Strata: j = 1,2,...,S c ; Locations: i = 1,2,...,n;

17 Model and notation Notation Animals: c = 1,2,...,K; Strata: j = 1,2,...,S c ; Locations: i = 1,2,...,n; Response variable: y (c) ji = 1 if animal c was at location i in j-th stratum, 0 otherwise;

18 Model and notation Notation Animals: c = 1,2,...,K; Strata: j = 1,2,...,S c ; Locations: i = 1,2,...,n; Response variable: y (c) ji = 1 if animal c was at location i in j-th stratum, 0 otherwise; Covariates: Value of attributes of landscape at location i in stratum j of animal c: x (c) ji = (x (c) ji1,...,x(c) jip ) ;

19 Model and notation Notation Animals: c = 1,2,...,K; Strata: j = 1,2,...,S c ; Locations: i = 1,2,...,n; Response variable: y (c) ji = 1 if animal c was at location i in j-th stratum, 0 otherwise; Covariates: Value of attributes of landscape at location i in stratum j of animal c: x (c) ji = (x (c) ji1,...,x(c) jip ) ; Sampling design: By design, it is known before sampling that n i=1 y(c) ji = m for all j,c.

20 Model and notation Prospective model If we sampled locations without knowing the value of the y (c) ji in advance (i.e., prospective study), we could link landscape attributes x (c) ji with y (c) ji using logistic regression-type models.

21 Model and notation Prospective model If we sampled locations without knowing the value of the y (c) ji in advance (i.e., prospective study), we could link landscape attributes x (c) ji with y (c) ji using logistic regression-type models. E.g., given i.i.d. N(0, Σ) vectors of animal-level random effects, say b c, and the covariates, it is assumed that the y (c) ji are independent with ( ) [ ] Pr y (c) exp β x (c) ji = 1 b c,x (c) ji + b c z (c) ji ji = ( ). 1 + exp β x (c) ji + b c z (c) ji

22 Model and notation Resource selection function The exponential of the linear predictor is sometimes called resource selection function (RSF). Maps of its value can help to assess animal preferences.

23 Justification of conditional logistic regression Retrospective model When location i in stratum j of animal c is sampled on the basis of its y (c) ji value, how can we infer about β (and possibly Σ) in the prospective model?

24 Justification of conditional logistic regression Retrospective model When location i in stratum j of animal c is sampled on the basis of its y (c) ji value, how can we infer about β (and possibly Σ) in the prospective model? Using arguments based on conditional likelihood (e.g., Hosmer & Lemeshow 2000), on discrete choice theory (e.g., Manly et al. 2002, Train 2003) or on movement kernels (e.g., Forester et al, 2009), we get that a good way to deal with the retrospective design is conditional logistic regression.

25 Justification of conditional logistic regression Conditional likelihood If we suppose that b c z (c) ji = b c in the prospective model, then we get that [ ] ( ) n exp n i=1 β x (c) ji y (c) ji Pr ji,i = 1,...,n b c, y (c) i=1 y (c) ji = m,x (c) ji,i = 1,...,n where the sum at the denominator is over all vectors v l comprised of zeros and ones such that the sum of their elements is m. = (n m) l=1 exp ( n i=1 β x (c) ji v li )

26 Justification of conditional logistic regression Exponential movement kernels (Forester et al 2009) Suppose the animal is at location a at time step t. All locations in set D a are reachable by the animal until time step t + 1. Assume that the density of movement from a point a to a point b in a homogeneous baseline landscape over one time step is given by φ(d ab ), where d ab is the distance between a and b. Suppose that habitat characteristics have a log-linear effect on the movement kernel. Then f (b a,x s,s D a ) = φ(d ab)exp(β x b ) s D a φ(d as )exp(β x s ).

27 Justification of conditional logistic regression Exponential movement kernels (Forester et al 2009) Evaluation of the integral at the denominator can be replaced by an approximating sum. Forester et al (2009) show that if a sample S a comprised of b and n 1 other locations in D a are appropriately sampled, f (b a,x l,l D a ) = φ(d ab)exp(β x b ) l D a φ(d al )exp(β x l ) exp(β x b ) l Sa exp(β x l ), which is the probability of conditional logistic regression when m = 1 and the location with y = 1 is b.

28 Method Data and assumptions Now back to the general problem: K animals, S (c) strata observed for animal c, m cases (locations with y = 1) and n m controls (locations with y = 0) in each stratum. We want to make population averaged inference about β in the prospective model. It is assumed that the data can be partitioned into uncorrelated clusters (data from different animals uncorrelated, or clusters of observations on a same animal taken several time units apart).

29 Method Craiu et al (2008) We showed that the likelihood score function of the retrospective model can be rewritten as U(β) = = K S (c) n c=1 j=1 i=2 K c=1 where x (c) ji x(c) ji y (c) n) ji (m l=1 v lix (c) ji ( D (c) V Indep) (c) 1 {Ỹ(c) } µ(β), ( ) exp n h=2 β x (c) ji v lh (m n) l=1 exp ( n h=2 β x (c) ji v lh ) = x (c) ji x (c) j1 and Ỹ(c) is the vector of all responses, but without the y (c) j1 s and µ(β) = E Retro.[Ỹ (c) X (c) ].

30 Method Advantages With the robust (sandwich) estimate of Var( ˆβ), inferences about β are valid no matter what the correlation structure within clusters is... as long as data are uncorrelated between clusters.

31 Method Advantages With the robust (sandwich) estimate of Var( ˆβ), inferences about β are valid no matter what the correlation structure within clusters is... as long as data are uncorrelated between clusters. U(β) is the partial likelihood score for the Cox model for discrete data PROC PHREG or coxph() can be used to apply the method.

32 Method Advantages With the robust (sandwich) estimate of Var( ˆβ), inferences about β are valid no matter what the correlation structure within clusters is... as long as data are uncorrelated between clusters. U(β) is the partial likelihood score for the Cox model for discrete data PROC PHREG or coxph() can be used to apply the method. Simulations have shown that inferences are good in finite samples:

33 Method Simulation results, Craiu et al (2008, Table 1)

34 Method Disadvantages Inference on parameters of working correlation matrix not possible Must use independence working assumption.

35 Method Disadvantages Inference on parameters of working correlation matrix not possible Must use independence working assumption. Though better than AIC, the QIC(I) model selection criterion did not perform really well in simulations:

36 Method Simulation results

37 Example of application Application to female bison in Prince Albert 8 female bison with 14 clusters of 48 locations, and 1 female with 9 clusters, all followed between 2 Sept Dec Each observed location was matched to 10 locations picked at random in a 300 m buffer (so K = = 121, S = 48, m = 1, n = 11). x: 6 dummy variables to quantify seven-level habitat class categorical variable (deciduous stands = baseline level)

38 Example of application Model fit

39 Method Conditional inference Sometimes, subject-specific inferences are required. Can we estimate β and Σ from the mixed-effects prospective model with the retrospective sampling design?

40 Method Conditional inference Sometimes, subject-specific inferences are required. Can we estimate β and Σ from the mixed-effects prospective model with the retrospective sampling design? Already done in some special cases: Family studies of genetic diseases (special case S = 1) Mixed multinomial logit discrete choice model (special case m = 1)

41 Method Likelihood for the general case Craiu et al (2011) get the following likelihood in the general case:

42 Method Likelihood for the general case Craiu et al (2011) get the following likelihood in the general case: L(β,Σ) = K c=1 ( exp si y (c) si β x (c) si d (c) (β,b) s (c) exp l L s ) ( ) exp si y (c) si b z (c) si d (c) (β,b) df(b;σ) { }, i v (c) lsi (β x (c) si + b z (c) si ) df(b; Σ) where d (c) (β,b) = s i {1 + exp(β x (c) si + b z (c) si )} 1.

43 Method Likelihood for the general case Craiu et al (2011) get the following likelihood in the general case: L(β,Σ) = K c=1 ( exp si y (c) si β x (c) si d (c) (β,b) s (c) exp l L s ) ( ) exp si y (c) si b z (c) si d (c) (β,b) df(b;σ) { }, i v (c) lsi (β x (c) si + b z (c) si ) df(b; Σ) where d (c) (β,b) = s i {1 + exp(β x (c) si + b z (c) si )} 1. How do you maximize this thing?!?!?!?!!

44 Method Maximization of the likelihood Family studies (Pfeiffer et al 2001): Evaluate the integrals by Monte Carlo method, then maximize using a hybrid of Newton-type methods for β and grid search for elements of Σ.

45 Method Maximization of the likelihood Family studies (Pfeiffer et al 2001): Evaluate the integrals by Monte Carlo method, then maximize using a hybrid of Newton-type methods for β and grid search for elements of Σ. Mixed multinomial logit (Bhat 2001): Quasi-Monte Carlo evaluation of integrals, Newton-type methods to maximize.

46 Method Maximization of the likelihood Family studies (Pfeiffer et al 2001): Evaluate the integrals by Monte Carlo method, then maximize using a hybrid of Newton-type methods for β and grid search for elements of Σ. Mixed multinomial logit (Bhat 2001): Quasi-Monte Carlo evaluation of integrals, Newton-type methods to maximize. Craiu et al (2011), first attempt: Quasi-Monte Carlo evaluation of integrals, Newton-type methods to maximize

47 Method Maximization of the likelihood Family studies (Pfeiffer et al 2001): Evaluate the integrals by Monte Carlo method, then maximize using a hybrid of Newton-type methods for β and grid search for elements of Σ. Mixed multinomial logit (Bhat 2001): Quasi-Monte Carlo evaluation of integrals, Newton-type methods to maximize. Craiu et al (2011), first attempt: Quasi-Monte Carlo evaluation of integrals, Newton-type methods to maximize With small K and large S, these methods are painfully slow and unstable!

48 Method Two-step algorithm, Craiu et al (2011) Inspired by earlier work for GLMM, we derived a two-step method that is numerically fast and stable and that yields estimators of β and Σ with good properties: Step 1: Separately for each cluster c, use traditional maximum likelihood for independent data (e.g., coxph()) to get ˆβ c and an estimate of its estimate R c = Var( ˆβ c ).

49 Method Two-step algorithm, Craiu et al (2011) Inspired by earlier work for GLMM, we derived a two-step method that is numerically fast and stable and that yields estimators of β and Σ with good properties: Step 1: Separately for each cluster c, use traditional maximum likelihood for independent data (e.g., coxph()) to get ˆβ c and an estimate of its estimate R c = Var( ˆβ c ). Step 2: Since the clusters are large, the ˆβ c are independent and ˆβ c N(β,R c ). Thus we can use linear mixed model theory and REML estimation to combine these estimates together to obtain estimates of β and Σ.

50 Method Second step: REML with EM Easy to implement and to program... but difficult to explain due to extremely heavy notation! But in a nutshell, Stack the estimates ˆβ 1..., ˆβ K in a vector V and their variance estimates in a block diagonal matrix R = diag(r 1,...,R K ).

51 Method Second step: REML with EM Easy to implement and to program... but difficult to explain due to extremely heavy notation! But in a nutshell, Stack the estimates ˆβ 1..., ˆβ K in a vector V and their variance estimates in a block diagonal matrix R = diag(r 1,...,R K ). Stack the vectors of random effects b 1,...,b K in a vector φ and their variances in a block diagonal matrix Σ = diag(σ,...,σ).

52 Method Second step: REML with EM Easy to implement and to program... but difficult to explain due to extremely heavy notation! But in a nutshell, Stack the estimates ˆβ 1..., ˆβ K in a vector V and their variance estimates in a block diagonal matrix R = diag(r 1,...,R K ). Stack the vectors of random effects b 1,...,b K in a vector φ and their variances in a block diagonal matrix Σ = diag(σ,...,σ). Define W 1 = 1 K I p and W 2 = I K p.

53 Method Second step: REML with EM Easy to implement and to program... but difficult to explain due to extremely heavy notation! But in a nutshell, Stack the estimates ˆβ 1..., ˆβ K in a vector V and their variance estimates in a block diagonal matrix R = diag(r 1,...,R K ). Stack the vectors of random effects b 1,...,b K in a vector φ and their variances in a block diagonal matrix Σ = diag(σ,...,σ). Define W 1 = 1 K I p and W 2 = I K p. Consider the linear mixed model U = W 1 β + W 2 φ + ε, where ε N(0,R), φ N(0, Σ) and φ ε.

54 Method Second step: REML with EM β and Σ in this mixed linear model can be estimated by maximum likelihood (ML) or by restricted maximum likelihood (REML).

55 Method Second step: REML with EM β and Σ in this mixed linear model can be estimated by maximum likelihood (ML) or by restricted maximum likelihood (REML). We first tried with ML, but variances were underestimated and ˆβ was biased.

56 Method Second step: REML with EM β and Σ in this mixed linear model can be estimated by maximum likelihood (ML) or by restricted maximum likelihood (REML). We first tried with ML, but variances were underestimated and ˆβ was biased. We used the EM algorithm (both E and M steps in closed form for a few specifications of the structure of Σ) to implement REML numerically quick and stable, estimators quite good in terms of bias, even in terms of efficiency.

57 Method Second step: REML with EM β and Σ in this mixed linear model can be estimated by maximum likelihood (ML) or by restricted maximum likelihood (REML). We first tried with ML, but variances were underestimated and ˆβ was biased. We used the EM algorithm (both E and M steps in closed form for a few specifications of the structure of Σ) to implement REML numerically quick and stable, estimators quite good in terms of bias, even in terms of efficiency. An R package (TwoStepClogit) implementing this method should be available on CRAN in the Spring!

58 Method Simulation results, Craiu et al (2011, Fig. 1)

59 Example of application Application to female bison in Prince Albert 20 pairs of two female bison followed between 15 Nov. 15 April, 2005, 2006, 2007 Each pair of observed locations was matched to 10 locations picked at random in a 700 m buffer (so K = 20, m = 2, n = 12, S varied between 21 and 349). x: dummy variables to quantify habitat class as well as an above-ground vegetation biomass index (in kg/m 2 )

60 Example of application Model fit

61 Future research How should the controls be sampled?

62 Future research How should the controls be sampled? Within cluster correlation: How to estimate working correlations in GEE? How to include autocorrelation among observations belonging to a same cluster in the prospective (then retrospective) model?

63 Future research How should the controls be sampled? Within cluster correlation: How to estimate working correlations in GEE? How to include autocorrelation among observations belonging to a same cluster in the prospective (then retrospective) model? Between cluster correlation: How can we include between animal (or between pair of animals) correlation in such models?

64 Future research How should the controls be sampled? Within cluster correlation: How to estimate working correlations in GEE? How to include autocorrelation among observations belonging to a same cluster in the prospective (then retrospective) model? Between cluster correlation: How can we include between animal (or between pair of animals) correlation in such models? Model validation: relatively easy to do informally with K-fold cross-validation type of approaches... but how can a formal goodness-of-fit test be done?

65 References Bhat, C. (2001). Quasi-random maximum simulated likelihood estimation of the mixed multinomial logit model, Transport. Res. Part B, 35, Craiu, R. V., Duchesne, T., Fortin, D. (2008). Inference methods for the conditional logistic regression model with longitudinal data., Biometrical J., 50, Craiu, R. V., Duchesne, T., Fortin, D., Baillargeon, S. (2011). Conditional logistic regression with longitudinal follow up and individual-level random coefficients: A stable and efficient two-step estimation method, J. of Comput. & Graph. Statist, to appear. Forester, J. D., Im, H. K., Rathouz, P. J. (2009). Accounting for animal movement in estimation of resource selection functions: sampling and data analysis, Ecology, 90, Pfeiffer, R. M., Gail, M. H., Pee, D. (2001). Inference for covariates that accounts for ascertainment and random genetic effects in family studies, Biometrika, 88, Train, K. (2003). Discrete choice methods with simulation, New York: Cambridge University Press.

Package TwoStepCLogit

Package TwoStepCLogit Package TwoStepCLogit March 21, 2016 Type Package Title Conditional Logistic Regression: A Two-Step Estimation Method Version 1.2.5 Date 2016-03-19 Author Radu V. Craiu, Thierry Duchesne, Daniel Fortin

More information

Key Words: CREML; EM-algorithm; Habitat selection; Mixed effects; Mixed multinomial logit; One-step estimator; REML; Two-step analysis.

Key Words: CREML; EM-algorithm; Habitat selection; Mixed effects; Mixed multinomial logit; One-step estimator; REML; Two-step analysis. Supplementary materials for this article are available online. PleaseclicktheJCGSlinkathttp://pubs.amstat.org. Conditional Logistic Regression With Longitudinal Follow-up and Individual-Level Random Coefficients:

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Generalized, Linear, and Mixed Models

Generalized, Linear, and Mixed Models Generalized, Linear, and Mixed Models CHARLES E. McCULLOCH SHAYLER.SEARLE Departments of Statistical Science and Biometrics Cornell University A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS, INC. New

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science. Texts in Statistical Science Generalized Linear Mixed Models Modern Concepts, Methods and Applications Walter W. Stroup CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/ Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Impact of serial correlation structures on random effect misspecification with the linear mixed model.

Impact of serial correlation structures on random effect misspecification with the linear mixed model. Impact of serial correlation structures on random effect misspecification with the linear mixed model. Brandon LeBeau University of Iowa file:///c:/users/bleb/onedrive%20 %20University%20of%20Iowa%201/JournalArticlesInProgress/Diss/Study2/Pres/pres.html#(2)

More information

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Logistic Regression. Continued Psy 524 Ainsworth

Logistic Regression. Continued Psy 524 Ainsworth Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Stat 587: Key points and formulae Week 15

Stat 587: Key points and formulae Week 15 Odds ratios to compare two proportions: Difference, p 1 p 2, has issues when applied to many populations Vit. C: P[cold Placebo] = 0.82, P[cold Vit. C] = 0.74, Estimated diff. is 8% What if a year or place

More information

Cox s proportional hazards/regression model - model assessment

Cox s proportional hazards/regression model - model assessment Cox s proportional hazards/regression model - model assessment Rasmus Waagepetersen September 27, 2017 Topics: Plots based on estimated cumulative hazards Cox-Snell residuals: overall check of fit Martingale

More information

On dealing with spatially correlated residuals in remote sensing and GIS

On dealing with spatially correlated residuals in remote sensing and GIS On dealing with spatially correlated residuals in remote sensing and GIS Nicholas A. S. Hamm 1, Peter M. Atkinson and Edward J. Milton 3 School of Geography University of Southampton Southampton SO17 3AT

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 14, 2014 Today s Schedule Course Project Introduction Linear Regression Model Decision Tree 2 Methods

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning ECE 5984: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16 Dhruv Batra Virginia Tech Administrativia HW3 Due: April 14, 11:55pm You will implement

More information

Introduction to mtm: An R Package for Marginalized Transition Models

Introduction to mtm: An R Package for Marginalized Transition Models Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Lecture 2: Poisson and logistic regression

Lecture 2: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc.

Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Logistic Regression: Online, Lazy, Kernelized, Sequential, etc. Harsha Veeramachaneni Thomson Reuter Research and Development April 1, 2010 Harsha Veeramachaneni (TR R&D) Logistic Regression April 1, 2010

More information

On Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation

On Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation On Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation Structures Authors: M. Salomé Cabral CEAUL and Departamento de Estatística e Investigação Operacional,

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology

Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Occupancy models Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Advances in Species distribution modelling in ecological studies and conservation Pavia and Gran

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2015 Announcements TA Monisha s office hour has changed to Thursdays 10-12pm, 462WVH (the same

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The

More information

Regression Adjustment with Artificial Neural Networks

Regression Adjustment with Artificial Neural Networks Regression Adjustment with Artificial Neural Networks Age of Big Data: data comes in a rate and in a variety of types that exceed our ability to analyse it Texts, image, speech, video Real motivation:

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan

More information

Statistics: A review. Why statistics?

Statistics: A review. Why statistics? Statistics: A review Why statistics? What statistical concepts should we know? Why statistics? To summarize, to explore, to look for relations, to predict What kinds of data exist? Nominal, Ordinal, Interval

More information

Chapter 14 Combining Models

Chapter 14 Combining Models Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.

More information

Lecture 5: Poisson and logistic regression

Lecture 5: Poisson and logistic regression Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction

More information

Incorporating Boosted Regression Trees into Ecological Latent Variable Models

Incorporating Boosted Regression Trees into Ecological Latent Variable Models Incorporating Boosted Regression Trees into Ecological Latent Variable Models Rebecca A. Hutchinson, Li-Ping Liu, Thomas G. Dietterich School of EECS, Oregon State University Motivation Species Distribution

More information

New Developments in Econometrics Lecture 9: Stratified Sampling

New Developments in Econometrics Lecture 9: Stratified Sampling New Developments in Econometrics Lecture 9: Stratified Sampling Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Overview of Stratified Sampling 2. Regression Analysis 3. Clustering and Stratification

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

Model Assumptions; Predicting Heterogeneity of Variance

Model Assumptions; Predicting Heterogeneity of Variance Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of

More information

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility

Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Estimating and contextualizing the attenuation of odds ratios due to non-collapsibility Stephen Burgess Department of Public Health & Primary Care, University of Cambridge September 6, 014 Short title:

More information

MMWS Software Program Manual

MMWS Software Program Manual MMWS Software Program Manual 1 Software Development The MMWS program is regularly updated. The latest beta version can be downloaded from http://hlmsoft.net/ghong/ MMWS Click here to get MMWS. For a sample

More information

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

Regression tree-based diagnostics for linear multilevel models

Regression tree-based diagnostics for linear multilevel models Regression tree-based diagnostics for linear multilevel models Jeffrey S. Simonoff New York University May 11, 2011 Longitudinal and clustered data Panel or longitudinal data, in which we observe many

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Kernel Logistic Regression and the Import Vector Machine

Kernel Logistic Regression and the Import Vector Machine Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University

GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR. Raymond J. Carroll: Texas A&M University GENERALIZED LINEAR MIXED MODELS AND MEASUREMENT ERROR Raymond J. Carroll: Texas A&M University Naisyin Wang: Xihong Lin: Roberto Gutierrez: Texas A&M University University of Michigan Southern Methodist

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Lecture 8 Stat D. Gillen

Lecture 8 Stat D. Gillen Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Robust Bayesian Variable Selection for Modeling Mean Medical Costs Robust Bayesian Variable Selection for Modeling Mean Medical Costs Grace Yoon 1,, Wenxin Jiang 2, Lei Liu 3 and Ya-Chen T. Shih 4 1 Department of Statistics, Texas A&M University 2 Department of Statistics,

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Analysing longitudinal data when the visit times are informative

Analysing longitudinal data when the visit times are informative Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca

More information

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work SSUI: Presentation Hints 1 Comparing Marginal and Random Eects (Frailty) Models Terry M. Therneau Mayo Clinic April 1998 SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that

More information

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

ABHELSINKI UNIVERSITY OF TECHNOLOGY

ABHELSINKI UNIVERSITY OF TECHNOLOGY Cross-Validation, Information Criteria, Expected Utilities and the Effective Number of Parameters Aki Vehtari and Jouko Lampinen Laboratory of Computational Engineering Introduction Expected utility -

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Multilevel Methodology

Multilevel Methodology Multilevel Methodology Geert Molenberghs Interuniversity Institute for Biostatistics and statistical Bioinformatics Universiteit Hasselt, Belgium geert.molenberghs@uhasselt.be www.censtat.uhasselt.be Katholieke

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information