EPSE 594: Meta-Analysis: Quantitative Research Synthesis
|
|
- Clare Goodwin
- 5 years ago
- Views:
Transcription
1 EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia January 24, 2019 Ed Kroc (UBC) EPSE 594 January 24, / 37
2 Last time Composite effect sizes: inverse variance weighting The basic meta-analysis models: Fixed effect Random effect Introduction to the homogeneity assumption Ed Kroc (UBC) EPSE 594 January 24, / 37
3 Today Review of fixed vs. random effect models Quantifying heterogeneity Prediction Introduction to meta-regression (subgroup analysis) [mixed effects models] Ed Kroc (UBC) EPSE 594 January 24, / 37
4 Fixed vs. random effects Big difference between: an intervention that consistently increases, say, retention of course materials by 40% [fixed effect], another intervention that increases retention by 40% on average, with a treatment effect ranging between 10% and 70% depending on the study/subpopulation [random effect]. Fixed effects models: Generally take less data to produce precise estimates,...but rely on stronger assumptions, so are more prone to spurious conclusions. Weight larger studies more heavily. Random effects models: More general, more conservative, usually more realistic,...but generally require more data to produce precise estimates. Weight all studies more equally since we assume more variance in the model. Ed Kroc (UBC) EPSE 594 January 24, / 37
5 Fixed effects model The basic fixed effects model: pθ k θ ` ε k, where θ p k is the estimated effect size from study k, θ is the fixed, true effect in the population, and ε k Np0, σk 2 q captures the sampling error from study k, with unknown variance σk 2. Given our estimates pθ p k, pσ k 2q of pθ k, σk 2 q, k 1,..., N, we can derive the maximum-likelihood estimate for the population effect θ to get: ř N k 1 pθ FE w kθ p k ř N k 1 w, Varp x θfe p q «k 1 ř N k 1 w, k where w k 1{pσ 2 k. Ed Kroc (UBC) EPSE 594 January 24, / 37
6 Fixed effects model A pictorial representation of the basic fixed effects model: pθ k θ ` ε k Note that in this graphic, v k σ 2 k. Ed Kroc (UBC) EPSE 594 January 24, / 37
7 Fixed effects model Fixed effects model relies on the homogeneity assumption: θ 1 θ N θ, where θ denotes the common population effect size. We saw one way to test this assumption: via Cochran s Q statistic (an asymptotic chi-squared test): Q Nÿ k 1 pθ p k pθ FE q 2, where we conclude evidence against the null (against homogeneity) if Q ą χ 2 1 αpn 1q. pσ 2 k Ed Kroc (UBC) EPSE 594 January 24, / 37
8 Random effects More generally, a random effect is a quantity that varies over sample units. A fixed effect, on the other hand, remains fixed across sample units (Kreft & De Leeuw 1998). In experimental design, we think of fixed effects as those that have values that are either (1) fixed by the experimenter, or (2) exhausted by the experimental design (Green & Tukey 1960). Another common definition is that effects are fixed if they are interesting in themselves, and random if there is interest in some greater population (Searle, Casella & McCulloch 1992). The distinction is not always obvious in practice. Moreover, the distinction is largely dependent on what you are interested in studying. Ed Kroc (UBC) EPSE 594 January 24, / 37
9 Random effects model The basic random effects model: pθ k θ ` u k ` ε k, where θ p k and ε k Np0, σk 2 q are as before, but θ is now the mean effect in the population and u k Np0, τ 2 q. Also, u k is assumed to be independent of ε k. The random effects model does not assume homogeneity. So each p θ k estimates a (sub)population effect θ k for study k, but θ k Np0, τ 2 q. So we can capture two sources of variation: Sampling variation as in the fixed effects model Structural variation since the (true) effects themselves are random draws from some greater population of effects. Ed Kroc (UBC) EPSE 594 January 24, / 37
10 Random effects model A pictorial representation of the basic random effects model is: Note, in the graphic, v k σ 2 k. pθ k θ ` u k ` ε k Ed Kroc (UBC) EPSE 594 January 24, / 37
11 Random effects model: estimation The random effect estimate of θ (now the mean effect in the population) is still a weighted average of the individual study effect estimates p θ k : but now w k 1{ppσ 2 k ` pτ 2 q. ř N k 1 pθ RE w kθ p k ř N k 1 w, k Note that this estimator takes into account both within-study variation pσ 2 k, and between-study variation pτ 2. Analogous to before, the estimated (approximate asymptotic) variance of this estimator is xvarp p θ RE q «1 ř N k 1 w k 1 ř N k 1 1{ppσ2 k ` pτ 2 q Ed Kroc (UBC) EPSE 594 January 24, / 37
12 Random effects model: between-study variation One critical quantity we are interested in that was not present in the fixed effects case is an estimate pτ 2 of the between-study variance τ 2. Many different methods for estimating this. We won t go into the math, but the idea is similar to an ANOVA partition of between-study and within-study sum of squares. Also, many different tests for H 0 : τ 2 0. The most basic is an F -test that is directly analogous to an ANOVA F -test of a between-treatment effect. The study of between-study variation is the study of heterogeneity and is often of critical interest. Ed Kroc (UBC) EPSE 594 January 24, / 37
13 Heterogeneity Interested in quantifying how much the true effect sizes vary in our (meta)-population; i.e we would like to estimate τ 2. But remember: always have sampling error present from each study; this is not heterogeneity. Would like a way of separating the sampling variability (natural sampling error) from the variability of the true effects (heterogeneity). Accomplish this by: (1) estimate total amount of between-study variation, (2) estimate amount of between-study variation we would expect under the homogeneity assumption, (3) then excess variation should reflect heterogeneity. Ed Kroc (UBC) EPSE 594 January 24, / 37
14 Heterogeneity Start with Cochran s Q statistic: Q Nÿ k 1 pθ p k pθq 2, where p θ is the summary (combined) effect from our model. This is our (weighted) estimate of total between-study variation. Now assume that homogeneity holds: θ 1 θ N θ. Then Q is (asymptotically) χ 2 pn 1q, and we know that the expected amount of between-study variation is then N 1. So a natural way to estimate excess variation (i.e. variation not attributable to sampling error) is to consider pσ 2 k Q pn 1q. Ed Kroc (UBC) EPSE 594 January 24, / 37
15 Heterogeneity This is a natural way to estimate excess variation (i.e. variation not attributable to sampling error): But, a couple problems with this: Q pn 1q highly sensitive to total number of studies, N, and to within-study sample size not on an easily interpretable scale can be negative Ed Kroc (UBC) EPSE 594 January 24, / 37
16 Heterogeneity To fix these issues, we can use the moment-based estimate: " * pτ 2 T 2 Q pn 1q max 0,, C where 1 ř N N k 1 w k 1 ř 2 N N k 1 w k C 1 ř N N k 1 w k Notice that C is really an estimate of the sample variance of the weights w k, standardized by their sample mean. Now T is on the same scale (units) as the outcome/effect of interest. Also, T will not automatically increase with more studies, or with more within-study sample size. Ed Kroc (UBC) EPSE 594 January 24, / 37
17 Heterogeneity Remember: since T 2 is a statistic (and so a random variable), it has an associated measure of uncertainty. Thus, can derive estimates of VarpT 2 q and then also derive confidence intervals for T 2 ; see Borenstein pp for formulas. WARNING: Jamovi will output an estimate of the standard error (SE) of T 2, but do not just take 2 SE to form a 95% confidence interval. This trick only works if the test statisic is approximately normally distributed (or if we can appeal to the CLT). In practice, T 2 is very far from normally distributed. Ed Kroc (UBC) EPSE 594 January 24, / 37
18 Heterogeneity There are many ways to derive an estimate of pτ 2, but this method-of-moments (DerSimonian-Laird) method (T 2 ) is the most common (see pictures in Borenstein pp. 115, 116). Many estimation options in Jamovi. Should usually be good to use DerSimonian-Laird or restricted maximum likelihood estimate (has nice statistical properties). Rarely advisable to use simple maximum likelihood to fit RE-model and estimate τ. Personally, I would recommend avoiding empirical Bayes as well unless you know something about Bayesian analysis (workshop next month!). Ed Kroc (UBC) EPSE 594 January 24, / 37
19 Heterogeneity In many cases, the exact choice of estimation method won t make a huge difference. However, exceptions do exist (and if you find yourself in such a situation you should consult a statistician). Classic example by Snedecor and Cochran (1967): number of conceptions recorded from artificial insemination of six bulls. Ed Kroc (UBC) EPSE 594 January 24, / 37
20 Heterogeneity Different estimation methods result in very different estimates of τ 2 : Estimation method Est. of τ 2 Est. of SE(τ 2 ) I 2 DerSimonian-Laird % Hedges % Hunter-Schmidt % Sidik-Jonkman % Maximum likelihood (ML) % Restricted ML % Empirical Bayes % However, note that Cochran s Q-statistic is the same for all of these methods, since Q is computed independently of any estimate of τ 2. Ed Kroc (UBC) EPSE 594 January 24, / 37
21 Tests for heterogeneity We have already seen Cochran s Q-statistic as a test for homo/heterogeneity. But many others exist. Two of the most common are: Reported in Jamovi: Not reported in Jamovi: H 2 Q N 1 R 2 τ 2 ` σ 2 σ 2, where σ 2 is a typical within-study variance. WARNING: Jamovi does report something called an R 2 when you specify a moderator in the meta-analysis (subgroup analysis or meta-regression), but confusingly, that R 2 is not the same thing as this R 2 test of homogeneity. Ed Kroc (UBC) EPSE 594 January 24, / 37
22 Relative measures of heterogeneity A very common relative measure of heterogeneity is captured by the I 2 -statistic: " * I 2 Q pn 1q max 0,. Q Notice that I 2 compares our rough measure of excess variation to our measure of total variation (akin to the definition of reliability). In this way, it tries to do the same thing as the traditional R 2 in regression (yes, another R 2 statistic). Notice that, like the traditional R 2, we always have 0 ď I 2 ď 1. I 2 is independent of the number of studies N, but is affected by the precision (sample sizes) of the studies. Ed Kroc (UBC) EPSE 594 January 24, / 37
23 Relative measures of heterogeneity A very common relative measure of heterogeneity is captured by the I 2 -statistic: " * I 2 Q pn 1q max 0,. Q Rough interpretaion: the closer I 2 is to 100%, the more variation is explained by the random effect (differences in true effect sizes, u k ) rather than just sampling error, ε k ; i.e., the more between-study variation rather than within-study variation. WARNING: however, just as with R 2 in ANOVA/regression, this interpretation breaks down if the data (effect sizes) are not actually normally distributed about their mean. Ed Kroc (UBC) EPSE 594 January 24, / 37
24 Example 1 Recall Cohen meta-analysis from last time: partial correlation coefficients between section mean instructor rating and section mean final exam score, controlling for student ability. Again, note that R 2 in this Jamovi output is not the same R 2 -statistic we previously defined. Jamovi s R 2 -statistic will be relevant for subgoup analysis (meta-regression). Ed Kroc (UBC) EPSE 594 January 24, / 37
25 Example 2 Meta-analysis by Colditz et al. (1994). In these 13 trials, the effect of Bacillus Calmette-Guérin (BCG) vacciantion was investigated on the prevention of tuberculosis. Effect size measured is relative risk. Study results are on a logarithmic scale. Ed Kroc (UBC) EPSE 594 January 24, / 37
26 Example 2 Vaccinated Not vaccinated Trial Disease No disease Disease No disease Risk ratio * * , , * , , , , * 7 8 2, * , , , , * , , * , , * , , , , Ed Kroc (UBC) EPSE 594 January 24, / 37
27 Example 2 Ed Kroc (UBC) EPSE 594 January 24, / 37
28 Prediction Notice that so far we have generated the following: estimate of mean true effect size (random effects model) or of true effect size (fixed effects model) estimate of variation of true effect sizes (random effects model). Graphically, we have ways of representing the mean true effect size and its associated uncertainty (the diamond in a forest plot), but how can we easily express the information contained in our estimate of τ? ANSWER: create a prediction interval. Ed Kroc (UBC) EPSE 594 January 24, / 37
29 Prediction You may remember prediction intervals from regression. They look superficially similar to confidence intervals, but represent very different things. Recall: a 95% confidence interval for a statistic is a measure of uncertainty about the value of that statistic. Technically, we say that if we were to repeat the same experiment (meta-experiment) many times, then about 95% of the resulting 95% confidence intervals would contain the true effect being estimated by the statistic. In contrast, a 95% prediction interval represents our uncertainty for how the true effects are distributed about the statistic. In the context of meta-analysis, in about 95% of new studies, the true effect of that study will fall inside the 95% prediction interval (given that all modelling assumptions hold). Ed Kroc (UBC) EPSE 594 January 24, / 37
30 Prediction A 100 p1 αq% prediction interval for the true effect sizes is given by: rθ z 1 α{2 τ, θ ` z 1 α{2 τs In practice, we need to substitute in our estimates for θ and τ, so this becomes an approximate 100 p1 αq% prediction interval. Note: while the confidence interval is based upon the variance of the estimated mean effect, Varp p θq, the prediction interval is based upon the between-study variance τ 2. Thus, as the number of studies increases, a confidence interval will automatically get tighter since Varp p θq Ñ 0....But the between-study variance τ 2 will stay (close to) fixed. The more studies we have, generally, the better estimate T 2 of τ 2 we will have. But this estimate will settle on some nonzero number (unless there is no actual between-study variation). Ed Kroc (UBC) EPSE 594 January 24, / 37
31 Prediction Consider the following diagram from Borenstein p. 132: Note: width of CI gets smaller as number of studies increases, while width of PI stays about the same. Ed Kroc (UBC) EPSE 594 January 24, / 37
32 Prediction Prediction interval for correlation between section mean instructor rating and section mean final exam score (controlling for student ability) from last time: Ed Kroc (UBC) EPSE 594 January 24, / 37
33 Subgroup analysis and meta-regression We have now done the following: Estimated a combined effect size with associated uncertainty (fixed or random effect model) Tested homogeneity assumption (fixed effect model) Quantified heterogeneity (random effect model) Predicted a plausible range of true effect sizes (random effect model) But a natural question remains: if true effect sizes are heterogenous, then what explains this heterogeneity? Ed Kroc (UBC) EPSE 594 January 24, / 37
34 Subgroup analysis and meta-regression Recall the basic random effects model: pθ k θ ` u k ` ε k, where θ p k and ε k Np0, σk 2 q are as always, θ is the mean effect in the population and u k Np0, τ 2 q captures the between-study variation. Also, u k is assumed to be independent of ε k. This basic model treats the between-study variation as a simple random component, totally explained by a simple normal random variable. However, in practice, we might often expect that such heterogeneity is explained by other study-wide factors. Ed Kroc (UBC) EPSE 594 January 24, / 37
35 Subgroup analysis and meta-regression Possible examples of informative heterogeneity: Some studies in our meta-analysis may be experimental, while others are observational. We would likely expect both more variation and more bias to arise from the observational studies. Multiple studies could have been conducted at the same lab or by the same research group. Consequently, we would expect less variation between studies coming from the same lab/group. Studies could be conducted in different provinces or countries. Stuides could be conducted at different times, so subject to different traditional protocols, government policies, etc. How do we account for these things? Ed Kroc (UBC) EPSE 594 January 24, / 37
36 Subgroup analysis and meta-regression ANSWER: work with a mixed effects model. Can simply add other explanatory factors to the basic random effects model, e.g.: pθ k θ ` Type k ` Lab k ` u k ` ε k, where Type k accounts for whether the study was experimental or observational (binary) and Lab k records which lab or research group conducted study k (categorical). The above model would allow for what is called a subgroup analysis. This is just a special case of meta-regression. Ed Kroc (UBC) EPSE 594 January 24, / 37
37 Subgroup analysis and meta-regression Can easily incorporate continuous explanatory variables (covariates) into the general mixed effects model. Can also easily incorporate other random effects if we would like (e.g. could treat Lab as a random effect in previous example). Downside: this will require lots of data. We will expore details of subgroup analysis (meta-regression) next time. Ed Kroc (UBC) EPSE 594 January 24, / 37
EPSE 592: Design & Analysis of Experiments
EPSE 592: Design & Analysis of Experiments Ed Kroc University of British Columbia ed.kroc@ubc.ca October 3 & 5, 2018 Ed Kroc (UBC) EPSE 592 October 3 & 5, 2018 1 / 41 Last Time One-way (one factor) fixed
More informationBiostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods
Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods Meta-analysis describes statistical approach to systematically combine results from multiple studies [identified follong an exhaustive
More informationPractical Meta-Analysis -- Lipsey & Wilson
Overview of Meta-Analytic Data Analysis Transformations, Adjustments and Outliers The Inverse Variance Weight The Mean Effect Size and Associated Statistics Homogeneity Analysis Fixed Effects Analysis
More informationA Primer on Statistical Inference using Maximum Likelihood
A Primer on Statistical Inference using Maximum Likelihood November 3, 2017 1 Inference via Maximum Likelihood Statistical inference is the process of using observed data to estimate features of the population.
More informationIntroduction to Applied Bayesian Modeling. ICPSR Day 4
Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the
More informationStatistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )
Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv:1809.05778) Workshop on Advanced Statistics for Physics Discovery aspd.stat.unipd.it Department of Statistical Sciences, University of
More informationPrevious lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)
Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationA re-appraisal of fixed effect(s) meta-analysis
A re-appraisal of fixed effect(s) meta-analysis Ken Rice, Julian Higgins & Thomas Lumley Universities of Washington, Bristol & Auckland tl;dr Fixed-effectS meta-analysis answers a sensible question regardless
More informationMeta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health
Meta-analysis 21 May 2014 www.biostat.ku.dk/~pka Per Kragh Andersen, Biostatistics, Dept. Public Health pka@biostat.ku.dk 1 Meta-analysis Background: each single study cannot stand alone. This leads to
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationBridging Meta-Analysis and Standard Statistical Methods
Bridging Meta-Analysis and Standard Statistical Methods Mike W.-L. Cheung Department of Psychology National University of Singapore June 208 Mike W.-L. Cheung Department of Psychology Bridging National
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationCalculating Effect-Sizes. David B. Wilson, PhD George Mason University
Calculating Effect-Sizes David B. Wilson, PhD George Mason University The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction and
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationPoisson regression: Further topics
Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationDesign of Experiments. Factorial experiments require a lot of resources
Design of Experiments Factorial experiments require a lot of resources Sometimes real-world practical considerations require us to design experiments in specialized ways. The design of an experiment is
More informationInverse-variance Weighted Average
Fixed Effects Meta-Analysisand Homogeneity Evaluation Jeff Valentine University of Louisville Campbell Collaboration Annual Colloquium Oslo 2009 Inverse-variance Weighted Average All effect sizes are not
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationExample name. Subgroups analysis, Regression. Synopsis
589 Example name Effect size Analysis type Level BCG Risk ratio Subgroups analysis, Regression Advanced Synopsis This analysis includes studies where patients were randomized to receive either a vaccine
More informationInterpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score
Interpret Standard Deviation Outlier Rule Linear Transformations Describe the Distribution OR Compare the Distributions SOCS Using Normalcdf and Invnorm (Calculator Tips) Interpret a z score What is an
More informationEffect and shrinkage estimation in meta-analyses of two studies
Effect and shrinkage estimation in meta-analyses of two studies Christian Röver Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany December 2, 2016 This project has
More informationDescribing Change over Time: Adding Linear Trends
Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section
More informationEstimation and Centering
Estimation and Centering PSYED 3486 Feifei Ye University of Pittsburgh Main Topics Estimating the level-1 coefficients for a particular unit Reading: R&B, Chapter 3 (p85-94) Centering-Location of X Reading
More informationConditional probabilities and graphical models
Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationL6: Regression II. JJ Chen. July 2, 2015
L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationPrevious lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.
Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationStatistical Inference: Uses, Abuses, and Misconceptions
Statistical Inference: Uses, Abuses, and Misconceptions Michael W. Trosset Indiana Statistical Consulting Center Department of Statistics ISCC is part of IU s Department of Statistics, chaired by Stanley
More informationHeterogeneity issues in the meta-analysis of cluster randomization trials.
Western University Scholarship@Western Electronic Thesis and Dissertation Repository June 2012 Heterogeneity issues in the meta-analysis of cluster randomization trials. Shun Fu Chen The University of
More informationreview session gov 2000 gov 2000 () review session 1 / 38
review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationBayesian methods in economics and finance
1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org
More informationRegression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences
Faculty of Health Sciences Categorical covariate, Quantitative outcome Regression models Categorical covariate, Quantitative outcome Lene Theil Skovgaard April 29, 2013 PKA & LTS, Sect. 3.2, 3.2.1 ANOVA
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationRobustness and Distribution Assumptions
Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology
More informationMath 10 - Compilation of Sample Exam Questions + Answers
Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the
More informationA new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationApplied Regression Analysis. Section 2: Multiple Linear Regression
Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationBiost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation
Biost 58 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 5: Review Purpose of Statistics Statistics is about science (Science in the broadest
More informationWhat is a meta-analysis? How is a meta-analysis conducted? Model Selection Approaches to Inference. Meta-analysis. Combining Data
Combining Data IB/NRES 509 Statistical Modeling What is a? A quantitative synthesis of previous research Studies as individual observations, weighted by n, σ 2, quality, etc. Can combine heterogeneous
More informationA Bayesian Treatment of Linear Gaussian Regression
A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationA3. Statistical Inference Hypothesis Testing for General Population Parameters
Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,
More informationDo not copy, post, or distribute
14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible
More informationChapter 18. Sampling Distribution Models /51
Chapter 18 Sampling Distribution Models 1 /51 Homework p432 2, 4, 6, 8, 10, 16, 17, 20, 30, 36, 41 2 /51 3 /51 Objective Students calculate values of central 4 /51 The Central Limit Theorem for Sample
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationGeneral Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One
Within Cases ANOVA Part One 1 / 25 Within Cases A case contributes a DV value for every value of a categorical IV It is natural to expect data from the same case to be correlated - NOT independent For
More informationOnline Supplementary Material. MetaLP: A Nonparametric Distributed Learning Framework for Small and Big Data
Online Supplementary Material MetaLP: A Nonparametric Distributed Learning Framework for Small and Big Data PI : Subhadeep Mukhopadhyay Department of Statistics, Temple University Philadelphia, Pennsylvania,
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationPsych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions
Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions Part 1: Conceptual ideas about correlation and regression Tintle 10.1.1 The association would be negative (as distance increases,
More informationFitting a Straight Line to Data
Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,
More informationTwo-sample Categorical data: Testing
Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer
More informationChapter 6. Logistic Regression. 6.1 A linear model for the log odds
Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,
More informationModel comparison. Christopher A. Sims Princeton University October 18, 2016
ECO 513 Fall 2008 Model comparison Christopher A. Sims Princeton University sims@princeton.edu October 18, 2016 c 2016 by Christopher A. Sims. This document may be reproduced for educational and research
More informationBiostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras
Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationIV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors
IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More informationCourse Review. Kin 304W Week 14: April 9, 2013
Course Review Kin 304W Week 14: April 9, 2013 1 Today s Outline Format of Kin 304W Final Exam Course Review Hand back marked Project Part II 2 Kin 304W Final Exam Saturday, Thursday, April 18, 3:30-6:30
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationCategorical and Zero Inflated Growth Models
Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).
More informationVARIANCE COMPONENT ANALYSIS
VARIANCE COMPONENT ANALYSIS T. KRISHNAN Cranes Software International Limited Mahatma Gandhi Road, Bangalore - 560 001 krishnan.t@systat.com 1. Introduction In an experiment to compare the yields of two
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationf rot (Hz) L x (max)(erg s 1 )
How Strongly Correlated are Two Quantities? Having spent much of the previous two lectures warning about the dangers of assuming uncorrelated uncertainties, we will now address the issue of correlations
More informationPRINCIPAL COMPONENTS ANALYSIS
121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions
ACMS 20340 Statistics for Life Sciences Chapter 13: Sampling Distributions Sampling We use information from a sample to infer something about a population. When using random samples and randomized experiments,
More informationLecture 3: Just a little more math
Lecture 3: Just a little more math Last time Through simple algebra and some facts about sums of normal random variables, we derived some basic results about orthogonal regression We used as our major
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationA Re-Introduction to General Linear Models (GLM)
A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing
More informationMeasures of Association and Variance Estimation
Measures of Association and Variance Estimation Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 35
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationSections 3.4, 3.5. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis
Sections 3.4, 3.5 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 3.4 I J tables with ordinal outcomes Tests that take advantage of ordinal
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationHierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!
Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter
More information