Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)
|
|
- Barbra Briggs
- 6 years ago
- Views:
Transcription
1 Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse
2 Meta-Analysis Today s outline P-value based methods. Fixed effect model. Meta vs. pooled analysis. Random effects model. Meta vs. pooled analysis New random effects analysis
3 Meta-Analysis Single study is under-powered because the effect of common variant is very modest (OR 1.4) Meta-analysis is an effective way to combine data from multiple independent studies
4 Meta-Analysis Methods P-value based Regression coefficient based Fixed effects model Random effects model
5 P-value Based Conduct meta-analysis using p-values Simple and widely used Fisher and Z -score based methods
6 P-value Based K studies T Fisher = Simple and works well K 2 log(p k ) χ 2 2K k=1 Direction of effect is not considered
7 P-value Based Stouffer s Z -score (one-sided right-tailed) Z k = Φ(1 p k ) Φ is the standard cumulative normal distribution function Z = K k=1 Z k K Z follows the standard normal distribution.
8 Fisher versus Z -score These two are not perfectly linear, but they follow a highly linear relationship over the range of Z values most observed (Z 1).
9 P-value Based Z -score can incorporate direction of effects and weighting scheme βk : effect for study k w k = n k Z k = Φ(1 p k /2) sign(β k ) K k=1 Z = w kz k N(0, 1) K k=1 w 2 k
10 P-value Based Easy to use Z -score is perhaps more popular as it works well to find a consistent signal from studies Cannot estimate the effect size
11 Regression Coefficient Based For each study k = 1,..., K g{e(y ik )} = X ik α k + G ik β k Estimate β k and its standard error, ( β k, σ k ) β k N(β k, σ 2 k )
12 Fixed Effect Model Assume β 1 = = β K Effect size estimation β = K k=1 w k β k K k=1 w k n( β β) N ( 0, n k w 2 k σ2 k ( k w k) 2 w k = 1 gives the minimal variance of β var( β k ) )
13 Fixed Effect Model Assumes all studies in the analysis have the same effect Each study can be considered as a random sample drawn from the population with true parameter value β. There is between-study heterogeneity Different definitions of phenotypes. Effect size may be higher (or lower) in certain subgroups (e.g., age, sex).
14 Assess Heterogeneity Cochran s Q test K Q = w k ( β k β) 2 k=1 Q should be large if there is heterogeneity. Under the null hypothesis of no heterogeneity Q χ 2 K 1 Under powered when there are fewer studies.
15 Assess Heterogeneity Measure the % of total variance explained by the between-study heterogeneity (Higgins and Thompson 2002) I 2 Q (K 1) = 100% Q > 50% indicates large heterogeneity Intuitive interpretation, simple to calculate, and can be accompanied by an uncertainty interval
16 An Example
17 Meta- vs. Pooled-Analysis Meta analysis: Combining summary statistics (e.g., β k ) of studies Pooled analysis: Combining original or individual-level of all studies
18 Relative Efficiency For k = 1,, K studies g{e(y ki )} = α k + β T X ki β is common to all K studies; α k is specific to the kth study The maximum likelihood estimator (MLE) β k for the kth study by maximizing n k L k = sup αk f (Y ki, X ki ; β, α k ) i=1
19 Relative Efficiency β is MLE of β by maximizing K n k L = sup {α1,,α K } f (Y ki, X ki ; β, α k ) k=1 i=1 K n k = sup αk f (Y ki, X ki ; β, α k ) = k=1 K k=1 L k i=1
20 Relative Efficiency Information Recall var( β) = 1 I(β) I(β) = 2 K β 2 log L = 2 β 2 log L k = var( β) = K k=1 k=1 1 1 var(β k ) = K I k (β) k=1 1 K k=1 I k(β) Using summary statistics has the same asymptotic efficiency as using original data, if β is the only common parameter across studies.
21 Relative Efficiency Common nuisance parameters, say γ. ( ) ( ) IK I k = ββ I K βγ Iββ I I = βγ I K β I K γγ { K var( β) = (I kββ I kβγ I 1 k=1 = (I ββ K k=1 kγγ I kγβ) I kβγ I 1 kγγ I kγβ) 1 I γβ I γγ } 1 (I ββ I βγ I 1 γγ I γβ ) 1 = var( β) Equality holds if and only of var(β k ) 1 cov( β k, γ k ) are the same among the K studies.
22 Random Effects Model Under the fixed effect model, β 1 = = β K = β The random effects model β k = β + ξ k (k = 1,, K ) ξ k N(0, τ 2 )
23 Random Effects Model Estimation of τ 2 DerSimonian & Laird (1986) method-of-moments estimator V k = var( β k β k ) var( β k ) = V k + τ 2 τ 2 = V 1 K Q (K 1) V 2 K / V 1 K Estimate β with inverse variance estimator ŵ k = var( β k ) SE( β) = β = K k=1 ŵk β k K k=1 ŵk 1 k ŵk
24 Random Effects Model Fixed effect model is more powerful, but ignores the heterogeneity between studies. Random effects model is probably more robust, but is under powered. The confidence intervals have poor coverage for small and moderate K.
25 Meta- vs Pooled-Analysis The maximum likelihood estimator β from pooled data can be obtained by maximizing K k=1 ) 1 f k (Y ki X ki, β + ξ k ) ( exp ξ2 k 2πτ 2 2τ 2 dξ k Challenging to establish the theoretical properties of β and β under the random effects model because n k K.
26 Asymptotic Distributions (Zeng and Lin 2015) Assumptions: For k = 1,, K, n k = π k n for some constant π k within a compact interval (0, ). τ 2 = 1 n σ2, σ 2 is a constant. The between-study variability is comparable to the within-study variability. Asymptotic distribution of MLE β n( β β0 ) d ( K n τ 2 d Ã; n K Vk v k k=1 π k v k + π k A ) K k=1 π 1/2 k v k + π k A Z k Z k are independently distributed N(0, v k + π k σ 2 0 ). A is a complicated form that involves {Z k }. It is a mixture of normal random variables with the mixing probabilities both being random and correlated with the normal random variables.
27 Asymptotic Distributions Asymptotic distribution of β n τ 2 d  n( β β0 ) d ( K k=1 π k v k + π k  ) 1 K k=1 π 1/2 k Z k v k + π k  As n, K, Kn 1/2 0, var( β) var( β), i.e., the weighted average estimator β is at least as efficient as the MLE When K = 100, 200, 400, the empirical relative efficiency is 1.037, 1.083, and Perform statistical inference for β and β based on asymptotic distributions using resampling techniques Or simpler, use the profile likelihood to construct 95% confidence intervals (Hardy and Thompson, 1996)
28 95% Coverage Probabilities Solid: Zeng & Lin; Dashed: DerSimonian-Laird; dotted Jackson-Bowden resampling method; dot-dash: Hardy-Thompson profile method. Left: K=10 studies; Right: K=20
29 Testing with random effects model Random effects (RE) model gives less significant p values than the fixed effects model when the variants show varying effect sizes between studies Ironic because RE is designed specifically for the case in which there is heterogeneity All associations identified RE are usually identified by the fixed effect model Causal variants showing high between-study heterogeneity might not be discovered by either method
30 Revisit the Traditional RE First step: estimate the effect size and CI by taking heterogeneity into account. Second step: normalize β/se( β) and translate it into p-value. Effectively, RE assumes heterogeneity under the null hypothesis, i.e., β k N(0, σ 2 + τ 2 ). There should not be heterogeneity under the null because β 1 = = β K = 0.
31 New RE New null hypothesis H 0 : β = 0, τ 2 = 0. Model β = ( β 1,, β K ). β k N(0, σ 2 ) β follows a multivariate normal distribution β MVN(β1, Σ) The likelihood ratio test statistic. Σ = V + τ 2 I, V = diag(v 1,, V K ). S new = 2 log L 1 L 0 = L 1 = K k=1 K k=1 L 0 1 2πVk exp ( β 2 k 2V k ) ( 1 2π(Vk + τ 2 ) exp ( β k β) 2 2(V k + τ 2 ) )
32 New RE Basically we test both fixed and random effects together S new = S β + S τ 2 S β is equal to the Z-statistic based on the fixed effect model, and S τ 2 is the test statistic for testing τ 2 = 0 β is unconstrained, but τ 2 0. Hence, S new 1 2 χ χ2 0. However, asymptotics cannot be applied because of small K. The asymptotic p-value is conservative because of the tail of asymptotic distribution is thicker than that of the true distribution at the tail Resampling approach should be used to calculate p-values
33 Power Comparison
34 Interpretation and Prioritization In the usual meta-analysis where one collects similar studies and expects the common effect, the results found by the fixed effects model should be the top priority. An association showing large heterogeneity requires careful investigation (e.g., different pathways, LD pattern, different study populations). Effect size estimate and CI is the same as those in the current RE, but may be inconsistent between wide CI and statistically significant results.
35 Comparison between meta- vs pooled-analysis Meta Pooled Logistic Easy Difficult Time-efficient Cheap Bias Time-consuming Costly Publication Possible Unlikely Exposure assessment May not be comparable Comparable Confounders May not be consistent Consistent Efficiency Relatively large data Similar Similar Sparse data May be unstable Better Complex analysis (e.g. machine learning) Difficult Easy Long-term benefit Always requires coordination Better
36 Meta- vs pooled-analysis for sparse data β is unstable if data are sparse in each study. However, if the interest is only on testing H 0 : β = 0, there are ways to combine the studies with only summary statistics. Recall the score test: [ [ 0)] β β=0 log L(β, α I(β = 0 α 0 ) 0)] 1 β β=0 log L(β, α χ 2 1 I(β = 0 α 0 ) = { I ββ I βα I 1 ααi αβ } β=0, α0 [ ] β log L(β, α 0) β=0 = [ K k=1 β log L β=0 k(β, α 0 )] I ββ β=0, α0 = K β=0, α0. Similar for I βα, I αα, and I αβ k=1 I kββ
37 Summary P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.
38 Recommended Reading DerSimonian R & Laird N (1986). Meta-analysis in clinical trials. Contr Clin Trials 7: Han B and Eskin E (2011). Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am J Hum Genet 88: Hardy RJ & Thomson SG (1996) A likelihood approach to meta-analysis with random effects. Stat in Med 30: Lin DY and Zeng D (2010). On the relative efficiency of using summary statistics vs. individual level data in meta-analysis. Biometrika 97: Zeng D and Lin DY (2015). On random-effects meta-analysis. Biometrika 102:
On the relative efficiency of using summary statistics versus individual level data in meta-analysis
On the relative efficiency of using summary statistics versus individual level data in meta-analysis By D. Y. LIN and D. ZENG Department of Biostatistics, CB# 7420, University of North Carolina, Chapel
More informationPrevious lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.
Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationReview We have covered so far: Single variant association analysis and effect size estimation GxE interaction and higher order >2 interaction Measurement error in dietary variables (nutritional epidemiology)
More informationGood Confidence Intervals for Categorical Data Analyses. Alan Agresti
Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationAn accurate test for homogeneity of odds ratios based on Cochran s Q-statistic
Kulinskaya and Dollinger TECHNICAL ADVANCE An accurate test for homogeneity of odds ratios based on Cochran s Q-statistic Elena Kulinskaya 1* and Michael B Dollinger 2 * Correspondence: e.kulinskaya@uea.ac.uk
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationMaximum-Likelihood Estimation: Basic Ideas
Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators
More informationConfidence Distribution
Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept
More informationA re-appraisal of fixed effect(s) meta-analysis
A re-appraisal of fixed effect(s) meta-analysis Ken Rice, Julian Higgins & Thomas Lumley Universities of Washington, Bristol & Auckland tl;dr Fixed-effectS meta-analysis answers a sensible question regardless
More informationExtending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy
Received: 20 October 2016 Revised: 15 August 2017 Accepted: 23 August 2017 DOI: 10.1002/sim.7492 RESEARCH ARTICLE Extending the MR-Egger method for multivariable Mendelian randomization to correct for
More informationSummary and discussion of The central role of the propensity score in observational studies for causal effects
Summary and discussion of The central role of the propensity score in observational studies for causal effects Statistics Journal Club, 36-825 Jessica Chemali and Michael Vespe 1 Summary 1.1 Background
More informationEPSE 594: Meta-Analysis: Quantitative Research Synthesis
EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca January 24, 2019 Ed Kroc (UBC) EPSE 594 January 24, 2019 1 / 37 Last time Composite effect
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationMeta-analysis. 21 May Per Kragh Andersen, Biostatistics, Dept. Public Health
Meta-analysis 21 May 2014 www.biostat.ku.dk/~pka Per Kragh Andersen, Biostatistics, Dept. Public Health pka@biostat.ku.dk 1 Meta-analysis Background: each single study cannot stand alone. This leads to
More informationTECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study
TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,
More information8. Instrumental variables regression
8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption
More informationLogistic regression: Miscellaneous topics
Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio
More information1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches
Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationPower and sample size calculations for designing rare variant sequencing association studies.
Power and sample size calculations for designing rare variant sequencing association studies. Seunggeun Lee 1, Michael C. Wu 2, Tianxi Cai 1, Yun Li 2,3, Michael Boehnke 4 and Xihong Lin 1 1 Department
More informationSurvival Analysis for Case-Cohort Studies
Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationDesign of Multiregional Clinical Trials (MRCT) Theory and Practice
Design of Multiregional Clinical Trials (MRCT) Theory and Practice Gordon Lan Janssen R&D, Johnson & Johnson BASS 2015, Rockville, Maryland Collaborators: Fei Chen and Gang Li, Johnson & Johnson BASS 2015_LAN
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More informationLecture 9: Kernel (Variance Component) Tests and Omnibus Tests for Rare Variants. Summer Institute in Statistical Genetics 2017
Lecture 9: Kernel (Variance Component) Tests and Omnibus Tests for Rare Variants Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 46 Lecture Overview 1. Variance Component
More informationSTAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method
STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationPubh 8482: Sequential Analysis
Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 7 Course Summary To this point, we have discussed group sequential testing focusing on Maintaining
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationBiostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods
Biostat Methods STAT 5820/6910 Handout #9a: Intro. to Meta-Analysis Methods Meta-analysis describes statistical approach to systematically combine results from multiple studies [identified follong an exhaustive
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 17. Bayesian inference; Bayesian regression Training == optimisation (?) Stages of learning & inference: Formulate model Regression
More informationIs the cholesterol concentration in blood related to the body mass index (bmi)?
Regression problems The fundamental (statistical) problems of regression are to decide if an explanatory variable affects the response variable and estimate the magnitude of the effect Major question:
More informationPrediction intervals for random-effects meta-analysis: a confidence distribution approach
Statistical Methods in Medical Research 2018. In press. DOI: 10.1177/0962280218773520 Prediction intervals for random-effects meta-analysis: a confidence distribution approach Kengo Nagashima 1 *, Hisashi
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview
More informationA new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationProbability and Statistics
CHAPTER 5: PARAMETER ESTIMATION 5-0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 5: PARAMETER
More informationPrediction intervals for random-effects meta-analysis: a confidence distribution approach
Preprint / Submitted Version Prediction intervals for random-effects meta-analysis: a confidence distribution approach Kengo Nagashima 1 *, Hisashi Noma 2, and Toshi A. Furukawa 3 Abstract For the inference
More informationBrandon C. Kelly (Harvard Smithsonian Center for Astrophysics)
Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming
More information14.30 Introduction to Statistical Methods in Economics Spring 2009
MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationSome New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary
Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary Bimal Sinha Department of Mathematics & Statistics University of Maryland, Baltimore County,
More informationChapter 20: Logistic regression for binary response variables
Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried
More informationLikelihood and Fairness in Multidimensional Item Response Theory
Likelihood and Fairness in Multidimensional Item Response Theory or What I Thought About On My Holidays Giles Hooker and Matthew Finkelman Cornell University, February 27, 2008 Item Response Theory Educational
More informationScore test for random changepoint in a mixed model
Score test for random changepoint in a mixed model Corentin Segalas and Hélène Jacqmin-Gadda INSERM U1219, Biostatistics team, Bordeaux GDR Statistiques et Santé October 6, 2017 Biostatistics 1 / 27 Introduction
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationBTRY 4830/6830: Quantitative Genomics and Genetics
BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements
More informationREPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS
REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS Ying Liu Department of Biostatistics, Columbia University Summer Intern at Research and CMC Biostats, Sanofi, Boston August 26, 2015 OUTLINE 1 Introduction
More informationTesting Homogeneity Of A Large Data Set By Bootstrapping
Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp
More informationBeta-binomial model for meta-analysis of odds ratios
Research Article Received 9 June 2016, Accepted 3 January 2017 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.7233 Beta-binomial model for meta-analysis of odds ratios
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationUniversity of Bristol - Explore Bristol Research
Bowden, J., Del Greco M, F., Minelli, C., Davey Smith, G., Sheehan, N., & Thompson, J. (2017). A framework for the investigation of pleiotropy in twosample summary data Mendelian randomization. Statistics
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationA Robust Test for Two-Stage Design in Genome-Wide Association Studies
Biometrics Supplementary Materials A Robust Test for Two-Stage Design in Genome-Wide Association Studies Minjung Kwak, Jungnam Joo and Gang Zheng Appendix A: Calculations of the thresholds D 1 and D The
More informationA3. Statistical Inference Hypothesis Testing for General Population Parameters
Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,
More informationLecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017
Lecture 7: Interaction Analysis Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 39 Lecture Outline Beyond main SNP effects Introduction to Concept of Statistical Interaction
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationLawrence D. Brown* and Daniel McCarthy*
Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals
More informationSTAT 536: Genetic Statistics
STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationCausal Inference: Discussion
Causal Inference: Discussion Mladen Kolar The University of Chicago Booth School of Business Sept 23, 2016 Types of machine learning problems Based on the information available: Supervised learning Reinforcement
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More information1 Preliminary Variance component test in GLM Mediation Analysis... 3
Honglang Wang Depart. of Stat. & Prob. wangho16@msu.edu Omics Data Integration Statistical Genetics/Genomics Journal Club Summary and discussion of Joint Analysis of SNP and Gene Expression Data in Genetic
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationIntroduction to Machine Learning
Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationPractical considerations for survival models
Including historical data in the analysis of clinical trials using the modified power prior Practical considerations for survival models David Dejardin 1 2, Joost van Rosmalen 3 and Emmanuel Lesaffre 1
More informationPoisson regression: Further topics
Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationMachine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.
More informationRegression techniques provide statistical analysis of relationships. Research designs may be classified as experimental or observational; regression
LOGISTIC REGRESSION Regression techniques provide statistical analysis of relationships. Research designs may be classified as eperimental or observational; regression analyses are applicable to both types.
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More informationVariable selection and machine learning methods in causal inference
Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of
More informationThe Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies
The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu August 5, 2014 Ian Barnett
More informationEmpirical likelihood-based methods for the difference of two trimmed means
Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More information15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due
More informationLecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)
Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations) Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology
More informationLecture 4 January 23
STAT 263/363: Experimental Design Winter 2016/17 Lecture 4 January 23 Lecturer: Art B. Owen Scribe: Zachary del Rosario 4.1 Bandits Bandits are a form of online (adaptive) experiments; i.e. samples are
More informationCross-Validation with Confidence
Cross-Validation with Confidence Jing Lei Department of Statistics, Carnegie Mellon University UMN Statistics Seminar, Mar 30, 2017 Overview Parameter est. Model selection Point est. MLE, M-est.,... Cross-validation
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationMultiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar
Multiple regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Multiple regression 1 / 36 Previous two lectures Linear and logistic
More informationUnderstanding Ding s Apparent Paradox
Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox
More informationEconometric Analysis of Cross Section and Panel Data
Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND
More information