Logistisk regression T.K.
|
|
- Gertrude Kelley
- 5 years ago
- Views:
Transcription
1 Föreläsning 13: Logistisk regression T.K
2 Your Learning Outcomes Odds, Odds Ratio, Logit function, Logistic function Logistic regression definition likelihood function: maximum likelihood estimate
3 The setting: Y is a binary r.v.. x is its covariant, predictor or explanatory variable. x can be continuous, categorical. We cannot model the association of Y to x by a direct linear regression, Y = α + βx + e where e is, e.g., N (0, σ 2 ).
4 A Case Y = Bacterial Meningitis or Acute Viral Meningitis. x =cerebrospinal fluid total protein count.
5 A Case Diaz, Armando A and Tomba, Emanuele and Lennarson, Reese and Richard, Rex and Bagajewicz, Miguel J and Harrison, Roger G: Prediction of protein solubility in Escherichia coli using logistic regression, Biotechnology and bioengineering, 105, 2 pp , The authors develop a model for the prediction of the solubility of proteins overexpressed in the bacterium Escherichia coli. The model uses the statistical technique of logistic regression. To build this model, 32 covariates x i that could potentially correlate well with solubility were used. Logistic regression provides the probability p of a certain protein to belong (= Y ) to one set or another.
6 Part I: Concepts Odds, Odds Ratio Logit function Logistic function
7 Odds The odds of a statement (event e.t.c) A is calculated as the probability p(a) of observing A divided by the probability of not observing A: odds of A = p(a) 1 p(a) E.g., in humans an average of 51 boys are born in every 100 births, the odds of a randomly chosen delivery being boy are: odds of a boy = = 1.04 The odds of a certain thing happening are infinite.
8 An Aside: Posterior Odds The posterior probability p(h E ) of H given the evidence E: posterior odds of H = p(h E ) 1 p(h E ) If there are only two alternative hypotheses, H and not-h, then posterior odds of H = p(h E ) p(not-h E ) by Bayes formula = p(e H) p(h) p(e not-h) p(not-h). = likelihood ratio prior odds
9 Odds ratio The odds ratio ψ is the ratio of odds from two different conditions or populations. ψ = odds of A 1 odds of A 2 = p(a 1 ) 1 p(a 1 ) p(a 2 ) 1 p(a 2 )
10 The function logit(p) The logarithmic odds of success is called the logit of p ( ) p logit(p) = ln 1 p
11 The function logit(p) and its inverse ( ) p logit(p) = log 1 p If θ = logit(p), then the inverse function is p = logit 1 (θ) = eθ 1 + e θ
12 The logit(p) and its inverse The function ( ) p logit(p) = log, 0 < p < 1 1 p p = logit 1 (θ) = σ(θ) = eθ 1 + e θ = e θ 1, < θ <, 1 + e θ is called the logistic function or sigmoid. Note that σ(0) = 1 2.
13 Logistic function In biology the logistic function refers to change in size of a species population.
14 The logit(p) and the logistic function Sats The logit function ( ) p θ = logit(p) = log, 0 < p < 1 1 p and the logistic function p = σ(θ) = are inverse functions to each other. 1, < θ <, 1 + e θ
15 The logit(p) and the logistic function ( ) p 1 θ = logit(p) = log, 0 < p < 1 p = σ(θ) =, < θ <, 1 p 1 + e θ
16 The logistic distribution We see that 0 σ(x) 1. σ(x) 1, as x +. σ(x) 0, as x. σ(x) is strictly increasing. Hence σ(x) is the distribution function of a random variable.
17 The logistic distribution If for every x P(ɛ x) = σ(x) = 1, < x < 1 + e x then we say that ɛ has the logistic distribution ɛ logistic(0, 1)
18 The logistic distribution: an exercise Let P U(0, 1) and ɛ = ln P 1 P. Then ɛ logistic(0, 1)
19 The logistic distribution: another exercise ɛ logistic(0, 1) Then P( ɛ x) = P(ɛ x).
20 Simulating ɛ Logistic(0, 1) Simulate p 1,..., p n from the uniform distribution on (0, 1) and then do ɛ i = logit(p i ), i = 1,..., n. In the figure we plot the empirical distribution function of ɛ i for n =
21 Part II: Logistic Regression A regression function log odds of Y a regression function How to generate Y
22 Let β = (β 0, β 1, β 2,..., β p ) be (p + 1) 1 vector and X = (1, X 1, X 2,..., X p ) be a (p + 1) 1 -vector of (predictor) variables. We set, as in multiple regression, β T X = β 0 + β 1 X 1 + β 2 X β p X p Then G (X) = σ(β T X) = T X e = e β βt X 1 + e βt X
23 Logistic regression The predictor variables (X 1, X 2,..., X p ) can be binary, ordinal, categorical or continuous.
24 e βt X G (X) = σ(β T X) = 1 + e βt X By construction 0 < G (X) < 1. Then logit is well defined and logit(g (X)) = ln G (X) 1 G (X) = βt X = β 0 + β 1 X 1 + β 2 X β p X p.
25 Logistic regression Let now Y be a binary random variable such that { 1 with probability G (X) Y = 1 with probability 1 G (X) Definition If the logit of G (X) (or log odds of Y ) is logit(g (X)) = ln G (X) 1 G (X) = β 0 + β 1 X 1 + β 2 X β p X p, then we say that Y follows a logistic regression w.r.t. the predictor variables X = (1, X 1, X 2,..., X p ).
26 Logistic regression: genetic epidemiology Logistic regression { success with probability G (X) Y = failure with probability 1 G (X) logit(g (X)) = ln G (X) 1 G (X) = β 0 + β 1 X 1 + β 2 X β p X p. is extensively applied in biomedical research.
27 Logistic regression P(Y = 1 X) = G (X) = σ(β T X) = e βt X 1 + e βt X 1 P(Y = 0 X) = 1 G (X) = e = e β βt X 1 + e βt X = e βt X. T X
28 Logistic regression: genetic epidemiology Suppose we have two populations, where X i = x 1 in first population and X i = x 2 in the second population, all other predictors are equal in the two populations. Then a medical geneticist finds it useful to calculate the logarithm of the odds ratio ln ψ = ln p 1 1 p 1 ln = β i (x 1 x 2 ) p 2 1 p 2 or ψ = e β i (x 1 x 2 ) Hence a unit change in X i corresponds to e β i β i change in logodds. change in odds and
29 We need the following regression model Y = β T X + ɛ where β T X = β 0 + β 1 X 1 + β 2 X β p X p and ɛ Logistic(0, 1), i.e. the variable Y can be written directly in terms of the linear predictor function and an additive random error variable. The logistic distribution is the probability distribution the random error.
30 Generating model and/or how to simulate Take a continuous latent variable Y (latent= an unobserved random variable) that is given as follows: Y = β T X + ɛ where β T X = β 0 + β 1 X 1 + β 2 X β p X p and ɛ Logistic(0, 1).
31 Define the response Y as the indicator for whether the latent variable is positive: { 1 if Y > 0 i.e. ε < β T X, Y = 0 otherwise. Then Y follows a logistic regression w.r.t. X. We need only to verify that 1 P(Y = 1 X) = 1 + e. βt X
32 P(Y = 1 X) = P(Y > 0 X) (1) = P(β T X + ε > 0) (2) = P(ε > β T X) (3) = P( ε < β T X) (4) = P(ε < β T X) (5) = σ(β T X) (6) 1 = (7) 1 + e βt X where we used that the logistic distribution is symmetric (and continuous), as found in the exercise classes, or Pr ( ε x) = Pr (ε x).
33 Part III: Logistic Regression Special case: β T X = β 0 + β 1 x. Likelihood Maximum Likelihood logisticmle.m
34 We consider the model: β T X = β 0 + β 1 x. P(Y = 1 x) = σ(β T X) = e (β 0+β 1 x)
35 Notation: P(Y = y x) = σ(β T X) y (1 σ(β T X) 1 y = { σ(β T X) if y = 1 1 σ(β T X) if y = 0
36 Data (x i, y i ) n i=1, likelihood function with the notation above L (β 0, β 1 ) = P(Y = y 1 x 1 ) P(Y = y 2 x 2 ) P(Y = y n x n ) = σ(β T X 1 ) y 1 (1 σ(β T X 1 ) 1 y1 σ(β T X n ) y n (1 σ(β T X n ) 1 y n = A B A = σ(β T X 1 ) y1 σ(β T X 2 ) y2 σ(β T X n ) y n B = (1 σ(β T X 1 ) 1 y 1 (1 σ(β T X 2 ) 1 y2 (1 σ(β T X n ) 1 y n
37 = n i=1 + ln L (β 0, β 1 ) = ln A + ln B n i=1 = n y l ln σ(β T X i ) i=1 (1 y i ) ln(1 σ(β T X i )) ln(1 σ(β T X i )) + n i=1 σ(β T X i ) y l ln 1 σ(β T X i )
38 ( ln(1 σ(β T X i )) = ln 1 ( e (β 0+β 1 x) = ln 1 + e (β 0+β 1 x i ) ( ) 1 = ln 1 + e β 0+β 1 = x i ) = ln (e β 0+β 1 x i ) e (β 0+β 1 x i ) )
39 logodds σ(β T X i ) ln 1 σ(β T X i ) = β 0 + β 1 x i
40 In summary: ln L (β 0, β 1 ) = n i=1 y i (β 0 + β 1 x i ) n i=1 ) ln (e β 0+β 1 x i
41 = n i=1 β 1 ln L (β 0, β 1 ) y i x i n i=1 ) ln (e β 0+β 1 x i β 1 ) ln (e β 0+β 1 x i β 1 = e β 0+β 1 x i x i 1 + e β 0+β 1 x i = P(Y = 1 x i ) x i.
42 ln L (β 0, β 1 ) = 0 β 1 n (y i x i P(Y = 1 x i ) x i ) = 0. i=1 In the same manner we can also find n ln L (β 0, β 1 ) = β 0 (y i P(Y = 1 x i )) = 0 i=1 These two equations have no closed form solution w.r.t. β 0 and β 1.
43 Newton-Raphson for MLE: a one-parameter case For one parameter θ, set f (θ) = ln L(θ). We are searching for the solution of Newton-Raphson method f (θ) = d f (θ) = 0. dθ θ new = θ old + f ( θ old ) where a good initial value is desired. f (θ old ),
44 Newton-Raphson for logistic MLE ( β new 0 β new 1 ) = ( β old 0 β old 1 ) ( + H 1 (β old 0, β old β 1 ) 0 ln L ( β old 0, ) βold 1 β 1 ln L ( β old 0, ) βold 1 ). where H 1 (β old (next slide) 0, βold 1 ) is the matrix inverse of the 2 2 matrix
45 Newton-Raphson for logistic MLE H(β old 0, β old 1 ) = 2 β 2 0 ln L ( β old 0, ) βold 1 2 β 1 β 0 ln L ( β old 0, βold 1 ) 2 β 0 β 1 ln L ( β old 2 β 2 1 0, βold 1 ln L ( β old 0, βold 1 ) ) = n i=1 P (Y = 1 x i ) (1 P(Y = 1 x i ) n i=1 P (Y = 1 x i ) (1 P(Y = 1 x i ) right)cdotx i n i=1 P (Y = 1 x i ) (1 P (Y = 1 x i )) x i n i=1 P (Y = 1 x i ) (1 P(Y = 1 x i )) x 2 i
46 [bs,stderr,phat,deviance] = logisticmle(y,x) Input: y - responses, a binary vector, values 0 and 1 x - the covariate, as a vector Output: bs - estimators of beta0 and beta1 stderr - standard error of the estimate = square roots of the diagonal elements of H 1. phat - estimator of p= P(Y=1) deviance - deviance
47 Confidence Interval with degree of confidence 1 α β MLE,i ± λ α/2 stderr( β MLE,i )
Modern Methods of Statistical Learning sf2935 Lecture 5: Logistic Regression T.K
Lecture 5: Logistic Regression T.K. 10.11.2016 Overview of the Lecture Your Learning Outcomes Discriminative v.s. Generative Odds, Odds Ratio, Logit function, Logistic function Logistic regression definition
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationLogistic Regression. Seungjin Choi
Logistic Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationBinomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials
Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =
More informationLecture 10: Introduction to Logistic Regression
Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial
More informationData-analysis and Retrieval Ordinal Classification
Data-analysis and Retrieval Ordinal Classification Ad Feelders Universiteit Utrecht Data-analysis and Retrieval 1 / 30 Strongly disagree Ordinal Classification 1 2 3 4 5 0% (0) 10.5% (2) 21.1% (4) 42.1%
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationDynamic Generalized Linear Models
Dynamic Generalized Linear Models Jesse Windle Oct. 24, 2012 Contents 1 Introduction 1 2 Binary Data (Static Case) 2 3 Data Augmentation (de-marginalization) by 4 examples 3 3.1 Example 1: CDF method.............................
More informationChapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models
Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29 14.1 Regression Models
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationCOMPLEMENTARY LOG-LOG MODEL
COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationBMI 541/699 Lecture 22
BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationThe Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016
The Normal Linear Regression Model with Natural Conjugate Prior March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior The plan Estimate simple regression model using Bayesian methods
More informationChapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a
Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.
More informationLogistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu
Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More informationLecture 4 Towards Deep Learning
Lecture 4 Towards Deep Learning (January 30, 2015) Mu Zhu University of Waterloo Deep Network Fields Institute, Toronto, Canada 2015 by Mu Zhu 2 Boltzmann Distribution probability distribution for a complex
More informationGeneralized Linear Models. Last time: Background & motivation for moving beyond linear
Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered
More informationStandard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j
Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationGradient-Based Learning. Sargur N. Srihari
Gradient-Based Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More information> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel
Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation
More information22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression
22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationThe material for categorical data follows Agresti closely.
Exam 2 is Wednesday March 8 4 sheets of notes The material for categorical data follows Agresti closely A categorical variable is one for which the measurement scale consists of a set of categories Categorical
More informationThe logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is
Example The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is log p 1 p = β 0 + β 1 f 1 (y 1 ) +... + β d f d (y d ).
More informationModeling Binary Outcomes: Logit and Probit Models
Modeling Binary Outcomes: Logit and Probit Models Eric Zivot December 5, 2009 Motivating Example: Women s labor force participation y i = 1 if married woman is in labor force = 0 otherwise x i k 1 = observed
More informationSTAB57: Quiz-1 Tutorial 1 (Show your work clearly) 1. random variable X has a continuous distribution for which the p.d.f.
STAB57: Quiz-1 Tutorial 1 1. random variable X has a continuous distribution for which the p.d.f. is as follows: { kx 2.5 0 < x < 1 f(x) = 0 otherwise where k > 0 is a constant. (a) (4 points) Determine
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationLogistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20
Logistic regression 11 Nov 2010 Logistic regression (EPFL) Applied Statistics 11 Nov 2010 1 / 20 Modeling overview Want to capture important features of the relationship between a (set of) variable(s)
More informationOutline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationLecture #11: Classification & Logistic Regression
Lecture #11: Classification & Logistic Regression CS 109A, STAT 121A, AC 209A: Data Science Weiwei Pan, Pavlos Protopapas, Kevin Rader Fall 2016 Harvard University 1 Announcements Midterm: will be graded
More informationEquivalence of random-effects and conditional likelihoods for matched case-control studies
Equivalence of random-effects and conditional likelihoods for matched case-control studies Ken Rice MRC Biostatistics Unit, Cambridge, UK January 8 th 4 Motivation Study of genetic c-erbb- exposure and
More informationLogistic regression model for survival time analysis using time-varying coefficients
Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research
More informationCS340 Machine learning Gaussian classifiers
CS340 Machine learning Gaussian classifiers 1 Correlated features Height and weight are not independent 2 Multivariate Gaussian Multivariate Normal (MVN) N(x µ,σ) def 1 (2π) p/2 Σ 1/2exp[ 1 2 (x µ)t Σ
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationPubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH
PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationModel Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection
Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist
More informationLogistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy
Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided
More informationNaive Bayes and Gaussian Bayes Classifier
Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others February 22, 2016 Naive Bayes and Gaussian Bayes Classifier February 22, 2016 1 / 21 Naive Bayes Bayes Rule:
More informationAnalysis of Time-to-Event Data: Chapter 4 - Parametric regression models
Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored
More informationA Reliable Constrained Method for Identity Link Poisson Regression
A Reliable Constrained Method for Identity Link Poisson Regression Ian Marschner Macquarie University, Sydney Australasian Region of the International Biometrics Society, Taupo, NZ, Dec 2009. 1 / 16 Identity
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationAn ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent.
Statistical Methods in Business Lecture 6. Binomial Logistic Regression An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they
More information2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
More informationIntroduction To Logistic Regression
Introduction To Lecture 22 April 28, 2005 Applied Regression Analysis Lecture #22-4/28/2005 Slide 1 of 28 Today s Lecture Logistic regression. Today s Lecture Lecture #22-4/28/2005 Slide 2 of 28 Background
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationMachine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart
Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural
More informationChapter 10 Logistic Regression
Chapter 10 Logistic Regression Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Logistic Regression Extends idea of linear regression to situation where outcome
More informationGeneralized Linear Models 1
Generalized Linear Models 1 STA 2101/442: Fall 2012 1 See last slide for copyright information. 1 / 24 Suggested Reading: Davison s Statistical models Exponential families of distributions Sec. 5.2 Chapter
More informationGoodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links
Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationSTA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random
STA 216: GENERALIZED LINEAR MODELS Lecture 1. Review and Introduction Much of statistics is based on the assumption that random variables are continuous & normally distributed. Normal linear regression
More informationAnalysing categorical data using logit models
Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.research-training.net/manchester
More informationBayesian Interpretations of Regularization
Bayesian Interpretations of Regularization Charlie Frogner 9.50 Class 15 April 1, 009 The Plan Regularized least squares maps {(x i, y i )} n i=1 to a function that minimizes the regularized loss: f S
More informationBANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1
BANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013) ISLR Data Mining I
More informationLecture 5: LDA and Logistic Regression
Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant
More informationGeneralized Linear Modeling - Logistic Regression
1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationLinear Classification: Probabilistic Generative Models
Linear Classification: Probabilistic Generative Models Sargur N. University at Buffalo, State University of New York USA 1 Linear Classification using Probabilistic Generative Models Topics 1. Overview
More informationLinear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52
Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52 Components of a linear model The two
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationLinear Models for Classification
Catherine Lee Anderson figures courtesy of Christopher M. Bishop Department of Computer Science University of Nebraska at Lincoln CSCE 970: Pattern Recognition and Machine Learning Congradulations!!!!
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationChapter 5: Logistic Regression-I
: Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay
More informationLogistic Regression. Sargur N. Srihari. University at Buffalo, State University of New York USA
Logistic Regression Sargur N. University at Buffalo, State University of New York USA Topics in Linear Classification using Probabilistic Discriminative Models Generative vs Discriminative 1. Fixed basis
More informationWhen is MLE appropriate
When is MLE appropriate As a rule of thumb the following to assumptions need to be fulfilled to make MLE the appropriate method for estimation: The model is adequate. That is, we trust that one of the
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationAnalysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationBayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017
Bayesian inference Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark April 10, 2017 1 / 22 Outline for today A genetic example Bayes theorem Examples Priors Posterior summaries
More informationSTA216: Generalized Linear Models. Lecture 1. Review and Introduction
STA216: Generalized Linear Models Lecture 1. Review and Introduction Let y 1,..., y n denote n independent observations on a response Treat y i as a realization of a random variable Y i In the general
More informationLINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,
More informationReview of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models
Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationLinear and logistic regression
Linear and logistic regression Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Linear and logistic regression 1/22 Outline 1 Linear regression 2 Logistic regression 3 Fisher discriminant analysis
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More information