Non-Linear & Logistic Regression

Similar documents
Tests for the Ratio of Two Poisson Rates

Student Activity 3: Single Factor ANOVA

The steps of the hypothesis test

Continuous Random Variables

Predict Global Earth Temperature using Linier Regression

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

Chapter 5 : Continuous Random Variables

Lecture 3 Gaussian Probability Distribution

Math 1B, lecture 4: Error bounds for numerical methods

Estimation of Binomial Distribution in the Light of Future Data

NUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.

Lecture INF4350 October 12008

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

Review of Calculus, cont d

Recitation 3: More Applications of the Derivative

Operations with Polynomials

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Section 11.5 Estimation of difference of two proportions

Monte Carlo method in solving numerical integration and differential equation

Chapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

Numerical integration

1 Linear Least Squares

CS667 Lecture 6: Monte Carlo Integration 02/10/05

UNIT 1 FUNCTIONS AND THEIR INVERSES Lesson 1.4: Logarithmic Functions as Inverses Instruction

1 Probability Density Functions

20 MATHEMATICS POLYNOMIALS

Data Assimilation. Alan O Neill Data Assimilation Research Centre University of Reading

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

CBE 291b - Computation And Optimization For Engineers

1 Online Learning and Regret Minimization

8 Laplace s Method and Local Limit Theorems

P 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0)

Pre-Session Review. Part 1: Basic Algebra; Linear Functions and Graphs

Review of basic calculus

fractions Let s Learn to

f(a+h) f(a) x a h 0. This is the rate at which

LECTURE NOTE #12 PROF. ALAN YUILLE

Precalculus Spring 2017

MATH 144: Business Calculus Final Review

Lecture 21: Order statistics

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac

For the percentage of full time students at RCC the symbols would be:

approaches as n becomes larger and larger. Since e > 1, the graph of the natural exponential function is as below

Mathematics Extension 1

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

MATH SS124 Sec 39 Concepts summary with examples

Math 135, Spring 2012: HW 7

A-Level Mathematics Transition Task (compulsory for all maths students and all further maths student)

13: Diffusion in 2 Energy Groups

Identify graphs of linear inequalities on a number line.

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

different methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s).

Comparison Procedures

AP Calculus Multiple Choice: BC Edition Solutions

Math& 152 Section Integration by Parts

Overview of Calculus I

2008 Mathematical Methods (CAS) GA 3: Examination 2

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

Scientific notation is a way of expressing really big numbers or really small numbers.

Math 3B Final Review

Read section 3.3, 3.4 Announcements:

Math 113 Exam 2 Practice

Math 520 Final Exam Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Fundamentals of Analytical Chemistry

MA 124 January 18, Derivatives are. Integrals are.

Chapter 0. What is the Lebesgue integral about?

How can we approximate the area of a region in the plane? What is an interpretation of the area under the graph of a velocity function?

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.

4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve

( ) where f ( x ) is a. AB/BC Calculus Exam Review Sheet. A. Precalculus Type problems. Find the zeros of f ( x).

Driving Cycle Construction of City Road for Hybrid Bus Based on Markov Process Deng Pan1, a, Fengchun Sun1,b*, Hongwen He1, c, Jiankun Peng1, d

University of Texas MD Anderson Cancer Center Department of Biostatistics. Inequality Calculator, Version 3.0 November 25, 2013 User s Guide

Reinforcement learning II

Lesson 1.6 Exercises, pages 68 73

5.7 Improper Integrals

Design and Analysis of Single-Factor Experiments: The Analysis of Variance

AB Calculus Review Sheet

PART 1 MULTIPLE CHOICE Circle the appropriate response to each of the questions below. Each question has a value of 1 point.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Statistical Physics I Spring Term Solutions to Problem Set #1

( ) as a fraction. Determine location of the highest

Lecture 1: Introduction to integration theory and bounded variation

( ) where f ( x ) is a. AB Calculus Exam Review Sheet. A. Precalculus Type problems. Find the zeros of f ( x).

Acceptance Sampling by Attributes

A Matrix Algebra Primer

Preparation for A Level Wadebridge School

The graphs of Rational Functions

A. Limits - L Hopital s Rule ( ) How to find it: Try and find limits by traditional methods (plugging in). If you get 0 0 or!!, apply C.! 1 6 C.

The Regulated and Riemann Integrals

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying

Hybrid Group Acceptance Sampling Plan Based on Size Biased Lomax Model

A-level Mathematics. Paper 3 Mark scheme. Practice paper Set 1. Version 1.0

Lesson 25: Adding and Subtracting Rational Expressions

CS 188: Artificial Intelligence Spring 2007

Math 426: Probability Final Exam Practice

Transcription:

Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University)

Regression Anlyses When do we use these? PART 1: find reltionship between response vrible (Y) nd predictor vrible (X) (e.g. Y~X) PART 2: use reltionship to predict Y from X Simple liner regression: y = b + m*x y = β 0 + β 1 * x 1 Multiple liner regression: y = β 0 + β 1 *x 1 + β 2 *x 2 + β n *x n Non liner regression: when line just doesn t fit our dt Logistic regression: when our dt is binry (dt is represented s 0 or 1)

Non-liner Regression Curviliner reltionship between response nd predictor vribles The right type of non-liner model re usully conceptully determined bsed on biologicl considertions For strting point we cn plot the reltionship between the 2 vribles nd visully check which model might be good option There re obviously MANY curves you cn generte to try nd fit your dt

response (y) response (y) response (y) response (y) Exponentil Curve Non-liner regression option #1 Exponentil: y = + bc x Rpid incresing/decresing chnge in Y or X for chnge in the other Ex: bcteri growth/decy, humn popultion growth, infection rtes (humns, trees, etc.) 0 < c < 1 c > 1 +b +b 0 < c < 1 c > 1 -b -b

response (y) response (y) response (y) response (y) Logrithmic Curve Non-liner regression option #2 Logrithmic: y = + bx c Rpid incresing/decresing chnge in Y or X for chnge in the other Ex: survivl thresholds, resource optimiztion -c +c -c +b +b +c -b -b

response (y) response (y) Hyperbolic Curve Non-liner regression option #3 Hyperbolic: y = + b x + c Rpid incresing/decresing chnge in Y or X for chnge in the other Ex: survivl of function of popultion Similr to exponentil nd logrithmic curve but now we hve 2 symptotes +b -b c c

response (y) response (y) Prbolic Curve Non-liner regression option #4 Prbolic: y = + b x c 2 Rpid incresing/decresing chnge in Y or X for chnge in the other followed by the reverse trend Ex: survivl of function of n environmentl vrible Upwrd Prbolic Downwrd Prbolic +b -b c c

response (y) Gussin Curve Non-liner regression option #5 Gussin: y = b x c 2 Resembles norml distribution Ex: survivl of function of n environmentl vrible Where 0 < b < 1 b c

response (y) Sigmoidl Curve Non-liner regression option #6 Signoidl: y = 1 + b x c + d Stbility in Y followed by rpid increse then stbility gin Ex: restricted growth, lerning response, threshold hs to occur for response effect Where b > 1 nd c > 1 c b d

response (y) response (y) Michelis Menten Curve Non-liner regression option #7 Michelis Menten: y = x b + x Rpid incresing/decresing chnge in Y or X for chnge in the other Ex: biologicl process s function of resource vilbility Similr to exponentil nd logrithmic curve but now we hve 2 prmeters this model comes from kinetics/physiology 1 2 -b b

Non-Liner Regression Curve Fitting Procedure: 1. Plot your vribles to visulize the reltionship. Wht curve does the pttern resemble? b. Wht might lterntive options be? 2. Decide on the curves you wnt to compre nd run non-liner regression curve fitting. You will hve to estimte your prmeters from your curve to hve strting vlues for your curve fitting function 3. Once you hve prmeters for your curves compre models with AIC 4. Plot the model with the lowest AIC on your point dt to visulize fit Non-liner regression curve fitting in R: instll.pckges("minpck.lm") nlslm(responsey~model, strt=list(strting vlues for model prmeters))

Non-Liner Regression Output from R Non-liner model tht we fit Simplified logrithmic with slope=0 Estimtes of model prmeters Residul sum-of-squres for your non-liner model Number of itertions needed to estimte the prmeters

Non-Liner Regression Curve Fitting Procedure: 1. Plot your vribles to visulize the reltionship. Wht curve does the pttern resemble? b. Wht might lterntive options be? 2. Decide on the curves you wnt to compre nd run non-liner regression curve fitting. You will hve to estimte your prmeters from your curve to hve strting vlues for your curve fitting function 3. Once you hve prmeters for your curves compre models with AIC 4. Plot the model with the lowest AIC on your point dt to visulize fit Non-liner regression curve fitting in R: instll.pckges("minpck.lm") nlslm(responsey~model, strt=list(strting vlues for model prmeters))

Akike s Informtion Criterion (AIC) How do we decide which model is best? In the 1970s he used informtion theory to build numericl equivlent of Occm's rzor Hirotugu Akike, 1927-2009 Occm s rzor: All else being equl, the simplest explntion is the best one For model selection, this mens the simplest model is preferred to more complex one Of course, this needs to be weighed ginst the bility of the model to ctully predict nything AIC considers both the fit of the model nd the model complexity Complexity is mesured s number prmeters or the use of higher order polynomils Allows us to blnce over- nd under-fitting in our modelled reltionships We wnt model tht is s simple s possible, but no simpler A resonble mount of explntory power is trded off ginst model complexity AIC mesures the blnce of this for us

Akike s Informtion Criterion (AIC) AIC in R AIC is useful becuse it cn be clculted for ny kind of model llowing comprisons cross different modelling pproches nd model fitting techniques Model with the lowest AIC vlue is the model tht fits your dt best (e.g. minimizes your model residuls) Output from R is single AIC vlue Akike s Informtion Criterion in R to determine best model: AIC(nlsLM(responseY~MODEL1, strt=list(strting vlues))) AIC(nlsLM(responseY~MODEL2, strt=list(strting vlues))) AIC(nlsLM(responseY~MODEL3, strt=list(strting vlues)))

Non-Liner Regression Curve fitting Use the prmeter estimtes outputted from nlslm() to generte curve for plotting

Non-Liner Regression Assumptions NLR mke no ssumptions for normlity, equl vrinces, or outliers However the ssumptions of independence (sptil & temporl) nd design considertions (rndomiztion, sufficient replictes, no pseudorepliction) still pply We don t hve to worry bout sttisticl power here becuse we re fitting reltionships All we cre bout is if or how well we cn model the reltionship between our response nd predictor vribles

Non-Liner Regression R 2 for goodness of fit Clculting n R 2 is NOT APPROPIATE for non-liner regression Why? For liner models, the sums of the squred errors lwys dd up in specific mnner: SS Regression + SS Error = SS Totl Therefore R 2 = SS Regression SSTotl which mthemticlly must produce vlue between 0 nd 100% But in nonliner regression SS Regression + SS Error SS Totl Therefore the rtio used to construct R 2 is bis in nonliner regression Best to use AIC vlue nd the mesurement of the residul sum-of-squres to pick best model then plot the curve to visulize the fit

Logistic Regression (.k. logit regression) Reltionship between binry response vrible nd predictor vribles Logistic Model: y = eβ 0+β 1 x 1 +β 2 x 2 + +β n x n 1 e β 0 +β 1 x 1 +β 2 x 2 + +β nx n Logit Model Binry response vrible cn be considered clss (1 or 0) Yes or No Present or Absent The liner prt of the logistic regression eqution is used to find the probbility of being in ctegory bsed on the combintion of predictors Predictor vribles re usully (but not necessrily) continuous But it is hrder to mke inferences from regression outputs tht use discrete or ctegoricl vribles

Binomil distribution vs Norml distribution Key difference: Vlues re continuous (Norml) vs discrete (Binomil) As smple size increses the binomil distribution ppers to resemble the norml distribution Binomil distribution is fmily of distributions becuse the shpe references both the number of observtions nd the probbility of getting success - vlue of 1 Wht is probbility of x success in n independent nd identiclly distributed Bernoulli trils? Bernoulli tril (or binomil tril) - rndom experiment with exctly two possible outcomes, "success" nd "filure", in which the probbility of success is the sme every time the experiment is conducted

Logistic Regression vs Liner Regression Liner Regression - references the Gussin (norml) distribution - uses ordinry lest squres to find best fitting line the estimtes prmeters tht predict the chnge in the dependent vrible for chnge in the independent vrible Logistic regression - references the Binomil distribution - estimtes the probbility (p) of n event occurring (y=1) rther then not occurring (y=0) from knowledge of relevnt independent vribles (our dt) - regression coefficients re estimted using mximum likelihood estimtion (itertive process)

Mximum likelihood estimtion How coefficients re estimted for logistic regression Complex itertive process to find coefficient vlues tht mximizes the likelihood function Likelihood function - probbility for the occurrence of observed set of vlues X nd Y given function with defined prmeters Process: 1. Begins with tenttive solution for ech coefficient 2. Revise it slightly to see if the likelihood function cn be improved 3. Repets this revision until improvement is minute, t which point the process is sid to hve converged

Logistic Regression vs Liner Regression Liner Regression - references the Gussin (norml) distribution - uses ordinry lest squres to find best fitting line the estimtes prmeters tht predict the chnge in the dependent vrible for chnge in the independent vrible Logistic regression - references the Binomil distribution - estimtes the probbility (p) of n event occurring (y=1) rther then not occurring (y=0) from knowledge of relevnt independent vribles (our dt) - regression coefficients re estimted using mximum likelihood estimtion (itertive process) Simple Logistic Regression in R: lm(response~predictor, fmily="binomil") summry(lm(response~predictor, fmily="binomil")) Multiple Logistic Regression in R: lm(response~predictor1+predictor2+ +predictorn, fmily="binomil") summry(lm(response~predictor1+predictor2+ +predictorn, fmily="binomil"))

Logistic Regression (.k. logit regression) Output from R Estimte of model prmeters (intercept nd slope) Stndrd error of estimtes AIC vlue for the model Tests the null hypothesis tht the coefficient is equl to zero (no effect) A predictor tht hs low p-vlue is likely to be meningful ddition to your model becuse chnges in the predictor's vlue re relted to chnges in the response vrible A lrge p-vlue suggests tht chnges in the predictor re not ssocited with chnges in the response

Logistic Regression (.k. logit regression) Pseudo R 2 for goodness of fit In liner regression, the reltionship between the dependent nd the independent vribles is liner However this ssumption is not mde in logistic regression so we cnnot use the clcultion R 2 = SS Regression SSTotl - REMEMBER we re not using sum-of-squres to estimte our prmeters we re using mximum likelihood estimtion We cn however clculte pseudo R 2 - Lots of options on how to do this, but the best for logistic regression ppers to be McFdden's clcultion R 2 = 1 lnl M FULL lnl M intercept L = Estimted likelihood Estimting McFdden s pseudo R 2 in R: mod=lm(response~predictor,fmily="binomil") mcf.r2=1-mod$devince/mod$null.devince NOTE: Pseudo R 2 will be MUCH lower thn R 2 vlues!

Logistic Regression (.k. logit regression) Assumptions Logistic regression mke no ssumptions for normlity, equl vrinces, or outliers However the ssumptions of independence (sptil & temporl) nd design considertions (rndomiztion, sufficient replictes, no pseudorepliction) still pply Logistic regression ssumes the response vrible is binry (0 & 1) We don t hve to worry bout sttisticl power here becuse we re fitting reltionships All we cre bout is if or how well we cn model the reltionship between our response nd predictor vribles

Importnt to Remember A non-liner or logistic reltionship DOES NOT imply custion! AIC or pseudo R 2 implies reltionship rther thn one or multiple fctors cusing nother fctor vlue Be creful of your interprettions!