Applied Statistics II - Categorical Data Analysis Data analysis using Genstat - Exercise 2 Logistic regression

Similar documents
Solution of Assignment #2

Errata. Items with asterisks will still be in the Second Printing

EXST Regression Techniques Page 1

Answer Homework 5 PHA5127 Fall 1999 Jeff Stark

What are those βs anyway? Understanding Design Matrix & Odds ratios

Linear Non-Gaussian Structural Equation Models

Review Statistics review 14: Logistic regression Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2

Observer Bias and Reliability By Xunchi Pu

4.2 Design of Sections for Flexure

2008 AP Calculus BC Multiple Choice Exam

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

Errata. Items with asterisks will still be in the Second Printing

MEMORIAL UNIVERSITY OF NEWFOUNDLAND

Unit 30: Inference for Regression

Logistic, Poisson, and Nonlinear Regression Problems

are given in the table below. t (hours)

What does the data look like? Logistic Regression. How can we apply linear model to categorical data like this? Linear Probability Model

A. Limits and Horizontal Asymptotes ( ) f x f x. f x. x "±# ( ).

Math 34A. Final Review

3-2-1 ANN Architecture

Chapter 3 Exponential and Logarithmic Functions. Section a. In the exponential decay model A. Check Point Exercises

Logistic, Poisson, and Nonlinear Regression Problems

Elements of Statistical Thermodynamics

PHA 5127 Answers Homework 2 Fall 2001

PHA 5128 Answer CASE STUDY 3 Question #1: Model

Where k is either given or determined from the data and c is an arbitrary constant.

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

Introduction to Condensed Matter Physics

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

Higher order derivatives

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17)


MA 262, Spring 2018, Final exam Version 01 (Green)

This test is for two independent Populations. The test is sometimes called the Mann-Whitney U test or the Rank Sum Wilcoxon. They are equivalent.

Solution: APPM 1360 Final (150 pts) Spring (60 pts total) The following parts are not related, justify your answers:

Addition of angular momentum

Supplementary Materials

Calculus II (MAC )

EXAMINATION QUESTION SVSOS3003 Fall 2004 Some suggestions for answering the questions

Determination of Vibrational and Electronic Parameters From an Electronic Spectrum of I 2 and a Birge-Sponer Plot

4. (5a + b) 7 & x 1 = (3x 1)log 10 4 = log (M1) [4] d = 3 [4] T 2 = 5 + = 16 or or 16.

Addition of angular momentum

TEMASEK JUNIOR COLLEGE, SINGAPORE. JC 2 Preliminary Examination 2017

Problem Statement. Definitions, Equations and Helpful Hints BEAUTIFUL HOMEWORK 6 ENGR 323 PROBLEM 3-79 WOOLSEY

PHA Final Exam Fall 2001

Prod.C [A] t. rate = = =

Institute of Actuaries of India

Title: Vibrational structure of electronic transition

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

Chapter 6 Student Lecture Notes 6-1

Search sequence databases 3 10/25/2016

Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

AS 5850 Finite Element Analysis

CHAPTER 5. Section 5-1

Partial Derivatives: Suppose that z = f(x, y) is a function of two variables.

Estimation of odds ratios in Logistic Regression models under different parameterizations and Design matrices

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

15. Stress-Strain behavior of soils

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

VALUING SURRENDER OPTIONS IN KOREAN INTEREST INDEXED ANNUITIES

Class(ic) Scorecards

ARIMA Methods of Detecting Outliers in Time Series Periodic Processes

Text: WMM, Chapter 5. Sections , ,

3) Use the average steady-state equation to determine the dose. Note that only 100 mg tablets of aminophylline are available here.

1 Minimum Cut Problem

SUMMER 17 EXAMINATION

Problem Set #2 Due: Friday April 20, 2018 at 5 PM.

SCHUR S THEOREM REU SUMMER 2005

Differentiation of Exponential Functions

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

5. B To determine all the holes and asymptotes of the equation: y = bdc dced f gbd

Chapter 3 Lecture 14 Longitudinal stick free static stability and control 3 Topics

ECE 650 1/8. Homework Set 4 - Solutions

Constants and Conversions:

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the

Need to understand interaction of macroscopic measures

Application of Vague Soft Sets in students evaluation

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM

Computing and Communications -- Network Coding

Homework #3. 1 x. dx. It therefore follows that a sum of the

Quasi-Classical States of the Simple Harmonic Oscillator

Objective Mathematics

NEW APPLICATIONS OF THE ABEL-LIOUVILLE FORMULA

Lecture 37 (Schrödinger Equation) Physics Spring 2018 Douglas Fields

nd the particular orthogonal trajectory from the family of orthogonal trajectories passing through point (0; 1).

Differential Equations

Chapter 6: Polarization and Crystal Optics

Einstein Equations for Tetrad Fields

1997 AP Calculus AB: Section I, Part A

Diploma Macro Paper 2

JOHNSON COUNTY COMMUNITY COLLEGE Calculus I (MATH 241) Final Review Fall 2016

First derivative analysis

Chapter (8) Estimation and Confedence Intervals Examples

Transitional Probability Model for a Serial Phases in Production

1997 AP Calculus AB: Section I, Part A

Principles of Humidity Dalton s law

3 Finite Element Parametric Geometry

Deift/Zhou Steepest descent, Part I

Transcription:

Applid Statistics II - Catgorical Data Analysis Data analysis using Gnstat - Exrcis 2 Logistic rgrssion Analysis 2. Logistic rgrssion for a 2 x k tabl. Th tabl blow shows th numbr of aphids aliv and dad aftr spraying with four concntrations of solutions of sodium olat. Th qustions ar (a) Ar concntration and mortality indpndnt? (b) Is thr a rlationship btwn concntration and mortality? (c) How wll dos th modl in (b) fit? Concntration of sodium olat (%) 0.65..6 2. Dad 55 62 00 72 Aliv 22 3 2 5 Rad in th data into Gnstat as follows: Units [4] rad conc_lv, conc,dad,aliv 0.65 55 22 2. 62 3 3.6 00 2 4 2. 72 5 : groups [rdfin=ys] conc_lv calc n=dad+aliv calc prop=dad/n For Logistic rgrssion modl of th numbr dad - spcify -that th distribution is th binomial -that th logit link is bing usd -that th total count is dad+aliv=n modl [distribution=binomial;link=logit] dad;nbinomial=n Fitting th Indpndnc modl. To answr (a) first fit th indpndnc modl using th statmnt which just fits a singl constant fit This givs th output man dvianc d.f. dvianc dvianc ratio Rgrssion 0 0.00 * Rsidual 3 6.63 5.544 Total 3 6.63 5.544

Hr th valu 6.63 is th Chisquar (G 2 ) valu in a tst for indpndnc btwn rows and columns (prform this tst using th chisquar procdur usd in xrcis sssion ). It has 3 df and its P valu < 0.00, indicating that th indpndnc modl is not adquat. Only a singl paramtr, th constant, is fittd hr giving th modl for th data as E(n i ) = n i. p, E(n i2 ) = n i. (-p) and p log =α p *** Estimats of paramtrs *** Antilog of Estimat s.. t(*) stimat Constant.75 0.5.39 5.558 This shows that th ML stimat ˆ α =. 75 and th logit formula can b invrtd to giv th ML stimat of p.75 p ˆ = = 0.847.75 + This is th ML stimat of th proportion of dad aphids in th data. Th total numbr of aphids in th trial is 34 with 289 dad, which givs a proportion of 0.847 which is th sam as obtaind abov. Viwd as a simpl binomial with 34 trials this would hav bn th ML stimat of p so it is rassuring that th two agr. Fitting th Saturatd modl. " fit th saturatd modl using th factor conc_lv" fit [fprob=ys] conc_lv ***** Rgrssion Analysis ***** *** Summary of analysis *** man dvianc approx d.f. dvianc dvianc ratio chi pr Rgrssion 3 6.63 5.544 5.54 <.00 Rsidual 0 0.00 * Total 3 6.63 5.544 Hr th valu 6.63 in th last lin is th Chisquar (G 2 ) valu in a tst for indpndnc btwn rows and columns as bfor. This is thn partitiond into a componnt du to th saturatd modl (which fits thr paramtrs (4-) in addition to a constant to rprsnt th four probabilitis on for ach lvl of th factor conc_lv) and th rmaindr unxplaind by th modl. For th saturatd modl all th variation is xplaind (Chisquar 6.63 with 3 dgrs of frdom) as it givs a prfct fit and th rmaindr is zro with zro dgrs of frdom. Th modl fittd is as follows E(n i ) = n i. p i, E(n i2 ) = n i. (-p i )

and p log p = α pi log = + for i> α β i pi This givs 4 paramtrs α, β 2, β 3 and β 4 whr α is th logit of p and β i is th diffrnc btwn th logit of p i and th logit of p. Th following tabl stimats ths paramtrs. *** Estimats of paramtrs *** antilog of stimat s.. t(*) stimat Constant 0.96 0.252 3.63 2.500 conc_lv 2 0.646 0.396.63.908 conc_lv 3.204 0.396 3.04 3.333 conc_lv 4.75 0.527 3.32 5.760 Th final column abov givs th antilog of th stimat. For th thr paramtrs for conc_lv this givs th ML stimat of th OR btwn th appropriat lvl and th first lvl. Thus 3.333 is th stimat of th OR btwn lvl and lvl 3. Chck this with th 2 x 2 tabl formd by taking data just for lvls and 3 in th data. This subtabl is Lvl 3 Lvl.6 0.65 Dad 00 55 Aliv 2 22 whr OR is stimatd as (00*22)/(2*55) = 3.333. Th following command prdicts proportions from th modl for ach of th lvls of conc_lv as ˆ α.96 pˆ = = = 0.74 ˆ α.96 + + ˆ α+ β2.96+ pˆ 2 = = ˆ α+ ˆ β.96+ 2 + + and similarly for lvls 3 and 4. prdict conc_lv ˆ.646.646 = 0.827 Prdiction S.. conc_lv.00 0.743 0.055 2.00 0.8267 0.0437 3.00 0.8929 0.0292 4.00 0.935 0.028 * MESSAGE: S..s ar approximat, sinc modl is not linar. * MESSAGE: S..s ar basd on disprsion paramtr with valu

" Th following command prints th proportions calculatd from th data for comparison with thos prdictd from th modl" print conc_lv,prop conc_lv prop.000 0.743 2.000 0.8267 3.000 0.8929 4.000 0.935 Not that for th saturatd modl that th proportions prdictd from th modl ar idntical with thos calculatd from th data at ach concntration lvl rgardd as a simpl binomial. As th saturatd modl fits a sparat paramtr for ach group this is what w might xpct. Fitting th Linar Logistic rgrssion modl. This provids th answrs to qustions (b) and (c) at th start of this xrcis. " fit a linar logistic rgrssion logit(p) = a + b Conc" fit [fprob=ys] conc man dvianc approx d.f. dvianc dvianc ratio chi pr Rgrssion 6.55822 6.55822 6.56 <.00 Rsidual 2 0.07459 0.03729 Total 3 6.6328 5.54427 * MESSAGE: ratios ar basd on disprsion paramtr with valu Hr th valu 6.63 in th last lin is th Chisquar (G 2 ) valu in a tst for indpndnc btwn rows and columns as bfor. This is thn partitiond into a componnt du to th linar logistic rgrssion modl (which fits on paramtr in addition to a constant to rprsnt th rlationship btwn th four probabilitis for ach lvl of th factor variat conc which is th actual concntration of sodium olat in th tratmnts) and th rmaindr unxplaind by th modl. Th chisquar for th rgrssion part is 6.556 with df (P<0.00). Th rmaindr is 0.075 with 2 df which not at all significant. This indicats that th rgrssion modl givs a good rprsntation of th data. Th modl fittd is as follows E(n i ) = n i. p i, E(n i2 ) = n i. (-p i ) and pi log =α+ β conci p i This givs 2 paramtrs α and β. Th following tabl givs stimats of ths paramtrs. antilog of stimat s.. t(*) stimat Constant 0.52 0.399 0.38.64 conc.226 0.35 3.89 3.406

Th scond ntry in th final column of this tabl givs th antilog of th rgrssion cofficint and stimats th OR du to incrasing concntration lvl by on unit as 3.406. This mans that th numbr of daths pr survivor is 3.4 tims gratr at concntration.7 than at concntration 0.7. Th following commands giv prdictd proportion of dad at th concntrations of sodium olat nominatd in th variat xconc (0.7,.3 and 2). variat [nvalus=3;valus=.7,.3,2] xconc prdict conc; lvls=xconc Prdiction S.. conc 0.70 0.7329 0.048.30 0.853 0.0202 2.00 0.930 0.095 * MESSAGE: S..s ar approximat, sinc modl is not linar. * MESSAGE: S..s ar basd on disprsion paramtr with valu

Applid Statistics II - Catgorical Data Analysis Data analysis using Gnstat - Exrcis 2 Logistic rgrssion ANSWER SHEET - TO BE SUBMITTED FOR GRADING Nam of studnt Dat Analysis 2.2 A survy of businsss providd th data blow on th salary and ducational achivmnt of th chif xcutivs. Salary Educational Lvl < Third Third <35000 25 0 35000 45000 30 29 45000 55000 24 35 55000 65000 9 36 65000 3 (a) Is th proportion of rspondnts in third lvl ducation constant ovr salary classs? Complt th following tabl. Chisquar Df Significanc (b) Assign scors 3, 4, 5, 6 and 7 to th incom catgoris and fit a linar logistic modl to th proportion of chif xcutivs with third lvl ducation as rlatd to th scor. Intrcpt and its SE Cofficint and its SE Significanc of cofficint What is th modl? (c) Is thr vidnc of lack of fit with th modl? Support you answr with a suitabl tst of significanc. Tst typ Tst valu Dgrs of frdom Significanc

(d) Prdict th proportion of chif xcutivs with third lvl ducation for thos having incom of 50000 pr annum and comput an approximat 95% confidnc intrval for your answr. Proportion 95% CI Lowr Uppr () Estimat th OR associatd with incrasing incom by 20000 and comput a 95% confidnc intrval for it. OR stimat 95% CI Lowr Uppr