Decision Analysis (part 2 of 2) Review Linear Regression

Size: px

Start display at page:

Download "Decision Analysis (part 2 of 2) Review Linear Regression"

Sheena Gallagher
6 years ago
Views:

1 Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall 2005 Decson Analyss (part 2 of 2) Revew Lnear Regresson Lucla Ohno-Machado

2 Outlne Homework clarfcaton Senstvty, specfcty, prevalence Cost-effectveness analyss Dscountng cost and utltes Revew of Lnear Regresson

3 2 x 2 table (contngency table) TB PPD+ 8 PPD 2 10 no TB Prevalence of TB = 10/100 Senstvty of test = 8/11 Specfcty of test = 87/89

4 threshold normal Dsease True Negatve (TN) FN FP True Postve (TP) 0 3mm 10

5 nl D Sens = TP/TP+FN Spec = TN/TN+FP nl TN FN PPV = TP/TP+FP D FP TP NPV = TN/TN+FN - + Accuracy = TN +TP nl D

6 nl D Sens = TP/TP+FN 40/50 =.8 Spec = TN/TN+FP 45/50 =.9 PPV = TP/TP+FP 40/45 =.89 NPV = TN/TN+FN 45/55 =.81 Accuracy = TN +TP 85/100 =.85 nl D nl D

7 Senstvty = 50/50 = 1 Specfcty = 40/50 = 0.8 threshold nl nl 40 D 0 40 D nl dsease TN TP FP

8 Senstvty = 40/50 =.8 Specfcty = 45/50 =.9 threshold nl nl 45 D D nl dsease TN TP FN FP

9 Senstvty = 30/50 =.6 Specfcty = 1 threshold nl nl 50 D D nl dsease TN TP FN

10 Cost-effectveness analyss Comparson of costs wth health effects Cost per Down case syndrome averted Cost per year of lfe saved Perspectves (socety, nsurer, patent) Comparators Comparson wth dong nothng Comparson wth standard of care

11 Dscountng costs It s better to spend $10 next year than today (ts value wll be only $9.52, assumng 5% rate) Even better to spend t 2 years from now ($9.07) For cost-effectveness analyss spannng multple years, recommended rate s usually 5% C = C 0 + C 1 /(1-r) 1 + C 2 /(1-r) C 0 are costs at tme 0

12 Dscountng utltes Value for full moblty s 10 today (s t only 9.52 next year?) Should the dscount rate be the same as for costs? If smaller, then t would always be better to wat one more year

13 Levels of Evdence n Evdence-Based Medcne US Task Force Level 1: at least 1 randomzed controlled tral (RCT) Level 2-I: controlled trals (CT) Level 2-II: cohort or case-control study Level 2-III: multple tme seres wth or wthout the nterventon Level 3: expert opnons

14 Examples Cost per year of lfe saved, Lfe years/us$1m By pass surgery mddle-aged man $11k/year, 93 CCU for low rsk patents $435k/year, 2

15 Importance of good stratfcaton Bypass surgery Left man dsease 93 One vessel dsease 12 CCU for chest pan 5% rsk of MI 2 20% rsk of MI 10

16 Intro to Modelng

17 Unvarate Lnear Model Y s what we want to predct (dependent varable) X are the predctors (ndependent varables) Y=f(X), where f s a lnear functon y = 1β 0 + x 1 β 1 1 y = 1β + x 2 β y x

18 Unvarate Lnear Model β 1 s slope β0 s ntercept y = 1β + x β y = 1β + x β y x

19 Multvarate Model Smple model: structure and parameters 3 predctors, 4 parameters β one of the parameters (β 0 ) s a constant =1 β + β + β y x x x y =1 β0+x β 1+x β +x β 1 1 β x 11 x 21 β1 β x 12 x 22 x x β2 β 3

20 Notaton and Termnology x s vector of j nputs, covarates, ndependent varables, or predctors for case (.e., what we x1 know for all cases) age 30 x 1 salt = 10 = x 2 smoke 1 x 3 X s matrx of j x n x column vectors (nput data for each case) T x 11 x 21 x x x 13 x 23 = x 11 x 12 x 13 x 21 x 22 x 23

21 Predcton y s scalar: output, dependent varable (.e., what we want to predct) e.g., mean blood pressure pred pat1 = 100 y 1 pred pat2 98 = y 2 Y s vector of y

22 Multvarate Lnear Model y =1 β +x β +x +x β β 3 y =1 β +x β 1+x β x 23 β 3 Y = X T β, where each x ncludes a term for 1 (constant) (x 10 =1, x 20 =1, etc.) to be multpled by the ntercept β 0 y 1 x 10 x 11 x 12 x 13 y 2 x 20 x21 x22 x23 β 0 β 1 β2 β3

23 Regresson and Classfcaton Regresson: contnuous outcome E.g., blood pressure Y = f(x) Classfcaton: categorcal outcome E.g., death (bnary) G = G ˆ ( X )

24 Loss functon Y and X random varables f(x) s the model L(Y,f(X)) s the loss functon (penalty for beng wrong) It s a functon of how much to pay for dscrepances between Y (real observaton) and f(x) estmated value for an observaton) In several cases, we use only the error and leave the cost for the decson analyss model

25 Regresson Problems Let s concentrate on smple errors for now: Expected Predcton Error (EPE): [Y-f(X)] 2 Y-f(X) These two error functons mply that errors n both drectons are consdered the same way.

26 Unvarate Lnear Model ŷ 1 = 1β 0 + x 1 β 1 ŷ 2 = 1β + x 2 β 0 1 y y = 1β + x 1 β + error y = 1β + x 2 β + error x

27 Squared Errors n SSE = (Y Yˆ) =1 ˆ = β + β 0 1 y x 2 Lnear regresson y y Y ˆ = f ( X )

28 Condtonng on x x s a certan value y ˆ(x y ) = 1/ k y x = x ) f ( x ) = Ave ( y x = x ) Expectaton s approxmated by average Condtonng s on x x

29 k -Nearest Neghbors N s neghborhood y ˆ(x y ) = 1/ k y x N k ( x ) f ( x ) = Ave ( y x N k ( x )) Expectaton s approxmated by average Condtonng s on neghborhood x

30 Nearest Neghbors N s contnuous neghborhood y Yˆ( x) = 1/ n y w f ( x) = WeghtedAv e ( y x) x Expectaton s approxmated by weghted average w x

31 Mnmze Sum of Squared Errors y ŷ = β 0 + β 1 x n n = [ y (β + β x)] = = 1 = 1 SSE = ( y ŷ) 2 n = ( y 2 2 y(β + β x) + (β + β x) 2 ) = = n 2 = ( y 2 yβ 2 yβ x + β β1 1 = 1 β x + β 2 2 x )

32 (dervatve wrt β 0 ) = 0 n ( y 2 yβ 2 yβ x + β β 0β 1x + β 1 x ) = 1 n SE = 2 ( y + β 0 = 1 β + β x) = β n + β 1 x = y Normal equaton 1 0

33 (dervatve wrt β 1 ) = 0 n ( y 2 y β 2 y β x + β β 0β1x + β 1 x ) = 1 n SE = 2 ( y x + β x + β x 2 ) = 0 β 1 = β 0 x + β 1 x 2 = yx Normal equaton 2

34 Solve system of normal equatons β n + β 1 x = y 0 β 0 x + β 1 x 2 = y x Normal equaton 1 Normal equaton 2 β 0 = y β 1x Σ( x x )( y y ) β 1 = Σ( x x ) 2 y

35 Lmtatons of lnear regresson Assumes condtonal probablty p(y X) s normal Assumes equal varance n every X It s lnear (but we can always use nteracton or transformed terms)

36 Lnear Regresson for Classfcaton? y = p y=1 ŷ = β + β x 0 1 x

37 Lnear Probablty Model 0 for 0 >= β 0 + β 1 x ˆ y β + β x 0 1 = for 0<= β 0 + β 1 x <=1 y=1 1 for β 0 + β 1 x >=1 x

38 Logt Model = + = + = + = x x x x p p x p p e e p e p β β β β β β β β β 1 log 1 log ) ( logt x p=1 x

39 Two dmensons x1 x2

Logistic Regression Maximum Likelihood Estimation

Logistic Regression Maximum Likelihood Estimation Harvard-MIT Dvson of Health Scences and Technology HST.951J: Medcal Decson Support, Fall 2005 Instructors: Professor Lucla Ohno-Machado and Professor Staal Vnterbo 6.873/HST.951 Medcal Decson Support Fall