Biosttistics ti ti Lecture INF4350 October 12008 Anj Bråthen Kristoffersen Biomedicl Reserch Group Deprtment of informtics, UiO Gol Presenttion of dt descriptive tbles nd grphs Sensitivity, specificity, it ROC curve Hypotese testing Type I nd type II error Multiple testing Flse positives Ctegoricl vribles Ordinl: Vrible types Are you smoking? 1 Dily, 2 now nd then, 3 Stopped lst yer, 4 Stopped erlier, 5 never Nominl (Discrete vribles): Civil stte: 1 not mrried, 2 mrried, 3 hve prtner, 4 divorced, 5 widow DNA (A, T, C, G) Binry vribles (0, 1) Continues vribles numbers Vribles cn be Independent d Are not influenced by other vribles. Are not influenced by the event, but could influence the event Dependent Vribles influence ech other. For instnce would the informtion tht gene is on/of possible influence n other gene. Which vrible tht depend/influence the other vrible cn often not be defined.
Properties: Averge (men) All observtions must be known The observtions do not need to be order Sensible for outliers (extreme, untypicl) vlues. Eqution: Adjusted men Men bsed on the centrl observtions: 90 95 % of the observtions; the tle (5 10 %) of the dt is not included for clcultions. Less sensible for extreme observtions. x x x K x n 1 2 n 1 1 n xi n i 1 Eqution: x Combining mens n1x1 n2x2 K n m x m n n K 1 2 n m Where n i is the number of observtions behind the men. Note tht djusted mens cn not be combined like this. x i Synonym: Medin 50 percentile Empiricl medin Properties: The observtions re ordered Medin the vlue tht divides the observtions in two prts. Not sensitive for extreme observtions. Mthemticl not good since the medin of more then one set of observtions cn not be combined.
Mode The observtion tht occur most times. Mthemticl not good since the medin of more then one set of observtions cn not be combined. Dispersl mesures Rnge X n X 1 Sme oneness s the observtions Sensible for extreme observtions Quntiles, percentiles The numerl V p tht hs p proportions of the ordered observtions below. (0<p<1) Sme oneness s the observtions n Stndrd devition 1 2 sd ( x i x) Alwys positive n 1 i 1 Outlying observtions contribute most Sme oneness s the observtions Stndrd devition If the dt is close to Gussin distributed pproximtely 95% of the popultion re within x ± 1. 96 sd Which pproximtely correspond to the 2.5 nd 97.5 percentile A consequences of the properties of the Gussin distribution Depends on pproximtely symmetry y nd unimodlity. Quick nd dirty: Rnge sd 4 Hndy when first guess of the sd when clculting the necessry numbers of observtions. Descriptive sttistics - tbles A sclr vrible: Clculte men, medin nd stndrd devition A ctegoricl vrible: Descriptive sttistics frequencies Two ctegoricl vribles: Descriptive sttistics cross tble A sclr vrible nd ctegoricl vrible compre men/medin for ech ctegory Two sclr vrible: Ctegorise one of the vribles or: liner regression
Do lwys plot your dt A plot tells more thn 1000 tests A sclr vrible: Histogrm Box-plot Compre the dt with the Gussin distribution: Q-Q plot esier to red nd explin thn Gussin curve upon histogrm QQplot Histogrm Do lwys plot your dt A plot tells more thn 1000 tests Two sclr vrible: Sctter plot Do lwys plot your dt A plot tells more thn 1000 tests A sclr nd ctegoricl vrible Box plot Sctter plot Two sclr nd ctegoricl vrible: Sctter plot
Number of bbies born Exmple probbility of getting t boy Number of boys Prosentge boys 10 8 0.8 100 55 0.55 1000 525 0.525 10000 5139 0.5139 100000 51127 0.51127 376058 1927054 0.51247 17989361 9219202 0.51248 34832051 17857857 0.51268 Reltive risk A { Positive mmmogrm} B { Brest cncer within two yers} Pr Pr ( B A) 0.1 ( B A) 0. 0002 ( B A) Pr RR Pr B A ( ) 0.1 500 0.0002 Prevlence, sensitivity, specificity, nd more A { symptom or positive dignostic testt } B { ill} P( B) prevlence of the illness P( A B) sensitivity P P P ( A B ) flse positive rte ( A B ) specificity ( A B ) P( A B ) 1 P ( A B ) 1 P ( A B ) 1 spesifisit y Then we hve P P ( B A) PPV PV positive predictive vlue ( B A ) NPV NV negtive predictive vlue A B Pr Exmple brest cncer dignostic { positive mmmogrm} { brest bes cncer ce within two yers } ( B A) 0.0002 then Pr( B A) Pr ( B A ) 0.9998 NPV PPV Pr ( B A) 0. 1 1 0.0002 0.9998
Exmple brest cncer in different groups Brest cncer Brest cncer mong women 45 to 54 yers old Group A: gve first birth before 20 yer old Group B: gve first birth fter 30 yer old Assume tht 40 of 10000 in group A nd 50 of 10000 in group B get cncer, coincidence id or different risk? If the numbers where 400 of 100000 nd 500 of 100000? Still coincidence? Test result Trditionl 2 2 tble ill - [TP] b [FP] b - c [FN] d [TN] cd c bd bcd TP true positive, FP flse positive, FN flse negtive, TN true negtive Anlyse v 2 2 tbell Fisher showed tht the probbility of obtining ny such set of vlues ws given by the hypergeometric distribution: b c d c ( b )! ( c d )! ( c )! ( b d )! p b c d ( b c d)!! b! c! d! c If the p vlue is less thn cut off (i.e. p<0.05) we ssume tht we cn reject the null hypotheses nd ssume tht t true odds rtio is not equl to 1, hence the test result differentite between ill nd not ill. Exmple brest cncer > fisher.test(mtrix(c(40,9960,50,9950),ncol 2, byrowtrue)) Fisher's Exct Test for Count Dt dt: mtrix(c(40, 9960, 50, 9950), ncol 2, byrow TRUE) p-vlue 0.3417 lterntive hypothesis: true odds rtio is not equl to 1 95 percent confidence intervl: 0.5133146 1.2371891 smple estimtes: odds rtio 0.7992074 c 40 9960 50 9950 > fisher.test(mtrix(c(400,99600,500,99500),ncol ( (,, 2, byrowtrue)) 400 99600 Fisher's Exct Test for Count Dt 500 99500 dt: mtrix(c(400, ( 99600, 500, 99500), ncol 2, byrow TRUE) p-vlue 0.0009314 lterntive hypothesis: true odds rtio is not equl to 1 95 percent confidence intervl: 0.6987355 0.9135934 smple estimtes: odds rtio 0.7991994 b d
Prevlence, sensitivity, specificity, nd more c Prevlence Pr( B) b c d Sensitivity Pr( A B) c d Specificity Pr( A B ) b d PPV Pr( B A) b d NPV Pr( B A) c d d Accurcy b c d Testing hypotheses Find null nd n lterntive hypothesis Exmple: H 0 : Expected response is equl in both groups H 1: Expected response is different between groups. p-vlue: is the probbility to observe the observed vlues given tht H 0 is true. Reject H 0 if the p-vlue is less thn given significnce level (e.g. 0.0505 or 0.01) 01) Sttisticl tests Some tests ssume certin distribution E.g.: t-test ssume tht the dt re (pproximtely) Gussin distributed Non prmetric tests re more flexible E.g.: compring two medins: non prmetric test, t two independent d groups (Mnn-Whitney) Sttistisk test metoder Two ctegoricl vribles: Fisher test Chi squre testt Mnn-Whitney Two sclr vribles: ttest t.test A sclr nd ctegoricl vrible: nov
The t-test test The t sttistic is bsed on the smple men nd vrince t Mnn-Whitney In order to pply the Mnn-Whitney test, the rw dt from smples A nd B must first be combined into set of Nn n b elements, which re then rnked from lowest to highest. These rnkings re then re-sorted into the two seprte smples. The vlue of U reported in this nlysis is the one bsed on smple A, clculted s smple A clculted s n ( n 1 1 ) U A n n where T A the observed sum of rnks for smple A, nd n n b n ( n 1 ) 2 b 2 T the mximum possible vlue of T A Convert the U sttistics into p-vlues. A ANOVA The t-test nd its vrints only work when there re two smple pools. Anlysis of vrince (ANOVA) is generl technique for hndling multiple vribles, with replictes. A simple experiment Mesure response to drug tretment in two different mouse strins. Repet ech mesurement five times. Totl experiment 2 strins * 2 tretments t t * 5 repetitions 20 rrys If you look for tretment effects using t- test, then you ignore the strin effects.
ANOVA lingo Two-fctor design Fctor: vrible tht t is under the control of the experimenter (strin, tretment). Level: possible vlue of fctor (drug, no drug). Min effect: n effect tht involves only one fctor. Interction effect: n effect tht involves two or more fctors simultneously. Blnced design: n experiment in which ech fctor nd level is mesured n equl number of times. Fixed nd rndom effects Fixed effect: fctor for which h the levels l would be repeted exctly if the experiment were repeted. Rndom effect: term for which the levels would not repet in replicted experiment. In the simple experiment, tretment nd strin re fixed effects, nd we include rndom effect to ccount for biologicl nd experimentl vribility. ANOVA model i 1, K, n, E ijk μ T i S j ( TS ) ij ε ijk j 1, K, m, k 1, K, p. μ is the men expression level of the gene. T nd S re min effects (tretment, strin) with n nd m levels, respectively. TS is n interction effect. p is the number of replictes per group. ε represents rndom error (to be minimized).
ANOVA steps For ech gene on the rry Fit the prmeters T nd S, minimizing ε. Test T, S nd TS for difference from zero, yielding three F sttistics. Convert the F sttistics into p-vlues. F-sttistics Compre two liner models. Men Squres Group MSG F or Men Squres Error MSE This compres the vrition between groups (group men to group men) to the vrition within groups (individul vlues to group mens). F-distribution Pr( F > F df df 2 1, clculted ) ANOVA ssumptions A B Gene ANOVA output p-vlue For given gene, the rndom error terms re independent, normlly distributed nd hve uniform vrince. The min effects nd their interctions re liner. Strin effects Tretment effects Interction effects Vehicle Drug
Multiple testing correction This nd some following slides re from http://compdig.molgen.mpg.de/ngfn/docs/2004/mr/differentilgenes.pdf. Multiple testing correction On n rry of 10,000 spots, p-vlue of 0.0001 my not be significnt. Bonferroni correction: divide your p-vlue cutoff by the number of mesurements. For significnce of 0.05 with 10,000 spots, you need p-vlue of 5 10-6. Bonferroni is conservtive becuse it ssumes tht ll genes re independent.
Types of errors Flse discovery rte Flse positive (Type I error): the experiment indictes tht the gene hs chnged, but it ctully hs not. Flse negtive (Type II error): the gene hs chnged, but the experiment filed to indicte the chnge. Typiclly, reserchers re more concerned bout flse positives. Without doing mny (expensive) replictes, there will lwys be mny flse negtives. 5 FP 13 TP 33 TN 5 FN The flse discovery rte (FDR) is the percentge of genes bove given position in the rnked list tht re expected to be flse positives. Flse positive rte: percentge of non-differentilly expressed genes tht re flgged. Flse discovery rte: percentge of flgged genes tht re not differentilly expressed. FDR FP / (FP TP) 5/18 27.8% FPR FP / (FP TN) 5/38 13.2% 5 FP 13 TP 33 TN 5 FN Controlling the FDR FDR exmple Order the undjusted p-vlues p 1 p 2 p m. To control FDR t level α, Rnk of this p-vlue of this gene gene j j* mx j : p j α m Reject the null hypothesis for j 1,, j*. Desired significnce threshold Totl number of genes This pproch is conservtive if mny genes re differentilly expressed. Rnk (jα)/m p-vlue 1 0.00005 0.0000008 2 0.0001000010 0.00000120000012 3 0.00015 0.0000013 4 0.00020 0.0000056 5 0.0002500025 0.00000780000078 6 0.00030 0.0000235 7 0.00035 0.0000945 8 0.0004000040 0.00024500002450 9 0.00045 0.0004700 10 0.00050 0.0008900 1000 0.05000 1.0000000 Choose the threshold so tht, for ll the genes bove it, (jα)/m is less thn the corresponding p- vlue. Approximtely 5% of the exmples bove the line re expected to be flse positives. (Benjmini & Hochberg, 1995)
Bonferroni vs. flse discovery rte Flse discovery rte Bonferroni controls the fmily-wise error rte; i.e., the probbility of t lest one flse positive. FDR is the proportion of flse positives mong the genes tht re flgged s differentilly expressed. Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Sttus Definitively Probble unsure Probbly Definitively Totl norml norml not norml not norml Norml 33 6 6 11 2 58 Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Sttus Definitively Probble unsure Probbly Definitively Totl norml norml not norml not norml Norml 33 6 6 11 2 58 Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Criteri 1 ll with rnge from 1 to 5 get the dignose ill. Find ll the ill ones, but identify now helthy ones. Sensitivity 1, specificity 0, flse positive rte 1 Criteri 2 ll with rnge from 2 to 5 get the dignose ill. Find 48/51 of the ill ones, but identifies 33/58 helthy ones. Sensitivity 0.94, specificity 0.57, flse positive rte 0.43
Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Sttus Definitively Probble unsure Probbly Definitively norml norml not norml not norml Norml 33 6 6 11 2 58 Totl Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Sttus Definitively Probble unsure Probbly Definitively Totl norml norml not norml not norml Norml 33 6 6 11 2 58 Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Criteri 3 ll with rnge from 3 to 5 get the dignose ill. Find 46/51 of the ill ones, but identifies 39/58 helthy ones. Sensitivity 0.90, specificity 0.67, flse positive rte 0.33 Criteri 4 ll with rnge from 4 to 5 get the dignose ill. Find 44/51 of the ill ones, but identifies 45/58 helthy ones. Sensitivity 0.86, specificity 0.78, flse positive rte 0.22 Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Dignostic/ROC curve Rnging g of 109 CT imges of one rdiologist: Sttus Definitively Probble unsure Probbly Definitively Totl norml norml not norml not norml Norml 33 6 6 11 2 58 Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Sttus Definitively Probble unsure Probbly Definitively Totl norml norml not norml not norml Norml 33 6 6 11 2 58 Not 3 2 2 11 33 51 norml Totl 36 8 8 22 35 109 Criteri 5 ll with rnge from 2 to 5 get the dignose ill. Find 33/51 of the ill ones, but identifies 56/58 helthy ones. Sensitivity 0.65, specificity 0.97, flse positive rte 0.03 Criteri 6 ll with rnge > 5 get the dignose ill. Find non of the ill ones, but identifies ll the helthy ones. Sensitivity 0, specificity 1, flse positive rte 0
Dignostic/ROC curve Refernser Positiv test criteri sensitivity specificity Flse positive rte 1 1 0 1 2 0.94 0.57 0.43 3 0.90 0.67 0.33 4 0.86 0.78 0.22 5 0.65 0.97 0.03 6 0 1.0 0 http://www.medisin.ntnu.no/ikm/medstt/m edstt1.07.dg1.pdf http://www.medisin.ntnu.no/ikm/medstt/m edstt1.07dg2.snns.pdf http://noble.gs.wshington.edu/~noble/gen ome373/microrry nlysis: ANOVA nd multiple testing correction