Inferential Statistical Analysis of Microarray Experiments 2007 Arizona Microarray Workshop
|
|
- Jasmin Cameron
- 6 years ago
- Views:
Transcription
1 Inferential Statistical Analysis of Microarray Experiments 007 Arizona Microarray Workshop μ!! Robert J Tempelman Department of Animal Science tempelma@msuedu
2 HYPOTHESIS TESTING (as if there was only one gene) Significance level (False Positive Rate) H 0 is not rejected H 0 is rejected H 0 is true (non-de) H 0 is false (DE) No error (-α) Type II error (β) Type I error (α) No error (-β) DE: differentially expressed Standard approach: Power (Sensitivity) Specify an acceptable type I error rate (α) Seek tests that minimize the type II error rate (β), ie, maximize power ( - β)
3 Unpaired comparisons between two treatments Example ) Affymetrix Data: one specimen (expt l unit/biological rep) per array A A An B B Bn eg response (y ijg ) = log fluorescence intensity for subject j on gene k within Group i i =,,,T (T = # of treatments) j =,,,n i g =,,,G (n i = # of biological reps within treatment i) (G= # of genes)
4 Unpaired comparisons between two treatments (cont d) Cy5 Example ) Reference (two color) Designs: A R B R A R B R An R Bn R Cy3 eg Response (y ijg ) = log ratio of fluorescence intensity (relative to reference common sample R) Subscripts as on previous slide (one measure per probe)
5 Linear statistical model Basis for classical statistical inference Consider linear model for one gene (drop g) Y ij = μ + trt i + e ij μ = overall mean trt i = effect of treatment i True mean for treatment i e ij : random experimental (biological) error
6 THE TWO-SAMPLE t-test (T=) Assume equal variances within each treatment Sample statistics y g y g, For gene k Sample Sample y g y g y n,g s g s g y g y n,g y g, s g ( n ) s + ( n ) s = n + n g g Given yjg ~ N( μg, σg) and H σ : μ = μ vs H : μ μ = σ g g y ~ N( μ, σ ) jg g g 0 g g g g t = y Test statistic: y ~ t g g g ( n+ n ) sg + n n weighted average of s g and s g SED g
7 DECISION RULES AND P-VALUES (IGNORING MULTIPLICITY) One-tailed test (H : μ g - μ g > 0) If t g > t α Gene k concluded to be differentially expressed t g = y y g g SED g
8 t g = y DECISION RULES AND P-VALUES (IGNORING MULTIPLICITY) Two-tailed test (H : μ g - μ g 0) g g SED y g Compare t k to t α/ or compare P-value Prob(t> t k ) to α ) P-value < α Reject H o : μ g -μ g = 0 α/ α/ ) P-value > α Fail to reject H o :μ g -μ g = 0 -t α/ t t α/
9 DECISION RULES AND P-VALUES (EXAMPLES) α = 005 t 60 ) t = 5: P-value =005< α -> Reject H o : Prob(t<-5) Prob (t>5) -5 5 ) t = -0: P-value = 03 > α > Fail to reject H o : Prob(t<-0) Prob (t>0) -0 0
10 THE TWO-SAMPLE t-test σ σ COMMENT: If, the t-test should be altered accordingly: t = y s n y s + n t ~ df *, where: s s + n n df * = s s n n + n + n + (Satterthwaite, 946)
11 EXAMPLE and SAS CODE for one gene T T n =6; n = 5 y = y = s = 05 s = 08 s ( n ) s + ( n ) s = = n + n 0069 t = = NS
12 RESULTS P-value (two-tailed) =0695 = Prob(t 9 >49) + Prob(t 9 <-49) PROC TTEST
13 An issue taken with the two sample t-test (or classical linear model analysis) Distributional assumptions especially with small n effect of non-normality outliers might have unduly large influence
14 THE PERMUTATION TEST The basic idea is simple estimate the null distribution of the test statistics to draw conclusions on statistical significance There is a close connection with bootstrap sampling Suppose: Experiment Trt Trt y y From distribution F From distribution G y y y n y n H 0 : F = G vs H : F G y ± s y ± s
15 THE PERMUTATION TEST y y Define a statistic (eg t = SED ) and calculate its value for the actual experiment (call it t*) Repeat B times Take a random sample of size n without replacement from the data to represent Group The remaining n observations are assigned to Group y y Compute the value of t = (call it t (i) ) SED P-value (one-tailed for H : μ > μ ) : p = Σ I(t (i) t*)/b
16 THE PERMUTATION TEST Actual Experiment Permutation Permutation Permutation B Trt Trt y y y y n y y n T y y T y n y y n y y, s y, s T y y, s T y y y y n y n y, s T y y, s T y n y y y y n y, s y, s t* = y y SED y, s t () t () t (B) One-tailed Permutation P-value: p = Σ I(t (i) t*)/b ie proportion of times that t (b) exceeds t* for b =,, B
17 SAS example B data example; input trt $ y; datalines; T 57 T 557 T 56 T 59 T 594 T 574 T 544 T 549 T 548 T 577 T 58 ; proc multtest data=example permutation nsample=0000 pvals outsamp=res; test mean(y); class Trt; contrast 'Trt -' -; run; proc print data=res(obs=); run;
18 permuted samples t Permuted DataSet y y () () () = = SED() t Permuted DataSet y y () () () = = SED() Obs _sample class obs_ y T T T T T T T T T 58 0 T 57 T 4 59 T T T 57 5 T T T T T T T 58 T 4 59 And so on
19 Summary Continuous Variable Tabulations Standard Variable trt NumObs Mean Deviation y T y T p-values Variable Contrast Raw Permutation y Trt Regular t-test P- value Permutation based two-tailed P-value
20 FREQUENCY t Value t*= 49 (actual expt) Permutation p-value = 0778 Distribution of t (i) over B=0000 permuted datasets t=-49
21 THE BOOTSTRAP The bootstrap tests are more widely applicable though less accurate than the permutation test Extremely useful for computing standard errors and confidence intervals Suppose: Experiment Trt Trt y y From distribution F From distribution G y y y n y n H 0 : F = G vs H : F G y ± s y ± s
22 THE BOOTSTRAP y y Define the statistics (eg t = ) and calculate SED its value for the data set (call it t*) Compute the estimated residuals for each observation e = y y ˆij ij i Repeat B times Draw at random n residuals of size with replacement: Assign as data for Group Draw at random n residuals of size with replacement: Assign as data for Group Compute the value of t (call it t (i) ) One-tailed P-value: p = Σ I(t (i) t*)/b (for H : μ > μ )
23 SAS example data example; input trt $ y; datalines; T 57 T 557 T 56 T 59 T 594 T 574 T 544 T 549 T 548 T 577 T 58 ; proc multtest data=example bootstrap nsample=0000 pvals outsamp=res; test mean(y); class Trt; contrast 'Trt -' -; run; proc print data=res(obs=); run; y = y = Residuals for actual expt Obs trt residual T T T T T T T T T T 0700 T 000
24 bootstrap samples t Bootstrapped DataSet y y () () () = = SED() Bootstrapped DataSet t y y () () () = = SED() And so on Obs _sample class obs_ y T 0000 T T T T T T T T T T T T T T T T T T T T T
25 Standard Variable trt NumObs Mean Deviation y T y T p-values Variable Contrast Raw Bootstrap y Trt Regular t-test P- value Bootstrap based two-tailed P-value
26 Issues with permutation and bootstrap sampling Still need to have sufficiently large samples: The granularity problem (Allison et al, 006) Limited number of permutations ( ) n+ n n+ n = n n! n! eg if n = n = 3, then only 0 permutations possible smallest possible (one-tailed) P-value is /0 = 005 Less applicability to more complex designs! Allison DB, Cui XQ, Page GP, and Sabripour M Microarray data analysis: from disarray to consolidation and consensus Nature Reviews Genetics 7: 55-65, 006
27 The multiple testing issue involving m genes Called not significant Called significant Total Constant Null true m o -F F m o Alternative true m T T m Total m-s S m (=G) F: number of Type errors m -T: number of Type errors Observed
28 A hypothetical situation involving m=0000 genes (Pawitin et al, 005; Bioinformatics) Called not significant Called significant Total Null true m o F = 905 F = 475 m o = 9500 Alternative true m T = 00 T = 400 m = 500 Total m-s = 95 S = 875 m = 0000 False positive rate F 475 = = = 005 (FPR) m m0 F 905 Specificity=-FPR = = = 095 m Consistent with using α = 005
29 A hypothetical situation involving 0000 genes (Pawitin et al, 005) Called not significant Called significant Total Null true m o F = 905 F = 475 m o = 9500 Alternative true m T = 00 T = 400 m = 500 Total m-s = 95 S = 875 m = 0000 False negative rate (FNR) m T 00 = = = m 500 Sensitivity =- T 400 = = = 080 FNR m Consistent with Power = 080
30 Controlling FWER Prob(F=) There have been improvements to controlling FWER relative to using Bonferroni (too conservative) Stepdown procedures (eg Holm s, Sidak, Westfall and Young) Multivariate permutation (next) Provided the early inspiration on multiple testing in microarray studies
31 Multivariate permutation and bootstrapping and controlling FWER A R A R Suited for each other More powerful than Bonferroni Reference Design Example (m genes): Treatment A A3 R 3 4 A4 R or B5 R Treatment B B6 R B7 R B8 R Compute t-test P-values for comparing A to B for each gene p p M pm p m
32 Multivariate permutation and bootstrapping and controlling FWER (cont d) Treatment A Treatment B Permutation * * * * m m p p p p M min * p () * * * * m m p p p p M min * p () * * * * m m p p p p M min * p () Identify gene j as significantly expressed if * () # # j of perm where p p of perm α < < Also used in Callow et al (000) Genome Research 0: 0-09 Compute P-values for each of m genes:
33 SAS program data example; input trt $ y y y3; datalines; T T T T T T T T T T T ; proc multtest data=example permutation nsample=0000 pvals outsamp=res; test mean(y y y3); class Trt; contrast 'Trt -' -; run;
34 First Two Multivariate Permutation Samples Note: correlation structure between genes is preserved only expt l unit labels are shuffled Obs _sample class obs_ y y y3 T T T T T T T T T T T T T T T T T T T T T T
35 Continuous Variable Tabulations Standard Variable trt NumObs Mean Deviation y T y T y T y T y3 T y3 T p-values Variable Contrast Raw Permutation y Trt y Trt y3 Trt Note multivariate P-values < Bonferroni adjusted P-values
36 A hypothetical situation involving 0000 genes (Pawitin et al, 005) Called not significant Called significant Total Null true m o F = 905 F = 475 m o = 9500 Alternative true m T = 00 T = 400 m = 500 Total m-s = 95 S = 875 m = 0000 False discovery rate (FDR) = F S = 875 = FDR particularly suffers when π = 0 m0 m π o : proportion of all genes are that are non-de
37 FDR (solid curves for π o = 09, 095 or 099), FPR {α} (dashed curves) and sensitivity (dotted curves) as a function of critical value of the t-statistic Half of DE genes had (μ μ )/σ = ; other half had (μ -μ ) /σ = - Figure from Pawitan, Y et al Bioinformatics 005 : ; π 0 =099 π 0 =095 π 0 =
38 Using permutation/bootstrapping to estimate FDR Small example (3000 genes) from Storey and Tibshirani (003) Compare two Groups -> Group (n = 5) vs Group (n = 3) Suppose decide to reject H o : for all genes with t > 00 would then conclude 46 genes would be statistically significant Randomly shuffle experimental units for 00 different permutation datasets and simply tabulate the number of times t >00 for each gene Average number of times t > 00 across 00 permutations is 3 Thus a simple estimate of FDR for t > 00 is 3/46*00% = 84% ie if one used t > 00 to conclude statistical significance, 84% of genes in the significant list would be estimated to be false positives Equivalently 958% of the genes in the list should be estimated to be true positives Storey, JD, and R Tibshirani (003) SAM thresholding and false discovery rates for detecting differential gene expression in DNA microarrays In Parmigiani et al (eds) The Analysis of Gene Expression Data: Methods and Software Springer, Verlag pp7-90
39 Using permutation/bootstrapping to estimate FDR (cont d) Actually estimated FDR of 84% is biased upwards recall: Called significant Called not significant Total Null true F m o -F m o Alternative true T m -T m Total S=46 m S=854 m=3000 Permutations make all m genes null, but only a mo portion π o = truly are So to improve estimate FDR m estimate, should multiple 84% by π o Estimate of π o from example (details on next slide) = 089 Therefore, improved estimate of FDR for t > 00 is 84*089 = 749%
40 How to estimate π o using permutation? Suppose it is safe to say that t <05 involve all true null hypotheses Consider number of observed t < 05 For example = 668 Consider average number of permuted t < 05 For example = 750 Therefore, Therefore: 668 ˆ π o = =
41 SAM (Significance Analysis of Microarrays) Storey and Tibshirani (003) A popular inferential procedure for differential gene expression in microarrays A mix on permutation/bootstrap methods with FDR control and shrinkage estimation (later) Permute or bootstrap on: yj yg dg = ; g =,,, m se y y + s ( ) g g 0 s o = some (50 th or 90 th ) percentile (or percentile that minimizes the CV of d g ) Empirical Bayes adjustment provides stability to unusual SED!
42 The SAM procedure yg yg d = ; g =,,, m ( ) Compute g se yg yg + s0 as based on the data and order them from smallest to largest: d < d < d < < d m < d m () () (3) ( ) ( ) Take b =,,,B permuted or bootstrap ( samples, compute d b ) g, g =,,, G for each (b) sample and order the statistics from smallest to largest within each sample b d < d < d,, d < d b b b b b () () (3) ( m ) ( m)
43 3) Compute the average of B values for each ordered d statistic: where, eg, d, d, d,, d, d () () (3) ( m ) ( m) d () = B b= d B b () 4) Plot d, d, d,, d, d vs () () (3) ( m ) ( m) d, d, d,, d, d () () (3) ( m ) ( m) and base gene list on values that fall outside bands parallel to line
44 Example Affymetrix dataset on 79 genes from each of two groups (n=4) for 8 slides Distributed with SAM software (downloadable from cademic)
45 Treatment labels Click Need first columns for gene labels
46
47 Plot of observed vs expected statistics d( g ) Significantly upregulated (40) Significantly downregulated (39) d( g ) Δ = difference (along 45 line) between outer two (dashed) lines with expected (blue) line Note asymmetric rejection regions
48
49 Estimating π o and FDR for regular nonpermutation (eg t-tests) procedures: Distribution of P-values Under the null hypothesis, the distribution of P-values across many independent tests is uniform on the interval [0,], regardless of the sample size and statistical test used (provided the test is valid) Frequency P-value
50 Distribution of P-values (cont d) If some genes are differentially expressed, then the frequency of low P-values should be greater than that of high P-values: Frequency P-value
51 Distribution of P-values (for 333 genes from a small boutique array at MSU) FREQUENCY 300 Ef f ect =t r eat Expected height of each bar if no differentially expressed genes 00 Plausible estimate of π o? mo π o = m Pr > t
52 How to (roughly) estimate π o? Choose all p-values above a point (λ) where the p-value frequencies start to level off (say λ = 060 based on previous slide) # p 70 ˆ0 ( ) j > λ π λ = = 0774 m( λ) 333 ( λ)
53 Estimating FDR s based on P-values Choose an arbitrary P-value cutoff (0<t<) for statistical significance Expected number of false positives (F(t)) with P-value<t determined by: E(F(t)) = m o t Hence, estimated FDR at P-value cutoff t is: E( F( t) ) mt ˆ ˆ 0 π 0mt FDR () t = = = E( S() t ) S() t S() t
54 Q-value Defined for each gene Minimum FDR that can be attained by calling that gene significant (and others that have greater statistical significance) For gene i: qˆ ( pi ) = minfdr( t) t p i
55 Small example (SAS program) data example; input raw_p; datalines; run; proc multtest fdr pdata=example; run;
56 Small example (SAS output) The Multtest Procedure p-values False Discovery Test Raw Rate The SAS procedure assumes π o = (as from Benjamini and Hochberg, 995) Just need to multiply SAS values by estimated π 0
57 Plot of q-values vs p-values (for heifer example) qval ue raw_p
58 Plot of q-values vs Number of Declared Significant Genes qval ue S_t
59 Classical linear model analysis So far, Comparison of two treatments Simple design structure -> Simple linear model: Y ij = μ + trt i + e ij μ = overall mean trt i = effect of treatment i e ij : random experimental (biological) error Formal linear model analysis is not necessary unless t >
60 Common reference design for two treatments balanced for dye assignments Cy3 Cy5 A R A R A3 R An R eg Response (y ijkg ) = log ratio of fluorescence intensity (relative to reference common sample R) B B B3 Bn i: treatment j: dye assignment to test sample R R R R k: biological rep g: gene
61 Linear model for common reference with dye balance Y ijk = μ + trt i +dye j + e ijk μ = overall mean trt i = effect of treatment i, dye j = effect of dye j assigned to treated sample e ijk : random experimental (biological) error for biological rep k assigned to trt i and dye k Simple linear model analysis (ANOVA)
62 Representative data ( refdye ): i j k y ijk Cy Cy Cy Cy Cy Cy Cy Cy Cy Cy Cy Cy5 logfluor fold subj dye trt Obs n = 6
63 SAS ANOVA code proc mixed data=refdye; class trt dye; model logfluor = trt dye; lsmeans trt /diff; run;
64 SAS output ANOVA table Type 3 Tests of Fixed Effects Effect Num DF Den DF trt 9 dye 9 Adjusted trt means F Value Pr > F Least Squares Means Effect trt Estimate Standard Error DF tvalue Pr > t trt trt Adjusted trt mean difference Differences of Least Squares Means Effect trt _trt Estimate Standard Error DF tvalue trt Est fold change (relative to reference) = 0487 Est fold change(trt vs trt) = -036 Pr > t 0490
65 Balanced Block Design Example ): Comparison of two treatments each based on n subjects A A A3 An B B B3 Bn Total of n biological replicates Probably might be good to have even n (balanced dye swap) eg Response (y ijkg ) = log fluorescence intensity for treatment i on subject j within array k on gene g
66 Linear (mixed) model for balanced block design Y ijk = μ + trt i +dye j + array k +e ijk μ = overall mean trt i = effect of treatment i, dye j = effect of dye j assigned to treated sample array k = random effect of array k e ijk : random experimental (biological) error for biological rep assigned to trt i and dye j within array k Simple linear mixed model analysis (ANOVA) Each ijk identifies a unique biological replicate
67 logfluor fluorescence array dye trt Obs Representative data ( balanceblock )
68 SAS code proc mixed data=balanceblock method = type3; class array trt dye; model logfluor = trt dye; random array; lsmeans trt /diff; run;
69 Representative output Source df SS MS Expected Mean Square Error Term trt Var(Residual) + Q(trt) MS(Residual) dye Var(Residual) + Q(dye) MS(Residual) array Var(Residual) + Var(array) MS(Residual) Resid Var(Residual) Source Error DF F Value Pr > F trt dye array Resid
70 Representative output (cont d) Least Squares Means Effect trt Estimate Standard Error DF tvalue Pr > t trt <000 trt <000 Differences of Least Squares Means Effect trt _trt Estimate Standard Error DF tvalue Pr > t trt Est fold change(trt vs trt) = 0389
71 Balanced Block Design (Two Color blocking on array & subject) Example ): Comparison of two treatments/tissues within each of n subjects A A A3 An A A A3 An Total of n mice/arrays Probably be good to have even n eg Response (y ijkg ) = log fluorescence intensity from tissue/treatment i on slide/animal j from array k for gene g Same linear mixed model as previous!
72 A dairy heifer expt (Two Color blocking on array & subject) Two treatments (A & B) randomly assigned to one of two mrna aliquots taken from the same animal Trt A Trt A Trt A Trt B Trt B Trt B Trt B Trt B Trt B Trt A Trt A Trt A Heifer Heifer Heifer 3 Heifer 4 Heifer 5 Heifer 6 Dye and treatments orthogonal to each other Heifer and array confounded with each other
73 4 rows & 8 columns = 3 printtips 359 genes 4 spots per gene: therefore need to distinguish experimental from pseudo replication
74 Inference strategies ) Could average the intensities at the 4 spots for each gene Would still need to model treatment, dye and array effects Then same mixed model analysis as one presented previously! ) Explicitly model spot variability and treatment*array variability
75 Array data for one gene ( heifer ): Obs array dye trt spot logf Array0 Cy Array0 Cy Array0 Cy Array0 Cy Array0 Cy Array0 Cy Array0 Cy Array0 Cy Array8 Cy Array8 Cy Array8 Cy Array8 Cy Array8 Cy Array8 Cy Array8 Cy Array8 Cy Array9 Cy Array9 Cy Array9 Cy Array9 Cy Array9 Cy Array9 Cy Array9 Cy Array9 Cy Obs array dye trt spot logf 5 Array35 Cy Array35 Cy Array35 Cy Array35 Cy Array35 Cy Array35 Cy Array35 Cy Array35 Cy Array36 Cy Array36 Cy Array36 Cy Array36 Cy Array36 Cy Array36 Cy Array36 Cy Array36 Cy Array88 Cy Array88 Cy Array88 Cy Array88 Cy Array88 Cy Array88 Cy Array88 Cy Array88 Cy
76 ANOVA (SAS PROC MIXED) proc mixed data=heifer method=type3 ; class array dye trt spot ; model resid = dye trt; random array dye*trt*array spot(array); lsmeans trt /diff; run;
77 Some output Source DF Sum of Squares Mean Square Expected Mean Square dye Var(Residual) + 4 Var(array*dye*trt) + Q(dye) trt Var(Residual) + 4 Var(array*dye*trt) + Q(trt) array Var(Residual) + Var(spot(array)) + 4 Var(array*dye*trt) + 8 Var(array) array*dye* trt Var(Residual) + 4 Var(array*dye*trt) spot(array) Var(Residual) + Var(spot(array)) Residual Var(Residual) Source Error DF F Value Pr > F dye trt array array*dye*trt spot(array) <000 Residual
78 ANOVA table with EMS for example with technical replication Source Treatment Dye Array Array*Treat Spot(Array) Residual df SS SS t SS d SS a SS a*t SS s(a) SS e MS MS t MS d MS a MS a(t) MS e EMS σ + 4σ + γ σ σ γ σ σ σ σ σ e a* t trt e + 4 a* t + dye e + s( a) + 4 a* t + 8 a + 4σ e a* t s( a) e e σ e + σ φ = σ
79 Least Squares Means Effect trt Estimate Standard Error DF tvalue Pr > t trt <000 trt <000 Differences of Least Squares Means Effect trt _trt Estimate Standard Error DF tvalue Pr > t trt Hence estimated trt : trt fold change = = 086
80 Another example: Connected Loop Design (n=4) A A C Loop Loop B C B A3 A4 Loop 3 Loop 4 C3 B3 C4 B4
81 Mixed model approach Source Treatment Dye Array Animal(Trt) Residual df df t df d df b df a(t) df e SS SS t SS d SS b SS a(t) SS e MS MS t MS d MS b MS a(t) MS e EMS σ + 5σ + γ e animal( trt) trt σ e + γ dye σe + 5σarray σe + 5σanimal( trt) σ e
82 What next? FDR adjustment on P-values to provide q-values Same procedure as described previously Use FDR control criterion to come up with a gene list
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationEstimation of the False Discovery Rate
Estimation of the False Discovery Rate Coffee Talk, Bioinformatics Research Center, Sept, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationLinear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1
Linear Combinations Comparison of treatment means Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 6 1 Linear Combinations of Means y ij = µ + τ i + ǫ ij = µ i + ǫ ij Often study
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationHigh-throughput Testing
High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector
More informationChapter 11. Analysis of Variance (One-Way)
Chapter 11 Analysis of Variance (One-Way) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result
More information13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares
13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares opyright c 2018 Dan Nettleton (Iowa State University) 13. Statistics 510 1 / 18 Suppose M 1,..., M k are independent
More informationReview Article Statistical Analysis of Efficient Unbalanced Factorial Designs for Two-Color Microarray Experiments
International Journal of Plant Genomics Volume 2008, Article ID 584360, 16 pages doi:10.1155/2008/584360 Review Article Statistical Analysis of Efficient Unbalanced Factorial Designs for Two-Color Microarray
More informationOutline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013
Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical
More informationMultiple testing: Intro & FWER 1
Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationTools and topics for microarray analysis
Tools and topics for microarray analysis USSES Conference, Blowing Rock, North Carolina, June, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationCOMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.
COMPLETELY RANDOM DESIGN (CRD) Description of the Design -Simplest design to use. -Design can be used when experimental units are essentially homogeneous. -Because of the homogeneity requirement, it may
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationSta$s$cs for Genomics ( )
Sta$s$cs for Genomics (140.688) Instructor: Jeff Leek Slide Credits: Rafael Irizarry, John Storey No announcements today. Hypothesis testing Once you have a given score for each gene, how do you decide
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn
More informationContrasts and Multiple Comparisons Supplement for Pages
Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis
More informationSample Size Estimation for Studies of High-Dimensional Data
Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,
More informationSingle gene analysis of differential expression
Single gene analysis of differential expression Giorgio Valentini DSI Dipartimento di Scienze dell Informazione Università degli Studi di Milano valentini@dsi.unimi.it Comparing two conditions Each condition
More informationSleep data, two drugs Ch13.xls
Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch
More informationChapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).
Chapter 3: Statistical methods for estimation and testing Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001). Chapter 3: Statistical methods for estimation and testing Key reference:
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationMultiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationDesign of Microarray Experiments. Xiangqin Cui
Design of Microarray Experiments Xiangqin Cui Experimental design Experimental design: is a term used about efficient methods for planning the collection of data, in order to obtain the maximum amount
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationSample Size / Power Calculations
Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two
More informationStatistical analysis of microarray data: a Bayesian approach
Biostatistics (003), 4, 4,pp. 597 60 Printed in Great Britain Statistical analysis of microarray data: a Bayesian approach RAPHAEL GTTARD University of Washington, Department of Statistics, Box 3543, Seattle,
More informationSTAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:
STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationBiochip informatics-(i)
Biochip informatics-(i) : biochip normalization & differential expression Ju Han Kim, M.D., Ph.D. SNUBI: SNUBiomedical Informatics http://www.snubi snubi.org/ Biochip Informatics - (I) Biochip basics Preprocessing
More informationLooking at the Other Side of Bonferroni
Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology
Statistics GIDP Ph.D. Qualifying Exam Methodology January 9, 2018, 9:00am 1:00pm Instructions: Put your ID (not your name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you
More informationStat 206: Estimation and testing for a mean vector,
Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where
More informationExam: high-dimensional data analysis February 28, 2014
Exam: high-dimensional data analysis February 28, 2014 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question (not the subquestions) on a separate piece of paper.
More informationStatistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong
Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38
BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to
More informationDesign and Analysis of Gene Expression Experiments
Design and Analysis of Gene Expression Experiments Guilherme J. M. Rosa Department of Animal Sciences Department of Biostatistics & Medical Informatics University of Wisconsin - Madison OUTLINE Æ Linear
More informationExam: high-dimensional data analysis January 20, 2014
Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationCross model validation and multiple testing in latent variable models
Cross model validation and multiple testing in latent variable models Frank Westad GE Healthcare Oslo, Norway 2nd European User Meeting on Multivariate Analysis Como, June 22, 2006 Outline Introduction
More informationLesson 11. Functional Genomics I: Microarray Analysis
Lesson 11 Functional Genomics I: Microarray Analysis Transcription of DNA and translation of RNA vary with biological conditions 3 kinds of microarray platforms Spotted Array - 2 color - Pat Brown (Stanford)
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationExpression arrays, normalization, and error models
1 Epression arrays, normalization, and error models There are a number of different array technologies available for measuring mrna transcript levels in cell populations, from spotted cdna arrays to in
More informationQuick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis
Statistics Preprints Statistics 11-2006 Quick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis Peng Liu Iowa State University, pliu@iastate.edu
More informationResampling and the Bootstrap
Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationVisual interpretation with normal approximation
Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation
More informationLecture 10: Experiments with Random Effects
Lecture 10: Experiments with Random Effects Montgomery, Chapter 13 1 Lecture 10 Page 1 Example 1 A textile company weaves a fabric on a large number of looms. It would like the looms to be homogeneous
More informationThe One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)
The One-Way Repeated-Measures ANOVA (For Within-Subjects Designs) Logic of the Repeated-Measures ANOVA The repeated-measures ANOVA extends the analysis of variance to research situations using repeated-measures
More informationChapter Seven: Multi-Sample Methods 1/52
Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze
More informationAndrogen-independent prostate cancer
The following tutorial walks through the identification of biological themes in a microarray dataset examining androgen-independent. Visit the GeneSifter Data Center (www.genesifter.net/web/datacenter.html)
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationImproved Statistical Tests for Differential Gene Expression by Shrinking Variance Components Estimates
Improved Statistical Tests for Differential Gene Expression by Shrinking Variance Components Estimates September 4, 2003 Xiangqin Cui, J. T. Gene Hwang, Jing Qiu, Natalie J. Blades, and Gary A. Churchill
More informationFalse discovery rate procedures for high-dimensional data Kim, K.I.
False discovery rate procedures for high-dimensional data Kim, K.I. DOI: 10.6100/IR637929 Published: 01/01/2008 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue
More informationControlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method
Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationEffects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel
Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationTwo-Color Microarray Experimental Design Notation. Simple Examples of Analysis for a Single Gene. Microarray Experimental Design Notation
Simple Examples of Analysis for a Single Gene wo-olor Microarray Experimental Design Notation /3/0 opyright 0 Dan Nettleton Microarray Experimental Design Notation Microarray Experimental Design Notation
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationTopics on statistical design and analysis. of cdna microarray experiment
Topics on statistical design and analysis of cdna microarray experiment Ximin Zhu A Dissertation Submitted to the University of Glasgow for the degree of Doctor of Philosophy Department of Statistics May
More informationChap The McGraw-Hill Companies, Inc. All rights reserved.
11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview
More informationarxiv: v1 [math.st] 31 Mar 2009
The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN
More informationOn Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses
On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical
More informationSpecific Differences. Lukas Meier, Seminar für Statistik
Specific Differences Lukas Meier, Seminar für Statistik Problem with Global F-test Problem: Global F-test (aka omnibus F-test) is very unspecific. Typically: Want a more precise answer (or have a more
More informationDEGseq: an R package for identifying differentially expressed genes from RNA-seq data
DEGseq: an R package for identifying differentially expressed genes from RNA-seq data Likun Wang Zhixing Feng i Wang iaowo Wang * and uegong Zhang * MOE Key Laboratory of Bioinformatics and Bioinformatics
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationAnalysis of Variance
Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol 21 no 11 2005, pages 2684 2690 doi:101093/bioinformatics/bti407 Gene expression A practical false discovery rate approach to identifying patterns of differential expression
More informationTopic 28: Unequal Replication in Two-Way ANOVA
Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant
More informationIntroduction to Crossover Trials
Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain
More informationLec 1: An Introduction to ANOVA
Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to
More informationStatistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing
Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret
More informationAliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25
Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University
More informationReference: Chapter 13 of Montgomery (8e)
Reference: Chapter 1 of Montgomery (8e) Maghsoodloo 89 Factorial Experiments with Random Factors So far emphasis has been placed on factorial experiments where all factors are at a, b, c,... fixed levels
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark
More informationSPOTTED cdna MICROARRAYS
SPOTTED cdna MICROARRAYS Spot size: 50um - 150um SPOTTED cdna MICROARRAYS Compare the genetic expression in two samples of cells PRINT cdna from one gene on each spot SAMPLES cdna labelled red/green e.g.
More informationFDR and ROC: Similarities, Assumptions, and Decisions
EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationResampling and the Bootstrap
Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing
More informationPermutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods
Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of
More informationIEOR165 Discussion Week 12
IEOR165 Discussion Week 12 Sheng Liu University of California, Berkeley Apr 15, 2016 Outline 1 Type I errors & Type II errors 2 Multiple Testing 3 ANOVA IEOR165 Discussion Sheng Liu 2 Type I errors & Type
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 147 Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data Sunduz
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationExample 1: Two-Treatment CRD
Introduction to Mixed Linear Models in Microarray Experiments //0 Copyright 0 Dan Nettleton Statistical Models A statistical model describes a formal mathematical data generation mechanism from which an
More informationHunting for significance with multiple testing
Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance
More informationAnalysis of Variance
Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationGene Expression an Overview of Problems & Solutions: 3&4. Utah State University Bioinformatics: Problems and Solutions Summer 2006
Gene Expression an Overview of Problems & Solutions: 3&4 Utah State University Bioinformatics: Problems and Solutions Summer 006 Review Considering several problems & solutions with gene expression data
More informationIntroduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs
Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique
More informationPeak Detection for Images
Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016 Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More information