Chapter 7: Hypothesis testing

Size: px
Start display at page:

Download "Chapter 7: Hypothesis testing"

Transcription

1 Chapter 7: Hypothesis testing Hypothesis testing is typically done based on the cumulative hazard function. Here we ll use the Nelson-Aalen estimate of the cumulative hazard. The survival function is used to weight differences between the observed and expected cumulative hazard. Recall that the Nelson-Aalen estimate of the cumulative hazard is H(t) = t ti d i Y i In a one-sample problem, you test whether the hazard rate h(t) is equal to some reference hazard, h 0 (t). The null hypothesis is H 0 : h(t) = h 0 (t). Under the null hypothesis, the expected hazard rate at time t i is h 0 (t i ). SAS Programming March 11, / 43

2 Hypothesis testing: one sample The idea is then to compare observed - expected cumulative hazard rates at the time τ, the largest time in the study (τ = t D ) if the largest time is a death time). The test statistic is then Z(τ) = O(τ) E(τ) = D i=1 W (t i ) d τ i W (s)h 0 (s) ds Y i 0 where W ( ) is a weight function. The variance is V [Z(τ)] = τ 0 W 2 (s) h 0(s) Y (s) ds SAS Programming March 11, / 43

3 Hypothesis testing The expected value of Z(τ) = 0, so if we take a z-score of Z(τ) (subtracting the mean and dividing by the standard deviation), we get Z(τ)/ V [Z(τ)] which has an approximate standard normal distribution. This can be used for either a two-sided or one-sided test. For example, a one-sided test would be H 1 : h ( t) > h 0 (t), and you would reject only for large values of Z(τ)/ V [Z(τ)] SAS Programming March 11, / 43

4 Hypothesis testing The most popular choice for a weighting function is W (t) = Y (t), which leads to D O(τ) = Y (t i ) d D i = d i Y i i=1 i=1 This is also called the log-rank test (not sure why). Other weight functions are possible. For example W (t) = Y (t)s 0 (t) p [1 S 0 (t)] q with 0 p, q 1 (you don t necessarily need q = 1 p here). The choice of p affects whether you care more about the hazard not matching the hypothesized hazard for small t or large t. For example, if p is large, then more emphasis is placed on the estimated hazard matching the null hazard for small values of t. S 0 (t) can be obtained from S 0 (t) = exp[ H 0 (t)]. SAS Programming March 11, / 43

5 Hypothesis testing An example where you would use the one-sided hypothesis test is in testing whether some population has a higher hazard than a reference population, such as the psychiatric patients from Iowa. Recall that for this example, we looked at excess mortality previously. SAS Programming March 11, / 43

6 Hypothesis testing: two or more samples If you have two or more samples (i.e., mortality for three different treatments or three different risk groups), then the null and alternative hypothesis are similar to that for ANOVA: H 0 : h 1 (t) = h 2 (t) = h K (t), for all t τ H A : h i (t) h j (t) for some i j and some t τ where τ is the largest time at which all of the groups have at least one subject at risk. SAS Programming March 11, / 43

7 Hypothesis testing: two or more samples We now define t i as the unique death times for the pooled data (i.e., ignoring the group that each observation comes from), and again t D is the largest death time. We observe d ij deaths at time t i in sample j, and there are Y ij individuals at risk at time t i in sample j. We let d i = K j=1 d ij be the total number of deaths at time t i and Y i = K j=1 Y ij be the total number of indivdiuals at risk (available for death?) at time t i. SAS Programming March 11, / 43

8 Hypothesis testing: two or more samples The idea for testing the hypothesis is that under the null hypothesis, the estimate of the hazard (and cumulative hazard) should be the same (in expectation) using the pooled data (ignoring the group the samples are from) and for the individual samples. We can think of the pooled data as providing a more precise estimate of the hazard for the jth sample than the jth sample itself, so using the idea of observed minus expected, we can write D ( dij Z j (τ) = W j (t) d ) i, j = 1,..., K Y ij Y i i=1 If all of the Z j (τ) terms are close to 0, then all of the sample estimated cumulative hazards are close to the pooled cumulative hazard, so they all must be close to each other, and this supports the null hypothesis. SAS Programming March 11, / 43

9 Hypothesis testing: two or more samples The typical weight function used is W j (t) = Y ij (t)w (t i ), where W (t i ) is a common weight shared by each group. For this weighting scheme, V [Z j (τ)] = σ jj = D i=1 Z j (τ) = D i=1 W (t i ) 2 Y ij Y i cov(z j (τ), Z k (τ)) = σ jk = D i=1 [ ( )] di d ij Y ij Y i ( 1 Y ij Y i W (t i ) 2 Y ij Y i Y ik Y i ) ( ) Yi d i d i, j = 1,..., K Y i 1 ( ) Yi d i d i, j k Y i 1 SAS Programming March 11, / 43

10 Hypothesis testing: two or more samples Based on the second formula for Z j (τ), the sum K j=1 Z j(τ) is equal to 0, meaning that the Z j (τ) are not independent of one another. In particular Z K (τ) is a linear combination of Z 1 (τ),..., Z K 1 (τ). Consequently, we construct a test statistic just based on the first K 1 Z j (τ) terms: χ 2 = (Z 1 (τ),..., Z K 1 (τ))σ 1 (Z 1 (τ),..., Z K 1 (τ)) where (Z 1 (τ),..., Z K 1 (τ)) is interpreted as a K 1 row-vector, Σ is a (K 1) (K 1) covariance matrix (if you had made a K K matrix using all the variables, it wouldn t be full rank, and therefore not invertible). The χ 2 statistic has K 1 degrees of freedom, and you can base the test on this distribution. SAS Programming March 11, / 43

11 Hypothesis testing: two samples Several weight functions are possible. W (t) = 1 for all t leads to the two-sample log-rank test. W (t i ) = Y i and W (t i ) = Y i have also been used. In the case of K = 2 samples, the test statistic can be written as [ ( )] D i=1 W (t i) d i1 Y di i1 Y i Z = D ( ) ( ) i=1 W (t i) 2 Y i1 Y i 1 Y i1 Yi d i Y i Y i 1 SInce we don t have to square in this case, we can do one-sided as well as two-sided hypothesis tests based on a standard normal distribution instead of a χ 2, or you can square the statistic and use a χ 2 1 distribution. SAS Programming March 11, / 43

12 Hypothesis testing: two samples SAS Programming March 11, / 43

13 Hypothesis testing: two samples This example was kidney dialysis patients with surgically implanted catheters versus percutaneous (needle-puncture) placement of catheter. Even though the survival curves look fairly different after 1 year or so, the differences are not statistically signficant. Note that there are also very few observations for the percutaneous sample. Actually the number of observations is fairly small for both samples, so the confidence intervals would be fairly wide. SAS Programming March 11, / 43

14 Hypothesis testing: two samples SAS Programming March 11, / 43

15 Hypothesis testing: two samples SAS Programming March 11, / 43

16 Hypothesis testing: two samples Different choices for the weight function affect the p-value. It is reassuring if a lot of weighting schemes give the same conclusion. The cases where the p-value were low were where the weighting scheme gave a lot of weight to differences in the hazard for large values of t i, which of course is where they appear different. This can also be sensitive to differences in censoring patterns in the two samples, so should be used cautiously. A problem with using lots of weighting schemes is if you only report weighting schemes that give the results you want and different weights conflict. This would be dishonest, so you should either pick a weighting scheme and stick to it, or report results of the different weighting schemes that you used. SAS Programming March 11, / 43

17 Hypothesis testing: weight functions SAS Programming March 11, / 43

18 Hypothesis testing: weight functions The most common weight functions are either flat, W (t i ) = 1 or decreasing, with W (t i ) = Y i. A weight function that is increasing might be used if to compare longer term survival when early survival might be due to complications rather than long term effectiveness of a treatment. An example is in comparing autologous transplants versus allogenic transplants for bone marrow for leukemia. Allogenic transplant patients (receiving bone marrow from sibling) tend to have more complications early on, reducing early survival rates (and increasing early hazard rates), but if interest is in long term survival, then a weight function could be used that emphasized later times. SAS Programming March 11, / 43

19 Hypothesis testing in R To test the difference in survival curves in R, you can use survdiff() from the survival library. An example is with the allo- versus autopatients in the leukemia data. > x <- read.table("leukemia2.txt") > a <- survdiff(surv(x$v1,x$v2)~factor(x$v3)) Call: survdiff(formula = Surv(x$V1, x$v2) ~ factor(x$v3)) N Observed Expected (O-E)^2/E (O-E)^2/V factor(x$v3)= factor(x$v3)= Chisq= 0.4 on 1 degrees of freedom, p= The results suggest that the two groups had survival experiences that were not statistically significantly different from each other. SAS Programming March 11, / 43

20 Hypothesis testing in R To plot the two survival curves together you can use > x <- read.table("leukemia2.txt") > a <- survfit(surv(x$v1[x$v3==1],x$v2[x$v3==1])~1) > b <- survfit(surv(x$v1[x$v3==2],x$v2[x$v3==2])~1) > plot(a,conf=f) > points(b$time,b$surv,type="s",col="red",lwd=3) > legend(20,1,legend=c("auto","allo"),col=c("black","red"), lty=c(1,1),lwd=c(1,3),cex=1.3) SAS Programming March 11, / 43

21 Hypothsis testing in R SAS Programming March 11, / 43

22 Hypothesis testing in R The survdiff() function in R has an optional paramter rho whose default is 0, which results in the log rank test. Larger values of rho put larger weight on later times and can have a big impact on the p-value. SAS Programming March 11, / 43

23 Hypothesis testing in SAS You can use PROC LIFETEST in SAS to do hypothesis testing. We ll take a look at examples after the break. SAS Programming March 11, / 43

24 Tests of trend For multiple samples (K > 2), a different alternative hypothesis is the following: H A : h 1 (t) h 2 (t) h K (t), for t τ, where at least one inequality is strict. This is equivalent to H A : S 1 (t) S K (t) SAS Programming March 11, / 43

25 Tests of trend We construct the Z j (τ)s as before and use any weight functions W j (t i ). We also pick a new set of weights a j, j = 1,..., K, where a j = j is often used. The test statistic is now Z = K j=1 a jz j (τ) K K j=1 k=1 a ja k σ jk where Σ = ( σ jk ) is the K K covariance matrix. (It isn t full rank, but we don t need the inverse.) The test statistic can be compared to a standard normal. SAS Programming March 11, / 43

26 Tests of trend SAS Programming March 11, / 43

27 Stratified tests If different populations have different covariates (age, sex, etc.), then ideally, you could use a regression approach to survival analysis to adjust for covariates before comparing survival curves or hazard rates. This is done in Chapter 8. If there are a small number of levels for a predictor, then you can use a stratified test instead. Let H 0 : h 1s (t) = h 2s (t) = = h Ks (t), s = 1,..., M, t τ The idea is that for each level of the covariate (indexed by s), the hazard rate should be the same. Typically, M is small. SAS Programming March 11, / 43

28 Stratified tests For the stratified test, let Z j. (τ) = σ jk = M Z js (τ) s=1 M s=1 σ jks Then the test statistic is as before with multiple samples: (Z 1. (τ),..., Z K 1,. (τ))σ 1 (Z 1. (τ),..., Z K 1,. (τ)) which is approximately χ 2 with K 1 degrees of freedom. Here we have K samples and M strata within each sample. SAS Programming March 11, / 43

29 Renyi type tests For a two sample problem, if hazard functions cross, then the previous tests might not detect much overall difference in the hazard rates. Thus, the overall survival experience might be similar, but it could be different in the short term and different in the long term. If one group is at more at risk in the short term, and another in the long term, these changes of direction could cancel out leading one to not reject the hypothesis that the hazards are different. Renyi-type tests are based on the maximum absolute value of the differences between cumulative hazard rates rather than the summed differences. The idea is similar to the Kolmogorov-Smirnov test for comparing two distributions, which uses the largest absolute value of the difference betweent the two empirical CDF functions, but Renyi tests allow for censoring. SAS Programming March 11, / 43

30 Renyi type tests To construct this test, let Z(t i ) = t k t i W (t k ) [ ( )] dk d k1 Y k1, i = 1,..., D Y k where as usual d k = d k1 + d k2 and Y k = Y k1 + Y k2 (i.e., d k and Y k are the pulled number of deaths and number at risk at time t k over both samples). The standard error of Z(τ) is σ 2 (τ) = τ k τ W (t k ) 2 ( Yk1 Y k ) ( Yk2 where τ is the largest death time t k with Y k1, Y k2 > 0 Y k ) ( ) Yk d k d k Y k 1 SAS Programming March 11, / 43

31 Renyi type tests The test statistic is Q = sup{ Z(t), t τ}/σ(τ) you can think of the supremum here as just the maximum of the absolute values of the Z(t j ) values. Critical values are given in the Appendix, table C.5, and are based on the theory of Brownian motion. SAS Programming March 11, / 43

32 Renyi type tests SAS Programming March 11, / 43

33 Renyi type tests: finding the maximum Z(t j ) SAS Programming March 11, / 43

34 SAS Programming March 11, / 43

35 Testing based on a fixed point in time Instead of testing survival and hazard rates over all time points, you might be interested in the 1-yr survival rate. Note that the time being tested should be chosen before doing the test. If you look at two survival curves and say, Wow, they look really different at year 3, is that significant? then the p-value will biased too low. It is similar to testing at many time points but then not adjusting for multiple comparisons. In practice, this is what happens all the time though. People look at a graph of the data, which is maybe meant to be descriptive, something jumps out at them as being unusual, and they say, Wow, is that significant? It s extremely difficult to answer this type of question. A better approach in this type of case might be the Renyi type of test, because it is accounting for the fact that you are looking at maximum differences over the entire time frame. SAS Programming March 11, / 43

36 Testing based on a fixed point in time Here we want to test against H 0 : S 1 (t 0 ) = S 2 (t 0 ) H A : S 1 (t 0 ) S 2 (t 0 ) for two survival curves. (The method can be generalized to more survival curves.) The test statistic is Z = Ŝ 1 (t 0 ) Ŝ2(t 0 ) V [Ŝ1(t 0 )] + V [Ŝ2(t 0 )] which has an approximate standard normal distribution for large samples. SAS Programming March 11, / 43

37 Testing based on a fixed point in time If you want to test multiple fixed time points, such as the 1-yr and 5-yr survival rates, then you should adjust for multiple comparisons. For testing two time points, a Bonferroni adjustment could be made, meaning that you reject each hypothesis only if the p-value is less than α/2. The more time points you check, the less power you will have to find signficant differences. SAS Programming March 11, / 43

38 Bonferroni adjustments Probably the most popular, and simplest adjustment to make for multiple testing is Bonferroni adjustments. The idea is that to have k tests at level α (meaning that if the null hypotheses are true for all k tests, there is only a 5% chance of making an error on any one of them), you use an α level of α/k for each test. What is the rationale for doing this? SAS Programming March 11, / 43

39 Bonferroni adjustments There are several ways to justify Bonferroni adjustments. One is to look at the expected number of false positives under the null. Let X i = 1 if you make a correct decision on test i, and otherwise X i = 0. What type of variable is X i? What is the probability that X i = 1 if the null hypothesis (for experiment i) is true? What is the expected value of X i? SAS Programming March 11, / 43

40 Bonferroni adjustments X i as defined previously is Bernoulli with p = α if testing using level α. The expected value of a Bernoulli(p) random variable is p. (Why?), so the expected value of X i is α. If you do k experiments, the expected number of false positives is [ k E i=1 X i ] = kα However, if you test at the α/k level, then the expected number of false positives is α. Thus, the Bonferroni adjustment controls the expected number of false positives. SAS Programming March 11, / 43

41 Bonferroni adjustments Another approach is to use something called Bonferroni s inequality. Let A i be the event that you don t reject the null hypothesis. Suppose we set P(A i ) = 1 α/k when the null is true. From the Inclusion-Exclusion formula P(A 1 A 2 ) = P(A 1 ) + P(A 2 ) P(A 1 A 2 ) P(A 1 ) + P(A 2 ) 1 If we apply the formula again, setting B = A 1 A 2, we get P(A 1 A 2 A 3 ) = [P(A 1 )+P(A 2 ) 1]+P(A 3 ) 1 P(A 1 )+P(A 2 )+P(A 3 ) 2 In general for k events P(A 1 A k ) k P(A i ) (k 1) i=1 SAS Programming March 11, / 43

42 Bonferroni adjustments If P(A i ) = 1 α/k, then we get P(A 1 A k ) k ( 1 α ) k + 1 = 1 α k Thus, the probability of all decisions being correct is at least 1 α, and the probability of making any wrong decision is at most α. SAS Programming March 11, / 43

43 Bonferroni adjustments Bonferroni s inequality can be useful in other probabilistic arguments as well. SAS Programming March 11, / 43

Kernel density estimation in R

Kernel density estimation in R Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

More information

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Kernel density estimation in R

Kernel density estimation in R Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

4. Comparison of Two (K) Samples

4. Comparison of Two (K) Samples 4. Comparison of Two (K) Samples K=2 Problem: compare the survival distributions between two groups. E: comparing treatments on patients with a particular disease. Z: Treatment indicator, i.e. Z = 1 for

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

Contrasts and Multiple Comparisons Supplement for Pages

Contrasts and Multiple Comparisons Supplement for Pages Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis

More information

Rejection regions for the bivariate case

Rejection regions for the bivariate case Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test

More information

Textbook: Survivial Analysis Techniques for Censored and Truncated Data 2nd edition, by Klein and Moeschberger

Textbook: Survivial Analysis Techniques for Censored and Truncated Data 2nd edition, by Klein and Moeschberger Lecturer: James Degnan Office: SMLC 342 Office hours: MW 12:00 1:00 or by appointment E-mail: jamdeg@unm.edu Please include STAT474 or STAT574 in the subject line of the email to make sure I don t overlook

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance ANOVA Analysis of Variance Extends independent samples t test ANOVA Analysis of Variance Extends independent samples t test Compares the means of groups of independent observations

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba B.N.Bandodkar College of Science, Thane Random-Number Generation Mrs M.J.Gholba Properties of Random Numbers A sequence of random numbers, R, R,., must have two important statistical properties, uniformity

More information

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that Lecture 28 28.1 Kolmogorov-Smirnov test. Suppose that we have an i.i.d. sample X 1,..., X n with some unknown distribution and we would like to test the hypothesis that is equal to a particular distribution

More information

Right-truncated data. STAT474/STAT574 February 7, / 44

Right-truncated data. STAT474/STAT574 February 7, / 44 Right-truncated data For this data, only individuals for whom the event has occurred by a given date are included in the study. Right truncation can occur in infectious disease studies. Let T i denote

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

5. Parametric Regression Model

5. Parametric Regression Model 5. Parametric Regression Model The Accelerated Failure Time (AFT) Model Denote by S (t) and S 2 (t) the survival functions of two populations. The AFT model says that there is a constant c > 0 such that

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Nonparametric hypothesis tests and permutation tests

Nonparametric hypothesis tests and permutation tests Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon

More information

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent: Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs) The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or

More information

Linear Regression. Chapter 3

Linear Regression. Chapter 3 Chapter 3 Linear Regression Once we ve acquired data with multiple variables, one very important question is how the variables are related. For example, we could ask for the relationship between people

More information

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016 Proportional Hazards Model - Handling Ties and Survival Estimation Statistics 255 - Survival Analysis Presented February 4, 2016 likelihood - Discrete Dan Gillen Department of Statistics University of

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing (Section 8-2) Hypotheses testing is not all that different from confidence intervals, so let s do a quick review of the theory behind the latter. If it s our goal to estimate the mean of a population,

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t) PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Kaplan-Meier in SAS. filename foo url "http://math.unm.edu/~james/small.txt"; data small; infile foo firstobs=2; input time censor; run;

Kaplan-Meier in SAS. filename foo url http://math.unm.edu/~james/small.txt; data small; infile foo firstobs=2; input time censor; run; Kaplan-Meier in SAS filename foo url "http://math.unm.edu/~james/small.txt"; data small; infile foo firstobs=2; input time censor; run; proc print data=small; run; proc lifetest data=small plots=survival;

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October

More information

Cox s proportional hazards/regression model - model assessment

Cox s proportional hazards/regression model - model assessment Cox s proportional hazards/regression model - model assessment Rasmus Waagepetersen September 27, 2017 Topics: Plots based on estimated cumulative hazards Cox-Snell residuals: overall check of fit Martingale

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Philosophy and Features of the mstate package

Philosophy and Features of the mstate package Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University

More information

The Random Effects Model Introduction

The Random Effects Model Introduction The Random Effects Model Introduction Sometimes, treatments included in experiment are randomly chosen from set of all possible treatments. Conclusions from such experiment can then be generalized to other

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up? Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Statistics 262: Intermediate Biostatistics Regression & Survival Analysis

Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017 Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about

More information

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel Statistics Primer A Brief Overview of Basic Statistical and Probability Principles Liberty J. Munson, PhD 9/19/16 Essential Statistics for Data Analysts Using Excel Table of Contents What is a Variable?...

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

Lecture 4: Testing Stuff

Lecture 4: Testing Stuff Lecture 4: esting Stuff. esting Hypotheses usually has three steps a. First specify a Null Hypothesis, usually denoted, which describes a model of H 0 interest. Usually, we express H 0 as a restricted

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Solutions to Final STAT 421, Fall 2008

Solutions to Final STAT 421, Fall 2008 Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,

More information

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, 2016 1 Discreteness versus Hypothesis Tests You cannot do an exact level α test for any α when the data are discrete.

More information

Module 6: Model Diagnostics

Module 6: Model Diagnostics St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 6: Model Diagnostics 6.1 Introduction............................... 1 6.2 Linear model diagnostics........................

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Inferences about a Mean Vector

Inferences about a Mean Vector Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1 Preamble to The Humble Gaussian Distribution. David MacKay Gaussian Quiz H y y y 3. Assuming that the variables y, y, y 3 in this belief network have a joint Gaussian distribution, which of the following

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA

More information

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise.

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise. Name Solutions Linear Algebra; Test 3 Throughout the test simplify all answers except where stated otherwise. 1) Find the following: (10 points) ( ) Or note that so the rows are linearly independent, so

More information

Lecture 8 Stat D. Gillen

Lecture 8 Stat D. Gillen Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Rank-Based Methods. Lukas Meier

Rank-Based Methods. Lukas Meier Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 38 Goodness - of fit tests Hello and welcome to this

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

Lecture 19 Multiple (Linear) Regression

Lecture 19 Multiple (Linear) Regression Lecture 19 Multiple (Linear) Regression Thais Paiva STA 111 - Summer 2013 Term II August 1, 2013 1 / 30 Thais Paiva STA 111 - Summer 2013 Term II Lecture 19, 08/01/2013 Lecture Plan 1 Multiple regression

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Confidence intervals

Confidence intervals Confidence intervals We now want to take what we ve learned about sampling distributions and standard errors and construct confidence intervals. What are confidence intervals? Simply an interval for which

More information