BIOS 312: Precision of Statistical Inference


 Donald Lucas
 1 years ago
 Views:
Transcription
1 and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013
2 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
3 and Power/Sample Size and Standard Errors Bias and Goal of statistical inference is to estimate parameters accurately (unbiased) and with high precision Measures of precision Standard error (not standard deviation) Width of confidence intervals Power (equivalently, type II error rate)
4 and Power/Sample Size and Standard Errors Summary measures Scientific hypotheses are typically refined in statistical hypotheses by identifying some parameter, θ, measuring differences in the distribution of the response variable Often we are interested in if θ differs across of levels of categorical (e.g. treatment/control) or continuous (e.g. age) predictor variables θ could be any summary measure such as Difference/ratio of means Difference/ratio of medians Ratio of geometric means Difference/ratio of proportions Odds ratio, relative risk, risk difference Hazard ratio
5 and Power/Sample Size and Standard Errors Choosing summary measure How to select θ? In order of importance... 1 Scientific (clinical) importance. May be based on current state of knowledge 2 Is θ likely to vary across the predictor of interest? Impacts the ability to detect a difference, if it exists. 3 Statistical precision. Only relevant if all other factors are equal.
6 and Power/Sample Size and Standard Errors Statistical inference Statistics is concerned with making inference about population parameters, (θ), based on a sample of data Frequentist estimation includes both point estimates (ˆθ) and interval estimates (confidence intervals) Bayesian analysis estimates the posterior distribution of θ given the sampled data, p(θ data). The posterior distribution can then be summarized by quantities like the posterior mean and 95% credible interval. Likelihood analysis focuses on using the likelihood function to obtain maximum likelihood estimates. The likelihood function can be used directly to obtain upper and lower confidencetype intervals for estimates.
7 and Power/Sample Size and Standard Errors example Consider the following results from 5 clinical trials of three drugs (A, B, C) designed to lower cholesterol compared to baseline. Assume a 10 unit drop in cholesterol (relative to baseline) is clinically meaningful. Trial Drug Pts Mean diff Std dev Std error 95% CI for diff pvalue 1 A [129, 69] A [49.6, 10.4] B [85, 45] B [8.5, 4.5] C [9.9, 2.1] 0.002
8 and Power/Sample Size and Standard Errors example Consider the following results from 5 clinical trials of three drugs (A, B, C) designed to lower cholesterol compared to baseline. Assume a 10 unit drop in cholesterol (relative to baseline) is clinically meaningful. Trial Drug Pts Mean diff Std dev Std error 95% CI for diff pvalue 1 A [129, 69] A [49.6, 10.4] B [85, 45] B [8.5, 4.5] C [9.9, 2.1] Which drug is effective at reducing cholesterol? Why is study 4 more informative than study 3 (even though the p values are similar)?
9 and Power/Sample Size and Standard Errors example Consider the following results from 5 clinical trials of three drugs (A, B, C) designed to lower cholesterol compared to baseline. Assume a 10 unit drop in cholesterol (relative to baseline) is clinically meaningful. Trial Drug Pts Mean diff Std dev Std error 95% CI for diff pvalue 1 A [129, 69] A [49.6, 10.4] B [85, 45] B [8.5, 4.5] C [9.9, 2.1] Which drug is effective at reducing cholesterol? Why is study 4 more informative than study 3 (even though the p values are similar)? Moral: Hypothesis tests and pvalues can often be insufficient to make proper decisions. The confidence interval provides more useful information.
10 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
11 and Power/Sample Size and Standard Errors Sampling distribution defined The sampling distribution is the probability distribution of a statistic ( ) e.g. the sampling distribution of the sample mean is N µ, σ2 n Most often we choose estimators that are asymptotically Normally distributed For large n, ˆθ N ( ) θ, V n ˆθ is our estimate of θ. Theˆindicates it is an estimate. Mean: θ Variance: V, which is related to the average amount of statistical information available from each observation Often V depends on θ Large n depends on the distribution of the underlying data. If n is large enough, approximate Normality of ˆθ will hold.
12 and Power/Sample Size and Standard Errors Confidence intervals when n is large Calculating 100(1 α)% confidence intervals (θ L, θ U ) with approximate Normality θ L = ˆθ V Z 1 α/2 n V θ U = ˆθ + Z 1 α/2 n (estimate) ± (crit val) (std err of estimate) Can similarly calculate approximate twosided pvalues Z = (estimate) (hyp value) (std err of estimate) pvalue in Stata: 2 norm pvalue in R: use the pnorm() function ( ( )) abs (estimate) (hyp value) (std err of estimate)
13 and Power/Sample Size and Standard Errors Comparing independent estimates If estimates are independent and Normally distributed ˆθ 1 N ( ) θ 1, se1 2 and ˆθ2 N ( ) θ 2, se2 2 Then, ˆθ 1 ˆθ 2 N ( ) θ 1 θ 2, se1 2 + se2 2 ˆθ 1 + ˆθ 2 N ( ) θ 1 + θ 2, se1 2 + se2 ( ) 2 ˆθ 1 N θ1 ˆθ 2 θ 2, se1 2 + θ2 1 se 2 θ2 2 2
14 and Power/Sample Size and Standard Errors Comparing correlated estimates If estimate are correlated and Normally distributed ˆθ 1 N ( ) θ 1, se1 2 and ˆθ 2 N ( ) θ 2, se2 2 ρ = corr(ˆθ 1, ˆθ 2) Then, ˆθ 1 ˆθ 2 N ( ) θ 1 θ 2, se1 2 + se2 2 2 ρ se 1 se 2 ˆθ 1 + ˆθ 2 N ( ) θ 1 + θ 2, se1 2 + se ρ se 1 se 2 Example: Comparing results from the same study Paper may not give the interesting results (from your point of view) Comparison can be difficult because correlation usually not reported
15 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
16 and Power/Sample Size and Standard Errors Classical hypothesis testing Classical hypothesis testing is stated in terms of the null hypothesis (H 0 ). The alternative hypothesis (H 1 ) is the complement of H 0 Two sided: H 0 : θ = θ 0 vs H 1 : θ θ 0 One sided: H 0 : θ θ 0 vs H 1 : θ < θ 0 One sided: H 0 : θ θ 0 vs H 1 : θ > θ 0 Inference is based on either rejecting or failing to reject the null hypothesis Typically, the null hypothesis is stated in some form so as to indicate no association
17 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true
18 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened
19 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened See if data are consistent with H 0
20 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened See if data are consistent with H 0 Are data extreme or unlikely if H 0 is really true?
21 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened See if data are consistent with H 0 Are data extreme or unlikely if H 0 is really true? Proof by contradiction: if assuming H 0 is true leads to results that are bizarre or unlikely to have been observed, casts doubt on premise
22 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened See if data are consistent with H 0 Are data extreme or unlikely if H 0 is really true? Proof by contradiction: if assuming H 0 is true leads to results that are bizarre or unlikely to have been observed, casts doubt on premise Evidence summarized through a single statistic capturing a tendency of data, e.g., x
23 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process Assumes H 0 is true Conceives of data as one of many datasets that might have happened See if data are consistent with H 0 Are data extreme or unlikely if H 0 is really true? Proof by contradiction: if assuming H 0 is true leads to results that are bizarre or unlikely to have been observed, casts doubt on premise Evidence summarized through a single statistic capturing a tendency of data, e.g., x Look at probability of getting a statistic as or more extreme than the calculated one (results as or more impressive than ours) if H 0 is true (the Pvalue)
24 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process cont. If the statistic has a low probability of being observed to be this extreme we say that if H 0 is true we have acquired data that are very improbable, i.e., have witnessed a low probability event
25 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process cont. If the statistic has a low probability of being observed to be this extreme we say that if H 0 is true we have acquired data that are very improbable, i.e., have witnessed a low probability event Then evidence mounts against H 0 and we might reject it
26 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process cont. If the statistic has a low probability of being observed to be this extreme we say that if H 0 is true we have acquired data that are very improbable, i.e., have witnessed a low probability event Then evidence mounts against H 0 and we might reject it A failure to reject does not imply that we have gathered evidence in favor of H 0 many reasons for studies to not be impressive, including small sample size (n)
27 and Power/Sample Size and Standard Errors Classical hypothesis testing thought process cont. If the statistic has a low probability of being observed to be this extreme we say that if H 0 is true we have acquired data that are very improbable, i.e., have witnessed a low probability event Then evidence mounts against H 0 and we might reject it A failure to reject does not imply that we have gathered evidence in favor of H 0 many reasons for studies to not be impressive, including small sample size (n) Key Limitation Classical hypothesis ignores clinical significance. An approach that allows us to make informed decisions is preferential.
28 and Power/Sample Size and Standard Errors Decision theoretic approach Stated in terms of the null hypothesis and suitable chosen design alternative
29 and Power/Sample Size and Standard Errors Decision theoretic approach Stated in terms of the null hypothesis and suitable chosen design alternative Summarize the design alternative through θ 1 (θ 1 > 0) Two sided: H 0 : θ = θ 0 vs H 1 : θ θ 1 or θ θ 1 One sided: H 0 : θ θ 0 vs H 1 : θ θ 1 One sided: H 0 : θ θ 0 vs H 1 : θ θ 1
30 and Power/Sample Size and Standard Errors Decision theoretic approach Stated in terms of the null hypothesis and suitable chosen design alternative Summarize the design alternative through θ 1 (θ 1 > 0) Two sided: H 0 : θ = θ 0 vs H 1 : θ θ 1 or θ θ 1 One sided: H 0 : θ θ 0 vs H 1 : θ θ 1 One sided: H 0 : θ θ 0 vs H 1 : θ θ 1 Using the decision theoretic approach, can conclude Reject Null Hypothesis. Data is atypical of what we would expect if the null hypothesis is true Reject Alternative Hypothesis. Data is atypical of what we would expect if the alternative hypothesis is true
31 and Power/Sample Size and Standard Errors Decision theoretic approach cont. Key difference from classical approach The design alternative (θ 1 ) is ideally chosen to be the minimal important difference to detect based on scientific or clinical criteria. Clinical significance: In the cholesterol example, the important difference was assumed to be 10 mg/dl Economic impact: A new drug is not marketable unless it has a large effect Feasibility of study: Limited availability of subjects may limit investigators to searching for interventions with large impact
32 and Power/Sample Size and Standard Errors Decision theoretic approach cont. Key difference from classical approach The design alternative (θ 1 ) is ideally chosen to be the minimal important difference to detect based on scientific or clinical criteria. Clinical significance: In the cholesterol example, the important difference was assumed to be 10 mg/dl Economic impact: A new drug is not marketable unless it has a large effect Feasibility of study: Limited availability of subjects may limit investigators to searching for interventions with large impact Remember the cholesterol example. Studies 2, 4, and 5 follow the decision theoretic approach because they allow us to discriminate between scientifically meaningful hypotheses.
33 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
34 and Power/Sample Size and Standard Errors Measures of high precision What are the measures of (high) precision? Estimators are less variable across studies, which is often measured by decreased standard error. Narrower confidence intervals. Estimators are consistent with fewer hypotheses if the CIs are narrow. Able to reject false hypotheses. Z statistic is higher when the alternative hypothesis is true.
35 and Power/Sample Size and Standard Errors Measures of high precision What are the measures of (high) precision? Estimators are less variable across studies, which is often measured by decreased standard error. Narrower confidence intervals. Estimators are consistent with fewer hypotheses if the CIs are narrow. Able to reject false hypotheses. Z statistic is higher when the alternative hypothesis is true. Translation into sample size Based on the width of the confidence interval Choose a sample size such that a 95% CI will not contain both the null and design alternative If both θ 0 and θ 1 cannot be in the CI, we have discriminated between those hypotheses Based on statistical power When the alternative is true, have a high probability of rejecting the null In other words, minimize the type II error rate
36 and Power/Sample Size and Standard Errors Statistical power: quick review Power is the probability of rejecting the null hypothesis when the alternative is true Pr(reject H 0 θ = θ 1) Most often ˆθ N ( ) θ, V n so that the test statistic Z = ˆθ θ 0 wll V /n follow a Normal distribution Under H 0, Z N(0, ( 1) so we ) reject H 0 if Z > Z 1 α/2 θ Under H 1, Z N 1 θ 0, 1 V /n
37 and Power/Sample Size and Standard Errors Statistical power: quick review Power is the probability of rejecting the null hypothesis when the alternative is true Pr(reject H 0 θ = θ 1) Most often ˆθ N ( ) θ, V n so that the test statistic Z = ˆθ θ 0 wll V /n follow a Normal distribution Under H 0, Z N(0, ( 1) so we ) reject H 0 if Z > Z 1 α/2 θ Under H 1, Z N 1 θ 0, 1 V /n Power curves The power function (power curve) is a function of the true value of θ We can compute power for every value of θ As θ moves away from θ 0, power increases (for twosided alternatives) For any choice of desired power, there is always some θ such that the study has that power Pwr(θ 0 ) = α, the type I error rate
38 and Power/Sample Size and Standard Errors Power curves for a twosample, equal variance, ttest; n=100 Power σ = 1 σ = True difference in means (theta)
39 and Power/Sample Size and Standard Errors Code for generating example power curve mydiffs < seq(0.8, 0.8, 0.05) mypower < vector("numeric", length(mydiffs)) mypower2 < vector("numeric", length(mydiffs)) for (i in 1:length(mydiffs)) { mypower[i] < power.t.test(n = 100, sd = 1, delta = mydiffs[i])$power mypower2[i] < power.t.test(n = 100, sd = 1.2, delta = mydiffs[i])$power } plot(mydiffs, mypower, xlab = "True difference in means (theta)", ylab = "Power", type = "l", main = "") lines(mydiffs, mypower2, lty = 2) legend("top", c(expression(sigma == 1), expression(sigma == 1.2)), lty = 1:2, inset = 0.05)
40 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
41 and Power/Sample Size and Standard Errors and standard errors Standard errors are the key to precision Greater precision is achieved with smaller standard errors Standard errors are decreased by either decreasing V or increasing n Typically: se(ˆθ) = V n Width of CI: 2 (crit value) se(ˆθ) Test statistic: Z = ˆθ θ 0 se( ˆθ)
42 and Power/Sample Size and Standard Errors Example: One sample mean Observations are independent and identically distributed (iid) iid Y i (µ, σ 2 ), i = 1,..., n n θ = µ, ˆθ = 1 Y n i = Y i=1 V = σ 2, se(ˆθ) = σ 2 n
43 and Power/Sample Size and Standard Errors Example: One sample mean Observations are independent and identically distributed (iid) iid Y i (µ, σ 2 ), i = 1,..., n n θ = µ, ˆθ = 1 Y n i = Y i=1 V = σ 2, se(ˆθ) = σ 2 n Note that we are not assuming a specific distribution for Y i, just that the distribution has a mean and variance We are assuming that n is large so asymptotic results are applicable Then the distribution Y i could be binary data, Poisson, exponential, normal, etc. and the results will hold
44 and Power/Sample Size and Standard Errors Example: One sample mean Observations are independent and identically distributed (iid) iid Y i (µ, σ 2 ), i = 1,..., n n θ = µ, ˆθ = 1 Y n i = Y i=1 V = σ 2, se(ˆθ) = σ 2 n Note that we are not assuming a specific distribution for Y i, just that the distribution has a mean and variance We are assuming that n is large so asymptotic results are applicable Then the distribution Y i could be binary data, Poisson, exponential, normal, etc. and the results will hold There are ways to decrease V including... Restrict sample by age, gender, etc. Take repeated measures on each subject, summarize, and perform test on summary measures Better ideas (this course): Adjust for age and gender; use all data while modeling correlation
45 and Power/Sample Size and Standard Errors Example: Two sample mean Difference of independent means Observations no longer identically distributed, just independent. Group 1 has a different mean and variance than group 2 ind Y ij (µ j, σ 2 j ), j = 1, 2; i = 1,..., n j n = n 1 + n 2; r = n 1/n 2 θ = µ 1 µ 2, ˆθ = Y 1 Y 2 V = (r + 1)( σ2 1 r + σ 2) 2 se(ˆθ) = = σ1 2 V n n 1 + σ2 2 n 2
46 and Power/Sample Size and Standard Errors Comments on the optimal ratio of sample sizes (r) If we are constrained by the maximal sample size n = n 1 + n 2 Smallest V when r = n 1 n 2 = σ 1 σ 2 In other words, smaller V if we sample more subjects from the more variable group If we are unconstrained by the maximal sample size, there is a point of diminishing returns Example: Casecontrol study where finding cases is difficult/expensive but finding controls is easy/cheap Often quoted r = 5
47 and Power/Sample Size and Standard Errors Optimal sample size ratio for fixed sample size Optimal r for Fixed (n1 + n2): r = s1 / s2 Standard Error r = 1 r = 3 r = 2 s1 = 3*s2 s1 = 2*s2 s1 = s Sample Size Ratio r = n1/n2
48 and Power/Sample Size and Standard Errors Diminishing returns for increase sample size ratio, r Diminishing returns for r > 5 Standard Error s1 = 3*s2 s1 2*s2 s1 = s Sample Size Ratio r = n1/n2
49 and Power/Sample Size and Standard Errors Code for optimal sample size ratio for fixed sample size var.fn < function(r, s1, s2) { (r + 1) * (s1^2/r + s2^2) } n < 100 s2 < 10 plot(function(r) sqrt(var.fn(r, s1 = s2, s2 = s2)/n), 0, 20, ylim = c(1, 6), xlim = c(0, 25), ylab = "Standard Error", xlab = "Sample Size Ratio r = n1/n2", main = "Optimal r for Fixed (n1 + n2): r = s1 / s2") plot(function(r) sqrt(var.fn(r, s1 = 2 * s2, s2 = s2)/n), 0, 20, add = TRUE, lty = 2) plot(function(r) sqrt(var.fn(r, s1 = 3 * s2, s2 = s2)/n), 0, 20, add = TRUE, lty = 3) text(20, 4.7, "s1 = s2", pos = 4) text(20, 5.1, "s1 = 2*s2", pos = 4) text(20, 5.5, "s1 = 3*s2", pos = 4) points(c(1, 2, 3), sqrt(var.fn(c(1, 2, 3), s1 = c(1, 2, 3) * s2, s2 = s2)/n), pch = 2) text(1, 1.8, "r = 1") text(2, 2.8, "r = 2") text(3, 3.8, "r = 3")
50 and Power/Sample Size and Standard Errors Code for diminishing returns for increase sample size ratio n1 < 200 plot(function(r) sqrt(var.fn(r, s1 = s2, s2 = s2)/(n1 + r * n1)), 0, 20, ylim = c(0.5, 3), xlim = c(0, 25), ylab = "Standard Error", xlab = "Sample Size Ratio r = n1/n2", main = "Diminishing returns for r > 5") plot(function(r) sqrt(var.fn(r, s1 = 2 * s2, s2 = s2)/(n1 + r * n1)), 0, 20, add = TRUE, lty = 2) plot(function(r) sqrt(var.fn(r, s1 = 3 * s2, s2 = s2)/(n1 + r * n1)), 0, 20, add = TRUE, lty = 3) text(20, 0.7, "s1 = s2", pos = 4) text(20, 0.8, "s1 = 2*s2", pos = 4) text(20, 0.9, "s1 = 3*s2", pos = 4)
51 and Power/Sample Size and Standard Errors Example: Paired means Difference of paired means No longer iid. Group 1 has a different mean and variance than group 2, and observations are paired (correlated) Y ij (µ j, σj 2 ), j = 1, 2; i = 1,..., n corr(y i1, Y i2 ) = ρ; corr(y ij, Y mk ) = 0 if i m θ = µ 1 µ 2, ˆθ = Y 1 Y 2 V = σ1 2 + σ2 2 2ρσ 1σ 2 se(ˆθ) = V n gains are made when matched observations are positively correlated (ρ > 0) Usually the case, but possible exceptions Sleep on successive nights Intrauterine growth of littermates
52 and Power/Sample Size and Standard Errors Example: Clustered data Clustered data: Experiment where treatments/interventions are assigned based on the basis of Households, schools, clinics, cities, etc. Mean of clustered data Y ij (µ, σ 2 ), i = 1,..., n; j = 1,..., m Up to n clusters, each of which have m subjects corr(y ij, Y ik ) = ρ if j k corr(y ij, Y mk ) = 0 if i m θ = µ, ˆθ = 1 nm n i=1 ( ) V = σ 2 1+(m 1)ρ se(ˆθ) = V n m m Y ij = Y j=1
53 and Power/Sample Size and Standard Errors Example: Clustered data Clustered data: Experiment where treatments/interventions are assigned based on the basis of Households, schools, clinics, cities, etc. Mean of clustered data Y ij (µ, σ 2 ), i = 1,..., n; j = 1,..., m Up to n clusters, each of which have m subjects corr(y ij, Y ik ) = ρ if j k corr(y ij, Y mk ) = 0 if i m θ = µ, ˆθ = 1 nm n i=1 ( ) V = σ 2 1+(m 1)ρ V n m m Y ij = Y j=1 se(ˆθ) = What is V if... ρ = 0 (independent) m = 1 m is large (e.g m = 1000) and ρ is 0, 1, or 0.01
54 and Power/Sample Size and Standard Errors Clustered data cont. With clustered data, even small correlations can be very important to consider Equal precision achieved with Clusters (n) m ρ Total N
55 and Power/Sample Size and Standard Errors Clustered data cont. With clustered data, even small correlations can be very important to consider Equal precision achieved with Clusters (n) m ρ Total N Always consider practical issues. Is it easier/cheaper to collect 1 observation on 1000 different subjects, or 100 observations on 20 different subjects?
56 and Power/Sample Size and Standard Errors Example: Independent odds ratios Binary outcomes ind Y ij B(1, p j ), i = 1,..., n j ; j = 1, 2 n = n 1 + n 2; r = n 1/n 2 θ = log σ 2 j = ( p1 /(1 p 1 ) p 2 /(1 p 2 ) 1 = 1 p j (1 p j ) p j (q j ) V = (r + 1)( σ2 1 r + σ2) 2 se(ˆθ) = = 1 V n ) ; ˆθ = log n 1 p 1 q n 2 p 2 q 2 ( ) ˆp1 /(1 ˆp 1 ) ˆp 2 /(1 ˆp 2 ) Notes on maximum precision Max precision is achieved when the underlying odds are near 1 (proportions near 0.5) If we were considering differences in proportions, the max precision is achieved when the underlying proportions are near 0 or 1
57 and Power/Sample Size and Standard Errors Example: Hazard ratios Independent censored time to event outcomes (T ij, δ ij ), i = 1,..., n j ; j = 1, 2 n = n 1 + n 2; r = n 1/n 2 θ = log(hr); ˆθ = ˆβ from proportional hazards (PH) regression V = (r+1)(1/r+1) se(ˆθ) = Pr(δ ij =1) V n = (r+1)(1/r+1) d In the PH model, statistical information is roughly proportional to d, the number of observed events Papers always report the number of events Study design must consider how long it will take to observe events (e.g. deaths) starting from randomization
58 and Power/Sample Size and Standard Errors Example: Linear regression Independent continuous outcomes associated with covariates ind Y i X i (β 0 + β 1X i, σ 2 Y X ), i = 1,..., n θ = β 1, ˆθ = ˆβ 1 from LS regression V = σ2 Y X Var(X ) se(ˆθ) = ˆσ 2 Y X n ˆ Var(X ) tends to increases as the predictor (X ) is measured over a wider range also related to the within group variance σ 2 Y X What happens to the formulas when X is a binary variable? See two sample mean
59 Outline Overview and Power/Sample Size and Standard Errors 1 Overview and Power/Sample Size 5 and Standard Errors 6
60 Summary Overview and Power/Sample Size and Standard Errors Options for increasing precision Increase sample size Decrease V (Decrease confidence level) Criteria for precision Standard error Width of confidence intervals Statistical power Select a suitable design alternative Select desired power
61 Summary Overview and Power/Sample Size and Standard Errors Sample size calculation: The number of sampling units needed to obtain the desired precision Level of significance α when θ = θ 0 Power β when θ = θ 1 Variability V within one sampling unit n = (z 1 α/2 +z β ) 2 V (θ 1 θ 0 ) 2 When sample size is constrained (the usual case) either Compute power to detect a specified alternative ( ) (θ 1 β = φ 1 θ 0 ) z 1 α/2 V /n φ is the standard Normal cdf function In STATA, use normprob for the φ function Compute alternative that can be detected with high power θ 1 = θ 0 + (z 1 α/2 + z β ) V /n
62 and Power/Sample Size and Standard Errors General comments Sample size required behaves like the square of the width of the CI. To cut the width of the CI in half, need to quadruple the sample size. Positively correlated observations within the same group provide less precision than the same number of independent observations Positively correlated observations across groups provide more precision What power do you use? Most popular is 80% (too low) or 90% Key is to be able to discriminate between scientifically meaningful hypotheses
Sequential Monitoring of Clinical Trials Session 4  Bayesian Evaluation of Group Sequential Designs
Sequential Monitoring of Clinical Trials Session 4  Bayesian Evaluation of Group Sequential Designs Presented August 810, 2012 Daniel L. Gillen Department of Statistics University of California, Irvine
More informationMonitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison
Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationReports of the Institute of Biostatistics
Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions
More informationBayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationHypothesis Testing. ECE 3530 Spring Antonio Paiva
Hypothesis Testing ECE 3530 Spring 2010 Antonio Paiva What is hypothesis testing? A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove that a hypothesis is
More informationStatistical Simulation An Introduction
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 Simulation Through Bootstrapping Introduction 1 Introduction When We Don t Need Simulation
More informationSample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA
Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider
More informationSection Comparing Two Proportions
Section 8.2  Comparing Two Proportions Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Comparing Two Proportions Twosample problems Want to compare the responses in two groups or treatments
More informationAccounting for Baseline Observations in Randomized Clinical Trials
Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical
More informationConfidence Distribution
Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept
More informationCS540 Machine learning L9 Bayesian statistics
CS540 Machine learning L9 Bayesian statistics 1 Last time Naïve Bayes BetaBernoulli 2 Outline Bayesian concept learning BetaBernoulli model (review) Dirichletmultinomial model Credible intervals 3 Bayesian
More informationMTMS Mathematical Statistics
MTMS.01.099 Mathematical Statistics Lecture 12. Hypothesis testing. Power function. Approximation of Normal distribution and application to Binomial distribution Tõnu Kollo Fall 2016 Hypothesis Testing
More information(4) Oneparameter models  Beta/binomial. ST440/550: Applied Bayesian Statistics
Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What
More informationDETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence interval to compare two proportions.
Section 0. Comparing Two Proportions Learning Objectives After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence
More informationTypical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction
Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Nonparametric Estimates of Survival Comparing
More informationInverse Sampling for McNemar s Test
International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 19277032 EISSN 19277040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationPower and Sample Size Bios 662
Power and Sample Size Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 20081031 14:06 BIOS 662 1 Power and Sample Size Outline Introduction One sample: continuous
More informationStatistics for the LHC Lecture 1: Introduction
Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University
More informationInference for Binomial Parameters
Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for
More informationTwo examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.
Two examples of the use of fuzzy set theory in statistics Glen Meeden University of Minnesota http://www.stat.umn.edu/~glen/talks 1 Fuzzy set theory Fuzzy set theory was introduced by Zadeh in (1965) as
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationIntroduction to bivariate analysis
Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.
More informationDoseresponse modeling with bivariate binary data under model uncertainty
Doseresponse modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,
More informationANOVA Situation The F Statistic Multiple Comparisons. 1Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationIntroduction to bivariate analysis
Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 31 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationTwo Sample Problems. Two sample problems
Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent
More informationConfidence Intervals with σ unknown
STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, tdistribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, ztest Confidence
More informationTutorial 2: Power and Sample Size for the Paired Sample ttest
Tutorial 2: Power and Sample Size for the Paired Sample ttest Preface Power is the probability that a study will reject the null hypothesis. The estimated probability is a function of sample size, variability,
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More informationBIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca LenzTönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More information10810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification
10810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene
More informationGeneralized Linear Modeling  Logistic Regression
1 Generalized Linear Modeling  Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationChapter 3 ANALYSIS OF RESPONSE PROFILES
Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over
More informationSolution E[sum of all eleven dice] = E[sum of ten d20] + E[one d6] = 10 * E[one d20] + E[one d6]
Name: SOLUTIONS Midterm (take home version) To help you budget your time, questions are marked with *s. One * indicates a straight forward question testing foundational knowledge. Two ** indicate a more
More informationLogistic regression model for survival time analysis using timevarying coefficients
Logistic regression model for survival time analysis using timevarying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshimau.ac.jp Research
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 7180
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 7180 71. Decide in each case whether the hypothesis is simple
More informationCategorical Data Analysis Chapter 3
Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,
More informationHYPOTHESIS TESTING: FREQUENTIST APPROACH.
HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous
More informationPubH 5450 Biostatistics I Prof. Carlin. Lecture 13
PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null
More information10.1. Comparing Two Proportions. Section 10.1
/6/04 0. Comparing Two Proportions Sectio0. Comparing Two Proportions After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET
More informationAccounting for Baseline Observations in Randomized Clinical Trials
Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical
More informationContents 1. Contents
Contents 1 Contents 1 OneSample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 Onesample Ztest (see Chapter 0.3.1)...... 4 1.1.2 Onesample ttest................. 6 1.1.3 Large sample
More informationECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12
ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 Winter 2012 Lecture 13 (Winter 2011) Estimation Lecture 13 1 / 33 Review of Main Concepts Sampling Distribution of Sample Mean
More informationEstimating the accuracy of a hypothesis Setting. Assume a binary classification setting
Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier
More informationChapter 2. Binary and Mary Hypothesis Testing 2.1 Introduction (Levy 2.1)
Chapter 2. Binary and Mary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or Mary hypothesis testing problems. Applications: This chapter: Simple hypothesis
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationLecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
Lecture 8 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University October 22, 2007 1 2 3 4 5 6 1 Define convergent series 2 Define the Law of Large Numbers
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationStatistics  Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics  Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More information1Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationST440/540: Applied Bayesian Statistics. (9) Model selection and goodnessoffit checks
(9) Model selection and goodnessoffit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate
More information2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling
2015 SISG Bayesian Statistics for Genetics R Notes: Generalized Linear Modeling Jon Wakefield Departments of Statistics and Biostatistics, University of Washington 20150724 Case control example We analyze
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationUniversity of Oxford. Statistical Methods Autocorrelation. Identification and Estimation
University of Oxford Statistical Methods Autocorrelation Identification and Estimation Dr. Órlaith Burke Michaelmas Term, 2011 Department of Statistics, 1 South Parks Road, Oxford OX1 3TG Contents 1 Model
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationProbability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!
Probability theory and inference statistics Dr. Paola Grosso SNE research group p.grosso@uva.nl paola.grosso@os3.nl (preferred) Roadmap Lecture 1: Monday Sep. 22nd Collecting data Presenting data Descriptive
More informationBayesian Inference for Normal Mean
Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density
More informationHypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)
Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Ztest χ 2 test Confidence Interval Sample size and power Relative effect
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150  Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationDescribing Contingency tables
Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationSTAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14. Your Name:
STAT5044: Regression and ANOVA, Fall 2011 Final Exam on Dec 14 Your Name: Please make sure to specify all of your notations in each problem GOOD LUCK! 1 Problem# 1. Consider the following model, y i =
More informationSTA6938Logistic Regression Model
Dr. Ying Zhang STA6938Logistic Regression Model Topic 2Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationPolitical Science 236 Hypothesis Testing: Review and Bootstrapping
Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The
More information1; (f) H 0 : = 55 db, H 1 : < 55.
Reference: Chapter 8 of J. L. Devore s 8 th Edition By S. Maghsoodloo TESTING a STATISTICAL HYPOTHESIS A statistical hypothesis is an assumption about the frequency function(s) (i.e., pmf or pdf) of one
More informationMaster s Written Examination  Solution
Master s Written Examination  Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2
More informationPassingBablok Regression for Method Comparison
Chapter 313 PassingBablok Regression for Method Comparison Introduction PassingBablok regression for method comparison is a robust, nonparametric method for fitting a straight line to twodimensional
More informationCompare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method
Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method Yan Wang 1, Michael Ong 2, Honghu Liu 1,2,3 1 Department of Biostatistics, UCLA School
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in twoway and threeway tables. Now we will
More informationSAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)
SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE
More informationPASS Sample Size Software. Poisson Regression
Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationBayesian Inference: Posterior Intervals
Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)
More informationPubh 8482: Sequential Analysis
Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 12 Review So far... We have discussed the role of phase III clinical trials in drug development
More information1 Comparing two binomials
BST 140.652 Review notes 1 Comparing two binomials 1. Let X Binomial(n 1,p 1 ) and ˆp 1 = X/n 1 2. Let Y Binomial(n 2,p 2 ) and ˆp 2 = Y/n 2 3. We also use the following notation: n 11 = X n 12 = n 1 X
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationA brief introduction to mixed models
A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.
More informationPackage bpp. December 13, 2016
Type Package Package bpp December 13, 2016 Title Computations Around Bayesian Predictive Power Version 1.0.0 Date 20161213 Author Kaspar Rufibach, Paul Jordan, Markus Abt Maintainer Kaspar Rufibach Depends
More informationHypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33
Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett
More informationExamples and Limits of the GLM
Examples and Limits of the GLM Chapter 1 1.1 Motivation 1 1.2 A Review of Basic Statistical Ideas 2 1.3 GLM Definition 4 1.4 GLM Examples 4 1.5 Student Goals 5 1.6 Homework Exercises 5 1.1 Motivation In
More informationThe Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen
The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen January 2324, 2012 Page 1 Part I The Single Level Logit Model: A Review Motivating Example Imagine we are interested in voting
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More information