TAMS38 Experimental Design and Biostatistics, 4 p / 6 hp Examination on 19 April 2017, 8 12

Similar documents
Contents. TAMS38 - Lecture 8 2 k p fractional factorial design. Lecturer: Zhenxia Liu. Example 0 - continued 4. Example 0 - Glazing ceramic 3

Multiple Regression Examples

Contents. TAMS38 - Lecture 6 Factorial design, Latin Square Design. Lecturer: Zhenxia Liu. Factorial design 3. Complete three factor design 4

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000

Section IX. Introduction to Logistic Regression for binary outcomes. Poisson regression

assumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Contents. 2 2 factorial design 4

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

This document contains 3 sets of practice problems.

ACOVA and Interactions

Six Sigma Black Belt Study Guides

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

Confidence Interval for the mean response

Contents. 1 Repetitious tasks 3

Correlation & Simple Regression

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

The Flight of the Space Shuttle Challenger

1 Introduction to Minitab

Question Possible Points Score Total 100

Examination paper for TMA4255 Applied statistics

Stat 231 Final Exam. Consider first only the measurements made on housing number 1.

Sleep data, two drugs Ch13.xls

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

STAT 525 Fall Final exam. Tuesday December 14, 2010

Institutionen för matematik och matematisk statistik Umeå universitet November 7, Inlämningsuppgift 3. Mariam Shirdel

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

This gives us an upper and lower bound that capture our population mean.

Chapter 5 Introduction to Factorial Designs Solutions

EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS

STAT 328 (Statistical Packages)

Contents. TAMS38 - Lecture 10 Response surface. Lecturer: Jolanta Pielaszkiewicz. Response surface 3. Response surface, cont. 4

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

SMAM 314 Exam 42 Name

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.

Model Building Chap 5 p251

STAT 501 EXAM I NAME Spring 1999

Problem Set 10: Panel Data

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Basic Statistics Exercises 66

1. An article on peanut butter in Consumer reports reported the following scores for various brands

Introduction to logistic regression

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

Statistics GIDP Ph.D. Qualifying Exam Methodology January 10, 9:00am-1:00pm

20g g g Analyze the residuals from this experiment and comment on the model adequacy.

16.400/453J Human Factors Engineering. Design of Experiments II

Review of One-way Tables and SAS

Confidence Intervals, Testing and ANOVA Summary

Chapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models

Models with qualitative explanatory variables p216

One-Way Tables and Goodness of Fit

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Department of Mathematics & Statistics Stat 2593 Final Examination 19 April 2001

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STATISTICS 141 Final Review

Analysis of Variance (ANOVA)

Three-Way Contingency Tables

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Unit 8: A Mixed Two-Factor Design

Basic Business Statistics, 10/e

Multivariate Data Analysis Notes & Solutions to Exercises 3

General Linear Model (Chapter 4)

Assignment 9 Answer Keys

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks.

Answer Keys to Homework#10

Reference: Chapter 13 of Montgomery (8e)

Data files & analysis PrsnLee.out Ch9.xls

(c) Interpret the estimated effect of temperature on the odds of thermal distress.

Discrete Multivariate Statistics

R Hints for Chapter 10

Two-Way ANOVA. Chapter 15

Intro to Linear Regression

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Data Set 8: Laysan Finch Beak Widths

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple Regression: Chapter 13. July 24, 2015

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Intro to Linear Regression

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name

Stat 217 Final Exam. Name: May 1, 2002

Mean Comparisons PLANNED F TESTS

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Chapter 5: Logistic Regression-I

Statistics Handbook. All statistical tables were computed by the author.

Unit 11: Multiple Linear Regression

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

STATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours

MULTIPLE REGRESSION METHODS

Statistics in medicine

Booklet of Code and Output for STAC32 Final Exam

Transcription:

Kurskod: TAMS38 - Provkod: TEN1 TAMS38 Experimental Design and Biostatistics, 4 p / 6 hp Examination on 19 April 2017, 8 12 The collection of the formulas in mathematical statistics prepared by Department of Mathematics LiU and calculator with empty memory are allowed on the exam. Dictionary English-other language are allowed. No extra notes in the formula collection is allowed. Score limits: 7-9 points gives grade 3, 9.5-12 gives 4 and 12.5-15 gives 5. Examinator: Martin Singull, 013-281447 The result will be normally published via LADOK within 12 working days. Clear answers and justifications are required for each problem. 1) A manufacturer suspects that the batches of raw material furnished by his supplier differ significantly in calcium content. There are a large number of batches currently in the warehouse. Five of these are randomly selected for study. A chemist makes five determinations on each batch and obtains the following data: Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 23.46 23.59 23.51 23.28 23.29 23.48 23.46 23.64 23.40 23.46 23.56 23.42 23.46 23.37 23.37 23.39 23.49 23.52 23.46 23.32 23.40 23.50 23.49 23.39 23.38 ȳ i = 23.458 23.492 23.524 23.380 23.364 s i = 0.0687 0.0630 0.0688 0.0652 0.0650 ȳ = 23.444 a) Write the model and estimate its parameters. b) Is there significant variation in calcium content from batch to batch? Use α = 5%. Remember to state your H 0 and H 1. c) Test if one can assume that Batch 2 and Batch 3 have same variance. Use α = 5%. Remember to state your H 0 and H 1. 2) An industrial engineer is investigating the effect of four assembly methods (A, B, C, D) on the assembly time for a color television component. Four operators are selected for the study. Furthermore, the engineer knows that time for the earlier and later assembly can be different regarding assembly method procedure. To account for this, the engineer uses the Latin Square design shown below. 1

Order of Assembly Operator 1 2 3 4 1 C=10 D=14 A=7 B=8 2 B=7 C=18 D=11 A=8 3 A=5 B=10 C=11 D=9 4 D=10 A=10 B=12 C=14 a) State the Latin Square model used to analyze data. (0.5p) General Linear Model: time versus Method, Operator, Order Analysis of Variance Source DF Adj SS Method 3 72.50 Operator 3 51.50 Order 3 18.50 Error 6 10.50 Total 15 153.00 Method Mean Operator Mean Order Mean 1 7.500 1 8.000 1 9.750 2 9.250 2 13.000 2 11.000 3 13.250 3 10.250 3 8.750 4 11.000 4 9.750 4 11.500 b) Is there a significant difference between assembly methods regarding assembly time? Use α = 5%. Remember to state your H 0 and H 1. c) Use confidence intervals to determinate which operator and method gives the shortest assembling time with α sim 10%. 3) Male residents of some city in aged between 40 and 59 were chosen to investigate probability of developing coronary heart disease. During a 6-year follow-up period 1329 males were classified according to several factors, including blood pressure and whether they developed coronary heart disease. The results and Minitab analysis using Logistic Regression (logit(p) = β 0 + β 1 pressure, where p is probability that subject develops heart disease) for these two variables are given below. Blood pressure Heart disease Total (n) YES NO <117 153 3 156 117-126 235 17 252 127-136 272 12 284 137-146 255 16 271 147-156 127 12 139 157-166 77 8 85 167-186 83 16 99 >186 35 8 43 Total 1237 92 1329 2

MODEL 1 Binary Logistic Regression: Yes versus pressure Response Information Event Variable Value Count Name Yes Event 1237 Event Non-event 92 n Total 1329 Model Summary Coefficients Term Coef SE Coef Constant 3.692 0.270 pressure -0.2694 0.0550 Regression Equation P(Event) = exp(y )/(1 + exp(y )) Y = 3.692-0.2694 pressure Goodness-of-Fit Tests Test DF Chi-Square Deviance 6 6.39 Pearson 6 6.92 Hosmer-Lemeshow 4 6.75 a) Can we claim with probability 95% that logistic model 1 is equally good to the maximal model? Motivate your answer with appropriate test. b) Construct 95% confidence interval for β 1 for results obtained for Model 1. c) One decided to collect the data also for females and add variable Sex into logistic model 2. The results for new data set are given below. Is the new variable Sex significant on α = 5%? Motivate answer using appropriate test. State H 0 and H 1. MODEL 2 Binary Logistic Regression: Yes versus pressure, Sex Coefficients Term Coef SE Coef Constant 2.169 0.156 pressure -0.1055 0.0342 Sex 0.831 0.134 Goodness-of-Fit Tests Test DF Chi-Square P-Value Deviance 13 56.12 0.000 Pearson 13 68.67 0.000 Hosmer-Lemeshow 6 14.87 0.021 4) In the experiment one studied three factors on the growth of a certain type of sulfur bacteria. The cultivation of bacteria occurred during the continuous light and at temperature of 25 30 C. The following factors have been used: Low level High level A (Na 2 S 9H 2 O) 0.5 1.5 B (Na 2 S 2 O 3 ) 1 5 C ( NaHCO 3 ) 1 5 3

where concentrations are given in g/l. The following observations were obtained and analyzed using 2 3 factorial design MTB > copy c1-c8 m1 MTB > trans m1 m2 MTB > copy c9 m3 MTB > mult m2 m3 m4 MTB > copy m4 c10 MTB > set c11 DATA> 1:8 DATA> end MTB > let c12 = c10/8 MTB > Sort c11 c12 c13 c14; SUBC> By c12. MTB > print c13-c14 Data Display Row C13 C14 1 2 0,375 2 8 1,250 3 4 2,625 4 6 3,250 5 7 9,250 6 5 10,750 7 3 11,375 8 1 24,125 MTB > copy c14 c14; SUBC> omit 8. MTB > nscores c14 c15 MTB > plot c15*c14 (1) 15.5 c 14.5 a 7.0 ac 14.0 b 17.0 bc 48.0 ab 14.0 abc 63.0 a) Determine three most significant effects. (0.5p) b) Is it possible to use ANOVA analysis directly instead of given above procedure based on formula ˆξ = 1 8 F y? Motivate your answer with one sentence. (0.5p) c) By only looking at main effects estimate the expected value that gives the best response, i.e, the highest expected value. d) Using results of ANOVA analysis (given below) and a t-interval, determine level for factor A that give highest response with α = 1%. e) Using results of ANOVA analysis (given below) and Tukey-interval method determine level combination for factors B and C that give highest response with α = 5%. f) What is simultaneous confidence level for the two sets of intervals you obtained in d) and e)? (0.5p) 4

ANOVA: Y versus A, B, C Analysis of Variance for Y Source DF SS A 1 1.13 B 1 1035.13 C 1 924.50 B*C 1 684.50 Error 3 152.13 Total 7 2797.38 Means A N Y -1 4 23.750 1 4 24.500 B N Y -1 4 12.750 1 4 35.500 C N Y -1 4 13.375 1 4 34.875 B C N Y -1-1 2 11.250-1 1 2 14.250 1-1 2 15.500 1 1 2 55.500 5