Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

Size: px
Start display at page:

Download "Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer"

Transcription

1 Solutions to Exam in December 2012 Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer Exercise IV.2 IV.3 IV.4 V.1 V.2 V.3 VI.1 VI.2 VII.1 VII.2 Question (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) Answer Exercise VII.3 VII.4 VIII.1 VIII.2 IX.1 IX.2 IX.3 X.1 X.2 X.3 Question (21) (22) (23) (24) (25) (26) (27) (28) (29) (30) Answer Exercise I On a large fully automated production plant items are pushed to a side band at random time points, from which they are automatically fed to a control unit. The production plant is set up in such a way that the number of items sent to the control unit on average is 1.6 item pr. minute. Let the random variable X denote the number of items pushed to the side band in 1 minute. It is assumed that X follows a Poisson distribution. Question I.1 (1) The probability that there will arrive more than 5 items at the control unit in a given minute is: Answer With λ = 1.6, we find that P (X > 5) = 1 P (X 5) = = where the can be found in the Poisson table (Table 2) OR we can find the value by R: 1-ppois(5,1.6) 3 Approximately 0.6% Question I.2 (2) The probability that no more than 8 items arrive to the control unit within a 5-minute period is: 1

2 Answer With λ 5minutes = 8, we find that P (X 8) = where the can be found in the Poisson table (Table 2) OR we can find the value by R: ppois(8,8) 1 Approximately 59.3% Question I.3 (3) The operators responsible for the control unit believe that the number of items arriving for control, is lower than desired. Hence, a count of the number of items arriving in periods of 10 minutes is carried out. Eight random periods of 10 minutes are being registered. The following data is found: It can now be assumed that a normal distribution, N(µ, σ 2 ), can be used as a valid approximation of the distribution of the number of items for control during 10 minutes. We want to test the hypothesis (on level α = 0.05) H 0 : µ = 16 ( Correct level ) H 1 : µ < 16 ( Too low level ) The result of the study becomes: (As well conclusion as argument must be correct) Answer The mean and sample standard deviation becomes x = and s = , so the t-test statistic becomes: t = / 8 = 3.53 And the p-value becomes (as it is a left-one-tailed alternative): P (t < 3.53) where t is a t(7)-distribution. From Table 4 we can conclude that the P-value is below 0.005, as the point (=99.5% percentile) of the t(7)-distribution is From R, everything, including the exact P-value can be found as: 2

3 x=c(16,12,10,15,11,14,9,15) mean(x) sd(x) t_obs=( )/(sd(x)/sqrt(8)) t_obs pt(t_obs,7) Or more easily: x=c(16,12,10,15,11,14,9,15) t.test(x,mu=16,alt="less") 5 There is a documented too low level, since the relevant P-value is clearly below Question I.4 (4) The management made a similar investigation but based it on 10 periods of 5 minutes, and got the following counts: They wish to obtain a 90% confidence interval for µ - the mean of the number of items in 5 minutes but WITHOUT using the assumption that the normal distribution is valid, and runs the following in R: x=c(8,7,5,10,8,7,7,8,9,8) k = my_bootstrap_samples = replicate(k, sample(x, replace = TRUE)) my_bootstrap_means = apply(my_bootstrap_samples, 2, mean) quantile(my_bootstrap_means,c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995)) And obtains from the the last line of code the following percentiles for the bootstrap distribution: 0.1% 1% 2.5% 5% 10% 90% 95% 97.5% 99% 99.5%

4 The wanted confidence interval based on this becomes: Answer As it is stated in the note on simulation based statistics/bootstrapping, the method simply amounts to reading off the relevant percentiles, whch for a 90%-confidence interval then are the 5% and 95% percentiles. 2 [7.0; 8.3] Exercise II A machine for checking computer chips uses on average 65 milliseconds per check with a standard deviation of 4 milliseconds. A newer machine, potentially to be bought, uses on average 54 milliseconds per check with a standard deviation of 3 milliseconds. It can be used that check times can be assumed normally distributed and independent. Question II.1 (5) The probability that the time savings per check using the new machine is less than 10 milliseconds is: Answer Let X old N(65, 4 2 ) and X new N(54, 3 2 ). If we let U denote the time saving per check, we have that U = X old X new. We are asked to find: P (U < 10) = P (Z < 10 E(U) V ar(u) ) = P (Z < 10 (65 54) = where the latter can be found from Table 3 OR in R: ) = P (Z < 1 ) = P (Z < 0.2) 5 z=(10-11)/5 z pnorm(z) Alternatively in R: pnorm(10,mean=65-54,sd=sqrt(16+9)) 5 Approximately 42% 4

5 Question II.2 (6) The mean (µ) og standard deviation (σ) for the total time use for checking 100 chips on the new machine is: Answer Let U be the total time use for checking 100 chips on the new machine, that is: 100 U = where X i N(54, 3 2 ). So we find, using basic mean and variance calculus rules, that: i=1 X i and µ = E(U) = E(X i ) = 54 = = 5400 i=1 i= σ 2 = V ar(u) = V ar(x i ) = 9 = i=1 i=1 2 µ = = 5400ms and σ = = 30ms Exercise III A supermarket has just opened a delicacy department wanting to make its own homemade remoulade (a Danish delicacy consisting of a certain mixture of pickles and dressing). In order to find the best recipe a taste test was conducted. 4 different kinds of dressing and 3 different types of pickles were used in the test. Taste evaluation of the individual remoulade versions were carried out on a continuous scale from 0 to 5 The following measurement data were found: In an R-run for twoway ANOVA: Dressing type Row Pickles type A B C D average I II III Column average anova(lm(taste~pickles+dressing)) the following output is obtained: (however some of the values have been substituted by the symbols A, B, C, D, E and F) 5

6 > anova(lm(taste~pickles+dressing)) Analysis of Variance Table Response: Taste Df Sum Sq Mean Sq F value Pr(>F) Pickles A E Dressing B F Residuals C D Question III.1 (7) The values of A, B, and C are: Answer As is clear from the general definition of the two-way ANOVA table the degrees of freedom are r 1, b 1 and (r 1)(b 1), where r = 3 is the number of rows, c = 4 is the number of columns. 3 A = 2, B = 3 and C = 6 Question III.2 (8) The values of D, E, and F are: Answer E and F are the F-statistics, which are: F P ickles = MS P ickles MSE = = F Dressing = MS Dressing = MSE = Actually, only one answer option has these two values. The D= SSE could be found from the total sum of squares: 3 4 SS(total) = (y ij 2.23) 2 And then: i=1 j=1 D = SSE = SS(total) OR more easily using that the DF E = (r 1)(b 1) = 6 and then: In any case, the answer is: D = SSE = 6 MSE = = D = 0.633, E = 1.55 and F = Question III.3 (9) With a test level of α = 5% the conclusion of the analysis becomes: Answer We look at the P-values in the ANOVA table, and observe that the Dressing P-value is BELOW 0.05 and the Pickles P-value is ABOVE 0.05, and hence the answer is: 6

7 1 Only the choice of the dressing type has a significant influence on the taste Exercise IV For production of brass valves raw material (brass bars) from 2 different suppliers are received. Samples are taken from the deliveries from each of the two suppliers. The tensile strength of the items are determined, and the following results are found: Supplier 1: n 1 = 15, x 1 = 223.5N/mm 2, s 1 = 7.23N/mm 2 Supplier 2: n 2 = 20, x 2 = 220.4N/mm 2, s 2 = 4.49N/mm 2 As a potential help, the following four R-commands: round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),14,19),3) round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),15,20),3) round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),19,14),3) round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),20,15),3) are giving a number of percentiles (rounded to 3 decimals) for four different F-distributions. The results of these are shown in the R output window as follows: > round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),14,19),3) [1] > round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),15,20),3) [1] > round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),19,14),3) [1] > round(qf(c(0.001,0.01,0.025,0.05,0.1,0.9,0.95,0.975,0.99,0.995),20,15),3) [1] Question IV.1 (10) With a significance level of α = 5% we cannot conclude any difference between the two variances for the two suppliers, since: Answer The proper test statistic for comparing two variances is the larger variance divided by the smaller one: = The proper distribution to use for the test is the F (14, 19)-distribution. As nothing points towards a one-sided test, we should apply a two-sided test AND the critical value hence becomes F α/2 (n 1 1, n 2 1) = F (14, 19) 7

8 (Which by the way cannot be found by Table 6, BUT we do not need that here). It may be found in R as: qf(0.975,14,19) So the answer is, (since no other options use the correct degrees of freedom). 3 F (14, 19) = > Question σ1 2 = σ2: 2 IV.2 (11) The following hypothesis is to be tested, and it is now assumed that H 0 : µ 1 = µ 2 H 1 : µ 1 µ 2 With a significance level of α = 5% we cannot conclude any difference between the two means for the two suppliers, since the t statistic and P-value for this test becomes: (both must be correct) Answer The standard pooled t-test for this situation uses the pooled variance estimate: And the t-statistic hence becomes: s 2 p = t = = (1/15 + 1/20) = From the t-distribution Table 4 (with ν = 33) we can observe that the P-value is larger than 0.10 (and smaller than 0.20) since is between t 0.10 and t 0.05 and we are doing a two-sided test (so the tail probability should be multiplied by two to give the proper P-value) In R we could do it by the following: sp=sqrt((14*s1^2+19*s2^2)/33) t=( )/(sp*sqrt(1/15 + 1/20)) t 2*(1-pt(t,33)) Giving a P-value of In both cases the P-value is found to be larger than 0.05 so the answer is: 2 t = and P-value>

9 Question IV.3 (12) As there is no difference in the mean level for the two suppliers the data can be joined together and a 99% confidence interval for the mean can be found as: (Some summary statistics for the joint data set: n = 35, x = 221.7, s = 5.93) Answer We use the standard procedure for the one-sample confidence interval: x ± t α/2 (n 1) s n which is, since α = ± t ± t Question IV.4 (13) A study of a new supplier is planned. It is expected that the standard deviation for this supplier will be approximately 6, that is σ = 6N/mm 2. A 99% confidence interval for the mean in this new study is required to have a width of ±1N/mm 2. How many items must be sampled to achieve this? Answer We use the one-sample confidence interval sample size formula: The is found in Table 3 or in R as: ( zα/2 σ ) ( n = = E 1 ) 2 qnorm(0.995) 4 ( ) Exercise V When brass is used in a production, the modulus of elasticity, E, of the material is often important for the functionality. The modulus of elasticity for 6 different brass alloys are measured. 5 samples from each alloy are tested. The results are shown in the table below where the measured modulus of elasticity is given in GPa: 9

10 In an R-run for oneway analysis of variance: Brass alloys M1 M2 M3 M4 M5 M anova(lm(elasmodul~alloy)) the following output is obtained: (however some of the values have been substituted by the symbols A, B, and C) > anova(lm(elasmodul~alloy)) Analysis of Variance Table Response: ElasModul Df Sum Sq Mean Sq F value Pr(>F) Alloy A e-05 Residuals B C Question V.1 (14) The values of A, B, and C are: Answer The A and B are the degrees of freedom, which in the oneway ANOVA is k 1 and N k, where k = 6 is the number of groups and N = 30 is the number of observations. This is all that is needed to answer the question, but the C could be found as: C = SSE = MSE DF E = = A = 5, B = 24 and C = Question V.2 (15) The assumptions for using the oneway analysis of variance is: (Choose the answer that most correctly lists all the assumptions and NOT lists any unnecessary assumptions) Answer It is difficult to make a lot of arguments here but to emphasize that only in answer 1 all assumptions are given and not any unnecessary assumptions. 1 The data must be normally distributed within each group, independent and the variances within each group should not differ significantly from each other 10

11 Question V.3 (16) A 95% confidence interval for the difference between brass alloy 1 and 2 becomes: Answer A post-hoc 95% confidence interval between two groups in a oneway ANOVA is: ( 1 ȳ 1 ȳ 2 ± t MSE + 1 ) n 1 n 2 So we have to compute the means of the M1 and M2 groups: ȳ 1 = 84.62, ȳ 2 = (Or accept that it can only be 3.48) and then plug in MSE = and n 1 = n 2 = 5. So the answer is: ± t ( ) 2 5 Exercise VI It is a common conjecture that a student s perception of the quality of teaching in a particular discipline is related to the student s level in the subject. To investigate whether this is true, the following data were collected: There are 125 students in the table above. Grade Course Evaluation group GOOD MIDDLE BAD HIGH 22.4% 7.2% 4% MIDDLE 18.4% 8.8% 11.2% LOW 11.2% 5.6% 11.2% Question VI.1 (17) To investigate whether the conjecture is valid the following statistic should be calculated: Answer The actual 3-by-3 contingency table comes from multiplying the percentages by 125: o ij Course Evaluation GOOD MIDDLE BAD HIGH MIDDLE LOW Now one could compute all the nine expected values for this 3-by-3 table, for instance, e 11 = In R these could e.g. be found as: ( )( ) = 21.84

12 X=t(matrix(125*c(22.4,7.2,4, 18.4,8.8,11.2, 11.2,5.6,11.2)/100,ncol=3)) round(chisq.test(x)$expected,2) But having found just one of them makes it clear that only answer 1 can provide the correct answer, as the χ 2 -statistic has the form: χ 2 = 3 3 (o ij e ij ) 2 e i=1 j=1 ij 1 ( )2 + (9 9.07)2 + ( )2 + ( )2 + ( )2 + ( )2 + ( )2 + (7 7.56)2 + ( ) Question VI.2 (18) The number of degrees of freedom (DF) and the critical value (Q) of the relevant test on a 5% level are: Answer This is a χ 2 -test for an r-by-c table where the degrees of freedom are (r 1)(c 1) = 2 2 = 4, and χ (4) = (to be found in table 5 with ν = 4) or in R as: qchisq(0.95,4) 4 DF = 4 and Q = 9.49 Exercise VII At a specific education it was decided to introduce a project, running through the course period, as a part of the grade point evaluation. In order to assess whether it has changed the percentage of students passing the course, the following date was collected: Before introduction After introduction of project of project Number of students evaluated Number of students failed 13 3 Average grade point x Sample standard deviation s

13 Let p Before be the proportion failing the course before the introduction of the project and p After the corresponding proportion after the introduction of the project. Question VII.1 (19) If the following hypothesis is tested: H 0 : p Before = p After H 1 : p Before > p After a valid test statistic u, the corresponding P-value and a valid conclusion become: (both values and the conclusion must be correct) Answer For this particular test we have been given two different (but similar) options. One would be the χ 2 -test for a 2-by-2 frequencey/contingency table OR a z-test giving a direct comparison of the two proportions: (ˆp 1 ˆp 2 ) (ˆp (1 ˆp) (1/50 + 1/24) where ˆp 1 = 13/50, ˆp 2 = 3/24 and ˆp = 16/74, so: (ˆp 1 ˆp 2 ) (ˆp (1 ˆp) (1/50 + 1/24) = 1.32 And the (one-tailed) P-value becomes (from Table 3) or from R as: P (Z > 1.32) = pnorm(1.32) And this means that we must accept the null hypothesis, that is we cannot reject it on e.g. 5% level. 3 u = 1.32 and P-value = On a 5% level a drop in failing percentage cannot be documented Question VII.2 (20) As it is assumed that the grades are approximately normally distributed in each group, and that the variances in the two groups do not differ significantly from each other, the following hypothesis is tested: H 0 : µ Before = µ After H 1 : µ Before < µ After 13

14 The test statistic, the P-value and the conclusion for this test become: (both values and the conclusion must be correct) Answer The standard pooled t-test for this situation uses the pooled variance estimate: s 2 p = = And the t-statistic hence becomes: t = (1/50 + 1/24) = From the t-distribution Table 4 (with ν = 72) we can observe that the (one-tailed) P-value is between and 0.05 since is between t 0.05 and t In R we could do it by the following: t=( )/(2.088*sqrt(1/50+1/24)) t pt(t,72) Giving a P-value of t = = 1.842, P-value=0.035: On a 5% level an increase in grade point ( ) 24 average can be documented Question VII.3 (21) A 95% confidence interval for the grade point standard deviation after the introduction of the project becomes: Answer The confidence interval formula for a sample variance is used WITH the square-root applied to everything: (n 1) s 2 (n 1) s 2 < σ < χ χ < σ After < Question VII.4 (22) The critical value for the following hypothesis test for the grade point variance before the introduction of the project σbefore 2 : H 0 : σbefore 2 =

15 becomes on level 1% (α = 0.01): H 1 : σ 2 Before > 2 2 Answer The test for comparing a variance with a specific value is a χ 2 -test with ν = DF = n 1 = 49. So the critical value for this ONE-sided hypothesis is: In R this can be found as: χ (49) qchisq(0.99,49) giving the value Without using R we can use Table 4. In table 4 the χ (50) value can be read off to be and the χ (40) value can be read off to be , so by linear interpolation the only possible answer is: Exercise VIII A company manufactures an electronic device to be used in a very wide temperature range. The company knows that increased temperature shortens the life time of the device, and a study is therefore performed in which the life time is determined as a function of temperature. The following data is found: Temperature in Celcius (t) Life time in hours (y) The following is run in R: t=c(10,20,30,40,50,60,70,80,90) y=c(420,365,285,220,176,117,69,34,5) summary(lm(y~t)) with the reults: Call: lm(formula = y ~ t) Residuals: Min 1Q Median 3Q Max Coefficients: 15

16 Estimate Std. Error t value Pr(> t ) (Intercept) e-09 *** t e-07 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 7 degrees of freedom Multiple R-squared: 0.984,Adjusted R-squared: F-statistic: on 1 and 7 DF, p-value: 1.505e-07 Question VIII.1 (23) A 95% confidence interval for the slope in the regression model underlying the R-run above, and which expresses the life time as a linear function of the temperature, become: Answer Either one could do all the regression computations to find the b = and then subsequently use the formula for the confidence interval for β: b ± t α/2 s e 1 Sxx Or one could use the knowledge of the information in the R-output that wht is know as the standard error for the slope can be directly read off as: s e 1 Sxx = And t (7) = from Table 4 or in R: qt(.975,7) ± Question VIII.2 (24) Can a relation between temperature and life time be documented on level 5%? (As well conclusion as argument must be correct) Answer We look at the P-value in the slope row of the output: 1.51e 07 = Yes, as the relevant P-value is , which is clearly smaller than

17 Exercise IX In a consumer study, it is shown that in many supermarkets there are discrepancies between your receipt and the price on the shelf. The manager of a supermarket wants to keep track of the error percentage, and therefore introduces the following checks: During the day, 40 different items are sampled at random and for these items it is being checked whether the receipt and the price on the shelf is matching. The manager defines the situation as in control, if there are no more than 1 mislabeled item found among the 40 items. The probability that the situation is found to be under control, if the real percentage of mislabeled items is 1%, is called A. The probability that the situation is found to be under control, if the real percentage of mislabeled products is 10%, is called B. Question IX.1 (25) The values of A and B are: Answer For A we use the binomial distribution with p = 0.01 and n = 40: A = P (X 1), X b(x; 40, 0.01) For this p = 0.01 we cannot use table 1, so we have to either use hand calculation of the two binomial probabilities: A = P (X = 0) + P (X = 1) = = For B we use the binomial distribution with p = 0.10 and n = 40: A = P (X 1), X b(x; 40, 0.1) For this p = 0.1 and n = 40 combination we cannot use table 1, so we have to either use hand calculation of the two binomial probabilities: B = P (X = 0) + P (X = 1) = = In R, A could be found by any of the following three computations: pbinom(1,40,0.01) dbinom(0,40,0.01)+dbinom(1,40,0.01) 0.99^40+40*0.01*0.99^39 In R, B could be found by any of the following three computations: pbinom(1,40,0.10) dbinom(0,40,0.10)+dbinom(1,40,0.10) 0.90^40+40*0.1*0.90^39 17

18 1 A = and B = Question IX.2 (26) As an additional check, on a given day all together 120 different items were checked, and out of these 15 mislabeled items were observed. A 95% confidence interval for the proportion of mislabeled items becomes: Answer The standard one-proportion confidence interval is used (and this is OK as both np and n(1 p) is at least 15): ˆp(1 ˆp) ˆp ± z α/2 n which becomes: 15 (15/120)(105/120) 120 ± ± /120 Question IX.3 (27) We now wish to determine the proportion of mislabeled items in a particular store with a precision such that a 90% confidence interval will be of the width plus/minus It is expected the proportion will be in the order of How many items should approximately be checked in order to achieve such precision? Answer We use the sample size formula for the proportion confidence interval using a guess of the true value: ( zα/2 ) 2 n = p(1 p) = (1.645/0.02) 2 E as α = 0.10 and Table 3 or: qnorm(0.95) (1.645/0.02) items 18

19 Exercise X The yield Y of a chemical process is a random variable whose value is considered to be a linear function of the temperature X. The following data of corresponding values of x and y is found: Temperature in Celcius(x) Yield in grams (y) The average and standard deviation of Temperature(x) and Yield (y) are: x = 50, s x = , ȳ = 55.4, s y = , and further it can be used that S xy = In the exercise the usual linear regression model is used: Y i = α + βx i + ε i, ε i N(0, σ 2 ), i = 1,..., 5 Question X.1 (28) Can a significant relationship between yield and temperature be documented? (As well conclusion as argument must be correct) Answer It could most easily be solved by running the regression in R as: x=c(0,25,50,75,100) y=c(14,38,54,76,95) summary(lm(y~x)) with the results: Call: lm(formula = y ~ x) Residuals: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** x e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 3 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: 1071 on 1 and 3 DF, p-value: 6.267e-05 19

20 Alternatively one could use hand calculations and use the formula for the t-test of the hypothesis: H 0 : β = 0. The relevant test statistic and P-value can be read off in the R-output as and Yes, as the relevant test statistic and P-value are resp and Question X.2 (29) Give the 95% confidence interval of the expected yield at a temperature of x 0 = 80 degrees celcius: Answer We use the formula for the confidence limit of a point on the line: 1 (a + bx 0 ) ± t α/2 s e n + (x 0 x) S xx And we have to compute a, b and s e either by hand OR by R as above: a = 15.4, b = 0.8, s e = So the confidence interval becomes 1 ( ) ± since S xx = = (80 50) ± 3.61 Question X.3 (30) The five residuals become: -1.4, 2.6, -1.4, 0.6 og What is the upper quartile of the residuals? Answer We use the basic definition of finding a percentile (from Chapter 2), n = 5, p = 0.75, so: np = 3.75 So the upper quartile is the 4th observation in the ordered sequence: -1.4, -1.4, -0.4, 0.6,

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Homework 9 Sample Solution

Homework 9 Sample Solution Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

MAT 2377C FINAL EXAM PRACTICE

MAT 2377C FINAL EXAM PRACTICE Department of Mathematics and Statistics University of Ottawa MAT 2377C FINAL EXAM PRACTICE 10 December 2015 Professor: Rafal Kulik Time: 180 minutes Student Number: Family Name: First Name: This is a

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

STA 101 Final Review

STA 101 Final Review STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS

EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS Norges teknisk naturvitenskapelige universitet Institutt for matematiske fag Side 1 av 8 Contact during exam: Bo Lindqvist Tel. 975 89 418 EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Regression and the 2-Sample t

Regression and the 2-Sample t Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

No other aids are allowed. For example you are not allowed to have any other textbook or past exams. UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000 Lecture 14 Analysis of Variance * Correlation and Regression Outline Analysis of Variance (ANOVA) 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) Outline Lecture 14 Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA) 11-1 Introduction 11- Scatter Plots 11-3 Correlation 11-4 Regression Outline 11-5 Coefficient of Determination

More information

STAT Exam Jam Solutions. Contents

STAT Exam Jam Solutions. Contents s Contents 1 First Day 2 Question 1: PDFs, CDFs, and Finding E(X), V (X).......................... 2 Question 2: Bayesian Inference...................................... 3 Question 3: Binomial to Normal

More information

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1 Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10- Regression 10-3 Coefficient of Determination and

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE Course Title: Probability and Statistics (MATH 80) Recommended Textbook(s): Number & Type of Questions: Probability and Statistics for Engineers

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

SIMPLE REGRESSION ANALYSIS. Business Statistics

SIMPLE REGRESSION ANALYSIS. Business Statistics SIMPLE REGRESSION ANALYSIS Business Statistics CONTENTS Ordinary least squares (recap for some) Statistical formulation of the regression model Assessing the regression model Testing the regression coefficients

More information

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies. I. T or F. (1 points each) 1. The χ -distribution is symmetric. F. The χ may be negative, zero, or positive F 3. The chi-square distribution is skewed to the right. T 4. The observed frequency of a cell

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

This does not cover everything on the final. Look at the posted practice problems for other topics.

This does not cover everything on the final. Look at the posted practice problems for other topics. Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

STA 2101/442 Assignment 3 1

STA 2101/442 Assignment 3 1 STA 2101/442 Assignment 3 1 These questions are practice for the midterm and final exam, and are not to be handed in. 1. Suppose X 1,..., X n are a random sample from a distribution with mean µ and variance

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as

More information

Stat 401B Final Exam Fall 2015

Stat 401B Final Exam Fall 2015 Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018

QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 Page 1 of 4 QUEEN S UNIVERSITY FINAL EXAMINATION FACULTY OF ARTS AND SCIENCE DEPARTMENT OF ECONOMICS APRIL 2018 ECONOMICS 250 Introduction to Statistics Instructor: Gregor Smith Instructions: The exam

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Stat 401B Exam 2 Fall 2015

Stat 401B Exam 2 Fall 2015 Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

STAT 328 (Statistical Packages)

STAT 328 (Statistical Packages) Department of Statistics and Operations Research College of Science King Saud University Exercises STAT 328 (Statistical Packages) nashmiah r.alshammari ^-^ Excel and Minitab - 1 - Write the commands of

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

The Normal Distribution

The Normal Distribution The Mary Lindstrom (Adapted from notes provided by Professor Bret Larget) February 10, 2004 Statistics 371 Last modified: February 11, 2004 The The (AKA Gaussian Distribution) is our first distribution

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Technical University of Denmark

Technical University of Denmark Technical University of Denmark Page 1 of 17 pages Written examination: 29 May 2009 Course name and number: Introduction to Statistics, 02402 Aids and facilities allowed: All The questions were answered

More information

1 Use of indicator random variables. (Chapter 8)

1 Use of indicator random variables. (Chapter 8) 1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting

More information

Simple linear regression

Simple linear regression Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Simple Linear Regression: One Quantitative IV

Simple Linear Regression: One Quantitative IV Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,

More information

STAT Final Practice Problems

STAT Final Practice Problems STAT 48 -- Final Practice Problems.Out of 5 women who had uterine cancer, 0 claimed to have used estrogens. Out of 30 women without uterine cancer 5 claimed to have used estrogens. Exposure Outcome (Cancer)

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Outline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II

Outline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II Week 13 Comparing Two Populations, Part II Week 13 Objectives Coverage of the topic of comparing two population continues with new procedures and a new sampling design. The week concludes with a lab session.

More information

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance

More information

Analysis of variance

Analysis of variance Analysis of variance 1 Method If the null hypothesis is true, then the populations are the same: they are normal, and they have the same mean and the same variance. We will estimate the numerical value

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

Solutions to Final STAT 421, Fall 2008

Solutions to Final STAT 421, Fall 2008 Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

1 Multiple Regression

1 Multiple Regression 1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

This is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables in your book.

This is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables in your book. NAME (Please Print): HONOR PLEDGE (Please Sign): statistics 101 Practice Final Key This is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information