Section 4.6 Simple Linear Regression
|
|
- Julia Davis
- 5 years ago
- Views:
Transcription
1 Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval estimation of future observations from the model ˆ Regression diagnostics, including R 2 and basic residual analysis Basic Philosophy We have two variables X and Y. Here, X is not random (so we will write x), but Y is random. We believe that Y depends in some way on x. Some typical examples of (x, Y ) pairs are ˆ x study time and Y score on a test. ˆ x height and Y weight. ˆ x father s height and Y son s height. We focus our efforts on estimating two parameters, β 0 and β in the simple linear regression model: Y i β 0 + β x i + ε i, where ε i N ( 0, σ 2) ˆ Y i is the (random) response for the ith case. ˆ β 0, β are unknown parameters that we want to estimate. β 0 (unknown) intercept, and β (unknown) slope. ˆ X i is the value of the predictor variable for the ith case. ˆ ε i is a (random) error term for the ith case, such that the mean 0, variance the same for all the cases, and the covariance between the ith and jth case 0. Least Squares Estimates We begin with the likelihood function L(β 0, β, σ 2 ) n f ( y i ; β 0, β, σ 2) ln L(β 0, β, σ 2 ) n 2 ln ( 2πσ 2) + n ( ) [ n/2 (yi β 0 β x i ) 2 2πσ 2 exp 2σ 2 [ ] exp (y i β 0 β x i ) 2 2πσ 2 2σ 2 ] (y i β 0 β x i ) 2 To maximize the log likelihood, let s minimize the summand, i.e., H 2σ 2 (y i β 0 β x i ) 2. That is, let s find β 0 and β that minimize H. Because of the two parameters, we differentiate this wrt
2 β 0, β and set them equal to zero, we get β 0 H 2 β H 2 2 (y i β 0 β x i ) 0 nβ 0 + β x i x i (y i β 0 β x i ) ( xi y i β 0 x i β x 2 ) i 0 β0 x i + β y i x 2 i x i y i Organizing these two equations, we get ˆβ 0 ȳ ˆβ x ( ) x i y i x i)( y i /n ˆβ ( ) 2 x 2 i x i /n Shown below are the second derivatives: (x i x) (y i ȳ) (x i x) 2 2 β 2 0 H 2n, 2 β β 0 H 2 x i, 2 β 0 β H 2 2 β 2 H 2 x 2 i x i And the 2 2 matrix consisting of these second-derivatives is positive definite because the (,)th element > 0 and its determinant is also > 0. 2n, 2 2 x i, 2 x i x 2 i det > 0 The conclusion? Use ŷ i ˆβ 0 + ˆβ x i line to ensure the line that fits the (x, y) pattern the best, i.e., the estimated line we have will leave the smallest gap between the observed y s and the estimated line. For this reason, they are also called the least squares estimates. Section 4.6+, page 2
3 Next, let s find the mle of σ 2. σ 2 { ln L(β0, β, σ 2 ) } n 2σ 2 (y i β 0 β x i ) 2 2 (σ 2 ) 2 0 We get ˆσ 2 n (y i ˆβ 0 ˆβ ) 2 x i One note: ˆ In statistics, the gap between the observed value (y i ) and the expected (or predicted) value (ŷ i ) is called the residual. So (y i ˆβ 0 ˆβ ) 2 x i (y i ŷ i ) 2 is the sum of squared residuals, and it s commonly called SS E. For a point estimate of σ 2, we use SS E n 2, i.e., SSE ˆσ s n 2. ˆ There are many equivalent formulas for ˆβ that are more intuitive, or at the least are easier to remember. One of the popular ones is ˆβ r SD y, where r correlation coefficient between SD x x and y, SD y sd of y and SD x sd of x. Inferences about the Parameters Let s learn some more notations: b ˆβ S xy b 0 ˆβ 0 ȳ b x (x i x) (y i ȳ) (x i x) 2 (x i x) y i (x i x) 2 Section 4.6+, page 3
4 Here is how to derive the expectation and the variance of the estimates: E (b ) E ( Sxy ) { } E (x i x) y i E S xx ] [ x i E (y i ) ne ( xȳ) [ ] x i (β 0 + β x i ) n x (β 0 + β x) [ β 0 ( x i n x (0 + β ) β ) + β ( x 2 i x 2 )] { } (x i y i xy i ) E (b 0 ) E (ȳ b x) E (β 0 + β x) xe (b ) β 0 ( ) { } { } Sxy V ar (b ) V ar Sxx 2 V ar (x i x) y i S 2 (x i x) 2 σ 2 xx ( ) V ar (b 0 ) V ar (ȳ b x) σ 2 n + x2 Furthermore, it can be shown that b N (mean β, sd σ b ), where σ b σ Sxx σ2 σ b σ is also called the standard error of b and we can estimate σ from the previous descrip- Sxx tion by s SSE n 2. So the SE of b becomes s b s Sxx See textbook page It can be shown that SS E σ 2 n ˆσ 2 σ 2 χ2 (n 2). Also, it turns out that b 0, b, and s are mutually independent. Therefore, we have the following t-distribution. T b β σ b SS E σ 2 /(n 2) b β σ/ SS E σ 2 /(n 2) b β s/ b β s b t df(n 2) Therefore, a 00( α)% confidence interval for β is given by b ± t df(n 2) α/2 s b Section 4.6+, page 4
5 It can also be shown in a similar way, b 0 N (mean β 0, sd σ b0 ), where σ b0 σ n + x2 σ b0 σ n + x2 is the standard error of b 0 and the SE of b becomes Therefore, we have another t-distribution. s b0 s n + x2 T 0 b 0 β 0 σ b0 SS E σ 2 /(n 2) b 0 β 0 σ SS E n + x2 Sxx σ 2 /(n 2) b 0 β 0 s n + x2 b 0 β 0 s b0 t df(n 2) Therefore, a 00( α)% confidence interval for β is given by b 0 ± t df(n 2) α/2 s b0 We have seen how to estimate the coefficients of a regression line with both point estimates and confidence intervals. We have learned how to estimate a value ŷ on the regression line for a given value of x, such as x x 0. But how good is our estimate ŷ at x x 0? How much confidence do we have in this estimate? Furthermore, suppose we were going to observe another value of y at x x 0. What can we say? Intuitively, it should be easier to get bounds on the mean (average) value of y at x 0 (called a confidence interval for the mean value of y at x 0 ) than it is to get bounds on a future observation of y (called a prediction interval for y at x 0 ). It turns out the confidence intervals are narrower for the mean value, wider for the individual value. Our point estimate of y at x 0 is, of course, ŷ at x 0, so for a confidence interval we will need to know the sampling distribution of ŷ s. It turns out that ŷ at x 0 is distributed as ŷ N ) (mean E (y x0 ), sd σŷx0, where σŷx0 σ n + (x 0 x 2 ) Section 4.6+, page 5
6 σŷx0 σ n + (x 0 x 2 ) is the standard error of ŷ x0 and the estimate is sŷx0 s n + (x 0 x 2 ) Therefore, we have the following t-distribution. T 2 ŷ x0 E(y x0 ) σŷx0 SS E σ 2 /(n 2) ŷ x0 E(y x0 ) n + (x 0 x2 ) Sxx σ SS E σ 2 /(n 2) ŷx 0 E (y x0 ) ŷ x0 E (y x0 ) s n + (x 0 x 2 ) sŷx0 t df(n 2) Therefore, a 00( α)% confidence interval (C.I.) for E (y) at x 0 is given by ŷ x0 ± t df(n 2) α/2 sŷx0 Next, the prediction intervals are slightly different. In order to find confidence bounds for a new observation of y (we will denote it y future ) we use the fact that ŷ future N ( mean E (y future ), sd σŷfuture ), where σŷfuture σ + n + (x 0 x 2 ) Of course σ is unknown and we estimate it with s. Therefore, a 00( α)% prediction interval (P.I.) for a future value of y at x 0 is given by ŷ x0 ± t df(n 2) α/2 sŷfuture Take note that the prediction interval is wider than the confidence interval, as its SE is greater. Ex. Consider the following sample data and carry out all the inferences involved. Midterm (X) Final (Y ) Section 4.6+, page 6
7 > x <- c(70,74,80,84,80,67,70,64,74,82) > y <- c(87,79,88,98,96,73,83,79,9,94) > model <- lm(y~x) > summary(model) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) x ** --- Residual standard error: on 8 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 9 on and 8 DF, p-value: > plot(y~x,pch6,col2) > abline(model,col4) > predict(model,interval"confidence") fit lwr upr > predict(model,interval"prediction") fit lwr upr > newx <- seq(60,95,0.2) > ci <- predict(model,list(xnewx), interval"confidence") > pi <- predict(model,list(xnewx), interval"prediction") > plot(x,y,pch6,col2) > matplot(newx,ci,type"l",ltyc(,2,2),colc(,2,2),addt) > matplot(newx,pi,type"l",ltyc(,3,3),colc(,4,4),addt) > legend(locator(),c("regression line","95% ci","95% pi"),cex0.8,lty:3,colc(,2,4)) > #The following command creates four diagnostic plots. > par(mfrowc(,4)) Section 4.6+, page 7
8 > plot(model) Figure : Regression line, 95% CI & 95% PI Figure 2: Diagnostic plots of a regression model Section 4.6+, page 8
9 Section 4.8 One-Factor ANOVA One-Factor Samples Suppose you have collected n i, where (i, 2,..., m) samples from m groups: Groups Means Y : Y Y 2 Y n Ȳ Y 2: Y 2 Y 22 Y 2n2 Ȳ 2..:..... Y m: Y m Y m2 Y mnm Ȳ m Grand Mean: Ȳ The hypotheses we want to test are: H 0 : µ µ 2 µ m (i.e., all group means are the same.) H : not H 0 (i.e., some group means are significantly different.) In the end, all will be summarized in the following ANOVA table: source SS df MS F -value p-value Treatment SS trt m MS trt SS trt m Error SS E n m MS E SS E n m Total SS tot n MS trt/ms E Here are all the SS (sum of squares) numbers and how the SS tot is partitioned: SS tot m n i ( Yij Ȳ ) 2 m n i ( Yij Ȳi + Ȳi Ȳ ) 2 j n i j n i m ( Yij Ȳi ) 2 m + j n i j m ( Yij Ȳ ) 2 m + j ) 2 (Ȳi Ȳ cross-product term 0 ) 2 n i (Ȳi Ȳ SS E + SS trt We also have SS trt σ 2 χ 2 (m ), SS E σ 2 χ 2 (n m) SS trt /(m ) σ 2 SS E /(n m) SS trt/(m ) SS σ 2 E /(m m) MS trt MS E F (m ),(n m) Section 4.6+, page 9
10 Ex 2. Consider the following sample data and carry out all the inferences involved. Observations Group : Group 2: Group 3: Group 4: Group 5: > grp <- c(rep(,7),rep(2,7),rep(3,7),rep(4,7),rep(5,7)) > y <- c(92,90,87,05,86,83,02,00,08,98,0,4,97,94, + 43,49,38,36,39,20,45,47,44,60,49,52,3,34, + 42,55,9,34,33,46,52) > data <- data.frame(cbind(grp,y)) > head(data) > attach(data) > grp <- factor(grp) > boxplot(y~grp,col"pink") > model <- lm(y~grp) > summary(model) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-6 *** grp * grp e-0 *** grp e- *** grp e-0 *** --- Residual standard error: 9.7 on 30 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 44.2 on 4 and 30 DF, p-value: 3.664e-2 > anova(model) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) grp e-2 *** Residuals > summary(aov(y~grp)) Df Sum Sq Mean Sq F value Pr(>F) grp e-2 *** Residuals > boxplot(y~grp,col"pink") > plot(tukeyhsd(aov(y~grp))) Section 4.6+, page 0
11 > par(mfrowc(,4)) > plot(model) Figure 3: Boxplot & Tukey s pairwise comparison Figure 4: Diagnostic plots of ANOVA Section 4.6+, page
12 Section 4.0 χ 2 Tests Review: Facts about χ 2 -Distribution In the χ 2 distribution X χ 2 df, where df ( degrees of freedom) is the only parameter that uniquely determines the shape. The (theoretical) population mean is µ df and the (theoretical) population standard deviation is σ 2 df. ˆ If you square a random variable that has the standard normal distribution, it has χ 2 (df) distribution. This is often written as Z 2 χ 2 (df). ˆ The random variable with a χ 2 distribution with k degrees of freedom is the sum of k independent, squared standard normal variables, i.e., χ 2 (dfk) Z2 + Z Z2 k, where Z N(0, ). ˆ The curve is nonsymmetrical and skewed to the right. ˆ The mean, µ, is always located just to the right of the peak. ˆ The χ 2 test statistic is always greater than or equal to zero. ˆ When df > 90, the χ 2 curve is approximated by the normal distribution. X χ 2 (df000), then, X N(µ 000, σ ). For example, χ 2 Goodness of Fit Test We test whether the data fits a particular distribution or not. For example, we can test if the color distribution of M&M bags fits what the company claims on their webpage. After flipping a coin many times, we can test if it fits a binomial distribution. We use a χ 2 test statistic to determine if there is a good fit or not. Why χ 2? Demo for a binomial case Let Y binomial(n, p ), then Z Y np np ( p ) the CLT. Consider the following: has an approximate N(0, ) distribution due to Q (Y np ) 2 np ( p ) (Y np ) 2 + (Y np ) 2 (Why?) np n( p ) (Y np ) 2 + (Y 2 np 2 ) 2 [ (Y np ) 2 {n Y n( p )} 2 (Y 2 np 2 ) 2] np np 2 2 (Y i np i ) 2 χ 2 (df) np i Section 4.6+, page 2
13 This will be generalized to when there are k many categories. It can be shown: Q k k (Y i np i ) 2 np i χ 2 (dfk ) The null and the alternative hypotheses for the goodness-of-fit test can be written as: H 0 : p i p i0, where i, 2,..., k, (i.e., data fits the hypothesized distribution) H : p i p i0 (i.e., at least in some cases, data does NOT fit the hypothesized distribution) Ex. People were asked to write bunch of random digits. The result: If these digits are truly random, the probability of the next digit can be either the same as the preceding one with the probability of /0 or one away from the preceding one with the probability of 2/0, or neither cases with the probability of 7/0. We want to test whether the data fits this thinking (i.e., random sequence examined by this idea), i.e., H 0 : p 0, p 2 2 0, p H : At least one of the cases is significantly different from the hypothesized proportion. Here is summary: observed freq expected freq same digit 0 5 (/0) 5. one-away digit 8 5 (2/0) 0.2 others 43 5 (7/0) 35.7 Test statistic: χ 2 3 (observed expected) 2 expected (0 5.) (8 0.2) ( ) χ 2 (df2) p-value , so we reject H 0 and conclude that the data didn t follow the hypothesized proportion, i.e., the data doesn t seem random. The whole thing can be done in R as shown below. > x <- c(0,8,43) > chisq.test(x,p <- c(0., 0.2, 0.7)) Pearson s Chi-squared test data: x and p <- c(0., 0.2, 0.7) X-squared 6, df 4, p-value 0.99 Section 4.6+, page 3
14 Ex 2. You flipped a coin 4 times a day and counted total number of H s every day. You did this for 00 days. The result: Number of H s observed freq Test if the result agrees with X (total number of H s) being a binomial (4, /2). Answer: Number of H s observed freq expected freq Test statistic: χ 2 5 (obs exp) 2 exp (7 6.25) (8 25) (4 6.25) χ 2 (df4) p-value , so we do not reject H 0 and conclude that the data supports the hypothesis of binomial (4, 0.5). Ex 3. You lose one more df by estimating another parameter! Shown below are X, number of α particles emitted by barium-33 in /0 of a second, and counted by a Geiger counter Test H 0 : X P oisson. Answer: We first have to estimate the Poisson parameter λ by the mean of data, i.e., ˆλ x 5.4. Then, we calculate the expected probabilities for each case and expected frequencies. Cases observed freq expected freq {0,,2,3} {4} {5} {6} {7} {8, 9,... } Section 4.6+, page 4
15 Test statistic: χ 2 6 (obs exp) 2 exp (3 0.65) (0 8.90) χ 2 (df4) p-value 0.408, so we do not reject H 0 and conclude that the data cannot reject the hypothesis that the counts form a Poisson distribution. χ 2 Test for Homogeneity The goodness-of-fit test can be used to decide whether a data fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to draw a conclusion about whether two populations have the same distribution. Here we re concerned about: H 0 : The distributions of the two populations are the same. H : The distributions of the two populations are NOT the same. Ex 4. Shown below are grade distribution of two groups of students. observed freq A B C D F total Group I Group II Test H 0 : Grade distribution of the two groups are the same. Answer: Under H 0 that the probabilities of each grade is equal, the respective estimates of the probabilities are: 2/000.2, 22/000.22, 30/000.3, 26/000.26, and 0/000.. Note also, since we have estimated these probabilities, the χ 2 test statistic will have a df (5 ) + (5 ) 4 4. Here are the expected frequencies for each case. expected freq A B C D F Group I Group II Test statistic: 2 5 χ 2 (obs exp) 2 exp j (8 6) (7 5) χ 2 (df4) p-value , so we do not reject H 0 and conclude that we cannot say there is a significant difference in grade distribution between the two groups. Section 4.6+, page 5
16 > data <- matrix(c(8,4,3,9,6,4,0,6,3,7), nrow2, ncol5) > chisq.test(as.table(data))$observed A B C D E A B > chisq.test(as.table(data))$expected A B C D E A B > chisq.test(as.table(data))$residual A B C D E A B > chisq.test(as.table(data)) Pearson s Chi-squared test data: as.table(data) X-squared 5.786, df 4, p-value χ 2 Test for Independence Test of independence involves using a contingency table of observed (data) values. statistic for a test of independence is similar to that of a goodness-of-fit test: The test c r χ 2 (obs exp) 2 exp j χ 2 df(r )(c ), where r number of rows, and c number of columns. Ex 5. A random sample of 400 students at the University of Iowa shows the following breakdown of gender and colleges where they study. observed freq Business Engineering Liberal Arts Nursing Pharmacy total Male Females Test H 0 : p ij p i p j (i.e., the college where a student studies is independent of the gender.) Answer: > data2 <- matrix(c(2,4,6,4,45,75,2,3,6,4), nrow2, ncol5) > chisq.test(as.table(data2))$observed A B C D E A B > chisq.test(as.table(data2))$expected A B C D E Section 4.6+, page 6
17 A B > chisq.test(as.table(data2))$residual A B C D E A B > chisq.test(as.table(data2)) Pearson s Chi-squared test data: as.table(data2) X-squared , df 4, p-value We do reject H 0 and conclude that the number of students in colleges is highly dependent on gender, i.e., the two variables (gender and which college) are NOT independent. Section 4.9 Distribution-Free CI & TI Basics Let Y < Y 2 < Y 3 < Y 4 < Y 5 be the order statistics of a random sample of size n 5 from any continuous distribution. Also, let m π 0.5 (i.e., the 50th percentile) be the median. For example, we can find the following probability: P (Y < m < Y 5 ) 4 k ( 5 k ) ( 2 Why P (Y < m < Y 5 ) is calculated like this? ) k ( ) 5 k 2 P (X 0) P (X 5), where X binomial(5, /2) ( ) 5 ( ) First, for any individual observation, say X, has P (X < m) 0.5, and in order for Y to be less than m and Y 5 to be greater than m, we must have, 2, 3, or 4 observations to be less than m. And we say (y, y 5 ) is a 94% (distribution-free) confidence interval for m. In a similar way, when there are n independent trials, we calculate: j P (Y i < m < Y j ) ki ( n k α ) ( 2 ) k ( ) n k 2 Section 4.6+, page 7
18 and (y i, y j ) is a 00( α)% (distribution-free) confidence intervals for the median m. Ex. Suppose we have an ordered set of data (n 9) like: Let s calculate: P (Y 2 < m < Y 8 ) 7 k2 ( 9 k ) ( 2 ) k ( ) 9 k and (y 2, y 8 ) (9.0, 30.) is a 96.% (distribution-free) confidence intervals for the median m. It turns out we can argue the same thing for any percentile π p. In this case, any individual observation X has P (X < π p ) p, so when there are n independent trials, we calculate: j ( ) n P (Y i < π p < Y j ) p k ( p) n k α k ki and (y i, y j ) is a 00( α)% (distribution-free) confidence intervals for the percentile π p. Ex 2. Suppose we have an ordered set of data (n 27) like: First, note that π 0.25 (i.e., the first quartile) (n + )p (27 + )(0.25) 7, and we have ˆπ 0.25 y Now, let s see how much confidence we can have with (y 4, y 0 ) being a confidence interval for y 7. P (Y 4 < π 0.25 < Y 0 ) 9 k4 ( ) 27 (0.25) k (0.75) 27 k k i.e., (y 4, y 0 ) (74, 87) is a 82.0% (distribution-free) confidence intervals for the 25th percentile π One note: For some of these binomial probability calculations, it s OK to use the normal approximation. For example, in the last ( problem where we calculate P (4 X 9), where X binomial (n 27, p /4), and X. N µ 27/4 6.75, σ ) 27 (/4) (3/4) Finding the same probability by normal approximation, we have: ( P (4 X 9) P (3.5 X 9.5) P Z 2.25 ) Section 4.6+, page 8
19 i.e., the normal approximation works rather well for such a case. Theorem. Let Y < Y 2 < < Y n be the order statistics (based on random samples x, x 2,..., x n ). Then the pdf of Y k is Proof. g k (y) n! (k )!(n k)! [F (y)]k f(y) [ f(y)] n k, where f( ), F ( ) pdf and cdf of X. Theorem 2. Let U () < U (2) < < U (n) be the order statistics, where U i uniform(0, ). Then U (k) has a beta distribution with two parameters k and (n k + ). Proof. From Theorem, we have g k (y) n! (k )!(n k)! (y)k ( y) n k, 0 < y < pdf of β(k, n k + ) Theorem 3. Let X, X 2,..., X n be random variables with cdf F ( ), then, where U i uniform(0, ). Then F { } X (k) has a beta distribution with two parameters k and (n k + ). Proof. First, note that U i F (X i ) is iid uniform (0, ) due to the probability integral transformation. Furthermore, F ( ) is a nondecreasing function, i.e., F ( ) preserves order. So, U (i) F { } X (i). That is, { } { ( ) ( ) ( )} U(), U (2),..., U (n) F X(), F X(2),..., F X(n) F ( ) X (k) β(k, n k + ) Application: Let Y k be the order statistic of X k, i.e., Y k X (k). Consider the following n + Section 4.6+, page 9
20 random variables: W F (Y ) W 2 F (Y 2 ) F (Y ) W 3 F (Y 3 ) F (Y 2 ) W n F (Y n ) F (Y n ) W n+ F (Y n ) ˆ These W, W 2,..., W n+ are called the coverage of intervals, for example (Y i, Y i+ ]. ˆ Note that sum of k of thse intervals, i.e., W + W k F (Y k ) β(k, n k + ) ˆ F (Y j ) F (Y i ), i < j is the sum of k j i coverages, so that it will have β(j i, n j +i+), i.e., γ P {F (Y j ) F (Y i ) p} p Γ(n + ) Γ(j i)γ(n j + i + ) vj i ( v) n j+i dv and this is called a 00γ% tolerance interval for 00p% of the distribution. Ex 3. Let Y < Y 2 < < Y 6 be the order statistics of a random sample of size n 6 from any continuous distribution. Also, let p 0.8, then γ P {F (Y 6 ) F (Y ) 0.8} 0.8 Γ(7) Γ(5)Γ(2) v4 ( v)dv 0.34 i.e., (y, y 6 ) is a 34% (distribution-free) tolerance interval for 80% of the distribution. Section 4.6+, page 20
Ch 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationHomework 9 Sample Solution
Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationI i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.
Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationSTAT 215 Confidence and Prediction Intervals in Regression
STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationExercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer
Solutions to Exam in 02402 December 2012 Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer 3 1 5 2 5 2 3 5 1 3 Exercise IV.2 IV.3 IV.4 V.1
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationMaster s Written Examination - Solution
Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationFinal Exam - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationThis is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables in your book.
NAME (Please Print): HONOR PLEDGE (Please Sign): statistics 101 Practice Final Key This is a multiple choice and short answer practice exam. It does not count towards your grade. You may use the tables
More informationIntroduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution
Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationNo other aids are allowed. For example you are not allowed to have any other textbook or past exams.
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In
More informationAnalysis of Variance
Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More information: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.
Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationSTAT2912: Statistical Tests. Solution week 12
STAT2912: Statistical Tests Solution week 12 1. A behavioural biologist believes that performance of a laboratory rat on an intelligence test depends, to a large extent, on the amount of protein in the
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationMultivariate Linear Regression Models
Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More information6. Multiple Linear Regression
6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationhttp://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationFinalExamReview. Sta Fall Provided: Z, t and χ 2 tables
Final Exam FinalExamReview Sta 101 - Fall 2017 Duke University, Department of Statistical Science When: Wednesday, December 13 from 9:00am-12:00pm What to bring: Scientific calculator (graphing calculator
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More information