Probability and Statistics Notes
|
|
- Darleen Casey
- 5 years ago
- Views:
Transcription
1 Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring / 42
2 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
3 Statistical testing problems usually involve two hypotheses about a parameter, such as H 0 : µ 150 vs. H 1 : µ > 150. Here, the parameter µ represents the average mass of an apple produced using a new type of fertilizer. Assuming that 150 grams is the mass of an apple produced using the old fertilizer, H 0 represents a hypothesis of no change, and H 0 is called the null hypothesis. H 1 is the hypothesis that the fertilizer manufacturer would need to prove to convince people to buy this type of fertilizer. H 1 is called the alternative hypothesis. (Tarleton State University) Chapter Seven Notes Spring / 42
4 For mathematical reasons, the testing problem is usually restated as H 0 : µ = 150 vs. H 1 : µ > 150. (Replaced with = in H 0 ) Two types of errors: Type I: Reject H0 when H 0 is true. Type II: Don t reject H0 when H 0 is false. (We never accept H 0 ) We can t decrease the chance of both types of errors without raising the sample size. (Tarleton State University) Chapter Seven Notes Spring / 42
5 Solution: settle for making P[Type I error] small. We choose a number α, called the significance level of the test, and construct the test so that P[Type I error] = α. A conventional value of α is 0.05, but other values may be more appropriate, depending on the situation. If α is small, rejecting H 0 is strong evidence that H 0 is false, and the smaller α is, the stronger the evidence is. Unfortunately, not rejecting H 0 is not strong evidence that H 0 is true. This is a more delicate matter that s addressed in chapter 10. Hypothesis tests with significance level α correspond to confidence intervals with confidence level 1 α. (Tarleton State University) Chapter Seven Notes Spring / 42
6 The remaining slides in this section are not essential to material in chapter 7. P[Type II error] = β. Among all tests with significance level α, we would like the one that minimizes β. K = 1 β is called the power of the test. Minimizing β is equivalent to maximizing power, so we want a most powerful test. Notice that β is actually a function of the unknown parameter. (Tarleton State University) Chapter Seven Notes Spring / 42
7 For example, let s return to H 0 : µ = 150 vs. H 1 : µ > 150, and let s assume the population standard deviation is σ = 10. If µ is actually equal to , then H 0 is false, but the chance of rejecting H 0 will be low, unless the sample size is extremely large. The chance of making a type II error would be large and the power would be low. If µ is actually equal to 300, rejecting H 0 would be very likely, even with a small sample size. The chance of making a type II error would be small and the power would be high. This shows how β and the power depend on the unknown parameter. The tests studied in this chapter maximize power for each value of the unknown parameter, so they are called uniformly most powerful tests, a topic discussed in more detail in chapter 10. (Tarleton State University) Chapter Seven Notes Spring / 42
8 The power of these tests is still low if the true parameter is close to H 0 (relative to the population variance and the sample size). In these cases, achieving a high power level is not possible. (Maximum power does not mean high power. If a high power is impossible, the maximum will still be low.) In other words, even if we don t reject H 0, it is always possible that H 0 is false, but the true parameter is so close to H 0 that it appears that H 0 is true. For this reason we can never accept H 0. (Tarleton State University) Chapter Seven Notes Spring / 42
9 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
10 Let µ be the mean of a population, and consider the hypothesis test H 0 : µ = µ 0 vs. H 1 : µ µ 0. Suppose X 1,..., X n is a random sample from this population, and assume that one of the following conditions holds: n < 30 and the population is normal, or n 30. If σ is unknown, the decision rule is where T is the test statistic This is a two-tailed test. Reject H 0 if T t α/2 (n 1), T = X µ 0 S/ n. (Tarleton State University) Chapter Seven Notes Spring / 42
11 For the testing problem reject H 0 if T > t α (n 1). For the testing problem reject H 0 if T < t α (n 1). These are one-tailed tests. H 0 : µ = µ 0 vs. H 1 : µ > µ 0, H 0 : µ = µ 0 vs. H 1 : µ < µ 0, If σ is known, replace s with σ and t α (n 1) with z α. Recall that if n is large, t α (n 1) can also be replaced by z α as an approximation. (Tarleton State University) Chapter Seven Notes Spring / 42
12 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
13 Test Statistics Recall that a parameter is a number (or vector etc.) related to the population. For example, the population mean µ and population variance σ 2 are parameters. Parameters are not random. A statistic is a number (or vector etc.) whose value is based on a random sample. For example, the sample mean X and sample variance S 2 are statistics. Statistics are random variables. Most decision rules for hypothesis tests are based on test statistics. For example, the decision rules from the previous section were based on the test statistic T = X µ 0 s/ n. (Tarleton State University) Chapter Seven Notes Spring / 42
14 p-values Once a sample has been collected and a test statistic has been calculated, the p-value of the test can also be calculated. Definition The p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming the null hypothesis is true. (Tarleton State University) Chapter Seven Notes Spring / 42
15 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
16 Let p be a population proportion, and consider the testing problem H 0 : p = p 0 vs. H : p p 0. Suppose ˆp is the sample proportion from a sample of size n, and assume that both nˆp 5 and n(1 ˆp) 5. Decision rule: where Z is the test statistic Reject H 0 if Z z α/2, Z = ˆp p 0 p0 (1 p 0 )/n. Use appropriate modifications for one-tailed tests. (Tarleton State University) Chapter Seven Notes Spring / 42
17 Let p 1 and p 2 be two population proportions, and consider H 0 : p 1 = p 2 vs. H 1 : p 1 p 2. Let ˆp 1 = Y 1 /n 1 and ˆp 2 = Y 2 /n 2 be corresponding sample proportions based on independent samples of sizes n 1 and n 2, respectively. Also, assume that both n i ˆp i 5 and n i (1 ˆp i ) 5, for i = 1, 2. Decision rule: Reject H 0 if Z z α/2, where Z = ˆp 1 ˆp 2 ( ), and ˆp(1 ˆp) n1 n2 ˆp = Y 1 + Y 2 n 1 + n 2. (Tarleton State University) Chapter Seven Notes Spring / 42
18 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
19 Let N(µ X, σx 2 ) and N(µ Y, σy 2 ) be two normal populations, with the same, unknown variance σ = σ X = σ Y, and consider H 0 : µ X = µ Y vs. H 1 : µ X µ Y. Let X 1,..., X n and Y 1,..., Y m be independent samples from these populations. Decision rule: Reject H 0 if T t α/2 (n + m 2), where S p = T = X Y, and 1 S p n + 1 m (n 1)SX 2 + (m 1)S2 Y. n + m 2 This test is valid for any sample sizes, but it s only necessary to use this test for small sample sizes. (Tarleton State University) Chapter Seven Notes Spring / 42
20 Let N(µ X, σx 2 ) and N(µ Y, σy 2 ) be two normal populations, with known variances σ X and σ Y, and consider H 0 : µ X = µ Y vs. H 1 : µ X µ Y. Let X 1,..., X n and Y 1,..., Y m be independent samples from these populations. Decision rule: Reject H 0 if Z z α/2, where Z = X Y σ 2 Xn + σ2 Ym. This test is valid for any sample sizes. (Tarleton State University) Chapter Seven Notes Spring / 42
21 Consider two populations with means µ X and µ Y and variances σ X and σ Y. Consider H 0 : µ X = µ Y vs. H 1 : µ X µ Y. Let X 1,..., X n and Y 1,..., Y m be independent samples from these populations, where n 30 and m 30. Decision rule: Reject H 0 if Z z α/2, where Z = X Y. + S2 Y m S 2 X n This test is only valid for large sample sizes. If the population variances are known, use them in Z instead of the sample variances. (Tarleton State University) Chapter Seven Notes Spring / 42
22 Consider a sample of paired observations (X 1, Y 1 ),..., (X n, Y n ) and H 0 : µ X = µ Y vs. H 1 : µ X µ Y. Define the differences D i = X i Y i, for i = 1,..., n. The above testing problem is now equivalent to testing H 0 : µ D = 0 vs. H 1 : µ D 0, and can be solved using the techniques from Section 7.2. Decision rule: Reject H 0 if T t α/2 (n 1), where T = D S D / n. This test is only valid if the populations are normal or n 30. (Tarleton State University) Chapter Seven Notes Spring / 42
23 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
24 Let N(µ X, σx 2 ) be a normal population, and consider H 0 : σ 2 X = σ2 0 vs. H 1 : σ 2 X σ2 0. Let X 1,..., X n be a random sample from this population. Decision rule: Reject H 0 if χ 2 χ 2 1 α/2 (n 1) or χ2 χ 2 α/2 (n 1), where χ 2 = (n 1)S2 σ0 2. This test is valid for any sample size. This test is not robust against deviations from normality. (Tarleton State University) Chapter Seven Notes Spring / 42
25 Let N(µ X, σx 2 ) and N(µ Y, σy 2 ) be two normal populations, and consider H 0 : σx 2 = σ2 Y vs. H 1 : σx 2 σ2 Y. Let X 1,..., X n and Y 1,..., Y m be independent samples from these populations. Decision rule: Reject H 0 if F F 1 α/2 (n 1, m 1) or F F α/2 (n 1, m 1), where F = S2 X SY 2. This test is valid for any sample sizes. This test is not robust against deviations from normality. (Tarleton State University) Chapter Seven Notes Spring / 42
26 Outline 1 General Hypothesis Testing Concepts 2 Section 7.2: Tests about One Mean 3 More General Concepts: Test Statistics and p-values 4 Section 7.1: Tests about Proportions 5 Section 7.3: Tests of the Equality of Two Means 6 Section 7.4: Tests for Variances 7 Sections 6.7, 6.8, and 7.7: Linear Regression (Tarleton State University) Chapter Seven Notes Spring / 42
27 Simple Linear Regression A simple linear regression model is given by Y i = a + bx i + ɛ i, for i = 1,..., n, where Y 1,..., Y n and x 1,..., x n are measurements of some physical quantities. We are interested in finding a relationship between these quantities and using x to predict Y. The Y i s are random variables, while the x i s can be regarded as random or constant. Both the Y i s and x i s are observable. x is often called the independent, explanatory, or input variable. Y is often called the dependent, response, or output variable. (Tarleton State University) Chapter Seven Notes Spring / 42
28 Simple Linear Regression Y i = a + bx i + ɛ i, for i = 1,..., n, where This model assumes there is a linear relationship between x and Y, except for random error. The parameters a and b are the y-intercept and slope of this line. a and b are fixed numbers. a and b are not observable. (Tarleton State University) Chapter Seven Notes Spring / 42
29 Simple Linear Regression Y i = a + bx i + ɛ i, for i = 1,..., n, where The ɛ i s are random errors. We will assume that ɛ 1,..., ɛ n are IID N(0, σ 2 ) random variables. σ 2 is a parameter (it s just a number, not random). The ɛ i s and σ 2 are not observable. (Tarleton State University) Chapter Seven Notes Spring / 42
30 Simple Linear Regression: Concise Formulation A simple linear regression model is given by Y i = a + bx i + ɛ i, for i = 1,..., n, where Y 1,..., Y n are observable random variables; x 1,..., x n are observable numbers; a and b are unobservable numbers; ɛ 1,..., ɛ n are unobservable IID N(0, σ 2 ) random variables; and σ 2 is an unobservable positive number. (Tarleton State University) Chapter Seven Notes Spring / 42
31 Hook s Law Hook s Law from physics states that the length of a spring is a linear function of the mass placed on the spring. Consider the following data. Mass (kg) Length (cm) Does the data fall exactly on a line? Why not? Find the slope and y-intercept of the best fitting line. (Tarleton State University) Chapter Seven Notes Spring / 42
32 Estimating the y-intercept and Slope For the simple linear regression model, the MLEs for a and b are least squares estimates: they minimize the sum of the squares of the distances from the data points to the regression line. ˆb = r S Y xi y i 1/n( x i )( y i ) = S x x 2 i 1/n(. x i ) 2 The regression line passes through the vector of means (x, Y ). Therefore, â = Y ˆbx. Note that â and ˆb are observable random variables. (Tarleton State University) Chapter Seven Notes Spring / 42
33 Estimates for Hook s Law Example For the Hook s Law example, ˆb = and â = Scatter plot and least squares line: How would we estimate σ 2? (Tarleton State University) Chapter Seven Notes Spring / 42
34 Residuals and Estimating σ 2 Rearranging the regression equation, we obtain ɛ i = Y i (a + bx i ), for i = 1,..., n. We can t directly observe the ɛ i s. We can observe the residuals, given by the equation below: e i = Y i (â + ˆbx i ), for i = 1,..., n. The residuals are observable random variables. If our estimates are accurate, e i ɛ i. The (unbiased) MLE for σ 2 is ˆσ 2 = 1 n 2 n ei 2. i=1 (Tarleton State University) Chapter Seven Notes Spring / 42
35 Strong Positive Correlation (Tarleton State University) Chapter Seven Notes Spring / 42
36 Strong Negative Correlation (Tarleton State University) Chapter Seven Notes Spring / 42
37 Virtually No Correlation (Tarleton State University) Chapter Seven Notes Spring / 42
38 Multiple Regression Y i = β 1 x i1 + β 2 x i2 + + β p x ip + ɛ i, for i = 1,..., n, where ɛ 1,..., ɛ n are unobservable IID N(0, σ 2 ) random variables. Y 1. Y n = X is called the design matrix. X 11 X 1p.. X n1 X np Y = Xβ + ɛ β 1. β p ɛ 1 +. ɛ n (Tarleton State University) Chapter Seven Notes Spring / 42
39 Y = Xβ + ɛ ˆβ = (X t X) 1 X t Y cov( ˆβ) = σ 2 (X t X) 1 The standard error of ˆβ j is the square root of the jth diagonal entry of cov( ˆβ). SE( ˆ ˆβ j ) = ˆσ [(X t X) 1 ] jj The following random variable has a t-distribution with n p degrees of freedom. T = ˆβ j β j SE( ˆ ˆβ j ). (Tarleton State University) Chapter Seven Notes Spring / 42
40 The following random variable has a t-distribution with n p degrees of freedom. 1 α confidence interval of β j : T = ˆβ j β j SE( ˆ ˆβ j ). ˆβ j ± t α/2 (n p) ˆ SE( ˆβ j ) To test H 0 : β j = 0 vs. H 1 : β j 0, compute and reject if T t α/2 (n p). ˆβ j T = SE( ˆ ˆβ j ), (Tarleton State University) Chapter Seven Notes Spring / 42
41 The F -test Consider the multiple regression model Y i = β 1 x i1 + β 2 x i2 + + β p x ip + ɛ i, for i = 1,..., n. Let p 0 be a positive integer such that 1 p 0 p. Consider testing the null hypothesis that the first p 0 components in the regression equation are zero, i.e., H 0 : β 1 = = β p0 = 0 vs. H 1 : β j 0, for at least one j = 1,..., p 0. Let e be the vector of residuals for the full model. Let e (s) be the vector of residuals for the small model. The F-statistic used for this testing problem is F = ( e(s) 2 e 2 )/p 0 e 2. /(n p) (Tarleton State University) Chapter Seven Notes Spring / 42
42 The F -test H 0 : β 1 = = β p0 = 0 vs. H 1 : β j 0, for at least one j = 1,..., p 0. The F-statistic used for this testing problem is F = ( e(s) 2 e 2 )/p 0 e 2. /(n p) Under the null hypothesis, F has an F(p 0, n p) distribution. (Tarleton State University) Chapter Seven Notes Spring / 42
Confidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationLECTURE 5 HYPOTHESIS TESTING
October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSTAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression
STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression Rebecca Barter April 20, 2015 Fisher s Exact Test Fisher s Exact Test
More informationTopic 16 Interval Estimation
Topic 16 Interval Estimation Additional Topics 1 / 9 Outline Linear Regression Interpretation of the Confidence Interval 2 / 9 Linear Regression For ordinary linear regression, we have given least squares
More informationStatistical Hypothesis Testing
Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationTHE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS
THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations
More informationSTA 114: Statistics. Notes 21. Linear Regression
STA 114: Statistics Notes 1. Linear Regression Introduction A most widely used statistical analysis is regression, where one tries to explain a response variable Y by an explanatory variable X based on
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More information1 Statistical inference for a population mean
1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known
More informationCHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:
CHAPTER 8 Test of Hypotheses Based on a Single Sample Hypothesis testing is the method that decide which of two contradictory claims about the parameter is correct. Here the parameters of interest are
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationy = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output
12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationEconometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationHypothesis Testing The basic ingredients of a hypothesis test are
Hypothesis Testing The basic ingredients of a hypothesis test are 1 the null hypothesis, denoted as H o 2 the alternative hypothesis, denoted as H a 3 the test statistic 4 the data 5 the conclusion. The
More informationChapter 14. Linear least squares
Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationPreliminary Statistics Lecture 5: Hypothesis Testing (Outline)
1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSTAT 135 Lab 5 Bootstrapping and Hypothesis Testing
STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationAMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression
AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number
More informationLecture 3: Multiple Regression
Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u
More informationStats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16
Stats Review Chapter 14 Revised 8/16 Note: This review is meant to highlight basic concepts from the course. It does not cover all concepts presented by your instructor. Refer back to your notes, unit
More informationRegression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin
Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationQuantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing
Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationSTA442/2101: Assignment 5
STA442/2101: Assignment 5 Craig Burkett Quiz on: Oct 23 rd, 2015 The questions are practice for the quiz next week, and are not to be handed in. I would like you to bring in all of the code you used to
More informationBusiness Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing
Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Agenda Introduction to Estimation Point estimation Interval estimation Introduction to Hypothesis Testing Concepts en terminology
More informationChapter 5: HYPOTHESIS TESTING
MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationBasics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations
Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationIntroductory Econometrics
Session 4 - Testing hypotheses Roland Sciences Po July 2011 Motivation After estimation, delivering information involves testing hypotheses Did this drug had any effect on the survival rate? Is this drug
More information18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013
18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More information(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.
Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationCherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants
18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV
Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationWISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A
WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationCorrelation and Regression Analysis. Linear Regression and Correlation. Correlation and Linear Regression. Three Questions.
10/8/18 Correlation and Regression Analysis Correlation Analysis is the study of the relationship between variables. It is also defined as group of techniques to measure the association between two variables.
More informationBivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationMultiple Linear Regression
Chapter 3 Multiple Linear Regression 3.1 Introduction Multiple linear regression is in some ways a relatively straightforward extension of simple linear regression that allows for more than one independent
More informationSimple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com
12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationHypothesis Testing. ECE 3530 Spring Antonio Paiva
Hypothesis Testing ECE 3530 Spring 2010 Antonio Paiva What is hypothesis testing? A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove that a hypothesis is
More informationMATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests
1/16 MATH 829: Introduction to Data Mining and Analysis Linear Regression: statistical tests Dominique Guillot Departments of Mathematical Sciences University of Delaware February 17, 2016 Statistical
More informationQuantitative Methods for Economics, Finance and Management (A86050 F86050)
Quantitative Methods for Economics, Finance and Management (A86050 F86050) Matteo Manera matteo.manera@unimib.it Marzio Galeotti marzio.galeotti@unimi.it 1 This material is taken and adapted from Guy Judge
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More information1 Hypothesis testing for a single mean
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationContest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.
Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round
More informationSYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationLecture 4: Heteroskedasticity
Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationEC2001 Econometrics 1 Dr. Jose Olmo Room D309
EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:
More informationExamination paper for TMA4255 Applied statistics
Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More information