Chapter 24: Comparing means

Size: px
Start display at page:

Download "Chapter 24: Comparing means"

Transcription

1 Chapter 4: Comparing means Example: Consumer Reports annually conducts a survey of automobile reliability Approximately 4 million households are surveyed by mail, The 990 survey is summarized in the Figure by manufacturing location Volvo 40 4 Volkswagen Jetta Toyota Tercel 4 Toyota Cressida 6 Toyota Corolla 4 Toyota Camry 4 Subaru Loyale 4 Subaru Legacy 4 Pontiac LeMans 4 Oldsmobile Cutlass Ciera 4 Oldsmobile Calais Nissan Stanza 4 Nissan Sentra Nissan Maxima V6 Nissan 40SX 4 Mitsubishi Sigma V6 Mitsubishi Galant 4 Mercury Tracer 4 Mazda Protege 4 Mazda MPV V6 Mazda 99 V6 Mazda 66 4 Honda Prelude Si 4WS Honda Civic CRX Si 4 Honda Civic Honda Accord 4 Ford Thunderbird V6 Ford Tempo 4 Ford Taurus V6 Ford Probe Ford Mustang V8 Ford LTD Crown Victoria V8 Ford Festiva 4 Ford Escort 4 Ford Aerostar V6 Eagle Summit 4 Eagle Premier V6 Dodge Grand Caravan V6 Dodge Daytona Chrysler New Yorker V6 Chrysler Le Baron V6 Chrysler Le Baron Coupe Chevrolet Caprice V8 Chevrolet Camaro V8 Chevrolet Beretta 4 Buick Skylark 4 Buick Le Sabre V6 Buick Century 4 Acura Legend V6 Domestic Reliability Foreign The data suggest that domestic cars are considered by owners to be less reliable than foreign cars Do these data conclusively support the contention than domestic cars are less reliable than foreign? The question is best addressed by making it more specific: is the mean reliability ratings of domestic cars less than that of foreign cars? To proceed, let µ and µ denote the mean reliability ratings of foreign and domestic cars, respectively, for the time period , and suppose that the data shown in the figure are representative of the 0-year time period of interest The objective is to draw inferences regarding µ µ ; specifically, to assess the strength of evidence in favor of the hypothesis H a : µ µ < 0 and against H 0 : µ µ = 0 Inferences regarding µ µ : The point estimator of µ µ is y y The mathematical The response rate is on the order of 0% All car owners are not alike, and they can have personality traits that directly influence their choice of vehicle, their vehicle expectations, and how they subsequently treat and maintain their cars It s impossible to control for this kind of systematic error in their surveys 83

2 foundation for confidence intervals and hypothesis tests regarding µ and µ is the sampling distribution of y y Before developing the sampling distribution of y y, recall that y N(µ, σ / n ) and y N(µ, σ / n ) The results described in Chapter 6 imply the following statements are true: E(y y ) = µ µ, Var(y y ) = Var(y ) + Var(y ) and σ(y y ) = = σ + σ, n n σ + σ n n The last statement is true only if y and y are computed from independent samples from the σ respective populations The term σ(y y ) = n + σ n is the standard deviation of y y The Central Limit Theorem implies that y y N µ µ, σ + σ n In almost all practical applications, σ and σ must be estimated by the sample standard deviations s and s, and σ(y y ) is estimated by the standard error of the difference: s σ(y y ) = + s n n As in the one-sample case, when σ(y y ) is replaced by σ(y y ), the standardized form of y y has a distribution that is approximately t 3 That is, T = y y (µ µ ) s + s n n n t df () Consequently, the t-distribution is used to obtain critical values and for computing p-values The only substantive difference between one- and two-sample situations is that the degrees of freedom (df) for the two-sample case cannot be determined exactly There are two common approximations of the degrees of freedom: 3 The standardized form is exactly t df in distribution if the sampled population is normal; if it s not, then standardized form is approximately in t df distribution 84

3 Set the degrees of freedom to be the smaller of n and n The approximation is conservative in the sense that confidence intervals are slightly wider than necessary and p-values are slightly larger than would be obtained from a more accurate approximation The second approximation (Satterthwaite s approximation) is more accurate but troublesome to compute It is df n ( s + s n n ( s n ) ) + n ( ) s n A 00( α)% confidence interval for µ µ Let t denote the critical value for degrees of freedom df and a 00( α)% confidence level Then, a 00( α)% confidence interval for µ µ is y y ± t σ(y y ) = y y ± t s n + s n For the automobile reliability comparison, n = 6 and n = 3, so the first approximation of df is The R function ttest computes Satterthwaite s approximation of the degrees of freedom as df = The critical value, obtained from R (using the command qt(975,46797)) is t = 0 The sample statistics are y = 4384, s = 03, y = 6, and s = 964 Thus, y y ± t σ(y y ) = ± 0 = [55, 70] The estimated difference in reliability scores between foreign and domestic cars is 3 and a confidence interval for the mean difference among all foreign and domestic cars is [55, 70] Zero is not even remotely close to being bracketed by the interval If Satterthwaite s approximation of the degrees of freedom was replaced by the alternative approximation (the smaller of n = 5 and n = ), then t = 074 The difference in the confidence interval width is negligible when comparing the two degrees of freedom approximation A hypothesis test for comparing µ and µ 4 ttest also computes a confidence interval and a test of the two-sided alternative H a : µ µ 0 85

4 A common null hypothesis when comparing two groups is H 0 : µ = µ, or equivalently, H 0 : µ µ = 0 The alternative hypothesis can be one-sided or two-sided: The test statistic is the two-sample t-statistic H a : µ µ > 0 (or H a : µ > µ ) H a : µ µ < 0 (or H a : µ < µ ) H a : µ µ 0 (or H a : µ µ ) T = = y y σ(y y ) y y s + s n n If H 0 is correct, then T t df As discussed above, there are two common approximations of the degrees of freedom: Set the degrees of freedom to be the smaller of n and n Satterthwaite s approximation: df n H a determines how the p-value is computed: Conditions of the two-sample t-procedures Sampling: ( s + s n n ( s n ) ) + n ( ) s n H a : µ µ < 0 p-value = P (T t H 0 ) H a : µ µ > 0 p-value = P (T t H 0 ) H a : µ µ 0 p-value = P (T t H 0 ) Randomization: In an observational study, the two samples are random samples from their respective populations In a controlled experiment, the subjects are randomly assigned to the two treatments Independent samples: In an observational study, the two samples are drawn independently of each other In a controlled experiment, the condition is met if the subjects are randomly assigned to treatment group 86

5 Normality: The distribution of the variable across population is normal; similarly, the distribution of the variable across population is normal As this condition is rarely met, the distribution of T is accurately approximated by the t df distribution provided that either If a sample size is less than 5, then sample distribution is without skewness or outliers If a sample size between 5 and 40, then the sample distribution is roughly normal (only mild skewness or mild outliers are present) If a sample size is greater than 40, then the sample distribution is of little concern 5 Example: Low birth weight is often an indicator of developmental delays and susceptibility to disease Infant mortality rates and birth defect rates are greater for low birth weight babies than normal birth weight A woman s behavior during pregnancy (including diet, smoking habits, and prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight It s suspected that maternal hypertension is associated with low birth weight, and to investigate the relationship, data were collected on 89 women, of which had low birth weight babies and 77 of which had normal birth weight babies Birthweight (g) Hypertension history No hypertension history The figure above shows the distribution of birth weight for newborns born to mothers with a history of hypertension, and to mothers without history of hypertension The figure is suggestive of a difference between the mean weight µ of newborns born to mothers with a history of hypertension and the mean weight µ of newborns born to mothers without a history of hypertension A formal comparison can be conducted through a hypothesis test 5 In other words, if a sample size is greater than 40, then there is no concern even if the sample is highly skewed and contains outliers 6 Hosmer and Lemeshow (000) Applied Logistic Regression: Second Edition Data were collected at Baystate Medical Center, Springfield, Massachusetts during

6 The hypotheses are The sample statistics are H 0 : µ µ = 0 and H a : µ µ > 0 n = 77 n = y = 97 g y = 5368 g s = 974 g s = 7094 g The sample difference, as a percentage of y is 46%, is practically significant The test statistic is y T = y () s + s n n = (3) = 6 (4) Satterthwaite s approximation of the degrees of freedom is 909 and the p-value is P (T 6 H 0 ) = 0666 There is some evidence supporting the contention that lower birth weights are associated with mother s hypertension Perhaps if the sample size for the hypertensive group were larger, then the data would be more conclusive The independence and normality conditions are apparently satisfied (I used the R command qqnorm to construct the normal quantile-quantile plots to the right) Sample Quantiles Sample Quantiles Theoretical Quantiles Theoretical Quantiles Normal quantile-quantile plots provide a visual check for the fit of a theoretical distribution to the observed data A normal quantile-quantile is constructed by plotting the observed values of a variable against the theoretical quantiles assuming that the data were sampled from a normal distribution If the fit of the theoretical distribution to the observed values is 88

7 good, then the plotted values fall along a straight line, though sampling variability implies that there will be variation about the line If the data were not obtained by random sampling of a normal distribution, then the data pairs will deviate substantively from a straight line I don t know enough about the data to comment on whether the samples are random or representative of larger populations of newborns Example: The textbook website has a data file containing sugar content (percentage by weight) of 7 children s cereals and 9 adult cereals The data are visually summarized to the right 60 The box plot suggests that the cereals are vastly different with respect to sugar content Is this observation supported by an objective statistical analysis? Carry out a hypothesis test appropriate for answering this question Sugar Step identifies to which of the following situations this problem corresponds: 0 Adult Children One population involving the population proportion p Two populations involving a comparison of population proportions p and p 3 One population involving the population mean µ 4 Two populations involving a comparison of population means µ and µ This problem is a two population problem involving a comparison of population means µ and µ : the population of interests are all children s cereals and all adults cereals The parameters of interest are the mean sugar content (percentage by weight) µ of the children s cereal, and mean sugar content (percentage by weight) µ of the adult s cereal Step sets up the hypotheses for testing whether µ is greater than µ H 0 : µ µ = 0 H a : µ µ > 0 89

8 Step 3 identifies the test statistic and computes the terms necessary to evaluate the test statistic The test statistic is the two-sample t-statistic t = y y σ(y y ), where σ(y y ) = s n + s n From StatCrunch, y = 47%, y = 7% and y y = 355% Also, s = 435, s = 880, n = 7, and n = 9 s σ(y y ) = + s n n 435 = = 36 Step 4 checks the large-sample and sampling conditions Randomization is doubtful - I suspect the data are a convenience sample from the selves of a supermarket To be fair, I don t have any evidence that this is how the data was collected Independence: there are only a handful of cereal manufacturers, and so I believe that number of cereals in the samples were made by the same manufacturers Independence is doubtful 3 The samples contain a few outliers and some skewness More than 40 observations are needed in each sample to be confident that the normality condition is satisfied Normality is doubtful 4 It s unclear whether the 0% condition is satisfied since the population sizes are unknown I suspect that there are fewer than 00 cereals nationwide The formal test is nearly pointless because none of the conditions are met; furthermore, the box plots show very large differences between the distributions as do the sample means However, for those readers that are unwilling to draw a conclusion from the boxplot, the t-statistic, viewed as a coarse measure of strength of evidence, shows very strong evidence against H 0 and in favor of H a The conclusion should be stated conservatively since the conditions are not met 90

9 Step 5 computes the test statistic and the p-value: t = = 50 Set df = n = 6 StatCrunch computes p-value = P (t 6 5) = , so I will report that the p-value is less than 00 Step 6 states the conclusion: there is very strong evidence that children s cereal contain more sugar than adult cereal, moreover the observed difference, 355% is very large Said another way, there s about four times as much sugar, on average in childrens cereals than adults cereals 9

Operators and the Formula Argument in lm

Operators and the Formula Argument in lm Operators and the Formula Argument in lm Recall that the first argument of lm (the formula argument) took the form y. or y x (recall that the term on the left of the told lm what the response variable

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.2 with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Figure T1: Consumer Segments with No Adverse Selection. Now, the discounted utility, V, of a segment 1 consumer is: Segment 1 (Buy New)

Figure T1: Consumer Segments with No Adverse Selection. Now, the discounted utility, V, of a segment 1 consumer is: Segment 1 (Buy New) Online Technical Companion to Accompany Trade-ins in Durable Goods Markets: Theory and Evidence This appendix is divided into six main sections which are ordered in a sequence corresponding to their appearance

More information

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment Linear Regression (again) Consider the relationship

More information

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure

More information

S160 #20. Comparing Two Means Paired Data. JC Wang. April 5, 2016

S160 #20. Comparing Two Means Paired Data. JC Wang. April 5, 2016 S160 #20 Comparing Two Means Paired Data JC Wang April 5, 2016 Outline 1 Paired Data Comparing Means in Paired Data (General) Paired Data JC Wang (WMU) S160 #20 S160, Lecture 20 2 / 12 Paired Data, Before-and-After

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

INSTITUTE OF ACTUARIES OF INDIA

INSTITUTE OF ACTUARIES OF INDIA INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS 4 th November 008 Subject CT3 Probability & Mathematical Statistics Time allowed: Three Hours (10.00 13.00 Hrs) Total Marks: 100 INSTRUCTIONS TO THE CANDIDATES

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

An inferential procedure to use sample data to understand a population Procedures

An inferential procedure to use sample data to understand a population Procedures Hypothesis Test An inferential procedure to use sample data to understand a population Procedures Hypotheses, the alpha value, the critical region (z-scores), statistics, conclusion Two types of errors

More information

MATH Chapter 21 Notes Two Sample Problems

MATH Chapter 21 Notes Two Sample Problems MATH 1070 - Chapter 21 Notes Two Sample Problems Recall: So far, we have dealt with inference (confidence intervals and hypothesis testing) pertaining to: Single sample of data. A matched pairs design

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Polynomial Regression

Polynomial Regression Polynomial Regression Summary... 1 Analysis Summary... 3 Plot of Fitted Model... 4 Analysis Options... 6 Conditional Sums of Squares... 7 Lack-of-Fit Test... 7 Observed versus Predicted... 8 Residual Plots...

More information

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Instructor Name Time Limit: 120 minutes Any calculator is okay. Necessary tables and formulas are attached to the back of the exam.

More information

Confidence Intervals with σ unknown

Confidence Intervals with σ unknown STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence

More information

Section 6-5 THE CENTRAL LIMIT THEOREM AND THE SAMPLING DISTRIBUTION OF. The Central Limit Theorem. Central Limit Theorem: For all samples of

Section 6-5 THE CENTRAL LIMIT THEOREM AND THE SAMPLING DISTRIBUTION OF. The Central Limit Theorem. Central Limit Theorem: For all samples of Section 6-5 The Central Limit Theorem THE CENTRAL LIMIT THEOREM Central Limit Theorem: For all samples of the same size with 30, the sampling distribution of can be approximated by a normal distribution

More information

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc. Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:

More information

Chapter 7: Hypothesis Testing - Solutions

Chapter 7: Hypothesis Testing - Solutions Chapter 7: Hypothesis Testing - Solutions 7.1 Introduction to Hypothesis Testing The problem with applying the techniques learned in Chapter 5 is that typically, the population mean (µ) and standard deviation

More information

Chapter 3 Assignment

Chapter 3 Assignment Chapter 3 Assignment AP Statistics-Adams Name: Period: Date: 1. This scatterplot shows the overall percentage of on-time arrivals versus overall mishandled baggage per 1000 passengers for the year 2002.

More information

Name. City Weight Model MPG

Name. City Weight Model MPG Name The following table reports the EPA s city miles per gallon rating and the weight (in lbs.) for the sports cars described in Consumer Reports 99 New Car Buying Guide. (The EPA rating for the Audii

More information

Chapter 9. Hypothesis testing. 9.1 Introduction

Chapter 9. Hypothesis testing. 9.1 Introduction Chapter 9 Hypothesis testing 9.1 Introduction Confidence intervals are one of the two most common types of statistical inference. Use them when our goal is to estimate a population parameter. The second

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017 Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0

More information

Chapter 5: HYPOTHESIS TESTING

Chapter 5: HYPOTHESIS TESTING MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate

More information

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α Chapter 8 Notes Section 8-1 Independent and Dependent Samples Independent samples have no relation to each other. An example would be comparing the costs of vacationing in Florida to the cost of vacationing

More information

Stat 3355 Statistical Methods for Statisticians and Actuaries. Stat 3355 Course Information

Stat 3355 Statistical Methods for Statisticians and Actuaries. Stat 3355 Course Information Stat 3355 Statistical Methods for Statisticians and Actuaries The notes and scripts included here are copyrighted by their author, Larry P. Ammann, and are intended for the use of students currently registered

More information

Inferences Based on Two Samples

Inferences Based on Two Samples Chapter 6 Inferences Based on Two Samples Frequently we want to use statistical techniques to compare two populations. For example, one might wish to compare the proportions of families with incomes below

More information

Ch18 links / ch18 pdf links Ch18 image t-dist table

Ch18 links / ch18 pdf links Ch18 image t-dist table Ch18 links / ch18 pdf links Ch18 image t-dist table ch18 (inference about population mean) exercises: 18.3, 18.5, 18.7, 18.9, 18.15, 18.17, 18.19, 18.27 CHAPTER 18: Inference about a Population Mean The

More information

Sem. 1 Review Ch. 1-3

Sem. 1 Review Ch. 1-3 AP Stats Sem. 1 Review Ch. 1-3 Name 1. You measure the age, marital status and earned income of an SRS of 1463 women. The number and type of variables you have measured is a. 1463; all quantitative. b.

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations Basics of Experimental Design Review of Statistics And Experimental Design Scientists study relation between variables In the context of experiments these variables are called independent and dependent

More information

First we look at some terms to be used in this section.

First we look at some terms to be used in this section. 8 Hypothesis Testing 8.1 Introduction MATH1015 Biostatistics Week 8 In Chapter 7, we ve studied the estimation of parameters, point or interval estimates. The construction of CI relies on the sampling

More information

Exam 2 (KEY) July 20, 2009

Exam 2 (KEY) July 20, 2009 STAT 2300 Business Statistics/Summer 2009, Section 002 Exam 2 (KEY) July 20, 2009 Name: USU A#: Score: /225 Directions: This exam consists of six (6) questions, assessing material learned within Modules

More information

Tests for Population Proportion(s)

Tests for Population Proportion(s) Tests for Population Proportion(s) Esra Akdeniz April 6th, 2016 Motivation We are interested in estimating the prevalence rate of breast cancer among 50- to 54-year-old women whose mothers have had breast

More information

Chapter 5 Confidence Intervals

Chapter 5 Confidence Intervals Chapter 5 Confidence Intervals Confidence Intervals about a Population Mean, σ, Known Abbas Motamedi Tennessee Tech University A point estimate: a single number, calculated from a set of data, that is

More information

Jointly Distributed Variables

Jointly Distributed Variables Jointly Distributed Variables Sec 2.6, 9.1 & 9.2 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 7-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

We will now find the one line that best fits the data on a scatter plot.

We will now find the one line that best fits the data on a scatter plot. General Education Statistics Class Notes Least-Squares Regression (Section 4.2) We will now find the one line that best fits the data on a scatter plot. We have seen how two variables can be correlated

More information

Inference for Distributions Inference for the Mean of a Population. Section 7.1

Inference for Distributions Inference for the Mean of a Population. Section 7.1 Inference for Distributions Inference for the Mean of a Population Section 7.1 Statistical inference in practice Emphasis turns from statistical reasoning to statistical practice: Population standard deviation,

More information

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Lecture 9 Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests Univariate categorical data Univariate categorical data are best summarized in a one way frequency table.

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Sampling Distributions Parameter and Statistic A is a numerical descriptive measure of a population. Since it is based on the observations in the population, its value is almost always unknown.

More information

Louis Roussos Sports Data

Louis Roussos Sports Data Louis Roussos Sports Data Rank the sports you most like to participate in, 1 = favorite, 7 = least favorite. There are n=130 rank vectors. > sportsranks Baseball Football Basketball Tennis Cycling Swimming

More information

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement

More information

Chapter 9 Inferences from Two Samples

Chapter 9 Inferences from Two Samples Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review

More information

CHAPTER 9: HYPOTHESIS TESTING

CHAPTER 9: HYPOTHESIS TESTING CHAPTER 9: HYPOTHESIS TESTING THE SECOND LAST EXAMPLE CLEARLY ILLUSTRATES THAT THERE IS ONE IMPORTANT ISSUE WE NEED TO EXPLORE: IS THERE (IN OUR TWO SAMPLES) SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

DynoMax Standard Series Headers

DynoMax Standard Series Headers CHRYSLER CORPORATION - CARS DODGE Aspen 76-79 273-360 85035 86035 035 5 /8, 3 Y Y Y Y Y 4, 0,, 86 89000 7282 773 Challenger 70-77 273-360 85035 86035 035 5 /8, 3 Y Y Y Y Y 4, 0,, 86 89000 7282 773 Challenger

More information

Chapter 23. Inference About Means

Chapter 23. Inference About Means Chapter 23 Inference About Means 1 /57 Homework p554 2, 4, 9, 10, 13, 15, 17, 33, 34 2 /57 Objective Students test null and alternate hypotheses about a population mean. 3 /57 Here We Go Again Now that

More information

Chapter 24. Comparing Means

Chapter 24. Comparing Means Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing

More information

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1 PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population

More information

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, Paper III : Statistical Applications and Practice

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, Paper III : Statistical Applications and Practice EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 2002 Paper III Statistical Applications and Practice Time Allowed Three Hours Candidates should answer FIVE questions.

More information

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E Salt Lake Community College MATH 1040 Final Exam Fall Semester 011 Form E Name Instructor Time Limit: 10 minutes Any hand-held calculator may be used. Computers, cell phones, or other communication devices

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms For All Practical Purposes Mathematical Literacy in Today s World, 7th ed. Interpreting Histograms Displaying Distributions: Stemplots Describing

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc. Chapter 23 Inferences About Means Sampling Distributions of Means Now that we know how to create confidence intervals and test hypotheses about proportions, we do the same for means. Just as we did before,

More information

Tribhuvan University Institute of Science and Technology 2065

Tribhuvan University Institute of Science and Technology 2065 1CSc. Stat. 108-2065 Tribhuvan University Institute of Science and Technology 2065 Bachelor Level/First Year/ First Semester/ Science Full Marks: 60 Computer Science and Information Technology (Stat. 108)

More information

Statistics 251: Statistical Methods

Statistics 251: Statistical Methods Statistics 251: Statistical Methods 1-sample Hypothesis Tests Module 9 2018 Introduction We have learned about estimating parameters by point estimation and interval estimation (specifically confidence

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Two Sample Problems. Two sample problems

Two Sample Problems. Two sample problems Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

Hypothesis testing. 1 Principle of hypothesis testing 2

Hypothesis testing. 1 Principle of hypothesis testing 2 Hypothesis testing Contents 1 Principle of hypothesis testing One sample tests 3.1 Tests on Mean of a Normal distribution..................... 3. Tests on Variance of a Normal distribution....................

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10. Comparing Two Means The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Means Learning

More information

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight? Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight? 16 subjects overfed for 8 weeks Explanatory: change in energy use from non-exercise activity (calories)

More information

LC OL - Statistics. Types of Data

LC OL - Statistics. Types of Data LC OL - Statistics Types of Data Question 1 Characterise each of the following variables as numerical or categorical. In each case, list any three possible values for the variable. (i) Eye colours in a

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10.2 Comparing Two Means The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Means Learning

More information

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Chapter 9: Inferences from Two Samples. Section Title Pages

Chapter 9: Inferences from Two Samples. Section Title Pages Chapter 9: Inferences from Two Samples Section Title Pages 1 Review and Preview 1 2 Inferences About Two Proportions 1 5 3 Inferences About Two Means: Independent 6 7 4 Inferences About Two Means: Dependent

More information

Examine characteristics of a sample and make inferences about the population

Examine characteristics of a sample and make inferences about the population Chapter 11 Introduction to Inferential Analysis Learning Objectives Understand inferential statistics Explain the difference between a population and a sample Explain the difference between parameter and

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

42 GEO Metro Japan

42 GEO Metro Japan Statistics 101 106 Lecture 11 (17 November 98) c David Pollard Page 1 Read M&M Chapters 2 and 11 again. Section leaders will decide how much of Chapters 12 and 13 to cover formally; they will assign the

More information

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors. EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

More information

O.E. Alloy Wheel Weight Applications

O.E. Alloy Wheel Weight Applications O.E. Alloy Wheel Weight Applications Passenger Cars Vehicle Model 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 n Acura n AUDI 1 All EN EN EN EN EN EN EN EN EN EN EN n BMW 1 All IAWbo IAWbo IAWbo

More information

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution

More information

Important note: Transcripts are not substitutes for textbook assignments. 1

Important note: Transcripts are not substitutes for textbook assignments. 1 In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

Analysis of Variance and Co-variance. By Manza Ramesh

Analysis of Variance and Co-variance. By Manza Ramesh Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method

More information

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

Survey on Population Mean

Survey on Population Mean MATH 203 Survey on Population Mean Dr. Neal, Spring 2009 The first part of this project is on the analysis of a population mean. You will obtain data on a specific measurement X by performing a random

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 17 pages including

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

Inferences about Means

Inferences about Means Inferences about Means Keith Thompson Department of Mathematics and Statistics Department of Oceanography February 23, 2012 ( ) February 23, 2012 1 / 58 Information on the Instructor Instructor Departments

More information

Chapter 20 Comparing Groups

Chapter 20 Comparing Groups Chapter 20 Comparing Groups Comparing Proportions Example Researchers want to test the effect of a new anti-anxiety medication. In clinical testing, 64 of 200 people taking the medicine reported symptoms

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

AUTHENTIC PERFORMANCE. wiper blade APPLICATION GUIDE INCLUDES ALL MAKES

AUTHENTIC PERFORMANCE. wiper blade APPLICATION GUIDE INCLUDES ALL MAKES AUTHENTIC PERFORMANCE wiper blade INCLUDES ALL MAKES APPLICATION GUIDE Mopar Value Line Wiper Blades NOTE: Many blade parts are also available for purchase in bulk. For part numbers, simply replace the

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1 While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1 Chapter 12 Analysis of Variance McGraw-Hill, Bluman, 7th ed., Chapter 12 2

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation

More information

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01 An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Chapter 7: Simple linear regression

Chapter 7: Simple linear regression The absolute movement of the ground and buildings during an earthquake is small even in major earthquakes. The damage that a building suffers depends not upon its displacement, but upon the acceleration.

More information

11. The Normal distributions

11. The Normal distributions 11. The Normal distributions The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 11) The Normal distributions Normal distributions The

More information