Harvard University. Rigorous Research in Engineering Education
|
|
- Aubrey Wilcox
- 5 years ago
- Views:
Transcription
1 Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09
2 Statistical Inference You have a sample and want to use the data collected in your sample to make inferences about the underlying truth(s) () in the population p Target Population Sample Experiment?
3 Statistics and Parameters Mean the average : add everything up and divide by the sample size. Standard Deviation A measure of the average distance from the mean. Indicates how spread out the data are. Variance = (Standard Deviation) 2 Proportion the proportion of the sample that falls in a certain category Correlation (r) a measure of linear association between Correlation (r) a measure of linear association between two numeric variables that runs between 1 and 1
4 Sample Statistics and Population Parameters MEAN SD VARIANCE PROPORTION CORRELATION SAMPLE X s s 2 pˆp r POPULATION μ σ 2 σ p ρ Sample Statistic: a number calculated using your data Population Parameter: a usually unknown population value GOAL U th l t ti ti t k i f b t GOAL: Use the sample statistics to make inferences about the population parameters
5 Sampling Variability A sample statistic is rarely exactly the same as the (unknown) population parameter. Each time we take a random sample from a population, we are likely to get a different set of individuals and calculate a different statistic. This is called sampling variability Good news: if you have a random sample, as your sample size gets larger and larger, the sample statistic gets closer and closer to the population parameter.
6 Sampling Variability If we could take lots of random samples of the same size from a given population, the variation i of an estimate from sample to sample the sampling distribution follows a predictable pattern: Luckily, the mean, standard deviation, and shape of the sampling distribution are all fully estimable from quantities we can observe in our data! Statistical inference is based on this knowledge. We only get to observe one random sample, but can take advantage of the predictable sampling distribution to make valid inferences about population parameters
7 Sampling Distribution The sampling distribution gives us the distribution of our estimate (for example, a mean) if were to take many samples from the population. For many common estimates, with iha large enough sample size this will be a Normal Distribution
8 Standard Error The standard error (SE) of an estimate is the standard deviation of the sampling distribution. Most tforms of statistical ttiti linference rely on knowing or being able to calculate the standard error (uncertainty) of an estimate The sampling distribution and standard error depend on What type of parameter you are estimating The standard deviation of your data The sample size
9 Statistical Inference Confidence Intervals: Create an interval for the true population parameter based on your estimate and the uncertainty surrounding your estimate Hypothesis Testing: Determine whether a difference or association is statistically significant Statistical Modeling: Explore relationships between variables, model trends, create predictions
10 Confidence Intervals A confidence interval typically takes the form estimate ± ( margin of error) A 95% confidence interval will cover the true parameter in 95% of random samples. enceinterval.html
11 Confidence Intervals In any given analysis, you will only observe one of these intervals. 95% of 95% CIs will cover the truth
12 Confidence Intervals The formula estimate ± ( margin of error) can be re written as estimate ± (multiplier) ( standard error) where the multiplier depends on the confidence level: the percent of the intervals that will cover the truth (typically 95%). For 95% intervals this multiplier is usually close to 2.
13 Confidence Interval Proportion Example You want to estimate the proportion of your graduates who go on to eventually pass the EIT (Engineer in Training) exam You take a random sample of your graduates, contact them, and ask them whether or not they have passed the EIT. You sample 100 people and 55 have passed. Your estimated population proportion is.55, but you want an interval surrounding this estimate to reflect your uncertainty
14 Confidence Interval Proportion Example The standard error for a sample proportion is pˆ(1 pˆ) n The 95% confidence level multiplier for a proportion is The margin of error for a proportion is 1.96 pˆ(1 pˆ) n
15 Confidence Interval Proportion Example Your 95% confidence intervalisis then estimate ± margin of error ( ) = estimate ± (multiplier) ( standard error ) pˆ(1 pˆ) = pˆ ± 1.96 n =.55 ± =.55 ± =.55 ± ,.648 ( ) Note that if we instead sampled 1000 graduates, and still 55% had passed, our interval for the true proportion would be (.52,.58) much narrower!
16 Confidence Intervals Confidence intervals are pretty easy to compute by hand if you know the standard error (SE) of your estimate However, each type of estimate requires a different formula for computing the standard error, so it s easiest to just use a computer (any stat software will produce confidence intervals)
17 Confidence Intervals The width of the confidence interval, determined by the margin of error, depends on The confidence level (usually 90%, 95%, or 99%) Higher confidence > Wider interval The standard deviation of the original data Higher SD > Wider interval The sample size Higher sample size > Narrower interval
18 Confidence Intervals The easiest way to get more precise intervals for the truth is to sample a lot of people If you have a desired margin of error you want for a specific situation, you can work backwards to determine how many people you will need to sample
19 Sample Size When doing inference for a proportion, suppose you want no higher h than a certain ti margin of error. Recall ME = 1.96 pˆ(1 pˆ) n We don t know the sample proportion, but p(1 p) is maximized when p = ½, so we can be conservative and use p = ½. If we use p = ½ and replace 1.96 with 2, then we have ME n = n 1 n ME 2 If we want a margin of error of.02, f02 we should sample 1/.02 2 = 2500 people.
20 Confidence Intervals 95% CI estimate t ± 2 SE ( ) Quantity of Parameter Estimate Standard Error of Distribution Interest Estimate for multiplier One Mean Difference in Means μ μ X1 X 2 One Proportion μ 1 2 p Difference in Proportions p1 p2 X ˆp p1 p2 ˆ ˆ s s n s n n 1 2 tn 1 t + n1 + n2 2 pˆ(1 pˆ) N(0,1) n pˆ (1 pˆ ) pˆ (1 pˆ ) n + N(0,1) n 1 2 *Get the multiplier for a (1 k)% interval by finding the value on the distribution with k% remaining in the tails.
21 Confidence Intervals Single Mean The 95% confidence interval for a mean is approximately X ± 2 s n
22 Statistical Significance A phenomenon is statistically significant if it is unlikely to occur by random chance. For a difference in sample means between variables to be statistically significant, we need to see a difference much larger than we would see just by random chance if the true means really were the same.
23 Hypotheses Null Hypothesis (H 0 ): What you are trying to disprove, the status quo. The null hypothesis can only be disproved, never proved. The null hypothesis is assumed to be true, and a lack of evidence against tthe null does NOT mean it is true! Alternative Hypothesis (H a ): The hypothesis you are trying to prove. The alternative is proved by collecting evidence (data) that contradicts the null hypothesis. Together, the null and alternative must comprise all possibilities. i Rejecting the null is equivalent to proving the alternative.
24 Hypotheses Two Sided Alternative: You merely want to prove that two things are not equal. H : μ = μ H μ μ H : μ = μ a : μ μ0 H a : μ1 μ2 H : p = p H p p H : p = p a : 0 H a : p1 p2 One Sided Alternative: You want to prove something is greater (or less than) something else. You are fairly certain the sample statistics would not show otherwise. H 0 : μ μ 0 H 0 : μ 1 μ 2 H a : μ > μ0 H a : μ1 > μ2 H p p H : p > p 0 : 0 H 0 : p 1 p 2 a 0 H : p > p a 1 2 Hypotheses are always phrased in terms of population parameters, NOT sample statistics
25 Hypothesis Testing Example You devise an activity that you guess will improve conceptual understanding of a topic. Conceptual understanding is measured by a test worth 100 points. The existing average score for this test is 84. After completing your activity, your students score an average of 86. Did these students score significantly higher than the existing average? H0 : μ 84 : μ > 84 H a X = 86 Is 86 extreme enough to reject the null??? STATISTICS: quantifyinghow unlikelyyour data is given STATISTICS: quantifying how unlikely your data is, given that the null is true
26 p value The p value for a sample is the probability of getting g data as extreme (or more extreme) than the observed data, assuming the null hypothesis is true.
27 p value A small p value means that if the null hypothesis is true, you are very unlikely to observe a value that extreme just by random chance. Since you did observe a value that extreme, your null hypothesis probably is not true, so your alternative is probably true! ASMALLP-VALUE PROVIDES EVIDENCE A SMALL P VALUE PROVIDES EVIDENCE AGAINST THE NULL HYPOTHESIS
28 IMPORTANT! The smaller the p-value, The smaller the p-value, the the stronger stronger the evidence the against evidence H o. the stronger the against th. o evidence against H. o How small is small enough? o
29 Significance Level The significance level (α) of a test determines when a p value is small enough for the null hypothesis to be rejected α = proportion of times you will incorrectly reject the null, if it is true. Most commonly in education research, α=0.05 Decision rule: p value < α Reject H o p value > α Do not Reject H o α must always be specified before the data is analyzed!
30 Test Statistic The extremity of a sample statistic can often by determined by the number of standard errors it is from the null mean, called a z score: t.. s = estimate null mean SE In most cases, this test statistic follows a predictable, p distribution when the null hypothesis is true, so the p value is computed as area in the tails of this distribution.
31 p value The p value is the probability that the test statistic is extreme as that observed, given that the null hypothesis is true Distribution of the test statistic assuming the null hypothesis is true p value B Test Statistic
32 Hypothesis Testing 1. Determine the null and alternative hypotheses 2. Calculate the estimate, the standard error of the estimate, and your test statistic 3. Determine the distribution of the test statistic assuming the null hypothesis is true 4. Calculate l the p value, the probability bilit of observed a test t statistic as extreme as yours, given that the null is true 5. If the p value is small enough (typically.05), reject the null
33 Hypothesis Testing Example The existing average score for this test is 84. After completing your activity, your students sample mean is 86. Did these students score significantly higher than the existing average? What other information do you need? sample standard deviation: s = 5 sample size: n = 25
34 Hypothesis Testing One Mean 1. Determine the null and alternative hypotheses H0 : μ μ μ 0 0 = 84 H : μ > μ H : μ 84 a > 0 0 H a : μ > Calculate the estimate, the standard error of the estimate, and your test statistic s Estimate = X SE = X = 86, s = 5, n= 25 n ts.. = = = 2 estimate null mean X μ0 5 ts.. = 1 SE = s 25 n
35 Hypothesis Testing One Mean 3. Determine the distribution of the test statistic assuming the null hypothesis is true t n 1 : t distribution with n 1 degrees of freedom 4. Calculate the p value, the probability of observed a test statistic as extreme as yours, given that the null is true.028 <.05, so we would reject the null hypothesis. There is evidence that the true mean test score of students completing the activity is higher than the p value =.028 existing mean of
36 Hypothesis Testing Example We found that students completing gyour activity scored higher than the existing average. Does this mean the activity increases conceptual understanding? Not necessarily! To answer a question about causality we need A RANDOMIZED EXPERIMENT!
37 Hypothesis Testing Example You select n=100 students to participate p in your student, and randomly assign half of them participate in the activity, and half of them get a placebo. You give them each the conceptual understanding test. The placebo group has an average of 85 with a standard deviation of 5, and the activity group has an average of 87, also with a standard deviation of 5. Now is there evidence that the activity increases conceptual understanding (as measured by the test)?
38 Hypothesis Testing ts.. = estimate t null value SE Quantity of Hypotheses Estimate Standard Error of Distribution Interest Estimate under the null One Mean Difference in Means One Proportion Difference in Proportions H0 : μ = μ0 H : μ μ a 0 H0 : μ1 = μ2 H : μ μ a 1 2 H : p = p H : p p 0 0 a 1 0 H0 : p1 = p H : p p a 2 2 X pˆ X X 1 2 p s s n s n tn 1 t + n1 + n2 2 n 1 2 ˆp 0 0 pˆ 1 2 pˆ(1 pˆ) (1 p ) N(0,1) n n1 n2 pˆ = pˆ n + pˆ n n + n N(0,1)
39 Hypothesis Testing Difference inmeans H H : μ μ2 : μ > μ 0 1 a 1 2 estimate null mean X 1 X ts.. = = = = = SE 1 s s n n Null Distribution t + t = t 98 : n 1 n 2 2 p value =.024 The p value is less than.05, so we can reject the null hypothesis. Since this was a randomized experiment, we can conclude that the activity causes higher conceptual understanding.
40 Hypothesis Testing What makes results significant? Large effect size (estimate is far from null estimate) Low variability (standard deviation) in data (less random variation, easier to spot an effect) Large sample size (as sample size increases, estimates get closer and closer to the true, so can be trusted more)
41 Hypothesis Testing You will almost definitely use a computer to conduct hypotheses tests, so just need to know the names of the appropriate tests: Test for one mean: t test test Test for a difference in means: t test Test for a proportion or difference in proportions: z test for proportion(s) Test for association between categorical variables: chi square test for independence Test for association between quantitative variables: correlation test or test for the slope coefficient in linear regression
42 Hypothesis Testing If conducting a hypothesis test, you will have to COLLECT YOUR DATA STATE THE NULL AND ALTERNATIVE HYPOTHESIS KNOW WHICH TEST TO ASK A COMPUTER TO PERFORM (or just ask me) INTERPRET THE P VALUE
43 Simple Linear Regression Predicts your outcome data based on one explanatory variable
44 Simple Linear Regression Finds the best fit line The coefficient of the explanatory variable gives the change in the outcome variable ibl for every unit change in the explanatory (also called predictor ) variable Hypothesis tests on the coefficient for a variable determine if the variable is significantly correlated with the outcome Confidence intervals (for the predicted value) and prediction intervals (for an individual) can be produced
45 Multiple Regression Uses multiple explanatory variables to predict the outcome The coefficient for each variable represents the effect of that variable ibl given the other variables in the model dl If explanatory variables arecorrelated with each other, coefficients may change depending on what is in the model p values for each explanatory variable can be assessed Again, prediction intervals can be formed
46 Outliers Outliers can very strongly influence your results (for regression, hypothesis tests, etc ) Always plot your data first to check for outliers If you do have extreme outliers check for errors If the outliers are legitimate, you should run your analyses with and without the outliers to see how much the outliers influence the results
47 Outliers Correlation (as well as mean, standard deviation, regression coefficients) can be highly affected by outliers: r =.78 r =.17 y y Outlier x x2
48 Statistical Analysis This is only a brief introduction to some basic statistical analysis tools, and to give you an idea of what s available All of these are very famous and detailed information can be found on the web, or in any introductory statistics textbook Most of these techniques need assumptions to be verified before applying them. Please read or ask me for more specifics about the technique you actually intend to use
49
Mathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and can be printed and given to the
More informationChapter 5 Confidence Intervals
Chapter 5 Confidence Intervals Confidence Intervals about a Population Mean, σ, Known Abbas Motamedi Tennessee Tech University A point estimate: a single number, calculated from a set of data, that is
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationSTA Module 10 Comparing Two Proportions
STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationIntroduction to Survey Analysis!
Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationEcon 325: Introduction to Empirical Economics
Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population
More informationThe t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies
The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit
More informationHypothesis testing. Data to decisions
Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationStatistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers
Statistical Inference Greg C Elvers 1 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population and not just the sample that we used But our sample
More informationTwo-Sample Inferential Statistics
The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is
More informationStatistical Inference for Means
Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,
More informationECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12
ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 Winter 2012 Lecture 13 (Winter 2011) Estimation Lecture 13 1 / 33 Review of Main Concepts Sampling Distribution of Sample Mean
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationProbability and Statistics
Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationLab #12: Exam 3 Review Key
Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution
More informationSTAT Chapter 8: Hypothesis Tests
STAT 515 -- Chapter 8: Hypothesis Tests CIs are possibly the most useful forms of inference because they give a range of reasonable values for a parameter. But sometimes we want to know whether one particular
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationSingle Sample Means. SOCY601 Alan Neustadtl
Single Sample Means SOCY601 Alan Neustadtl The Central Limit Theorem If we have a population measured by a variable with a mean µ and a standard deviation σ, and if all possible random samples of size
More informationy ˆ i = ˆ " T u i ( i th fitted value or i th fit)
1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationLecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t
Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t t Confidence Interval for Population Mean Comparing z and t Confidence Intervals When neither z nor t Applies
More informationLast week: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationLast two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling
More informationInferential statistics
Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,
More informationCHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups 10.1 Comparing Two Proportions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Proportions
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationSTAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)
STAT 515 -- Chapter 9: Two-Sample Problems Paired Differences (Section 9.3) Examples of Paired Differences studies: Similar subjects are paired off and one of two treatments is given to each subject in
More informationChapter 9 Inferences from Two Samples
Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2017 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationMath 124: Modules Overall Goal. Point Estimations. Interval Estimation. Math 124: Modules Overall Goal.
What we will do today s David Meredith Department of Mathematics San Francisco State University October 22, 2009 s 1 2 s 3 What is a? Decision support Political decisions s s Goal of statistics: optimize
More informationWarm-up Using the given data Create a scatterplot Find the regression line
Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444
More informationInteractions and Factorial ANOVA
Interactions and Factorial ANOVA STA442/2101 F 2018 See last slide for copyright information 1 Interactions Interaction between explanatory variables means It depends. Relationship between one explanatory
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationRelating Graph to Matlab
There are two related course documents on the web Probability and Statistics Review -should be read by people without statistics background and it is helpful as a review for those with prior statistics
More informationSimple Linear Regression for the Climate Data
Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO
More informationWhat is a Hypothesis?
What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population mean Example: The mean monthly cell phone bill in this city is μ = $42 population proportion Example:
More informationComparing Means from Two-Sample
Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to
More informationInterpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score
Interpret Standard Deviation Outlier Rule Linear Transformations Describe the Distribution OR Compare the Distributions SOCS Using Normalcdf and Invnorm (Calculator Tips) Interpret a z score What is an
More information11 Correlation and Regression
Chapter 11 Correlation and Regression August 21, 2017 1 11 Correlation and Regression When comparing two variables, sometimes one variable (the explanatory variable) can be used to help predict the value
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationChapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis
Chapter 8 Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis Copyright 2018, 2014, and 2011 Pearson Education, Inc. Slide - 1 Content 1. Identifying the Target Parameter 2.
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Statistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests 1999 Prentice-Hall, Inc. Chap. 8-1 Chapter Topics Hypothesis Testing Methodology Z Test
More informationChapter 23. Inference About Means
Chapter 23 Inference About Means 1 /57 Homework p554 2, 4, 9, 10, 13, 15, 17, 33, 34 2 /57 Objective Students test null and alternate hypotheses about a population mean. 3 /57 Here We Go Again Now that
More informationImportant note: Transcripts are not substitutes for textbook assignments. 1
In this lesson we will cover correlation and regression, two really common statistical analyses for quantitative (or continuous) data. Specially we will review how to organize the data, the importance
More informationy = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output
12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation y = a + bx y = dependent variable a = intercept b = slope x = independent variable Section 12.1 Inference for Linear
More informationChapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania
Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are
More informationBusiness Statistics. Lecture 5: Confidence Intervals
Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More information10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing
& z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could
More informationExam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015
Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences 18.30 21.15h, February 12, 2015 Question 1 is on this page. Always motivate your answers. Write your answers in English. Only the
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationHypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.
Hypothesis Testing Today, we are going to begin talking about the idea of hypothesis testing how we can use statistics to show that our causal models are valid or invalid. We normally talk about two types
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationLecture #16 Thursday, October 13, 2016 Textbook: Sections 9.3, 9.4, 10.1, 10.2
STATISTICS 200 Lecture #16 Thursday, October 13, 2016 Textbook: Sections 9.3, 9.4, 10.1, 10.2 Objectives: Define standard error, relate it to both standard deviation and sampling distribution ideas. Describe
More informationTutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances. Acknowledgements:
Tutorial 3: Power and Sample Size for the Two-sample t-test with Equal Variances Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported in
More informationCIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8
CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval
More informationSection 10.1 (Part 2 of 2) Significance Tests: Power of a Test
1 Section 10.1 (Part 2 of 2) Significance Tests: Power of a Test Learning Objectives After this section, you should be able to DESCRIBE the relationship between the significance level of a test, P(Type
More informationHYPOTHESIS TESTING. Hypothesis Testing
MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More informationdetermine whether or not this relationship is.
Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations
More informationQuestions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.
Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized
More informationLecture 11 - Tests of Proportions
Lecture 11 - Tests of Proportions Statistics 102 Colin Rundel February 27, 2013 Research Project Research Project Proposal - Due Friday March 29th at 5 pm Introduction, Data Plan Data Project - Due Friday,
More informationAnswer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1
9.1 Scatter Plots and Linear Correlation Answers 1. A high school psychologist wants to conduct a survey to answer the question: Is there a relationship between a student s athletic ability and his/her
More informationMultiple samples: Modeling and ANOVA
Multiple samples: Modeling and Patrick Breheny April 29 Patrick Breheny Introduction to Biostatistics (171:161) 1/23 Multiple group studies In the latter half of this course, we have discussed the analysis
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationAMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015
AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationt-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression
t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationTables Table A Table B Table C Table D Table E 675
BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationStatistical inference provides methods for drawing conclusions about a population from sample data.
Introduction to inference Confidence Intervals Statistical inference provides methods for drawing conclusions about a population from sample data. 10.1 Estimating with confidence SAT σ = 100 n = 500 µ
More informationCOSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan
COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Topic Distribution of means When it is needed How to build one (from scratch) Determining the characteristics
More information