Inferential Statistics

Size: px
Start display at page:

Download "Inferential Statistics"

Transcription

1 Inferential Statistics Part 1 Sampling Distributions, Point Estimates & Confidence Intervals Inferential statistics are used to draw inferences (make conclusions/judgements) about a population from a sample. Consider an experiment in which 10 students who sat an exam after 24 hours of sleep deprivation scored 12% lower than 10 students who sat the same exam after a normal night's sleep. Is the difference real or could it be due to chance? How much larger could the real difference be than the 12% found in the sample? These are the types of questions answered by inferential statistics. There are two main methods used in inferential statistics: estimation and hypothesis testing. In estimation, the sample is used to estimate a parameter and a confidence interval about the estimate is constructed. In the most common use of hypothesis testing, a null hypothesis is put forward and it is determined whether the data is strong enough to reject it. For the sleep deprivation study, the null hypothesis would be that sleep deprivation has no effect on performance. Sampling Error When we looked at primary data collection and sampling methods, we stressed the importance of selecting a random sample so that every item or individual in the population had a known chance of being selected. To accomplish this, we could choose a simple random sample, a systematic sample, a stratified sample, a cluster sample, or a combination of these methods. However, it is unlikely that the mean of a sample would be identical to the population mean. Likewise, the sample standard deviation or other measure computed from a sample would probably not be exactly equal to the corresponding population value. We can therefore expect some difference between a sample statistic, such as the sample mean or sample standard deviation, and the corresponding population parameter. The difference between a sample statistic and a population parameter is called sampling error. Example: Suppose that a population of five students had exam results of 68, 72, 67, 69 and 74. Suppose that a sample of two results 68 and 74 - is selected to estimate the population mean result. The mean of that sample would be Another sample of two is selected 72 and 67 - with a sample mean of The mean of all the results (the population mean) is 70. The sampling error for the first sample is 0.7, determined by X - µ = The second sample has a sampling error of -0.35

2 Each of these differences, 0.7 and -0.35, is the error made in estimating the population mean based on a sample mean, and these sampling errors are due to chance. The amount of these errors will vary from one sample to the next. So given the possibility of a sampling error when sample results are used to estimate a population parameter, how can we make accurate inferences/conclusions about the population based only on sample results? To begin with we develop a sampling distribution of the sample means. Sampling Distribution of the Sample Means The exam results example showed the means for samples of a specified size vary from sample to sample. The mean exam result of the first sample of two students was 70.7, and the second sample mean was A third sample would probably result in a different mean. The population mean was 70. If we organised the means of all possible samples of two results into a probability distribution, we would obtain the sampling distribution of the sample means. Example: A firm has seven production workers (considered the population). The hourly earnings of each worker are given below. Employee No. Hourly Earnings What is the population mean? 2. What is the sampling distribution of the sample means for samples with a size of 2? 3. What is the mean of the sampling distribution? 4. What observations can be made about the population and the sampling distribution? The population mean is found by: To get the sampling distribution of the sample means, all possible samples of 2 are selected without replacement from the population, and their means are computed. There are 21 possible samples, found by:

3 where N is the number of observations in the population and n is the number of observations in the sample. The 21 distinct sample means from all possible samples of 2 that can be drawn from the population are shown below. Sample Employees Hourly Earnings Sum Mean This probability distribution is the sampling distribution of the sample means and can be summarised as follows. Sampling Distribution of the Sample Mean for n = 2 Sample Mean No. Means Probability

4 The mean of the sampling distribution of the sample mean is obtained by summing the various sample means and dividing the sum by the number of samples. The mean of all the sample means is usually written. The µ reminds us that it is a population value because we have considered all possible samples. The subscript X indicates that it is the sampling distribution of mean.

5 These observations can be made: a. The mean of the sample means ( 7.71) is equal to the mean of the population: b. The spread in the distribution of the sample means is less than the spread in the population values. The sample means range from 7 to 8.50, while the population values vary from 7 to 9. In fact, the standard deviation of the distribution of the sample means is equal to the population standard deviation divided by the square root of the sample size. So the formula for the standard deviation of the distribution of sample means is: Therefore, as we increase the size of the sample, the spread of the distribution of the sample means becomes smaller. c. The shape of the sampling distribution of the sample means and the shape of the frequency distribution of the population values are different. The distribution of sample means tends to be more bell-shaped and to approximate the normal probability distribution. In summary, we took all possible random samples from a population and for each sample calculated a sample statistic (the mean). Because each possible sample has a chance of being selected, the probability that the mean amount earned will be values such as 7.27,

6 8.50, 6.50 etc. can be determined. The distribution of the mean amounts earned is called the sampling distribution of the sample means. Even though in practice we see only one particular random sample, in theory any of the samples could arise. Consequently, we view the sampling process as repeated sampling of the statistic from its sampling distribution. This sampling distribution is then used to measure how likely a particular outcome might be. The Central Limit Theorem Applying the central limit theorem to the sampling distribution of the sample means allows us to use the normal probability distribution to create confidence intervals for the population mean. The central limit theorem states that, for large random samples, the shape of the sampling distribution of the sample means is close to a normal probability distribution. The approximation is more accurate for large samples than for small samples (most statisticians consider a sample of 30, or more, large enough for the central limit theorem to be employed) General Procedure Sampling requires that we draw successive samples from a defined population. The samples must be randomly selected and of the same size. Calculate the mean for each sample and plot the sample means. This produces a distribution of sample means. A plot of an " infinite" number of sample means is called the sampling distribution of the mean. Successive Sampling Frequency distributions of sample means quickly approach the shape of a normal distribution, even if we are taking relatively few, small samples from a population that is not normally distributed. As we randomly select more and more samples from the population, the distribution of sample means becomes more normally distributed and looks smoother. With " infinite" numbers of successive random samples, the sampling distributions all have a normal distribution with a mean that is equal to the population mean (μ). Increasing Sample Size As sample sizes increase, the sampling distributions approach a normal distribution. With " infinite" numbers of successive random samples, the mean of the sampling distribution is equal to the population mean (μ). As the sample sizes increases, the variability of each sampling distribution decreases. The range of the sampling distribution is smaller than the range of the original population.

7 Taken together, these distributions suggest that the sample mean provides a good estimate of μ and that errors in our estimates (indicated by the variability of scores in the distribution) decrease as the size of the samples we draw from the population increase. Population Distributions The principles of successive sampling and increasing sample size work for all distributions. We can count on the sampling distribution of the mean being approximately normally distributed, no matter what the original population distribution looks like as long as the sample size is relatively large. The central limit theorem states that when an infinite number of successive random samples are taken from a population, the distribution of sample means calculated for each sample will become approximately normally distributed with mean μ and standard deviation σ/ n as the sample size (n) becomes larger, irrespective of the shape of the population distribution. This is one of the most useful conclusions in statistics. We can reason about the distribution of the sample means with absolutely no information about the shape of the original distribution from which the sample is taken. In other words, the central limit theorem is true for all distributions. The central limit theorem applies only to sample means; its tenets cannot be applied to any other statistic. The central limit theorem tells that what to expect of the distribution of sample means when we take an infinite number of relatively large samples of a given size from a population. The central limit theorem works no matter what how the population distribution is shaped. The central limit theorem helps us test hypotheses about means because it tells us what to expect when we draw samples from a population. Point Estimates & Confidence Intervals In statistics, point estimation involves the use of sample data to calculate a single value (known as a statistic) which serves as a "best guess" for an unknown population parameter. For example, the sample mean X, is a statistic, and is a point estimate of the population mean, a parameter, μ. While we expect the point estimate (statistic) to be close to the population parameter, we would like to measure how close (accurate) it is. A confidence interval serves this purpose. In contrast to point estimation, which is a single number, with confidence intervals we use sample data to construct an interval (range) of possible (or probable) values of an unknown population parameter, so that the parameter occurs within that range at a specified probability. The specified probability is called the level of confidence.

8 The information developed about the shape of a sampling distribution of the sample means, that is the sampling distribution of X, allows us to locate an interval that has a specified probability of containing the population mean μ. for reasonably large samples, we can use the central limit theorem and state the following: 1. 95% of the sample means selected from a population will be within 1.96 standard deviations of the population mean μ % of the sample means will lie within 2.58 standard deviations of the population mean. How are the values of 1.96 and 2.58 obtained? The 95% and 99% refer to the percent of the time that similarly constructed intervals would include the parameter being estimated. For example, 95% refers to the middle 95% of the observations. Therefore, the remaining 5% are equally divided between the two tails. The central limit theorem states that, for large random samples, the shape of the sampling distribution of the sample means is approximately normal. Therefore, we use z-tables to look at areas under the normal curve (see z-table) When the sample size, n, is at least 30, it is generally accepted that the central limit theorem will ensure a normal distribution of the sample means. This is an important consideration. If the sample means are normally distributed, we can use the standard normal distribution, that is, z, in our calculations. When n 30 the formula for getting the confidence interval for a mean is: where:

9 Example: X = the sample mean z = appropriate z value for level of confidence s = sample standard deviation n = sample size An experiment involves selecting a random sample of 256 managers. One item of interest is annual income. The sample mean is 45,420, and the sample standard deviation is 2, What is the estimated mean income of all managers (the population) i.e. what is the point estimate? 2. What is the 95% confidence interval for the population mean (rounded to the nearest 10)? 3. Interpret the findings. 1. The point estimate of the population mean is 45, The confidence interval is: onfidence Interval for a ean X = 45,420 z = 1.96 s = 2,050 n = 256 onfidence Interval for ean 45,420 45,420 45, and 45, ,170 and 45,170 μ 3. We can say that we are 95% confident that the unknown population mean income (μ) is between 45,170 and 45, 70. If we had time to select many samples of size 256 from the population of managers and compute sample means and confidence intervals, the population mean and annual income would be in about 95 of every 100 confidence intervals. Either an interval

10 contains the population mean of not. About 5 out of every 100 confidence intervals would not contain the population mean annual income, μ.

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10.1 Comparing Two Proportions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Proportions

More information

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers Statistical Inference Greg C Elvers 1 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population and not just the sample that we used But our sample

More information

Examine characteristics of a sample and make inferences about the population

Examine characteristics of a sample and make inferences about the population Chapter 11 Introduction to Inferential Analysis Learning Objectives Understand inferential statistics Explain the difference between a population and a sample Explain the difference between parameter and

More information

Unit 4 Probability. Dr Mahmoud Alhussami

Unit 4 Probability. Dr Mahmoud Alhussami Unit 4 Probability Dr Mahmoud Alhussami Probability Probability theory developed from the study of games of chance like dice and cards. A process like flipping a coin, rolling a die or drawing a card from

More information

Sampling Methods and the Central Limit Theorem GOALS. Why Sample the Population? 9/25/17. Dr. Richard Jerz

Sampling Methods and the Central Limit Theorem GOALS. Why Sample the Population? 9/25/17. Dr. Richard Jerz Sampling Methods and the Central Limit Theorem Dr. Richard Jerz 1 GOALS Explain why a sample is the only feasible way to learn about a population. Describe methods to select a sample. Define and construct

More information

Statistical Inference for Means

Statistical Inference for Means Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,

More information

Sampling Distributions

Sampling Distributions Sampling Error As you may remember from the first lecture, samples provide incomplete information about the population In particular, a statistic (e.g., M, s) computed on any particular sample drawn from

More information

THE SAMPLING DISTRIBUTION OF THE MEAN

THE SAMPLING DISTRIBUTION OF THE MEAN THE SAMPLING DISTRIBUTION OF THE MEAN COGS 14B JANUARY 26, 2017 TODAY Sampling Distributions Sampling Distribution of the Mean Central Limit Theorem INFERENTIAL STATISTICS Inferential statistics: allows

More information

Math 10 - Compilation of Sample Exam Questions + Answers

Math 10 - Compilation of Sample Exam Questions + Answers Math 10 - Compilation of Sample Exam Questions + Sample Exam Question 1 We have a population of size N. Let p be the independent probability of a person in the population developing a disease. Answer the

More information

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit

More information

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc. Chapter 23 Inferences About Means Sampling Distributions of Means Now that we know how to create confidence intervals and test hypotheses about proportions, we do the same for means. Just as we did before,

More information

Single Sample Means. SOCY601 Alan Neustadtl

Single Sample Means. SOCY601 Alan Neustadtl Single Sample Means SOCY601 Alan Neustadtl The Central Limit Theorem If we have a population measured by a variable with a mean µ and a standard deviation σ, and if all possible random samples of size

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes Chapter 7 Estimates and Sample Sizes 7-1 Overview 7-2 Estimating a Population Proportion 7-3 Estimating a Population Mean: σ Known 7-4 Estimating a Population Mean: σ Not Known 7-5 Estimating a Population

More information

Lab #12: Exam 3 Review Key

Lab #12: Exam 3 Review Key Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution

More information

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing

More information

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing Agenda Introduction to Estimation Point estimation Interval estimation Introduction to Hypothesis Testing Concepts en terminology

More information

Inference About Means and Proportions with Two Populations. Chapter 10

Inference About Means and Proportions with Two Populations. Chapter 10 Inference About Means and Proportions with Two Populations Chapter 10 Two Populations? Chapter 8 we found interval estimates for the population mean and population proportion based on a random sample Chapter

More information

Stochastic calculus for summable processes 1

Stochastic calculus for summable processes 1 Stochastic calculus for summable processes 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a

More information

Do students sleep the recommended 8 hours a night on average?

Do students sleep the recommended 8 hours a night on average? BIEB100. Professor Rifkin. Notes on Section 2.2, lecture of 27 January 2014. Do students sleep the recommended 8 hours a night on average? We first set up our null and alternative hypotheses: H0: μ= 8

More information

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan COSC 341 Human Computer Interaction Dr. Bowen Hui University of British Columbia Okanagan 1 Last Class Introduced hypothesis testing Core logic behind it Determining results significance in scenario when:

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

Sampling. What is the purpose of sampling: Sampling Terms. Sampling and Sampling Distributions

Sampling. What is the purpose of sampling: Sampling Terms. Sampling and Sampling Distributions Sampling and Sampling Distributions Normal Distribution Aims of Sampling Basic Principles of Probability Types of Random Samples Sampling Distributions Sampling Distribution of the Mean Standard Error

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

Inferences for Correlation

Inferences for Correlation Inferences for Correlation Quantitative Methods II Plan for Today Recall: correlation coefficient Bivariate normal distributions Hypotheses testing for population correlation Confidence intervals for population

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

Ch. 17. DETERMINATION OF SAMPLE SIZE

Ch. 17. DETERMINATION OF SAMPLE SIZE LOGO Ch. 17. DETERMINATION OF SAMPLE SIZE Dr. Werner R. Murhadi www.wernermurhadi.wordpress.com Descriptive and Inferential Statistics descriptive statistics is Statistics which summarize and describe

More information

How do we compare the relative performance among competing models?

How do we compare the relative performance among competing models? How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Business Statistics:

Business Statistics: Chapter 7 Student Lecture Notes 7-1 Department of Quantitative Methods & Information Systems Business Statistics: Chapter 7 Introduction to Sampling Distributions QMIS 220 Dr. Mohammad Zainal Chapter Goals

More information

12.4. The Normal Distribution: A Problem-Solving Tool

12.4. The Normal Distribution: A Problem-Solving Tool 12.4. The Normal Distribution: A Problem-Solving Tool 1 Objectives A. Find the mean and standard deviation from a normal curve. B. Find the z-score of a measurement from a normally distributed set of data.

More information

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above

Final Exam STAT On a Pareto chart, the frequency should be represented on the A) X-axis B) regression C) Y-axis D) none of the above King Abdul Aziz University Faculty of Sciences Statistics Department Final Exam STAT 0 First Term 49-430 A 40 Name No ID: Section: You have 40 questions in 9 pages. You have 90 minutes to solve the exam.

More information

POLI 443 Applied Political Research

POLI 443 Applied Political Research POLI 443 Applied Political Research Session 4 Tests of Hypotheses The Normal Curve Lecturer: Prof. A. Essuman-Johnson, Dept. of Political Science Contact Information: aessuman-johnson@ug.edu.gh College

More information

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing Page Title PSY 305 Module 3 Introduction to Hypothesis Testing Z-tests Five steps in hypothesis testing State the research and null hypothesis Determine characteristics of comparison distribution Five

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Chapter 9 Inferences from Two Samples

Chapter 9 Inferences from Two Samples Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review

More information

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)

More information

10.1. Comparing Two Proportions. Section 10.1

10.1. Comparing Two Proportions. Section 10.1 /6/04 0. Comparing Two Proportions Sectio0. Comparing Two Proportions After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET

More information

Data Collection: What Is Sampling?

Data Collection: What Is Sampling? Project Planner Data Collection: What Is Sampling? Title: Data Collection: What Is Sampling? Originally Published: 2017 Publishing Company: SAGE Publications, Inc. City: London, United Kingdom ISBN: 9781526408563

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores

More information

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

CBA4 is live in practice mode this week exam mode from Saturday!

CBA4 is live in practice mode this week exam mode from Saturday! Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as

More information

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing & z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could

More information

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing

Elementary Statistics Triola, Elementary Statistics 11/e Unit 17 The Basics of Hypotheses Testing (Section 8-2) Hypotheses testing is not all that different from confidence intervals, so let s do a quick review of the theory behind the latter. If it s our goal to estimate the mean of a population,

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

Business Statistics:

Business Statistics: Department of Quantitative Methods & Information Systems Business Statistics: Chapter 7 Introduction to Sampling Distributions QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing this chapter,

More information

Section 6.2 Hypothesis Testing

Section 6.2 Hypothesis Testing Section 6.2 Hypothesis Testing GIVEN: an unknown parameter, and two mutually exclusive statements H 0 and H 1 about. The Statistician must decide either to accept H 0 or to accept H 1. This kind of problem

More information

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV)

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV) MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS Descriptive statistics Inferential statistics MEASURES OF CENTRAL TENDENCY In an experiment, these are applied to the dependent variable (DV) E.g., MEASURES

More information

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12 Winter 2012 Lecture 13 (Winter 2011) Estimation Lecture 13 1 / 33 Review of Main Concepts Sampling Distribution of Sample Mean

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015 Fall 2015 Population versus Sample Population: data for every possible relevant case Sample: a subset of cases that is drawn from an underlying population Inference Parameters and Statistics A parameter

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

UNIVERSITY OF TORONTO MISSISSAUGA. SOC222 Measuring Society In-Class Test. November 11, 2011 Duration 11:15a.m. 13 :00p.m.

UNIVERSITY OF TORONTO MISSISSAUGA. SOC222 Measuring Society In-Class Test. November 11, 2011 Duration 11:15a.m. 13 :00p.m. UNIVERSITY OF TORONTO MISSISSAUGA SOC222 Measuring Society In-Class Test November 11, 2011 Duration 11:15a.m. 13 :00p.m. Location: DV2074 Aids Allowed You may be charged with an academic offence for possessing

More information

Final Exam - Spring ST 370 Online - A

Final Exam - Spring ST 370 Online - A Final Exam - Spring 2002 - ST 370 Online - A Darken the circle on the answer sheet corresponding to your answer. Use a number 2 pencil. Stray marks on the form may cause errors. All questions are worth

More information

Math Released Item Algebra 1. Solve the Equation VH046614

Math Released Item Algebra 1. Solve the Equation VH046614 Math Released Item 2017 Algebra 1 Solve the Equation VH046614 Anchor Set A1 A8 With Annotations Prompt Rubric VH046614 Rubric Score Description 3 Student response includes the following 3 elements. Reasoning

More information

Applied Statistics for the Behavioral Sciences

Applied Statistics for the Behavioral Sciences Applied Statistics for the Behavioral Sciences Chapter 8 One-sample designs Hypothesis testing/effect size Chapter Outline Hypothesis testing null & alternative hypotheses alpha ( ), significance level,

More information

EXAM 3 Math 1342 Elementary Statistics 6-7

EXAM 3 Math 1342 Elementary Statistics 6-7 EXAM 3 Math 1342 Elementary Statistics 6-7 Name Date ********************************************************************************************************************************************** MULTIPLE

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Sampling, Confidence Interval and Hypothesis Testing

Sampling, Confidence Interval and Hypothesis Testing Sampling, Confidence Interval and Hypothesis Testing Christopher Grigoriou Executive MBA HEC Lausanne 2007-2008 1 Sampling : Careful with convenience samples! World War II: A statistical study to decide

More information

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes We Make Stats Easy. Chapter 4 Tutorial Length 1 Hour 45 Minutes Tutorials Past Tests Chapter 4 Page 1 Chapter 4 Note The following topics will be covered in this chapter: Measures of central location Measures

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics The Mean "When she told me I was average, she was just being mean". The mean is probably the most often used parameter or statistic used to describe the central tendency

More information

CENTRAL LIMIT THEOREM (CLT)

CENTRAL LIMIT THEOREM (CLT) CENTRAL LIMIT THEOREM (CLT) A sampling distribution is the probability distribution of the sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic

More information

Inferential statistics

Inferential statistics Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute

More information

Normal Curve in standard form: Answer each of the following questions

Normal Curve in standard form: Answer each of the following questions Basic Statistics Normal Curve in standard form: Answer each of the following questions What percent of the normal distribution lies between one and two standard deviations above the mean? What percent

More information

Bus 216: Business Statistics II Introduction Business statistics II is purely inferential or applied statistics.

Bus 216: Business Statistics II Introduction Business statistics II is purely inferential or applied statistics. Bus 216: Business Statistics II Introduction Business statistics II is purely inferential or applied statistics. Study Session 1 1. Random Variable A random variable is a variable that assumes numerical

More information

Review. A Bernoulli Trial is a very simple experiment:

Review. A Bernoulli Trial is a very simple experiment: Review A Bernoulli Trial is a very simple experiment: Review A Bernoulli Trial is a very simple experiment: two possible outcomes (success or failure) probability of success is always the same (p) the

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

LC OL - Statistics. Types of Data

LC OL - Statistics. Types of Data LC OL - Statistics Types of Data Question 1 Characterise each of the following variables as numerical or categorical. In each case, list any three possible values for the variable. (i) Eye colours in a

More information

Probability and Samples. Sampling. Point Estimates

Probability and Samples. Sampling. Point Estimates Probability and Samples Sampling We want the results from our sample to be true for the population and not just the sample But our sample may or may not be representative of the population Sampling error

More information

DSST Principles of Statistics

DSST Principles of Statistics DSST Principles of Statistics Time 10 Minutes 98 Questions Each incomplete statement is followed by four suggested completions. Select the one that is best in each case. 1. Which of the following variables

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Tribhuvan University Institute of Science and Technology 2065

Tribhuvan University Institute of Science and Technology 2065 1CSc. Stat. 108-2065 Tribhuvan University Institute of Science and Technology 2065 Bachelor Level/First Year/ First Semester/ Science Full Marks: 60 Computer Science and Information Technology (Stat. 108)

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Chapter 12: Estimation

Chapter 12: Estimation Chapter 12: Estimation Estimation In general terms, estimation uses a sample statistic as the basis for estimating the value of the corresponding population parameter. Although estimation and hypothesis

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

AP Statistics Ch 12 Inference for Proportions

AP Statistics Ch 12 Inference for Proportions Ch 12.1 Inference for a Population Proportion Conditions for Inference The statistic that estimates the parameter p (population proportion) is the sample proportion p ˆ. p ˆ = Count of successes in the

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

Section 9 1B: Using Confidence Intervals to Estimate the Difference ( p 1 p 2 ) in 2 Population Proportions p 1 and p 2 using Two Independent Samples

Section 9 1B: Using Confidence Intervals to Estimate the Difference ( p 1 p 2 ) in 2 Population Proportions p 1 and p 2 using Two Independent Samples Section 9 1B: Using Confidence Intervals to Estimate the Difference ( p 1 p 2 ) in 2 Population Proportions p 1 and p 2 using Two Independent Samples If p 1 p 1 = 0 then there is no difference in the 2

More information

Big Data Analysis with Apache Spark UC#BERKELEY

Big Data Analysis with Apache Spark UC#BERKELEY Big Data Analysis with Apache Spark UC#BERKELEY This Lecture: Relation between Variables An association A trend» Positive association or Negative association A pattern» Could be any discernible shape»

More information

Sampling Distributions

Sampling Distributions Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong Statistics Primer ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong 1 Quick Overview of Statistics 2 Descriptive vs. Inferential Statistics Descriptive Statistics: summarize and describe data

More information

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<= A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Background to Statistics

Background to Statistics FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient

More information

Statistical inference provides methods for drawing conclusions about a population from sample data.

Statistical inference provides methods for drawing conclusions about a population from sample data. Introduction to inference Confidence Intervals Statistical inference provides methods for drawing conclusions about a population from sample data. 10.1 Estimating with confidence SAT σ = 100 n = 500 µ

More information