The Chi-Square Distributions

Similar documents
The Chi-Square Distributions

(i) The mean and mode both equal the median; that is, the average value and the most likely value are both in the middle of the distribution.

(i) The mean and mode both equal the median; that is, the average value and the most likely value are both in the middle of the distribution.

Using Tables and Graphing Calculators in Math 11

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Ch. 7. One sample hypothesis tests for µ and σ

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

Lecture 41 Sections Wed, Nov 12, 2008

hp calculators HP 50g Probability distributions The MTH (MATH) menu Probability distributions

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Linear Correlation and Regression Analysis

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

The Geometric Distribution

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Chapter 4. Probability Distributions Continuous

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Chapter 11 Sampling Distribution. Stat 115

χ 2 (m 1 d) distribution, where d is the number of parameter MLE estimates made.

Continuous Random Variables

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

MATH4427 Notebook 4 Fall Semester 2017/2018

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Tables Table A Table B Table C Table D Table E 675

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution

χ L = χ R =

Single Sample Means. SOCY601 Alan Neustadtl

The t-statistic. Student s t Test

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

Chapter 23: Inferences About Means

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Chapter 7 Comparison of two independent samples

+ Specify 1 tail / 2 tail

Statistical Calculations and Tests Using the TI 83/84.

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

Study Ch. 13.1, # 1 4 all Study Ch. 13.2, # 9 15, 25, 27, 31 [# 11 17, ~27, 29, ~33]

One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013

Chapter 23. Inference About Means

hp calculators HP 20b Probability Distributions The HP 20b probability distributions Practice solving problems involving probability distributions

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Descriptive Statistics

Table 1: Fish Biomass data set on 26 streams

Difference between means - t-test /25

MATH Chapter 21 Notes Two Sample Problems

Review of Statistics 101

Two-Sample Inferential Statistics

Normal distributions

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Small-Sample CI s for Normal Pop. Variance

Formulas and Tables by Mario F. Triola

1.3: Describing Quantitative Data with Numbers

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Lecture 45 Sections Wed, Nov 19, 2008

(x t. x t +1. TIME SERIES (Chapter 8 of Wilks)

Quantitative Analysis and Empirical Methods

POLI 443 Applied Political Research

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

9.5 t test: one μ, σ unknown

Sampling Distributions: Central Limit Theorem

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Bio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

AP Statistics Cumulative AP Exam Study Guide

Preliminary Statistics. Lecture 3: Probability Models and Distributions

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Statistical Intervals (One sample) (Chs )

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

This does not cover everything on the final. Look at the posted practice problems for other topics.

Suppose that we are concerned about the effects of smoking. How could we deal with this?

Chapter 24. Comparing Means

Final Exam - Solutions

Hypothesis Testing for Var-Cov Components

Statistical Preliminaries. Stony Brook University CSE545, Fall 2016

CHAPTER 7. Hypothesis Testing

Introduction to Statistical Inference

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Inference for Proportions, Variance and Standard Deviation

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion

Content by Week Week of October 14 27

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

16.3 One-Way ANOVA: The Procedure

Will Landau. Feb 28, 2013

Chapter 8 Sampling Distributions Defn Defn

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Hypothesis testing: Steps

Ch18 links / ch18 pdf links Ch18 image t-dist table

Chi Square Analysis M&M Statistics. Name Period Date

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Transcription:

MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness of fit of various population models on a set of data. A chi-square distribution is based on a parameter known as the degrees of freedom n, where n is an integer greater than or equal to 1. Such a random variable is denoted by X ~ χ (n). The χ (n) distribution is defined to be the sum of the squares of n independent standard normal distributions. For example, suppose X 1,..., X n are independent normally distributed measurements having mean µ i and standard deviation σ i for i = 1,..., n. These measurements could be the heights or IQ scores of various groups of people. By subtracting the mean and then dividing by the standard deviation, we convert each measurement into a standard normal distribution: Z i = X i µ i σ i ~ N(0, 1), for 1 i n. So Z 1 ~ N(0, 1) and its distribution graph will be the common bell-shaped curve which is symmetric about the origin. Then Z 1 ~ χ (1). Its plot will consist of positive values concentrated near the origin, and it will have mean 1 and variance. The standard normal distribution χ (1) distribution χ ()distribution χ (n) distribution By standardizing, squaring, and summing random measurements from the respective normal populations, we obtain a chi-square distribution with n degrees of freedom: χ (n) = X 1 µ 1 σ 1 + X µ σ +... + X n µ n σ n = Z 1 + Z +...+ Zn. The distribution graphs for n 3 are skewed bell-shaped curves, defined on [0, ), with increasingly larger values of x as the point at which the graph obtains its maximum. The mean is now n, the variance is n, and the standard deviation is n. For n 3, the maximum (mode) occurs when x = n. X ~ χ (n) = Z 1 + Z +...+ Zn Mean = n Variance = n Standard Deviation = n Mode = n (for n 3)

The theoretical distribution curve is given by f (x) = C n x n/ 1 e x /, for x 0, where C n is a constant that depends on n given by 1 n/ n 1! C n = (n )/ n 1! (n 1)! π for n even for n odd. A chi-square curve can be plotted using the built-in χ pdf( command from the DISTR menu. For example, to graph the χ (10) curve, enter χ pdf( X,10) into the Y= screen. To compute P(a X b) for X ~ χ (n), enter χ cdf(a, b, n) or Shadeχ (a, b, n). Example 1. Let X ~ χ (10). (a) Where does the maximum of the curve occur? (b) Compute P(6 X 10). Is there symmetry at the outer tails; i.e., does P(0 X 6) = P(X 10)? (c) Find the left and right bounds that contain 90% of the distribution. Solution. (a) For X ~ χ (10), the maximum (mode) occurs when x = n = 8. (b) From the TI output, we see that P(6 X 10) 0.37477. Also, the left-tail is P(0 X 6) 0.1847, and the right-tail is P(X 10) 0.4405. So the two tails outside of the inner region 6 X 10 are not symmetric. For there to be 90% in the middle of the distribution, we must have 5% at each tail. The values where these occur (chi-square scores) can be found with the table on the next page. In this case, the values are about 3.940 and 18.31.

Left and Right Chi Square Scores for 80%, 90%, 95%, and 98% intervals. (L = Prob. of Left Tail, R = Prob. of Right Tail) 0.01 0.05 0.05 0.10 0.10 0.05 0.05 0.01 d.f. L L L L R R R R 1 0.000 0.001 0.004 0.016.706 3.841 5.04 6.635 0.00 0.051 0.103 0.11 4.605 5.991 7.378 9.10 3 0.115 0.16 0.35 0.584 6.51 7.815 9.348 11.34 4 0.97 0.484 0.711 1.064 7.779 9.488 11.14 13.8 5 0.554 0.831 1.145 1.610 9.36 11.07 1.83 15.09 6 0.87 1.37 1.635.04 10.64 1.59 14.45 16.81 7 1.39 1.690.167.833 1.0 14.07 16.01 18.48 8 1.646.180.733 3.490 13.36 15.51 17.54 0.09 9.088.700 3.35 4.168 14.68 16.9 19.0 1.67 10.558 3.47 3.940 4.865 15.99 18.31 0.48 3.1 11 3.053 3.816 4.575 5.578 17.8 19.68 1.9 4.7 1 3.571 4.404 5.6 6.304 18.55 1.03 3.34 6. 13 4.107 5.009 5.89 7.04 19.81.36 4.74 7.69 14 4.660 5.69 6.571 7.790 1.06 3.68 6.1 9.14 15 5.9 6.6 7.61 8.547.31 5.00 7.49 30.58 16 5.81 6.908 7.96 9.31 3.54 6.30 8.84 3.00 17 6.408 7.564 8.67 10.08 4.77 7.59 30.19 33.41 18 7.015 8.31 9.390 10.86 5.99 8.87 31.53 34.80 19 7.633 8.907 10.1 11.65 7.0 30.14 3.85 36.19 0 8.60 9.591 10.85 1.44 8.41 31.41 34.17 37.57 1 8.897 10.8 11.59 13.4 9.6 3.67 35.48 38.93 9.54 10.98 1.34 14.04 30.81 33.9 36.78 40.9 3 10.0 11.69 13.09 14.85 3.01 35.17 38.08 41.64 4 10.86 1.40 13.85 15.66 33.0 36.4 39.36 4.98 5 11.5 13.1 14.61 16.47 34.38 37.65 40.65 44.31 6 1.0 13.84 15.38 17.9 35.56 38.88 41.9 45.64 7 1.88 14.57 16.15 18.11 36.74 40.11 43.19 46.96 8 13.56 15.31 16.93 18.94 37.9 41.34 44.46 48.8 9 14.6 16.05 17.71 19.77 39.09 4.56 45.7 49.59 30 14.95 16.79 18.49 0.60 40.6 43.77 46.98 50.89 40.16 4.43 6.51 9.05 51.80 55.76 59.34 63.69 50 9.71 3.36 34.76 37.69 63.17 67.50 71.4 76.15 60 37.48 40.48 43.19 46.46 74.70 79.08 83.30 88.38 70 45.44 48.76 51.74 55.33 85.53 90.53 95.0 100.4 80 53.34 57.15 60.39 64.8 96.58 101.9 106.6 11.3

Theorems I. Let { x 1, x,..., x n } denote the collection of all random samples of size n from normally distributed measurements having variance σ. Let S n 1 = (x n 1 i x ) be i=1 the distribution of all possible sample variances. Then (n 1) S σ is a χ (n 1) distribution. Thus with a normally distributed measurement, we can evaluate P(a S b) by provided σ is known. P(a S b) = P(a S b ) (n 1)a = P σ (n 1)a = P σ (n 1)S (n 1)b σ σ χ (n 1) (n 1)b σ II. Let S be the sample variance from a random sample of size n of a normally distributed measurement having variance σ. A confidence interval for σ, with level of confidence r = 1 α, is given by (n 1)S R σ (n 1)S L, where L and R are the left and right bounds of the χ (n 1) distribution that give r (n 1)S (n 1)S probability in the middle. A confidence interval for σ is σ. R L III. To test the null hypothesis H 0 : σ = M for a normally distributed measurement, we obtain the sample deviation S from a random sample of size n. The test statistic is then (n 1) S (n 1) S x = σ = M which is compared with the χ (n 1) distribution. Compute the (left-tail) P -value P χ (n 1) x (right-tail) P -value P χ (n 1) x ( ) for the alternative H a : σ < M, and compute the ( ) for the alternative H a : σ > M.

Example. Random samples of size 46 are taken from a measurement that is N(100,15). What is P(13 S 17)? Example 3. From a normally distributed measurement, a sample of size 0 yields S = 3.96. Find a 98% confidence interval for the true standard deviation σ. Example 4. From a normally distributed measurement, a sample of size 5 yields a sample deviation of 13.96. Is there evidence to reject the hypothesis H 0 : σ = 15? Solutions Example : P(13 S 17) = P(13 S 17 ) (n 1)13 (n 1)S (n 1)17 = P σ σ σ = P 45 169 5 χ (n 1) P 33.8 χ (45) 57.8 45 89 5 ( ) 0.794 (using χ cdf(33.8, 57.8, 45) ) (n 1)S Example 3: σ R or.8693 σ 6.4776. (n 1)S L ; hence, 19 3.96 36.19 σ 19 3.96 7.633, Example 4: For S = 13.96, we use the alternative H a : σ < 15. The test statistic is x = (n 1) S 4 13.96 σ = 15 = 0. 78737 ~ χ (n 1) = χ (4) and P χ (4) 0.78737 ( ) 0.348765 (χ cdf(0, 0.78737, 4). If σ = 15 were true, then there is still a 34.8765% chance of obtaining a sample deviation of 13.96 or lower with a sample of size 5. There is not enough evidence to reject H 0.

Exercises 1. Let X ~ χ (15). Find (a) P(13 X 17), (b) P(X < 13) and (c) P(X > 17). Show a graph for each. (d) Find the bounds that contain 95% of the distribution.. Adult heights are found to be normally distributed with mean µ = 68 inches and standard deviation σ = 3.5 inches. Suppose various random samples of size n = 6 are collected. Compute P(.8 S 4.). 3. From a normally distributed measurement, a sample of size 5 yields a sample deviation of 14.85. Find a 95% confidence interval for the true standard deviation. 4. From a normally distributed measurement, a sample of size 16 yields S = 4.6. Is there evidence to reject the hypothesis H 0 : σ = 3? Answers: 1. (a) 0.834 (b) 0.3977 (c) 0.3189 (d) L = 6.6 and R = 7.49. P 3. Use 5.8 3.5 χ 5 4. (5) 3.5 4 14.85 39. 36 σ 4 14.85 1.40 = P 16 χ (5) 36 ( ) 0.843 to obtain 11.6 σ 0.66. 4. Test stat = 30.46, P χ (15) 30. 46 ( ) 0.011. If σ = 3 were true, then there is only a 1.1% chance of getting an S of 4.6 or higher with a sample of size 16. Can reject H 0 in favor of H a : σ > 3.