Statisticians use the word population to refer the total number of (potential) observations under consideration

Similar documents
Chapter 6 Sampling Distributions

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Sampling Distributions, Z-Tests, Power

Random Variables, Sampling and Estimation

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Module 1 Fundamentals in statistics

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Statistics 511 Additional Materials

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Topic 9: Sampling Distributions of Estimators

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Parameter, Statistic and Random Samples

Topic 10: Introduction to Estimation

Expectation and Variance of a random variable

Parameter, Statistic and Random Samples

Binomial Distribution

4. Partial Sums and the Central Limit Theorem

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Frequentist Inference

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Infinite Sequences and Series

Simulation. Two Rule For Inverting A Distribution Function

Stat 421-SP2012 Interval Estimation Section

(6) Fundamental Sampling Distribution and Data Discription

Economics Spring 2015

Distribution of Random Samples & Limit theorems

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

PRACTICE PROBLEMS FOR THE FINAL

Basics of Probability Theory (for Theory of Computation courses)

Understanding Samples

Basis for simulation techniques

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

This is an introductory course in Analysis of Variance and Design of Experiments.

The Sample Variance Formula: A Detailed Study of an Old Controversy

Estimation of a population proportion March 23,

Chapter 6 Principles of Data Reduction

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Advanced Stochastic Processes.

Probability and statistics: basic terms

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

The standard deviation of the mean

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Computing Confidence Intervals for Sample Data

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

7.1 Convergence of sequences of random variables

Lecture 2: Monte Carlo Simulation

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

AMS570 Lecture Notes #2

This section is optional.

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

CS284A: Representations and Algorithms in Molecular Biology

1.010 Uncertainty in Engineering Fall 2008

PRACTICE PROBLEMS FOR THE FINAL

Lecture 19: Convergence

6.3 Testing Series With Positive Terms

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Stat 319 Theory of Statistics (2) Exercises

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Lecture 3. Properties of Summary Statistics: Sampling Distribution

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

MATH/STAT 352: Lecture 15

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

f(x)dx = 1 and f(x) 0 for all x.

7.1 Convergence of sequences of random variables

1 Inferential Methods for Correlation and Regression Analysis

SDS 321: Introduction to Probability and Statistics

32 estimating the cumulative distribution function

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Topic 18: Composite Hypotheses

Axioms of Measure Theory

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

NOTES ON DISTRIBUTIONS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

An Introduction to Randomized Algorithms

Estimation for Complete Data


AP Statistics Review Ch. 8

Lecture 1 Probability and Statistics

5. INEQUALITIES, LIMIT THEOREMS AND GEOMETRIC PROBABILITY

ANALYSIS OF EXPERIMENTAL ERRORS

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Transcription:

6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space (chapter 3) Therefore, a populatio may be fiite (e.g. umber of households i the US) or (effectively) ifiite (e.g. umber of stars i the uiverse)

e.g. questio: average umber of TV sets per household i US populatio: umber of TV sets i each household i US questio: average umber of TV sets per household i North America populatio: umber of TV sets i each household i Caada, US ad Mexico questio: probability that a star has plaets populatio: umber of plaets per star for all stars (past, preset, future) i all galaxies i the uiverse

I aswerig questios (e.g. what is the mea, what is the variace, what is the probability) for a give populatio, oe seldom aswers the questios usig the etire populatio. I practice the questios are aswered from a subset (a sample) of the populatio. it is importat to choose the sample i a way that does ot bias the aswers This is the subject of a area of statistics referred to as experimetal desig. (how to desig the sample such that you adequately reflect the etire populatio) e.g. i determiig the probability of gettig a pair i a poker had, you would ot sample oly poker hads that cotaied two pairs. (techically this would be a attempt to determie the probability P(pair) for the etire populatio by approximatig it by a coditioal probability P(pair two pair) e.g. To determie the average legth of logs movig o a coveyor belt at costat speed, oe might decide to measure oly the logs that pass a certai poit o the coveyor belt every 10 miutes. Upo reflectio, you realize that loger logs have a greater probability of beig at the measurig poit at the selected times, thus the sample would give a biased average legth measure that would be too large. e.g. to determie the expected lifetime of a tire, you oly test it o smooth, paved roads? e.g. to determie fuel ratig o cars, the EPA presumes that every car is drive 55 percet of the time i the city ad 45 percet of the time o the highway!?

Oe way to esure ubiased samplig is to esure your subset is a radom sample Suppose our sample is to cosist of observatios, x 1, x,, x. We have to select the first observatio x 1, the secod x, etc. We thik of the procedure for pickig x k as selectig a value for a radom variable X k, that is, we thik of pickig values x 1, x,, x for our sample as the process of pickig values for radom variables X 1, X,, X. Usig this thikig, we ca defie a radom sample as follows: fiite populatio: A set of observatios X 1, X,, X costitutes a radom sample of size from a fiite populatio of size N, if values for the set are chose so that each subset of of the N elemets of the populatio has the same probability of beig selected. ifiite populatio: A set of observatios X 1, X,, X costitutes a radom sample of size from the ifiite populatio described by distributio (discrete) or desity (cotiuous) f(x) if 1. each X i is a RV whose distributio/desity is give by f(x). the RVs are idepedet The phrase radom sample is applied both to the RV s X 1, X,, X ad their values x 1, x,, x

How to achieve a radom sample? e.g. the populatio is fiite (ad relatively small) Label each elemet of the populatio 1,,, N. Draw umbers sequetially, i groups of, from a radom digits table

Whe the populatio size is large or ifiite, this process ca become practically impossible, ad careful thought must be give to, at least approximate, radom samplig desig. e.g. areal samplig usig a regular grid works if uderlyig populatio (e.g. chemical cotamiat cocetratio) is relatively homogeous. Does t work if uderlyig populatio is spatially cocetrated. e.g. replicate samplig i aomalous areas

6. Samplig Distributio of the Mea For each sample x 1, x,, x of observatios, we ca compute a mea x. The mea value will vary with each of our samples. Thus we ca thik of the sample mea (mea value for each sample) as a radom variable X obeyig some distributio fuctio f(x ; ) The distributio f(x ; ) is referred to as the theoretical samplig distributio. We put aside for the momet the questio of the form for f(x ; ) ad ote that, i chapter 5.10, we have already computed the mea ad variace for f(x ; ) i the case of cotiuous RV s. Theorem 6.1: If a radom sample X 1, X,, X of size is take from a populatio havig mea μ ad variace σ, the X is a RV whose distributio f(x ; ) has: ifiite populatio mea value E(X) = μ ad variace Var X = σ fiite populatio mea value E(X) = μ ad variace Var X = σ N N 1 Note: The appearace of the term N for the variace of X i the fiite populatio case is N 1 uexpected based upo the calculatio i 5.10. The calculatios i 5.10, whe applied to a fiite populatio, assume that N. This correctio factor, called the fiite populatio correctio (fpc) factor is icluded to accout for cases i which N. Note that the fpc factor =0 for = N. (i.e. Var X =0 whe = N). This implies that, whe oe sample is take usig the etire populatio, X exactly measures the populatio mea with o error (variace).

e.g. For N = 1,000 ad = 10, the fpc is fpc = 990 999 = 0.991 Note that the results i Theorem 6.1 are idepedet of what f(x ; ) may actually be!!! Apply Chebyshev s theorem to the RV X Let ε = k σ, i.e. k = ε σ, givig P X μ > k σ < 1 k. P X μ > ε < σ ε = σ ε Therefore, for ay (arbitrarily small but) o-zero value for ε, the probability that X differs from μ ca be made arbitrarily small by makig large eough. (We eed σ ε, which meas must get very large as ε gets small). This observatio is kow as the law of large umbers (if you make the sample size large eough, a sigle sample is sufficiet to give a value for x arbitrarily close to the populatio mea.)

Theorem 6. Let X 1, X,, X be a radom sample, each havig the same mea value μ ad variace σ. The for ay ε > 0 P X μ > ε 0 as as the sample size gets large, the probability that the average from a sigle radom sample differs from the true mea goes to zero. Agai this result o X is idepedet of what f(x ; ) may actually be. e.g. I a experimet, evet A occurs with probability p. Repeat the experimet times ad compute relative frequecy of occurrece of A = Show that the relative frequecy of A p as umber of times A occurs i trials Cosider each trial as a idepedet RV, X 1, X,, X Each X i takes o two values, x i = 0,1 depedig o whether A does ot or does occur i experimet i. X i has mea value E X i = 0 1 p + 1 p = p ad variace Var X i = E X i E X i = 0 1 p + 1 p p = p(1 p) The X 1 + X + + X records the umber of times A occurs i trials, ad X = X 1 + X + + X is i fact the relative frequecy of occurrece of A. From Theorem 6. we have p(1 p) ε P X p > ε < 0 for ay p [0,1] as

Var(X) σ X = σ is referred to at the stadard error of the mea. To reduce the stadard error by a factor of two, it is ecessary to icrease 4. Thus (ufortuately) icreasig sample size decreases the stadard error at a relatively slow rate. (e.g. if goes from 5 to,500 (a factor of 100), the stadard error decreases oly by 10.) While the results i Theorems 6.1 ad 6. are idepedet of the form of the theoretical samplig distributio/desity f(x ; ), the actual form for f(x ; ) depeds o kowig the probability distributio which govers the populatio. I geeral it ca be very difficult to compute the form of f(x ; ). Two results are kow both preseted as theorems. Theorem 6.3 (cetral limit theorem) Let X be the mea of a radom sample of size take from a populatio havig mea μ ad variace σ. The the associated RV, the stadardized sample mea X μ Z σ is a RV whose distributio fuctio approaches the stadard ormal distributio as

The cetral limit theorem says that, as, the theoretical samplig distributio f(x ; ) a ormal distributio (i.e. X is ormally distributed) with mea μ ad variace σ The distributio f(x ; ) of X for samples of size for a populatio with expoetial distributio The distributio f(x ; ) of X for samples of size for populatio with uiform distributio I practice, the distributio for X is well approximated by a ormal distributio for as small as 5 to 30.

Practical use of the cetral limit theorem: You have a populatio whose mea μ ad stadard deviatio σ you assume that you kow (but whose desity fuctio f(x) you do ot kow). You sample the populatio with a sample of size. From the sample you compute a mea value x. If the sample size is sufficietly large the cetral limit theorem will tell you the probability of gettig the value x give your assumptios o the values of μ ad σ. To test your assumptio, compute the stadardized sample mea z usig the measured x ad assumed values μ ad σ. The cetral limit theorem states that the probability of gettig the value x is the same as the probability of gettig the z-score z i a stadard ormal distributio.

Theorem (Normal populatios) Let X be the mea of a radom sample of size take from a populatio that is ormally distributed havig mea μ ad variace σ. The the stadardized sample mea X μ Z σ has the stadard ormal distributio fuctio regardless of the size of. (i.e. f(x ; ) for X is ormal desity with mea μ ad variace σ /). Practical use of this theorem: You have a populatio whose distributio is (assumed to be) ormal ad whose mea μ ad stadard deviatio σ you assume that you kow. You sample the populatio with a sample of size. From the sample you compute a mea value x. This theorem will tell you the probability of gettig the value x give your assumptios o ormality ad the values of μ ad σ. To test your assumptios, compute the stadardized sample mea z usig the measured x ad assumed values μ ad σ. This theorem states that the probability of gettig the value x is the same as the probability of gettig the z-score z i a stadard ormal distributio.

e.g. 1-gallo pait cas (the populatio) from a particular maufacturer cover, o average 513.3 sq. ft, with a stadard deviatio of 31.5 sq. ft. What is the probability that the mea area covered by a sample of 40 1-gallo cas will lie withi 510.0 to 50.0 sq. ft. Fid the stadardized sample meas for the two limits of the rage: 510.0 513.3 50.0 513.3 z 1 = = 0.66, z = = 1.34 31.5 40 31.5 40 Assumig the cetral limit theorem, we have from Table 3 P 510.0 < X < 50.0 = P 0.66 < Z < 1.34 = F 1.34 F 0.66 = 0.9099 0.546 = 0.6553

6.3 The Samplig Distributio of the Mea whe σ is ukow (usual case) I 6. we discussed aspects of the distributio of the sample mea X (it has a distributio with mea μ,variace σ (for cotiuous RVs), ad the related RV X μ Z σ the stadardized sample mea approaches the stadard ormal distributio as ). I practice σ is ot kow ad we have to deal with the values x μ t s where s is the sample stadard deviatio s = s, ad s is the sample variace s x i x = 1 Similar to X, we defie the radom variable S called the sample variace S X i X = 1 which has values s. I this sectio ad the ext, we are iterested i the behavior of t ad S thought of as radom variables.

Little is kow about the behavior of the distributio for t whe is small uless we are samplig from a populatio govered by the ormal distributio (a ormal populatio ) Theorem 6.4 If X is the sample mea for a radom sample of size take from a ormal populatio havig mea μ, the X μ t S is a radom variable havig the t distributio with parameter v = 1. Note: it is covetio to use small t for the RV for the t distributio (breakig the covetio to use capital letters for the RV ad small letters for its values). We will use small t to stad for both the RV ad its values.

The t distributio: a oe-parameter family of RVs, with values defied o (, ) desity fuctio f t; v = Γ v + 1 vπγ v 1 + t v+1 v mea value 0 (for v > 1), otherwise udefied variace v v (for v > ), for 1 < v <, otherwise udefied The t distributio is symmetric about 0, ad very close to the stadard ormal distributio. I fact the t distributio the stadard ormal distributio as v. The t distributio has heavier tails tha the stadard ormal distributio (i.e. there is higher probability i the tails of the t distributio). It is ofte referred to as studet s t distributio v v v v

The parameter v i the t distributio is referred to as the (umber of) degrees of freedom (df) Recall that the sum of the sample deviatios x i x is 0, hece oly 1 of the deviatios are idepedet of each other. Thus the RVs S ad, by the same reasoig, t both have 1 degrees of freedom. Similar to the z α for the stadard ormal distributio, we defie the t α for the t distributio. Because of the symmetry of the stadard ormal ad t distributios we have z 1 α = z α, t 1 α = t α Recall that Table 3 lists values of the cumulative stadard ormal distributio F(z) for various values of z I cotrast, Table 4 lists values of t α for various values of α ad v. (Recall, α is the probability i the right-had tail above t α ) By symmetry, the probability i the left-had tail below t α is also α. Note that for, t α = z α The stadard ormal distributio provides a good approximatio to the t distributio for samples of size 30 or more.

Practical use of theorem 6.4: You have a populatio whose distributio is ( assumed to be) ormal ad whose mea μ you assume that you kow (but whose stadard deviatio you do ot kow). You sample the populatio with a sample of size. From the sample you compute a sample mea value x ad the sample stadard deviatio s. Theorem 6.4 will tell you the probability of gettig the values x ad s give your assumptios o ormality ad the value of μ. To test your assumptio, compute the stadardized sample mea z usig the measured x ad s ad the assumed values μ. Theorem 6.4 states that the probability of gettig the value x, s is the same as the probability of gettig the value t i a t distributio with v = 1

e.g. a maufacturer s fuses (the populatio) will blow i 1.40 miutes o average whe subjected to a 0% overload. A sample of 0 fuses are subjected to a 0% overload. The sample average ad stadard deviatio were observed to be, respectively, 10.63 ad.48 miutes. What is the probability of this observatio give the maufacturers claim? 10.63 1.40 t = = 3.19, v = 0 1 = 19.48/ 0 From Table 4, for v = 19, we see that a t value of.861 already has oly 0.5% probability (α = 0.005) of beig exceeded. Cosequetly there is less tha a 0.5% probability that a t value smaller tha -.861 will occur. Sice the t value obtaied i our sample of 0 is 3.19, we coclude that there is less tha 0.5% probability of gettig this result. We therefore suspect that the maufacturers claim is icorrect, ad that the maufacturers fuses will blow i less tha 1.40 miutes o average whe subjected to 0% overload. If the populatio is ot ormal, studies have show that the distributio of X μ S is fairly close to that of the t distributio as log as the populatio distributio is relatively bell-shaped ad ot too skewed. This ca be checked usig a ormal scores plot o the populatio.

6.4 The Distributio of the Sample Variace S Theorem 6.5 Cosider a radom sample of size take from a ormal populatio havig variace σ. The the RV ( 1)S X i X σ = σ has the chi-square distributio with parameter v = 1 The chi-square distributio: a oe-parameter family of RVs, with values defied o (0, ) desity fuctio 1 f x; v = v Γ v x v 1 e v mea value v variace v The chi-square distributio is just the gamma distributio with α = v, β = Agai, the parameter v is referred to as the (umber of) degrees of freedom (df) We defie the α otatio similar to that of z α ad t α. Just as for Table 4, Table 5 lists values of α for various values of α ad v.

v v v v v v

e.g. (the populatio) glass blaks from a optical firm suitable for gridig ito leses Variace or refractive idex of glass is 1.6 10 4. Radom sample of size 0 selected from ay shipmet, ad if variace of refractive idex of sample exceeds 10 4, the sample is rejected. What is probability of rejectio assumig uderlyig populatio is ormal? For the measured sample of 0 0 1 10 4 1.6 10 4 = 30. From Table 5, for v = 19, 30. correspods to a value α = 0.05. There is therefore a 5% probability of rejected a shipmet

Practical use of theorem 6.5: You have a populatio whose distributio is ( assumed to be) ormal ad whose variace σ you assume that you kow. You sample the populatio with a sample of size. From the sample you compute a sample variace s. Theorem 6.5 will tell you the probability of gettig the value s give your assumptios o ormality ad the value of σ. To test your assumptio, compute the chi square value usig the measured s ad the assumed value σ. Theorem 6.5 states that the probability of gettig the value s is the same as the probability of gettig the value i a chi square distributio with v = 1

Recap sample 1 outcomes y 1 y sample space (N outcomes if fiite) e.g. throws each of k dice sample values for RV x 1 x e.g. k-dice sums sample j Thik of each x i value as resultig from a RV X i such that 1. each X i has the same desity f(x), mea μ, ad variace σ. the X i are idepedet radom sample The populatio of outcomes i the sample space geerates values for the RVs

Each sample geerates a sample mea x ad a sample variace s = Thik of the sample meas ad variaces are values for the RVs X ad S What are F X, E X, Var X, F S, E S, Var S? 1 x i x Chapter 5 states: E X = μ, Var X E X = μ, Var X = σ / for a ifiite populatio = σ N N 1 for a ifiite populatio Chapter 6 addresses the questios o F X, F S Law of large umbers for a sigle sample (ad sigle value of X) Cetral limit theorem is a RV whose distributio F Z P X μ > ε < σ ε Z X μ σ stadard ormal N(0,1) as (i.e. X is a RV whosedistributio F X N(μ, σ) as )

If the X i are ormally distributed with mea μ ad variace σ X μ Z σ is a RV whose distributio F Z = N(0,1) for all i.e. X is a RV whose distributio F X = N(μ, σ) for all If the X i are ormally distributed with mea μ X μ t S is a RV whose distributio F t is the t-distributio with df v = 1 If the X i are ormally distributed with variace σ ( 1)S X i X σ = σ is a RV whose distributio F is the chi square distributio with df v = 1

Assume we have two populatios. We may wish to iquire whether they have the same variace. Assume S 1 ad S are measured sample variaces for each populatio. Theorem 6.6 If S 1 ad S are measured sample variaces of idepedet radom samples of respective sizes 1 ad take from two ormal populatios havig the same variace the F = S 1 S is a RV havig the F distributio with parameters v 1 = 1 1 ad v = 1. The F distributio: a two-parameter family of RVs, with values defied o (0, ) desity fuctio mea value variace f x; v 1, v = v v for v > 1 B v 1, v v (v 1 +v ) v 1 v (v 4) for v > 4 v 1 v v 1 x v 1 1 The F distributio is similar to the beta distributio. B v 1, v B x, y = t x 1 1 t y 1 dt 0 1 1 + v 1 v x v 1+v is the beta fuctio

F distributio v 1 v 1 v 1 v v v v 1 v 1 v v

The parameter v 1 is referred to as the umerator degrees of freedom (df of uerator) The parameter v is referred to as the deomiator degrees of freedom (df of deomiator) As with z α, t α, etc we defie F α. Values of F α are give i Table 6 for various values of v 1 ad v for α = 0.05 (Table 6(a)) ad α = 0.01(Table 6(b)) Practical use of theorem 6.6: You have two populatio whose distributio are ( assumed to be) ormal ad whose variaces you assume to be equal. You sample populatio 1 with a sample of size 1 ad populatio with a sample of size. From each sample you compute sample variaces s 1 ad s. Theorem 6.6 will tell you the probability of gettig the ratio s 1 s give your assumptios o ormality ad equality of variace. To test your assumptio, compute the value F. Theorem 6.6 states that the probability of gettig the ratio s 1 s is the same as the probability of gettig the value F i a F distributio with v 1 = 1 1, v = 1.

e.g. Two radom samples of size 1 = 7 ad = 13 are take from the same ormal populatio. What is the probability that the variace of the first sample will be at least 3 times that of the secod. For v 1 = 6 ad v = 1, Table 6(a) shows a F value of 3.00 for α = 0.05. Therefore there is a 5% probability that the variace of the first sample will be at least 3 times that of the secod.

6.5 Represetatios of ormal distributios Defiig ew radom variables i terms of others is referred to as a represetatio chi-square Let Z 1, Z,, Z v be idepedet stadard ormal RVs. Defie the RV v v = Z i The v has a chi square distributio with v df Thus we also see that the square of a stadard ormal RV is a chi-square RV Let v 1 1 = Z i v 1 +v ad = Z i i=v 1 +1 where the Z i are idepedet stadard ormal RVs (ad thus 1 ad are idepedet of each other). The 1 + has a chi square distributio with v 1 + v df. Thus we see that the sum of two idepedet chi square RVs is also a chi square RV with the sum of the idividual df

t distributio Let Z be a stadard ormal RV ad be a chi-square RV with v df. Assume Z ad are idepedet. The t Z has a t distributio with v df v F distributio Let 1 ad be chi-square RVs with df v 1 ad v respectively. Assume 1 ad are idepedet. The has a F distributio with v 1, v df F v1,v 1 v 1 v Thus we see that is a RV with a F 1,v distributio t Z 1 v

e.g. Let X 1, X,, X be idepedet ormal RVs all havig mea μ ad stadard deviatio σ. The Z i = X i μ σ is a stadard ormal RV for each i. The Z 1 is also a stadard ormal RV. Cosider i.e. Z i Z Z i = Z i = 1 Z Z i Z i = Z i Z X i μ σ = X μ σ/ + Z = Z i + Z Z Note that the LHS is chi square distributio with df. The last term o the RHS is chi square with 1 df. This implies that the first term o the RHS is chi-square with 1df. Thus we see that ( 1)S X i X σ = σ = Z i Z has a chi square distributio with 1df (as claimed i Theorem 6.5)

Let X i be N(μ i, σ i ) for i = 1,, be idepedet ormal RVs The is ormal with E X = u i A sum of ormal RVs is a ormal RV X = X i, Var X = σ i Let X i be a chi-square RV with df= v i for i = 1,, ; assume the X i are idepedet The X = is a chi-square RV with df v = A sum of chi-square RVs is chi-square X i v i

Let X i be a Poisso RV with parameter λ i for i = 1,, ; assume the X i are idepedet The X = is a Poisso RV with parameter λ = A sum of Poisso RVs is Poisso X i λ i