# Statisticians use the word population to refer the total number of (potential) observations under consideration

Save this PDF as:

Size: px
Start display at page:

Download "Statisticians use the word population to refer the total number of (potential) observations under consideration"

## Transcription

1 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space (chapter 3) Therefore, a populatio may be fiite (e.g. umber of households i the US) or (effectively) ifiite (e.g. umber of stars i the uiverse)

2 e.g. questio: average umber of TV sets per household i US populatio: umber of TV sets i each household i US questio: average umber of TV sets per household i North America populatio: umber of TV sets i each household i Caada, US ad Mexico questio: probability that a star has plaets populatio: umber of plaets per star for all stars (past, preset, future) i all galaxies i the uiverse

3 I aswerig questios (e.g. what is the mea, what is the variace, what is the probability) for a give populatio, oe seldom aswers the questios usig the etire populatio. I practice the questios are aswered from a subset (a sample) of the populatio. it is importat to choose the sample i a way that does ot bias the aswers This is the subject of a area of statistics referred to as experimetal desig. (how to desig the sample such that you adequately reflect the etire populatio) e.g. i determiig the probability of gettig a pair i a poker had, you would ot sample oly poker hads that cotaied two pairs. (techically this would be a attempt to determie the probability P(pair) for the etire populatio by approximatig it by a coditioal probability P(pair two pair) e.g. To determie the average legth of logs movig o a coveyor belt at costat speed, oe might decide to measure oly the logs that pass a certai poit o the coveyor belt every 10 miutes. Upo reflectio, you realize that loger logs have a greater probability of beig at the measurig poit at the selected times, thus the sample would give a biased average legth measure that would be too large. e.g. to determie the expected lifetime of a tire, you oly test it o smooth, paved roads? e.g. to determie fuel ratig o cars, the EPA presumes that every car is drive 55 percet of the time i the city ad 45 percet of the time o the highway!?

4 Oe way to esure ubiased samplig is to esure your subset is a radom sample Suppose our sample is to cosist of observatios, x 1, x,, x. We have to select the first observatio x 1, the secod x, etc. We thik of the procedure for pickig x k as selectig a value for a radom variable X k, that is, we thik of pickig values x 1, x,, x for our sample as the process of pickig values for radom variables X 1, X,, X. Usig this thikig, we ca defie a radom sample as follows: fiite populatio: A set of observatios X 1, X,, X costitutes a radom sample of size from a fiite populatio of size N, if values for the set are chose so that each subset of of the N elemets of the populatio has the same probability of beig selected. ifiite populatio: A set of observatios X 1, X,, X costitutes a radom sample of size from the ifiite populatio described by distributio (discrete) or desity (cotiuous) f(x) if 1. each X i is a RV whose distributio/desity is give by f(x). the RVs are idepedet The phrase radom sample is applied both to the RV s X 1, X,, X ad their values x 1, x,, x

5 How to achieve a radom sample? e.g. the populatio is fiite (ad relatively small) Label each elemet of the populatio 1,,, N. Draw umbers sequetially, i groups of, from a radom digits table

6 Whe the populatio size is large or ifiite, this process ca become practically impossible, ad careful thought must be give to, at least approximate, radom samplig desig. e.g. areal samplig usig a regular grid works if uderlyig populatio (e.g. chemical cotamiat cocetratio) is relatively homogeous. Does t work if uderlyig populatio is spatially cocetrated. e.g. replicate samplig i aomalous areas

7 6. Samplig Distributio of the Mea For each sample x 1, x,, x of observatios, we ca compute a mea x. The mea value will vary with each of our samples. Thus we ca thik of the sample mea (mea value for each sample) as a radom variable X obeyig some distributio fuctio f(x ; ) The distributio f(x ; ) is referred to as the theoretical samplig distributio. We put aside for the momet the questio of the form for f(x ; ) ad ote that, i chapter 5.10, we have already computed the mea ad variace for f(x ; ) i the case of cotiuous RV s. Theorem 6.1: If a radom sample X 1, X,, X of size is take from a populatio havig mea μ ad variace σ, the X is a RV whose distributio f(x ; ) has: ifiite populatio mea value E(X) = μ ad variace Var X = σ fiite populatio mea value E(X) = μ ad variace Var X = σ N N 1 Note: The appearace of the term N for the variace of X i the fiite populatio case is N 1 uexpected based upo the calculatio i The calculatios i 5.10, whe applied to a fiite populatio, assume that N. This correctio factor, called the fiite populatio correctio (fpc) factor is icluded to accout for cases i which N. Note that the fpc factor =0 for = N. (i.e. Var X =0 whe = N). This implies that, whe oe sample is take usig the etire populatio, X exactly measures the populatio mea with o error (variace).

8 e.g. For N = 1,000 ad = 10, the fpc is fpc = = Note that the results i Theorem 6.1 are idepedet of what f(x ; ) may actually be!!! Apply Chebyshev s theorem to the RV X Let ε = k σ, i.e. k = ε σ, givig P X μ > k σ < 1 k. P X μ > ε < σ ε = σ ε Therefore, for ay (arbitrarily small but) o-zero value for ε, the probability that X differs from μ ca be made arbitrarily small by makig large eough. (We eed σ ε, which meas must get very large as ε gets small). This observatio is kow as the law of large umbers (if you make the sample size large eough, a sigle sample is sufficiet to give a value for x arbitrarily close to the populatio mea.)

9 Theorem 6. Let X 1, X,, X be a radom sample, each havig the same mea value μ ad variace σ. The for ay ε > 0 P X μ > ε 0 as as the sample size gets large, the probability that the average from a sigle radom sample differs from the true mea goes to zero. Agai this result o X is idepedet of what f(x ; ) may actually be. e.g. I a experimet, evet A occurs with probability p. Repeat the experimet times ad compute relative frequecy of occurrece of A = Show that the relative frequecy of A p as umber of times A occurs i trials Cosider each trial as a idepedet RV, X 1, X,, X Each X i takes o two values, x i = 0,1 depedig o whether A does ot or does occur i experimet i. X i has mea value E X i = 0 1 p + 1 p = p ad variace Var X i = E X i E X i = 0 1 p + 1 p p = p(1 p) The X 1 + X + + X records the umber of times A occurs i trials, ad X = X 1 + X + + X is i fact the relative frequecy of occurrece of A. From Theorem 6. we have p(1 p) ε P X p > ε < 0 for ay p [0,1] as

10 Var(X) σ X = σ is referred to at the stadard error of the mea. To reduce the stadard error by a factor of two, it is ecessary to icrease 4. Thus (ufortuately) icreasig sample size decreases the stadard error at a relatively slow rate. (e.g. if goes from 5 to,500 (a factor of 100), the stadard error decreases oly by 10.) While the results i Theorems 6.1 ad 6. are idepedet of the form of the theoretical samplig distributio/desity f(x ; ), the actual form for f(x ; ) depeds o kowig the probability distributio which govers the populatio. I geeral it ca be very difficult to compute the form of f(x ; ). Two results are kow both preseted as theorems. Theorem 6.3 (cetral limit theorem) Let X be the mea of a radom sample of size take from a populatio havig mea μ ad variace σ. The the associated RV, the stadardized sample mea X μ Z σ is a RV whose distributio fuctio approaches the stadard ormal distributio as

11 The cetral limit theorem says that, as, the theoretical samplig distributio f(x ; ) a ormal distributio (i.e. X is ormally distributed) with mea μ ad variace σ The distributio f(x ; ) of X for samples of size for a populatio with expoetial distributio The distributio f(x ; ) of X for samples of size for populatio with uiform distributio I practice, the distributio for X is well approximated by a ormal distributio for as small as 5 to 30.

12 Practical use of the cetral limit theorem: You have a populatio whose mea μ ad stadard deviatio σ you assume that you kow (but whose desity fuctio f(x) you do ot kow). You sample the populatio with a sample of size. From the sample you compute a mea value x. If the sample size is sufficietly large the cetral limit theorem will tell you the probability of gettig the value x give your assumptios o the values of μ ad σ. To test your assumptio, compute the stadardized sample mea z usig the measured x ad assumed values μ ad σ. The cetral limit theorem states that the probability of gettig the value x is the same as the probability of gettig the z-score z i a stadard ormal distributio.

14 e.g. 1-gallo pait cas (the populatio) from a particular maufacturer cover, o average sq. ft, with a stadard deviatio of 31.5 sq. ft. What is the probability that the mea area covered by a sample of 40 1-gallo cas will lie withi to 50.0 sq. ft. Fid the stadardized sample meas for the two limits of the rage: z 1 = = 0.66, z = = Assumig the cetral limit theorem, we have from Table 3 P < X < 50.0 = P 0.66 < Z < 1.34 = F 1.34 F 0.66 = =

15 6.3 The Samplig Distributio of the Mea whe σ is ukow (usual case) I 6. we discussed aspects of the distributio of the sample mea X (it has a distributio with mea μ,variace σ (for cotiuous RVs), ad the related RV X μ Z σ the stadardized sample mea approaches the stadard ormal distributio as ). I practice σ is ot kow ad we have to deal with the values x μ t s where s is the sample stadard deviatio s = s, ad s is the sample variace s x i x = 1 Similar to X, we defie the radom variable S called the sample variace S X i X = 1 which has values s. I this sectio ad the ext, we are iterested i the behavior of t ad S thought of as radom variables.

16 Little is kow about the behavior of the distributio for t whe is small uless we are samplig from a populatio govered by the ormal distributio (a ormal populatio ) Theorem 6.4 If X is the sample mea for a radom sample of size take from a ormal populatio havig mea μ, the X μ t S is a radom variable havig the t distributio with parameter v = 1. Note: it is covetio to use small t for the RV for the t distributio (breakig the covetio to use capital letters for the RV ad small letters for its values). We will use small t to stad for both the RV ad its values.

17 The t distributio: a oe-parameter family of RVs, with values defied o (, ) desity fuctio f t; v = Γ v + 1 vπγ v 1 + t v+1 v mea value 0 (for v > 1), otherwise udefied variace v v (for v > ), for 1 < v <, otherwise udefied The t distributio is symmetric about 0, ad very close to the stadard ormal distributio. I fact the t distributio the stadard ormal distributio as v. The t distributio has heavier tails tha the stadard ormal distributio (i.e. there is higher probability i the tails of the t distributio). It is ofte referred to as studet s t distributio v v v v

18 The parameter v i the t distributio is referred to as the (umber of) degrees of freedom (df) Recall that the sum of the sample deviatios x i x is 0, hece oly 1 of the deviatios are idepedet of each other. Thus the RVs S ad, by the same reasoig, t both have 1 degrees of freedom. Similar to the z α for the stadard ormal distributio, we defie the t α for the t distributio. Because of the symmetry of the stadard ormal ad t distributios we have z 1 α = z α, t 1 α = t α Recall that Table 3 lists values of the cumulative stadard ormal distributio F(z) for various values of z I cotrast, Table 4 lists values of t α for various values of α ad v. (Recall, α is the probability i the right-had tail above t α ) By symmetry, the probability i the left-had tail below t α is also α. Note that for, t α = z α The stadard ormal distributio provides a good approximatio to the t distributio for samples of size 30 or more.

20 e.g. a maufacturer s fuses (the populatio) will blow i 1.40 miutes o average whe subjected to a 0% overload. A sample of 0 fuses are subjected to a 0% overload. The sample average ad stadard deviatio were observed to be, respectively, ad.48 miutes. What is the probability of this observatio give the maufacturers claim? t = = 3.19, v = 0 1 = 19.48/ 0 From Table 4, for v = 19, we see that a t value of.861 already has oly 0.5% probability (α = 0.005) of beig exceeded. Cosequetly there is less tha a 0.5% probability that a t value smaller tha will occur. Sice the t value obtaied i our sample of 0 is 3.19, we coclude that there is less tha 0.5% probability of gettig this result. We therefore suspect that the maufacturers claim is icorrect, ad that the maufacturers fuses will blow i less tha 1.40 miutes o average whe subjected to 0% overload. If the populatio is ot ormal, studies have show that the distributio of X μ S is fairly close to that of the t distributio as log as the populatio distributio is relatively bell-shaped ad ot too skewed. This ca be checked usig a ormal scores plot o the populatio.

21 6.4 The Distributio of the Sample Variace S Theorem 6.5 Cosider a radom sample of size take from a ormal populatio havig variace σ. The the RV ( 1)S X i X σ = σ has the chi-square distributio with parameter v = 1 The chi-square distributio: a oe-parameter family of RVs, with values defied o (0, ) desity fuctio 1 f x; v = v Γ v x v 1 e v mea value v variace v The chi-square distributio is just the gamma distributio with α = v, β = Agai, the parameter v is referred to as the (umber of) degrees of freedom (df) We defie the α otatio similar to that of z α ad t α. Just as for Table 4, Table 5 lists values of α for various values of α ad v.

22 v v v v v v

23 e.g. (the populatio) glass blaks from a optical firm suitable for gridig ito leses Variace or refractive idex of glass is Radom sample of size 0 selected from ay shipmet, ad if variace of refractive idex of sample exceeds 10 4, the sample is rejected. What is probability of rejectio assumig uderlyig populatio is ormal? For the measured sample of = 30. From Table 5, for v = 19, 30. correspods to a value α = There is therefore a 5% probability of rejected a shipmet

24 Practical use of theorem 6.5: You have a populatio whose distributio is ( assumed to be) ormal ad whose variace σ you assume that you kow. You sample the populatio with a sample of size. From the sample you compute a sample variace s. Theorem 6.5 will tell you the probability of gettig the value s give your assumptios o ormality ad the value of σ. To test your assumptio, compute the chi square value usig the measured s ad the assumed value σ. Theorem 6.5 states that the probability of gettig the value s is the same as the probability of gettig the value i a chi square distributio with v = 1

25 Recap sample 1 outcomes y 1 y sample space (N outcomes if fiite) e.g. throws each of k dice sample values for RV x 1 x e.g. k-dice sums sample j Thik of each x i value as resultig from a RV X i such that 1. each X i has the same desity f(x), mea μ, ad variace σ. the X i are idepedet radom sample The populatio of outcomes i the sample space geerates values for the RVs

26 Each sample geerates a sample mea x ad a sample variace s = Thik of the sample meas ad variaces are values for the RVs X ad S What are F X, E X, Var X, F S, E S, Var S? 1 x i x Chapter 5 states: E X = μ, Var X E X = μ, Var X = σ / for a ifiite populatio = σ N N 1 for a ifiite populatio Chapter 6 addresses the questios o F X, F S Law of large umbers for a sigle sample (ad sigle value of X) Cetral limit theorem is a RV whose distributio F Z P X μ > ε < σ ε Z X μ σ stadard ormal N(0,1) as (i.e. X is a RV whosedistributio F X N(μ, σ) as )

27 If the X i are ormally distributed with mea μ ad variace σ X μ Z σ is a RV whose distributio F Z = N(0,1) for all i.e. X is a RV whose distributio F X = N(μ, σ) for all If the X i are ormally distributed with mea μ X μ t S is a RV whose distributio F t is the t-distributio with df v = 1 If the X i are ormally distributed with variace σ ( 1)S X i X σ = σ is a RV whose distributio F is the chi square distributio with df v = 1

28 Assume we have two populatios. We may wish to iquire whether they have the same variace. Assume S 1 ad S are measured sample variaces for each populatio. Theorem 6.6 If S 1 ad S are measured sample variaces of idepedet radom samples of respective sizes 1 ad take from two ormal populatios havig the same variace the F = S 1 S is a RV havig the F distributio with parameters v 1 = 1 1 ad v = 1. The F distributio: a two-parameter family of RVs, with values defied o (0, ) desity fuctio mea value variace f x; v 1, v = v v for v > 1 B v 1, v v (v 1 +v ) v 1 v (v 4) for v > 4 v 1 v v 1 x v 1 1 The F distributio is similar to the beta distributio. B v 1, v B x, y = t x 1 1 t y 1 dt v 1 v x v 1+v is the beta fuctio

29 F distributio v 1 v 1 v 1 v v v v 1 v 1 v v

30 The parameter v 1 is referred to as the umerator degrees of freedom (df of uerator) The parameter v is referred to as the deomiator degrees of freedom (df of deomiator) As with z α, t α, etc we defie F α. Values of F α are give i Table 6 for various values of v 1 ad v for α = 0.05 (Table 6(a)) ad α = 0.01(Table 6(b)) Practical use of theorem 6.6: You have two populatio whose distributio are ( assumed to be) ormal ad whose variaces you assume to be equal. You sample populatio 1 with a sample of size 1 ad populatio with a sample of size. From each sample you compute sample variaces s 1 ad s. Theorem 6.6 will tell you the probability of gettig the ratio s 1 s give your assumptios o ormality ad equality of variace. To test your assumptio, compute the value F. Theorem 6.6 states that the probability of gettig the ratio s 1 s is the same as the probability of gettig the value F i a F distributio with v 1 = 1 1, v = 1.

31 e.g. Two radom samples of size 1 = 7 ad = 13 are take from the same ormal populatio. What is the probability that the variace of the first sample will be at least 3 times that of the secod. For v 1 = 6 ad v = 1, Table 6(a) shows a F value of 3.00 for α = Therefore there is a 5% probability that the variace of the first sample will be at least 3 times that of the secod.

32 6.5 Represetatios of ormal distributios Defiig ew radom variables i terms of others is referred to as a represetatio chi-square Let Z 1, Z,, Z v be idepedet stadard ormal RVs. Defie the RV v v = Z i The v has a chi square distributio with v df Thus we also see that the square of a stadard ormal RV is a chi-square RV Let v 1 1 = Z i v 1 +v ad = Z i i=v 1 +1 where the Z i are idepedet stadard ormal RVs (ad thus 1 ad are idepedet of each other). The 1 + has a chi square distributio with v 1 + v df. Thus we see that the sum of two idepedet chi square RVs is also a chi square RV with the sum of the idividual df

33 t distributio Let Z be a stadard ormal RV ad be a chi-square RV with v df. Assume Z ad are idepedet. The t Z has a t distributio with v df v F distributio Let 1 ad be chi-square RVs with df v 1 ad v respectively. Assume 1 ad are idepedet. The has a F distributio with v 1, v df F v1,v 1 v 1 v Thus we see that is a RV with a F 1,v distributio t Z 1 v

34 e.g. Let X 1, X,, X be idepedet ormal RVs all havig mea μ ad stadard deviatio σ. The Z i = X i μ σ is a stadard ormal RV for each i. The Z 1 is also a stadard ormal RV. Cosider i.e. Z i Z Z i = Z i = 1 Z Z i Z i = Z i Z X i μ σ = X μ σ/ + Z = Z i + Z Z Note that the LHS is chi square distributio with df. The last term o the RHS is chi square with 1 df. This implies that the first term o the RHS is chi-square with 1df. Thus we see that ( 1)S X i X σ = σ = Z i Z has a chi square distributio with 1df (as claimed i Theorem 6.5)

35 Let X i be N(μ i, σ i ) for i = 1,, be idepedet ormal RVs The is ormal with E X = u i A sum of ormal RVs is a ormal RV X = X i, Var X = σ i Let X i be a chi-square RV with df= v i for i = 1,, ; assume the X i are idepedet The X = is a chi-square RV with df v = A sum of chi-square RVs is chi-square X i v i

36 Let X i be a Poisso RV with parameter λ i for i = 1,, ; assume the X i are idepedet The X = is a Poisso RV with parameter λ = A sum of Poisso RVs is Poisso X i λ i

### 7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

### Sampling Distributions, Z-Tests, Power

Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace

### Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

### DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

### Parameter, Statistic and Random Samples

Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

### Binomial Distribution

0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

### Infinite Sequences and Series

Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

### KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give

### Basis for simulation techniques

Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

### The Sample Variance Formula: A Detailed Study of an Old Controversy

The Sample Variace Formula: A Detailed Study of a Old Cotroversy Ky M. Vu PhD. AuLac Techologies Ic. c 00 Email: kymvu@aulactechologies.com Abstract The two biased ad ubiased formulae for the sample variace

### This is an introductory course in Analysis of Variance and Design of Experiments.

1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

### Chapter 6 Principles of Data Reduction

Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

### Probability and statistics: basic terms

Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

### The standard deviation of the mean

Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

### IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Closed book ad otes. No calculators. 120 miutes. Cover page, five pages of exam, ad tables for discrete ad cotiuous distributios. Score X i =1 X i / S X 2 i =1 (X i X ) 2 / ( 1) = [i =1 X i 2 X 2 ] / (

### The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

### MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

### Topic 18: Composite Hypotheses

Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:

### f(x)dx = 1 and f(x) 0 for all x.

OCR Statistics 2 Module Revisio Sheet The S2 exam is 1 hour 30 miutes log. You are allowed a graphics calculator. Before you go ito the exam make sureyou are fully aware of the cotets of theformula booklet

### 1 Inferential Methods for Correlation and Regression Analysis

1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

### STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

### Lecture 1 Probability and Statistics

Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

### 5. Likelihood Ratio Tests

1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

### Stat 400, section 5.4 supplement: The Central Limit Theorem

Stat, sectio 5. supplemet: The Cetral Limit Theorem otes by Tim Pilachowski Table of Cotets 1. Backgroud 1. Theoretical. Practical. The Cetral Limit Theorem 5. Homework Exercises 7 1. Backgroud Gatherig

### Discrete probability distributions

Discrete probability distributios I the chapter o probability we used the classical method to calculate the probability of various values of a radom variable. I some cases, however, we may be able to develop

### Properties and Hypothesis Testing

Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

### Introduction to Probability and Statistics Twelfth Edition

Itroductio to Probability ad Statistics Twelfth Editio Robert J. Beaver Barbara M. Beaver William Medehall Presetatio desiged ad writte by: Barbara M. Beaver Itroductio to Probability ad Statistics Twelfth

### The Poisson Distribution

MATH 382 The Poisso Distributio Dr. Neal, WKU Oe of the importat distributios i probabilistic modelig is the Poisso Process X t that couts the umber of occurreces over a period of t uits of time. This

### Confidence Intervals QMET103

Cofidece Itervals QMET103 Library, Teachig ad Learig CONFIDENCE INTERVALS provide a iterval estimate of the ukow populatio parameter. What is a cofidece iterval? Statisticias have a habit of hedgig their

### Unbiased Estimation. February 7-12, 2008

Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

### Monte Carlo Integration

Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

### Confidence Intervals for the Population Proportion p

Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

### Closed book and notes. No calculators. 60 minutes, but essentially unlimited time.

IE 230 Seat # Closed book ad otes. No calculators. 60 miutes, but essetially ulimited time. Cover page, four pages of exam, ad Pages 8 ad 12 of the Cocise Notes. This test covers through Sectio 4.7 of

### Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter Output Aalysis for a Sigle Model Baks, Carso, Nelso & Nicol Discrete-Evet System Simulatio Error Estimatio If {,, } are ot statistically idepedet, the S / is a biased estimator of the true variace.

### Modeling and Performance Analysis with Discrete-Event Simulation

Simulatio Modelig ad Performace Aalysis with Discrete-Evet Simulatio Chapter 5 Statistical Models i Simulatio Cotets Basic Probability Theory Cocepts Useful Statistical Models Discrete Distributios Cotiuous

### 1036: Probability & Statistics

036: Probability & Statistics Lecture 0 Oe- ad Two-Sample Tests of Hypotheses 0- Statistical Hypotheses Decisio based o experimetal evidece whether Coffee drikig icreases the risk of cacer i humas. A perso

### PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured

### Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

### WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

### Exam 2 Instructions not multiple versions

Exam 2 Istructios Remove this sheet of istructios from your exam. You may use the back of this sheet for scratch work. This is a closed book, closed otes exam. You are ot allowed to use ay materials other

### Chapter 1 (Definitions)

FINAL EXAM REVIEW Chapter 1 (Defiitios) Qualitative: Nomial: Ordial: Quatitative: Ordial: Iterval: Ratio: Observatioal Study: Desiged Experimet: Samplig: Cluster: Stratified: Systematic: Coveiece: Simple

### Element sampling: Part 2

Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

### 2.2. Central limit theorem.

36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard

### CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

CURRICULUM INSPIRATIONS: wwwmaaorg/ci MATH FOR AMERICA_DC: wwwmathforamericaorg/dc INNOVATIVE CURRICULUM ONLINE EXPERIENCES: wwwgdaymathcom TANTON TIDBITS: wwwjamestatocom TANTON S TAKE ON MEAN ad VARIATION

### Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

### A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

### Output Analysis and Run-Length Control

IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

### Central Limit Theorem the Meaning and the Usage

Cetral Limit Theorem the Meaig ad the Usage Covetio about otatio. N, We are usig otatio X is variable with mea ad stadard deviatio. i lieu of sayig that X is a ormal radom Assume a sample of measuremets

### Final Examination Solutions 17/6/2010

The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

### Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

### STAT 203 Chapter 18 Sampling Distribution Models

STAT 203 Chapter 18 Samplig Distributio Models Populatio vs. sample, parameter vs. statistic Recall that a populatio cotais the etire collectio of idividuals that oe wats to study, ad a sample is a subset

### Chapter 13, Part A Analysis of Variance and Experimental Design

Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

### THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated

### NCSS Statistical Software. Tolerance Intervals

Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

### Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

### Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig

### Statistical inference: example 1. Inferential Statistics

Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

### Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom

### Stat 200 -Testing Summary Page 1

Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece

### B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets

B671-672 Supplemetal otes 2 Hypergeometric, Biomial, Poisso ad Multiomial Radom Variables ad Borel Sets 1 Biomial Approximatio to the Hypergeometric Recall that the Hypergeometric istributio is fx = x

### Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

### Topic 6 Sampling, hypothesis testing, and the central limit theorem

CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio

### BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

BHW # /5 ENGR Probabilistic Aalysis Beautiful Homework # Three differet roads feed ito a particular freeway etrace. Suppose that durig a fixed time period, the umber of cars comig from each road oto the

### IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

### Introducing Sample Proportions

Itroducig Sample Proportios Probability ad statistics Aswers & Notes TI-Nspire Ivestigatio Studet 60 mi 7 8 9 0 Itroductio A 00 survey of attitudes to climate chage, coducted i Australia by the CSIRO,

### Lecture 4. Random variable and distribution of probability

Itroductio to theory of probability ad statistics Lecture. Radom variable ad distributio of probability dr hab.iż. Katarzya Zarzewsa, prof.agh Katedra Eletroii, AGH e-mail: za@agh.edu.pl http://home.agh.edu.pl/~za

### Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

CONFIDENCE INTERVALS How do we make ifereces about the populatio parameters? The samplig distributio allows us to quatify the variability i sample statistics icludig how they differ from the parameter

### 4.1 Sigma Notation and Riemann Sums

0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

### STATISTICAL INFERENCE

STATISTICAL INFERENCE POPULATION AND SAMPLE Populatio = all elemets of iterest Characterized by a distributio F with some parameter θ Sample = the data X 1,..., X, selected subset of the populatio = sample

### Sampling, Sampling Distribution and Normality

4/17/11 Tools of Busiess Statistics Samplig, Samplig Distributio ad ormality Preseted by: Mahedra Adhi ugroho, M.Sc Descriptive statistics Collectig, presetig, ad describig data Iferetial statistics Drawig

### MA131 - Analysis 1. Workbook 2 Sequences I

MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

### Probability, Expectation Value and Uncertainty

Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

### Chapter 4 Tests of Hypothesis

Dr. Moa Elwakeel [ 5 TAT] Chapter 4 Tests of Hypothesis 4. statistical hypothesis more. A statistical hypothesis is a statemet cocerig oe populatio or 4.. The Null ad The Alterative Hypothesis: The structure

### Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

### First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

### Linear Regression Models

Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

### 0, otherwise. EX = E(X 1 + X n ) = EX j = np and. Var(X j ) = np(1 p). Var(X) = Var(X X n ) =

PROBABILITY MODELS 35 10. Discrete probability distributios I this sectio, we discuss several well-ow discrete probability distributios ad study some of their properties. Some of these distributios, lie

### Math 140 Introductory Statistics

8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These

### HOMEWORK 2 SOLUTIONS

HOMEWORK SOLUTIONS CSE 55 RANDOMIZED AND APPROXIMATION ALGORITHMS 1. Questio 1. a) The larger the value of k is, the smaller the expected umber of days util we get all the coupos we eed. I fact if = k

### Sequences I. Chapter Introduction

Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which

### The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

### y ij = µ + α i + ɛ ij,

STAT 4 ANOVA -Cotrasts ad Multiple Comparisos /3/04 Plaed comparisos vs uplaed comparisos Cotrasts Cofidece Itervals Multiple Comparisos: HSD Remark Alterate form of Model I y ij = µ + α i + ɛ ij, a i

### Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

### Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Cofidece Iterval 700 Samples Sample Mea 03 Cofidece Level 095 Margi of Error 0037 We wat to estimate the true mea of a radom variable X ecoomically ad with cofidece True Mea μ from the Etire Populatio

### ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

### 62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

### Lecture 9: September 19

36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace

### ORF 245 Fundamentals of Engineering Statistics. Midterm Exam 2

Priceto Uiversit Departmet of Operatios Research ad Fiacial Egieerig ORF 45 Fudametals of Egieerig Statistics Midterm Eam April 17, 009 :00am-:50am PLEASE DO NOT TURN THIS PAGE AND START THE EXAM UNTIL

### Kernel density estimator

Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

### Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Some Basic Probability Cocepts 2. Experimets, Outcomes ad Radom Variables A radom variable is a variable whose value is ukow util it is observed. The value of a radom variable results from a experimet;

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS Cotets 1. A few useful discrete radom variables 2. Joit, margial, ad

### Lesson 10: Limits and Continuity

www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

### UCLA STAT 110B Applied Statistics for Engineering and the Sciences

UCLA STAT 110B Applied Statistics for Egieerig ad the Scieces Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistats: Bria Ng, UCLA Statistics Uiversity of Califoria, Los Ageles,

### Statistics 20: Final Exam Solutions Summer Session 2007

1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets

### Singular Continuous Measures by Michael Pejic 5/14/10

Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

### Shannon s noiseless coding theorem

18.310 lecture otes May 4, 2015 Shao s oiseless codig theorem Lecturer: Michel Goemas I these otes we discuss Shao s oiseless codig theorem, which is oe of the foudig results of the field of iformatio

### Advanced Engineering Mathematics Exercises on Module 4: Probability and Statistics

Advaced Egieerig Mathematics Eercises o Module 4: Probability ad Statistics. A survey of people i give regio showed that 5% drak regularly. The probability of death due to liver disease, give that a perso

### Riemann Sums y = f (x)

Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid