GG313 GEOLOGICAL DATA ANALYSIS
|
|
- Carol Quinn
- 5 years ago
- Views:
Transcription
1 GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data usig several stadard techiques. At the core of these tests lies the cocept of the "ull hypothesis". A ull hypothesis is set up ad we use our tests to see if we ca reject the ull hypothesis, H 0. I other words, if we wat to test whether two rock samples have differet desities, we form the ull hypothesis that they have equal desities ad test if we ca reject H 0. We will illustrate all this with a example: It is claimed that the desity of a particular sadstoe is.35 gcm -3. We are haded a sample of 50 specimes from a outcrop i the same area ad decide to set the criteria that the samples are from aother lithological uit if the sample mea is less tha.5 or larger tha.45. This is a clear-cut criterio for acceptig or rejectig the claim that the samples are from the same uits, but it is ot ifallible. Sice our decisio will be based o a sample, there is the possibility that the sample mea may be <.5 or >.45 eve though the populatio mea is.35. We will therefore wat to kow what the chaces are that we make a wrog decisio. We will ivestigate what the probability is that x will be <.5 or x >.45 eve if =.35. Here, s (= ) = 0.4. This probability is give by the area uder the tails i Fig.-1. Accept claim reject claim reject claim Fig. -1. We reject the ull hypothesis whe the computed statistic falls i the tail area. Sice = 50 >>30 we will treat our sample as of ifiite size. The we have We ca ow evaluate the ormal scores s x = s = = 0.06
2 GG313 GEOLOGICAL DATA ANALYSIS z 0 = z 1 = = = We fid the area uder each tail to be erf Thus, the probability of gettig a sample mea that falls i the tail area of the distributio is p = 0475 =0.095 or 9.5%. This result meas there is a 9.5% chace we will erroeously reject the hypothesis that =.35 whe it is i fact true. We call this committig a type I error. Let us look at aother possibility, where our test will fail to detect that is ot equal to.35. Suppose for the sake of argumet that the true mea is.53. The, the probability of gettig a sample mea i the rage ad hece erroeously accept the claim that =.35 is give by the tail area i Fig. -. erroeously accept claim Reject claim.5.45 Fig. -. Possibility of committig a Type II error..53 As before, s x = 0.06 so the ormal scores become z 0 = = z 1 = = It follows that the area A = 0.5 erf erf 4.67 = 0.09 or 9.%. This is the risk we ru of acceptig the icorrect hypothesis =.35. We call this committig a type II error. We recogize that there are several possibilities whe testig the ull hypothesis. The table below summarizes the variatios: Accept H 0 Reject H 0 H 0 is TRUE Correct Decisio Type I Error H 0 is FALSE Type II Error Correct Decisio If the hypothesis is true, but is rejected, we have committed a Type I error, ad the probability of doig so is desigated. I our example, was If our hypothesis is icorrect, but we still accept it, the we have committed a Type II error, ad the probability of doig so is desigated. I our case, with =.53, was 0.09.
3 GG313 GEOLOGICAL DATA ANALYSIS 3 Sigificace test We saw i our example that the type II error probability depeded o the value of. Sice is ofte ot kow, it is commo to simply either reject H 0 or reserve judgmet (i.e., ever accept H 0 ). This way we avoid committig a type II error altogether, at the expese of ever acceptig H 0. We call this a sigificace test ad say that the results are statistically sigificat if we ca reject H 0. If ot, the results are ot statistically sigificat, ad we attempt o further decisios. Hece, i statistics we ca oly disprove hypotheses, but ever prove them... Differeces betwee meas We will ofte wat to kow if a observed differece i sample meas ca be attributed to chace. We will agai use Studet's t-test. It is assumed that the two distributios have the same variace but possibly differet meas. We are iterested i the distributio of x 1 x, the differece i sample meas. If the samples are idepedet ad radom, the differece distributio will be approximately ormal with mea 1 - ad stadard deviatio e = p (.1) where p is called the pooled variace: = p (.) We fid the t-statistic by evaluatig t = x 1 - x 1-1 s s (.3) ad test the hypothesis H 0 : 1 = based o the t-distributio for = degrees of freedom. For large 1,, the t-distributio becomes very close to a ormal distributio ad we may istead use z-statistics based o z = x 1 x s s We will illustrate the two-sample t-test with a example: We have obtaied radom samples of magetites from two separate outcrops. The measured magetizatios i Am kg -1 are Outcrop 1: {87.4, 93.4, 96.8, 86.1, 96.4} 1 = 5 Outcrop : {106., 10., 105.7, 93.4, 95.0, 97.0} = 6 We state our ull hypothesis H 0 : 1 = ; the alterative hypothesis is of course H 1 : 1. We decide to use 95% sigificace level, so = I this case, = = 9, ad a t- (.4)
4 GG313 GEOLOGICAL DATA ANALYSIS 4 statistics table (Appedix.4) shows that the critical t value is.6, ad we will reject H 0 is our t exceeds this critical value. From the data we fid Usig Eq. (.3) we obtai t = x 1 = 9.0 withs 1 =5.0 x = 99.9 withs = Sice t >.6 we must reject H 0. We coclude that the magetizatios at the two outcrops are ot the same. We have ow put cofidece limits o sample meas ad compared sample meas to ivestigate whether two populatios have differet meas. We will tur our attetio to ifereces about the stadard deviatio. Ifereces about the stadard deviatio The most popular way of estimatig is to compute the sample stadard deviatio. Whe ivestigatig properties of s ad we will be usig the "chi-square" statistic ( )s =.5 = 1 (.5) The distributio depeds o the degrees of freedom = - 1 ad is restricted to positive values because of the power of. It portrays how the sample stadard deviatio would be distributed if we selected radom samples of items. Fig. -3 shows a typical curve 1 - Fig. -3. A typical chi-square distributio. I the same way we used z ad t, we ow use as the value for which the area to the right of equals. Because the distributio is ot symmetrical, we must evaluate the / ad 1 - / critical values separately. Ad i the same way we put cofidece itervals o, we ow use (.5) to fid < ( 1)s < 1 or ( 1)s < < ( 1)s (.6) 1
5 GG313 GEOLOGICAL DATA ANALYSIS 5 which gives the simplified to cofidece iterval o the variace. For large samples ( > 30) this ca be 1 + z s < < s 1 z (.7) Note that the cofidece iterval is ot symmetrical about the sample stadard deviatio. Testig stadard deviatios We might wat to test whether our sample stadard deviatio s is equal to or differet from a give populatio. I such a case the ull hypothesis becomes H 0 : s = with the alterative hypothesis H 1 : s. As usual, we select our level of sigificace to be = Assume we have 15 estimates of temperatures with s = 1.3 C ad we wat to kow if s is ay differet from = 1.5 C based o past experiece. From = 0.05 ad = 14 we fid the critical values from a table to be 0.05 = ad = Based o our sample statistic we compute = = We see that we caot reject H 0 at the 95% sigificace level. Istead we may accept H 0 or reserve the judgmet. This was a two-sided test sice we must check that did ot lad i either of the two tails. For large samples 30, the does ot vary much with ad we may use the simpler statistic z = s ad use the stadard z-statistics table. Testig two stadard deviatios I the t-test for differeces betwee meas we assumed that the stadard deviatio of the two samples were the same. Ofte this is ot the case ad oe should first test whether this assumptio is valid. We wat to kow whether the two variaces are differet or ot. The statistic that is most appropriate for such tests is called the F-statistic, defied as s 1 s,s 1 > s F = s s1,s > s 1 For ormal distributios this variace ratio is a cotiuous distributio called the F distributio. It depeds o the two degrees of freedom 1 = 1-1 ad = - 1. As before, we will reject the ull hypothesis H 0 : 1 = at the level of sigificace ad [possibly] accept the alterative 1 whe our observed F statistic exceeds the critical value F /. Example: I our case of magetic magetizatios we assumed that the 's were approximately the same. Let us ow show that this is actually justified. We fid (.8)
6 GG313 GEOLOGICAL DATA ANALYSIS 6 F = = 1.1 From the table we fid F 0.05 ( 1 = 5, = 4) = Hece we caot reject H 0 ad coclude that the differece i sample stadard deviatios is ot statistically sigificat at the 95% level. The test The last parametric test we shall be cocered with is the chi-squared test. It is a samplebased statistic usig ormal scores that is squared ad summed up: = z i = i=1 i = 1 x If we draw all possible samples of size from a ormal populatio ad plotted Σz, they would form the distributio metioed earlier. The test is used to compare the shape of our data distributio to a distributio of kow shape (usually a ormal distributio). The test is most ofte used o data that have bee categorized or bied. Assumig that our observatios have bee bied ito k bis, the test statistics is foud as = i=1 (.9) ( O j E j ) (.10) where O j ad E j is the umber of observed ad expected values i the j'th bi. Note that this still is o-dimesioal sice we are usig couts, eve if the deomiator is ot squared. With couts, the probability that m out of couts will fall i a give bi j is determied by the biomial distributio, with ad Pluggig i for we fid = i= 1 x E j = E j = p j = p j ( 1 p j ) p j = E j = i =1 O j E j E j = i=1 ( O j E j ) E j As a example, cosider the 48 measuremets of saliity from Whitewater Bay i Florida (Table -1). We would like to kow if these observatios come from a ormal distributio or ot. The aswer might have implicatios for models of mixig salt ad freshwater. The first step is to ormalize the data ito ormal scores. We fid x = ad s = 9.7 thus trasfer all values to z i = x i We choose to bi the data ito 5 bis chose such that the area uder the curve for each bi is the same, i.e., 0.. Usig tables for the ormal distributio, we fid that the correspodig z-values for the itervals are (-, -0.84), (-0.84, -0.6), (-0.6, +0.6), (0.6, 0.84), (0.84, ). Coutig
7 GG313 GEOLOGICAL DATA ANALYSIS 7 the values i Table -1 we fid the observed umber of samples for each of the 5 bis are 10, 11, 10, 5, ad 1. These are O j 's. The expected values E j are all E j = k = 48 5 =9.6 Usig (.10) we fid the observed value = 3.04 The - distributio depeds o, the degrees of freedom, which ormally is = - 1 = 4 i our case. However, we used our observatios to compute x the s. This reduces by, leavig degrees of freedom. From Appedix.6 we fid the critical for = ad = 0.05 to be Sice this is much larger tha our computed value we coclude that we caot reject the ull hypothesis that the saliities were draw from a ormal distributio at the 95% sigificace level. We repeat that while we used a ormal distributio i this example, the E j could have represeted ay other distributio. Table -1 Stadardized scores of saliity measuremets from Whitewater Bay Number Origial Stadardized Number Origial Stadardized
8 GG313 GEOLOGICAL DATA ANALYSIS 8 No-parametric tests Last time we fiished up lookig at the stadard parametric tests, i.e., the t, F, ad tests. We justified usig these tests by either havig large samples ad ivoke the cetral limits theorem, or simply assumig that the distributio we have sampled is approximately ormal. Sometimes, however, oe of these coditios are met. The two cases are: Small samples ( < 30) ad you caot assume populatio is ormal Ay size sample of ordial data (which ca oly be raked, ot operated o umerically) I those cases we apply o-parametric methods which make o assumptios about the form of the data distributio. Ma - Whitey test This test is a o-parametric alterative to the two-sample Studet t-test. It also goes by the ames Wilcoxo test ad the U-test. The Ma-Whitey test is performed by combiig the two data sets we wat to compare, sort them ito ascedig order, ad assig each poit a rak: Smallest value is give rak = 1; the largest observatio is raked 1 +. Should some of the observatios be idetical, oe assigs the average rak to all these values. E.g. if the 7th ad 8th sorted values are idetical, we assig to each the rak 7.5. The idea here is that if the samples cosist of radom drawigs from the same populatio oe would expect the raks for both samples to be scattered more-or-less uiformly through the sequece. After arragig the data, we add up the raks for each data set ito rak sums which we deote W 1 ad W. The sum of W 1 + W must obviously equal the sum of the first ( 1 + ) itegers which is 1 ( + 1 ) ( ) May early rak sum tests were based o W 1 or W but ow it is customary to use the statistic U defied as or. U 1 = ( +1) W (.11) U = ( +1) W (.1) or simply U, the smallest of U 1 ad U. This statistic takes o values from 0 to 1 ad its samplig distributio is symmetrical about 1. The test the cosists of comparig the calculated U statistic to a critical U value give the sample sizes ad desired level of sigificace. Example : We wat to compare the grai size of sad obtaied from two differet locatios o the moo o the basis of measuremets of grai diameters i mm as follows. Locatio 1: 0.37, 0.70, 0.75, 0.30, 0.45, 0.16, 0.6, 0.73, = 9 Locatio : 0.86, 0.55, 0.80, 0.4, 0.97, 0.84, 0.4, 0.51, 0.9, 0.69 = 10
9 GG313 GEOLOGICAL DATA ANALYSIS 9 We do ot kow what distributio the grai sizes of sad o the moo follow so we choose the U-test to see if the mea grai size differ i the two samples. Computig the meas gives 0.49 ad If we wated to use the t-test we would have to assume that the uderlyig distributios are ormal. The U-test requires o such assumptios. We start by arragig the data joitly i ascedig order ad keep track of which sample each poit origiated from: Data Source Rak We first evaluate the rak sum for sample 1, givig W 1 = 69, from which it follows that 19 0 W = W 1 = =11 We ow form the ull hypothesis H 0 : 1 =, with H 1 : 1, ad state the level of sigificace = From a table with critical values for U we fid U (9, 10) = 0. We will reject the ull hypothesis if U is 0. From W 1 ad W we fid U 1 = = 66 U = = 4 ad hece U = mi(66, 4) = 4. This is larger tha the critical value of 0, suggestig we caot reject the ull hypothesis. I other words, the observed differece i grai size meas is ot statistically sigificat at the 95% sigificace level. For large samples ( 1, > 30) thigs agai simplify ad it ca be show that the mea ad stadard deviatio of the U 1 samplig distributio are U = 1 = 1 ( + 1 ) U 1 (.13) We could the form the z-score as z = U - ad use the familiar critical values u ± z u, or simply use the stadard t-test sice we have a large sample. Kolmogorov - Smirov Aother very useful o-parametric method is the Kolmogorov - Smirov (K-S) test. It is a test for goodess of fit or shape, ad is ofte used istead of the - test. A big advatage of the K-S test over the is that oe does ot have to bi the data, which is a arbitrary procedure ayway (how do you select bi size ad why?). I the K-S test we covert the data distributio to a cumulative distributio S(x). S(x) the gives the fractio of data poits to the "left" of x.
10 GG313 GEOLOGICAL DATA ANALYSIS 10 While differet data sets will geeral have differet distributios, all cumulative distributios agree at the smallest x (S(x) = 0) ad the largest x (S(x) = 1). Thus, it is the behavior betwee these poits that sets distributios apart. There is of course a ifiite umber of ways to measure the overall differece betwee two cumulative distributios: We could look at absolute value of the area betwee the curves, the mea square differece, etc. The K-S statistic is very simple: It cosists of the maximum value of the absolute differece betwee the two cumulative curves. Thus, comparig two cumulative distributios S 1 (x) ad S (x) oe K-S statistic becomes D = max S 1 < x < ( x i ) S x i ( ) (.14) Note that S may be aother or a give cumulative probability fuctio like the ormal distributio. The distributio of the K-S statistic itself ca be calculated uder the assumptio that S 1 ad S are draw from the same distributio, thus providig critical values for D. We will use the K-S test o the saliity measuremets we looked at previously. After computig the ormal scores, we plot the cumulative fuctio o the same graph as that of a ormal cumulative distributio. Ispectig the graph we fid the maximum absolute differece at z = 0.37, which correspods to the 53 ppt sample. The D estimate is = Based o a sigificace level of = 0.10 ad = 48, the critical K-S value is 0.17, much larger tha observed. Hece we caot reject the ull hypothesis that the samples were collected from a ormally distributed populatio. Tests of Correlatio Coefficiets There are both parametric ad o-parametric tests for the liear correlatio coefficiet r. We will look at both kids. Traditioal (Least-squares) Correlatio We recall that the covetioal correlatio coefficiet was defied by r = x xy y = i=1 i =1 ( x i x ) y i y ( x i x ) ( ) i =1 ( y i y ) (.15) Ofte, we eed to test if r is sigificat. I such tests, r is our sample-derived estimate of, the actual correlatio of the populatio. The most useful ull hypothesis is H 0 : = 0. It ca be show that the samplig distributio of r for a populatio that has zero correlatio ( = 0) has mea = 0 ad = 1 r. Hece, a t-statistic ca be calculated as t = r = r ( 1 r ) ( ) = r 1 r (.16) The degrees of freedom,, is -. Suppose we rolled a pair of dice, oe red ad oe gree (Table -). Usig (.15) we obtai r = 0.66 which seems quite high, especially sice there is o reaso to believe a correlatio should exist at all. Let us test to see if the correlatio is sigificat. Choosig = 0.05 we fid critical t / = Applyig (.16) gives the observed
11 GG313 GEOLOGICAL DATA ANALYSIS 11 t = 1.5, hece the correlatio of 0.66 is most likely caused by radom fluctuatios of small samples ad we caot reject H 0. Red (x) Gree (y) Table -. Examples of rollig a pair of dice. How high would r have to be for us to fid it sigificat ad commit a type I error by rejectig the (true) ull hypothesis? We must solve for r i t = r 1 r 3.18 = 3r 1 r r = ±0.88 So, if r equals or exceeds ±0.88 we would fid ourselves cocludig that red ad gree dice give correlated pairs of values... No-parametric Correlatio Fially, we will look at o-parametric correlatio called rak correlatio or Spearma's rak correlatio, deoted by r s. The rak correlatio is carried out by rakig the x i 's ad y i 's separately, the fidig the differece i rak d i betwee x i ad y i pairs, ad evaluate r s as r s =1-6 Σ d i - 1 (.17) I the case where the ull Hypothesis H 0 : o correlatio is true, the samplig distributio of r s has mea r s - 0 ad stadard deviatio 1 1. We ca therefore base our statistics o ad compare this z-value to critical z z = r s -0 = r 1 s -1 (.18) -1 values. Rakig the dice data gives Red (x) Rak x Gree (y) Rak y d Usid (.17) we fid r s = 0.65 (surprisigly similar to what we foud usig (.15)). The z- statistic from (.18) becomes z = 1.3, which is way iside the 95% cofidece limits for a ormal distributio (±). Hece, we agai arrive at the same coclusio that we caot reject H 0.
A statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationOverview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions
Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationCommon Large/Small Sample Tests 1/55
Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio
More informationAgreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times
Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log
More informationLecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationLecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 7: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review How ca we set a cofidece iterval o a proportio? 2 Review How ca we set a cofidece iterval
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationMath 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency
Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationLecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS
Lecture 7: No-parametric Compariso of Locatio GENOME 560 Doug Fowler, GS (dfowler@uw.edu) 1 Review How ca we set a cofidece iterval o a proportio? 2 What do we mea by oparametric? 3 Types of Data A Review
More information1036: Probability & Statistics
036: Probability & Statistics Lecture 0 Oe- ad Two-Sample Tests of Hypotheses 0- Statistical Hypotheses Decisio based o experimetal evidece whether Coffee drikig icreases the risk of cacer i humas. A perso
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationStat 200 -Testing Summary Page 1
Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More informationSection 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis
Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests
More informationBecause it tests for differences between multiple pairs of means in one test, it is called an omnibus test.
Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal
More informationClass 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2
More informationUniversity of California, Los Angeles Department of Statistics. Hypothesis testing
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Elemets of a hypothesis test: Hypothesis testig Istructor: Nicolas Christou 1. Null hypothesis, H 0 (claim about µ, p, σ 2, µ
More informationSampling Distributions, Z-Tests, Power
Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationIntroduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 3 (This versio August 17, 014) 015 Pearso Educatio, Ic. Stock/Watso
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationThis chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.
Chapter 9 & : Comparig Two Treatmets: This chapter focuses o two eperimetal desigs that are crucial to comparative studies: () idepedet samples ad () matched pair samples Idepedet Radom amples from Two
More informationChapter 23: Inferences About Means
Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS
PART of UNIVERSITY OF TORONTO Faculty of Arts ad Sciece APRIL/MAY 009 EAMINATIONS ECO0YY PART OF () The sample media is greater tha the sample mea whe there is. (B) () A radom variable is ormally distributed
More informationEconomics Spring 2015
1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures
More informationChapter 13, Part A Analysis of Variance and Experimental Design
Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion
1 Chapter 7 ad 8 Review for Exam Chapter 7 Estimates ad Sample Sizes 2 Defiitio Cofidece Iterval (or Iterval Estimate) a rage (or a iterval) of values used to estimate the true value of the populatio parameter
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationLast Lecture. Wald Test
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 22 Hyu Mi Kag April 9th, 2013 Is the exact distributio of LRT statistic typically easy to obtai? How about its asymptotic distributio? For testig
More informationChapter 5: Hypothesis testing
Slide 5. Chapter 5: Hypothesis testig Hypothesis testig is about makig decisios Is a hypothesis true or false? Are wome paid less, o average, tha me? Barrow, Statistics for Ecoomics, Accoutig ad Busiess
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More information[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:
PROBABILITY FUNCTIONS A radom variable X has a probabilit associated with each of its possible values. The probabilit is termed a discrete probabilit if X ca assume ol discrete values, or X = x, x, x 3,,
More information1 Constructing and Interpreting a Confidence Interval
Itroductory Applied Ecoometrics EEP/IAS 118 Sprig 2014 WARM UP: Match the terms i the table with the correct formula: Adrew Crae-Droesch Sectio #6 5 March 2014 ˆ Let X be a radom variable with mea µ ad
More information6 Sample Size Calculations
6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationNotes on Hypothesis Testing, Type I and Type II Errors
Joatha Hore PA 818 Fall 6 Notes o Hypothesis Testig, Type I ad Type II Errors Part 1. Hypothesis Testig Suppose that a medical firm develops a ew medicie that it claims will lead to a higher mea cure rate.
More informationTopic 18: Composite Hypotheses
Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationDirection: This test is worth 150 points. You are required to complete this test within 55 minutes.
Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem
More informationContinuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised
Questio 1. (Topics 1-3) A populatio cosists of all the members of a group about which you wat to draw a coclusio (Greek letters (μ, σ, Ν) are used) A sample is the portio of the populatio selected for
More informationDescribing the Relation between Two Variables
Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More information2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2
Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:
More informationMATH/STAT 352: Lecture 15
MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet
More informationLecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS
Lecture 8: No-parametric Compariso of Locatio GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review What do we mea by oparametric? What is a desirable locatio statistic for ordial data? What
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More informationStat 319 Theory of Statistics (2) Exercises
Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationSTAT431 Review. X = n. n )
STAT43 Review I. Results related to ormal distributio Expected value ad variace. (a) E(aXbY) = aex bey, Var(aXbY) = a VarX b VarY provided X ad Y are idepedet. Normal distributios: (a) Z N(, ) (b) X N(µ,
More informationModule 1 Fundamentals in statistics
Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationMA238 Assignment 4 Solutions (part a)
(i) Sigle sample tests. Questio. MA38 Assigmet 4 Solutios (part a) (a) (b) (c) H 0 : = 50 sq. ft H A : < 50 sq. ft H 0 : = 3 mpg H A : > 3 mpg H 0 : = 5 mm H A : 5mm Questio. (i) What are the ull ad alterative
More informationFinal Examination Solutions 17/6/2010
The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:
More information7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals
7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationClass 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 7 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 013 by D.B. Rowe 1 Ageda: Skip Recap Chapter 10.5 ad 10.6 Lecture Chapter 11.1-11. Review Chapters 9 ad 10
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationSummary. Recap ... Last Lecture. Summary. Theorem
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationImportant Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.
Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More informationLesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67
Lesso 7--7 Chaptre 3 Projects ad Had-is Project I: latest ovember Project I: latest december Laboratio Measuremet systems aalysis I: latest december Project - are volutary. Laboratio is obligatory. Give
More informationChapter 8: Estimating with Confidence
Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig
More information1 Constructing and Interpreting a Confidence Interval
Itroductory Applied Ecoometrics EEP/IAS 118 Sprig 2014 WARM UP: Match the terms i the table with the correct formula: Adrew Crae-Droesch Sectio #6 5 March 2014 ˆ Let X be a radom variable with mea µ ad
More informationz is the upper tail critical value from the normal distribution
Statistical Iferece drawig coclusios about a populatio parameter, based o a sample estimate. Populatio: GRE results for a ew eam format o the quatitative sectio Sample: =30 test scores Populatio Samplig
More informationHYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018
HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018 We are resposible for 2 types of hypothesis tests that produce ifereces about the ukow populatio mea, µ, each of which has 3 possible
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationThe Random Walk For Dummies
The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli
More information