Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
|
|
- Lora Thompson
- 5 years ago
- Views:
Transcription
1 Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH : Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities Chi-Squared Testig Example: Twis Chi-Squared Test for a Specified Distributio Chi-Squared Tests with Estimated Parameters Chi-Squared for a Parametrized Distributio Two-Way Cotigecy Tables Perspective 1: Test for homogeeity Perspective 2: Test for idepedece Example Copyright 2019, Joh T. Whela, ad all that Tuesday 16 April Chi-Squared Tests with Kow Probabilities 1.1 Chi-Squared Testig The ext sort of iferece we cosider is kow as categorical data aalysis. Cosider a experimet with a set of k possible discrete outcomes. If we do idepedet repetitios of that experimet, we fid 1 that ed i outcome #1, 2 that ed i outcome #2, up to k edig i outcome #k, where each i is a o-egative iteger, ad 1 2 k. There are k differet categories ito which each observatio ca fall, ad we cout up the umbers of each. We have a model which tells us the expected probabilities p 1, p 2,..., p k of the differet outcomes, where 0 p i 1 ad p 1 p 2 p k 1. The expected umbers of observatios i each category accordig to the model are p 1, p 2,..., p k. (Note that these will i geeral ot be itegers.) If we cosider this model to be the ull hypothesis, we re iterested i a test which rejects the model if 1
2 the observed umber couts i the categories are very differet from their expected values. To see the test statistic, we take a slight diversio ad recall that if X 1,..., X k are idepedet radom ormal variables such that X i N(µ i, σ 2 i ), the U k (X i µ i ) 2 i1 σ 2 i k i1 Z i 2 (1.1) is a chi-square radom variable with k degrees of freedom, also kow as χ 2 (k). Similarly, oe of the cosequeces of Studet s theorem is that if {X i } is a sample from N(µ, σ 2 ) V k (X i X) 2 i1 σ 2 χ 2 (k 1) (1.2) as we saw whe we cosidered cofidece itervals for the populatio variace, if we have a statistic W χ 2 (ν), P (W χ 2 α,ν) α (1.3) so comparig the statistic value to χ 2 α,ν gives a test at sigificace level α of the origial model. To coect this to the categorical data aalysis problem, cosider the k 2 case, where there are two possible outcomes. The N 1 Bi(, p 1 ) ad N 2 N 1. If p 1 ad p 2 (1 p 1 ) are both more tha about 5, we ca treat the biomial radom variable N 1 as approximately N(p 1, p 1 p 2 ) so W (N 1 p 1 ) 2 p 1 p 2 (1.4) Is approximately χ 2 (1)-distributed. If we use a little algebra, specifically 1 p 1 p 2 1 p 1 1 p 2 ad N 1 p 1 (N 2 p 2 ), we ca show W (N 1 p 1 ) 2 (N 2 p 2 ) 2 (1.5) p 1 p 2 which is a form of the chi-squared statistic (with k 1 1 degree of freedom) which treats outcomes 1 ad 2 o equal footig. Now cosider the case where k > 2. The radom variables N 1, N 2,..., N k are ot idepedet, but obey what s called a multiomial distributio, whose pmf (for every possible combiatio of o-egative itegers with 1 2 k ) is p( 1,..., k )! 1! 2! k! p 1 1 p 2 2 p k k (1.6) I fact, the last radom variable N k N 1 N 2 N k 1 is redudat; all the iformatio is carried i the first k 1 variables. It turs out that, as log as all the expected umber couts {p i } (icludig p k ) are at least 5, the statistic W k (N i p i ) 2 i1 p i (1.7) writte i aalogy with (1.5) is a chi-squared radom variable with k 1 degrees of freedom. So the statistic value (to be compared to χ 2 α,k 1 ) for a specific set of umber couts is k ( i p i ) 2 (observed expected) 2 p i expected i Example: Twis (1.8) Cosider 50 sets of twis. If they are all frateral, ad boys ad girls are equally likely, we d expect o average 12.5 boy-boy pairs, 25 mixed pairs, ad 12.5 girl-girl pairs. obs exp
3 We ca calculate χ 2 (8 12.5) (23 25)2 25 ( ) (1.9) ad ote that χ 2.10, while χ 2.05, so this hypothesis would be rejected at the 10% level but ot the 5% level. 1.2 Chi-Squared Test for a Specified Distributio We ca also apply the chi-squared test to a histogram of data hypothesized to be a sample draw from a specified distributio. The the umber couts are the umber of observatios which fall ito each bi. For example, suppose we wat to test the hypothesis that a set of 80 observatios is a sample from a expoetial distributio with rate parameter λ l 2. The cdf is F (x) 1 e x l x (1.10) so the expected umber of observatios lyig betwee x 1 ad x 2 is. Suppose we divide ito four bis, ad obtai the 2 x 1 2 x 2 followig observed ad expected histogram. x < 1 1 x < 2 2 x < 3 3 x obs exp The chi-squared statistic will be χ 2 (40 40)2 40 (21 20) (14 10) (5 10) (1.11) We ca compare this to the percetiles of χ 2 (3) distributio. Sice χ 2.10, , we see that the P -value is greater tha 10%, ad we would ot reject this model at ay reasoable cofidece level. Note that we could choose differet bis tha the oes we did; i practice it s good to choose the bis to have similar umbers of expected observatios. Practice Problems 14.1, 14.3 Thursday 18 April Chi-Squared Tests with Estimated Parameters I both of the examples from last time, we assumed that the expected umbers for each category were kow, but that required a assumptio i each case. Cosiderig the problem of twi distributio, we assumed that half of all childre were male, leadig to expected umbers of /4 /2 /4 If the fractio of male childre is θ, this becomes θ 2 2θ(1 θ) (1 θ) 2 we ve replaced the fractio expected i category k, which was previously a fixed umber p i, with a fuctio π i (θ). Ad i priciple, we ca imagie that there would be ot just oe parameter θ, but a set of parameters θ {θ 1, θ 2,..., θ m }, where 3
4 m < k 1. The chi-square statistic would the be χ 2 (θ) k [ i π i (θ)] 2 i1 π i (θ) (2.1) To actually evaluate this for a test, we eed to put i a estimate ˆθ for the parameters. Oe choice would be to fid the oe which miimizes χ 2 (θ) for the observed { i }, but istead, we will use the maximum likelihood estimate, i.e., the oe which maximizes f( 1,..., k ; θ) [π 1 (θ)] 1 [π 2 (θ)] 2 [π k (θ)] k (2.2) or equivaletly l f( 1,..., k ; θ) k! i l π i (θ) l 1! k! i1 (2.3) where the 1,..., k we use are the actual observed values. For example, i the twi example, the log-likelihood is l f( 1,..., k ; θ) 2 1 l θ 2 [l θ l(1 θ))] 2 3 l(1 θ) cost (2 1 2 ) l θ ( ) l(1 θ) cost (2.4) where the costat depeds o 1, 2 ad 3, but ot θ. Differetiatig with respect to θ gives the maximum likelihood equatio which has the solutio ˆθ ˆθ ˆθ (2.5).39 (2.6) We ca plug that ito the formulas for the expected umbers of twis i each category (for 50) ad get obs exp θ 2 2θ(1 θ) (1 θ) 2 exp for θ The chi-squared is the χ 2 ( ) ( ) ( ) (2.7) However, sice we ve used the data to fit oe parameter, the percetiles we should compare it to are ot χ 2 (2) but χ 2 (2 1) χ 2 (1). I geeral, if we have k categories ad m fuctioally idepedet parameters (meaig o oe parameter ca be completely determied from the other m 1), the resultig statistic will be approximately χ 2 (k 1 m). 2.1 Chi-Squared for a Parametrized Distributio We ca also retur to the applicatio where we apply the chisquared test to a bied histogram to check a hypothesized distributio. Last time we gave a example where we observed the followig umber couts for a data sample of size 80: x < 1 1 x < 2 2 x < 3 3 x Last time we hypothesized a expoetial distributio with λ l 2, but suppose we allowed λ to be a ukow parameter. The the cdf of the distributio would be F (x) 1 e λx (2.8) ad the expected umber couts i the bis would be 4
5 x < 1 1 x < 2 2 x < 3 3 x (1 e λ ) (e λ e 2λ ) (e 2λ e 3λ ) e 3λ If we wated to follow the maximum likelihood prescriptio from the previous sectio, we would eed to fid the λ value which maximizes (1 e λ ) 1 (e λ e 2λ ) 2 (e 2λ e 3λ ) 3 (e 3λ ) 4 (2.9) This seems like a pretty complicated fuctio, although it turs out that it is maximized by settig λ to ˆλ l (2.10) ad the resultig statistic is approximately χ 2 (3)-distributed. But i geeral it may be difficult ad/or require a umerical method to fid the parameters ˆθ which maximize k i l[f (b i ; θ) F (a i ; θ)] (2.11) i1 where the ith histogram bi covers a i < x b i, which is what we d eed to get a statistic which was χ 2 distributed with k 1 m degrees of freedom. Istead, it is ofte coveiet to use some other poit estimate costructed ot just from the bied histogram, but from the full sample of data values. For istace, for the expoetial model we could use λ 1/x. If we do this, though the χ 2 value will geerally ot be as low as if we had really miimized over θ. This meas that the threshold c α for a test with false alarm probability α will be betwee the value χ 2 α,k 1 we d use if we had fixed the parameters arbitrarily ad the (lower) value χ 2 α,k 1 m we d use if we had miimized the χ 2 over the m parameters, i.e., χ 2 α,k 1 m c α χ 2 α,k 1 (2.12) Practice Problems 14.15, Tuesday 23 April Two-Way Cotigecy Tables We ow cosider aother categorical data aalysis problem, i which we have two sets of categories, ad each observatio falls ito oe category from the first set ad oe from the secod set. We d like to kow if the data idicate ay statistical relatioship betwee which of the first set of categories a observatio falls ito ad which of the secod. For istace, suppose we survey studets majorig i four disciplies about their food choices: Vega Vegetaria No-Veg Total Math & Stat Physics Chemistry Biology Total We d like to kow if the data idicate ay sigificat tedecies for studets i oe major to have oe diet or aother. There are two ways to pose the questio: 1. Is there ay differece i the tedecies of studets i oe major or aother to have a vega, vegetaria, or ovegetaria diet? (homogeeity) 2. Is there ay correlatio betwee the major chose by a studet ad their dietary choices? (idepedece) The two questios have differet iterpretatios i the framework of classical statistics, but they tur out to lead to idetical data aalysis procedures. 5
6 As a matter of otatio, we ll cosider the table to have I rows ad J colums, labelled by i 1, 2,..., I ad j 1, 2,..., J, respectively. (I the table above, I is 4 ad J is 3.) We ll call the observed umber i each cell (i, j). (Devore calls this N ij.) We ll write the total of the umbers i row i as (i, ) J j1 (i, j) (Devore: i ) ad i colum j as (, j) I i1 (i, j) (Devore: j). The total umber of observatios is I (i, ) i1 I i1 J (i, j) j1 J (, j) (3.1) j1 3.1 Perspective 1: Test for homogeeity I the first way of lookig at thigs, the umber (i, ) i each row is a give, ad for each i we cosider {N(i, j) j 1,... J} to be a multiomial radom vector with probabilities p(j i). (Devore calls this p ij.) Note that this is ot a coditioal probability accordig to the set theory defiitio of Devore Chapter Two, but it s the probability for a observatio which is i row i to fall ito colum j. These probabilities obey J j1 p(j i) 1 for each i. The most importat property of the multiomial for us is that E (N(i, j)) (i, )p(j i) (3.2) The ull hypothesis of homogeeity is that each of these I sets of J probabilities is the same: H 0 : p(j i) p(, j) for all I (3.3) We ca estimate this commo probability as ˆp(, j) (,j) so that the estimated expected umber of items i each cell is ê(i, j) (i, )ˆp(, j) (i, ) (, j) (3.4) We the use these estimated expected umbers to make a chisquared statistic for the whole table s divergece from homogeeity: I J χ 2 [(i, j) ê(i, j)] 2 (3.5) ê(i, j) i1 j1 We have I multiomial radom variables with J categories each, which meas we ve observed I(J 1) idepedet umbers. We ve estimated J probabilities, but oly J 1 of them were idepedet because they had to add to 1. Thus the umber of degrees of freedom for the chi-squared should be I(J 1) (J 1) (I 1)(J 1) (3.6) Note that although the formalism treats the rows ad colums rather differetly, the fial data aalysis prescriptio treats them symmetrically. 3.2 Perspective 2: Test for idepedece I the secod way of lookig at thigs, the oly thig we treat as give is the total umber of observatios, ad the observatio {N(i, j) i 1,... I; j 1,... J} is a (IJ-dimesioal) multiomial radom vector with probabilities p(i, j). (Devore also calls this p ij, which is oe reaso I wated to make the otatio more descriptive.) These probabilities obey I J i1 j1 p(i, j) 1. Now the multiomial says E (N(i, j)) p(i, j) (3.7) The ull hypothesis of idepedece is that the probabilities ca be decomposed assumig idepedece of the two sets of categories: H 0 : p(i, j) p(i, ) p(, j) (3.8) 6
7 We ca estimate the probabilties as ˆp(i, ) (i, ) ad ˆp(, j) (,j) so that the estimated expected umber of items i each cell is ê(i, j) ˆp(i, )ˆp(, j) (i, ) (, j) 2 (i, ) (, j) (3.9) which is exactly what we saw above 1 We the make the chisquared as before: χ 2 I i1 J [(i, j) ê(i, j)] 2 j1 ê(i, j) (3.10) We have a sigle multiomial with IJ categories ow, so there are IJ 1 idepedet observatios. We ve estimated I probabilities {ˆp(i, ), I 1 of which are idepedet, ad J probabilities {ˆp(, j), J 1 of which are idepedet, so we have Vega Vegetaria No-Veg Total Math & Stat Physics Chemistry Biology Total Note that it s really easy to do this i a simple spreadsheet program. If I costruct the chi-squared statistic I get I eed to compare this to a chi-squared distributio with (4 1)(3 1) (3)(2) 6 degrees of freedom, so 0.99 is rather low ideed ad the data show o sigificat correlatio. Practice Problems 14.27, 14.34, 14.35, IJ 1 (I 1) (J 1) IJ I J 1 (I 1)(J 1) (3.11) which, agai, is the same umber of degrees of freedom as the homogeeity test told us Example We retur to the example above. The estimates accordig to the model are (81)(52) 10.35, (81)(116) 23.09, etc.: Note that viewed as a statistic, the estimate from idepedece is N(i, ) N(,j) ê(i, j) while that from homogeeity is ê(i, j) but this distictio has o effect o the data aalysis prescriptio. (i, ) N(,j), 7
Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More information( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2
82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,
More informationChapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more
More informationTable 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab
Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet
More informationChapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo
More informationClass 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 7 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 013 by D.B. Rowe 1 Ageda: Skip Recap Chapter 10.5 ad 10.6 Lecture Chapter 11.1-11. Review Chapters 9 ad 10
More informationRead through these prior to coming to the test and follow them when you take your test.
Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationChapter 11: Asking and Answering Questions About the Difference of Two Proportions
Chapter 11: Askig ad Aswerig Questios About the Differece of Two Proportios These otes reflect material from our text, Statistics, Learig from Data, First Editio, by Roxy Peck, published by CENGAGE Learig,
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationChi-Squared Tests Math 6070, Spring 2006
Chi-Squared Tests Math 6070, Sprig 2006 Davar Khoshevisa Uiversity of Utah February XXX, 2006 Cotets MLE for Goodess-of Fit 2 2 The Multiomial Distributio 3 3 Applicatio to Goodess-of-Fit 6 3 Testig for
More informationDirection: This test is worth 150 points. You are required to complete this test within 55 minutes.
Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationSTA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:
STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More information5. Likelihood Ratio Tests
1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationThis is an introductory course in Analysis of Variance and Design of Experiments.
1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class
More informationt distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference
EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The
More informationProbability and statistics: basic terms
Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample
More informationAgreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times
Sigificace level vs. cofidece level Agreemet of CI ad HT Lecture 13 - Tests of Proportios Sta102 / BME102 Coli Rudel October 15, 2014 Cofidece itervals ad hypothesis tests (almost) always agree, as log
More informationSTAT431 Review. X = n. n )
STAT43 Review I. Results related to ormal distributio Expected value ad variace. (a) E(aXbY) = aex bey, Var(aXbY) = a VarX b VarY provided X ad Y are idepedet. Normal distributios: (a) Z N(, ) (b) X N(µ,
More informationCommon Large/Small Sample Tests 1/55
Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationMidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday
Aoucemets MidtermII Review Sta 101 - Fall 2016 Duke Uiversity, Departmet of Statistical Sciece Office Hours Wedesday 12:30-2:30pm Watch liear regressio videos before lab o Thursday Dr. Abrahamse Slides
More informationStat 319 Theory of Statistics (2) Exercises
Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.
More informationTAMS24: Notations and Formulas
TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =
More informationUCLA STAT 110B Applied Statistics for Engineering and the Sciences
UCLA STAT 110B Applied Statistics for Egieerig ad the Scieces Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistats: Bria Ng, UCLA Statistics Uiversity of Califoria, Los Ageles,
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationRecall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.
Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationMath 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency
Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationA statistical method to determine sample size to estimate characteristic value of soil parameters
A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig
More informationGG313 GEOLOGICAL DATA ANALYSIS
GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data
More informationApril 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE
April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece
More informationThe Random Walk For Dummies
The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationLecture Notes 15 Hypothesis Testing (Chapter 10)
1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationBecause it tests for differences between multiple pairs of means in one test, it is called an omnibus test.
Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal
More informationDS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10
DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set
More informationUnderstanding Samples
1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We
More informationAgenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740
Ageda: Recap. Lecture. Chapter Homework. Chapt #,, 3 SAS Problems 3 & 4 by had. Copyright 06 by D.B. Rowe Recap. 6: Statistical Iferece: Procedures for μ -μ 6. Statistical Iferece Cocerig μ -μ Recall yes
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationSTATISTICAL INFERENCE
STATISTICAL INFERENCE POPULATION AND SAMPLE Populatio = all elemets of iterest Characterized by a distributio F with some parameter θ Sample = the data X 1,..., X, selected subset of the populatio = sample
More informationChapter 13: Tests of Hypothesis Section 13.1 Introduction
Chapter 13: Tests of Hypothesis Sectio 13.1 Itroductio RECAP: Chapter 1 discussed the Likelihood Ratio Method as a geeral approach to fid good test procedures. Testig for the Normal Mea Example, discussed
More informationSummary. Recap ... Last Lecture. Summary. Theorem
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca
More information(7 One- and Two-Sample Estimation Problem )
34 Stat Lecture Notes (7 Oe- ad Two-Sample Estimatio Problem ) ( Book*: Chapter 8,pg65) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye Estimatio 1 ) ( ˆ S P i i Poit estimate:
More information1 Constructing and Interpreting a Confidence Interval
Itroductory Applied Ecoometrics EEP/IAS 118 Sprig 2014 WARM UP: Match the terms i the table with the correct formula: Adrew Crae-Droesch Sectio #6 5 March 2014 ˆ Let X be a radom variable with mea µ ad
More informationData Analysis and Statistical Methods Statistics 651
Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio
More informationClass 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 23 Daiel B. Rowe, Ph.D. Departmet of Mathematics, Statistics, ad Computer Sciece Copyright 2017 by D.B. Rowe 1 Ageda: Recap Chapter 9.1 Lecture Chapter 9.2 Review Exam 6 Problem Solvig Sessio. 2
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationStatistics 20: Final Exam Solutions Summer Session 2007
1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets
More informationBIOS 4110: Introduction to Biostatistics. Breheny. Lab #9
BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationStat 200 -Testing Summary Page 1
Stat 00 -Testig Summary Page 1 Mathematicias are like Frechme; whatever you say to them, they traslate it ito their ow laguage ad forthwith it is somethig etirely differet Goethe 1 Large Sample Cofidece
More informationPower and Type II Error
Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error
More information1 Models for Matched Pairs
1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More informationLast Lecture. Wald Test
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 22 Hyu Mi Kag April 9th, 2013 Is the exact distributio of LRT statistic typically easy to obtai? How about its asymptotic distributio? For testig
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationSample Size Determination (Two or More Samples)
Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie
More information10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random
Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),
More information2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2
Chapter 8 Comparig Two Treatmets Iferece about Two Populatio Meas We wat to compare the meas of two populatios to see whether they differ. There are two situatios to cosider, as show i the followig examples:
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationDepartment of Mathematics
Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationSequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationRegression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.
Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat
More informationMath 140 Introductory Statistics
8.2 Testig a Proportio Math 1 Itroductory Statistics Professor B. Abrego Lecture 15 Sectios 8.2 People ofte make decisios with data by comparig the results from a sample to some predetermied stadard. These
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationConfidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.
MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval
More information