χ 2 (m 1 d) distribution, where d is the number of parameter MLE estimates made.

Size: px
Start display at page:

Download "χ 2 (m 1 d) distribution, where d is the number of parameter MLE estimates made."

Transcription

1 MATH 2 Goodness of Fit Part 1 Let x 1, x 2,..., x n be a random sample of measurements that have a specified range and distribution. Divide the range of measurements into m bins and let f 1,..., f m denote the frequencies of the sample points occurring in each bin. Let e 1,..., e m denote the theoretical number of measurements that should occur in each bin if the sample were a perfect fit of the specified distribution. Then for large samples m ( f x = k e k ) 2 k =1 e k follows an approximate χ 2 (m 1) distribution. This value x is the Pearson Chi-Square Test Statistic. The P -value is always the right-tail value P(χ 2 (m 1) x). If distribution parameters, such as the mean or standard deviation, are not specified but must be estimated from the sample data, then the test statistics follow a χ 2 (m 1 d) distribution, where d is the number of parameter MLE estimates made. Example 1. A Physics department claims that the scores on its standardized tests are uniformly distributed with the same proportions scoring in the ranges A [88, 100] B [75, 88) C [65, 75) D [50, 65) F [0, 50) But over the last two exams, with a total of 240 papers, the distribution of scores was 33 A's, 40 B's, 42 C's, D's, and 77 F's. Is there significant evidence that the grades are not really uniformly distributed? Solution. If the data were uniformly distributed over these 5 bins, then there should be an equal number of scores in each range. So there should be an expected value of e k = 240/5 = in each bin. A B C D F freq: f k exp: e k The Pearson test statistic is 5 ( f x = k e k ) 2 = k =1 e k (33 )2 (40 )2 (42 )2 ( )2 = which should be compared to a χ 2 (5 1 0) = χ 2 (4) distribution. (77 )2

2 If this data really were uniformly distributed as specified with the 5 ranges, then most test statistics would be near the middle of the χ 2 (4) distribution, just like most normal measurements are within a standard deviation of average. If the data really were from a uniformly distributed, then there would be only a small chance of obtaining a large test statistic. But when there are large differences between what should occur e k and what did occur f k, then the test statistic will be large. x So for our test statistic x 24.29, we compute P( χ 2 (4) 24.29), the right-tail probability that becomes the P -value. If the P -value is small (generally less than 0.10), then we have evidence to state that the data did not come from the stated distribution. Using the command χ 2 cdf(24.29, 1E99, 4), we obtain a P -value of about Thus we can say: If the grades really were uniformly distributed, then we would only have about probability of obtaining our frequencies f k from 240 grades that differ so much from the expected values e k in these 5 bins. This very low P -value gives us strong evidence to reject the claim that the grades are uniformly distributed as claimed. Example 2. Among Republicans, reported preferences for the 2016 Presidential election are: Donald Trump Rand Paul Ted Cruz Ben Carson 35% 15% 30% 20% However an independent poll of 900 Republicans gave the following preferences: Donald Trump Rand Paul Ted Cruz Ben Carson Does the survey poll give evidence to reject the reported preferences? Do a chisquare test of fit, give the P -value, and give a conclusion. Solution. If the reported preferences were correct, then the expected numbers of preferences for each candidate with a poll of 900 people would be Trump Paul Cruz Carson e k = 900 pct We now have the actual frequencies f k and the expected results e k assuming the reported preferences were true:

3 Trump Paul Cruz Carson freq: f k exp: e k The Pearson test statistic is x = 4 ( f k e k ) 2 = k =1 e k which should be compared to a χ 2 (4 1 0) = χ 2 (3) distribution. Using the command χ 2 cdf(3.0754, 1E99, 3), we obtain a P -value of about This relatively high P -value means that the data is not a terrible fit of the specified distribution. Thus we can say: If the reported preferences were true, then we would have a 38% chance of obtaining frequencies from 900 people that differ as much as ours do from the expected numbers on these four candidates. We do not have enough evidence to reject the report. Example 3. results: A completely random survey of 200 adults in Kentucky gave the following Smoker Non-Smoker Male Female Use goodness of fit to test the hypothesis that the proportion of smokers is the same among males as among females. Solution. Let p 1 be the true proportion of smokers among males, and let p 2 be the true proportion among females. Then p 1 = P( S M) = and p 2 = P( S F) = These proportions seem very close. Assuming the true proportions p 1 and p 2 84 are equal, then the pooled estimate for the proportion of smokers is p ˆ = 200 = This value of p ˆ = 0.42 gives us one MLE estimate from the data. (The proportion of non-smokers is then automatically about 0.58; it does not count as another additional population estimate.) Because we had a completely random survey (and not pre-stratified according to a known male/female breakdown), we also can estimate the proportions of males/females in the population. In this case, P(F) 104 = 0.52 (and hence P(M) ). Thus, we have another MLE estimate.

4 Now if the true proportion of smokers were the same among males as among females, and is estimated to be about p ˆ = 0.42, then what results should we have expected in our survey? Expected e k S N S N M or M F F Obtained f k S N M F In each of the 4 bins, the difference between expected and actual is So the Pearson test statistic is x = 4 k =1 ( f k e k ) = e k = , which should be compared to a χ 2 (4 1 2) = χ 2 (1) distribution (2 MLE estimates are used). Then using the command χ 2 cdf(.00842, 1E99, 1) we obtain a P -value of about Because of the high P -value, the data is almost a perfect fit of the expected distribution given that the real proportion of smokers is Using the Two-Sided 2-Proportion Z-Test If we test H 0 : p 1 = p 2 with a two-sided alternative H a : p 1 p 2, then we obtain the exact same P -value of In this case, the z test statistic is z = But note that ( ) 2 = , which is the exact value of the Pearson chi-square test statistic. However, by definition, Z 2 = χ 2 (1), when Z ~ N(0, 1). So the goodness of fit test for two proportions is equivalent to the two-sided 2 Proportion Z test.

5 Poisson Fit Test Many phenomena are modeled by a Poisson distribution, often because of empirical evidence, but sometimes just for mathematical simplification. The occurrences also can be distributed spatially, or otherwise, and not just measured during time intervals. Following are some examples of that show how to test whether data actually follows a Poisson distribution. Example 4. streptomycin. distribution? In the bacterium E. coli, a mutant variety is resistant to the drug Do the occurrences of mutant resistant colonies follow a Poisson Experiment: 150 Petri dishes were plated with one million bacteria each. Below are the results on how many dishes formed each number of resistant colonies. # of resistant colonies # of dishes Does the data come from a Poisson distribution? If so, then what is the best estimate for λ? For this λ, what would be the expected number of dishes e k forming each number of resistant colonies in the above table for k = 0, 1, 2,...? Solution. Here we use the MLE estimate of the Poisson average λ, which is the sample average number of resistant colonies that formed. Thus, we have λ ˆ = 150 = Now if λ = 0.46, then for k = 0, 1, 2, 3, we have e k = k e But for the k! last bin, we use e 4 = 150 P( X 4) = 150 e 0 e 1 e 2 e 3. We then have # of resistant colonies # of dishes e k or more Does there appear to be a significant difference between what did occur and what should occur if the distribution really were Poi(0. 46)? We now test with the Pearson test statistic.

6 We now have a test statistic of x = ( f k e k ) 2 = k e k Since we have 5 bins and 1 MLE in use, we use a χ 2 (5 1 1) = χ 2 (3) curve to obtain a P -value of P(χ 2 (3) ) If the data were from a Poi(0. 46) distribution, then we would have a 13.5% chance of obtaining frequencies from 150 observations that differ as much as ours do from expected in the 5 bin ranges. We do not have enough evidence to reject a Poi(0. 46) distribution. Below we do the computations on a TI in order to have less round-off error: Enter range and frequencies 1 VarStats L1, L2 computes x ˆ λ = x = 0.46 Store expected into L3 Stat Edit Must adjust last bin in L3 Edit L3(5) to etc Expected in L3 Compute error terms in test stat Error terms in L4 Compute stats on L4 The sum Σ x is the test stat χ 2 cdf ( , 1E99, 3) P-Value Note: The last bin contributes the most error to the test stat even though it has only one measurement. To avoid this problem, we could combine the last two bins as one bin and then use a χ 2 cdf(2) curve to compute the test stat.

7 Example 5 (Flying-Bomb Hits on London). Consider the statistics of flying-bomb hits in a south London area during Word War II. (R. D. Clarke, An application of the Poisson distribution, Journal of the Institute of Actuaries, 1946) The region was divided into 576 areas of 0.25 square kilometers each. The number of regions receiving k hits, for k = 0, 1, 2,..., were as follows: # of hits received # regions Does the data appear to follow a (spatial) Poisson distribution? Solution. The MLE estimate for λ is the sample average number of hits per 0.25 square kilometers. Thus λ ˆ = 576 = Now letting e k = k e , for k = 0, 1,..., 4 and e 5 = 576 e k (i.e., k! k=0 the remainder of the distribution), we have # of hits received f k e k or more Then summing over these six bins we have x = (e k f k ) 2 k e k Using 6 bins, the test statistic follows a χ 2 (4) distribution; thus, the P -value for the test statistic x = is P(χ 2 (4) ) If the distribution of bomb hits were Poisson, then there would be an 88.26% chance of obtaining a difference between the expected and observed measurements as large as the difference that occurs in our data. The high P -value means that the data is a good fit of the desired distribution.

8 Enter range and frequencies 1 VarStats L1, L2 computes x ˆ λ = x Store expected into L3 Stat Edit Must adjust last bin in L3 Edit L3(5) to etc Expected in L3 Compute error terms in test stat Error terms in L4 Compute stats on L4 The sum Σ x is the test stat P-Value

9 Exercises 1. A random sample of 1000 grades were 312 A s, 208 B s, 202 C s, 99 D s, 179 F s. (a) Test whether or not grades have been assigned according to the following distribution: A 30%, B 25%, C 20%, D 10%, F 15%. (b) Which grade(s) seem to fit the specified distribution (low contribution to test statistic), and which grade(s) seem to be a bad fit (high contribution to test statistic)? 2. In an experiment by Rutherford, Chadwick, and Ellis, (1920), a radioactive substance was observed during 2608 time periods of 7.5 seconds each. The number of alpha particles reaching a Geiger counter was recorded for each time period. The results were as follows: # particle hits # time periods Using 10 or more as the last bin, perform a goodness of fit test for whether the data comes from a Poisson distribution. Define and give an estimate of λ, and give a table of {e k }, the test statistic, and the P -value. Explain your conclusion in detail.

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the

More information

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

STP 226 EXAMPLE EXAM #3 INSTRUCTOR: STP 226 EXAMPLE EXAM #3 INSTRUCTOR: Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned. Signed Date PRINTED

More information

Chapter 22. Comparing Two Proportions 1 /29

Chapter 22. Comparing Two Proportions 1 /29 Chapter 22 Comparing Two Proportions 1 /29 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /29 Objective Students test null and alternate hypothesis about two population proportions. 3 /29 Comparing Two

More information

Poisson population distribution X P(

Poisson population distribution X P( Chapter 8 Poisson population distribution P( ) ~ 8.1 Definition of a Poisson distribution, ~ P( ) If the random variable has a Poisson population distribution, i.e., P( ) probability function is given

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Chapter 22. Comparing Two Proportions 1 /30

Chapter 22. Comparing Two Proportions 1 /30 Chapter 22 Comparing Two Proportions 1 /30 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /30 3 /30 Objective Students test null and alternate hypothesis about two population proportions. 4 /30 Comparing

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Using Tables and Graphing Calculators in Math 11

Using Tables and Graphing Calculators in Math 11 Using Tables and Graphing Calculators in Math 11 Graphing calculators are not required for Math 11, but they are likely to be helpful, primarily because they allow you to avoid the use of tables in some

More information

Lecture 41 Sections Wed, Nov 12, 2008

Lecture 41 Sections Wed, Nov 12, 2008 Lecture 41 Sections 14.1-14.3 Hampden-Sydney College Wed, Nov 12, 2008 Outline 1 2 3 4 5 6 7 one-proportion test that we just studied allows us to test a hypothesis concerning one proportion, or two categories,

More information

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15 Chapter 22 Comparing Two Proportions Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 15 Introduction In Ch.19 and Ch.20, we studied confidence interval and test for proportions,

More information

Ch. 11 Inference for Distributions of Categorical Data

Ch. 11 Inference for Distributions of Categorical Data Ch. 11 Inference for Distributions of Categorical Data CH. 11 2 INFERENCES FOR RELATIONSHIPS The two sample z procedures from Ch. 10 allowed us to compare proportions of successes in two populations or

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Math Review Summer 0 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classif the hpothesis test as two-tailed, left-tailed, or right-tailed.

More information

STAT 328 (Statistical Packages)

STAT 328 (Statistical Packages) Department of Statistics and Operations Research College of Science King Saud University Exercises STAT 328 (Statistical Packages) nashmiah r.alshammari ^-^ Excel and Minitab - 1 - Write the commands of

More information

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Interpretation of results through confidence intervals

Interpretation of results through confidence intervals Interpretation of results through confidence intervals Hypothesis tests Confidence intervals Hypothesis Test Reject H 0 : μ = μ 0 Confidence Intervals μ 0 is not in confidence interval μ 0 P(observed statistic

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

Sources of randomness

Sources of randomness Random Number Generator Chapter 7 In simulations, we generate random values for variables with a specified distribution Ex., model service times using the exponential distribution Generation of random

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Random Number Generation. CS1538: Introduction to simulations

Random Number Generation. CS1538: Introduction to simulations Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process

More information

Math 152. Rumbos Fall Solutions to Exam #2

Math 152. Rumbos Fall Solutions to Exam #2 Math 152. Rumbos Fall 2009 1 Solutions to Exam #2 1. Define the following terms: (a) Significance level of a hypothesis test. Answer: The significance level, α, of a hypothesis test is the largest probability

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Example. χ 2 = Continued on the next page. All cells

Example. χ 2 = Continued on the next page. All cells Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E

More information

Statistics for Managers Using Microsoft Excel

Statistics for Managers Using Microsoft Excel Statistics for Managers Using Microsoft Excel 7 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Statistics for Managers Using Microsoft Excel 7e Copyright 014 Pearson Education, Inc. Chap

More information

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain

More information

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100% QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100% 1) We want to conduct a study to estimate the mean I.Q. of a pop singer s fans. We want to have 96% confidence

More information

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed) Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships

More information

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval Epidemiology 9509 Wonders of Biostatistics Chapter 11 (continued) - probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Hypothesis Testing with Z and T

Hypothesis Testing with Z and T Chapter Eight Hypothesis Testing with Z and T Introduction to Hypothesis Testing P Values Critical Values Within-Participants Designs Between-Participants Designs Hypothesis Testing An alternate hypothesis

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

University of Chicago Graduate School of Business. Business 41000: Business Statistics

University of Chicago Graduate School of Business. Business 41000: Business Statistics Name: OUTLINE SOLUTION University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas.

More information

Section VII. Chi-square test for comparing proportions and frequencies. F test for means

Section VII. Chi-square test for comparing proportions and frequencies. F test for means Section VII Chi-square test for comparing proportions and frequencies F test for means 0 proportions: chi-square test Z test for comparing proportions between two independent groups Z = P 1 P 2 SE d SE

More information

Poisson Regression. Ryan Godwin. ECON University of Manitoba

Poisson Regression. Ryan Godwin. ECON University of Manitoba Poisson Regression Ryan Godwin ECON 7010 - University of Manitoba Abstract. These lecture notes introduce Maximum Likelihood Estimation (MLE) of a Poisson regression model. 1 Motivating the Poisson Regression

More information

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E Salt Lake Community College MATH 1040 Final Exam Fall Semester 011 Form E Name Instructor Time Limit: 10 minutes Any hand-held calculator may be used. Computers, cell phones, or other communication devices

More information

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics Originality in the Arts and Sciences: Lecture 2: Probability and Statistics Let s face it. Statistics has a really bad reputation. Why? 1. It is boring. 2. It doesn t make a lot of sense. Actually, the

More information

MATH Section 4.1

MATH Section 4.1 MATH 1311 Section 4.1 Exponential Growth and Decay As we saw in the previous chapter, functions are linear if adding or subtracting the same value will get you to different coordinate points. Exponential

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the. Practice Test 3 Math 1342 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) The term z α/2 σn describes the. 1) A) maximum error of estimate

More information

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Chapter 7 Exam A Name 1) How do you determine whether to use the z or t distribution in computing the margin of error, E = z α/2 σn or E = t α/2 s n? 1) Use the given degree of confidence and sample data

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

Binomial and Poisson Probability Distributions

Binomial and Poisson Probability Distributions Binomial and Poisson Probability Distributions Esra Akdeniz March 3, 2016 Bernoulli Random Variable Any random variable whose only possible values are 0 or 1 is called a Bernoulli random variable. What

More information

Inferences About Two Population Proportions

Inferences About Two Population Proportions Inferences About Two Population Proportions MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2018 Background Recall: for a single population the sampling proportion

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

What does a population that is normally distributed look like? = 80 and = 10

What does a population that is normally distributed look like? = 80 and = 10 What does a population that is normally distributed look like? = 80 and = 10 50 60 70 80 90 100 110 X Empirical Rule 68% 95% 99.7% 68-95-99.7% RULE Empirical Rule restated 68% of the data values fall within

More information

ISQS 5349 Final Exam, Spring 2017.

ISQS 5349 Final Exam, Spring 2017. ISQS 5349 Final Exam, Spring 7. Instructions: Put all answers on paper other than this exam. If you do not have paper, some will be provided to you. The exam is OPEN BOOKS, OPEN NOTES, but NO ELECTRONIC

More information

The Poisson Distribution

The Poisson Distribution The Poisson Distribution Mary Lindstrom (Adapted from notes provided by Professor Bret Larget) February 5, 2004 Statistics 371 Last modified: February 4, 2004 The Poisson Distribution The Poisson distribution

More information

Paired Samples. Lecture 37 Sections 11.1, 11.2, Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2012

Paired Samples. Lecture 37 Sections 11.1, 11.2, Robb T. Koether. Hampden-Sydney College. Mon, Apr 2, 2012 Paired Samples Lecture 37 Sections 11.1, 11.2, 11.3 Robb T. Koether Hampden-Sydney College Mon, Apr 2, 2012 Robb T. Koether (Hampden-Sydney College) Paired Samples Mon, Apr 2, 2012 1 / 17 Outline 1 Dependent

More information

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence Chapter 10 χ 2 tests for goodness of fit and independence Prof. Tesler Math 186 Winter 2018 Prof. Tesler Ch. 10: χ 2 goodness of fit tests Math 186 / Winter 2018 1 / 26 Multinomial test Consider a k-sided

More information

CBA4 is live in practice mode this week exam mode from Saturday!

CBA4 is live in practice mode this week exam mode from Saturday! Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as

More information

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3) STAT 515 -- Chapter 9: Two-Sample Problems Paired Differences (Section 9.3) Examples of Paired Differences studies: Similar subjects are paired off and one of two treatments is given to each subject in

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population

More information

and the Sample Mean Random Sample

and the Sample Mean Random Sample MATH 183 Random Samples and the Sample Mean Dr. Neal, WKU Henceforth, we shall assume that we are studying a particular measurement X from a population! for which the mean µ and standard deviation! are

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from

More information

Hypothesis testing. 1 Principle of hypothesis testing 2

Hypothesis testing. 1 Principle of hypothesis testing 2 Hypothesis testing Contents 1 Principle of hypothesis testing One sample tests 3.1 Tests on Mean of a Normal distribution..................... 3. Tests on Variance of a Normal distribution....................

More information

1.3 Exponential Functions

1.3 Exponential Functions 22 Chapter 1 Prerequisites for Calculus 1.3 Exponential Functions What you will learn about... Exponential Growth Exponential Decay Applications The Number e and why... Exponential functions model many

More information

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

Two sample Hypothesis tests in R.

Two sample Hypothesis tests in R. Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the

More information

Review of One-way Tables and SAS

Review of One-way Tables and SAS Stat 504, Lecture 7 1 Review of One-way Tables and SAS In-class exercises: Ex1, Ex2, and Ex3 from http://v8doc.sas.com/sashtml/proc/z0146708.htm To calculate p-value for a X 2 or G 2 in SAS: http://v8doc.sas.com/sashtml/lgref/z0245929.htmz0845409

More information

Chapter 4a Probability Models

Chapter 4a Probability Models Chapter 4a Probability Models 4a.2 Probability models for a variable with a finite number of values 297 4a.1 Introduction Chapters 2 and 3 are concerned with data description (descriptive statistics) where

More information

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies. I. T or F. (1 points each) 1. The χ -distribution is symmetric. F. The χ may be negative, zero, or positive F 3. The chi-square distribution is skewed to the right. T 4. The observed frequency of a cell

More information

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010 Instructor Name Time Limit: 120 minutes Any calculator is okay. Necessary tables and formulas are attached to the back of the exam.

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

Lecture 17 May 11, 2018

Lecture 17 May 11, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 17 May 11, 2018 Prof. Emmanuel Candes Scribe: Emmanuel Candes and Zhimei Ren) 1 Outline Agenda: Topics in selective inference 1. Inference After Model

More information

Test 3 SOLUTIONS. x P(x) xp(x)

Test 3 SOLUTIONS. x P(x) xp(x) 16 1. A couple of weeks ago in class, each of you took three quizzes where you randomly guessed the answers to each question. There were eight questions on each quiz, and four possible answers to each

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Mt. Douglas Secondary

Mt. Douglas Secondary Foundations of Math 11 Calculator Usage 207 HOW TO USE TI-83, TI-83 PLUS, TI-84 PLUS CALCULATORS FOR STATISTICS CALCULATIONS shows it is an actual calculator key to press 1. Using LISTS to Calculate Mean,

More information

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius.

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius. Math 50: Final 180 minutes, 140 points. No algebra-capable calculators. Try to use your calculator only at the end of your calculation, and show working/reasoning. Please do look up z, t, χ 2 values for

More information

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Lecture 9 Two-Sample Test Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Computer exam 1 18 Histogram 14 Frequency 9 5 0 75 83.33333333

More information

Macomb Community College Department of Mathematics. Review for the Math 1340 Final Exam

Macomb Community College Department of Mathematics. Review for the Math 1340 Final Exam Macomb Community College Department of Mathematics Review for the Math 0 Final Exam WINTER 0 MATH 0 Practice Final Exam WI0 Math0PF/lm Page of MATH 0 Practice Final Exam MATH 0 DEPARTMENT REVIEW FOR THE

More information

15: CHI SQUARED TESTS

15: CHI SQUARED TESTS 15: CHI SQUARED ESS MULIPLE CHOICE QUESIONS In the following multiple choice questions, please circle the correct answer. 1. Which statistical technique is appropriate when we describe a single population

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data Review for Exam #1 1 Chapter 1 Population the complete collection of elements (scores, people, measurements, etc.) to be studied Sample a subcollection of elements drawn from a population 11 The Nature

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Hypothesis testing Anna Wegloop iels Landwehr/Tobias Scheffer Why do a statistical test? input computer model output Outlook ull-hypothesis

More information

Chapter 10: Chi-Square and F Distributions

Chapter 10: Chi-Square and F Distributions Chapter 10: Chi-Square and F Distributions Chapter Notes 1 Chi-Square: Tests of Independence 2 4 & of Homogeneity 2 Chi-Square: Goodness of Fit 5 6 3 Testing & Estimating a Single Variance 7 10 or Standard

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Prince Sultan University STAT 101 Final Examination Spring Semester 2008, Term 082 Monday, June 29, 2009 Dr. Quazi Abdus Samad

Prince Sultan University STAT 101 Final Examination Spring Semester 2008, Term 082 Monday, June 29, 2009 Dr. Quazi Abdus Samad Prince Sultan University STAT 101 Final Examination Spring Semester 2008, Term 082 Monday, June 29, 2009 Dr. Quazi Abdus Samad Name: (First) (Middle) ( Last) ID Number: Section No.: Important Instructions:

More information

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except on problems 1 & 2. Work neatly.

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except on problems 1 & 2. Work neatly. Introduction to Statistics Math 1040 Sample Exam III Chapters 8-10 4 Problem Pages 3 Formula/Table Pages Time Limit: 90 Minutes 1 No Scratch Paper Calculator Allowed: Scientific Name: The point value of

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Announcements. Final Review: Units 1-7

Announcements. Final Review: Units 1-7 Announcements Announcements Final : Units 1-7 Statistics 104 Mine Çetinkaya-Rundel June 24, 2013 Final on Wed: cheat sheet (one sheet, front and back) and calculator Must have webcam + audio on at all

More information

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name STAT 301, Spring 2013 Name Lec 1, MWF 9:55 - Ismor Fischer Discussion Section: Please circle one! TA: Shixue Li...... 311 (M 4:35) / 312 (M 12:05) / 315 (T 4:00) Xinyu Song... 313 (M 2:25) / 316 (T 12:05)

More information

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of

More information

Midterm 1 and 2 results

Midterm 1 and 2 results Midterm 1 and 2 results Midterm 1 Midterm 2 ------------------------------ Min. :40.00 Min. : 20.0 1st Qu.:60.00 1st Qu.:60.00 Median :75.00 Median :70.0 Mean :71.97 Mean :69.77 3rd Qu.:85.00 3rd Qu.:85.0

More information

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability and Statistics FS 2017 Session Exam 22.08.2017 Time Limit: 180 Minutes Name: Student ID: This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided

More information

Chapter 4. Probability Distributions Continuous

Chapter 4. Probability Distributions Continuous 1 Chapter 4 Probability Distributions Continuous Thus far, we have considered discrete pdfs (sometimes called probability mass functions) and have seen how that probability of X equaling a single number

More information

79 Wyner Math Academy I Spring 2016

79 Wyner Math Academy I Spring 2016 79 Wyner Math Academy I Spring 2016 CHAPTER NINE: HYPOTHESIS TESTING Review May 11 Test May 17 Research requires an understanding of underlying mathematical distributions as well as of the research methods

More information

Chapter 8: Confidence Intervals

Chapter 8: Confidence Intervals Chapter 8: Confidence Intervals Introduction Suppose you are trying to determine the mean rent of a two-bedroom apartment in your town. You might look in the classified section of the newspaper, write

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS

UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS UC Berkeley Math 10B, Spring 2015: Midterm 2 Prof. Sturmfels, April 9, SOLUTIONS 1. (5 points) You are a pollster for the 2016 presidential elections. You ask 0 random people whether they would vote for

More information

MATH20802: STATISTICAL METHODS EXAMPLES

MATH20802: STATISTICAL METHODS EXAMPLES MATH20802: STATISTICAL METHODS EXAMPLES 1 1. If X N(µ, σ 2 ) show that its mgf is M X (t) = exp ( µt + σ2 t 2 2 2. If X 1 N(µ 1, σ 2 1 ) and X 2 N(µ 2, σ 2 2 ) are independent then show that ax 1 + bx

More information

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats Materials Needed: Bags of popcorn, watch with second hand or microwave with digital timer. Instructions: Follow the instructions on the

More information

Chi Square Analysis M&M Statistics. Name Period Date

Chi Square Analysis M&M Statistics. Name Period Date Chi Square Analysis M&M Statistics Name Period Date Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem

More information

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α Chapter 8 Notes Section 8-1 Independent and Dependent Samples Independent samples have no relation to each other. An example would be comparing the costs of vacationing in Florida to the cost of vacationing

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information