Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Size: px
Start display at page:

Download "Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018"

Transcription

1 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

2 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement y P = probability distn which assigns probability f(y) to y. The trait value of an individual randomly selected from the popn is a random variable Y with distn P. Mean and variance of the trait in the population are m = y f(y), σ 2 = (y m) 2 f(y) These are also the mean and variance of the random variable Y.

3 Summary statistics A random sample of size n drawn from the popn generates a sequence of observations Y 1... Y n. If size of sample is much less than size of popn, they can be treated as independent random variables, each with probability distn P. These observations are random variables, and we can calculate the sampling distn of any summary statistic (sample mean, median, variance, range, etc).

4 A sampling distribution The sample mean Ȳ might be used to estimate the population mean m. Sampling distn of Ȳ: E(Ȳ) = m, and var ( Ȳ ) = σ 2 /n, where n is sample size. It can be shown that if the distn (P) of the trait in the popn is normal, the distn of Ȳ is normal, i.e. Ȳ N(m, σ 2 /n) When n is large, this will be approximately true, even when the distn in the popn is not normal (central limit theorem).

5 Another sampling distn A binary trait takes one of two possible values, e.g. eyes are blue, eyes are not blue. Let Y i = 1 if the i-th member of the sample has blue eyes, otherwise Y i = 0. Then Ȳ is the proportion of the sample with blue eyes, and the sampling distn of Ȳ is a scaled binomial: Pr(Ȳ = y ( ) n n ) = m y (1 m) n y, for y = 0... n. y where m is the population mean, i.e. the proportion of the population with blue eyes.

6 Inference from sample to population A sample provides information about the popn from which it is drawn. For example, the sample mean Ȳ tells us something about the popn mean m. 1) Different samples give rise to different estimates. The value of the estimate cannot be predicted in advance, and we regard it as a random variable with a probability distribution (the sampling distribution of the estimator). 2) The sample estimate will differ from the population parameter, but if the sample is large enough, the estimate will be close to the true value with high probability.

7 Inference from sample to population Inference usually takes the form of a probability statement based on the sampling distn of the appropriate summary statistic. If the distn of Y in the popn is normal, the sampling distn of Ȳ is N(m, σ 2 /n). When σ 2 is known, a simple form of inference takes the form of a statement that the event Ȳ m < k E occurs with high probability. E = σ 2 /n is the standard error of Ȳ, and k is a suitable quantile of the standard normal distn.

8 Mean square error, bias, variance Here θ represents some feature of the popn, T is a sample statistic. If T is used as an estimator of θ, the estimation error is T θ. The mean squared error is E(T θ) 2, which we would like to be as small as possible. Let m T = E(T). The MSE can be split into two components: E(T θ) 2 = E(T m T ) 2 + (m T θ) 2 MSE = variance + bias 2

9 The sample variance The sample variance S 2 = (n 1) 1 n (Y i Ȳ) 2 i=1 estimates the population variance σ 2. An unbiased estimator is obtained by using the divisor n 1 (rather than n). The standard deviation (square root of the variance) of the sampling distribution of an estimator is called the standard error of the estimator. For example, the standard error of Ȳ is σ 2 /n. Usually the value of σ 2 is unknown, in which case the estimated standard error is calculated by replacing σ 2 in the formula by an estimate (e.g. the sample variance): estimated se(ȳ) = S 2 /n.

10 The t distribution Sample mean Ȳ is distributed N(m,σ 2 /n). The standardised value n ( Ȳ m)/σ has an N(0,1) distribution. Replacing σ by S (square root of sample variance S 2 ) changes the distn: n ( Ȳ m)/s has a t distn with n 1 d.f. More generally, if Z is N(0,σ 2 ), and S 2 is an estimate of σ 2 with f d.f., then Z/S has a t distn with f degrees of freedom. The t distn with few d.f. has thicker tails than the normal distn.

11 Studentization The form of inference described on the previous slide (slide 6??) requires that we know the value of σ 2. If this value is unknown, the procedure must be modified: 1) Replace the unknown value of σ 2 by an estimate, for example S 2 (the sample variance). 2) Take quantile k from tables of the t distn (instead of the normal distn). The d.f. for t are those associated with the estimate of σ 2 (or the sum of squares on which it is based).

12 Hypothesis tests Test of null hypothesis H 0 (about parameter θ): 1) Choose summary statistic T (typically, an estimator of θ). 2) Reject H 0 if T in C, where C is a subset of the values of T. C is the rejection region, chosen so that P(T in C) when H 0 is true is a small number α, called the significance level, or size of the test. α is the probability of rejecting H 0 when it is true ( type I error). The smaller the value of α, the more stringent the test. Failing to reject a false hypothesis is the type II error. The probability of rejecting a false hypothesis is called the power of the test.

13 Example of a hypothesis test Here we have a random sample from N(m, σ 2 ), and use the sample mean Ȳ as a test statistic for hypotheses about m. The test rejects H 0 : m = m 0 when Ȳ m 0 E > k where E = σ 2 /n is the standard error of Ȳ and k is a suitable quantile of the standard normal distn (when σ 2 known) or the t distn (when σ 2 estimated). This test is sometimes called the z test (σ 2 known), or the one-sample t test (σ 2 estimated). It is a two-sided (two-tail) test: we reject H 0 if either Ȳ > m 0 + ke or Ȳ < m 0 ke

14 Level of significance By convention, certain values are used as guidelines: 0.05, 0.01, 0.001, representing increasing strength of evidence against H 0. The smaller the significance level, the stronger the evidence. The following descriptions of a significant result are suggested, although there is no general agreement on these: Significance level Conclusion when H 0 is rejected 0.05 Evidence against H Strong evidence against H Very strong evidence against H 0.

15 The p value Given the observed value of the test statistic, the p value is the smallest α at which the test is significant. Alternatively, it is the probability of obtaining a value more extreme than the observed value of the test statistic. The p value can be regarded as a measure of the strength of evidence against the hypothesis: the smaller the p value, the stronger the evidence, and the less we are inclined to believe that the hypothesis is true. The p value should not be interpreted as the probability that the hypothesis is true.

16 Confidence intervals A confidence interval tells us which values of the parameter are consistent with the data. In the case of inference about a normal mean, this is just a matter of rearranging one inequality (on left) into another (on right): m ke < Ȳ < m + ke Ȳ ke < m < Ȳ + ke The first statement says that the random variable Ȳ lies between given limits. The second statement says that the random interval (Ȳ k E, Ȳ + k E) includes the unknown value of m. The value of k is chosen so that the statement is true with a given probability (e.g. 0.95).

17 One-sample t-test Ȳ and S 2 are the sample mean and variance of a random sample of size n from N(m, σ 2 ). The variance σ 2 is unknown, and it is required to test H 0 : m = 0. The test statistic is T = Ȳ/E, where E = S 2 /n is the estimated standard error of Ȳ. The null distn of the test statistic is the t distn with n 1 d.f. H 0 is rejected if T > k, where k is a suitable quantile taken from tables of the t distn with n 1 d.f. There are many other versions of the t test. This is the simplest.

18 Matched pairs experiment The one-sample t test is used to compare two treatments when observations consist of matched pairs. Nine twin pairs are chosen for the experiment. For each pair, one twin (chosen randomly) is given standard diet (control), the other is given standard diet plus food additive. Pair Difference One-sample t test is applied to the differences (treated minus control). These are regarded as a single sample from a normal distn with mean m. The null hypothesis is H 0 : m = 0 (no treatment effect).

19 One-sample t test The nine differences for the matched-pairs experiment are Sum is 99, mean is 11.0, uncorrected sum of squares = Corrected sum of squares is /9 = Estimate of σ 2 is S 2 = 1308/8 = Estimated s.e. of Ȳ is E = (S 2 /9) = 4.262, t statistic is Ȳ/E = The upper point of the t distn with 8 d.f. is The two-tail test is significant at the 0.05 level. There is some evidence that the food additive improves growth rate. A 95% confidence interval for the effect of the additive is 11.0 ± 2.306E (between 1.2 and 20.8 g/d).

20 A neat way to set out the calculations Write down the ANOVA table Source DF SSQ MSQ F Mean Residual Total The value 1089 in the first row is the correction term from the previous slide. In each row, MSQ (mean square) is SSQ (sum of squares) divided by DF. In the last column F is the ratio of the two MSQ. t statistic is the square root of F, with DF of the residual row.

21 Algebra of the ANOVA table Source DF SSQ MSQ F Mean 1 C.F. = nȳ 2 nȳ 2 nȳ 2 /S 2 Residual n 1 Corrected SSQ S 2 Total n Uncorrected SSQ Square root of F is Ȳ S 2 /n.

22 An experiment with two unmatched samples A random sample of n 1 = 9 lambs receive standard diet plus food additive. An independent random sample of n 2 = 8 lambs receive the standard diet alone. Growth rates are measured on all 17 lambs. Treated Controls Assumptions: all measurements are independently normally distributed with variance σ 2. Population means are m 1 (control), m 2 (treated). Null hypothesis: m 1 = m 2.

23 The two-sample t test Test is based on Ȳ 1 Ȳ 2, which has variance σ 2 (1/n 1 + 1/n 2 ) The test statistic is T = (Ȳ 1 Ȳ 2 )/E, where E = S 2 (1/n 1 + 1/n 2 ) and S 2 is an estimate of σ 2. The null distn of T is the t dist with n 1 + n 2 2 d.f.

24 Calculating the estimate of σ 2 n sum uncorrected SSQ Treated Controls Calculate the corrected sum of squares separately for each sample, then pool sums of squares and degrees of freedom. Treated Controls Pooled DF SSQ DF SSQ DF SSQ MSQ Mean Residual Total Estimate of σ 2 is S 2 = with 15 d.f., and the estimated s.e. of Ȳ 1 Ȳ 2 is 124.8(1/9 + 1/8) =

25 Two-sample t test Ȳ 1 Ȳ 2 S 2 E T T = 10.5/5.43 = 1.93 with = 15 d.f. The upper 2.5% point of the t distn with 15 d.f. is The two-sided test is not quite significant at the 0.05 level: the data are consistent with the null hypothesis that the additive has no effect. A 95% confidence interval for the benefit of the food additive is 10.5 ± (between 1.1 and 22.1 g/d).

26 Chi-squared goodness-of-fit test Frequencies n 1... n k ( n i = n) are multinomially distributed with probabilities p 1... p k. The probabilities are specified by null hypothesis H 0. Chi-squared test statistic is X 2 = k i=1 2 (n i np i ), np i often written (O E) 2 /E, where O is the observed frequency n i and E is the expected frequency np i. An alternative formula is ( ) X 2 = O 2 /E n. X 2 is a measure of discrepancy between observed and expected frequencies. A large value indicates departure from H 0 (therefore a one-sided test, with large values significant).

27 The chi-squared distribution The distn of the sum of squares of ν independent N(0,1) r.v.s is called the chi-squared distn with ν d.f. (ν = 1, 2, 3,... ). For example, the corrected sum of squares for a sample of size n from a normal distn has a scaled chi-squared distn with ν = n 1 d.f. The distn also arises as the null distn of the X 2 test statistic. The mean of the distribution is ν and the variance 2ν. Upper tail probability (%) d.f

28 Example 1 A roulette wheel with three compartments is spun 99 times, with the following results. Is the wheel fair? Side Total Frequency The null hypothesis is p 1 = p 2 = p 3 = 1/3. Each expected frequency is equal to 33, and X 2 is [(42-33) 2 + (27-33) 2 + (30-33) 2 ]/33 = 3.818, with 2 d.f. Upper 5% point for X 2 2 is 5.991, result is not significant. There is no evidence of bias: data are consistent with the wheel being fair.

29 Example 2 Sometimes specification of probabilities by the null hypothesis is incomplete, leaving s parameters to be estimated. In this case the null distn of X 2 is chi-squared with k 1 s d.f. Are the blood group frequencies in the table below consistent with Hardy-Weinberg equilibrium? MM MN NN Total H-W hypothesis specifies probabilities p 2, 2pq and q 2, where p and q are the M and N allele frequencies, which must be estimated. Estimates are p = , q = , and expected frequencies MM MN NN Total X 2 = 1.96 with = 1 d.f. (not significant)

30 Chi-squared association test Attributes A and B each take one of two possible values. Both are recorded for a sample of N individuals. Is there association between the attributes? (In the table below, a, b, c, and d are frequencies.) B 1 B 2 Total A 1 a b a+b A 2 c d c+d Total a+c b+d N The null hypothesis is independence of the row and column events, i.e. that Pr(A 1 B 1 ) = Pr(A 1 ) Pr(B 1 ), for example. Test statistic is X 2 = (O E) 2 /E. Null distn is the chi-squared distn with = 1 d.f. Expected frequency for the top-left cell is (a + b)(a + c)/n, etc.

31 Example Relationship between nasal carrier rate for Streptococcus pyogenes and size of tonsils among 1398 children. Not enlarged Enlarged Total Carriers 19 (26.6) 53 (45.4) 72 Non-carriers 497 (489.4) 829 (836.6) 1326 Total Expected frequencies are shown in brackets. X 2 = ( ) more terms = 3.61 (Or use alternative formula 19 2 / more terms 1398). The test is not significant at the 0.05 level.

32 Larger tables Calculate expected frequency for each cell as E = (row total) (column total)/(grand total), then sum (O E) 2 /E over all cells of the table. For a table with r rows and c columns, X 2 has (r 1)(c 1) degrees of freedom. Example: a more detailed breakdown of the tonsils data gives X 2 = 7.88 with 2 d.f. (P = 0.02). None Mild Severe Carriers 19 (26.6) 29 (30.3) 24 (15.1) Non-carriers 497 (489.4) 560 (558.7) 269 (277.9) Total

33 Using R pchisq( ) and qchisq( ) calculate probabilities and quantiles of the chi-squared distn. pt( ) and qt( ) do the same for the t distn. chisq.test( ) deals with the goodness-of-fit and association tests. Note: in the case where parameters are estimated, it reports the wrong d.f. t.test( ) can be used for one or two-sample version of the t test. For the two-sample version, set var.equal = TRUE. binom.test( ) can be used with binomial data. This test is exact, based on binomial probabilities, and usually gives a result similar to the chi-squared test.

34 Simulation Deriving sampling dists is a job for the mathematical statistician, but an approximate answer can usually be obtained by simulation. The R function replicate( ) is useful here. Example: The code below repeatedly draws samples of size 100 from a normal distn with unit variance, and compares the histogram of the results with the theoretical distn (normal with variance 1/100). curve(dnorm(x, sd = 0.1), -0.3, 0.3, col = red, ann = FALSE, las = 1) hist(replicate(1000, mean(rnorm(100))), freq = FALSE, add = TRUE)

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 CIVL - 7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8 Chi-square Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I-> Range of the class interval

More information

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope Oct 2017 1 / 28 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1 PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Chapter 24. Comparing Means

Chapter 24. Comparing Means Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Chapter 9 Inferences from Two Samples

Chapter 9 Inferences from Two Samples Chapter 9 Inferences from Two Samples 9-1 Review and Preview 9-2 Two Proportions 9-3 Two Means: Independent Samples 9-4 Two Dependent Samples (Matched Pairs) 9-5 Two Variances or Standard Deviations Review

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

The Multinomial Model

The Multinomial Model The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient

More information

Chapter 10: Inferences based on two samples

Chapter 10: Inferences based on two samples November 16 th, 2017 Overview Week 1 Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 1: Descriptive statistics Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter 8: Confidence

More information

Difference between means - t-test /25

Difference between means - t-test /25 Difference between means - t-test 1 Discussion Question p492 Ex 9-4 p492 1-3, 6-8, 12 Assume all variances are not equal. Ignore the test for variance. 2 Students will perform hypothesis tests for two

More information

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants. The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009 EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Outline. Unit 3: Inferential Statistics for Continuous Data. Outline. Inferential statistics for continuous data. Inferential statistics Preliminaries

Outline. Unit 3: Inferential Statistics for Continuous Data. Outline. Inferential statistics for continuous data. Inferential statistics Preliminaries Unit 3: Inferential Statistics for Continuous Data Statistics for Linguists with R A SIGIL Course Designed by Marco Baroni 1 and Stefan Evert 1 Center for Mind/Brain Sciences (CIMeC) University of Trento,

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence

More information

Formulas and Tables. for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. ˆp E p ˆp E Proportion.

Formulas and Tables. for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. ˆp E p ˆp E Proportion. Formulas and Tables for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. Ch. 2: Descriptive Statistics x Sf. x x Sf Mean S(x 2 x) 2 s Å n 2 1 n(sx 2 ) 2 (Sx) 2 s Å n(n 2 1) Mean (frequency

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model s s Oct 2017 1 / 34 s Consider N = n 0 + n 1 + + n k 1 observations, which form k groups, of sizes n 0, n 1,..., n k 1. The r-th group has sample mean Ȳ r The overall mean (for all groups combined) is

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Chapter 7: Statistical Inference (Two Samples)

Chapter 7: Statistical Inference (Two Samples) Chapter 7: Statistical Inference (Two Samples) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 41 Motivation of Inference on Two Samples Until now we have been mainly interested in a

More information

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc. Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

Introductory Econometrics. Review of statistics (Part II: Inference)

Introductory Econometrics. Review of statistics (Part II: Inference) Introductory Econometrics Review of statistics (Part II: Inference) Jun Ma School of Economics Renmin University of China October 1, 2018 1/16 Null and alternative hypotheses Usually, we have two competing

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Module 9: Nonparametric Statistics Statistics (OA3102)

Module 9: Nonparametric Statistics Statistics (OA3102) Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval Epidemiology 9509 Principles of Biostatistics Chapter 10 - Inferences about John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered 1. differences in

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017 Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0

More information

Chapter 5 Confidence Intervals

Chapter 5 Confidence Intervals Chapter 5 Confidence Intervals Confidence Intervals about a Population Mean, σ, Known Abbas Motamedi Tennessee Tech University A point estimate: a single number, calculated from a set of data, that is

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Analysis of variance

Analysis of variance Analysis of variance 1 Method If the null hypothesis is true, then the populations are the same: they are normal, and they have the same mean and the same variance. We will estimate the numerical value

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1 Math 66/566 - Midterm Solutions NOTE: These solutions are for both the 66 and 566 exam. The problems are the same until questions and 5. 1. The moment generating function of a random variable X is M(t)

More information

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: CHAPTER 9, 10 Hypothesis Testing Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: The person is guilty. The person is innocent. To

More information

Exam 2 (KEY) July 20, 2009

Exam 2 (KEY) July 20, 2009 STAT 2300 Business Statistics/Summer 2009, Section 002 Exam 2 (KEY) July 20, 2009 Name: USU A#: Score: /225 Directions: This exam consists of six (6) questions, assessing material learned within Modules

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics).

-However, this definition can be expanded to include: biology (biometrics), environmental science (environmetrics), economics (econometrics). Chemometrics Application of mathematical, statistical, graphical or symbolic methods to maximize chemical information. -However, this definition can be expanded to include: biology (biometrics), environmental

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion Formulas and Tables for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. Ch. 3: Descriptive Statistics x Sf. x x Sf Mean S(x 2 x) 2 s Å n 2 1 n(sx 2 ) 2 (Sx)

More information

Introductory Econometrics

Introductory Econometrics Session 4 - Testing hypotheses Roland Sciences Po July 2011 Motivation After estimation, delivering information involves testing hypotheses Did this drug had any effect on the survival rate? Is this drug

More information

Statistical Methods in Natural Resources Management ESRM 304

Statistical Methods in Natural Resources Management ESRM 304 Statistical Methods in Natural Resources Management ESRM 304 Statistical Methods in Natural Resources Management I. Estimating a Population Mean II. Comparing two Population Means III. Reading Assignment

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Chi-squared (χ 2 ) (1.10.5) and F-tests (9.5.2) for the variance of a normal distribution ( )

Chi-squared (χ 2 ) (1.10.5) and F-tests (9.5.2) for the variance of a normal distribution ( ) Chi-squared (χ ) (1.10.5) and F-tests (9.5.) for the variance of a normal distribution χ tests for goodness of fit and indepdendence (3.5.4 3.5.5) Prof. Tesler Math 83 Fall 016 Prof. Tesler χ and F tests

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,

More information

CENTRAL LIMIT THEOREM (CLT)

CENTRAL LIMIT THEOREM (CLT) CENTRAL LIMIT THEOREM (CLT) A sampling distribution is the probability distribution of the sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic

More information

Statistics. Statistics

Statistics. Statistics The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs STATISTICS 4 Summary Notes. Geometric and Exponential Distributions GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs P(X = x) = ( p) x p x =,, 3,...

More information

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing & z-test Lecture Set 11 We have a coin and are trying to determine if it is biased or unbiased What should we assume? Why? Flip coin n = 100 times E(Heads) = 50 Why? Assume we count 53 Heads... What could

More information

Probability and random variables. Sept 2018

Probability and random variables. Sept 2018 Probability and random variables Sept 2018 2 The sample space Consider an experiment with an uncertain outcome. The set of all possible outcomes is called the sample space. Example: I toss a coin twice,

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information