Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Similar documents
Chi Square Analysis M&M Statistics. Name Period Date

Lecture 41 Sections Mon, Apr 7, 2008

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Example. χ 2 = Continued on the next page. All cells

16.400/453J Human Factors Engineering. Design of Experiments II

Mathematical Notation Math Introduction to Applied Statistics

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Section VII. Chi-square test for comparing proportions and frequencies. F test for means

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Statistics for Managers Using Microsoft Excel

Categorical Data Analysis. The data are often just counts of how many things each category has.

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

HYPOTHESIS TESTING. Hypothesis Testing

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

The Chi-Square Distributions

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

11-2 Multinomial Experiment

Hypothesis testing. Data to decisions

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Frequency Distribution Cross-Tabulation

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

0 0'0 2S ~~ Employment category

CBA4 is live in practice mode this week exam mode from Saturday!

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Lecture 28 Chi-Square Analysis

LOOKING FOR RELATIONSHIPS

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Chapter 5 Confidence Intervals

Statistics Handbook. All statistical tables were computed by the author.

Visual interpretation with normal approximation

16.3 One-Way ANOVA: The Procedure

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Quantitative Analysis and Empirical Methods

Math 152. Rumbos Fall Solutions to Exam #2

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Chapter Eight: Assessment of Relationships 1/42

The Chi-Square Distributions

Lecture 41 Sections Wed, Nov 12, 2008

Basic Business Statistics, 10/e

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

How do we compare the relative performance among competing models?

t test for independent means

Case-Control Association Testing. Case-Control Association Testing

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Topic 21 Goodness of Fit

Inferential statistics

Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

POLI 443 Applied Political Research

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Chapter 3. Comparing two populations

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

MBA 605, Business Analytics Donald D. Conant, Ph.D. Master of Business Administration

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Binary Logistic Regression

Psych 230. Psychological Measurement and Statistics

10.2: The Chi Square Test for Goodness of Fit

CH.9 Tests of Hypotheses for a Single Sample

Testing Independence

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Retrieve and Open the Data

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

STT 843 Key to Homework 1 Spring 2018

Chapter Seven: Multi-Sample Methods 1/52

10: Crosstabs & Independent Proportions

The t-statistic. Student s t Test

Mathematical Notation Math Introduction to Applied Statistics

Lecture 7: Hypothesis Testing and ANOVA

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Institute of Actuaries of India

Single Sample Means. SOCY601 Alan Neustadtl

CHAPTER 7. Hypothesis Testing

Confidence Intervals, Testing and ANOVA Summary

Tables Table A Table B Table C Table D Table E 675

Chapter 26: Comparing Counts (Chi Square)

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Tests for Two Coefficient Alphas

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Analysis of Variance: Part 1

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

Transcription:

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to estimate the probability that an observed value did or did not occur by chance. The χ 2 test helps us determine whether there is a statistically significant different or not between the expected and observed values. We can choose an arbitrary cut-off for significance, which is typically at a p value of 0.05 or 0.01, meaning 5 or 1 percent of the outcome occurring due to chance alone. The χ 2 test also creates two hypotheses, known as our null and alternative hypotheses. The null hypothesis or H 0 states that there is no significant difference between the observed and expected values. The alternative hypothesis or H A states that there is a significant difference between the observed and expected values. The χ 2 or chi-square analysis begins with finding the χ 2 value. This is measured by the sum of each variable s observed minus expected values squared divided by the expected value. Or as an equation: (Observed Expected)! Expected You can abbreviate observed as O and expected as E, simplifying the general equation to: (O E)! Once you have a χ 2 value, you will then calculate your degrees of freedom or df. E df = n 1 n is the number of values that are allowed to vary freely, or each of our discrete categories that can be measured by an expected and observed value.

Using the df and χ 2 value, we can use the χ 2 table to find the probability, p. First, go to the column for df and go down until you find the correct df, this will be the row that you will find your χ 2 value within. From the df, go to the right and find your χ 2 value cut off. The χ 2 value will most likely not equal that exact number within the table, so we are searching for which columns your χ 2 value falls between. Once you have found the two values that are around your χ 2 value, then you will go up to the top row to find the corresponding probability, or p value. Notice the p values are between 0-1. Chi Square Table: Degrees of Freedom (df) Probability (p) 0.95 0.90 0.80 0.70 0.50 0.30 0.20 0.10 0.05 0.01 0.001 1 0.004 0.02 0.06 0.15 0.46 1.07 1.64 2.71 3.84 6.64 10.83 2 0.10 0.21 0.45 0.71 1.39 2.41 3.22 4.60 5.99 9.21 13.82 3 0.35 0.58 1.01 1.42 2.37 3.66 4.64 6.25 7.82 11.34 16.27 4 0.71 1.06 1.65 2.20 3.36 4.88 5.99 7.78 9.49 13.28 18.47 5 1.14 1.61 2.34 3.00 4.35 6.06 7.29 9.24 11.07 15.09 20.52 6 1.63 2.20 3.07 3.83 5.35 7.23 8.56 10.64 12.59 16.81 22.46 7 2.17 2.83 3.82 4.67 6.35 8.38 9.80 12.02 14.07 18.48 24.32 8 2.73 3.49 4.59 5.53 7.34 9.52 11.03 13.36 15.51 20.09 26.12 9 3.32 4.17 5.38 6.39 8.34 10.66 12.24 14.68 16.92 21.67 27.88 10 3.94 4.86 6.18 7.27 9.34 11.78 13.44 15.99 18.31 23.21 29.59 Non-significant Significant The p values denote 0-100% occurrences by chance. You can write your p value as: higher probability > p value > lower probability This means that your p value falls within a probability of x and y or the probability is less than the higher value, but greater than the lower probability value.

There is also the case in which the χ 2 value is very, very small and the value you find is off the chart, this will correspond to a p value that is greater than 0.95 or p value > 0.95. On the other side of the spectrum, the χ 2 value may be very large and is beyond that of the chart. In this case the p value will be written as p value < 0.001. When writing your own conclusion based on the results of a χ 2 test, you will include the specifics of what you have measured using the observed and expected values, the p value, and the conclusion in terms of the null hypothesis. p > 0.05 There is not a significant difference between the expected and observed values specifically state what you measured (p value), therefor we fail to reject the null hypothesis. p < 0.05 There is a significant difference between the expected and observed values specifically state what you measured (p value), therefor we reject the null hypothesis.

If we go back to the previous unit using a dihybrid cross, we can hypothesize the offspring ratio for a heterozygous vestigial stubble female crossed with a heterozygous vestigial male. 1. Female vg/+; s/+ x male vg/+; s/+ F2 genotypic ratio: F2 phenotypic ratio:

Phenotype Expected Observed Wildtype 35 Vestigial 17 Stubble 12 Vestigial/Stubble 7 Totals A) Fill in the expected number of each phenotype. B) Write the appropriate null hypothesis: C) Write the appropriate alternative hypothesis: D) Perform a Chi Square test. Show all your work.

E) What is the Chi Square value? F) What are the degrees of freedom? G) What is the associated P-value? H) What is your conclusion in terms of the null hypothesis? When is it appropriate to use the Chi-Square test?

II. T-test Objectives: Understand a second way to get a p value or measure probability is through a T-test. Be able to perform a t-test in excel with a data set. The t-test uses a different approach to finding probability of significance. Rather than using values different from the observed and expected, such as in χ 2 test. In this case, you are measuring the statistical difference between two sets of measured values. The general equation used is: t = Estimate of the parameter Hypothesized value of the parameter Estimated Standard Error of the Estimator Estimate of the parameter = differences between the two sample means of each group Hypothesized value of the parameter = expected difference between the two sample means of each group based on the null hypothesis (no difference) Estimated Standard Error of the Estimator = standard error of the data set, referring to the variability of the set due to sampling (dependent on equal versus unequal variance). The two-sample t test is the difference between the two means divided by the standard error of the sample. With this equation, you create two hypotheses (or more!) based on your variable that you are measuring the impact. In this case, we have a control group and a manipulated group. Both groups will have the exact same conditions aside from one variable, so we can measure the impact of the variable between these two groups. The group that does not have the variable factor is called the control group. The null hypothesis or H 0 states there is no significant difference between the two groups, one that has the variable and the control. mean 1 = mean 2

The alternative hypothesis or H A states that there is a significant difference between the two groups dependent on the variable. In this case we can state that they are not equal, or we can state that the variable creates a greater than or less than difference. The two variables are not equal, then: mean 1 mean 2 One variable is less than the other, then: mean 1 mean 2

One value is greater than the other, then: mean 1 mean 2 Within Microsoft Excel, we can use the t test function = TTEST From here, we will need to insert 4 values: = TTEST array1, array2, tails, type Array1= all cells containing results from sample group 1 Array2= all cells containing results from sample group 2 Tails= 1 or 2, dependent on the alternative hypothesis Type= 1, 2, or 3 1= paired 2= two-sample, equal variance 3= two-sample, unequal variance Because we are not going to test the equality of variance, we will either use types 1 or 3 Paired t test: using the same sample to measure a variable, before and after, repeated measures. In this case, the same individuals are measured twice Unpaired t test: using two independent samples to measure the affect of a variable. In this case each individual is only measured once. Once you enter = TTEST array1, array2, tails, type, you will have a resulting p value.

2. You will conduct a t-test using the height data and compare male to female heights. Write corresponding null and alternative hypotheses for this data comparison. Complete the t-test in Excel. Write a conclusion in terms of the null hypothesis.