Exam details. Final Review Session. Things to Review

Similar documents
GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Transition Passage to Descriptive Statistics 28

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Glossary for the Triola Statistics Series

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

3 Joint Distributions 71

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Textbook Examples of. SPSS Procedure

Subject CS1 Actuarial Statistics 1 Core Principles

Contents. Acknowledgments. xix

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Tables Table A Table B Table C Table D Table E 675

Statistics Handbook. All statistical tables were computed by the author.

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Sociology 6Z03 Review II

STATISTICS ANCILLARY SYLLABUS. (W.E.F. the session ) Semester Paper Code Marks Credits Topic

Confidence Intervals, Testing and ANOVA Summary

Turning a research question into a statistical question.

Basic Statistical Analysis

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

STATISTICS ( CODE NO. 08 ) PAPER I PART - I

Bivariate Relationships Between Variables

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics.

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

Types of Statistical Tests DR. MIKE MARRAPODI

Introduction to Statistical Analysis

Finding Relationships Among Variables

Statistics: revision

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Non-parametric tests, part A:

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Non-parametric methods

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

Institute of Actuaries of India

16.400/453J Human Factors Engineering. Design of Experiments II

SPSS Guide For MMI 409

Inferential Statistics

Lecture 7: Hypothesis Testing and ANOVA

Experimental Design and Data Analysis for Biologists

Business Statistics. Lecture 10: Course Review

Review of Statistics 101

Nonparametric Statistics

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Formulas and Tables by Mario F. Triola

Correlation and Simple Linear Regression

Chapter 1 Statistical Inference

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Inference for the Regression Coefficient

AP Statistics Cumulative AP Exam Study Guide

Statistics Introductory Correlation

Inferences About the Difference Between Two Means

Kumaun University Nainital

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

Sleep data, two drugs Ch13.xls

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

LOOKING FOR RELATIONSHIPS

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Data analysis and Geostatistics - lecture VII

Statistical. Psychology

Comparison of two samples

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

My data doesn t look like that..

Practical Statistics for the Analytical Scientist Table of Contents

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

1 Introduction to Minitab

STATISTICS REVIEW. D. Parameter: a constant for the case or population under consideration.

Correlation and Regression

Final Exam. Name: Solution:

Inference for Regression Simple Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Six Sigma Black Belt Study Guides

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

REVIEW: Midterm Exam. Spring 2012

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Index. Cambridge University Press Data Analysis for Physical Scientists: Featuring Excel Les Kirkup Index More information

One-way ANOVA Model Assumptions

Everything is not normal

The material for categorical data follows Agresti closely.

Note: The problem numbering below may not reflect actual numbering in DGE.

df=degrees of freedom = n - 1

Analysis of variance (ANOVA) Comparing the means of more than two groups

Factorial designs. Experiments

Mathematical Notation Math Introduction to Applied Statistics

Selection should be based on the desired biological interpretation!

Introduction to hypothesis testing

Chapter 7 Comparison of two independent samples

STATISTICS; An Introductory Analysis. 2nd hidition TARO YAMANE NEW YORK UNIVERSITY A HARPER INTERNATIONAL EDITION

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Ch. 1: Data and Distributions

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Transcription:

Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit 1 ( A ) Things to Review Concepts Basic formulae Statistical tests

Concepts Basic formulae Statistical tests Things to Review Populations Samples Random sample Parameters Estimates Mean Median Mode Variance Standard deviation Categorical Nominal, ordinal Numerical Discrete, continuous First Half Alternative hypothesis P-value Type I error Type II error Sampling distribution Standard error Central limit theorem Normal distribution Quantile plot Shapiro-Wilk test Data transformations Nonparametric tests Independent contrasts Second Half Observations vs. experiments Confounding variables Control group Replication and pseudoreplication Blocking Factorial design Power analysis Simulation Randomization Bootstrap Likelihood Example Conceptual Questions (you ve just done a two-sample t-test comparing body size of lizards on islands and the mainland) What is the probability of committing a type I error with this test? State an example of a confounding variable that may have affected this result State one alternative statistical technique that you could have used to test the null hypothesis, and describe briefly how you would have carried it out.

Sample Randomization test Randomized data Things to Review Calculate the same test statistic on the randomized data Concepts Basic formulae Statistical tests Concepts Basic formulae Statistical tests Things to Review

Sample Statistical tests Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Binomial test Chi-squared goodnessof-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Quick reference summary: Binomial test What is it for? Compares the proportion of successes in a sample to a hypothesized value, p o What does it assume? Individual trials are randomly sampled and independent : X, the number of successes Distribution under H o : binomial with parameters n and p o. Formula: " P(x) = $ n% ' p x ( 1( p) n(x P = * Pr[x!X] # x& P(x) = probability of a total of x successes p = probability of success in each trial n = total number of trials

Sample Binomial test Pr[success]=p o Binomial test x = number of successes Binomial n, p o H 0 : The relative frequency of successes in the population is p 0 H A : The relative frequency of successes in the population is not p 0 Binomial test Chi-squared goodnessof-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Quick reference summary: " Goodness-of-Fit test What is it for? Compares observed frequencies in categories of a single variable to the expected frequencies under a random model What does it assume? Random samples; no expected values < 1; no more than 0% of expected values < 5 : " Distribution under H o : " with Formula: df=# categories - # parameters - 1 " = ( Observed i # Expected i ) $ all classes Expected i

" goodness of fit test Sample Calculate expected values : Data fit a particular Discrete distribution " Goodness-of-Fit test " = ( Observed i # Expected i ) $ all classes Expected i : " With N-1-param. d.f. H 0 : The data come from a certain distribution H A : The data do not come from that distrubition Possible distributions " Pr[x] = $ n% ' p x 1( p # x& ( ) n(x Pr[ X ] = e"µ µ X X! Pr[x] = n * frequency of occurrence Proportional Binomial Poisson Given a number of categories Probability proportional to number of opportunities Days of the week, months of the year Number of successes in n trials Have to know n, p under the null hypothesis Punnett square, many p=0.5 examples Number of events in interval of space or time n not fixed, not given p Car wrecks, flowers in a field

Binomial test Chi-squared goodnessof-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Quick reference summary: " Contingency Test What is it for? Tests the null hypothesis of no association between two categorical variables What does it assume? Random samples; no expected values < 1; no more than 0% of expected values < 5 : " Distribution under H o : " with df=(r-1)(c-1) where r = # rows, c = # columns Formulae: ( Observed " = i # Expected i ) RowTotal *ColTotal Expected = $ GrandTotal all classes Expected i Sample " Contingency Test Calculate expected values : No association between variables " Contingency test " = ( Observed i # Expected i ) $ all classes Expected i : " With (r-1)(c-1) d.f. H 0 : There is no association between these two variables H A : There is an association between these two variables

Binomial test Chi-squared goodnessof-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Quick reference summary: One sample t-test What is it for? Compares the mean of a numerical variable to a hypothesized value,! o What does it assume? Individuals are randomly sampled from a population that is normally distributed. : t Distribution under H o : t-distribution with n-1 degrees of freedom. Formula: t = Y " µ o SE Y Sample One-sample t-test The population mean is equal to µ o One-sample t-test t = Y " µ o s/ n t with n-1 df H o : The population mean is equal to µ o H a : The population mean is not equal to µ o

Paired vs. sample comparisons Quick reference summary: Paired t-test What is it for? To test whether the mean difference in a population equals a null hypothesized value,! do What does it assume? Pairs are randomly sampled from a population. The differences are normally distributed : t Distribution under H o : t-distribution with n-1 degrees of freedom, where n is the number of pairs Formula: t = d " µ do SE d Paired t-test Sample The mean difference is equal to µ o Paired t-test t = d " µ do SE d t with n-1 df *n is the number of pairs H o : The mean difference is equal to 0 H a : The mean difference is not equal 0

Quick reference summary: Two-sample t-test What is it for? Tests whether two groups have the same mean What does it assume? Both samples are random samples. The numerical variable is normally distributed within both populations. The variance of the distribution is the same in the two populations : t Distribution under H o : t-distribution with n 1 +n - degrees of freedom. Formulae: t = Y 1 "Y SE Y 1 "Y # 1 SE Y 1 "Y = s p + 1 & % ( $ n 1 n ' s p = df 1s 1 + df s df 1 + df Sample t = Y 1 "Y SE Y 1 "Y Two-sample t-test The two populations have the same mean µ 1 =µ t with n 1 +n - df Two-sample t-test Statistical tests H o : The means of the two populations are equal H a : The means of the two populations are not equal Binomial test Chi-squared goodnessof-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA

F-test for Comparing the variance of two groups Sample F-test The two populations have the same variance! 1 =! H 0 :" 1 = " H A :" 1 # " F = s 1 s F with n 1-1, n -1 df Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Sample t = Y 1 " Y s 1 + s n 1 n Welch s t-test The two populations have the same mean µ 1 =µ t with df from formula

Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Parametric One-sample and Paired t-test Two-sample t-test Nonparametric Sign test Mann-Whitney U-test Quick Reference Summary: Sign Test What is it for? A non-parametric test to the medians of a group to some constant What does it assume? Random samples Formula: Identical to a binomial test with p o = 0.5. Uses the number of subjects with values greater than and less than a hypothesized median as the test statistic. " P(x) = probability of a total of x successes p = probability of success in each trial n = total number of trials P(x) = n% $ ' p x 1( p # x& ( ) n(x P = * Pr[x!X] Sample x = number of values greater than m o Sign test Median = m o Binomial n, 0.5

Sign Test H o : The median is equal to some value m o H a : The median is not equal to m o Quick Reference Summary: Mann-Whitney U Test What is it for? A non-parametric test to the central tendencies of two groups What does it assume? Random samples : U Distribution under H o : U distribution, with sample sizes n 1 and n Formulae: ( ) U 1 = n 1 n + n n +1 1 1 U = n 1 n " U 1 " R 1 n 1 = sample size of group 1 n = sample size of group R 1 = sum of ranks of group 1 Use the larger of U1 or U for a two-tailed test Sample U 1 or U (use the largest) Mann-Whitney U test The two groups Have the same median U with n 1, n Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test Statistical tests F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA

Quick Reference Guide - Correlation Coefficient What is it for? Measuring the strength of a linear association between two numerical variables What does it assume? Bivariate normality and random sampling Parameter: # Estimate: r Formulae: #( X i " X )( Y i " Y ) r = SE #( X i " X ) #( Y i " Y ) r = 1" r n " Quick Reference Guide - t-test for zero linear correlation What is it for? To test the null hypothesis that the population parameter, #, is zero What does it assume? Bivariate normality and random sampling : t : t with n- degrees of freedom t = r Formulae: SE r Sample T-test for correlation #=0 Statistical tests t = r SE r t with n- d.f. Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA

Quick Reference Guide - Spearman s Rank Correlation What is it for? To test zero correlation between the ranks of two variables What does it assume? Linear relationship between ranks and random sampling : r s : See table; if n>100, use t- distribution Formulae: Same as linear correlation but based on ranks Sample Spearman s rank correlation #=0 Spearman s rank Table H Statistical tests Assumptions of Regression Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA At each value of X, there is a population of Y values whose mean lies on the true regression line At each value of X, the distribution of Y values is normal The variance of Y values is the same at all values of X At each value of X the Y measurements represent a random sample from the population of Y values

OK Non-linear Non-normal Unequal variance Quick Reference Summary: Confidence Interval for Regression Slope What is it for? Estimating the slope of the linear equation Y = $ + %X between an explanatory variable X and a response variable Y What does it assume? Relationship between X and Y is linear; each Y at a given X is a random sample from a normal distribution with equal variance Parameter: % Estimate: b Degrees of freedom: n- Formulae: b " t #(),df SE b < $ < b + t #(),df SE b SE b = MS residual = # MS residual #( X i " X ) # (Y i "Y ) " b (X i " X )(Y i "Y ) n " Quick Reference Summary: t-test for Regression Slope What is it for? To test the null hypothesis that the population parameter % equals a null hypothesized value, usually 0 What does it assume? Same as regression slope C.I. : t : t with n- d.f. Formula: t = b SE b

Sample T-test for Regression Slope %=0 Statistical tests t = b SE b t with n- df Binomial test Chi-squared goodness-of-fit Proportional, binomial, poisson Chi-squared contingency test t-tests One-sample t-test Paired t-test Two-sample t-test F-test for comparing variances Welch s t-test Sign test Mann-Whitney U Correlation Spearman s r Regression ANOVA Quick Reference Summary: ANOVA (analysis of variance) What is it for? Testing the difference among k means simultaneously What does it assume? The variable is normally distributed with equal standard deviations (and variances) in all k populations; each sample is a random sample : F Distribution under H o : F distribution with k-1 and N-k degrees of freedom Quick Reference Summary: ANOVA (analysis of variance) Formulae: MS group = SS group df group F = MS group MS error = SS group k "1 SS group = # n i (Y i "Y) Y i Y = mean of group i = overall mean MS error = SS error df error = SS error N " k SS error = # s i (n i "1) n i = size of sample i N = total sample size

k Samples ANOVA All groups have the same mean ANOVA F = MS group MS error F with k-1, N-k df H o : All of the groups have the same mean H a : At least one of the groups has a mean that differs from the others ANOVA Tables Picture of ANOVA Terms Source of variation Sum of squares df Mean Squares F ratio P Treatment SS group = # n i (Y i "Y) k-1 MS group = SS group df group Error SS error = # s i (n i "1) N-k MS error = SS error df error Total SS group + SS error N-1 SS Total MS Total SS Group MS Group SS Error MS Error

Source of variation Treatment 1 Treatment Treatment 1 * Treatment Error Total Two-factor ANOVA Table Sum of Squares SS 1 SS SS 1* SS error SS total df k 1-1 k - 1 (k 1-1)*(k - 1) XXX N-1 Mean Square SS 1 k 1-1 SS k - 1 SS 1* (k 1-1)*(k - 1) SS error XXX F ratio MS 1 MSE MS MSE MS 1* MSE P Interpretations of -way ANOVA Terms Interpretations of -way ANOVA Terms Interpretations of -way ANOVA Terms Effect of Temperature, Not ph Effect of ph, Not Temperature

Interpretations of -way ANOVA Terms Effect of ph and Temperature, No interaction Interpretations of -way ANOVA Terms Effect of ph and Temperature, with interaction Quick Reference Summary: -Way ANOVA What is it for? Testing the difference among means from a -way factorial experiment What does it assume? The variable is normally distributed with equal standard deviations (and variances) in all populations; each sample is a random sample : F (for three different hypotheses) Distribution under H o : F distribution Quick Reference Summary: - Way ANOVA Formulae: Just need to know how to fill in the table

-way ANOVA -way ANOVA Samples Null hypotheses (three of them) Samples Null hypotheses (three of them) F = MS group MS error F F = MS group MS error Treatment 1 F -way ANOVA -way ANOVA Samples Null hypotheses (three of them) Samples Null hypotheses (three of them) F = MS group MS error Treatment F F = MS group MS error Interaction F

General Linear Models First step: formulate a model statement Example: General Linear Models Second step: Make an ANOVA table Example: Y = µ + TREATMENT Source of variation Treatme nt Error Total Sum of squares SS group = # n i (Y i "Y) SS error = # s i (n i "1) SS group + SS error df k-1 N-k N-1 Mean Squares MS group = SS group df group MS error = SS error df error F ratio P F = MS group MS error * Sample Randomization test Randomized data Calculate the same test statistic on the randomized data Which test do I use?

1 Methods for a single variable 1 Methods for a single variable How many variables am I comparing? Methods for comparing two variables How many variables am I comparing? 3 Methods for comparing two variables Methods for comparing three or more variables Methods for one variable Categorical Comparing to a single proportion p o or to a distribution? p o Is the variable categorical or numerical? distribution Numerical Y Methods for two variables X Explanatory variable Response variable Categorical Numerical Categorical Contingency table Contingency Logistic Grouped bar graph analysis regression Mosaic plot Numerical Multiple histograms t-test Correlation Scatter plot Cumulative frequency distributions ANOVA Regression Binomial test " Goodnessof-fit test One-sample t-test

How many variables am I comparing? 1 Is the variable categorical or numerical? Categorical Comparing to a single proportion p o or to a distribution? Numerical Explanatory variable Response variable Categorical Numerical Categorical Contingency table Grouped Contingency Logistic bar graph analysis Mosaic plot regression Numerical Multiple t-test histograms Correlation Scatter plot Cumulative frequency distributions ANOVA Regression One-sample t-test p o distribution Binomial test " Goodnessof-fit test Contingency analysis