Ten Z-Test Routines from Gopal Kanji s 100 Statistical Tests

Similar documents
Exam details. Final Review Session. Things to Review

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Week 1 Quantitative Analysis of Financial Markets Distributions A

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Statistics Handbook. All statistical tables were computed by the author.

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Statistics: revision

Contents 1. Contents

This paper is not to be removed from the Examination Halls

RANDOM NUMBERS with BD

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

Probability Distributions.

1.0 Hypothesis Testing

Probability Distributions

Probability and Stochastic Processes

Formulas and Tables by Mario F. Triola

Statistical. Psychology

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests


Statistical methods and data analysis

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Module 10: Analysis of Categorical Data Statistics (OA3102)

Generalized Linear Models

BIOL 4605/7220 CH 20.1 Correlation

HANDBOOK OF APPLICABLE MATHEMATICS

Package Delaporte. August 13, 2017

Chapter 9: Hypothesis Testing Sections

PASS Sample Size Software. Poisson Regression

STAT 328 (Statistical Packages)

6 Single Sample Methods for a Location Parameter

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Introduction to Error Analysis

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Sigmaplot di Systat Software

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Module 9: Nonparametric Statistics Statistics (OA3102)

Statistics for Python

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

375 PU M Sc Statistics

Math 152. Rumbos Fall Solutions to Exam #2

Business Statistics. Lecture 10: Course Review

Using SPSS for One Way Analysis of Variance

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

IT 403 Statistics and Data Analysis Final Review Guide

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Numerical Recipes in C visit

Generalized linear models

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

FRANKLIN UNIVERSITY PROFICIENCY EXAM (FUPE) STUDY GUIDE

L6: Regression II. JJ Chen. July 2, 2015

Probability Methods in Civil Engineering Prof. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Power and nonparametric methods Basic statistics for experimental researchersrs 2017

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Introduction to Machine Learning

Transition Passage to Descriptive Statistics 28

Confidence Intervals. CAS Antitrust Notice. Bayesian Computation. General differences between Bayesian and Frequntist statistics 10/16/2014

Intermediate Math Circles February 14, 2018 Contest Prep: Number Theory

Lecture 4: Statistical Hypothesis Testing

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size?

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Collaborative Statistics: Symbols and their Meanings

BINF702 SPRING 2015 Chapter 7 Hypothesis Testing: One-Sample Inference

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Modeling Overdispersion

Macomb Community College Department of Mathematics. Review for the Math 1340 Final Exam

Chapter 9 Inferences from Two Samples

Probability Distributions Columns (a) through (d)

REVIEW 8/2/2017 陈芳华东师大英语系

Fundamentals of Applied Probability and Random Processes

Outline. policies for the second part. with answers. MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

Solar neutrinos are the only known particles to reach Earth directly from the solar core and thus allow to test directly the theories of stellar evolu

Exam C Solutions Spring 2005

Multivariate Hypothesis Testing for. Gaussian Data: Theory and Software. Tapas Kanungo and Robert M. Haralick. Intelligent Systems Laboratory

Visual interpretation with normal approximation

Handout 1: Predicting GPA from SAT

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) HIGHER CERTIFICATE IN STATISTICS, 2004

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY. FIRST YEAR B.Sc.(Computer Science) SEMESTER I

Most data analysis starts with some data set; we will call this data set P. It will be composed of a set of n

STA 2101/442 Assignment 2 1

Contents. Acknowledgments. xix

Rank-Based Methods. Lukas Meier

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

3 Joint Distributions 71

Master s Written Examination

Package pearson7. June 22, 2016

Analysis of 2x2 Cross-Over Designs using T-Tests

The Welch-Satterthwaite approximation

Computational modeling

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

Exam Applied Statistical Regression. Good Luck!

LOOKING FOR RELATIONSHIPS

Introduction to Statistical Data Analysis III

Distribution Fitting (Censored Data)

Bayesian Computation

Spearman Rho Correlation

Numerical Recipes. in Fortran 77. The Art of Scientific Computing Second Edition. Volume 1 of Fortran Numerical Recipes. William H.

Transcription:

Ten Z-Test Routines from Gopal Kanji s 100 Statistical Tests Maurice HT Ling School of Chemical and Life Sciences, Singapore Polytechnic, Singapore Department of Zoology, The University of Melbourne, Australia mauriceling@acm.org Abstract This manuscript presents the implementation and testing of 10 Z-test routines from Gopal Kanji s book entitled 100 Statistical Tests. 1. Introduction Kanji (2006) s 100 Statistical Tests is a collection of commonly used statistical tests, ranging from single-sample to multi-samples parametric and non-parametric tests. At the same time, a sample calculation is provided for each test. Written in a recipe style, Kanji (2006) serves as a useful desktop reference for statistics practitioners. It is therefore conceivable that Python implementation of these tests will be useful on its own or for inclusion into other Python-based statistical packages, such as SalStat (Salmoni, 2008). This manuscript presents the implementation (file name: ZTests.py) and testing (file name: testfile.py) of 10 Z-test routines from Kanji (2006). The details and limitations of each test are given as docstrings in the codes below. A statistical test harness is implemented (testfile.test function) to standardize the implementation of each hypothesis testing routines. This harness can be used for both 1-tailed or 2-tailed hypothesis tests as it reports the upper and lower critical value given the confidence level and state whether the calculated test statistic is more than the upper critical value (in the upper critical region) or lower than the lower critical value (in the lower critical region) or not. The only limitation of this test harness is the assumption that the test is symmetrical (upper critical region is equal to lower critical region in terms of probability density). These Z-tests use the standardized normal distribution where mean is zero and variance is 1 (class NormalDistribution). Briefly, the cumulative density function (CDF) calculated the cumulative probability density from negative infinity to a give x-value using the complementary error function (Press et al., 1989) while the inverse CDF (calculates the x- value given the probability density) uses a loop of 0.01 step increment the x-value from -10.0 to positive infinity to measure the point whereby the cumulative probability density from CDF exceeds the given probability density and reports the x-value at that point. - 1 -

These codes are licensed under Lesser General Public Licence version 3. 2. Code Files File: ZTests.py This file contains the implementation of 10 Z-test routines from Gopal Kanji's book[1] and published under The Python Papers Source Codes [2]. The implemented statistical tests are: 1. Z-test for a population mean (variance known), test 1 of [1] 2. Z-test for two population means (variances known and equal), test 2 of [1] 3. Z-test for two population means (variances known and unequal), test 3 of [1] 4. Z-test for a proportion (binomial distribution), test 4 of [1] 5. Z-test for the equality of two proportions (binomial distribution), test 5 of [1] 6. Z-test for comparing two counts (Poisson distribution), test 6 of [1] 7. Z-test of a correlation coefficient, test 13 of [1] 8. Z-test for two correlation coefficients, test 14 of [1] 9. Z-test for correlated proportions, test 23 of [1] 10. Spearman rank correlation test (paired observations), test 58 of [1] These codes are licensed under Lesser General Public Licence version 3. [1] Kanji, Gopal K. 2006. 100 Statistical Tests, 3rd Edition. Sage Publications. [2] Ling, MHT. 2009. Ten Z-Test Routines from Gopal Kanji's 100 Statistical Tests. The Python Papers Source Codes 1:5 from math import exp, sqrt, log SQRT2 = 1.4142135623730950488016887242096980785696718753769 PI2 = 6.2831853071795864769252867665590057683943387987502 def erfcc(x): Complementary error function similar to erfc(x) but with fractional error lesser than 1.2e-7. @see: Numerical Recipes in Pascal (Recipe 6.2) @param x: float number @return: float number z = abs(x) t = 1.0 / (1.0 + 0.5*z) r = t * exp(-z*z-1.26551223+t*(1.00002368+t*(0.37409196+ t*(0.09678418+t*(-0.18628806+t*(0.27886807+ t*(-1.13520398+t*(1.48851587+t*(-0.82215223+ t*0.17087277))))))))) if (x >= 0.0): return r else: return 2.0 - r class NormalDistribution: Class for standardized Normal distribution (area under the curve = 1) def init (self): - 2 -

self.mean = 0.0 self.stdev = 1.0 def CDF(self, x): Calculates the cumulative probability from -infinity to x. return 1.0-0.5 * erfcc(x/sqrt2) def PDF(self, x): Calculates the density (probability) at x by the formula: f(x) = 1/(sqrt(2 pi) sigma) e^-((x^2/(2 sigma^2)) where mu is the mean of the distribution and sigma the standard deviation. @param x: probability at x return (1/(sqrt(PI2) * self.stdev)) * \ exp(-(x ** 2/(2 * self.stdev**2))) def inversecdf(self, probability, start = -10.0, end = 10.0, error = 10e-8): It does the reverse of CDF() method, it takes a probability value and returns the corresponding value on the x-axis, together with the cumulative probability. @param probability: probability under the curve from -infinity @param start: lower boundary of calculation (default = -10) @param end: upper boundary of calculation (default = 10) @param error: error between the given and calculated probabilities (default = 10e-8) @return (start, cprob): 'start' is the standard deviation for the area under the curve from -infinity to the given 'probability' (+/- step). 'cprob' is the calculated area under the curve from -infinity to the returned 'start'. # check for tolerance if abs(self.cdf(start)-probability) < error: return (start, self.cdf(start)) # case 1: lower than -10 standard deviations if probability < self.cdf(start): return self.inversecdf(probability, start-5, start, error) # case 2: between -10 to 10 standard deviations (bisection method) if probability > self.cdf(start) and \ probability < self.cdf((start+end)/2): return self.inversecdf(probability, start, (start+end)/2, error) if probability > self.cdf((start+end)/2) and \ probability < self.cdf(end): return self.inversecdf(probability, (start+end)/2, end, error) # case 3: higher than 10 standard deviations if probability > self.cdf(end): return self.inversecdf(probability, end, end+5, error) def test(statistic, distribution, confidence): Generates the critical value from distribution and confidence value using the distribution's inversecdf method and performs 1-tailed and 2- tailed test by comparing the calculated statistic with the critical value. Returns a 5-element list: [left result, left critical, statistic, right critical, right result] where left result = True (statistic in lower critical region) or False (statistic not in lower critical region) left critical = lower critical value generated from 1 - confidence statistic = calculated statistic value - 3 -

right critical = upper critical value generated from confidence right result = True (statistic in upper critical region) or False (statistic not in upper critical region) Therefore, null hypothesis is accepted if left result and right result are both False in a 2-tailed test. @param statistic: calculated statistic (float) @param distribution: distribution to calculate critical value @type distribution: instance of a statistics distribution of a one-tail test (usually 0.95 or 0.99), use 0.975 or 0.995 for 2-tail test @type confidence: float of less than 1.0 data = [None, None, statistic, None, None] data[1] = distribution.inversecdf(1.0 - confidence)[0] if data[1] < statistic: data[0] = False else: data[0] = True data[3] = distribution.inversecdf(confidence)[0] if statistic < data[3]: data[4] = False else: data[4] = True return data def Z1Mean1Variance(**kwargs): Test 1: Z-test for a population mean (variance known) To investigate the significance of the difference between an assumed population mean and sample mean when the population variance is known. Limitations - Requires population variance (use Test 7 if population variance unknown) @param smean: sample mean @param pmean: population mean @param pvar: population variance @param ssize: sample size smean = kwargs['smean'] pmean = kwargs['pmean'] pvar = kwargs['pvar'] ssize = kwargs['ssize'] statistic = abs(smean - pmean)/ \ (pvar / sqrt(ssize)) def Z2Mean1Variance(**kwargs): Test 2: Z-test for two population means (variances known and equal) To investigate the significance of the difference between the means of Two samples when the variances are known and equal. 1. Population variances must be known and equal (use Test 8 if Population variances unknown @param smean1: sample mean of sample #1 @param smean2: sample mean of sample #2 @param pvar: variances of both populations (variances are equal) @param ssize1: sample size of sample #1 @param ssize2: sample size of sample #2-4 -

@param pmean1: population mean of population #1 (optional) @param pmean2: population mean of population #2 (optional) if not kwargs.has_key('pmean1'): pmean1 = 0.0 else: pmean1 = kwargs['pmean1'] if not kwargs.has_key('pmean2'): pmean2 = 0.0 else: pmean2 = kwargs['pmean2'] smean1 = kwargs['smean1'] smean2 = kwargs['smean2'] pvar = kwargs['pvar'] ssize1 = kwargs['ssize1'] ssize2 = kwargs['ssize2'] statistic = ((smean1 - smean2) - (pmean1 - pmean2))/ \ (pvar * sqrt((1.0 / ssize1) + (1.0 / ssize2))) def Z2Mean2Variance(**kwargs): Test 3: Z-test for two population means (variances known and unequal) To investigate the significance of the difference between the means of Two samples when the variances are known and unequal. 1. Population variances must be known(use Test 9 if population variances unknown @param smean1: sample mean of sample #1 @param smean2: sample mean of sample #2 @param pvar1: variance of population #1 @param pvar2: variance of population #2 @param ssize1: sample size of sample #1 @param ssize2: sample size of sample #2 @param pmean1: population mean of population #1 (optional) @param pmean2: population mean of population #2 (optional) if not kwargs.has_key('pmean1'): pmean1 = 0.0 else: pmean1 = kwargs['pmean1'] if not kwargs.has_key('pmean2'): pmean2 = 0.0 else: pmean2 = kwargs['pmean2'] smean1 = kwargs['smean1'] smean2 = kwargs['smean2'] pvar1 = kwargs['pvar1'] pvar2 = kwargs['pvar2'] ssize1 = kwargs['ssize1'] ssize2 = kwargs['ssize2'] statistic = ((smean1 - smean2) - (pmean1 - pmean2))/ \ sqrt((pvar1 / ssize1) + (pvar2 / ssize2)) def Z1Proportion(**kwargs): Test 4: Z-test for a proportion (binomial distribution) To investigate the significance of the difference between an assumed proportion and an observed proportion. 1. Requires sufficiently large sample size to use Normal approximation To binomial @param spro: sample proportion @param ppro: population proportion - 5 -

@param ssize: sample size spro = kwargs['spro'] ppro = kwargs['ppro'] ssize = kwargs['ssize'] statistic = (abs(ppro - spro) - (1 / (2 * ssize)))/ \ sqrt((ppro * (1 - spro)) / ssize) def Z2Proportion(**kwargs): Test 5: Z-test for the equality of two proportions (binomial distribution) To investigate the assumption that the proportions of elements from two populations are equal, based on two samples, one from each population. 1. Requires sufficiently large sample size to use Normal approximation To binomial @param spro1: sample proportion #1 @param spro2: sample proportion #2 @param ssize1: sample size #1 @param ssize2: sample size #2 spro1 = kwargs['spro1'] spro2 = kwargs['spro2'] ssize1 = kwargs['ssize1'] ssize2 = kwargs['ssize2'] P = ((spro1 * ssize1) + (spro2 * ssize2)) / (ssize1 + ssize2) statistic = (spro1 - spro2) / \ (P * (1.0 - P) * ((1.0 / ssize1) + (1.0 / ssize2))) ** 0.5 def Z2Count(**kwargs): Test 6: Z-test for comparing two counts (Poisson distribution) To investigate the significance of the differences between two counts. 1. Requires sufficiently large sample size to use Normal approximation To binomial @param time1: first measurement time @param time2: second measurement time @param count1: counts at first measurement time @param count2: counts at second measurement time time1 = kwargs['time1'] time2 = kwargs['time2'] R1 = kwargs['count1'] / float(time1) R2 = kwargs['count2'] / float(time2) statistic = (R1 - R2) / sqrt((r1 / time1) + (R2 / time2)) def ZPearsonCorrelation(**kwargs): Test 13: Z-test of a correlation coefficient To investigate the significance of the difference between a correlation coefficient and a specified value. 1. Assumes a linear relationship (regression line as Y = MX + C) 2. Independence of x-values and y-values - 6 -

Use Test 59 when these conditions cannot be met @param sr: calculated sample Pearson's product-moment correlation coefficient @param pr: specified Pearson's product-moment correlation coefficient to test @param ssize: sample size ssize = kwargs['ssize'] sr = kwargs['sr'] pr = kwargs['pr'] Z1 = 0.5 * log((1 + sr) / (1 - sr)) meanz1 = 0.5 * log((1 + pr) / (1 - pr)) sigmaz1 = 1.0 / sqrt(ssize - 3) statistic = (Z1 - meanz1) / sigmaz1 def Z2PearsonCorrelation(**kwargs): Test 14: Z-test for two correlation coefficients To investigate the significance of the difference between the Correlation coefficients for a pair variables occurring from two difference populations. @param r1: Pearson correlation coefficient of sample #1 @param r2: Pearson correlation coefficient of sample #2 @param ssize1: Sample size #1 @param ssize2: Sample size #2 z1 = 0.5 * log((1.0 + kwargs['r1']) /(1.0 - kwargs['r1'])) z2 = 0.5 * log((1.0 + kwargs['r2']) /(1.0 - kwargs['r2'])) sigma1 = 1.0 / sqrt(kwargs['ssize1'] - 3) sigma2 = 1.0 / sqrt(kwargs['ssize2'] - 3) sigma = sqrt((sigma1 ** 2) + (sigma2 ** 2)) statistic = abs(z1 - z2) / sigma def ZCorrProportion(**kwargs): Test 23: Z-test for correlated proportions To investigate the significance of the difference between two correlated proportions in opinion surveys. It can also be used for more general applications. 1. The same people are questioned both times (correlated property). 2. Sample size must be quite large. @param ssize: sample size @param ny: number answered 'no' in first poll and 'yes' in second poll @param yn: number answered 'yes' in first poll and 'no' in second poll ssize = kwargs['ssize'] ny = kwargs['ny'] yn = kwargs['yn'] sigma = (ny + yn) - (((ny - yn) ** 2.0) / ssize) sigma = sqrt(sigma / (ssize * (ssize - 1.0))) statistic = (ny - yn) / (sigma * ssize) def SpearmanCorrelation(**kwargs): Test 58: Spearman rank correlation test (paired observations) - 7 -

To investigate the significance of the correlation between two series of observations obtained in pairs. 1. Assumes the two population distributions to be continuous 2. Sample size must be more than 10 @param R: sum of squared ranks differences @param ssize: sample size @param series1: ranks of series #1 (not used if R is given) @param series2: ranks of series #2 (not used if R is given) ssize = kwargs['ssize'] if not kwargs.has_key('r'): series1 = kwargs['series1'] series2 = kwargs['series2'] R = [((series1[i] - series2[i]) ** 2) for i in range(len(series1))] R = sum(r) else: R = kwargs['r'] statistic = (6.0 * R) - (ssize * ((ssize ** 2) - 1.0)) statistic = statistic / (ssize * (ssize + 1.0) * sqrt(ssize - 1.0)) File: testfile.py This file contains the test codes for the implementation of 10 Z-test routines from Gopal Kanji's book[1] and published under The Python Papers Source Codes [2]. These codes are licensed under Lesser General Public Licence version 3. [1] Kanji, Gopal K. 2006. 100 Statistical Tests, 3rd Edition. Sage Publications. [2] Ling, MHT. 2009. Ten Z-Test Routines from Gopal Kanji's 100 Statistical Tests. The Python Papers Source Codes 1:5 import unittest import ZTests as N class testnormaldistribution(unittest.testcase): def testz1mean1variance(self): Test 1: Z-test for a population mean (variance known) self.assertalmostequal(n.z1mean1variance(pmean = 4.0, smean = 4.6, pvar = 1.0, ssize = 9, confidence = 0.975)[2], 1.8) self.assertfalse(n.z1mean1variance(pmean = 4.0, smean = 4.6, pvar = 1.0, ssize = 9, confidence = 0.975)[4]) def testz2mean1variance(self): Test 2: Z-test for two population means (variances known and equal) self.assertalmostequal(n.z2mean1variance(smean1 = 1.2, smean2 = 1.7, pvar = 1.4405, ssize1 = 9, ssize2 = 16, confidence = 0.975)[2], -0.83304408) self.assertfalse(n.z2mean1variance(smean1 = 1.2, smean2 = 1.7, pvar = 1.4405, ssize1 = 9, ssize2 = 16, confidence = 0.975)[4]) def testz2mean2variance(self): - 8 -

Test 3: Z-test for two population means (variances known and unequal) self.assertalmostequal(n.z2mean2variance(smean1 = 80.02, smean2 = 79.98, pvar1 = 0.000576, pvar2 = 0.001089, ssize1 = 13, ssize2 = 8, confidence = 0.975)[2], 2.977847) self.asserttrue(n.z2mean2variance(smean1 = 80.02, smean2 = 79.98, pvar1 = 0.000576, pvar2 = 0.001089, ssize1 = 13, ssize2 = 8, confidence = 0.975)[4]) def testz1proportion(self): Test 4: Z-test for a proportion (binomial distribution) self.assertalmostequal(n.z1proportion(spro = 0.5, ppro = 0.4, ssize = 100, confidence = 0.975)[2], 2.23606798) self.asserttrue(n.z1proportion(spro = 0.5, ppro = 0.4, ssize = 100, confidence = 0.975)[4]) def testz2proportion(self): Test 5: Z-test for the equality of two proportions (binomial distribution) self.assertalmostequal(n.z2proportion(spro1 = 0.00325, spro2 = 0.0573, ssize1 = 952, ssize2 = 1168, confidence = 0.025)[2], -6.9265418) self.assertfalse(n.z2proportion(spro1 = 0.00325, spro2 = 0.0573, ssize1 = 952, ssize2 = 1168, confidence = 0.025)[4]) def testz2count(self): Test 6: Z-test for comparing two counts (Poisson distribution) self.assertalmostequal(n.z2count(time1 = 22, time2 = 30, count1 = 952, count2 = 1168, confidence = 0.975)[2], 2.401630072) self.asserttrue(n.z2count(time1 = 22, time2 = 30, count1 = 952, count2 = 1168, confidence = 0.975)[4]) def testzpearsoncorrelation(self): Test 13: Z-test of a correlation coefficient self.assertalmostequal(n.zpearsoncorrelation(sr = 0.75, pr = 0.5, ssize = 24, confidence = 0.95)[2], 1.94140329) self.asserttrue(n.zpearsoncorrelation(sr = 0.75, pr = 0.5, ssize = 24, confidence = 0.95)[4]) def testz2pearsoncorrelation(self): Test 14: Z-test for two correlation coefficients self.assertalmostequal(n.z2pearsoncorrelation(r1 = 0.5, r2 = 0.3, ssize1 = 28, ssize2 = 35, confidence = 0.975)[2], 0.89832268) self.assertfalse(n.z2pearsoncorrelation(r1 = 0.5, r2 = 0.3, ssize1 = 28, ssize2 = 35, confidence = 0.975)[4]) def testzcorrproportion(self): Test 23: Z-test for correlated proportions self.assertalmostequal(n.zcorrproportion(ny = 15, yn = 9, ssize = 105, confidence = 0.975)[2], 1.22769962) self.assertfalse(n.zcorrproportion(ny = 15, yn = 9, ssize = 105, confidence = 0.975)[4]) - 9 -

def testspearmancorrelation(self): Test 58: Spearman rank correlation test (paired observations) self.assertalmostequal(n.spearmancorrelation(r = 24, ssize = 11, confidence = 0.975)[2], -2.8173019) self.assertfalse(n.spearmancorrelation(r = 24, ssize = 11, confidence = 0.975)[4]) class testnormal(unittest.testcase): def testcdf1(self): self.assertalmostequal(n.normaldistribution().cdf(0), 0.5) def testpdf1(self): self.assertalmostequal(n.normaldistribution().pdf(0), 0.3989423) def testinversecdf1(self): self.assertalmostequal(n.normaldistribution().inversecdf(0.5)[0], 0) class testerfcc(unittest.testcase): Test data from Press, William H., Flannery, Brian P., Teukolsky, Saul A., and Vetterling, William T. 1992. Numerical Recipes Example Book C, 2nd edition. Cambridge University Press. def test1(self): self.assertalmostequal(n.erfcc(0.0), 1-0.000000) def test2(self): self.assertalmostequal(n.erfcc(0.5), 1-0.5204999) def test3(self): self.assertalmostequal(n.erfcc(1.0), 1-0.8427008) def test4(self): self.assertalmostequal(n.erfcc(1.5), 1-0.9661051) if name == ' main ': unittest.main() 3. References Kanji, Gopal K. 2006. 100 Statistical Tests, 3rd edition. Sage Publications. Press, William H., Flannery, Brian P., Teukolsky, Saul A., and Vetterling, William T. 1989. Numerical Recipes in Pascal. Cambridge University Press, Cambridge. Press, William H., Flannery, Brian P., Teukolsky, Saul A., and Vetterling, William T. 1992. Numerical Recipes Example Book C. Cambridge University Press, Cambridge. Salmoni, A. 2008. Embedding a Python interpreter into your program. The Python Papers, 3(2): 3. - 10 -