The locfdr Package. August 19, hivdata... 1 lfdrsim... 2 locfdr Index 5

Size: px
Start display at page:

Download "The locfdr Package. August 19, hivdata... 1 lfdrsim... 2 locfdr Index 5"

Transcription

1 Title Computes local false discovery rates Version The locfdr Package August 19, 2006 Author Bradley Efron, Brit Turnbull and Balasubramanian Narasimhan Computation of local false discovery rates Maintainer Bradley Efron License GPL 2.0 R topics documented: hivdata lfdrsim locfdr Index 5 hivdata HIV data set Format The data comprises 7680 z-values, each relating to a two-sample t-test. The test compares gene expression values for 4 HIV patients with values for 4 normal subjects; the t-score T[i] for gene i has been transformed to a normal scale, z[i] = qnorm(pt(t[i], df=6)), so that the z[i] s theoretically would have a standard N(0, 1) distribution under the null hypothesis. The original experiment is described in van t Wout et. al. (2003). data(hivdata) A vector containing 7680 z-values References van t Wout, et. al., Cellular gene expression upon human immuno-deficiency virus type 1 infection of CD4+-T-Cell lines, Journal ofvirology 77,

2 2 locfdr lfdrsim Simulated data set for locfdr A simulated dataset that involves 2000 "genes", each of which has yielded a test statistic "zex", with zex[i] N(mu[i], 1) (independently for i = 1, 2, ) The data comprises 2000 µ i values and 2000 z-values. data(lfdrsim) Format A matrix of 2000 rows and 2 columns containing mu and the z-score values (zex) locfdr Local False Discovery Rate Calculation Compute local false discovery rates, following the definitions and description in Efron (2004) JASA, Volume 99, pages and Efron, B (2005) "Local false discovery rates" and Efron, B. (2005) "Correlation and large-scale simultaneous significance testing" stanford.edu/~brad/papers/. locfdr(zz, bre=120, df=7, pct=0, pct0=1/4, nulltype=1, type=0, plot=1, mult, main=" ", sw=0) Arguments zz bre A vector of summary statistics, one for each case under simultaneous consideration. In a microarray experiment there would be one component of zz for each gene, perhaps a t-statistic comparing gene expression levels under two different conditions. Results may be improved by transforming zz so that its components are theoretically distributed as N(0, 1) under the null hypothesis, for example via z[i] = qnorm(pt(t[i],df)) when using t-statistics. This is especially important when the theoretical null option is invoked (see below). Recentering and rescaling zz may be necessary if its central histogram looks very far removed from mean 0 and variance 1. The calculations assume a large number of cases, say at least length(zz) exceeding 200. Number of breaks in the discretization of the z-score axis, or a vector of breakpoints fully describing the discretization. If length(zz) is small, such as when the number of cases is less than about 1000, set bre to a number lower than the default of 120.

3 locfdr 3 df pct pct0 nulltype type plot mult main sw Degrees of freedom for fitting the estimated density f(z). Larger values of df may be required if f(z) has sharp bends or other irregularities. A warning is issued if the fitted curve does not adequately match the histogram counts. It is a good idea to use the plot option to view the histogram and fitted curve. Excluded tail proportions of zz s when fitting f(z). pct=0 includes full range of zz s. pct can also be a 2-vector, describing the fitting range. Proportion of the zz distribution used in fitting the null density f0(z) by central matching. If a 2-vector, e.g. pct0=c(0.25,0.60), the range [pct0[1], pct0[2]] is used. If a scalar, [pct0, 1-pct0] is used. Type of null hypothesis assumed in estimating f 0(z), for use in the fdr calculations; 0 is theoretical null N(0, 1) [which assumes that the original zz scores have been scaled to have a N(0, 1) distribution under the null hypothesis]; 1 (the default) is the empirical null with parameters estimated by maximum likelihood; 2 is the empirical null with parameters estimated by central matching (see second reference); 3 is a "split normal" version of 2, in which f0(z) is allowed to have different scales on the two sides of the maximum. Unless sw == 2 or 3, the theoretical, maximum likelihood, and central matching estimates all will be output in the matrix fp0, and both the theoretical and the specified nulltype will be used in the calculations output in mat, but only the specified nulltype is used in the calculation of the output fdr (local fdr estimates for every case). Type of fitting used for f(z); 0 is a natural spline, 1 is a polynomial, in either case with degrees of freedom df [so total degrees of freedom including the intercept is df+1.] Plots desired. plot=0 gives no plots. plot=1 gives single plot showing the histogram of zz and fitted densities f(z) and f0(z); colored histogram bars indicate estimated non-null counts; yellow triangles on the x-axis indicate threshold z-values for fdr <= 0.2. plot=2 also gives plot of fdr, and the right and left tail area Fdr curves; plot=3 gives instead the f1 cdf of the estimated fdr curve, as in figure 4 of the second reference; plot=4 gives all three plots. Optional scalar multiple (or vector of multiples) of the sample size for calculation of the corresponding hypothetical Efdr value(s). Main heading for the histogram plot when plot>0. Determines the type of output desired. sw = 2 gives a list consisting of the last 5 values listed below. sw = 3 gives the square matrix of dimension bre-1 representing the influence function of log(fdr), i.e. the derivative of log(fdr) (for each bin) with respect to the bin counts. Any other value of sw returns a list consisting of the first 5 (6 if mult is supplied) values listed below. Details Value The standard error estimate lfdrse assumes independence of the zz values and should usually be considered as a lower bound on the true standard errors. See the third reference. The density estimates f, f0, f0theo are scaled to add up to approximately the number of zz s. The non-null density f1 is scaled to add up to approximately (1-p0) times the number of zz s. i.e. the estimated number of non-null zz s. fdr the estimated local false discovery rate for each case, using the selected options for type and nulltype.

4 4 locfdr fp0 Efdr the estimated parameters delta (mean of f0), sigma (standard deviation of f0), and p0, along with their standard errors. If nulltype<3, fp0 is a 5 by 3 matrix, with columns representing delta, sigma, and p0 and rows representing nulltypes and estimate vs. standard error. If nulltype==3, a fourth column represents the sigma estimate for the right side of f0. the expected false discovery rate for the non-null cases, a measure of the experiment s power as described in Section 3 of the second reference. Large values of Efdr, say Efdr>0.4, indicate low power. Overall Efdr and right and left values are given, both for the specified nulltype and for nulltype 0. If nulltype==0, values are given for nulltypes 1 and 0. cdf1 a 99x2 matrix giving the estimated cdf of fdr under the non-null distribution f1. Large values of the cdf for small fdr values indicate good power; see Section 3 of the second reference. Set plot to 3 or 4 to see the cdf plot. mat A matrix summarizing the estimates of f(z), f0(z), fdr(z), etc. at the bre 1 midpoints "x" of the break discretization. These are convenient for comparisons and plotting; mat includes fdr from nulltype 1, 2, or 3 as specified, estimates of the usual tail-area False Discovery Rates, Fdrleft and Fdrright, and also fdrtheo and f0theo, the fdr and f0 estimates assuming the theoretical null density N(0, 1). If nulltype==0, the fdr and f0 columns of mat are calculated using nulltype 1. The 10th column of mat, "lfdrse", is an estimate of standard error for the curve log(fdr) and is calculated based on the specified nulltype. The 11th column of mat is an estimate p1f1 of the subdensity for the non-null z-scores. Column "counts" gives the histogram counts for zz. mult pds x f pds. stdev Author(s) Bradley Efron If the argument mult was supplied, vector of the ratios of hypothetical Efdr for the supplied multiples of the sample size to Efdr for the actual sample size. The estimates of p0, delta, and sigma. The bin midpoints. The values of f(z) at the bin midpoints. The derivative of the estimates of p0 (when nulltype==1) or log(p0) (when nulltype==0 or 2), delta, and sigma with respect to the bin counts. The delta-method estimates of the standard deviations of the p0, delta, and sigma estimates. References Efron, B. (2004) "Large-scale simultaneous hypothesis testing: the choice of a null hypothesis", Jour Amer Stat Assoc, 99, pp Efron, B. (2006) "Size, Power, and False Discovery Rates" Efron, B. (2006) "Correlation and Large-Scale Simultaneous Significance Testing" Examples ## HIV data example data(hivdata) w <- locfdr(hivdata)

5 Index Topic datasets hivdata, 1 lfdrsim, 2 Topic htest Topic models hivdata, 1 lfdrsim, 2 5

Package locfdr. July 15, Index 5

Package locfdr. July 15, Index 5 Version 1.1-8 Title Computes Local False Discovery Rates Package locfdr July 15, 2015 Maintainer Balasubramanian Narasimhan License GPL-2 Imports stats, splines, graphics Computation

More information

Package FDRreg. August 29, 2016

Package FDRreg. August 29, 2016 Package FDRreg August 29, 2016 Type Package Title False discovery rate regression Version 0.1 Date 2014-02-24 Author James G. Scott, with contributions from Rob Kass and Jesse Windle Maintainer James G.

More information

Tweedie s Formula and Selection Bias. Bradley Efron Stanford University

Tweedie s Formula and Selection Bias. Bradley Efron Stanford University Tweedie s Formula and Selection Bias Bradley Efron Stanford University Selection Bias Observe z i N(µ i, 1) for i = 1, 2,..., N Select the m biggest ones: z (1) > z (2) > z (3) > > z (m) Question: µ values?

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Correlation, z-values, and the Accuracy of Large-Scale Estimators. Bradley Efron Stanford University

Correlation, z-values, and the Accuracy of Large-Scale Estimators. Bradley Efron Stanford University Correlation, z-values, and the Accuracy of Large-Scale Estimators Bradley Efron Stanford University Correlation and Accuracy Modern Scientific Studies N cases (genes, SNPs, pixels,... ) each with its own

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

Package aspi. R topics documented: September 20, 2016

Package aspi. R topics documented: September 20, 2016 Type Package Title Analysis of Symmetry of Parasitic Infections Version 0.2.0 Date 2016-09-18 Author Matt Wayland Maintainer Matt Wayland Package aspi September 20, 2016 Tools for the

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Manual: R package HTSmix

Manual: R package HTSmix Manual: R package HTSmix Olga Vitek and Danni Yu May 2, 2011 1 Overview High-throughput screens (HTS) measure phenotypes of thousands of biological samples under various conditions. The phenotypes are

More information

Package ChIPtest. July 20, 2016

Package ChIPtest. July 20, 2016 Type Package Package ChIPtest July 20, 2016 Title Nonparametric Methods for Identifying Differential Enrichment Regions with ChIP-Seq Data Version 1.0 Date 2017-07-07 Author Vicky Qian Wu ; Kyoung-Jae

More information

Package plw. R topics documented: May 7, Type Package

Package plw. R topics documented: May 7, Type Package Type Package Package plw May 7, 2018 Title Probe level Locally moderated Weighted t-tests. Version 1.40.0 Date 2009-07-22 Author Magnus Astrand Maintainer Magnus Astrand

More information

Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University Bayesian Inference Parameter: µ Ω Observed data: x Prior: π(µ) Probability distributions: Parameter of interest: { fµ (x), µ

More information

Package jmcm. November 25, 2017

Package jmcm. November 25, 2017 Type Package Package jmcm November 25, 2017 Title Joint Mean-Covariance Models using 'Armadillo' and S4 Version 0.1.8.0 Maintainer Jianxin Pan Fit joint mean-covariance models

More information

Mutual fund performance: false discoveries, bias, and power

Mutual fund performance: false discoveries, bias, and power Ann Finance DOI 10.1007/s10436-010-0151-9 RESEARCH ARTICLE Mutual fund performance: false discoveries, bias, and power Nik Tuzov Frederi Viens Received: 17 July 2009 / Accepted: 17 March 2010 Springer-Verlag

More information

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy

Package SEMModComp. R topics documented: February 19, Type Package Title Model Comparisons for SEM Version 1.0 Date Author Roy Levy Type Package Title Model Comparisons for SEM Version 1.0 Date 2009-02-23 Author Roy Levy Package SEMModComp Maintainer Roy Levy February 19, 2015 Conduct tests of difference in fit for

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Large-Scale Hypothesis Testing

Large-Scale Hypothesis Testing Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early

More information

Probabilistic Inference for Multiple Testing

Probabilistic Inference for Multiple Testing This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,

More information

Package severity. February 20, 2015

Package severity. February 20, 2015 Type Package Title Mayo's Post-data Severity Evaluation Version 2.0 Date 2013-03-27 Author Nicole Mee-Hyaang Jinn Package severity February 20, 2015 Maintainer Nicole Mee-Hyaang Jinn

More information

Package CCP. R topics documented: December 17, Type Package. Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0.

Package CCP. R topics documented: December 17, Type Package. Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0. Package CCP December 17, 2009 Type Package Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0.1 Date 2009-12-14 Author Uwe Menzel Maintainer Uwe Menzel to

More information

Package MultisiteMediation

Package MultisiteMediation Version 0.0.1 Date 2017-02-25 Package MultisiteMediation February 26, 2017 Title Causal Mediation Analysis in Multisite Trials Author Xu Qin, Guanglei Hong Maintainer Xu Qin Depends

More information

Lecture 41 Sections Wed, Nov 12, 2008

Lecture 41 Sections Wed, Nov 12, 2008 Lecture 41 Sections 14.1-14.3 Hampden-Sydney College Wed, Nov 12, 2008 Outline 1 2 3 4 5 6 7 one-proportion test that we just studied allows us to test a hypothesis concerning one proportion, or two categories,

More information

Package LBLGXE. R topics documented: July 20, Type Package

Package LBLGXE. R topics documented: July 20, Type Package Type Package Package LBLGXE July 20, 2015 Title Bayesian Lasso for detecting Rare (or Common) Haplotype Association and their interactions with Environmental Covariates Version 1.2 Date 2015-07-09 Author

More information

Package pfa. July 4, 2016

Package pfa. July 4, 2016 Type Package Package pfa July 4, 2016 Title Estimates False Discovery Proportion Under Arbitrary Covariance Dependence Version 1.1 Date 2016-06-24 Author Jianqing Fan, Tracy Ke, Sydney Li and Lucy Xia

More information

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University Bayesian Inference and the Parametric Bootstrap Bradley Efron Stanford University Importance Sampling for Bayes Posterior Distribution Newton and Raftery (1994 JRSS-B) Nonparametric Bootstrap: good choice

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

Package ssize. R topics documented: February 16, Title Estimate Microarray Sample Size Version Date

Package ssize. R topics documented: February 16, Title Estimate Microarray Sample Size Version Date Title Estimate Microarray Sample Size Version 1.53.0 Date 2012-06-07 Package ssize February 16, 2018 Author Gregory R. Warnes, Peng Liu, and Fasheng Li Description Functions for computing and displaying

More information

Package polypoly. R topics documented: May 27, 2017

Package polypoly. R topics documented: May 27, 2017 Package polypoly May 27, 2017 Title Helper Functions for Orthogonal Polynomials Version 0.0.2 Tools for reshaping, plotting, and manipulating matrices of orthogonal polynomials. Depends R (>= 3.3.3) License

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

Package IGG. R topics documented: April 9, 2018

Package IGG. R topics documented: April 9, 2018 Package IGG April 9, 2018 Type Package Title Inverse Gamma-Gamma Version 1.0 Date 2018-04-04 Author Ray Bai, Malay Ghosh Maintainer Ray Bai Description Implements Bayesian linear regression,

More information

Package vhica. April 5, 2016

Package vhica. April 5, 2016 Type Package Package vhica April 5, 2016 Title Vertical and Horizontal Inheritance Consistence Analysis Version 0.2.4 Date 2016-04-04 Author Arnaud Le Rouzic Suggests ape, plotrix, parallel, seqinr, gtools

More information

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret

More information

The superpc Package. May 19, 2005

The superpc Package. May 19, 2005 Title Supervised principal components Version 1.03 Author Eric Bair, R. Tibshirani The superpc Package May 19, 2005 Supervised principal components for regression and survival analsysis. Especially useful

More information

Package sscor. January 28, 2016

Package sscor. January 28, 2016 Type Package Package sscor January 28, 2016 Title Robust Correlation Estimation and Testing Based on Spatial Signs Version 0.2 Date 2016-01-19 Depends pcapp, robustbase, mvtnorm Provides the spatial sign

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Package jmuoutlier. February 17, 2017

Package jmuoutlier. February 17, 2017 Type Package Package jmuoutlier February 17, 2017 Title Permutation Tests for Nonparametric Statistics Version 1.3 Date 2017-02-17 Author Steven T. Garren [aut, cre] Maintainer Steven T. Garren

More information

Package blme. August 29, 2016

Package blme. August 29, 2016 Version 1.0-4 Date 2015-06-13 Title Bayesian Linear Mixed-Effects Models Author Vincent Dorie Maintainer Vincent Dorie Package blme August 29, 2016 Description Maximum a posteriori

More information

Package BayesNI. February 19, 2015

Package BayesNI. February 19, 2015 Package BayesNI February 19, 2015 Type Package Title BayesNI: Bayesian Testing Procedure for Noninferiority with Binary Endpoints Version 0.1 Date 2011-11-11 Author Sujit K Ghosh, Muhtarjan Osman Maintainer

More information

Package beam. May 3, 2018

Package beam. May 3, 2018 Type Package Package beam May 3, 2018 Title Fast Bayesian Inference in Large Gaussian Graphical Models Version 1.0.2 Date 2018-05-03 Author Gwenael G.R. Leday [cre, aut], Ilaria Speranza [aut], Harry Gray

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Package Delaporte. August 13, 2017

Package Delaporte. August 13, 2017 Type Package Package Delaporte August 13, 2017 Title Statistical Functions for the Delaporte Distribution Version 6.1.0 Date 2017-08-13 Description Provides probability mass, distribution, quantile, random-variate

More information

On testing the significance of sets of genes

On testing the significance of sets of genes On testing the significance of sets of genes Bradley Efron and Robert Tibshirani August 17, 2006 Abstract This paper discusses the problem of identifying differentially expressed groups of genes from a

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Package hot.deck. January 4, 2016

Package hot.deck. January 4, 2016 Type Package Title Multiple Hot-Deck Imputation Version 1.1 Date 2015-11-19 Package hot.deck January 4, 2016 Author Skyler Cranmer, Jeff Gill, Natalie Jackson, Andreas Murr, Dave Armstrong Maintainer Dave

More information

Package SpatialNP. June 5, 2018

Package SpatialNP. June 5, 2018 Type Package Package SpatialNP June 5, 2018 Title Multivariate Nonparametric Methods Based on Spatial Signs and Ranks Version 1.1-3 Date 2018-06-05 Author Seija Sirkia, Jari Miettinen, Klaus Nordhausen,

More information

The evdbayes Package

The evdbayes Package The evdbayes Package April 19, 2006 Version 1.0-5 Date 2006-18-04 Title Bayesian Analysis in Extreme Theory Author Alec Stephenson and Mathieu Ribatet. Maintainer Mathieu Ribatet

More information

Package CorrMixed. R topics documented: August 4, Type Package

Package CorrMixed. R topics documented: August 4, Type Package Type Package Package CorrMixed August 4, 2016 Title Estimate Correlations Between Repeatedly Measured Endpoints (E.g., Reliability) Based on Linear Mixed-Effects Models Version 0.1-13 Date 2015-03-08 Author

More information

BLAST: Target frequencies and information content Dannie Durand

BLAST: Target frequencies and information content Dannie Durand Computational Genomics and Molecular Biology, Fall 2016 1 BLAST: Target frequencies and information content Dannie Durand BLAST has two components: a fast heuristic for searching for similar sequences

More information

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is Stat 501 Solutions and Comments on Exam 1 Spring 005-4 0-4 1. (a) (5 points) Y ~ N, -1-4 34 (b) (5 points) X (X,X ) = (5,8) ~ N ( 11.5, 0.9375 ) 3 1 (c) (10 points, for each part) (i), (ii), and (v) are

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Introductory Statistics with R: Simple Inferences for continuous data

Introductory Statistics with R: Simple Inferences for continuous data Introductory Statistics with R: Simple Inferences for continuous data Statistical Packages STAT 1301 / 2300, Fall 2014 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail: sungkyu@pitt.edu

More information

MODULE 9 NORMAL DISTRIBUTION

MODULE 9 NORMAL DISTRIBUTION MODULE 9 NORMAL DISTRIBUTION Contents 9.1 Characteristics of a Normal Distribution........................... 62 9.2 Simple Areas Under the Curve................................. 63 9.3 Forward Calculations......................................

More information

Package pearson7. June 22, 2016

Package pearson7. June 22, 2016 Version 1.0-2 Date 2016-06-21 Package pearson7 June 22, 2016 Title Maximum Likelihood Inference for the Pearson VII Distribution with Shape Parameter 3/2 Author John Hughes Maintainer John Hughes

More information

Two sample Hypothesis tests in R.

Two sample Hypothesis tests in R. Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

The ssanv Package. February 14, 2006

The ssanv Package. February 14, 2006 The ssanv Package February 14, 2006 Type Package Title Sample Size Adjusted for Nonadherence or Variability of input parameters Version 1.0 Date 2006-02-08 Author Michael Fay Maintainer

More information

Package HarmonicRegression

Package HarmonicRegression Package HarmonicRegression April 1, 2015 Type Package Title Harmonic Regression to One or more Time Series Version 1.0 Date 2015-04-01 Author Paal O. Westermark Maintainer Paal O. Westermark

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

The lars Package. R topics documented: May 17, Version Date Title Least Angle Regression, Lasso and Forward Stagewise

The lars Package. R topics documented: May 17, Version Date Title Least Angle Regression, Lasso and Forward Stagewise The lars Package May 17, 2007 Version 0.9-7 Date 2007-05-16 Title Least Angle Regression, Lasso and Forward Stagewise Author Trevor Hastie and Brad Efron

More information

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I 1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

More information

The nltm Package. July 24, 2006

The nltm Package. July 24, 2006 The nltm Package July 24, 2006 Version 1.2 Date 2006-07-17 Title Non-linear Transformation Models Author Gilda Garibotti, Alexander Tsodikov Maintainer Gilda Garibotti Depends

More information

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018 Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng

More information

Probability and Statistics. Terms and concepts

Probability and Statistics. Terms and concepts Probability and Statistics Joyeeta Dutta Moscato June 30, 2014 Terms and concepts Sample vs population Central tendency: Mean, median, mode Variance, standard deviation Normal distribution Cumulative distribution

More information

Package ssanv. June 23, 2015

Package ssanv. June 23, 2015 Type Package Package ssanv June 23, 2015 Title Sample Size Adjusted for Nonadherence or Variability of Input Parameters Version 1.1 Date 2015-06-22 Author Michael P. Fay Maintainer

More information

Package EL. February 19, 2015

Package EL. February 19, 2015 Version 1.0 Date 2011-11-01 Title Two-sample Empirical Likelihood License GPL (>= 2) Package EL February 19, 2015 Description Empirical likelihood (EL) inference for two-sample problems. The following

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

Linear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments

Linear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Gordon K. Smyth (as interpreted by Aaron J. Baraff) STAT 572 Intro Talk April 10, 2014 Microarray

More information

The grnnr Package. December 19, 2005

The grnnr Package. December 19, 2005 The grnnr Package December 19, 2005 Type Package Title A Generalized Regression Neural Network Version 1.0 Date 2005-12-15 Author Maintainer grnnr synthesizes

More information

Package effectfusion

Package effectfusion Package November 29, 2016 Title Bayesian Effect Fusion for Categorical Predictors Version 1.0 Date 2016-11-21 Author Daniela Pauger [aut, cre], Helga Wagner [aut], Gertraud Malsiner-Walli [aut] Maintainer

More information

The SpatialNP Package

The SpatialNP Package The SpatialNP Package September 4, 2007 Type Package Title Multivariate nonparametric methods based on spatial signs and ranks Version 0.9 Date 2007-08-30 Author Seija Sirkia, Jaakko Nevalainen, Klaus

More information

(right tailed) or minus Z α. (left-tailed). For a two-tailed test the critical Z value is going to be.

(right tailed) or minus Z α. (left-tailed). For a two-tailed test the critical Z value is going to be. More Power Stuff What is the statistical power of a hypothesis test? Statistical power is the probability of rejecting the null conditional on the null being false. In mathematical terms it is ( reject

More information

Package invgamma. May 7, 2017

Package invgamma. May 7, 2017 Package invgamma May 7, 2017 Type Package Title The Inverse Gamma Distribution Version 1.1 URL https://github.com/dkahle/invgamma BugReports https://github.com/dkahle/invgamma/issues Description Light

More information

Applied Statistics for the Behavioral Sciences

Applied Statistics for the Behavioral Sciences Applied Statistics for the Behavioral Sciences Chapter 8 One-sample designs Hypothesis testing/effect size Chapter Outline Hypothesis testing null & alternative hypotheses alpha ( ), significance level,

More information

Introduction to Business Statistics QM 220 Chapter 12

Introduction to Business Statistics QM 220 Chapter 12 Department of Quantitative Methods & Information Systems Introduction to Business Statistics QM 220 Chapter 12 Dr. Mohammad Zainal 12.1 The F distribution We already covered this topic in Ch. 10 QM-220,

More information

Package noncompliance

Package noncompliance Type Package Package noncompliance February 15, 2016 Title Causal Inference in the Presence of Treatment Noncompliance Under the Binary Instrumental Variable Model Version 0.2.2 Date 2016-02-11 A finite-population

More information

Parametric Empirical Bayes Methods for Microarrays

Parametric Empirical Bayes Methods for Microarrays Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions

More information

Package sklarsomega. May 24, 2018

Package sklarsomega. May 24, 2018 Type Package Package sklarsomega May 24, 2018 Title Measuring Agreement Using Sklar's Omega Coefficient Version 1.0 Date 2018-05-22 Author John Hughes Maintainer John Hughes

More information

By Bradley Efron Stanford University

By Bradley Efron Stanford University The Annals of Applied Statistics 2008, Vol. 2, No. 1, 197 223 DOI: 10.1214/07-AOAS141 c Institute of Mathematical Statistics, 2008 SIMULTANEOUS INFERENCE: WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED?

More information

Package factorqr. R topics documented: February 19, 2015

Package factorqr. R topics documented: February 19, 2015 Package factorqr February 19, 2015 Version 0.1-4 Date 2010-09-28 Title Bayesian quantile regression factor models Author Lane Burgette , Maintainer Lane Burgette

More information

STATISTICS 141 Final Review

STATISTICS 141 Final Review STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /

More information

Data analysis and Geostatistics - lecture VII

Data analysis and Geostatistics - lecture VII Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some

More information

Package rnmf. February 20, 2015

Package rnmf. February 20, 2015 Type Package Title Robust Nonnegative Matrix Factorization Package rnmf February 20, 2015 An implementation of robust nonnegative matrix factorization (rnmf). The rnmf algorithm decomposes a nonnegative

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Package MiRKATS. May 30, 2016

Package MiRKATS. May 30, 2016 Type Package Package MiRKATS May 30, 2016 Title Microbiome Regression-based Kernal Association Test for Survival (MiRKAT-S) Version 1.0 Date 2016-04-22 Author Anna Plantinga , Michael

More information

Package ShrinkCovMat

Package ShrinkCovMat Type Package Package ShrinkCovMat Title Shrinkage Covariance Matrix Estimators Version 1.2.0 Author July 11, 2017 Maintainer Provides nonparametric Steinian shrinkage estimators

More information

Package SSPA. December 9, 2018

Package SSPA. December 9, 2018 Type Package Package SSPA December 9, 2018 Title General Sample Size and Power Analysis for Microarray and Next-Generation Sequencing Data Version 2.22.0 Author Maintainer General

More information

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 1 HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC 7 steps of Hypothesis Testing 1. State the hypotheses 2. Identify level of significant 3. Identify the critical values 4. Calculate test statistics 5. Compare

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

FDVAR: R code for variance of the number of false discoveries Beta version

FDVAR: R code for variance of the number of false discoveries Beta version FDVAR: R code for variance of the number of false discoveries Beta version Art B. Owen May 2004 Abstract This report documents a beta version of code for computing the variance of the false discovery rate,

More information

Package gtheory. October 30, 2016

Package gtheory. October 30, 2016 Package gtheory October 30, 2016 Version 0.1.2 Date 2016-10-22 Title Apply Generalizability Theory with R Depends lme4 Estimates variance components, generalizability coefficients, universe scores, and

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Homework Example Chapter 1 Similar to Problem #14

Homework Example Chapter 1 Similar to Problem #14 Chapter 1 Similar to Problem #14 Given a sample of n = 129 observations of shower-flow-rate, do this: a.) Construct a stem-and-leaf display of the data. b.) What is a typical, or representative flow rate?

More information