ON THE USE OF A CORRELATED BINOMIAL MODEL FOR THE
|
|
- Chrystal Walton
- 5 years ago
- Views:
Transcription
1 'e ON THE USE OF A CORRELATED BINOMIAL MODEL FOR THE ~~ALYSIS OF CERTAIN TOXICOLOGICAL EXPERIMENTS by L.L. Kupper Department of Biostatistics University of North Carolina, Chapel Hill J.K. Haseman Biometry Branch National Institute of Environmental Health Sciences Research Triangle Park.. Institute of Statistics Mimeo Series No. 4 June 977
2 '. ON THE USE OF A CORRELATED BINOMIAL MODEL FOR THE ANALYSIS OF CERTAIN TOXICOLOGICAL EXPERIMENTS L.L. Kupper! and J.K. Haseman ABSTRACT In certain toxicological experiments with laboratory animals, the outcome of interest is the occurrence of dead or malformed fetuses in a 'litter. Previous investigations have shown that the simple one~ parameter binomial and Poisson models generally provide poor fits to this type of binary data. In this paper, a type of correlated bino~, mial model is proposed for use in this situation. First, the model is described in detail and is shown to have certain theoretical advantages over a beta-binomial model proposed by Williams [8]. These two-parameter models are then contrasted as to goodness of fit to some real~life data. Finally, numerical examples are given in which likelihood ratio tests based on these models are employed to assess the significance of treatment-control differences. i IDepartment of Biostatistics, School of Public Health, University of North Carolina, Chapel Hill, NC 754. Biometry Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC 7709, *This investigation was supported in part by NIH Research Career Development Award (No. l-k04-es00003) from the National Institute of Environmental Health Sciences.
3 --,. INTRODUCTION In laboratory experiments designed to investigate the teratogenic or toxicological effect of certain compounds, the response of interest is frequently binary in nature, namely, the occurrence or not of "affected" fetuses or implantations in a litter. The "effect" under consideration is generally fetal death or the occurrence of some particular malformation. The statistical treatment of such data generally requires that the variations in response be described by some underlying probability model, and hence the quality of any subsequent statistical inferences will necessarily depend on how well such a model represents the phenomenon under study. Two one-parameter models which have been employed for the analysis of fetal death data are the Poisson (see []) and binomial (see [6]) distributions. Investigators who use a Poisson model base their analysis on the numper of dead implants per female, and so are unrealistically put~ ting no theoretical restriction on the number of dead implants. Unlike the Poisson model, the binomial model takes the total number of implants per female into account, the basic variable of interest being the proportion of dead implants. Unfortunately, fetal death in mice rarely follows either a Poisson or binomial distribution. For example, Haseman and Soares [3] considered the distribution of fetal death in three large groups of control mice, and found that the Poisson and binomial models provided poor fits to these data. I el
4 -3- In Table below (taken from the Haseman and Soares paper), the observed distribution of fetal death in these three groups has been compared to what would be expected assuming an underlying Poisson or binomial model. There was significant (P < 0.0) lack of fit in all cases, In addition. when sample sizes were sufficiently large, separate binomial models were fit to groups of animals having the same number of total implants. In most cases, these analyses also revealed a significant lack of fit. A two-parameter alternative to the Poisson and binomial models has been suggested by Williams [8]. He assumes that responses within a litter form a set of independent Bernoulli trials whose success probability varies among litters in the same treatment group according to a two-parameter beta distribution, The parameters of the beta distribution for each treatment group are then estimated by maximum likelihood, and the significance of treatment differences is assessed via asymptotic likelihood ratio tests.
5 TABLE I Poisson and Binomial Fit to Control Data a Number of dead implants Observed Data Set Data Set Data Set 3 Expected Expected Expected Observed Observed Poisson binomial Poisson binomial Poisson binomial c c c c > b c I ~ X 3 (test of fit) ~ I asee Haseman and Soares [3] for the complete distributions of fetal death. bp < 0.0. c P < , e..
6 -5- McCaughran and Arnold [5] have suggested the use of the negativebinomial distribution to model fetal death. If, for each female, the number of deaths is assumed to be Poisson and if the rate of death (i.e., the Poisson parameter) is assumed to vary from female-to-female according to a gamma distribution, the resulting unconditional distribution of fetal deaths in the population will be negative binomial. Although this particular two-parameter distribution must necessarily fit better than the one-parameter" Poisson and binomial distributions, there is little theoretical basis for its use. As mentioned earlier, the Poisson model does not take litter size into account, and, secondly, there is nc theoretical justification for the assumption that the death rates have a gamma distribution. Because of these objections, we do not consider the negative binomial model to be appropriate and so will not mention it again. The purpose of this paper is to present an alternative to the beta-binomial model proposed by Williams, and to compare these two models on both theoretical and practical grounds. As the title of the paper suggests, we. are proposing the use of a type of llcorrelated binomial" model, since we believe that one of the basic assumptions underlying the use of the binomial distribution - namely, the assump~ tion of independent trials - is quite possibly being violated in the experimental setting we are considering. In particular, we t feel that fetuses in the same litter tend to have an inherent relationship to one another, and that an appropriate model should in some way allow for an assessment of the strength of this possible intra-litter correlation. Our approach involves "correcting" the
7 -6- usual binomial model via a technique suggested by Bahadur [] t~ account for such within-litter dependency. We feel that this method of handling extra-binomial variation is intuitively more appealing than Williams' approach, which requires both the assumption of mutual independence among the within-litter Bernoulli responses and the assumption of a beta prior distribution for the Bernoulli parameter. In Section of this paper, we will discuss Bahadur's work and will describe the correlated-binomial model; the very special case of two implants per litter will be briefly considered for illustrative purposes. In Section 3, the beta-binomial and correlated-binomial models will be contrasted as to goodness-of-fit to the data sets given in Table I, and illustrations using the models to compare treatment and control groups will be given. The analysis procedures to be developed in this paper are useful in a variety of experimental situations. For example, Lachenbruch and Perry [4] discuss a physical therapy experiment involving patients with severed nerves in one hand; separate measurements were made on each finger of each patient.' s hand to assess the relationship between sweat and sensation. Clearly, the observations on the fingers of a hand cannot reasonably be considered to be independent. And, Wei! [7] cites a dental study where approximately 0 teeth per child were examined for the presence of caries, and he points out that there is an inherent dependency among the observations on teeth in the same mouth.
8 \ -7-. THE CORRELATED-BINOMIAL MODEL.. General Considerations To establish notation, let us suppose that there are l. litters in the i-th group (i = 0 for control group and i = for treatment group), the j-th litter in the i-th group being of size n.., j =,,...,.t. ) Let. n.. ) X.. = r X k ) k=l ) \ where X..k is a Bernoulli random variable taking the value with pro- ) bability p. if the k-th implant in the j-th litter of the i-th group - possesses the attribute of interest and taking the value 0 with probability ( - p. ) if the attribute is not present.. Note that we are assuming here that the true underlying probability of possessing the attribute depends only on the group under study and does not vary from litter to litter within a group. In contrast, Williams assumes that this probability depends on j as well as i, and he accounts for this variation in a Bayesian way in terms of a beta prior distribution.
9 -8- Under the ordinary binomial distribution assumption that X ijl, X. J'""'X" Jn.. J the value x.. with probability J are mutually independent, it follows that X.. J takes P(I)(X ij ) x.. = O,l,...,n.. J J () However, when the assumption of mutual independence is unreasonable, then Bahadur [] has shown that the correct and most general expression for Pr (X.. = x.. ), J I) say P(x..), J takes the form, P(x.. ) = P(l) (Xl')') f(x. l'x. ' 'x.. ) I) I) I) l)n ij () where the "correction factor" f(x. 'l'x. '""'x.. ) is what one multi- I) I) l)n.. I) plies the standard binomial probability distribution by in order to "cor- rect for" the lack of mutual independence among the is standardized to X. 'k's. I) If Z"k = (X. Ok - p.)/ip. (I-p.) I) I) then Bahadur has shown that,
10 -9- f(x. l'x. ' 'x.. ) =+ L E(Z. 'kz, 'k')z, 'kz, 'k' ) ) l)n.. k k' ) ) ) ) ) < \ " + E(Z. 'k Z ' 'k'z' 'k")z, 'kz' 'k'z' 'k"+".+e(z. 'lz, '".Z.. )z. :lz, '".z.. k <k'<k" ) ) ) ) ) J J ) )n.. ) ) )n.. ) ) Thus, f(x. l'x. "' 'x.. ) ) J )n.. J order) correlations among the etc., up to the of p. itself. n.. -th ) is a function of the pairwise (or second- X. 'k's, the third-order correlations, ) order correlation, as well as being a function The general expression () is quite complex and motivates one to look for an approximation to an P(x..). ) An obvious procedure for effecting such approximation is to neglect correlations of order higher than are required for reasonable accuracy. For example, if all correlations are taken to be zero, then we are back to the standard mutually uncorrclated case and so we could say that P(l)(x ij ) is a "first-order" approximation to (). If we can ~easonably neglect all correlations higher than order two, then P()(x.. ) = P(I)(X'.)G + L E(Z"kZ"k')z, 'kz"k].) ) L k<k' ) ) ) ) (3) is a second-order apprdximation to that and indeed P(x..). ) P( ) (x.. ) for m ) We caution at this point < m< n. " may fail to be ) non-negative for some values of x.., even though it is always true that n.. ) L )-OP ( ) (x.. ) =. X. - m ) ) For the situation we are considering, we will write E(Z"k Z. 'k') = Corr(X. 'k'x, 'k') )' )' ) ) e. = P. = p. (l..p.), where
11 -0- so that (3) parameters p. and e. : Cov (X..k ' X..k,) = e., J J then becomes the following explicit function of the two P () (x ij ) = e. p.(-p.) x.. (x.. -l)(l-p.) + (n.. -x.. )(n.. -x.. -l)p. EJ J J J J J - x.. (n.. -x..)p.(-p.)l}. J ) ) <.. (4 ) Expression (4) is a two-parameter alternative to Williams' betabinomial model. If needed, one could certainly employ a third or higherorder approximation to P (x.. ), ) but it has been our experience that P()(x ij ) performs just as well as Williams' model, in addition to having somewhat more theoretical appeal. Bahadur has shown that P() (x ij ) will be a valid probabil i ty distribution if and only if where -n.. (n..-l) ) ) p. -P. ) p. (-p. ). <p < min (--, Pi Pi (n.. -)p. (-p.) YO ) (5) Table II below provides values of the lower and upper bounds in (5) for various choices of n.. and p.; because of symmetry, only values for IJ Pi ~ ~ need to be tabulated.,
12 -- \ TABLE Permissible Ranges of Values for p. Based on (5) for Various Choices of n.. and p.. ) II ( ) (-0,49,. 000) (..,,00.000) 3 ( ) (-0.43, 0.636) (-0.333,. 000) 5 (-0.0, 0.300) (-0.043, 0.40) (-0.00, 0.500) 7 (-0.005, 0.3) (-0.00, 0.96) (-0.048, 0.333) 0 (-0.00, 0.00) (-0.00, 0.00) (-0.0, 0.00) 5 (-0.00, 0.0) (-0.004, 0.35) (-0.00, 0.43) 0 (-0.00, 0.00) (-0.00, 0.00) (-0.005, 0.00) A few comments are in order regarding inequality (5) and Table II. First of all, the upper bound is clearly less restrictive than the lower bound; this is desirable since in most (but not all) situations one would expect p. to be positive. Secondly, both bounds become closer to zero as n.. increases, so that, in practice, the largest n.. in a given ) ) data set is associated. with the most restrictive and governing set of bounds. Finally, sample-based bounds using (5) should be imposed to insure that estimates of p. and 8. (e.g., those obtained by maximum likelihood) will not lead to negative estimated probabilities based on (4). Now, the likelihood under model (4) would be where.. L. =n P()(X"), j=l ) i=o,l.
13 -- Let "'(0) L denote the value of L when maximized subject to tl'\e constraint and let denote the value of L maximized subject, to no constraints on the parameter space (other than (5), of course). Then, an asymptotically valid test of H O : Po =PI versus H a : Po;t PI is obtained by comparing distribution with D.F. with upper percentage points of the In general, this likelihood ratio test, which we will illustrate by example in Section 3, is best carried out using standard computer function maximization routines, since explicit formulas for the maximum likelihood parameter estimates can only be obtained for the very special case \lhen n.. =. ) As al aside, it is of interest to briefly discuss the likelihood ratio test we propose for examining whether the basic assumption of independent Bernoulli trials is valid. Since the reasonableness of this assump- ~ tion generally goes unquestioned in most applications involving the binomial distribution, this test should be of some value in this regard. X For the i-th "I group, say, if ol i is the value of L. maximized subject to the constraint 8 i = 0, then the likelihood ratio test of H o : 8 i = 0 versus HI: 8 i ;t 0 " "'() would be based on the statistic - In (OL. / L. ), which would have asymp totically a central X distribution with D.F. under H O '.. The Special Case n.. = ) Since it helps to highlight some of the differences between our procedure and that of Williams, the special case brief attention. When n.. =, ) distribution of X.. is: J n.. = ) it follows from will be given some (4) that the probability
14 -l3~ \ x.. P() (x ij ) J 0 I +e. (l~p.) p. (l-p.)-8. p. + e. If ~. and a. denote the mean and variance of Williams' beta distri- bution for the i-th group, then it is fairly easy to show in this special case that his ~. is equal to our p. and that his a. corresponds to our e.. In fact, the two models would be completely equivalent in this very special case (although clearly not in general), except in one respect. Williams' model only allows for a positive intra-litter association since his variance parameter a. is necessarily restricted to be non~negative, while our e. can, of course, take on negative values. Thus, the beta binomial model would be inappropriate in the situation when there is a possible negative correlation between responses within a litter, and so in this sense the correlated-binomial model is slightly more general. In particular, a preliminary likelihood ratio test which favor~ a. < 0 would tend to preclude the use of Williams' model. 3. COMPARISON OF THE BETA-IHNOMIAL AND CORRELATED-BINOMIAL ~ODELS To compare the fits of the beta-binomial and correlated-binomial models to some real-life data, we will again consider the three sets of I data given in Table. are given in the Haseman &Soares paper). In Table III, we have summarized the resul ts of fitting the two models to these three data sets (the required n.. ) values As can be seen, there is little
15 TABLE III Beta-Binomial and Correlated-Binomial Fits to Control Data in Table I Number Data Set Data Set Data Set 3 of dead implants Expected Expected Expected Observed Observed Observed Beta- Correlated Beta- Correlated Beta- Correatedbinomial binomial binomial binomial binomial binomial > , "- Parameter l-i=.0900 '" '"p=.093 l-i=.090 '" p=.086 l-i=.0735 '" "p=.0760 estimates 0 =.0056 '"8=.0037 &=.004 8=.007 cr =.005 "8= X (test of fit) It.. ".. -,... -,..-I
16 /e " \ -5- difference between the fits of the two models, and the improvement in fit relative to the ordinary binomial distribution is quite imp~essivc (see Table I). Although we have not attempted to do so here, we can, of course, improve the fit of the correlated-binomial model by allowing for third and even higher-order correlations; such flexibility with regard to model-improvement is not available with IVilliams' approach. To illustrate the use of model (4) in comparing treated and control groups, we will consider the data of Weil [7] used by Williams in his paper. We wish to perform the likelihood ratio test described in Section, which tests H O : PI = Po versus Ha; PI ~ Po with 8 not necessarily equal to 8 0, The restricted maximized log likelihood under H O ' In L(O), has the value with associated restricted maximum... '" " likelihood estimates p = ~7974, 8 =.039, and 8 0 =.095. Fitting model (4) separately to each of the two groups yields the unrestricted maximized log likelihood value, In i:(l), of In L(I) = In (l) + In (l) 0 = = '" '" " with associated unrestricted estimates PI =.757,8 =.030, PO=.8978,... and 8 0 =.004. Finally, I ing value - In ( (0)/ ()) = -( ) = 9.60, The correspondwhich is significant when compared with Xl/.005 = obtained using Williams' testing procedure is It should be mentioned here that it is necessary to consider the boundary
17 -6- conditions (5) when finding "() L and "(0) L, so that the maximum like- lihood estimates will not lead to negative estimated probabilities. As a final example, consider the following set of unpublished laboratory data involving l = l = 0 o group; the entries below are values of pregnant female mice in each x.. /n... ) ) CONTROL GROUP: 0/5, /6, 0/7, 0/7, 0/8, 0/8, 0/8, /9, /9, /0. TREATMENT GROUP: 0/5, /5, /7, 0/8, /8, 3/8, 0/9, 4/9, /0, 6/0. A preliminary likelihood ratio test of H : O e =0 gives Xl =0.3 for the control group and Xl =4.97 versus H: e ~ 0 a for the treatment group; the presence of this significant intra-litter effect in the treatment group precludes the. use of the ordinary binomial distribution to model these data. Using the beta-binomial model, we obtain under H O : ~l = ~O the " 0'0 =.046 and restricted maxim~m likelihood estimates ~ =.599, " 0' =.000, with an associated restricted maximized log likelihood value of In L(O) = ; under H a : ~l ~ ~O' the unrestricted para- " " ",, meter estimates are ~O =.0776, 0'0 =.007 and ~l =.36, 0' =.064, giving an unrestricted maximized log likelihood value of In () "= = Thus, \ which is significant at the 5% level sinc~ X = 3.84., OS
18 j Ie Using the correlated-binomial model, the parameter estimates under H O : PI = Po are P =.653, 8 0 =.049 and 8 =.06, giving In L(O) = ; the parameter estimates under H a : PI ~ Po -arc Po =.0765, 6 0 =.003 and PI =.60, 8 =.067, giving In L(l) = = Thus, Note that the likelihood under the correlated-binomial model is larger in every instance than the corresponding likelihood based on the beta-binomial model, suggesting that the former model is providing a better description of the data. ACKNOWLEDGEMENT The authors have benefited from some helpful discussions with Professor P.K. Sen. I
19 REFERENCES [] Bahadur, R.R. A Representation of the Joint Distribution of Responses to n Dichotomous Items. In H. Solomon, ed., Studies in Item Analysis and Prediction. Stanford, Calif.: Stanford University Press, 96. [] Epstein, S.S., Arnold, E., Andrea, J., Bass, W., and Bishop, Y. (97). Detection of Chemical Mutagens by the Dominant Lethal Assay in the Mouse. Toxicol. Appl. Pharmacol.~, [3] Haseman, J.K. and Soares, E.R. (976). The Distribution of Fetal Death in Control Mice and Its Implications on Statistical Tests for Dominant Lethal Effects. MUtation Res. ~, [4] Lachenbruch, P.A. and Perry, J. (97). Testing Equality of Proportion of Success of Several Correlated Binomial Variates. Univ. Nor. Car. Inst. Stat. Mimeo Series No P.. [5] McCaughran, D.A. and Arnold, D.W. (976). Statistical Models for Numbers of Implantation Sites and Embryonic Deaths in Mice. I Toxicol. Appl. Pharmacol. 38, [6] Salsburg, D.S. (973). Statistical Considerations for Dominant Lethal Mutagenic Trials. Env. Hlth. Persp. ~, [7] Weil, C.S. (970). Selection of the Valid Number of Sampling Units and a Consideration of Their Combination in Toxicological Studies Involving Reproduction, Teratogenesis or Carcinogenesis. Fd. Co~met. ToxicoZ. ~, [8] Williams, D.A. (975). The Analysis of Binary Responses From Toxicological Experiments Involving Reproduction and Teratogenicity. Biometrics~, \
SOME RESULTS ON THE MULTIPLE GROUP DISCRIMINANT PROBLEM. Peter A. Lachenbruch
.; SOME RESULTS ON THE MULTIPLE GROUP DISCRIMINANT PROBLEM By Peter A. Lachenbruch Department of Biostatistics University of North Carolina, Chapel Hill, N.C. Institute of Statistics Mimeo Series No. 829
More information1 Introduction: Extra-Binomial Variability In many experiments encountered in the biological and biomedical sciences, data are generated in the form o
Bootstrap Goodness-of-Fit Test for the Beta-Binomial Model STEVEN T. GARREN 1, RICHARD L. SMITH 2 &WALTER W. PIEGORSCH 3, 1 Department of Mathematics and Statistics, James Madison University, Harrisonburg,
More informationAdvanced Herd Management Probabilities and distributions
Advanced Herd Management Probabilities and distributions Anders Ringgaard Kristensen Slide 1 Outline Probabilities Conditional probabilities Bayes theorem Distributions Discrete Continuous Distribution
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationProbability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?
Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical
More informationDiscrete Probability Distributions
Discrete Probability Distributions EGR 260 R. Van Til Industrial & Systems Engineering Dept. Copyright 2013. Robert P. Van Til. All rights reserved. 1 What s It All About? The behavior of many random processes
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationMixture distributions in Exams MLC/3L and C/4
Making sense of... Mixture distributions in Exams MLC/3L and C/4 James W. Daniel Jim Daniel s Actuarial Seminars www.actuarialseminars.com February 1, 2012 c Copyright 2012 by James W. Daniel; reproduction
More informationDepartment of Statistical Science FIRST YEAR EXAM - SPRING 2017
Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;
More informationLecture 2: Probability and Distributions
Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info
More informationTHE MAYFIELD METHOD OF ESTIMATING NESTING SUCCESS: A MODEL, ESTIMATORS AND SIMULATION RESULTS
Wilson. Bull., 93(l), 1981, pp. 42-53 THE MAYFIELD METHOD OF ESTIMATING NESTING SUCCESS: A MODEL, ESTIMATORS AND SIMULATION RESULTS GARY L. HENSLER AND JAMES D. NICHOLS Mayfield (1960, 1961, 1975) proposed
More informationConditional Probabilities
Lecture Outline BIOST 514/517 Biostatistics I / pplied Biostatistics I Kathleen Kerr, Ph.D. ssociate Professor of Biostatistics University of Washington Probability Diagnostic Testing Random variables:
More informationA Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model
A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting
More informationANALYSIS OF DOSE-RESPONSE DATA IN THE PRESENCE OF EXTRA-BINOMIAL VARIATION
ANALYSIS OF DOSE-RESPONSE DATA IN THE PRESENCE OF EXTRA-BINOMIAL VARIATION Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, N.C. 27695-8203 and Division of Biometry and
More informationLecture 3. Biostatistics in Veterinary Science. Feb 2, Jung-Jin Lee Drexel University. Biostatistics in Veterinary Science Lecture 3
Lecture 3 Biostatistics in Veterinary Science Jung-Jin Lee Drexel University Feb 2, 2015 Review Let S be the sample space and A, B be events. Then 1 P (S) = 1, P ( ) = 0. 2 If A B, then P (A) P (B). In
More informationTesting Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data
Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of
More informationParameter Learning With Binary Variables
With Binary Variables University of Nebraska Lincoln CSCE 970 Pattern Recognition Outline Outline 1 Learning a Single Parameter 2 More on the Beta Density Function 3 Computing a Probability Interval Outline
More information2 Inference for Multinomial Distribution
Markov Chain Monte Carlo Methods Part III: Statistical Concepts By K.B.Athreya, Mohan Delampady and T.Krishnan 1 Introduction In parts I and II of this series it was shown how Markov chain Monte Carlo
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationThe number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them.
Chapter 2 Statistics In the present chapter, I will briefly review some statistical distributions that are used often in this book. I will also discuss some statistical techniques that are important in
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationA process capability index for discrete processes
Journal of Statistical Computation and Simulation Vol. 75, No. 3, March 2005, 175 187 A process capability index for discrete processes MICHAEL PERAKIS and EVDOKIA XEKALAKI* Department of Statistics, Athens
More informationComparison of Accident Rates Using the Likelihood Ratio Testing Technique
50 TRANSPORTATION RESEARCH RECORD 101 Comparison of Accident Rates Using the Likelihood Ratio Testing Technique ALI AL-GHAMDI Comparing transportation facilities (i.e., intersections and road sections)
More informationof Small Sample Size H. Yassaee, Tehran University of Technology, Iran, and University of North Carolina at Chapel Hill
On Properties of Estimators of Testing Homogeneity in r x2 Contingency Tables of Small Sample Size H. Yassaee, Tehran University of Technology, Iran, and University of North Carolina at Chapel Hill bstract
More informationSample size determination for a binary response in a superiority clinical trial using a hybrid classical and Bayesian procedure
Ciarleglio and Arendt Trials (2017) 18:83 DOI 10.1186/s13063-017-1791-0 METHODOLOGY Open Access Sample size determination for a binary response in a superiority clinical trial using a hybrid classical
More informationNaive Bayesian classifiers for multinomial features: a theoretical analysis
Naive Bayesian classifiers for multinomial features: a theoretical analysis Ewald van Dyk 1, Etienne Barnard 2 1,2 School of Electrical, Electronic and Computer Engineering, University of North-West, South
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationBasics on Probability. Jingrui He 09/11/2007
Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview
Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationLecture On Probability Distributions
Lecture On Probability Distributions 1 Random Variables & Probability Distributions Earlier we defined a random variable as a way of associating each outcome in a sample space with a real number. In our
More informationPractical Considerations Surrounding Normality
Practical Considerations Surrounding Normality Prof. Kevin E. Thorpe Dalla Lana School of Public Health University of Toronto KE Thorpe (U of T) Normality 1 / 16 Objectives Objectives 1. Understand the
More informationOptimal Few-Stage Designs for Clinical Trials. Janis Hardwick Quentin F. Stout
Presentation at GlaxoSmithKline, 10 May 2002. 1 ΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛΛ Optimal Few-Stage Designs for Clinical Trials Janis Hardwick Quentin F. Stout University of Michigan http://www.eecs.umich.edu/
More informationTHE NUMERICAL EVALUATION OF THE MAXIMUM-LIKELIHOOD ESTIMATE OF A SUBSET OF MIXTURE PROPORTIONS*
SIAM J APPL MATH Vol 35, No 3, November 1978 1978 Society for Industrial and Applied Mathematics 0036-1399/78/3503-0002 $0100/0 THE NUMERICAL EVALUATION OF THE MAXIMUM-LIKELIHOOD ESTIMATE OF A SUBSET OF
More informationInverse Sampling for McNemar s Test
International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test
More informationIntroduction to Probability
LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute
More informationHigh-Throughput Sequencing Course
High-Throughput Sequencing Course DESeq Model for RNA-Seq Biostatistics and Bioinformatics Summer 2017 Outline Review: Standard linear regression model (e.g., to model gene expression as function of an
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationUNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY
UNIVERSITY OF NOTTINGHAM Discussion Papers in Economics Discussion Paper No. 0/06 CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY by Indraneel Dasgupta July 00 DP 0/06 ISSN 1360-438 UNIVERSITY OF NOTTINGHAM
More information09 N -"*«*J. Additional Notes On The Negative Binomial Distribution. Dalton H. Wright. ALRAND Rpt 26 SPCC Mechanic sburg, Pa.
V m wmm&*a*m*am*hmm*>**. ^' ^v»"-^mtmt"x-^-3
More informationA litter-based approach to risk assessment in developmental toxicity. studies via a power family of completely monotone functions
A litter-based approach to ris assessment in developmental toxicity studies via a power family of completely monotone functions Anthony Y. C. Ku National University of Singapore, Singapore Summary. A new
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationReview of Discrete Probability (contd.)
Stat 504, Lecture 2 1 Review of Discrete Probability (contd.) Overview of probability and inference Probability Data generating process Observed data Inference The basic problem we study in probability:
More informationHypothesis Testing: Chi-Square Test 1
Hypothesis Testing: Chi-Square Test 1 November 9, 2017 1 HMS, 2017, v1.0 Chapter References Diez: Chapter 6.3 Navidi, Chapter 6.10 Chapter References 2 Chi-square Distributions Let X 1, X 2,... X n be
More informationBAYESIAN ANALYSIS OF DOSE-RESPONSE CALIBRATION CURVES
Libraries Annual Conference on Applied Statistics in Agriculture 2005-17th Annual Conference Proceedings BAYESIAN ANALYSIS OF DOSE-RESPONSE CALIBRATION CURVES William J. Price Bahman Shafii Follow this
More informationKnown probability distributions
Known probability distributions Engineers frequently wor with data that can be modeled as one of several nown probability distributions. Being able to model the data allows us to: model real systems design
More informationApproximate Median Regression via the Box-Cox Transformation
Approximate Median Regression via the Box-Cox Transformation Garrett M. Fitzmaurice,StuartR.Lipsitz, and Michael Parzen Median regression is used increasingly in many different areas of applications. The
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationDiscrete Dependent Variable Models
Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)
More informationChapter 8: An Introduction to Probability and Statistics
Course S3, 200 07 Chapter 8: An Introduction to Probability and Statistics This material is covered in the book: Erwin Kreyszig, Advanced Engineering Mathematics (9th edition) Chapter 24 (not including
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationSTAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.
STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial
More informationA Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.
A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,
More informationCOMPARING TRANSFORMATIONS USING TESTS OF SEPARATE FAMILIES. Department of Biostatistics, University of North Carolina at Chapel Hill, NC.
COMPARING TRANSFORMATIONS USING TESTS OF SEPARATE FAMILIES by Lloyd J. Edwards and Ronald W. Helms Department of Biostatistics, University of North Carolina at Chapel Hill, NC. Institute of Statistics
More informationMohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago
Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their
More informationIntroduction to Probabilistic Graphical Models
Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationA Count Data Frontier Model
A Count Data Frontier Model This is an incomplete draft. Cite only as a working paper. Richard A. Hofler (rhofler@bus.ucf.edu) David Scrogin Both of the Department of Economics University of Central Florida
More informationConstructing Ensembles of Pseudo-Experiments
Constructing Ensembles of Pseudo-Experiments Luc Demortier The Rockefeller University, New York, NY 10021, USA The frequentist interpretation of measurement results requires the specification of an ensemble
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is
More informationNotes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770
Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 77 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill October 4, 2 MATHEMATICAL
More informationErrata for the ASM Study Manual for Exam P, Fourth Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA
Errata for the ASM Study Manual for Exam P, Fourth Edition By Dr. Krzysztof M. Ostaszewski, FSA, CFA, MAAA (krzysio@krzysio.net) Effective July 5, 3, only the latest edition of this manual will have its
More informationConstrained estimation for binary and survival data
Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline
More informationProbability Distributions for Continuous Variables. Probability Distributions for Continuous Variables
Probability Distributions for Continuous Variables Probability Distributions for Continuous Variables Let X = lake depth at a randomly chosen point on lake surface If we draw the histogram so that the
More informationlet H1(a,a,b) H2(k,a,b,p') Bin(n,p)ñ Neg.Bin(k,p')p Beta(a,b). The probability generating functions are given by and
A COMPOUND POISSON PROCESS FORMULATION OF THE PARTIY DISTRIBUTION Lester R. Curtin Chirayath M. Suchindran, University of North Carolina. Introduction Various attempts have been made to describe the parity
More informationBasic Probabilistic Reasoning SEG
Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision
More informationApplications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University
i Applications of Basu's TheorelTI by '. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University January 1997 Institute of Statistics ii-limeo Series
More informationAPC486/ELE486: Transmission and Compression of Information. Bounds on the Expected Length of Code Words
APC486/ELE486: Transmission and Compression of Information Bounds on the Expected Length of Code Words Scribe: Kiran Vodrahalli September 8, 204 Notations In these notes, denotes a finite set, called the
More informationA proof of Bell s inequality in quantum mechanics using causal interactions
A proof of Bell s inequality in quantum mechanics using causal interactions James M. Robins, Tyler J. VanderWeele Departments of Epidemiology and Biostatistics, Harvard School of Public Health Richard
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationRandom Variables. P(x) = P[X(e)] = P(e). (1)
Random Variables Random variable (discrete or continuous) is used to derive the output statistical properties of a system whose input is a random variable or random in nature. Definition Consider an experiment
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationHarvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen
Harvard University Harvard University Biostatistics Working Paper Series Year 2014 Paper 175 A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome Eric Tchetgen Tchetgen
More informationThe University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80
The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple
More informationOptimal SPRT and CUSUM Procedures using Compressed Limit Gauges
Optimal SPRT and CUSUM Procedures using Compressed Limit Gauges P. Lee Geyer Stefan H. Steiner 1 Faculty of Business McMaster University Hamilton, Ontario L8S 4M4 Canada Dept. of Statistics and Actuarial
More informationDuke University. Duke Biostatistics and Bioinformatics (B&B) Working Paper Series. Randomized Phase II Clinical Trials using Fisher s Exact Test
Duke University Duke Biostatistics and Bioinformatics (B&B) Working Paper Series Year 2010 Paper 7 Randomized Phase II Clinical Trials using Fisher s Exact Test Sin-Ho Jung sinho.jung@duke.edu This working
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationII. Analysis of Linear Programming Solutions
Optimization Methods Draft of August 26, 2005 II. Analysis of Linear Programming Solutions Robert Fourer Department of Industrial Engineering and Management Sciences Northwestern University Evanston, Illinois
More informationConservative variance estimation for sampling designs with zero pairwise inclusion probabilities
Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume II: Probability Emlyn Lloyd University oflancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester - New York - Brisbane
More informationPROGRAM STATISTICS RESEARCH
An Alternate Definition of the ETS Delta Scale of Item Difficulty Paul W. Holland and Dorothy T. Thayer @) PROGRAM STATISTICS RESEARCH TECHNICAL REPORT NO. 85..64 EDUCATIONAL TESTING SERVICE PRINCETON,
More informationIntroduction to Bayesian Statistics
Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California
More informationarxiv: v2 [stat.me] 27 Aug 2014
Biostatistics (2014), 0, 0, pp. 1 20 doi:10.1093/biostatistics/depcounts arxiv:1305.1656v2 [stat.me] 27 Aug 2014 Markov counting models for correlated binary responses FORREST W. CRAWFORD, DANIEL ZELTERMAN
More informationThe Bayesian Paradigm
Stat 200 The Bayesian Paradigm Friday March 2nd The Bayesian Paradigm can be seen in some ways as an extra step in the modelling world just as parametric modelling is. We have seen how we could use probabilistic
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationNon Uniform Bounds on Geometric Approximation Via Stein s Method and w-functions
Communications in Statistics Theory and Methods, 40: 45 58, 20 Copyright Taylor & Francis Group, LLC ISSN: 036-0926 print/532-45x online DOI: 0.080/036092090337778 Non Uniform Bounds on Geometric Approximation
More informationECE 302 Division 2 Exam 2 Solutions, 11/4/2009.
NAME: ECE 32 Division 2 Exam 2 Solutions, /4/29. You will be required to show your student ID during the exam. This is a closed-book exam. A formula sheet is provided. No calculators are allowed. Total
More informationHow likely is Simpson s paradox in path models?
How likely is Simpson s paradox in path models? Ned Kock Full reference: Kock, N. (2015). How likely is Simpson s paradox in path models? International Journal of e- Collaboration, 11(1), 1-7. Abstract
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33
BIO5312 Biostatistics Lecture 03: Discrete and Continuous Probability Distributions Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 9/13/2016 1/33 Introduction In this lecture,
More informationA measure of partial association for generalized estimating equations
A measure of partial association for generalized estimating equations Sundar Natarajan, 1 Stuart Lipsitz, 2 Michael Parzen 3 and Stephen Lipshultz 4 1 Department of Medicine, New York University School
More informationIntroduction to Machine Learning. Lecture 2
Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for
More informationThe Logit Model: Estimation, Testing and Interpretation
The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 Introduction to maximum likelihood estimation 1.1 The likelihood function Consider a random sample Y 1,...,
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationDiscrete Random Variables
Discrete Random Variables An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan Introduction The markets can be thought of as a complex interaction of a large number of random processes,
More informationWhat is Probability? Probability. Sample Spaces and Events. Simple Event
What is Probability? Probability Peter Lo Probability is the numerical measure of likelihood that the event will occur. Simple Event Joint Event Compound Event Lies between 0 & 1 Sum of events is 1 1.5
More informationPreliminary Results on Social Learning with Partial Observations
Preliminary Results on Social Learning with Partial Observations Ilan Lobel, Daron Acemoglu, Munther Dahleh and Asuman Ozdaglar ABSTRACT We study a model of social learning with partial observations from
More informationChapter 1. Sets and probability. 1.3 Probability space
Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability
More informationPubH 5450 Biostatistics I Prof. Carlin. Lecture 13
PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null
More information