Stochastic orders of variability

Similar documents
VARIABILITY OF THE MAX AND MIN STATISTIC: A THEORY OF THE QUANTILE SPREAD AS A FUNCTION OF SAMPLE SIZE INDIANA UNIVERSITY HANS COLONIUS

Module 3. Function of a Random Variable and its distribution

2. Variance and Higher Moments

arxiv: v1 [math.pr] 3 Apr 2019

Some Recent Results on Stochastic Comparisons and Dependence among Order Statistics in the Case of PHR Model

Stochastic orders: a brief introduction and Bruno s contributions. Franco Pellerey

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Stochastic Comparisons of Weighted Sums of Arrangement Increasing Random Variables

NIH Public Access Author Manuscript J Math Psychol. Author manuscript; available in PMC 2012 June 1.

DISPERSIVE FUNCTIONS AND STOCHASTIC ORDERS

Statistical Methods in HYDROLOGY CHARLES T. HAAN. The Iowa State University Press / Ames

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Week 2. Review of Probability, Random Variables and Univariate Distributions

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

1 Exercises for lecture 1

STA205 Probability: Week 8 R. Wolpert

ELEG 833. Nonlinear Signal Processing

Continuous Random Variables and Continuous Distributions

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

HANDBOOK OF APPLICABLE MATHEMATICS

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Politecnico di Torino. Porto Institutional Repository

Continuous Random Variables

Topic 4: Continuous random variables

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

The expected value E[X] of discrete random variable X is defined by. xp X (x), (6.1) E[X] =

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Chapter 7: Special Distributions

Supermodular ordering of Poisson arrays

Lecture 22: Variance and Covariance

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Nonparametric estimation of. s concave and log-concave densities: alternatives to maximum likelihood

Probability and Measure

Some properties of skew-symmetric distributions

Generalized quantiles as risk measures

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

3 Continuous Random Variables

The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation

Topic 4: Continuous random variables

Subject CS1 Actuarial Statistics 1 Core Principles

Nonparametric estimation: s concave and log-concave densities: alternatives to maximum likelihood

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

2. The Normal Distribution

ECON 5350 Class Notes Review of Probability and Distribution Theory

MATH 217A HOMEWORK. P (A i A j ). First, the basis case. We make a union disjoint as follows: P (A B) = P (A) + P (A c B)

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Parameter addition to a family of multivariate exponential and weibull distribution

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Actuarial Science Exam 1/P

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

1 Appendix A: Matrix Algebra

ON THE NORMAL MOMENT DISTRIBUTIONS

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction

Some New Results on Information Properties of Mixture Distributions

Stat410 Probability and Statistics II (F16)

1. Let X and Y be independent exponential random variables with rate α. Find the densities of the random variables X 3, X Y, min(x, Y 3 )

Stat 5101 Lecture Slides Deck 4. Charles J. Geyer School of Statistics University of Minnesota

Convexification and Multimodality of Random Probability Measures

Curriculum Map for Mathematics SL (DP1)

A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ

Review 1: STAT Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics. August 25, 2015

WHEN MORE SELECTION IS WORSE

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

A new approach for stochastic ordering of risks

Mathematical Statistics

Brief Review of Probability

Maximum likelihood estimation of a log-concave density based on censored data

Introduction to Statistical Analysis

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Log-concave distributions: definitions, properties, and consequences

Stochastic Orders in Heterogeneous Samples with Applications

Monotonicity and Aging Properties of Random Sums

ORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES

Solving Systems (A.5): 5 blocks - Graphing (1 block) - Substitution (2 blocks) - Elimination (2 blocks)

Applications in Differentiation Page 3

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013. Conditional expectations, filtration and martingales

arxiv: v1 [math.st] 30 Mar 2015

Introduction & Random Variables. John Dodson. September 3, 2008

Probability and Distributions

5.6 The Normal Distributions

Stat 5101 Notes: Brand Name Distributions

Friedrich-Alexander-Universität. Erlangen-Nürnberg

R. Koenker Spring 2017 Economics 574 Problem Set 1

Weighted Exponential Distribution and Process

THE QUEEN S UNIVERSITY OF BELFAST

Introduction of Shape/Skewness Parameter(s) in a Probability Distribution

2-1. xp X (x). (2.1) E(X) = 1 1

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Exploratory data analysis: numerical summaries

Extreme Value Analysis and Spatial Extremes

Institute of Actuaries of India

Probability theory basics

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

CHAPTER 10: POLYNOMIALS (DRAFT)

Transcription:

Stochastic orders of variability Hans Colonius Revision August 2015

Variability orders 1 Introduction: Some Variability Orders The fact that the response to one and the same stimulus, under seemingly identical conditions, varies from one instance to the next is a fundamental observation of reaction time (RT) research and, arguably, of behavioral science in general. Moreover, recent neurophysiological modeling approaches seek to identify the sources of variability at the neuronal level (e.g., Lin et al., 2015; Goris et al., 2014; van den Berg et al., 2012) and often try relate them to trial-to-trial variability in behavior. In psychology, the recognition that the complete distribution function of RTs may contain critical information about the underlying stochastic mechanism not accessible by only considering their mean, has led to the development of many RT models predicting specific distribution functions, e.g. race, counter, and diffusion models (Luce, 1986). However, these models are often testable only if one takes specific distributional assumptions for granted, even if these assumption are not part of the models core assumptions (for a recent discussion, see Jones and Dzhafarov, 2014). Much of Townsend s work is in a different tradition, however (e.g., Townsend and Eidels, 2011; Townsend and Wenger, 2004; Townsend and Nozawa, 1995). Order relations between response time (RT) distributions that compare the location or the magnitude of random variables like stochastic order, hazard rate order, or likelihood order have played an important role in his work (e.g., Townsend, 1990). More recently, we addressed the issue of comparing RT distributions with respect to measures of variability (Townsend and Colonius, 2005). In this chapter, I recollect some results of the latter paper and relate them to some recent developments in statistics. While specific distributions are typically described by certain moments (mean and variance) and transforms thereof, in particular skewness and kurtosis, this approach has certain shortcomings. For example, the variance may not be finite. Several notions of variability, weaker and stronger than the variance, have been investigated in the statistical literature (cf., Shaked and Shantikumar, 2007) 1. The first variability order considered here requires the definition of a convex function φ : R R, i.e., for all x, y R and α [0, 1]. φ(αx + (1 α)y) αφ(x) + (1 α)φ(y), Definition 1 (Convex order). 2 Let X and Y be two random variables such that Eφ(X) Eφ(Y ) for all convex functions φ : R R, provided the expectations exist. Then X is said to be smaller than Y in the convex order, denoted as X cx Y. 1 If not indicated otherwise, proofs of all statement made here can be found in this monograph. 2 Definitions, examples, and propositions are numbered jointly and consecutively.

Variability orders 2 Roughly speaking, convex functions take on their (relatively) larger values over regions of the form (, a) (b, + ) for a < b. Therefore, when the condition in the above definition holds, Y is more likely to take on extreme values than X, that is, Y is more variable than X. 3 The convex order has some strong implications. Using the convexity of functions φ(x) = x, φ(x) = x, and φ(x) = x 2, it is can be shown that X cx Y implies EX = EY and VarX VarY. Moreover, when EX = EY, then a condition equivalent to X cx Y is E[max(X, a)] E[max(Y, a)], for all a R. The convex order only compares random variables that have the same means. One way to drop this requirement is to introduce a so-called dilation order by Definition 2 (Dilation order). X dil Y, if [X EX] cx [Y EY ]. For nonnegative random variables X and Y with finite means, one can alternatively define the Lorenz order by X Lorenz Y, if X EX cx Y EY, which can be used to order random variables with respect to the Lorenz curve used, e.g., in economics to measure the inequality of incomes. For the next order type, we need the concept of a quantile function, closely related to that of a distribution function. Definition 3 (Quantile function). Let X be a real-valued random variable with distribution function F (x). Then the quantile function of X is defined as Q(u) = F 1 (u) = inf{x F (x) u}, 0 u 1. (1) For every < x < + and 0 < u < 1, we have F (x) u if, and only if, Q(u) x. Thus, if there exists x with F (x) = u, then F (Q(u)) = u and Q(u) is the smallest value of x satisfying F (x) = u. If F (x) is continuous and strictly increasing, Q(u) is the unique value x such that F (x) = u. The specification of a distribution through its quantile 3 Clearly, it is sufficient to consider only functions φ that are convex on the union of the supports of X and Y rather than over the whole real line.

Variability orders 3 function takes away the need to describe a distribution through its moments. Alternative measures in terms of quantiles are available, e.g. the median as a measure of location, defined as md(x) = Q(0.5), or the interquartile range, as a measure of dispersion, defined as IQR = Q(.75) Q(.25). The following order is based on comparing difference between quantiles of the distribution functions. Definition 4 (Dispersive order). Let X and Y be random variables with quantile functions F 1 and G 1, respectively. If F 1 (β) F 1 (α) G 1 (β) G 1 (α), whenever 0 < α β < 1, then X is said to be smaller than Y in the dispersive order, denoted by X disp Y. In contrast to the convex order, the dispersive order is clearly location free: X disp Y X + c disp Y, for any real c. The dispersive order is also dilative, i.e., X disp ax whenever a 1 and, moreover, X disp Y X disp Y. However, it is not closed under convolutions. Its characterization requires two more definitions. First, a random variable Z is called dispersive if X +Z disp Y +Z whenever X disp Y and Z is independent of X and Y. Second, a density (more general, any nonnegative function) g is called logconcave if log g is concave, or, equivalently, g[αx + (1 α)y] [g(x)] α [g(y)] 1 α. Proposition 5. The random variable X is dispersive if, and only if, X has a logconcave density. 4 It can be shown that the dispersion order implies the dilation order, that is, for random variables with finite means, X disp Y = X dil Y, from which it immediately follows that also VarX VarY. 4 Note that a logconcave density follows from its distribution function G being strongly unimodal, i.e., if the convolution G F is unimodal for every unimodal F.

Variability orders 4 The Quantile Spread All variability orders considered so far are strong enough to imply a corresponding ordering of the variances. Moreover, if X and Y have the same finite support and satisfy X disp Y, then they must have the same distribution. Hence a weaker concept is desirable. In Townsend and Colonius (2005), we developed such a weaker concept of variability order in an attempt to describe the effect of sample size on the shape of the distribution of the extreme order statistics X 1:n (minimum) and X n:n (maximum). It is based on the notion of quantile spread: Definition 6. The quantile spread of random variable X with distribution F is for 0.5 < p < 1. QS X (p) = F 1 (p) F 1 (1 p), With S = 1 F (the survival function), F 1 (p) = S 1 (1 p) implies QS X (p) = S 1 (1 p) S 1 (p). It turns out that the concept of a spread function, as already defined by Balanda and MacGillivray (1990), is equivalent to the quantile spread. The quantile spread of a distribution describes how the probability mass is placed symmetrically about its median and hence can be used to formalize concepts such as peakedness and tailweight traditionally associated with kurtosis. This way, it allows us to separate concepts of kurtosis and peakedness for asymmetric distributions. Example 7 (Weibull). For the Weibull distribution (see Figure 1) defined by the quantile spread is for 0.5 < p < 1. F (x) = 1 exp( βx α ) with α, β > 0, QS X (p) = β 1/α [ ( ln ) 1 1/α ( ) ] 1 1/α ln 1 p p For α = 1 (exponential distribution with parameter β) one gets p QS X (p) = (1/β) ln 1 p. For β = 1/2, this equals the quantile spread function for the (standard) logistic distribution, F (x) = 1, < x < +. 1 + exp( x) NOTE: This show that the quantile spread does not uniquely characterize the distribution function from which it has been generated.

Variability orders 5 Figure 1: Two Weibull distributions with α = 2 and α = 6, and β = 2. The horizontal arrows indicate their quantile spread at p = 0.8. The order QS is defined in the next section. Example 8 (Cauchy distribution). The Cauchy distribution with density f(x) = has no finite variance; the quantile function is yielding the quantile spread for 0.5 < p < 1. γ π(x 2 + γ 2 ), γ > 0, F 1 (p) = γ tan[π(p 0.5)], F 1 (p) F 1 (1 p) = γ{tan[π(p 0.5)] tan[π(0.5 p)]}, Example 9 (Extreme order statistics). For the extreme order statistics, one gets simple expressions for the quantile spread, making it easy to describe their behavior as a function of sample size n: QS min (p) = S 1 [(1 p) 1/n ] S 1 (p 1/n ) and QS max (p) = F 1 (p 1/n ) F 1 [(1 p) 1/n ]. Not surprisingly, the case of a symmetric parent distribution leads to a simplification with respect to the order statistics:

Variability orders 6 Proposition 10. For a symmetric parent distribution, the quantile spread for the maximum is identical to the quantile spread for the minimum. Proof. (For a proof assuming symmetry around zero, see Townsend and Colonius, 2005, p. 765. The general case follows similarly.). The Quantile Spread Order Definition 11. Let X and Y be random variables with quantile spreads QS X and QS Y, respectively. Then X is called smaller than Y in quantile spread order, denoted as X QS Y, if QS X (p) QS Y (p), for all p (0.5, 1). The following properties of the quantile spread order can be determined: 1. The order QS is location-free, i.e., X QS Y iff X + c QS Y, for any real c 2. The order QS is dilative, i.e, X QS ax, whenever a 1. 3. X QS Y if, and only if X QS Y. 4. Assume F X and F Y are symmetric, then X QS Y if, and only if, F 1 X 1 (p) FY (p) for p (0.5, 1). 5. QS implies ordering of the mean absolute deviation around the median, MAD, (md the median), i.e., The last point follows from writing MAD(X) = E[ X md(x) ] X QS Y implies MAD(X) MAD(Y ). MAD(X) = 1 For the Cauchy distribution with quantile spread 0.5 [F 1 (p) F 1 (1 p)] dp. QS(p) = γ{tan[π(p 0.5)] tan[π(0.5 p)]}, it is obvious that parameter γ clearly defines the QS order for this distribution. The next example demonstrates how the QS order can be used to describe the effect of sample size on the minimum order statistic.

Variability orders 7 Example 12. The quantile spread for the Weibull minimum (cf. Example 7) is [ ( ) QS min (p; n) = (nλ) 1/α 1 1/α ( ln ln 1 ) ] 1/α. 1 p p For p (0.5, 1), this is decreasing in n. Thus, X 1:n QS X 1:n 1, i.e., the quantile spread for the Weibull minimum decreases as a function of sample size n. The next section illustrates a recent application of the quantile spread order to define the dispersive properties of a not yet well known distribution. The Quantile Spread Order for the Kumaraswamy Distribution The density for the Beta distribution, frequently used in modeling, is: f(x) = Γ(α + β) Γ(α)Γ(β) xα 1 (1 x) β 1, 0 < x < 1, where Γ is the gamma function, α > 0, and β > 0. It is flexible and has simple closedform expressions for the moments. However, it has no closed-form solution for its quantile function F 1 (p), which is often a drawback in simulation and estimation studies. An alternative distribution is the Kumaraswamy distribution which is, in many respects, very similar to the Beta distribution. Its dispersion properties have recently been studied by Mitnik and Baek (2013). The distribution function is: F (x) = 1 [ 1 ( x c b c ) p ] q where c < x < b and p, q are shape parameters with p > 0, q > 0 (Kumaraswamy, 1980). Let us denote this general form of the distribution by K(p, q, c, b). In contrast to the Beta distribution, it has no closed-form expressions for the mean and variance, but a simple quantile function: [ ] x = Q(u) = c + (b c) 1 (1 u) 1 1/p q, for 0 < u < 1. One question addressed in Mitnik and Baek (2013) was how to reparametrize the distribution so that members of its distribution family can be compared with respect to their variability. From the quantile function, one obtains for the median md(x) = c + (b c)(1 0.5 1 q ) 1/p. From this, the authors derive two re-parametrizations of the distribution, K p (ω, d p, c, b) and K q (ω, d q, c, b), where d p = p 1, d q = q 1, and ω = md(x) (for details, see Mitnik and Baek,

Variability orders 8 2013, p. 181). They go on to show that d p and d q are in fact dispersion parameters with reference to a well-defined dispersion order. Note that this order cannot be the dispersive order disp (cf. Definition 4): the Kumaraswamy distribution has a finite support, so any two distributions with the same b and c parameters ordered by disp would be identical (see Shaked and Shantikumar, 2007, Theorem 3.B.14). This argument, however, does not rule out the dilation order which is derived from the convex order (see Definition 2). In an additional study (see supplement to Mitnik and Baek, 2013), they show, via numerical computations, that the dilation order is not an order consistent with values of parameters d p and d q but that the quantile spread order is. They prove the following Proposition 13. Let X be distributed as K r (ω, d rx, c, b) and Y as K r (ω, d ry, c, b); then X QS Y if, and only if, d rx d ry, with r = p, q. Trivially, as well the interquartile range, defined as QS X (.75) is consistent with values of the parameters d p and d q. Finally, it is also follows from the above that if d rx d ry, then MAD(X) MAD(Y ), for r = p, q. Thus, Mitnik and Baek (2013) conclude that the only known variability order that is appropriate for the Kumaraswamy distribution is the quantile spread order defined in Townsend and Colonius (2005) (see also p. 151, Shaked and Shantikumar, 2007). Acknowledgment This chapter is dedicated to my friend and former colleague, Jim Townsend, at the occasion of a conference held in his honor at Indiana University, Bloomington, in April 2013. Comments by Trisha van Zandt and Joe Houpt on an earlier version, improving readability and correcting errors, are gratefully acknowledged.

Bibliography Balanda, K. and MacGillivray, H. (1990). Kurtosis and spread. The Canadian Journal of Statistics, 18(1):17 30. Goris, R. L. T., Movshon, J. A., and Simoncelli, E. P. (2014). variability. Nature Neuroscience, 17(6):858 865. Partitioning neuronal Jones, M. and Dzhafarov, E. N. (2014). Unfalsifiability and mutual translatability of major modeling schemes for choice reaction time. Psychological Review, 121:1 32. Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46:79 88. Lin, I.-C., Okun, M., Carandini, M., and Harris, K. D. (2015). cortical variability. Neuron, 87:644 656. The nature of shared Luce, R. (1986). Response times: Their role in inferring elementary mental organization. Oxford University Press, New York, NY. Mitnik, P. and Baek, S. (2013). The kumaraswamy distribution: median-dispersion reparameterizations for regression modeling and simulation-based estimation. Statistical Papers, 54:177 192. Shaked, M. and Shantikumar, J. (2007). Stochastic orders. Springer Science+Business Media, New York, NY. Townsend, J. (1990). Truth and consequences of ordinal differences in statistical distributions: toward a theory of hierarchical inference. Journal of Mathematical Psychology, 108(551 567). Townsend, J. and Colonius, H. (2005). Variability of the max and min statistic: a theory of the quantile spread as a function of sample size. Psychometrika, 70(4):759 772. Townsend, J. and Eidels, A. (2011). Workload capacity spaces: A unified methodology for response time measures of efficiency as workload is varied. Psychonomic Bulletin and Review, 18:659 681. 9

Variability orders 10 Townsend, J. and Nozawa, G. (1995). Spatio-temporal properties of elementary perception: an investigation of parallel, serial, and coactive theories. Journal of Mathematical Psychology, 39(4):321 359. Townsend, J. and Wenger, M. (2004). A theory of interactive parallel processing: new capacity measures and predictions for a response time inequality series. Psychological Review, 111(4):1003 1035. van den Berg, R., Shin, H., Chou, W.-C., George, R., and Ma, W. J. (2012). Variability in encoding precision accounts for visual short-term memory limitations. Proceedings of the National Academy of Sciences of the United States of America, 109(22):8780 8785.