Remarks on Random Sequences

Similar documents
Remarks on Random Sequences

Favoring, Likelihoodism, and Bayesianism

Confirmation Theory. Pittsburgh Summer Program 1. Center for the Philosophy of Science, University of Pittsburgh July 7, 2017

Philosophy 148 Announcements & Such

For True Conditionalizers Weisberg s Paradox is a False Alarm

For True Conditionalizers Weisberg s Paradox is a False Alarm

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

The paradox of knowability, the knower, and the believer

In Defense of Jeffrey Conditionalization

Computational methods are invaluable for typology, but the models must match the questions: Commentary on Dunn et al. (2011)

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Notes on statistical tests

MAT Mathematics in Today's World

Basic Probability. Introduction

CS 361: Probability & Statistics

Social Science Counterfactuals. Julian Reiss, Durham University

Discrete Finite Probability Probability 1

On Likelihoodism and Intelligent Design

Chapter Three. Hypothesis Testing

CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

Bayesian Updating: Odds Class 12, Jeremy Orloff and Jonathan Bloom

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Structure learning in human causal induction

CENTRAL LIMIT THEOREM (CLT)

V. Probability. by David M. Lane and Dan Osherson

Bayesian data analysis using JASP

Probability and Statistics

Evidence with Uncertain Likelihoods

Introductory Econometrics. Review of statistics (Part II: Inference)

2. AXIOMATIC PROBABILITY

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Maximum-Likelihood Estimation: Basic Ideas

Statistics for the LHC Lecture 1: Introduction

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests

Chapter 7: Hypothesis Testing

Delayed Choice Paradox

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

CS 361: Probability & Statistics

Bayesian Updating with Discrete Priors Class 11, Jeremy Orloff and Jonathan Bloom

Probability theory basics

I. Induction, Probability and Confirmation: Introduction

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Comparative Bayesian Confirmation and the Quine Duhem Problem: A Rejoinder to Strevens Branden Fitelson and Andrew Waterman

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Where E is the proposition that [If H and O were true, H would explain O], William Roche

Machine Learning

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Philosophy 148 Announcements & Such. Independence, Correlation, and Anti-Correlation 1

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On

STA Module 4 Probability Concepts. Rev.F08 1

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

In Defence of a Naïve Conditional Epistemology

Machine Learning

Ling 289 Contingency Table Statistics

Russell s logicism. Jeff Speaks. September 26, 2007

Objective probability-like things with and without objective indeterminism

Computational Cognitive Science

Bayesian Statistics. State University of New York at Buffalo. From the SelectedWorks of Joseph Lucke. Joseph F. Lucke

Naive Bayes classification

20 Hypothesis Testing, Part I

RANDOM WALKS IN ONE DIMENSION

Do Imprecise Credences Make Sense?

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

An Even Better Solution to the Paradox of the Ravens. James Hawthorne and Branden Fitelson (7/23/2010)

Critical Notice: Bas van Fraassen, Scientific Representation: Paradoxes of Perspective Oxford University Press, 2008, xiv pages

CHAPTER EVALUATING HYPOTHESES 5.1 MOTIVATION

Imaging and Sleeping Beauty A Case for Double-Halfers

Mathematical Statistics

Probability Distributions

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work

Confidence Intervals and Hypothesis Tests

FEEG6017 lecture: Akaike's information criterion; model reduction. Brendan Neville

Example. χ 2 = Continued on the next page. All cells

FACTORIZATION AND THE PRIMES

A Note On Comparative Probability

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

A proof of Bell s inequality in quantum mechanics using causal interactions

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

P (E) = P (A 1 )P (A 2 )... P (A n ).

Accuracy, Language Dependence and Joyce s Argument for Probabilism

1 Measurement Uncertainties

Hypothesis testing. Chapter Formulating a hypothesis. 7.2 Testing if the hypothesis agrees with data

Introduction to Game Theory

A new resolution of the Judy Benjamin problem

1 Multiple Choice. PHIL110 Philosophy of Science. Exam May 10, Basic Concepts. 1.2 Inductivism. Name:

18.05 Practice Final Exam

Holistic Conditionalization and Underminable Perceptual Learning

Philosophy 148 Announcements & Such

Lecture 8: Probability

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

1 What are probabilities? 2 Sample Spaces. 3 Events and probability spaces

Pair Hidden Markov Models

Testability and Ockham s Razor: How Formal and Statistical Learning Theory Converge in the New Riddle of. Induction

Transcription:

Remarks on Random Sequences Branden Fitelson & Daniel Osherson 1 Setup We consider evidence relevant to whether a (possibly idealized) physical process is producing its output randomly. For definiteness, we ll consider a coin-flipper C which reports H for heads and T for tails. By C producing its output randomly, we mean H and T have equal probability and trials are independent. If C produces its output randomly (in the above sense), then we ll say that C is a random device. There are many potential reasons to believe that C is a random device (or that it s not). We might know something about its manufacture, or be told that C is random on good authority, etc. Our question is whether there is any information in C s output that bears on whether C is random. We ll consider two statistics (concerning output sequences generated by C) that are often taken to provide information about C s randomness. The first statistic is the number of runs in an output sequence [6]. The second statistic is the number of heads versus tails. In order to evaluate the utility of these statistical tests for randomness, we ll focus on the following two potential output sequences: (A) HTTHTHHHT (B) HHHHHTTTT A run in a sequence is a maximal non-empty segment consisting of adjacent equal elements. For example (A) has six runs whereas (B) has just two. If Hs and Ts alternate randomly then the number of runs after N trials is a random variable whose cumulative distribution is given by counting the number of sequences of length N with r or fewer runs (or conversely, r or greater runs). Doing the relevant calculations for (A), we deduce: If C is a random device then the probability is 0.363 of producing these many (viz., six) runs or more in a sequence of length nine. Because of this, advocates of the runs test say that producing (A) does not strongly disconfirm that C is a random device. For (B) the same calculations imply: If C is a random device then the probability is 0.035 of producing this many runs (viz., two) or fewer in a sequence of length nine. In this case, advocates of of the runs test say that C s generating (B) does strongly disconfirm C s randomness. 1 The binomial test gives the probability of throwing at least x heads in n tosses of the coin (or the probability of throwing at most x heads if they are fewer than n ). In both (A) 2 and (B), we see 5 heads in 9 tosses. We compute that if C is a random device, then producing 1 Standard objections to evidential interpretations of classical statistical tests have been recently surveyed in [5] and [2]. Our objection will be somewhat different from earlier concerns.

a sequence with five or more heads has probability 0.5. Because of this, advocates of the binomial test say that the fact that C generates either sequence does not strongly disconfirm the claim that C is a random device. We ve exploited two statistical tests to evaluate evidence regarding whether C is a random device. If (B) is the output, the first test ( runs ) classifies this as strongly disconfirmatory of C s randomness. If the output is (A), the first test does not deem this to be strongly disconfirmatory. The second test ( binomial ) views neither case (A) nor (B) as constituting strong evidence against C s randomness. While these tests may disagree with each other, they each seem to be perfectly self-consistent. But, there is a problem... 2 The Problem At a given position of the sequence produced by C, there are more potential events than just heads and tails. For example, let X = t1, 4, 9u, and define: Position i of C s output holds a hail (h) iff either i P X and position i holds a head (H), or i X and position i holds a tail (T). Position i of C s output holds a tead (t) iff either i P X and position i holds a tail (T), or i X and position i holds a head (H). Given these definitions of teads and hails, we see that C generates (A) iff C generates (a), and C generates (B) iff C generates (b). (A) HTTHTHHHT (a) hhhhhtttt (B) HHHHHTTTT (b) htththhht Let us respond at once to the concern that teads and hails are unnatural, position dependent, or otherwise gerrymandered. Such characterizations seem no more applicable to teads/hails than to heads/tails. For, we have the following symmetry: Position i of C s output holds a tail (T) iff either i P X and position i holds a tead (t), or i X and position i holds a hail (h). Position i of C s output holds a head (H) iff either i P X and position i holds a hail (h), or i X and position i holds a tead (t). Someone who thinks in terms of heads/tails may well find teads/hails to be derivative. But someone who thinks in terms of teads/hails will make the parallel claim about heads/tails. It s not obvious how to break the symmetry. Moreover, C produces an unbiased, independent sequence of heads/tails iff C produces an unbiased, independent sequence of teads/hails. (This is easy to verify.) Therefore, the runs test applied to teads/hails is as 2

relevant to the randomness of C as the runs test applied to heads/tails. Unfortunately, applying the runs test to teads/hails leads to a reversal of our initial assessment (in terms of heads/tails). To see this, just count the number of runs of teads/hails in (a) and (b), above. We see that (a) has two runs and (b) has six. Doing the relevant calculations for (a), we deduce: If C is a random device then the probability is 0.035 of producing a sequence (of length nine) with so few t/h runs. The advocate of the runs test should say that this constitutes strong evidence against the claim that C is a random device. And, for (b), we deduce: If C is a random device then the probability is 0.363 of producing a sequence (of length nine) with so many t/h runs. The advocate of the runs test should say that that this does not constitute strong evidence against the claim that C is a random device. Thus, the use of teads/hails instead of heads/tails reverses the evidential verdict implied by the runs test! Underlying this phenomenon is alteration of the rejection set in the passage from heads/tails to teads/hails. The rejection set is composed of the sequences whose number of runs is too extreme to be easily compatible with C s randomness. A given, potential output from C might be considered extreme when the rejection set is reckoned in terms of runs of heads/tails but not teads/hails, and conversely. So the runs test is ambiguous unless some reason can be given to favor one way of counting runs over all the competing ways (and finding such a reason seems problematic). The same sort of reversal can be achieved for the binomial test as well. To wit, consider the following pair of potential outcome sequences: (A) HTTHTHHHT (D) TTTTTTTTT Then, let Y = t2, 3, 5, 9u, and define: Position i of C s output holds a schmail (t) iff either i P Y and position i holds a tail (T), or i Y and position i holds a head (H). Position i of C s output holds a schmead (h) iff either i P Y and position i holds a head (H), or i Y and position i holds a tail (T). Similarly to before, C is a random device for generating heads/tails iff C is a random device for generating schmails/schmeads. But, C produces (A) or (D) iff C produces (c) or (d), respectively. So we can apply the binomial test to both (pairs of) sequences of events: (A) HTTHTHHHT (c) ttttttttt (D) TTTTTTTTT (d) htththhht Applying the binomial test to the schmeads and schmails in (c) yields: If C is a random device then the probability is 0.004 of producing an event with so few hs. The advocate 3

of the binomial test should therefore view C s generating (c) as strong evidence against the claim that C is a random device. We saw earlier that the binomial test does not imply that (A) is an improbable sequence if generated randomly. As such, advocates of the binomial test should not view C s generation of (A) as strong evidence against C s randomness. The same reversal affects (D) and (d). Once again, the test s implications about evidential relevance depend on which concepts we employ. 2 3 What the Problem is Not The teads/hails terminology resonates with Goodman s [1, Ch. 3] use of grue/bleen to question the basis of projections to the future. But this is not the point of the present discussion. Indeed, whether one reckons an output sequence as HTTHTHHHT versus hhhhhtttt has no bearing on predictions about the next coin toss. After the 9th output, heads are invariably teads and tails hails. So if you expect a head [tail] there is no harm in announcing a tead [hail]. The situation is thus different from Goodman s since projecting the greenness of emeralds ultimately conflicts with projecting grueness (after time t the two kinds of emeralds look different). The same remarks apply to schmeads and schmails. In contrast, the choice between heads/tails versus teads/hails appears to alter the verdict of standard statistical tests about the here and now, namely, whether C is producing its output randomly. Driving the ambiguity is the fact that C issues heads and tails in a uniform, independent way just in case the same is true for teads and hails, hence, the tests apply equally in the two cases. Preserving the null hypothesis of uniformity and independence across shifts in vocabulary is not a feature of the grue/bleen puzzle. 3 Of course, at a more abstract level, both teads/hails and grue/bleen point to the language dependence of inductive inference. If we denoted both heads and tails by theds without specialized vocabulary for each then we might be struck by the fact that C produces nothing but theds. But our point is more specific. Standard statistical tests for the randomnesss of C yield conflicting results even though C is random with respect to one vocabulary if and only if it is random with respect to the other. Unless a principled choice can be made among candidate vocabularies, the tests are bound to offer equivocal verdicts. Embracing the language dependence of the tests, moreover, does not seem to be a viable response to the ambiguity. It makes no sense to declare different levels of confidence for C s being random in the sense of heads/tails compared to C s being random in the sense of teads/hails. For (to repeat), C is random in one sense if and only if it is random in the other. A better response, it seems to us, is to abandon the tests altogether, along with any other attempt to harness C s output to compute its likelihood assuming randomness within a null hypothesis framework. 2 Such reversals will plague any statistical test for randomness that we have encountered (see [3, Ch. 2] for a recent survey). 3 In this sense, the present phenomenon is perhaps more similar to Miller s [4, Ch. 11] languagedependencies than Goodman s. 4

4 Lessons Learned What is the value of a statistical test whose outcome is so sensitive to the concepts used to describe the data? It would appear that this kind of null hypothesis testing in the service of evaluating the randomness of C is of little epistemic value. Indeed, it is often noted that all sequences of a given length have the same probability of being generated by an unbiased independent source. So there s no such thing as an atypical sequence that is unlikely to be generated if C is random. All sequences are atypical, surprising, coincidental, etc. 4 Yet, intuitively, it seems reasonable (in some sense) to be sceptical about the randomness of a source that relentlessly produces heads. What explanation can we offer for such doubt? Prior to seeing any output, there are many alternatives to the hypothesis that C is random. One alternative is that a human mind controls the output. The human-control hypothesis enjoys a relatively elevated prior probability because there are so many human minds in the neighborhood. (If we lived far away, we might be surrounded by teads/hails speakers, leading to different priors about the character of C.) The likelihood of a long initial stretch of heads given human control is relatively high (simply because that s the kind of thing a human would do), so the posterior probability of human-control comes to swamp the priors. On our view, belief that C is random should not be based solely on C s output. Ideally, it is inspection of C s mechanism that grounds convictions about randomness (perhaps because of symmetries discovered, or for deeper reasons involving quantum theory, etc.). On this view, a sequence of events is random iff it has been generated by a random device. References [1] Goodman, N. (1955), Fact, Fiction and Forecast, Harvard. [2] Greco, D. (2011), Significance Testing in Theory and Practice, British Journal for the Philosophy of Science, 62: 607 637. [3] Li, M. and Vitanyi, P. (2008), An Introduction to Kolmogorov Complexity and Its Applications, third edition, Springer. [4] Miller, D. (2006), Out of Error: Essays in Critical Rationalism, Ashgate. [5] Royall, R. (1997), Statistical Evidence: A Likelihood Paradigm, Chapman & Hall. [6] Wald, A. and Wolfowitz, J. (1940), On a test whether two samples are from the same population, Annals of Mathematical Statistics 11: 147 162. 4 Ironically, some advocates of statistical tests for randomness [3, 1.8.1] seem to think that this reveals a shortcoming of purely probabilistic assessments of output sequences. On the contrary, we think the present considerations show that there is something wrongheaded about any approach to randomness that appeals solely to properties of output sequences. 5