Defensive Forecasting: 4. Good probabilities mean less than we thought.

Similar documents
Lecture 4 How to Forecast

Game-Theoretic Probability: Theory and Applications Glenn Shafer

Good randomized sequential probability forecasting is always possible

What is risk? What is probability? Game-theoretic answers. Glenn Shafer. For 170 years: objective vs. subjective probability

Discussion of Dempster by Shafer. Dempster-Shafer is fiducial and so are you.

arxiv: v1 [math.pr] 26 Mar 2008

The Complexity of Forecast Testing

What is risk? What is probability? Glenn Shafer

Mostly calibrated. Yossi Feinberg Nicolas S. Lambert

How to base probability theory on perfect-information games

Defensive forecasting for linear protocols

The Game of Normal Numbers

Hypothesis Testing. Rianne de Heide. April 6, CWI & Leiden University

Hoeffding s inequality in game-theoretic probability

Deterministic Calibration and Nash Equilibrium

Defensive forecasting for optimal prediction with expert advice

On a simple strategy weakly forcing the strong law of large numbers in the bounded forecasting game

Calibration and Nash Equilibrium

A True Expert Knows which Question Should be Asked.

arxiv: v3 [math.pr] 5 Jun 2011

Kolmogorov-Loveland Randomness and Stochasticity

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th

Uncertainty. Michael Peters December 27, 2013

Problem statement. The data: Orthonormal regression with lots of X s (possible lots of β s are zero: Y i = β 0 + β j X ij + σz i, Z i N(0, 1),

Indicative conditionals

Introduction to Proofs

arxiv: v1 [math.pr] 14 Dec 2016

Lecture 14, Thurs March 2: Nonlocal Games

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

Linear Classifiers and the Perceptron

Principles of Statistical Inference

Examples: P: it is not the case that P. P Q: P or Q P Q: P implies Q (if P then Q) Typical formula:

MITOCW watch?v=7q32wnm4dew

Predictions as statements and decisions

CS 361: Probability & Statistics

Principles of Statistical Inference

Fundamental Probability and Statistics

Non-Bayesian Testing of an Expert.

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Lecture 3: Probabilistic Retrieval Models

Industrial Engineering Prof. Inderdeep Singh Department of Mechanical & Industrial Engineering Indian Institute of Technology, Roorkee

Statistical Learning. Philipp Koehn. 10 November 2015

The problem Countable additivity Finite additivity Conglomerability Structure Puzzle. The Measure Problem. Alexander R. Pruss

PRODUCTS THAT ARE POWERS. A mathematical vignette Ed Barbeau, University of Toronto

2.4 The Extreme Value Theorem and Some of its Consequences

Asymptotic calibration

Theory and Applications of A Repeated Game Playing Algorithm. Rob Schapire Princeton University [currently visiting Yahoo!

Inference for Stochastic Processes

Information Retrieval and Web Search Engines

1 Primals and Duals: Zero Sum Games

Mean Vector Inferences

Self-calibrating Probability Forecasting

The No-Regret Framework for Online Learning

Math 381 Discrete Mathematical Modeling

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Continuum Probability and Sets of Measure Zero

Solving with Absolute Value

Game-theoretic probability in continuous time

Econ 325: Introduction to Empirical Economics

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Consistent Beliefs in Extensive Form Games

Lecture 4 An Introduction to Stochastic Processes

Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Assignment 3 Logic and Reasoning KEY

Tutorial on Venn-ABERS prediction

Prequential Analysis

COMP3702/7702 Artificial Intelligence Week1: Introduction Russell & Norvig ch.1-2.3, Hanna Kurniawati

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One-Parameter Processes, Usually Functions of Time

Statistical methods research done as science rather than math: Estimates on the boundary in random regressions

Fermat s Last Theorem for Regular Primes

CS 124 Math Review Section January 29, 2018

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

Lecture 4: September Reminder: convergence of sequences

Goals: Equipment: Introduction:

STOCHASTIC MODELS LECTURE 1 MARKOV CHAINS. Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept.

Mathematical induction

Ockham Efficiency Theorem for Randomized Scientific Methods

1 Review of The Learning Setting

1 Multiple Choice. PHIL110 Philosophy of Science. Exam May 10, Basic Concepts. 1.2 Inductivism. Name:

Slope Fields: Graphing Solutions Without the Solutions

CS 361: Probability & Statistics

A Problem Involving Games. Paccioli s Solution. Problems for Paccioli: Small Samples. n / (n + m) m / (n + m)

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1

Walras-Bowley Lecture 2003

Circuit Theory Prof. S.C. Dutta Roy Department of Electrical Engineering Indian Institute of Technology, Delhi

Structure learning in human causal induction

Math Stochastic Processes & Simulation. Davar Khoshnevisan University of Utah

Stochastic Processes

Hon.Algorithms (CSCI-GA 3520), Professor Yap Fall 2012 MIDTERM EXAM. Oct 11, 2012 SOLUTION

Algorithmic Learning in a Random World

Can Theories be Tested? A Cryptographic Treatment of Forecast Testing

Mathematical foundations of Econometrics

TDA231. Logistic regression

Discrete Probability and State Estimation

Classification and Regression Trees

Matrix Inverses. November 19, 2014

Why Try Bayesian Methods? (Lecture 5)

6.896 Quantum Complexity Theory September 9, Lecture 2

Transcription:

Defensive Forecasting: Good probabilities mean less than we thought! Foundations of Probability Seminar Departments of Statistics and Philosophy November 2, 206 Glenn Shafer, Rutgers University 0. Game theoretic probability.. Strategies for Forecaster 2. How Forecaster can win (defensive forecasting) 3. Must Skeptic s tests of Forecaster be continuous? 4. Good probabilities mean less than we thought.

Game theoretic probability. On each round:. Forecaster offers bets (i.e., probabilistic predictions). 2. Skeptic decides which offers to accept. 3. Reality decides the outcome. Classical mathematical probability and statistical testing are about strategies for Skeptic. Skeptic tests Forecaster by trying to multiply capital risked by large factor. All statistical tests can be put in this form. Game theoretic Cournot principle says there is no other way to evaluate Forecaster. Prediction is about strategies for Forecaster. Forecaster wants to give good forecasts. Game theoretic Cournot principle says forecasts are good if they pass Skeptic s tests. What else could you ask for? 2

Four ways of using probability games. Statistical testing. Take role of Skeptic and test Forecaster (= theory/algorithm/person making forecasts). 2. Forecasting. Take role of Forecaster and try to make good probability predictions. 3. Probability judgement. Use battery of probability games as scale of canonical examples for measuring import of evidence. 4. Causal investigation. Hypothesize hidden game in which Nature plays Forecaster. We see only some of Reality s moves. 3

Outline of today s talk Part. Strategies for Forecaster Forecaster s moves are less than a global probability distribution. Aglobal probability distribution is a strategy for Forecaster. Bayes is only one way of constructing a strategy for Forecaster. Part 2. How Forecaster can win (defensive forecasting) Winning when Skeptic s strategy is known Beating an all purpose strategy for Skeptic Part 3. Must Skeptic s tests of Forecaster be continuous? Hilary Putnam s and A. P. Dawid s objection Randomization as a response Part 4. Good probabilities mean less than we thought. Neyman s inductive behavior Probability judgement 4

Part. Strategies for Forecaster Forecaster s moves are less than a probability distribution for Reality s path (sequence of moves). But a probability distribution for Reality s path is a strategy for Forecaster. Here we understand probability distribution in classical, pre Kolmogorov sense. Bayes provides one way of constructing strategies for Forecaster. Defensive forecasting is another way. 5

Part. Strategies for Forecaster Example: Each evening for a year, Bob the weather forecaster gives probabilities for rain the next day. 6

Part. Strategies for Forecaster Path taken 0 0 0 0 0 0 0 7

Part. Strategies for Forecaster 0 0 0 0 0 0 0 This is a classical (pre Kolmogorov) probability distribution. The Kolmogorov probability measure may have less information, because the conditional probability will not be defined if the condition has probability zero. Classical probability distribution = strategy for Forecaster 8

Part. Strategies for Forecaster 9

WHAT IS FORECASTER TRYING TO ACCOMPLISH? Part. Strategies for Forecaster Phil Dawid s insight: Properties we expect depend only on probabilities Forecaster gives. No need to impute a strategy to Forecaster. A. P. Dawid Born 946 Counterfactuals are irrelevant to testing Forecaster. 0

Part 2. How Forecaster can win Theorem : Forecaster can beat Reality and Skeptic if he knows Skeptic s strategy. Forecaster can pass any given test. Theorem 2: If Forecaster beats the average of many strategies for Skeptic, he beats them all. Forecaster only needs to pass one test. Leonid Levin born 948 Kolmogorov s student starting in high school, Levin is now at Boston University. He developed the idea of playing against an all purpose test in the 970s.

Part 2. How Forecaster can win One example where Forecaster beats a strategy for Skeptic. 2

Part 2. How Forecaster can win 3

Part 2. How Forecaster can win Theorem. Forecaster can keep Skeptic from making money. It is convenient to prove the theorem in a slightly stronger form: Theorem. Forecaster can keep Skeptic from making money. 4

Part 2. How Forecaster can win (Recall mean value theorem for a continuous function.) 5

Part 2. How Forecaster can win You can merge all the tests Forecaster needs to pass into a single allpurpose test for Forecaster to pass.. If Skeptic has two strategies for multiplying capital risked, he can average them (i.e., divide his capital between them). 2. If Skeptic beats the average, he beats both tests. 3. There are only countably many strategies (Abraham Wald). 4. You can average countably many strategies. 5. Forecaster can beat any single test (including the average). Abraham Wald 936 {937 Laurent BIENVENU, Glenn SHAFER and Alexander SHEN : On the history of martingales in the study of randomness, Electronic Journal for History of Probability and Statistics, www.jehps.net 5(), June 2009. 6

Part 2. How Forecaster can win Two ways of finding a relatively robust strategy for Forecaster This can be called Bayesian. Collect a variety of strategies for Forecaster and average them. Asymptotically, the average may be as successful as the most successful of the individual strategies. We call this defensive forecasting. Collect a variety of strategies for Skeptic, which enforce different properties we expect when y n has probability p n. Average these strategies for Skeptic and play against them. 7

Part 2. How Forecaster can win We call this method defensive forecasting. The name was introduced in the working paper Defensive Forecasting, by Vovk, Takemura, and Shafer (September 2004). See also Working Papers 7, 9, 0,, 3, 4, 6, 7, 8, 20, 2, 22, and 30 at www.probabilityandfinance.com. Akimichi Takemura Born 952 Volodya Vovk Born 960 8

Part 3. Must tests of Forecaster be continuous? Putnam s counterexample Two solutions: Insist that all computable functions are continuous. Randomize. Hilary Putnam (926 206), on the right, with Bruno Latour, born 947 9

Part 3. Must tests of Forecaster be continuous? 20

Part 3. Must tests of Forecaster be continuous? 2

Part 3. Must tests of Forecaster be continuous? 22

THE ARGUMENT FOR CONTINUITY Part 3. Must tests of Forecaster be continuous? In practice, the tests (strategies for Skeptic) Forecaster wants to pass are continuous as functions of Forecaster s last move.. The strategies for Skeptic used by Shafer and Vovk in Probability and Finance were continuous (& computable & relatively simple). 2. Conjecture: every high probability or probability one result in classical theory can be proven by a strategy for Skeptic that is computable in Brouwer s sense. 3. L. E. J. Brouwer s continuity principle (96): only continuous functions are computable. Why can t discontinuous functions be computed? Because you would need infinite precision. 23

Part 3. Must tests of Forecaster be continuous? Alternative: Allow Forecaster to hide his precise prediction from Reality using a bit of randomization.. Good randomized sequential probability forecasting is always possible, by Vladimir Vovk and Glenn Shafer. Journal of the Royal Statistical Society, Series B 67 747 763, 2005. 2. Asymptotic calibration, by Dean Foster and Rakesh Vohra. Biometrika, 85:379 390, 998. 24

Part 4. Good probabilities mean less than we thought. Neyman s inductive behavior Probability judgement 25

Part 4. Good probabilities mean less than we thought. Giving probabilities for successive events. Think stochastic process, unknown probabilities, not iid. Can I assign probabilities that will pass statistical tests?. If you insist that I announce all probabilities before seeing any outcomes, NO. 2. If you always let me see the preceding outcomes before I announce the next probability, YES. 26

Part 4. Good probabilities mean less than we thought. We knew that a probability can be estimated from a random sample. But this depends on the idd assumption. Defensive forecasting tells us something new.. Our opponent is Reality rather than Nature. (Nature follows laws; Reality plays as he pleases.) 2. Defensive forecasting gives probabilities that pass statistical tests regardless of how Reality behaves. 3. The notion of a stochastic process with unknown probabilities loses its empirical content. 4. But the prediction of y n from x n depends on the sequence in which we have placed it. 27

Part 4. Good probabilities mean less than we thought. Jeyzy Neyman s inductive behavior A statistician who makes predictions with 95% confidence has two goals: be informative be right 95% of the time Why isn t this good enough for probability judgment? Answer: Two statisticians who are right 95% of the time may tell the court different and even contradictory things. They are placing the current event in different sequences. 28