The Best, The Hottest, & The Luckiest: A Statistical Tale of Extremes

Similar documents
Lesson 1 - Pre-Visit Safe at Home: Location, Place, and Baseball

SOCIETY FOR AMERICAN BASEBALL RESEARCH The Pythagorean Won-Loss Formula in Baseball: An Introduction to Statistics and Modeling

SOCIETY FOR AMERICAN BASEBALL RESEARCH The Pythagorean Won-Loss Formula in Baseball: An Introduction to Statistics and Modeling

arxiv:physics/ v1 [physics.data-an] 18 Apr 2006

Correlation (pp. 1 of 6)

Data and slope and y-intercepts, Oh My! Linear Regression in the Common Core

Finding scientific gems with Google s PageRank algorithm

Includes games of Wednesday, March 4, 2015 Major League Baseball

The Pythagorean Won-Loss Theorem: An introduction to modeling

Pythagoras at the Bat: An Introduction to Statistics and Mathematical Modeling

The Pythagorean Won-Loss Theorem: An introduction to modeling

Solving Quadratic Equations by Graphing 6.1. ft /sec. The height of the arrow h(t) in terms

Pythagoras at the Bat: An Introduction to Statistics and Mathematical Modeling

12.2. Measures of Central Tendency: The Mean, Median, and Mode

Pythagoras at the Bat: An Introduction to Stats and Modeling

Includes games of Friday, February 23, 2018 Major League Baseball

Records in a changing world

Simple Linear Regression

Pythagoras at the Bat: An Introduction to Stats and Modeling

Human ages are named after materials - stone, bronze, iron, nuclear, silicon

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

Chapter 7. Network Flow. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Introduction to DFT and Density Functionals. by Michel Côté Université de Montréal Département de physique

Chapter 7. Network Flow. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

CHAPTER 6: AIR MASSES & WEATHER PATTERNS

eigenvalues, markov matrices, and the power method

Oakland Police Department

First-Passage Statistics of Extreme Values

2011 Form B Solution. Jim Rahn

Lecture 3: Sports rating models

arxiv: v2 [cond-mat.stat-mech] 12 Apr 2011

Do Regions Matter for the Behavior of City Relative Prices in the U. S.?

1. Evaluation of maximum daily temperature

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

The top two Wild Card teams from each league make the postseason and play each other in a one game playoff.

EVALUATION OF EIGEN-VALUE METHOD AND GEOMETRIC MEAN BY BRADLEY-TERRY MODEL IN BINARY AHP

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

Investigating the mechanism of High Temperature Superconductivity by Oxygen Isotope Substitution. Eran Amit. Amit Keren

Authors: Antonella Zanobetti and Joel Schwartz

Small Investment, Big Reward

Xt i Xs i N(0, σ 2 (t s)) and they are independent. This implies that the density function of X t X s is a product of normal density functions:

Changes in frequency of heat and cold in the United States temperature record

Includes games of Tuesday, February 27, 2018 Major League Baseball

A NEW PROPERTY AND A FASTER ALGORITHM FOR BASEBALL ELIMINATION

Link Analysis. Stony Brook University CSE545, Fall 2016

Heat and Health: Reducing the Impact of the Leading Weather-Related Killer

Vibrancy and Property Performance of Major U.S. Employment Centers. Appendix A

SHORT ANSWER. Answer the question, including units in your answer if needed. Show work and circle your final answer.

Includes games of Thursday, March 1, 2018 Major League Baseball

North American Geography. Lesson 5: Barnstorm Like a Tennis Player!

Speed. June Victor Couture (U. Toronto) Gilles Duranton (U. Toronto) Matt Turner (U. Toronto)

( )( ) of wins. This means that the team won 74 games.

Approximate Inference in Practice Microsoft s Xbox TrueSkill TM

Lecture 3: Sports rating models

Includes games of Saturday, March 25, 2017

Relevant Chapter Review Exercise(s) Intro 3 R1.1. Related Example on Page(s) Intro 3 R R1.2, R R R R1.

IE 336 Seat # Name (clearly) Closed book. One page of hand-written notes, front and back. No calculator. 60 minutes.

Lecture 8: Introduction to Density Functional Theory

Markov Processes Hamid R. Rabiee

Climate Uncovered: Media Fail to Connect Hurricane Florence to Climate Change

Investigation 11.3 Weather Maps

Includes games of Friday, March 2, 2018 Major League Baseball

NFC CONFERENCE CHAMPION /// VISITOR AT SUPER BOWL XLIV WON SUPER BOWL XLIV OVER THE INDIANAPOLIS, COLTS 31-17

Chapter 14. Simple Linear Regression Preliminary Remarks The Simple Linear Regression Model

2/25/2019. Taking the northern and southern hemispheres together, on average the world s population lives 24 degrees from the equator.

Lab Activity: Weather Variables

HEINICKE ALLEN GILBERT. Taylor. Kyle. Garrett QUARTERBACKS HEINICKE CAREER STATISTICS

Detecting Climate Change through Means and Extremes

IB Math Standard Level Year 1: Final Exam Review Alei - Desert Academy

The electronic structure of materials 2 - DFT

Notes 21: Scatterplots, Association, Causation

Convergence of Random Processes

CS224W: Analysis of Networks Jure Leskovec, Stanford University

On a connection between the Bradley-Terry model and the Cox proportional hazards model

The xmacis Userʼs Guide. Keith L. Eggleston Northeast Regional Climate Center Cornell University Ithaca, NY

Math 416 Lecture 3. The average or mean or expected value of x 1, x 2, x 3,..., x n is

PUTNAM PROBLEM SOLVING SEMINAR WEEK 7 This is the last meeting before the Putnam. The Rules. These are way too many problems to consider. Just pick a

Probability. VCE Maths Methods - Unit 2 - Probability

Chapter 5 Integer Programming

Meteorology. Circle the letter that corresponds to the correct answer

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2019

Bardeen Bardeen, Cooper Cooper and Schrieffer and Schrieffer 1957

Research Update: Race and Male Joblessness in Milwaukee: 2008

Investigating Weather with Google Earth Student Guide

8.1 CHANGES IN CHARACTERISTICS OF UNITED STATES SNOWFALL OVER THE LAST HALF OF THE TWENTIETH CENTURY

BD 1; CC 1; SS 11 I 1

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam.

Finite-Temperature Hartree-Fock Exchange and Exchange- Correlation Free Energy Functionals. Travis Sjostrom. IPAM 2012 Workshop IV

On a connection between the Bradley Terry model and the Cox proportional hazards model

Density-functional theory of superconductivity

DIFFERENTIATION RULES

My Map Activity MINNESOTA SOCIAL STUDIES STANDARDS & BENCHMARKS

Signatures of Superfluidity in Dilute Fermi Gases near a Feshbach Resonance

Colley s Method. r i = w i t i. The strength of the opponent is not factored into the analysis.

ANSWER KEY. Part I: Weather and Climate. Lab 16 Answer Key. Explorations in Meteorology 72

Econ 508B: Lecture 5

Euler s Method (BC Only)

MADHAVA MATHEMATICS COMPETITION, December 2015 Solutions and Scheme of Marking

A Persistence Probability Analysis in Major Financial Indices

What Is the Weather Like in Different Regions of the United States?

Transcription:

The Best, The Hottest, & The Luckiest: A Statistical Tale of Extremes Sid Redner, Boston University, physics.bu.edu/~redner Extreme Events: Theory, Observations, Modeling and Simulations Palma de Mallorca, November 10-14, 2008

The Best with S. Maslov, P. Chen, H. Xie J. Informetrics 1, 8-15 (2007) Observations about scientific citations Google page rank analysis for hidden gems

Phys. Rev. Citation Data (as of July 03) 353,268 papers, 3,110,839 cites # cites =8.81, cite age =6.20 N.B.: Internal citations only; undercount by factor of 3-5. (for highly cited HEP papers; SPIRES)

Phys. Rev. Citation Data (as of July 03) 353,268 papers, 3,110,839 cites # cites =8.81, cite age =6.20 N.B.: Internal citations only; undercount by factor of 3-5. (for highly cited HEP papers; SPIRES) 11 papers with > 1000 citations 79 papers with > 500 citations 237 papers with > 300 citations 2340 papers with > 100 citations 8073 papers with > 50 citations

Phys. Rev. Citation Data (as of July 03) 353,268 papers, 3,110,839 cites # cites =8.81, cite age =6.20 N.B.: Internal citations only; undercount by factor of 3-5. (for highly cited HEP papers; SPIRES) 11 papers with > 1000 citations 79 papers with > 500 citations 237 papers with > 300 citations 2340 papers with > 100 citations 8073 papers with > 50 citations 245459 papers with < 10 citations 84144 papers with 1 citation 23421 papers with 0 citations

PR papers with >1000 cites cites age article title author(s) 1 PR 140 A1133 (1965) 3227 26.7 Self Consistent Equations.. W. Kohn & L. J. Sham

PR papers with >1000 cites cites age article title author(s) 1 PR 140 A1133 (1965) 4598 26.7 Self Consistent Equations.. W. Kohn & L. J. Sham

PR papers with >1000 cites article cites age title author(s) 1 PR 140 A1133 (1965) 4598 26.7 Self Consistent Equations.. W. Kohn & L. J. Sham 2 PR 136 B864 (1965) 2460 28.7 Inhomogeneous Electron Gas.. P. Hohenberg & W. Kohn 3 PRB 23 5048 (1981) 2079 14.4 Self-Interaction Correction to.. J. P. Perdew & A. Zunger 4 PRL 45 566 (1980) 1781 15.4 Ground State of the Electron.. D. M. Ceperley & B. J. Alder 5 PR 108 1175 (1957) 1364 20.2 Theory of Superconductivity Bardeen, Cooper, Schrieffer 6 PRL 19 1264 (1967) 1306 15.5 A Model of Leptons S. Weinberg 7 PRB 12 3060 (1975) 1259 18.4 Linear Methods in Band Theory O. K. Andersen 8 PR 124 1866 (1961) 1178 28.0 Effects of Configuration.. U. Fano 9 RMP 57 287 (1985) 1055 9.2 Disordered Electronic Systems P. A. Lee & T. V. Ramakrishnan 10 RMP 54 437 (1982) 1045 10.8 Electronic Properties of.. T. Ando, A. B. Fowler, & F. Stern 11 PRB 13 5188 (1976) 1023 20.8 Special Points for Brillouin.. H. J. Monkhorst & J. D. Pack

Google Page Rank for Citations Brin & Page (1999)

Google Page Rank for Citations Brin & Page (1999) Idea: launch many surfers that surf forever until a steady-state surfer number Gᵢ is reached at each site i of the network.

Google Page Rank for Citations Brin & Page (1999) Idea: launch many surfers that surf forever until a steady-state surfer number Gᵢ is reached at each site i of the network. Equation for number of surfers: G i =(1 d) j nn i G j k out j + d N

Google Page Rank for Citations Brin & Page (1999) Idea: launch many surfers that surf forever until a steady-state surfer number Gᵢ is reached at each site i of the network. Equation for number of surfers: random walk propagation G i =(1 d) j nn i G j k out j + d N manna from heaven

Google Page Rank for Citations Brin & Page (1999) Idea: launch many surfers that surf forever until a steady-state surfer number Gᵢ is reached at each site i of the network. Equation for number of surfers: random walk propagation G i =(1 d) j nn i G j k out j + d N manna from heaven Brin/Page: d=0.15 (bored after 6 clicks) We use: d=0.50 (don t cite beyond 2 generations)

Correlation between Google & citation counts 10 3 average google number 10 2 10 1 slope 1 10 0 10 0 10 1 10 2 10 3 10 4 number of citations

500 400 google number 300 200 100 0 0 500 1000 1500 number of citations

500 Cabibbo 400 google number 300 200 100 0 0 500 1000 1500 number of citations

500 400 Cabibbo BCS google number 300 200 100 0 0 500 1000 1500 number of citations

500 400 Cabibbo BCS K/S google number 300 200 H/K 100 0 0 500 1000 1500 number of citations

500 Cabibbo 400 BCS K/S google number 300 200 Onsager Weinberg H/K 100 0 0 500 1000 1500 number of citations

500 Cabibbo BCS 400 K/S google number 300 200 Onsager Slater W/S F/GM Anderson G4 Weinberg H/K 100 0 0 500 1000 1500 number of citations

500 Cabibbo 400 BCS K/S google number 300 200 Onsager Slater W/S F/GM Anderson G4 GM/B W/S Hi Tc DLA Glauber S Weinberg Fano H/K 100 0 0 500 1000 1500 number of citations

The Hottest with M. Petersen Phys. Rev. E 74, 061114 (2006) Some facts about urban temperatures Evolution of record temperature events Role of global warming on records

Philadelphia Temperatures annual temperature pattern long-term temperature trends Aug 7, 1918 106F 70 100 record high annual high 80 daily temperature 60 40 20 av. high av. low record low average temperature 60 50 annual middle 0 20 Feb 9, 1934-11F 0 100 200 300 day of the year 40 annual low 1880 1920 1960 2000 year

Philadelphia Temperatures annual temperature pattern Aug 7, 1918 106F long-term temperature trends IPCC 2001: global warming 1.1 over past century 70 100 80 record high 3.8 annual high daily temperature 60 40 20 av. high av. low record low average temperature 60 50 annual middle 0 20 Feb 9, 1934-11F 0 100 200 300 day of the year 40 annual low 1880 1920 1960 2000 year

Philadelphia Temperatures annual temperature pattern Aug 7, 1918 106F long-term temperature trends IPCC 2001: global warming 1.1 over past century 70 100 80 record high 3.8 annual high daily temperature 60 40 20 av. high av. low record low average temperature 60 50 annual middle 0 20 Feb 9, 1934-11F 0 100 200 300 day of the year 1.4 40 annual low 1880 1920 1960 2000 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 0 t 0 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 0 t 0 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 0 t 0 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 1 T 0 t 0 t 1 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 2 T 1 T 0 t 0 t 1 t 2 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 2 T 1 T 0 t 0 t 1 t 2 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 3 T 2 T 1 T 0 t 0 t 1 t 2 t 3 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 3 T 2 T 1 T 0 t 0 t 1 t 2 t 3 year

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 3 T 2 T 1 T 0 t 0 t 1 t 2 t 3 year Goal: compute T k, P k (T ) t k,p k (t)

Evolution of Temperature Records (Arnold et al., Records, 1998) assumption: daily temperatures are iid continuous variables temperature on a fixed day T 3 T 2 T 1 T 0 t 0 t 1 t 2 t 3 year Goal: compute T k, P k (T ) t k,p k (t) } distribution independent!

Record Time Statistics (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar)

Record Time Statistics (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) define σ i = { 1 if record in i th year 0 otherwise

Record Time Statistics (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) define σ i = { 1 if record in i th year 0 otherwise largest then σ i = 1 i +1... 0 1 2 i

define σ i = Record Time Statistics { 1 if record in i th year 0 otherwise (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) 2nd largest new largest then σ i = 1 i +1...... σ i σ j = σ i σ j 0 1 2 i j

define σ i = Record Time Statistics { 1 if record in i th year 0 otherwise (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) 2nd largest new largest then σ i = 1 i +1 therefore n(t) =...... σ i σ j = σ i σ j t i=1 σ i ln t 0 1 2 i P (n) = (ln t)n n! e ln t j

define σ i = Record Time Statistics { 1 if record in i th year 0 otherwise (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) 2nd largest new largest then σ i = 1 i +1 therefore n(t) =...... σ i σ j = σ i σ j t i=1 σ i ln t 0 1 2 i P (n) = (ln t)n n! e ln t probability that current record is broken in ith year:... 2nd j 1st 0 1 2 i

define σ i = Record Time Statistics { 1 if record in i th year 0 otherwise (Glick 1978, Sibani et al 1997, Krug & Jain 05, Majumdar) 2nd largest new largest then σ i = 1 i +1 therefore n(t) =...... σ i σ j = σ i σ j t i=1 σ i ln t 0 1 2 i P (n) = (ln t)n n! e ln t probability that current record is broken in ith year:... 2nd j 1st = 1 i(i +1) time until next record = 0 1 2 i

Records If Global Warming Is Occurring assume temperature distribution as a soluble example only p(t ; t) = { e (T vt) T>vt 0 T<vt exceedance probability prob of record at year n (over)estimate of record time p > (T k ; t k + j) = T k e [T v(t k+j)] dt = e (T k vt k ) e jv X e jv q n (T k ) p > (T k ) p < (T k ) n 1 e nv X n 1 j=1 (1 e jv X) }{{} when 0 record must occur e jv X =1 (t k+1 t k )v = T k vt k t k k v records ultimately occur at constant rate (Ballerini & Resnick, 85; Borokov, 99)

Frequency of Record High Temperatures 10 1 probability 10 2 0.012 0.006 v=0.012 v=0.006 0.003 10 3 v=0.003 v=0 10 0 10 1 10 2 10 3 time

Frequency of Record Low Temperatures 10 1 probability 10 2 v=0 10 3 v=0.003 v=0.006 v=0.012 10 1 10 0 10 1 10 2 10 3 time probability 10 2 0.012 0.006 0.003 v=0.012 v=0.006 10 3 v=0.003 v=0 10 0 10 1 10 2 10 3 time

Record Frequencies: Multiple Stations Potsdam CET Brussels 100 100 10 10 years after 1893 years after 1878 years after 1833 10 100 1 1 10 100 1 1 10 100 100 Milan 100 Prague 10 10 highs lows 1 years after 1763 1 10 100 1 years after 1775 1 10 100

The Luckiest with C. Sire Eur. Phys. J B to appear very soon Statistics of team win/loss records Consecutive-game winning/losing streaks

Fun Facts About Team Win/Loss Records Best all-time record: 1906 Chicago Cubs: 116/36.763 Worst all-time record: 1916 Philadelphia Athletics: 36/117.235 Historically most successful 1901-1960: NY Highlanders/Yankees 1903-1960: 5105/8809.5795 Historically least successful 1901-1960: Philadelphia Phillies 1901-1960: 3828/8786.4357 Boston Braves 1905-1952: 2627/5923.4435

Bradley-Terry Competition Model Zermelo (1929) Bradley & Terry (1952)

Bradley-Terry Competition Model Zermelo (1929) Bradley & Terry (1952) each team i has strength xᵢ, i=1,2,...,n (fixed in each season, can change between seasons)

Bradley-Terry Competition Model Zermelo (1929) Bradley & Terry (1952) each team i has strength xᵢ, i=1,2,...,n (fixed in each season, can change between seasons) probability that team i beats j: actually, assume only: p ij = x i p ij = f(x i ) x i + x j f(x i )+f(x j ) numerical studies: Anthology of Statistics in Sports, Albert et al. (2005)

Bradley-Terry Competition Model Zermelo (1929) Bradley & Terry (1952) each team i has strength xᵢ, i=1,2,...,n (fixed in each season, can change between seasons) probability that team i beats j: actually, assume only: p ij = x i p ij = f(x i ) x i + x j f(x i )+f(x j ) numerical studies: Anthology of Statistics in Sports, Albert et al. (2005) uniform strength distribution: x [ɛ, 1] with 0 ɛ 1 fundamental parameter: ε common assumption: log-normal distribution

Win Fraction Versus Rank MLB data for 1905, 10, 15,..., 1960 0.7 0.6 W(r) 0.5 0.4 0.3 0.2 0 0.2 0.4 0.6 0.8 1 r data from www.shrpsports.com

Average Win Fraction Versus Rank 0.7 MLB 1901-60 0.6 W(r) 0.5 0.4 0.3 0 0.2 0.4 0.6 0.8 1 r

Average Win Fraction Versus Rank MLB 1901-60 0.7 0.67 0.6 W(r) 0.5 0.4 0.32 0.3 0 0.2 0.4 0.6 0.8 1 r

Average Win Fraction Versus Rank MLB 1901-60, 1961-2005 0.7 0.6 0.67 0.63 W(r) 0.5 1961-2005 0.36 0.32 0.4 1901-60 0.3 0 0.2 0.4 0.6 0.8 1 r

Average Win Fraction Versus Rank Simulations of BT model for 10⁶ seasons 0.7 0.6 ε=0.278 balanced schedule 0.67 0.63 W(r) 0.5 1961-2005 ε=0.435 0.36 0.32 0.4 1901-60 0.3 0 0.2 0.4 0.6 0.8 1 r

Theory for the Average Win Fraction W

Theory for the Average Win Fraction W for a team of strength x: N x W (x) = 1 N j=1 x + x j

Theory for the Average Win Fraction W for a team of strength x: N x W (x) = 1 N 1 1 ɛ j=1 1 x + x j ɛ x x + y dy infinite season length infinite number of teams balanced schedule uniform strengths on [ε,1]

Theory for the Average Win Fraction W for a team of strength x: N x W (x) = 1 N 1 1 ɛ = j=1 1 ɛ x 1 ɛ ln x + x j x x + y dy [ x +1 x + ɛ ] infinite season length infinite number of teams balanced schedule uniform strengths on [ε,1]

Theory for the Average Win Fraction W for a team of strength x: N x W (x) = 1 N 1 1 ɛ = j=1 1 ɛ x 1 ɛ ln x + x j x x + y dy [ x +1 x + ɛ ] infinite season length infinite number of teams balanced schedule uniform strengths on [ε,1] strength x rank r: x=ε+(1-ε)r [ 1+ɛ +(1 ɛ)r ] W (r) =[ɛ +(1 ɛ)r] ln 2ɛ +(1 ɛ)r

Average Win Fraction Versus Rank 0.7 0.6 ε=0.278 0.67 0.63 W(r) 0.5 1961-2005 ε=0.435 0.36 0.32 0.4 1901-60 0.3 0 0.2 0.4 0.6 0.8 1 r

Average Win Fraction Versus Rank 0.7 0.6 BT model analytics ε=0.278 0.67 0.63 W(r) 0.5 1961-2005 ε=0.435 0.36 0.32 0.4 1901-60 0.3 0 0.2 0.4 0.6 0.8 1 r

Role of Season Length on Win Fraction 0.6 game data BT model W(r) 0.5 0.4 0 0.2 0.4 0.6 0.8 1 r

Role of Season Length on Win Fraction 0.6 game data BT model 300 game season W(r) 0.5 0.4 0 0.2 0.4 0.6 0.8 1 r

Role of Season Length on Win Fraction 0.6 game data BT model 300 game season 500 game season W(r) 0.5 0.4 0 0.2 0.4 0.6 0.8 1 r

Role of Season Length on Win Fraction W(r) 0.6 0.5 game data BT model 300 game season 500 game season 1000 game season 0.4 0 0.2 0.4 0.6 0.8 1 r

Role of Season Length on Win Fraction W(r) 0.6 0.5 0.4 game data BT model 300 game season 500 game season 1000 game season 10000 game season + theory 0 0.2 0.4 0.6 0.8 1 r

Fun Facts About Winning and Losing Streaks

Fun Facts About Winning and Losing Streaks longest win streaks 26 1916 New York Giants (1 tie) 21 1935 Chicago Cubs 20 2002 Oakland Athletics 19 1906 Chicago White Sox (1 tie) 19 1947 New York Yankees 18 1904 New York Giants 18 1953 New York Yankees 17 1907 New York Giants 17 1912 Washington Senators 17 1916 New York Giants 17 1931 Philadelphia Athletics 16 1909 Pittsburgh Pirates 16 1912 New York Giants 16 1926 New York Yankees 16 1951 New York Giants 16 1977 Kansas City Royals 15 1903 Pittsburgh Pirates 15 1906 New York Highlanders 15 1913 Philadelphia Athletics 15 1924 Brooklyn Dodgers 15 1936 Chicago Cubs 15 1936 New York Giants 15 1946 Boston Red Sox 15 1960 New York Yankees 15 1991 Minnesota Twins 15 2000 Atlanta Braves 15 2001 Seattle Mariners

Fun Facts About Winning and Losing Streaks longest win streaks 26 1916 New York Giants (1 tie) 21 1935 Chicago Cubs 20 2002 Oakland Athletics 19 1906 Chicago White Sox (1 tie) 19 1947 New York Yankees 18 1904 New York Giants 18 1953 New York Yankees 17 1907 New York Giants 17 1912 Washington Senators 17 1916 New York Giants 17 1931 Philadelphia Athletics 16 1909 Pittsburgh Pirates 16 1912 New York Giants 16 1926 New York Yankees 16 1951 New York Giants 16 1977 Kansas City Royals 15 1903 Pittsburgh Pirates 15 1906 New York Highlanders 15 1913 Philadelphia Athletics 15 1924 Brooklyn Dodgers 15 1936 Chicago Cubs 15 1936 New York Giants 15 1946 Boston Red Sox 15 1960 New York Yankees 15 1991 Minnesota Twins 15 2000 Atlanta Braves 15 2001 Seattle Mariners longest losing streaks 23 1961 Philadelphia Phillies 21 1988 Baltimore Orioles 20 1906 Boston Americans 20 1906 Philadelphia As 20 1916 Philadelphia As 20 1969 Montreal Expos (first year) 19 1906 Boston Beaneaters 19 1914 Cincinnati Reds 19 1975 Detroit Tigers 19 2005 Kansas City Royals 18 1920 Philadelphia As 18 1948 Washington Senators 18 1959 Washington Senators 17 1926 Boston Red Sox 17 1962 NY Mets (first year) 17 1977 Atlanta Braves 16 1911 Boston Braves 16 1907 Boston Doves 16 1907 Boston Americans (2 ties) 16 1944 Brooklyn Dodgers (1 made-up game) 15 1909 St. Louis Browns 15 1911 Boston Rustlers 15 1927 Boston Braves 15 1927 Boston Red Sox 15 1935 Boston Braves 15 1937 Philadelphia As 15 2002 Tampa Bay 15 1972 Texas Rangers (first year)

Fun Facts About Winning and Losing Streaks longest win streaks 26 1916 New York Giants (1 tie) 21 1935 Chicago Cubs 20 2002 Oakland Athletics 19 1906 Chicago White Sox (1 tie) 19 1947 New York Yankees 18 1904 New York Giants 18 1953 New York Yankees 17 1907 New York Giants 17 1912 Washington Senators 17 1916 New York Giants 17 1931 Philadelphia Athletics 16 1909 Pittsburgh Pirates 16 1912 New York Giants 16 1926 New York Yankees 16 1951 New York Giants 16 1977 Kansas City Royals 15 1903 Pittsburgh Pirates 15 1906 New York Highlanders 15 1913 Philadelphia Athletics 15 1924 Brooklyn Dodgers 15 1936 Chicago Cubs 15 1936 New York Giants 15 1946 Boston Red Sox 15 1960 New York Yankees 15 1991 Minnesota Twins 15 2000 Atlanta Braves 15 2001 Seattle Mariners longest losing streaks 23 1961 Philadelphia Phillies 21 1988 Baltimore Orioles 20 1906 Boston Americans 20 1906 Philadelphia As 20 1916 Philadelphia As 20 1969 Montreal Expos (first year) 19 1906 Boston Beaneaters 19 1914 Cincinnati Reds 19 1975 Detroit Tigers 19 2005 Kansas City Royals 18 1920 Philadelphia As 18 1948 Washington Senators 18 1959 Washington Senators 17 1926 Boston Red Sox 17 1962 NY Mets (first year) 17 1977 Atlanta Braves 16 1911 Boston Braves 16 1907 Boston Doves 16 1907 Boston Americans (2 ties) 16 1944 Brooklyn Dodgers (1 made-up game) 15 1909 St. Louis Browns 15 1911 Boston Rustlers 15 1927 Boston Braves 15 1927 Boston Red Sox 15 1935 Boston Braves 15 1937 Philadelphia As 15 2002 Tampa Bay 15 1972 Texas Rangers (first year) 27 15-game win/loss streaks between 1901-30 13 between 1931-60 15 between 1961-present

Winning and Losing Streak Distributions 10 0 MLB data 10-1 P n 10-2 10-3 10-4 1961-2005 1901-60 10-5 0 5 10 15 20 25 n

Winning and Losing Streak Distributions 10 0 MLB data 10-1 P n 10-2 10-3 1961-2005 1901-60 10-4 10-5 2 n 0 5 10 15 20 25 n

Winning and Losing Streak Distributions 10 0 BT model simulations 10-1 P n 10-2 10-3 1961-2005 1901-60 10-4 10-5 2 n 0 5 10 15 20 25 n ε=0.435 ε=0.278

Streak Length Distribution in BT Model P n (x) = n j=1 n wins in a row initial loss final loss x x + x j x 0 x + x 0 x n+1 x + x n+1

P n (x) = Streak Length Distribution in BT Model n j=1 n wins in a row initial loss P n (x) {xj } = x n 1 x + y final loss x x + x j x 0 x + x 0 x n+1 + x n+1 n y x + y 2 1 x + y y x + y = 1 1 ɛ 1 ɛ =1 x 1 ɛ ln dy x + y = 1 1 ɛ ln ( ) x +1 x + ɛ ( ) x +1 x + ɛ

P n (x) = Streak Length Distribution in BT Model n j=1 n wins in a row initial loss P n (x) {xj } = x n 1 x + y final loss x x + x j x 0 x + x 0 x n+1 + x n+1 n y x + y 2 1 x + y y x + y = 1 1 ɛ 1 ɛ =1 x 1 ɛ ln dy x + y = 1 1 ɛ ln ( ) x +1 x + ɛ ( ) x +1 x + ɛ P n = 1 1 ɛ 1 ɛ [ 1 x 1 ɛ ln ( )] 2 x +1 e ng(x) dx x + ɛ g(x) = ln x + ln [ 1 1 ɛ ln ( x +1 x + ɛ )] monotonic

P n (x) = Streak Length Distribution in BT Model n j=1 n wins in a row initial loss P n (x) {xj } = x n 1 x + y final loss x x + x j x 0 x + x 0 x n+1 + x n+1 n y x + y 2 1 x + y y x + y = 1 1 ɛ 1 ɛ =1 x 1 ɛ ln dy x + y = 1 1 ɛ ln ( ) x +1 x + ɛ ( ) x +1 x + ɛ P n = 1 1 ɛ 1 ɛ [ 1 x 1 ɛ ln 1 ng (1) eng(1) ( )] 2 x +1 e ng(x) dx x + ɛ g(x) = ln x + ln [ 1 1 ɛ ln monotonic ( x +1 x + ɛ )]

Asymptotic Behavior P n 1 ng (1) eng(1) g(x) = ln x + ln [ 1 1 ɛ ln g(1) = ln(1 ɛ) + ln ln ( )] x +1 x + ɛ ( 2 1+ɛ )

Asymptotic Behavior P n 1 ng (1) eng(1) g(x) = ln x + ln [ 1 1 ɛ ln g(1) = ln(1 ɛ) + ln ln ( )] x +1 x + ɛ ( 2 1+ɛ ) fastest decay when ε=1: P n 2 n slowest decay when ε=0: P n (ln 2) n (0.693) n

Winning and Losing Streak Distributions 10 0 10-1 10-2 P n 10-3 1961-2005 1901-60 10-4 10-5 2 n 0 5 10 15 20 25 n

Winning and Losing Streak Distributions numerically integrate 10 0 10-1 10-2 P n = 1 1 ɛ 1 ɛ [ 1 x 1 ɛ ln ( x +1 x + ɛ )] 2 e ng(x) dx P n 10-3 1961-2005 1901-60 10-4 10-5 2 n 0 5 10 15 20 25 n

Winning and Losing Streak Distributions numerically integrate 10 0 10-1 10-2 P n = 1 1 ɛ 1 same as: 10⁵ randomizations of all 2166 win/loss histories ɛ [ 1 x 1 ɛ ln ( x +1 x + ɛ )] 2 e ng(x) dx P n 10-3 1961-2005 1901-60 10-4 10-5 2 n 0 5 10 15 20 25 n

Summary/Outlook

Summary/Outlook Citations: Page rank analysis: helps uncover hidden gems Future: contextual analysis, specialization

Summary/Outlook Citations: Page rank analysis: helps uncover hidden gems Future: contextual analysis, specialization Climate: Global warming seems to affect temperature records Open: seasonality, high/low asymmetry, changing variability

Inter-day Temperature Correlations 10 0 C α (t) = 1 365 365 i=1 T i T i+t T i T i+t T 2 i T i 2 α = low, middle, high slope -4/3 (other work: slope 0.7 e.g., Bunde et al.) 10 1 C α (t) middle low 10 2 high 10 3 1 10 t 100 t

Seasonal Variance 12 daily variances (10-day averages) 10 8 low high average 6 4 0 100 200 300 day of the year

Seasonal Variance & Record Numbers daily variances (10-day averages) 12 10 8 low high average number of record highs (20-day average) 7 5 6 3 4 0 100 200 300 day of the year

Summary/Outlook Citations: Page rank analysis: helps uncover hidden gems Future: contextual analysis, specialization Climate: Global warming seems to affect temperature records Open: seasonality, high/low asymmetry, changing variability

Summary/Outlook Citations: Page rank analysis: helps uncover hidden gems Future: contextual analysis, specialization Climate: Global warming seems to affect temperature records Open: seasonality, high/low asymmetry, changing variability Competition: Statistical model win/loss records, streak distributions keener competition; harder to maintain advantage