Estimating transition times for a model of language change in an age-structured population

Similar documents
On the fast convergence of random perturbations of the gradient flow.

A STOCHASTIC MODEL OF LANGUAGE CHANGE THROUGH SOCIAL STRUCTURE AND PREDICTION-DRIVEN INSTABILITY

2012 NCTS Workshop on Dynamical Systems

Maximum Process Problems in Optimal Control Theory

1 Brownian Local Time

Dynamical systems with Gaussian and Levy noise: analytical and stochastic approaches

Path integrals for classical Markov processes

Differentiation of functions of covariance

A random perturbation approach to some stochastic approximation algorithms in optimization.

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion

Stochastic differential equations in neuroscience

LECTURE 8: DYNAMICAL SYSTEMS 7

Stochastic Modelling in Climate Science

Lecture 21 Representations of Martingales

Brownian Motion and the Dirichlet Problem

Latent voter model on random regular graphs

The Wright-Fisher Model and Genetic Drift

3 Hydrostatic Equilibrium

Introducing the Normal Distribution

Uncertainty quantification and systemic risk

Malliavin Calculus in Finance

2t t dt.. So the distance is (t2 +6) 3/2

Scaling limits of anisotropic random growth models

Calculus II Practice Test Problems for Chapter 7 Page 1 of 6

Introducing the Normal Distribution

5.9 Representations of Functions as a Power Series

Introduction to Random Diffusions

Introduction to Dynamical Systems Basic Concepts of Dynamics

Residence-time distributions as a measure for stochastic resonance

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Stochastic Stability - H.J. Kushner

LogFeller et Ray Knight

arxiv: v1 [math.ap] 28 Aug 2018

Fundamentals of Dynamical Systems / Discrete-Time Models. Dr. Dylan McNamara people.uncw.edu/ mcnamarad

Singular perturbation theory

I forgot to mention last time: in the Ito formula for two standard processes, putting

Branching Processes II: Convergence of critical branching to Feller s CSB

Nonparametric Drift Estimation for Stochastic Differential Equations

Path Decomposition of Markov Processes. Götz Kersting. University of Frankfurt/Main

Stochastic Viral Dynamics with Beddington-DeAngelis Functional Response

Lecture 17 Brownian motion as a Markov process

88 CONTINUOUS MARKOV CHAINS

Limits for parametric and polar curves

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

7 Planar systems of linear ODE

Landau s Fermi Liquid Theory

Synchronization and Phase Oscillators

Some Properties of NSFDEs

On Stochastic Adaptive Control & its Applications. Bozenna Pasik-Duncan University of Kansas, USA

Rough paths methods 1: Introduction

Automatic Differentiation and Neural Networks

Large Deviations for Small-Noise Stochastic Differential Equations

Large Deviations for Small-Noise Stochastic Differential Equations

A RANDOM VARIANT OF THE GAME OF PLATES AND OLIVES

Introduction to asymptotic techniques for stochastic systems with multiple time-scales

Notes on Large Deviations in Economics and Finance. Noah Williams

Toward Binary Black Hole Simulations in Numerical Relativity

Cores for generators of some Markov semigroups

1 Lyapunov theory of stability

3 Stability and Lyapunov Functions

Introduction to stochastic multiscale modelling of tumour growth

Simulation of diffusion. processes with discontinuous coefficients. Antoine Lejay Projet TOSCA, INRIA Nancy Grand-Est, Institut Élie Cartan

Stochastic modelling of epidemic spread

2 One-dimensional models in discrete time

Regularization by noise in infinite dimensions

Theory and applications of random Poincaré maps

LAN property for sde s with additive fractional noise and continuous time observation

Stochastic differential equation models in biology Susanne Ditlevsen

Asymptotic Analysis 88 (2014) DOI /ASY IOS Press

Stochastic (intermittent) Spikes and Strong Noise Limit of SDEs.

Dyson series for the PDEs arising in Mathematical Finance I

Outline. A Central Limit Theorem for Truncating Stochastic Algorithms

Simulated Annealing for Constrained Global Optimization

Sequences and infinite series

Random and Deterministic perturbations of dynamical systems. Leonid Koralov

Temperature and Pressure Controls

Stochastic Areas and Applications in Risk Theory

Distributed learning in potential games over large-scale networks

2:2:1 Resonance in the Quasiperiodic Mathieu Equation

Lecture 4: Introduction to stochastic processes and stochastic calculus

Brownian Motion and lorentzian manifolds

VIII. Phase Transformations. Lecture 38: Nucleation and Spinodal Decomposition

Evolutionary Game Theory: Overview and Recent Results

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

A Barrier Version of the Russian Option

Metastability for the Ginzburg Landau equation with space time white noise

New ideas in the non-equilibrium statistical physics and the micro approach to transportation flows

Quantum Gravity in 2+1 Dimensions I

Exact Linearization Of Stochastic Dynamical Systems By State Space Coordinate Transformation And Feedback I g-linearization

Index theory on manifolds with corners: Generalized Gauss-Bonnet formulas

DIFFERENTIATION RULES

One-Parameter Processes, Usually Functions of Time

Math 350 Solutions for Final Exam Page 1. Problem 1. (10 points) (a) Compute the line integral. F ds C. z dx + y dy + x dz C

Math 114: Make-up Final Exam. Instructions:

STAT Sample Problem: General Asymptotic Results

Stochastic Calculus. Kevin Sinclair. August 2, 2016

HILL S EQUATION WITH RANDOM FORCING TERMS. Fred C. Adams and Anthony M. Bloch Also: Suzie Butler, Jeff Druce and Jake Ketchum

The Wiener Itô Chaos Expansion

Some Tools From Stochastic Analysis

Universal examples. Chapter The Bernoulli process

Section 2.8: The Power Chain Rule

Transcription:

Estimating transition times for a model of language change in an age-structured population W. Garrett Mitchener MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu

Abstract: Human languages are stable on long time scales but have a tendency to change dramatically in a matter of a few decades. Some changes can be attributed to contact between languages, but others seem to be essentially spontaneous. In this project, I explore part of the machinery of language learning and variation that seems to contribute to spontaneous changes: Children seem to detect correlations between language variation and age and social status, and amplify them as they grow up. These effects are called regularization and incrementation. To model these, I have formulated a discrete Markov chain, and its limit as a stochastic differential equation, for a population with two age groups. The population tends to settle at a stable equilibrium dominated by one language variant, then switch spontaneously to an equilibrium dominated by another. The mean transition time can be calculated using a related convectiondiffusion PDE, but that turns out to be numerically stiff. Alternatively, an asymptotic estimate of the mean time between transitions can be computed using the action functional. I demonstrate the calculation as performed with the Mathematica computer algebra system.

Language change: Middle English (1100 AD to 1500 AD) verb-raising syntax: ˆ Know you what time it is? ˆ I know not what time it is. Early Modern & Modern English do-support syntax: ˆ Do you know what time it is? ˆ I don t know what time it is.

Problem: Human languages are stable on long time scales but can change spontaneously over decades. Forces: Children detect correlations between language variation and age and social status, and amplify them. 1.0 0.8 0.6 0.4 0.2 0.0 1450 1500 1550 1600 1650 1700 Fraction of transitive affirmative questions using do-support instead of verb-raising [2, 4]

Learning model: Children modify language as they learn Regularization: Children sometimes drop rarely used forms Two idealized grammars: G 1 and G 2 only one independent variable: 1.0 0.8 0.6 0.4 0.2 0 Mean learning function q(m) m = mean usage rate of G 2 dm dt = birth {}}{ q(m) death {}}{ m Dynamics of m

Incrementation: Children can detect and amplify trends Two age groups: ξ = mean speech of young generation ζ = mean speech of old generation r(ξ, ζ) = where children predict the population will be dξ dt = birth {}}{ q( r(ξ, ζ) ) }{{} prediction aging {}}{ ξ Prediction function dζ dt = aging {}}{ ξ death {}}{ ζ

Add some random fluctuations: Start with a Markov chain: Resampling YOUNG OLD Death Aging Resampling mean OLD YOUNG NEW (prediction) Birth birthday

Fine print about the Markov chain: N youth and N parents Most general: Each agent in a state k {0, 1,..., K}, uses G 2 at a rate k K Simplify K = 1: Each agent uses G 1 (state 0) or G 2 (state 1) C k = number youth in state k X k = 1 N C k Random sentence from uniformly selected youth is type k with probability X k D k = number parents in state k Y k = 1 N D k Random sentence from uniformly selected parent is type k with probability Y k

Q(m) = binomial distribution, parameters q(m) & K K Q k (m) = q(m) k (1 q(m)) K k k r D = death rate: each time step each adult dies with probability p D = r D, replaced by sampling from youth; each youth N ages with probability p D, replaced by sampling from distribution vector Q(r(X 1, Y 1 )) time-in-generation is geometric, mean = N r D time steps 1 time step = 1 N years average life span L = 2N r D time steps = 2 r D years r R = resampling rate: each time step each agent copies the state of a random agent of the same generation with probability r R

Limit of infinite population N : ξ = fraction of youth using G 2 ζ = fraction of parents using G 2 Rescale time: time in units of 1 = L r D 2 years 1 (1 r R ) 2 Let ϵ = r D dξ = (q(r(ξ, ζ)) ξ) dt + ϵ ξ(1 ξ) db ξ dζ = (ξ ζ) dt + ϵ ζ(1 ζ) db ζ where B ξ and B ζ are Brownian motion From [1, 5]: (X 1, Y 1 ) converges weakly to (ξ, ζ)

Change variables: θ = rcsin(2ξ 1) ϕ = rcsin(2ζ 1) b 1 (θ,ϕ) { }} { 1 2q (r(ξ, ζ)) 1 dθ = 2 ϵ2 1 tn θ + dt + ϵ db ξ cos θ b 2 (θ,ϕ) { }}{ 1 dϕ = 2 ϵ2 1 tn ϕ + sin θ dt + ϵ db ζ cos ϕ Phase space is R = π 2, π π 2 2, π 2

Vector field component: Φ 1.5 1.0 0.5 0.0 0.5 1.0.. 1.0 0.8 0.6 0.4 0.2 1.1 1.2 1.5 1.5 1.0 0.5 0.0 0.5 1.0 1.5 (θ, ϕ) vector field: = stable fixed points where both generations are dominated by G 1 or G 2 ; = saddle, separates the basins Mean usage rate of G 2 in young generation vs. time: The population hovers near one stable fixed point. Fluctuations drive it across the separatrix to the other stable fixed point.

1.5 1.0.. Φ 0.5 0.0 0.5 1.0 1.1 1.5 1.2 1.5 1.0 0.5 0.0 0.5 1.0 1.5

Estimating time between transitions: From [3]: Given an SDE with ϵ noise: dx = b(x) dt + ϵ db The normalized action functional is S(ƒ ) = 1 2 T 0 ƒ (t) b(ƒ (t)) 2 dt Intuition: This measures how strongly a path ƒ : [0, T] R flows against the vector field how much energy the random fluctuations must muster. If ƒ solves = b() then S(ƒ ) = 0.

The quasipotential is V( 0, ) = inf {S(ƒ ) ƒ (0) = 0, ƒ (T) = } Intuition: Minimum energy to go from 0 to. The exit time from a domain D is τ = min{t X t D} The expected exit time satisfies lim ϵ 0 ϵ2 ln E (τ) = min V( 0, y) y D Intuition: Minimum energy to best exit point y pick a small ϵ: E (τ) exp 1 ϵ 2 min y D V( 0, y)

Calculation: Pick a small ϵ and compute V. In Mathematica, set up an InterpolatingFunction for the path ƒ with unknowns for its values & derivatives at various times, then use FindMinimum. What to use for D? We really need the basin of attraction for D, but that s hard to compute. Instead, pin the ends of ƒ to the stable fixed points: D runs against the vector field, large contribution runs with the vector field, no contribution Once the path leaves D, it can follow the vector field, which contributes 0 to the integral for S No need to find exit point y Let finish time T increase and look for convergence Let ϵ = 0.02...

Bottom to top: 0.020 V Top to bottom: 0.025 V 0.015 0.020 0.010 0.015 0.010 0.005 0.005 7 7 min V 0.00162956 E (τ) 58.7871 min V 0.000777568 E (τ) 6.98613

Why two dimensions are necessary: From [3]: For an SDE dx = b(x) dt + ϵdb, if the vector field b is a gradient (always true in one dimension) then everything is simpler. Learning with no age groups, no incrementation, no prediction: dξ = (q(ξ) ξ) dt + ϵ ξ(1 ξ) db Change variables θ = rcsin(2ξ 1) b(θ) { }}{ 1 dθ = 2 ϵ2 1 tn θ + 2q(ξ) 1 dt + ϵ db cos θ

Define U(θ 0, θ) = b(s) ds θ 0 From [3]: There s only one boundary point θ C, and the minimum V is simple: V(θ L, θ C ) = 2U(θ L, θ C ) 0.123923 Mean transition time from left to right E (τ) exp 1 θ ϵ 2 V(θ L, θ C ) 3.53612 10 134 That language isn t going to change any time soon!

Intuition: The fluctuations have to overcome the vector field all the way from the stable fixed point to the unstable tipping point in the center. Eventually it must happen but it takes too much energy to occur on a reasonable time scale. With two dimensions, the separatrix can be very close to the stable fixed point. With less of the vector field to overcome, the fluctuations can push the population state into another basin of attraction more easily.

What goes wrong with PDE approach: From [3]: Given a general SDE dx = b(x) dt + σ(x) db, (X) = σ(x)σ (X) the corresponding differential operator is L = 1 2 The solution to,j () 2,j () + b () (),j L = 1 on D, = 0 on D is () = E (τ start at )

Could set it up on D = a small disk around the stable fixed point, solve with a finite element method But: numerically unstable, results are nonsense Even in one dimension, the calculation is difficult because of the outside boundary conditions: Specifying ( π/2) = 0 results in a numerical singularity. 1.5 10 6 1.0 10 6 u 500 000 0 1.5 1.0 0.5 0.0 Mean exit time from θ L via diff eq, using (θ C ) = 0 and ( (1 10 6 )π/2) = 0 to avoid boundary singularity. Intuition: The solution is a plateau, and the edges are very difficult to resolve numerically.

Conclusion: Formulated population dynamics with learning, including regularization and incrementation Mean time between transitions from one stable fixed point to the other: Represents language change Can be calculated with a PDE, but numerically unstable Better to use an asymptotic approximation using the action functional Example calculation: Mean transition times on the order of 10 lifespans Without age groups and incrementation: The model is first order, but the transition time is many orders of magnitude too large Some sort of second order dynamics is necessary

References [1] Richard Durrett. Stochastic Calculus: A Practical Introduction. CRC Press, New York, 1996. [2] Alvar Ellegård. The Auxiliary do: The Establishment and Regulation of Its Use in English, volume II of Gothenburg Studies in English. Almqvist and Wiksell, 1953. [3] M. I. Freidlin and A. D. Wentzell. Random Perturbations of Dynamical Systems, volume 260 of Grundlehren der mathematischen Wissenschaften. Springer Verlag, New York, 1984. [4] Anthony Kroch. Reflexes of grammar in patterns of language change. Language Variation and Change, 1:199 244, 1989. [5] W. Garrett Mitchener. A stochastic model of language change through social structure and prediction-driven instability. Submitted, 2009.