INFORMATION CRITERIA AND APPROXIMATE BAYESIAN COMPUTING FOR AGENT-BASED MODELLING IN ECOLOGY

Similar documents
DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals

Approximate Bayesian Computation: a simulation based approach to inference

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference

Lecture 2: Individual-based Modelling

Markov Chain Monte Carlo, Numerical Integration

7. Estimation and hypothesis testing. Objective. Recommended reading

Lecture 6: Model Checking and Selection

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

All models are wrong but some are useful. George Box (1979)

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Approximate Bayesian computation: methods and applications for complex systems

Departamento de Economía Universidad de Chile

7. Estimation and hypothesis testing. Objective. Recommended reading

Model comparison: Deviance-based approaches

PATTERN RECOGNITION AND MACHINE LEARNING

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

Bayesian Inference and MCMC

Penalized Loss functions for Bayesian Model Choice

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

Tutorial on Probabilistic Programming with PyMC3

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models

Fast Likelihood-Free Inference via Bayesian Optimization

Bayesian Phylogenetics:

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Multimodal Nested Sampling

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Density Estimation. Seungjin Choi

Appendix 1. Overview, Design Concepts, and Details (ODD) of the Chitwan ABM.

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Model comparison and selection

an introduction to bayesian inference

Probabilistic and Bayesian Machine Learning

Adaptive Monte Carlo methods

Akaike Information Criterion

CPSC 540: Machine Learning

Bayesian Model Diagnostics and Checking

Discrete & continuous characters: The threshold model

An Introduction to Mplus and Path Analysis

Learning in Bayesian Networks

MCMC: Markov Chain Monte Carlo

Computational statistics

Session 5B: A worked example EGARCH model

Analyzing the genetic structure of populations: a Bayesian approach

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Tutorial on Approximate Bayesian Computation

Bayesian Methods for Machine Learning

The Metropolis-Hastings Algorithm. June 8, 2012

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

An Introduction to Path Analysis

Bayesian Regression Linear and Logistic Regression

CROSS-VALIDATION OF CONTROLLED DYNAMIC MODELS: BAYESIAN APPROACH

Molecular Evolution & Phylogenetics

Learning Bayesian Networks for Biomedical Data

Monte Carlo in Bayesian Statistics

Probabilistic Graphical Models

Answers and expectations

DIC: Deviance Information Criterion

19 : Slice Sampling and HMC

Bivariate Degradation Modeling Based on Gamma Process

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Markov Chain Monte Carlo methods

Bayesian RL Seminar. Chris Mansley September 9, 2008

Empirical Likelihood Based Deviance Information Criterion

Variability within multi-component systems. Bayesian inference in probabilistic risk assessment The current state of the art

LECTURE 15 Markov chain Monte Carlo

OVERVIEW. L5. Quantitative population genetics

Learning Bayesian belief networks

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Collective motion: from active matter to swarms in natural and engineered systems

CS Homework 3. October 15, 2009

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

An introduction to Bayesian statistics and model calibration and a host of related topics

Bayesian model selection for computer model validation via mixture model estimation

R-INLA. Sam Clifford Su-Yun Kang Jeff Hsieh. 30 August Bayesian Research and Analysis Group 1 / 14

Bayesian Statistics for Personalized Medicine. David Yang

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Making rating curves - the Bayesian approach

IRT Model Selection Methods for Polytomous Items

13: Variational inference II

Computational statistics

Gaussian kernel GARCH models

Bayesian Hierarchical Models

Agent-Based Models in Ecology: Patterns and Alternative Theories of Adaptive Behaviour

STA 4273H: Statistical Machine Learning

Causal Graphical Models in Systems Genetics

Ecology and evolution. Limnology Lecture 2

Markov Chain Monte Carlo in Practice

Inference in Bayesian Networks

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection

Quantitative Biology II Lecture 4: Variational Methods

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Bayesian Phylogenetics

A Bayesian Approach to Phylogenetics

Estimating Evolutionary Trees. Phylogenetic Methods

Lecture 7 and 8: Markov Chain Monte Carlo

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Dynamic System Identification using HDMR-Bayesian Technique

Transcription:

INFORMATION CRITERIA AND APPROXIMATE BAYESIAN COMPUTING FOR AGENT-BASED MODELLING IN ECOLOGY New tools to infer on Individual-level processes Cyril Piou Biomath 2015 June 17

Plan Agent Based Modelling Pattern-Oriented Modelling Information criteria in statistics POMIC Case study POMIC & ABC for ABM ABM POM IC POMIC CS ABC

Agent-Based Modelling Reynolds, C. W. (1987) Flocks, herds and schools: a distributed behavioral model. Computer Graphics, 21, 25-34. Cohesion Direction Separation How inter-individual interactions generate group or population emergent properties?

Agent-Based Modelling How individual variability influence population dynamics and structure? Huston, M., D. DeAngelis, and W. Post. (1988). New computer models unify ecological theory. BioScience 38:682-691.

Agent-Based Modelling Trout behavior model: Use of shallow habitat when small; deep habitat when big Shift in habitat when predators, larger competitors are introduced Hierarchical feeding: big guys get the best spots Movement to margins during floods Use of slower, quieter habitat in high turbidity Use of lower velocities at lower temperatures How individual adaptations regulate populations? Railsback, S. F. & Harvey, B. C. (2002) Analysis of habitat-selection rules using an individual-based model. Ecology, 83, 1817-1830.

Agent-Based Modelling Salmon evolutionary model: Piou, C. & Prévost, E. (2012) Ecological Modelling, 231, 37-52. Piou, C. & Prévost, E. (2013) Global Change Biology, 19, 711-723. Piou, C. et al. in press Journal of Applied Ecology How individual genetics allow populations to evolve in front of changes in environmental conditions?

Agent-Based Modelling 5 branches in ecology (DeAngelis & Mooij 2005): Variation & phenotypic plasticity Spatial interactions & movement Life history and ontogenesis Cognitive behavior Genetic variability and evolution 3 main reasons to use the approach (Grimm & Railsback 2005): Individual variability Local interactions Adaptive behaviors Problems of realism and reproducibility

Pattern-Oriented Modelling ABMs (and many other simulation models) were seen as too theoretical in the 1990s Their relation to real world system was often thin ABMs used for management/real world advice purpose Need of sufficiently good representations Need to continue learning from patterns

Pattern-Oriented Modelling POM was elaborated for complex system models (CA, ABMs ) : Pattern or Emergent property = expression of underlying process POM evolved into 3 branches (reviewed in Grimm et al. 2005, Science) : 1. Choose adapted model structure 2. Analyze processes behind the pattern Strong Inference 3. Increment reliability of model & parameters Inverse modelling

Use multiple patterns Medawar zone Payoff Payoff of a model = not only how useful for the problem, but also structural realism (reproduce nature). Model complexity Problem Multiple patterns Data Grimm et al. 2005 Model complexity # nb of rules, parameters and state variables needed in the model. Medawar zone = zone of model complexity when an increase in complexity of the model does not increase the payoff.

E.g. of Strong inference from multiple patterns Trout behavior model: Railsback, S. F. & Harvey, B. C. (2002) Ecology, 83, 1817-1830.

E.g. of Inverse modelling from multiple patterns Salmon evolutionary model (16 time series): 0+ Parrs 1+ Parrs Smolts 1yr Smolts 2yr Numbers Proportions matures L (mm) Field Model ABM POM ABC IC POMIC CS Piou, C. & Prévost, E. (2012)

Akaike s Information Criterion Based on information theory (Akaike 1973) The best model is the one with smallest Kullback-Leibler distance

Akaike s Information Criterion Kullback-Leibler distance f(x) (probability of value in nature) TRUTH density distribution x (variable considered) With underlying unknown probability density function (pdf)

Akaike s Information Criterion Kullback-Leibler distance f(x) (probability of value in nature) g(x) (probability of value with model)? Predicted density distribution x (variable considered) x (variable to be predicted)

Akaike s Information Criterion Kullback-Leibler distance I ( f ( x) ; g( x) ) ( ) ( ) ( ) + f x = f x log dx g x

Akaike s Information Criterion AIC proposition: Model selection dependent on fit and parsimony Assumptions: L( ) k AIC = 2 log θ + 2 k The n is large enough Mean expected likelihood as estimator of K-L distance Maximum likelihood ~ mean expected likelihood

Model selection in Bayesian statistics Proposition of Spiegelhalter et al. (2002 Journal of the Royal Statistical Society B, 64, 583-639) for Bayesian models : Deviance D ( θ ) = 2log( L( θ )) ( θ ) D( θ ) = D( ) pd DIC = 2 D θ + Mean deviance on posterior distribution Prerequisite: Convergence of a MCMC gives posterior distribution Identification of likelihood function for parameters DIC was demonstrated as generalization of AIC Complexity of model ( θ ) D( θ ) pd D = Deviance of model with estimates of parameters from posterior distribution

POMIC proposition Pseudo-likelihood pseudo-deviance Posterior distribution estimation: Metropolis without likelihood (~ABC) Complexity: Adapting DIC proposition Piou, C., Berger, U. & Grimm, V. (2009) Proposing an information criterion for individual-based models developed in a pattern-oriented modelling framework. Ecological Modelling, 220, 1957-1967.

Pseudo-likelihood For (most) ABMs we do not have likelihood functions of the parameters The likelihood of a parameter set given the data = Goodness of fit indicator

Pseudo-likelihood For (most) ABMs we do not have likelihood functions of the parameters One way to bypass this is approximating a likelihood-type function with Z included inθ g(x θ) (probability in model)? b(x) =b(x (θ*+z)) (probability in sample) x (predicted variable) x (sampled variable)

Pseudo-likelihood An estimate of likelihood: g(x 1 θ) -Mesure of x in model -Adjust estimation of density on range of real x -«reading» of pseudo-likelihood of real x from this adjusted density L n ( θ data) g( θ ) = i ( ) x i x 1 x 2

Pseudo-likelihood An estimate of Deviance: g(x 1 θ) -Mesure of x in model -Adjust estimation of density on range of real x -«reading» of pseudo-likelihood of real x from this adjusted density L n ( θ data) g( θ ) POMDEV = = 2 i n i= 1 ( ) log x i ( g( θ )) x i x 1 x 2

Metropolis without likelihood Metropolis Thanks Marjoram et al. (2003, PNAS) Idea 1: estimator of likelihood ratios P ( accept θ ') = Estimate of θ ' Q LR θ t Q ( θt θ ') ( θ ' θ ) ( θ ') ( θ ) Idea 2: production of data D and reject parameters if D D (±error) P ( accept θ ') 1 D = D' Q = 0 otherwise Q ( θt θ ') ( θ ' θ ) t t P P P P ( θ ') ( θ ) t t

Metropolis without likelihood Metropolis + POMDEV: POMDEV ' = 2 n i= 1 log ( g( θ ')) x i P P Idea 1: estimator of likelihood ratios exp POMDEVt POMDEV' accept θ ', POMDEV' = 2 ( ) ( ) Q( θt θ ') Q( θ ' θ ) Idea 2: production of data D and reject parameters if D D (±error) ( accept θ ') 1 = POMDEV ' POMDEVacceptance 0 otherwise Q Q ( θt θ ') ( θ ' θ ) t t P P P P ( θ ') ( θ ) t ( θ ') ( θ ) t

POMIC P( θ D) = POMDEV = POMDEV ( θˆ ) POMDEV 1 L = POMDEV L l l= 1 ( θ ) ( ) = POMDEV pd POMIC = 2 POMDEV POMDEV θˆ + pd = POMDEV POMDEV ( θˆ )

POMIC ~ ABC Hartig, F., Calabrese, J. M., Reineking, B., Wiegand, T. & Huth, A. (2011) Statistical inference for stochastic simulation models - theory and application. Ecology Letters, 14, 816-827. ABM POM ABC IC POMIC CS

Case study Inferring on innate locust behavior with individual based models and an information criteria Piou C. & Maeno K.

Solitarious

Density Gregarization

Niger 1988 Mauritanie 1988 Gregarious Niger 1987

Innate gregarious behavior? Ellis (1956) found a random distribution of hatchlings in cages Simpson et al. (Islam et al. 1994, Bouaïchi et al. 1995, Simpson et al. 1999 ) individual measurements difference of behavior among solitarious and gregarious hatchlings (but mixing activity and attractivity and isolating individuals at hatching time) Guershon & Ayali (2012) group measurements behaving identically: grouping together on sticks (but discarded egg pods producing few individuals, first 2h after hatching and wrong statistical method)

Innate gregarious behavior? Group measure to keep interactions But problem of mixing of processes/behaviors (level of activity, aggregation, food search, resting )

Laboratory experiments 17 experiments: 11 from grouped mothers 6 from isolated mothers 30 eggs positioned on a wet petri-dish ~24h before hatching time in a center of an arena Photo recording every 15min during at least 7h (up to 16h) after first hatching

Laboratory experiments Photo analysis: Count of individuals per stick Count of individuals on other parts of the arena NND (on sticks) Activity (change of nb per sticks) Proportions of individuals on sticks

OVERVIEW ABM analysis Model description Objective: Simulate our experiments Entities & state variables: Eggs ready to hatch (Time to hatch, position) Sticks (attractiveness, number of locusts = n) Hatchling locust (position, age) Scales: 101 time steps (time step = 15min), 16 sticks, N eggs = hatched eggs in experiments

OVERVIEW ABM analysis Model description Process overview: Update-attractivity: d n attractiveness i = 1+ n SetvectorAttractivity: P ( choice = i) j = i j f + attractivenessi attractiveness j j k k neighbors of i n j n j Update locusts: P climb = invlogit ( ) ( a age + b) choice = i? P P ( walk ) = 1 P( climb) ( changestick ) = pc

OVERVIEW ABM analysis Model description Parameters: a, b, d, f, pc (to be adjusted with MCMC per experiment) Forced dynamics: hatching time of eggs (as in experiments) d=0 d=50

ABM analysis Searching for behavior explaining the lab results 3 model versions : ModelO d=f=0 pc=? a=? b=? ModelA d=? f=0 pc=? a=? b=? ModelB d=? f=? pc=? a=? b=?

ABM analysis Searching for behavior explaining the lab results 3 model versions : ModelO d=f=0 pc=? a=? b=? ModelA d=? f=0 pc=? a=? b=? ModelB d=? f=? pc=? a=? b=? With each model and for each experiment (17) parameterization and posterior creation with MCMC (following Piou et al. 2009) POMIC measure to infer on likely behavior that lead to the 3 temporal patterns of: Mean NND through time Proportion of individuals on sticks through time Activity through time

ModelB for experiment 8 a ModelA for experiment 16 a 0.8 b d 0.83 b f d pc pc 0.27

ABM analysis Gregarious Solitarious Experiments 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Best Model O O B O A O B B A O O O O O O A A Second Model B A O A B B O O O A B A B A A B O Third Model A B A B O A A A B B A B A B B O B dˆ of best model - - 2.5-14 - 2 9 7 - - - - - - 57 37 POMIC 1 st -2 nd 1 1 1.6 32 2.2.0 4 0 4 12.8.9 14.6 5 1 POMIC 1 st -3 rd 1 4 4 42 4.6.2 6 3 13 13 2 1.5 26 1 59 5 Gregarious innate behavior sometimes significantly happening but from both treatments

Conclusions of case study Innate gregarious behavior might exist in Desert Locust Small n of lab experiments ABM integration of different processes POMIC approach applicable to behavioral analyses

POMIC & ABC Potential in behavioral ecology to infer on complex systems dynamics Problem of chain convergence for complex models trade-off of case complexity and inference possibility Other techniques of ABC could be applied (summary statistics, regression ) Study of group behavior of hopper bands to come

THANK YOU! БЛАГОДАРЯ!