Species distribution modelling with MAXENT Mikael von Numers Åbo Akademi

Similar documents
Managing Uncertainty in Habitat Suitability Models. Jim Graham and Jake Nelson Oregon State University

Welcome to NR502 GIS Applications in Natural Resources. You can take this course for 1 or 2 credits. There is also an option for 3 credits.

ArcGIS Role in Maxent Modeling

Digital Terrain Model GRID Width 50 m DGM50

Digital Terrain Model Grid Width 200 m DGM200

Outline. - Background of coastal and marine conservation - Species distribution modeling (SDM) - Reserve selection analysis. - Results & discussion

The usage of GIS to track the movement of black bears in Minnesota due to climate change

Introduction to Geographic Information Systems (GIS): Environmental Science Focus

Display data in a map-like format so that geographic patterns and interrelationships are visible

Geographical Information Systems

European Commission STUDY ON INTERIM EVALUATION OF EUROPEAN MARINE OBSERVATION AND DATA NETWORK. Executive Summary

Introduction-Overview. Why use a GIS? What can a GIS do? Spatial (coordinate) data model Relational (tabular) data model

Maximum entropy modeling of species geographic distributions. Steven Phillips. with Miro Dudik & Rob Schapire

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

WORKING WITH DMTI DIGITAL ELEVATION MODELS (DEM)

Course overview. Grading and Evaluation. Final project. Where and When? Welcome to REM402 Applied Spatial Analysis in Natural Resources.

INTRODUCTION TO GIS. Dr. Ori Gudes

Aldo Ferrero, Francesco Vidotto, Fernando De Palo. RUNOFF team

Intro to GIS In Review

ENMTools: a toolbox for comparative studies of environmental niche models

MODELLING AND UNDERSTANDING MULTI-TEMPORAL LAND USE CHANGES

IDENTIFYING POTENTIAL CURRENT DISTRIBUTION FOR BENDIRE'S THRASHER (Toxostoma bendirei) Final Report

GIS CONCEPTS ARCGIS METHODS AND. 3 rd Edition, July David M. Theobald, Ph.D. Warner College of Natural Resources Colorado State University

Overview Objectives Materials Background Activity Procedure

Popular Mechanics, 1954

ABSTRACT. by Alexa Marie Mainella

SPATIAL DISTRIBUTION OF THE GREATER SAGE-GROUSE IN THE POWDER RIVER BASIN IN NORTHEASTERN WYOMING. Audra Smolek

Determining the difference in the geographic overlap of the potential distribution of the green and ocellated lizards at continental and regional

Relate Attributes and Counts

envision Technical Report Archaeological Prediction Maps Kapiti Coast

Introduction to Field Data Collection

Watershed Sciences 4930 & 6920 Advanced GIS

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

"MODEL COMPLEXITY AND VARIABLE SELECTION IN MAXENT NICHE MODELS: ANALYSES FOR RODENTS IN MADAGASCAR"

Among various open-source GIS programs, QGIS can be the best suitable option which can be used across partners for reasons outlined below.

Current and Future Plans. R. Srinivasan

Predicted herring (Clupea harengus) spawning grounds on the Lithuanian coast

RAPPORT LNR Wave exposure calculations for the Finnish coast

Using the Stock Hydrology Tools in ArcGIS

Search for the Gulf of Carpentaria in the remap search bar:

Conquering the Cold: Climate suitability predictions for the Asian clam in cold temperate North America

NR402 GIS Applications in Natural Resources

Trail Flow: Analysis of Drainage Patterns Affecting a Mountain Bike Trail

Utilization of Global Map for Societal Benefit Areas

GIS in Water Resources Midterm Exam Fall 2008 There are 4 questions on this exam. Please do all 4.

Spatial Effects on Current and Future Climate of Ipomopsis aggregata Populations in Colorado Patterns of Precipitation and Maximum Temperature

Automatic Watershed Delineation using ArcSWAT/Arc GIS

A GIS Approach to Modeling Native American Influence on Camas Distribution:

Integrated Hydrodynamic Modeling System

SDMtoolbox 2.0 User Guide

Model Integration - How WEPP inputs are calculated from GIS data. ( ArcGIS,TOPAZ, Topwepp)

GIS CONCEPTS ARCGIS METHODS AND. 2 nd Edition, July David M. Theobald, Ph.D. Natural Resource Ecology Laboratory Colorado State University

Sampling Populations limited in the scope enumerate

Web Portal to European Soil Database

Precision Ag. Technologies and Agronomic Crop Management. Spatial data layers can be... Many forms of spatial data

Bryan F.J. Manly and Andrew Merrill Western EcoSystems Technology Inc. Laramie and Cheyenne, Wyoming. Contents. 1. Introduction...

Watershed Delineation

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

Modeling the Rural Urban Interface in the South Carolina Piedmont: T. Stephen Eddins Lawrence Gering Jeff Hazelton Molly Espey

Give 4 advantages of using ICT in the collection of data. Give. Give 4 disadvantages in the use of ICT in the collection of data

CPSC 695. Future of GIS. Marina L. Gavrilova

Eva Strand and Leona K. Svancara Landscape Dynamics Lab Idaho Coop. Fish and Wildlife Research Unit

General Overview and Facts about the Irobland

TITLE: A statistical explanation of MaxEnt for ecologists

Machine Learning Linear Classification. Prof. Matteo Matteucci

Working with Digital Elevation Models in ArcGIS 8.3

Digitization of the Beatty Odonata Collection at the Frost Entomological Museum (PSUC): the Terrain of Ecological Niche Modeling

Applying GIS to Hydraulic Analysis

GIS and Coastal Nutrients Luke Cole

Geography 38/42:376 GIS II. Topic 1: Spatial Data Representation and an Introduction to Geodatabases. The Nature of Geographic Data

Technical Report No. 5 PREDICTED HERRING (CLUPEA HARENGUS) SPAWNING GROUNDS IN THE LITHUANIAN COASTAL WATERS

Arctostaphylos hookeri habitat suitability model

National Atlas of Groundwater Dependent Ecosystems (GDE)

CHAPTER 6 RESULTS FIGURE 8.- DATA WORK FLOW FOR BACKSCATTER PROCESSING IN HYPACK

Abstract: Contents. Literature review. 2 Methodology.. 2 Applications, results and discussion.. 2 Conclusions 12. Introduction

Tips and Tricks for Using ArcGIS for Fire Pre-Incident Planning Version II By: Chris Rogers Firefighter Kirkland Fire Department Kirkland Washington

METADATA. Publication Date: Fiscal Year Cooperative Purchase Program Geospatial Data Presentation Form: Map Publication Information:

PHYTOPLANKTON AGGREGATE EVENTS AND HOW THEY RELATE TO SEA SURFACE SLOPE

USING GIS CARTOGRAPHIC MODELING TO ANALYSIS SPATIAL DISTRIBUTION OF LANDSLIDE SENSITIVE AREAS IN YANGMINGSHAN NATIONAL PARK, TAIWAN

How to use the guidance tool (Producing Guidance and Verification)

Determining a Useful Interpolation Method for Surficial Sediments in the Gulf of Maine Ian Cochran

Modeling Risk of Japanese

Bentley Map Advancing GIS for the World s Infrastructure

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press, ISSN

In matrix algebra notation, a linear model is written as

MUDMAP TM. Software Description

The University of Texas at Austin. Icebox Model Projections for Sea Level Fall in the Gulf Coast and Caribbean Sea Region

)UDQFR54XHQWLQ(DQG'tD]'HOJDGR&

Incorporating Boosted Regression Trees into Ecological Latent Variable Models

Burn-P3. Version 4.7. User's manual, 2017

Software requirements * :

ARCGIS TRAINING AT KU GIS LABS: INTRODUCTION TO GIS: EXPLORING ARCCATALOG AND ARCGIS TOOLS

GIS & Remote Sensing in Mapping Sea-Level Rise (SLR)

THE WILDLIFE SOCIETY CONFERENCE 2015

Raster Spatial Analysis Specific Theory

Presence-only data and the EM algorithm

Delineating environmental envelopes to improve mapping of species distributions, via a hurdle model with CART &/or MaxEnt

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

Visualization of Commuter Flow Using CTPP Data and GIS

USING HYPERSPECTRAL IMAGERY

Transcription:

Species distribution modelling with MAXENT Mikael von Numers Åbo Akademi

Why model species distribution? Knowledge about the geographical distribution of species is crucial for conservation and spatial planning. Detailed data on species distribution is usually not available and collecting such data is costly and labor intensive. Conservationists have in many cases to rely on predictive models for estimating patterns of species distribution and for making conservation strategies. SDMs provide one of the best ways to overcome sparseness typical of distributional data, by relating them to a set of geographic or environmental predictors.

What do we need for SDM? Reliable data on species presences (and absences) Environmental data as GIS rasters (predictors)

Typical workflow:

Maxent A short introduction Maxent is a presence-only (po) modelling method, which means that no absence data is needed. Maxent (or other po methods) might be a good choice for instance when: There is no absence data is available (which is often the case (absences are not recorded, data from museums, herbaria etc.) There is reason to believe that the absence data is not reliable. Several other reasons, for instance: The species is not stationary (satellite tagged animals (e.g. porpoises), radiotelemetry data) The species hard to detect (e.g. reptiles) The species is temporarily absent. The species occurs in patches. You have only a single observation within a large suitable territory (for instance singing bird males)

How it works The Maxent method does not need species absences; instead it uses background environmental data for the entire study area. The method focuses on how the environment where the species is known to occur, relates on the environment across the rest of the study area. The idea is find the probability distribution of maximum entropy (most spread out), subject to constraints imposed by information available regarding the species presences and the environmental conditions across the study area (more in Phillips et al. 2006 and Elith et al. 2011: A statistical explanation of MaxEnt for ecologists. Maxent has similarities to GAM and GLM but Maxent models a probability distribution over all pixels in the study area, and in no sense are pixels without species interpreted as absences, meaning that pseudoabsences are not used.

Advantages: Maxent can use both continuous and categorical environmental variables (predictors) Maxent is able to fit complex relationships between the species and the environmental variables (features in Maxent), also including interactions between the predictors. Produces test statistics, measures of variable importance and response curves. A possibility to make cross-validations. A possibility to shift regularization parameters. These determine how focused the output distribution is. A larger parameter will give a less localized prediction. Works well together with, for instance, ArcView. Is reported to be effective with a relatively small number of presences. The output raster represents a continuous measure of probability of occurrence. Maxent is a quite new method, but it has performed excellently in tests compared to other similar methods. It is quite easy to use and has an nice user friendly interface. Shareware, active discussion group, lots of published papers recently. Download from: www.cs.princeton.edu/~schapire/maxent/ Major conclusions drawn from Elith et al. 2006: Presence-only data are useful for modelling species distributions Presence-only data can be sufficiently accurate to be used in conservation planning New modelling methods, such as MAXENT, generally outperforms established methods

Drawbacks: a black box ; not easy to understand how the method works, compared to, for instance, to GLM or GAM According to the literature not as mature a statistical method as GAM or GLM. Sample selection bias is a bigger problem for presence-only methods than for presence -absence methods. If there is a bias you will get a model that combines the species distribution with the distribution of sampling effort. There are methods to deal with this problem: you can provide Maxent with a bias raster to correct for the bias in sampling effort. If absence data are available, a presence-absence method is a better choice than a po-method.

In this case a fitted model might be closer to a model of survey effort than of distribution.

The Maxent user interface

Zostera marina

Species data: 75 presence points of Zostera marina in the S. Archipelago Sea

Species X_coord Y_coord Zostera, 3214710, 6666810 Zostera, 3191860, 6681080 Zostera, 3195940, 6674130 Zostera, 3215030, 6679040 Zostera, 3208580, 6653860 Zostera, 3184780, 6642620 Zostera, 3205750, 6669300 Zostera, 3196800, 6646150 Zostera, 3213730, 6678190 Zostera, 3206280, 6678010 Zostera, 3199600, 6647510 Zostera, 3197280, 6646490 Zostera, 3200910, 6648660 Zostera, 3212160, 6647820 Zostera, 3212160, 6647890 Zostera, 3189660, 6683280 Zostera, 3205810, 6669390 Zostera, 3213530, 6654590 Fucus, 3209220, 6657510 Fucus, 3194840, 6646240 Fucus, 3196250, 6646940 Fucus, 3189310, 6683540 Species data format: data as a comma delimited *.csv file (use Excel). only 3 columns needed: species name(s) and co-ordinates.

Predictor layers describing the environmental variables the grids has to be in ascii raster format (ESRI.asc) the grids must have the same geographic bounds and cell size. the layers can be continuous or categorical. ncols 1827 nrows 2044 xllcorner 3176430 llcorner 6636626 cellsize 25 NODATA_value -9999-9999 -9999-9999 -9999-9999 -9999-0.1697558-0.3892355-0.629083-0.8858771-1.15194-1.418818-1.683608-1.943836-2.19765-2.453322-2.724256-3.016762-3.336428-3.700734-4.129993-4.631121-5.202521-5.847729-6.573002-7.368282-8.198206-9.017972-9.795128-10.51915-11.18508-11.76465-12.1964-12.40763-12.36905-12.19018-12.21704-12.41916-13.14217-14.7096-17.03474-19.10044-20.86929-22.32145-23.51356-24.54868-25.52947-26.52113-27.53157-28.51738-29.42035-30.20646-30.87352-31.43348-31.87958-32.1587-32.1725-31.80916-30.98529-29.68661-27.99283-26.07573-24.18093-22.60346-21.62301-21.31837-21.2968-21.14873-20.38395-18.66777-15.95716-13.17438-10.75567-8.945774-6.735695-4.542916-2.320879-9999 -9999-9999 -9999-9999 -9999-9999 -9999-9999 -9999-9999 -9999-9999 - 9999-9999 -0.5598915

Predictors: Depth (DEM)

Predictors: exposure

Predictors: distance from sand. A proxy for sandy substrate (that is not available).

Predictors: Slope (derived from the DEM)

The Maxent output probability raster is an ascii (.asc) raster, which is easy to exported to ArcView for further analysis and symbolisation.

Substrate data = categorical data

Cormorant fishing areas Substrate included as a categorical variable

Worth to remember when modelling: 1. Garbage in garbage out. 2. Use a sufficient number of records. No algorithm can model extremely sparse species data. Guideline > 30 records. 3. Each record should bring new information to the model; clusters of observations -> one observation. 4. Samples should spread across the whole area of interest. -> Stratified sampling. 5. Beware of sampling bias especially in po-methods. 6. Pre-process the predictors carefully. Resolution, collinearity etc. 7. Check the model fit. ( AUC, cross validation, learn-test datasets). Large literature available. 8. Many sources of error. -> predictions will always be uncertain. -> Be realistic and cautious when interpreting the results.

Workflow: 1. The Maxent program 2. The Maxent output 3. Do a Maxent run using Zostera data and four predictor layers (individually or together) 4. Import the Maxent predictions to ArcView (together) 5. Use ArcView to mask out part of the study area (together). 6. Do a new Maxent run and compare the results.