Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology
|
|
- Kristopher Henderson
- 5 years ago
- Views:
Transcription
1 Occupancy models Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Advances in Species distribution modelling in ecological studies and conservation Pavia and Gran Paradiso, Italy Sept. 2011
2 Outline Session 1: Introduction to occupancy modelling S1.1 Introduction S1.2 Statistical background S1.3 Single-season occupancy model Session 2: Occupancy modelling in practice S2.1 Practical: single -season S2.2 Study design Session 3: Occupancy modelling developments S3.1 Multiple-season occupancy model S3.2 Practical: multi-season S3.3 Further models 2
3 SESSION 2 Occupancy modelling in practice 2.1 Practical: single-season
4 Software packages to fit occupancy models Package unmarked 4
5 Practical We will use the sample data sets that come with PRESENCE To find them go to the PRESENCE installation folder and look under the folder sample_data e.g. C:\program files\presence\sample_data 5
6 Sample data set 1: Blue-ridge two-lined salamander Blue-ridge two-lined salamander Eurycea wilderae Habitat: temperate forests, rivers, freshwater springs Endemic to the US Found in the southern Appalachians 6
7 Sample data set 1: Blue-ridge two-lined salamander Data set from a 2001 survey (Blue_Ridge_pg99.csv) Sampling protocol s = 39 sites surveyed, k =5 replicates per site Each site: a 50m-transect on natural cover and coverboard stations Each site surveyed once every two weeks from April to mid-june, when the salamanders are believed to be most active 7
8 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) 8
9 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) Columns: β parameters Row: real parameter (ψ) Grid cells: values that define the regression equation logit ψ = a 1 1 = a 1 9
10 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) Columns: β parameters To incorporate covariates need to add more columns Row: real parameter (ψ) Grid cells: values that define the regression equation logit ψ = a 1 1 = a 1 10
11 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) 11
12 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) Columns: β parameters logit p 1 = b 1 logit p 2 = b 1 Row: real parameters ( 1...p 5 ) Grid cells: values that define the regression equation logit p 3 = b 1 logit p 4 = b 1 logit p 5 = b 1 12
13 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) Columns: β parameters To incorporate covariates need to add more columns Row: real parameters ( 1...p 5 ) Grid cells: values that define the regression equation Add more columns also for survey-specific p 13
14 Design matrix Used to relate probabilities ψ and p to the regression coefficients ( β parameters) logit p 1 = b 1 logit p 2 = b 2 logit p 3 = b 3 logit p 4 = b 4 logit p 5 = b 5 14
15 Practical 1 - results Part 1a: regression equations logit p 1 = b 1 logit p 2 = b 1 logit p 3 = b 2 logit p 4 = b 2 logit p 5 = b 2 15
16 Practical 1 - results Part 1a: regression equations logit p 1 = b 1 logit p 2 = b 1 1 b 1 = p 1 2 = 1 + exp b 1 1 = 1 + exp = logit p 3 = b 2 logit p 4 = b 2 logit p 5 = b 2 1 b 2 = p 3 5 = 1 + exp b 2 1 = 1 + exp =
17 Practical 1 - results Part 1a: model support 2.94 AIC units better than the constant model ψ(.) p(.) Likelihood-ratio test: = > 3.84 (χ 2 value at 0.05 for 3-2=1degrees of freedom) p-val < 0.05 i.e. there is support to reject the null hypothesis: ψ(.) p(.) 17
18 Practical 1 - results Part 1b: regression equations logit p 1 = b 1 logit p 2 = b 1 + b 2 logit p 3 = b 1 + b 3 logit p 4 = b 1 + b 4 logit p 5 = b 1 + b 5 Note: survey 1 is the reference here 18
19 Practical 1 - results Part 1b: regression equations logit p 1 = b 1 logit p 2 = b 1 + b 2 logit p 3 = b 1 + b 3 logit p 4 = b 1 + b 4 logit p 5 = b 1 + b 5 b 1 = p 1 = b 2 = p 2 = b 3 = p 3 = b 4 = p 4 = b 5 = p 5 = Note: survey 1 is the reference here 19
20 Practical 1 - results Part 1c: regression equations logit p 1 = b 1 logit p 2 = b 2 logit p 3 = b 3 logit p 4 = b 4 logit p 5 = b 5 b 1 = p 1 = b 2 = p 2 = b 3 = p 3 = b 4 = p 4 = b 5 = p 5 =
21 Practical 1 - results Part 1c: model support 4.89 AIC units worse than the best model in the set ψ(.) p(1-2,3-5) Likelihood-ratio test: = < 7.82 (χ 2 value at 0.05 for 6-3=3degrees of freedom) p-val > 0.05 i.e. no support to reject the null hypothesis: ψ(.) p(1-2,3-5) 21
22 Sample data set 2: Mahoenui giant weta Mahoenui giant weta Deinacrida mahoenui Endemic to the King Country in New Zealand s northern island Only 2 surviving populations (main one in a 240-ha reserve) Use gorse plants as protection from predators and food source Goats and cattle used to browsed the gorse ( + foliage) 22
23 Sample data set 2: Mahoenui giant weta Data set from a 2004 survey (Weta_pg116.xls) Sampling protocol Each site a 3m radius circular plot s = 72 sites surveyed, k = 3-5 replicates per site within 5-day period 3 observers (each surveyor visited each site at least once) Interest in the effect of browsing on ψ 23
24 Practical 2 - results Part 2a: logit ψ u = a 1, logit ψ b = a 2 ψ u = 0.481, ψ b =
25 Practical 2 - results Part 2a: logit ψ u = a 1, logit ψ b = a 2 ψ u = 0.481, ψ b = logit ψ u = a 1, logit ψ b = a 1 + a 2 ψ u = 0.481, ψ b =
26 Practical 2 - results Part 2b: set of models 26
27 Practical 2 - results Part 2c: probabilities ψ for a browsed site: (0.121) p for a site surveyed in day 3 by observer 2: (0.081) 27
28 Practical 2 - results Part 2d: Combined model weights for day in p = 0.91 Combined model weights for observer in p = 0.73 Combined model weights for browse in ψ =
29 Practical 2 - results Part 2e: model averaging Model averaged ψ for browsed sites: (0.127) Model averaged ψ for unbrowsed sites: (0.142) 29
30 Worked examples in PRESENCE If you want to play more with these data sets: you can find in PRESENCE a file with exercises and detailed explanations Go to Help PRESENCE worked examples and exercises the occupancy book also discusses the analysis of these data sets (MacKenzie et al, 2006, p99 and p116) 30
31 SESSION 2 Occupancy modelling in practice 2.2 Study design
32 Think first! To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of (Ronald Fisher, ca. 1938) "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data..." (John Tukey, 1986) 32
33 Think first! Lot s of things to think about: why-what-how (Yoccoz et al 2001) Why to carry out this study? (i.e. articulate objectives) What would be a suitable state variable? How to do it? Here we concentrate on the how Note it is related to the what and why A note on what : do not artificially force yourself to use a particular framework/model e.g. if raw data consists of counts, collapsing them to detection/nondetection will inevitably lose information! Whole literature of models for count data (e.g. N-mixture models) 33
34 Design issues in occupancy modelling How to define site? How to choose sites to sample? What is our sampling season? How to obtain replication to estimate detectability? How far apart should my replicate visits be? How to avoid heterogeneity? How to allocate my survey effort into sites and replicates? How much total effort will be needed? What can I expect to obtain with my available resources? 34
35 How to define site? Sometimes there is a natural definition ( discrete units) E.g. ponds, habitat patch In others, it is more arbitrary ( continuous landscape) E.g. plots within a forest 35
36 How to define site? Remember: occupancy is related to my definition of site! ψ 1 =2/4=0.5 ψ 2 =2/16=0.125 An occupancy estimate is meaningless if we do not know how the sampling site was defined. 36
37 How to define site? There is no universal truth To be assessed on a case-by-case basis Things to consider: At what scale we want to measure occupancy? At which scale an observed 0 or 1 is meaningful? Is the species territorial? What is the size of its home range? 37
38 How to choose sites to sample? Usually aim to do inference beyond the specific sites sampled select sites so that results can be generalized Need a probabilistic sampling scheme Random sampling Stratified random sampling. Selecting sites based on knowledge of their occupancy status is in general not a good idea Unless those sites actually represent the population of interest Estimates of occupancy for the entire population may be biased 38
39 How to choose sites to sample? Best approach depends on project objectives e.g. two possible objectives for an occupancy study: 1. Compare occupancy for two specific habitat types in the area 2. Obtain an overall estimate of occupancy for the entire area For objective 1 an efficient design would involve to identify areas within the reserve corresponding to the two habitat types and then randomly select sites A However, this design would not be appropriate for objective 2 because the sample is not representative from the whole reserve B C 39
40 What is our sampling season? The window of time in which the system is sampled Sometimes there is a natural definition e.g. breeding season, wet season... But it could be something different e.g. If survey once/day during 7 days, our sampling season is 1 week Need to consider how the species moves, as this influences the biological quantity that the sampling is capturing Remember the closure assumption. Is there emigration/immigration? Are we looking at ψ as actual occupancy or as use? If occupancy as a sort of surrogate for population size, better shorter season (~snapshot) If interest in use, better longer season 40
41 How to obtain replication to estimate detectability? Repeated visits to each site at different points in time Multiple surveys within one visit One observer carried out various independent surveys Simultaneous independent observers Simultaneous independent detection methods Spatial replication within the site 41
42 How to obtain replication to estimate detectability? Which is more appropriate depends on the biology of the species and the factors that affect its detection If p constant, repeated surveys within a visit may be more efficient But if p varies e.g. daily, heterogeneity may be induced Multiple visits allow each site to be surveyed under a range of conditions 42
43 How far apart should my replicate visits be? So that our estimates have a useful interpretation E.g. far enough to ensure that the detections are independent E.g. close enough to ensure we do not have closure problems Once again one needs to consider how the species moves If ψ as usage, ensure that the species has had the chance to randomly enter/leave the site from one visit to the next If ψ as occupancy ( snapshot), closer so species is either present or absent at the site during the whole period. 43
44 How to avoid heterogeneity? Choose site size so that no great differences in abundance Collect relevant information and include as covariates E.g. habitat information, meteorological data... Avoid sampling always the same sites under same conditions if possible, rotate! 44
45 How to avoid heterogeneity? e.g. monitoring of Alaotran gentle lemur Observer Time of day Meteorological conditions Day 1 T1 village T2 T3 (Guillera-Arroita et al. 2010a) 45
46 How to avoid heterogeneity? e.g. monitoring of Alaotran gentle lemur Observer Time of day Meteorological conditions Day 2 T1 village T2 T3 (Guillera-Arroita et al. 2010a) 46
47 How to allocate my survey effort? How to allocate our effort into sites and replicate surveys? Is it better to visit less sites and carry out more surveys in each one? Is it better to cover more sites and do less visits? Site Site Site sites x 3 surveys = sites x 6 surveys = Site Site Trade-off: s var(ψ) BUT if k more likely to miss the species at occupied sites and p not so well estimated var(ψ) 47
48 How to allocate my survey effort? Can obtain guidelines looking at the estimator properties for the constant occupancy model (no covariates) Based on large-sample assumption (asymptotic) The optimal allocation depends on actual ψ and p Need some estimates for those! Sensitivity analysis can be a good idea Also need to have: an indication of the max number of surveys that can be conducted level of acceptable precision (part of the objective) 48
49 How to allocate my survey effort? We want a good estimate of ψ... so let s look at the estimator variance for different designs (s, k) Assume a standard survey design (i.e. s sites visited k times) For this model there is an explicit expression: var ψ = ψ s 1 ψ + 1 p p kp 1 p k 1 p = 1 1 p k from binomial experiment extra variance due to imperfect detection Pr(detection in at least one visit) 49
50 asymptotic variance of ψ How to allocate my survey effort? Fixed total effort E = s k E =2000 psi=0.4,p=0.3 psi=0.4,p=0.6 psi=0.7,p= # replicates (k) 50
51 asymptotic variance of ψ asymptotic variance of ψ How to allocate my survey effort? Note the optimal replication (k) depends only on ψ and p, not on the total effort E E =2000 psi=0.4,p=0.3 psi=0.4,p= E =1000 psi=0.4,p=0.3 psi=0.4,p= psi=0.7,p= psi=0.7,p= # replicates # replicates 51
52 How to allocate my survey effort? Optimal k ψ p Rare species: more sites, less intensively Common species: less sites, more intensively (MacKenzie & Royle 2005) 52
53 How to allocate my survey effort? We have assumed constant cost per survey (E= k s) but other scenarios are possible e.g. It could be that repeat surveys at a site are cheaper than the 1 st MacKenzie & Royle (2005) explore different cost scenarios Results on optimal replication reasonably robust to the effect of cost We used the occupancy estimator precision as design criterion Guillera-Arroita et al (2010b) explore other optimality criteria, incorporating the precision of the detection probability estimator Broadly the same patterns arise 53
54 How to allocate my survey effort? Once k chosen, what about s? That depends on the precision we want to achieve! Study constraints: Maximum survey effort available Minimum estimator precision Two approaches: A) Best estimator with the given available effort B) Good enough estimator with the minimum effort possible 54
55 How to allocate my survey effort? Approach A: Best estimator with the given available effort Use all the effort s = E k Check: do we achieve the minimum precision with this design? var ψ = ψ s 1 ψ + 1 p p kp 1 p k 1 p = 1 1 p k 55
56 How to allocate my survey effort? Approach B: good enough estimator with minimum effort Derive s given ψ, p and k from the variance expression s = ψ var ψ 1 ψ + 1 p p kp 1 p k 1 p = 1 1 p k Check: is the design (s, k) in line with the maximum effort available? 56
57 How to allocate my survey effort? e.g. we expect ψ 0.6, p 0.3 and want to achieve SE of 0.05 Optimal k = 6 replicates per site ( p* = 0.88) 57
58 How to allocate my survey effort? e.g. we expect ψ 0.6, p 0.3 and want to achieve SE of 0.05 Optimal k = 6 replicates per site ( p* = 0.88) s = ψ var ψ E= k s = = ψ + 1 p p kp 1 p k 1 = = = =
59 How to allocate my survey effort? e.g. we expect ψ 0.6, p 0.3 and want to achieve SE of 0.05 Optimal k = 6 replicates per site ( p* = 0.88) s = ψ var ψ E= k s = = ψ + 1 p p kp 1 p k 1 = = = = 146 p=1 s = =
60 How to allocate my survey effort? e.g. we expect ψ 0.6, p 0.3 and want to achieve SE of 0.05 What about if k = 3 replicates per site? ( p* = 0.66) 60
61 How to allocate my survey effort? e.g. we expect ψ 0.6, p 0.3 and want to achieve SE of 0.05 What about if k = 3 replicates per site? ( p* = 0.66) s = ψ var ψ E= k s = = ψ + 1 p p kp 1 p k 1 = = = =
62 Non-standard designs Other designs explored by MacKenzie & Royle (2005) Double sampling design: repeated surveys at a subset of sites, the rest only surveyed once In general not more efficient Removal design: stop surveying at a site when 1 st detection Repetition helps i) establishing occupancy status and ii) estimating p Can be slightly more efficient (especially high p, high ψ) But provides less flexibility for modelling and may be less robust to heterogeneity in p 62
63 Further considerations Simulations as a tool for design expressions and tables shown are based on approximations which may break if the sample size is small they provide a useful guidance but it is recommended to verify the properties of the chosen design via simulations Designing requires some idea about the parameter values Pilot studies can provide helpful information An optimal design is not necessarily a robust design e.g. what if there is heterogeneity in p? in that case a larger number of replicates may be better 63
64 An R-script to evaluate the standard ψ(.)p(.) Call: source("occdesign1sp.r") #only needed once myres<-evaldesign(psi=0.5,p=0.3,s=30,k=3,nits=10000, doprint=1,doplot=1) Information on estimator properties (bias, variance and MSE) Intuitive plot showing the distribution of the estimator (Guillera-Arroita et al. 2010b) 64
65 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.8, p=0.7, s=100, k=4, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 0% ASYMPTOTIC Var = p^ 65
66 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.5, p=0.7, s=100, k=4, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 0% ASYMPTOTIC Var = p^ 66
67 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.5, p=0.3, s=100, k=4, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 0% ASYMPTOTIC Var = p^ 67
68 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.5, p=0.3, s=30, k=4, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 2% ASYMPTOTIC Var = p^ 68
69 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.2, p=0.3, s=30, k=4, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 11% ASYMPTOTIC Var = p^ 69
70 psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.2, p=0.3, s=30, k=3, simulations ψ-hat SIMULATION Var = MSE = SE = Boundary = 18% ASYMPTOTIC Var = p^ 70
71 psi ^ psi ^ An R-script to evaluate the standard ψ(.)p(.) ψ=0.5, p=0.7, s=100, k=4 ψ=0.5, p=0.7, s=200, k= ψ-hat Var = MSE = SE = p^ ψ-hat Var = MSE = SE = p^ 71
72 GENPRES: a tool for occupancy survey design 72
73 Survey design exercise Using simulations for survey design for the rolling giraffe (Neogiraffa rotatoria) Survey: visit to a 1 ha plot Six plot visits can be carried out per day Total survey effort allocated to the study: 60 days Based on findings of a previous similar study, we assume: ψ ~ p ~
74 Survey design exercise (solutions) Variance of ψ based on simulations: ψ=0.3 ψ=0.4 k=2 s= p=0.4 k=3 s= k=4 s= k=2 s= p=0.5 k=3 s= k=4 s= k=2 s= p=0.6 k=3 s= k=4 s= Total effort E=6 60=360 (10000 simulations) 74
75 Survey design exercise (solutions) k=2 not so good, k=3 or 4 a better compromise If choose k=4, the most restrictive case (higher var.) is: ψ=0.4, p=0.4 var ψ = This is higher than the target (SE ψ =0.05 var ψ =0.0025) If we want to achieve that target with this setup (ψ, p, k) we ll need s=135 sites (using formula) Total survey effort of s k=135 4= days 75
Occupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology
Occupancy models Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Advances in Species distribution modelling in ecological studies and conservation Pavia and Gran
More informationOccupancy models. Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology
Occupancy models Gurutzeta Guillera-Arroita University of Kent, UK National Centre for Statistical Ecology Advances in Species distribution modelling in ecological studies and conservation Pavia and Gran
More informationRepresent processes and observations that span multiple levels (aka multi level models) R 2
Hierarchical models Hierarchical models Represent processes and observations that span multiple levels (aka multi level models) R 1 R 2 R 3 N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 N 9 N i = true abundance on a
More informationFour aspects of a sampling strategy necessary to make accurate and precise inferences about populations are:
Why Sample? Often researchers are interested in answering questions about a particular population. They might be interested in the density, species richness, or specific life history parameters such as
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationIntroduction to Occupancy Models. Jan 8, 2016 AEC 501 Nathan J. Hostetter
Introduction to Occupancy Models Jan 8, 2016 AEC 501 Nathan J. Hostetter njhostet@ncsu.edu 1 Occupancy Abundance often most interesting variable when analyzing a population Occupancy probability that a
More informationLectures 5 & 6: Hypothesis Testing
Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across
More informationTo hear the seminar, dial (605) , access code
Welcome to the Seminar Resource Selection Functions and Patch Occupancy Models: Similarities and Differences Lyman McDonald Senior Biometrician WEST, Inc. Cheyenne, Wyoming and Laramie, Wyoming lmcdonald@west-inc.com
More informationSTATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002
Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.
More informationApproach to Field Research Data Generation and Field Logistics Part 1. Road Map 8/26/2016
Approach to Field Research Data Generation and Field Logistics Part 1 Lecture 3 AEC 460 Road Map How we do ecology Part 1 Recap Types of data Sampling abundance and density methods Part 2 Sampling design
More informationLecture 09 - Patch Occupancy and Patch Dynamics
WILD 7970 - Analysis of Wildlife Populations 1 of 11 Lecture 09 - Patch Occupancy and Patch Dynamics Resources Site Occupancy D. I. MacKenzie, J. D. Nichols, G. D. Lachman, S. Droege, J. A. Royle, and
More informationIncorporating Boosted Regression Trees into Ecological Latent Variable Models
Incorporating Boosted Regression Trees into Ecological Latent Variable Models Rebecca A. Hutchinson, Li-Ping Liu, Thomas G. Dietterich School of EECS, Oregon State University Motivation Species Distribution
More information9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.
Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences
More informationIMPROVING INFERENCES IN POPULATION STUDIES OF RARE SPECIES THAT ARE DETECTED IMPERFECTLY
Ecology, 86(5), 2005, pp. 1101 1113 2005 by the Ecological Society of America IMPROVING INFERENCES IN POPULATION STUDIES OF RARE SPECIES THAT ARE DETECTED IMPERFECTLY DARRYL I. MACKENZIE, 1,6 JAMES. D.NICHOLS,
More informationLecture 3: Just a little more math
Lecture 3: Just a little more math Last time Through simple algebra and some facts about sums of normal random variables, we derived some basic results about orthogonal regression We used as our major
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationEXERCISE 8: REPEATED COUNT MODEL (ROYLE) In collaboration with Heather McKenney
EXERCISE 8: REPEATED COUNT MODEL (ROYLE) In collaboration with Heather McKenney University of Vermont, Rubenstein School of Environment and Natural Resources Please cite this work as: Donovan, T. M. and
More informationBryan F.J. Manly and Andrew Merrill Western EcoSystems Technology Inc. Laramie and Cheyenne, Wyoming. Contents. 1. Introduction...
Comments on Statistical Aspects of the U.S. Fish and Wildlife Service's Modeling Framework for the Proposed Revision of Critical Habitat for the Northern Spotted Owl. Bryan F.J. Manly and Andrew Merrill
More informationChi Square Analysis M&M Statistics. Name Period Date
Chi Square Analysis M&M Statistics Name Period Date Have you ever wondered why the package of M&Ms you just bought never seems to have enough of your favorite color? Or, why is it that you always seem
More informationInference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies
Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies Thierry Duchesne 1 (Thierry.Duchesne@mat.ulaval.ca) with Radu Craiu,
More informationModule 03 Lecture 14 Inferential Statistics ANOVA and TOI
Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module
More informationBiometrics Unit and Surveys. North Metro Area Office C West Broadway Forest Lake, Minnesota (651)
Biometrics Unit and Surveys North Metro Area Office 5463 - C West Broadway Forest Lake, Minnesota 55025 (651) 296-5200 QUANTIFYING THE EFFECT OF HABITAT AVAILABILITY ON SPECIES DISTRIBUTIONS 1 Geert Aarts
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationIntroduction to Econometrics. Heteroskedasticity
Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More informationDarryl I. MacKenzie 1
Aust. N. Z. J. Stat. 47(1), 2005, 65 74 WAS IT THERE? DEALING WITH IMPERFECT DETECTION FOR SPECIES PRESENCE/ABSENCE DATA Darryl I. MacKenzie 1 Proteus Wildlife Research Consultants Summary Species presence/absence
More informationQuantitative Methods Geography 441. Course Requirements
Quantitative Methods Geography 441 Course Requirements Equipment: 1. Calculator with statistical functions 2. Three-ring binder 3. A thumb dive 4. Textbook: Statistical Methods for Geography. By Peter
More informationStatistical Analysis of List Experiments
Statistical Analysis of List Experiments Graeme Blair Kosuke Imai Princeton University December 17, 2010 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 1 / 32 Motivation Surveys
More informationMohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago
Mohammed Research in Pharmacoepidemiology (RIPE) @ National School of Pharmacy, University of Otago What is zero inflation? Suppose you want to study hippos and the effect of habitat variables on their
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationInstrumental Variables and the Problem of Endogeneity
Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =
More informationSampling. General introduction to sampling methods in epidemiology and some applications to food microbiology study October Hanoi
Sampling General introduction to sampling methods in epidemiology and some applications to food microbiology study October 2006 - Hanoi Stéphanie Desvaux, François Roger, Sophie Molia CIRAD Research Unit
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationLISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers
More informationLab #12: Exam 3 Review Key
Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution
More informationAnnouncements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.
Announcements Announcements Unit 3: Foundations for inference Lecture 3:, significance levels, sample size, and power Statistics 101 Mine Çetinkaya-Rundel October 1, 2013 Project proposal due 5pm on Friday,
More informationBrett Skelly, Katharine Lewis, Reina Tyl, Gordon Dimmig & Christopher Rota West Virginia University
CHAPTER 22 Occupancy models multi-species Brett Skelly, Katharine Lewis, Reina Tyl, Gordon Dimmig & Christopher Rota West Virginia University Ecological communities are composed of multiple interacting
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationLecture 10: Generalized likelihood ratio test
Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual
More informationHow to deal with non-linear count data? Macro-invertebrates in wetlands
How to deal with non-linear count data? Macro-invertebrates in wetlands In this session we l recognize the advantages of making an effort to better identify the proper error distribution of data and choose
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 6 Patrick Breheny University of Iowa to Biostatistics (BIOS 4120) 1 / 36 Our next several lectures will deal with two-sample inference for continuous
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationHypothesis Testing. ) the hypothesis that suggests no change from previous experience
Hypothesis Testing Definitions Hypothesis a claim about something Null hypothesis ( H 0 ) the hypothesis that suggests no change from previous experience Alternative hypothesis ( H 1 ) the hypothesis that
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationEE/CpE 345. Modeling and Simulation. Fall Class 9
EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior
More information2 Prediction and Analysis of Variance
2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering
More informationBiology Principles of Ecology Oct. 20 and 27, 2011 Natural Selection on Gall Flies of Goldenrod. Introduction
1 Biology 317 - Principles of Ecology Oct. 20 and 27, 2011 Natural Selection on Gall Flies of Goldenrod Introduction The determination of how natural selection acts in contemporary populations constitutes
More informationProbability and Statistics
Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT
More informationECOLOGICAL PLANT GEOGRAPHY
Biology 561 MWF 11:15 12:05 Spring 2018 128 Wilson Hall Robert K. Peet ECOLOGICAL PLANT GEOGRAPHY Objectives: This is a course in the geography of plant biodiversity, vegetation and ecological processes.
More informationMany natural processes can be fit to a Poisson distribution
BE.104 Spring Biostatistics: Poisson Analyses and Power J. L. Sherley Outline 1) Poisson analyses 2) Power What is a Poisson process? Rare events Values are observational (yes or no) Random distributed
More information4 Bias-Variance for Ridge Regression (24 points)
Implement Ridge Regression with λ = 0.00001. Plot the Squared Euclidean test error for the following values of k (the dimensions you reduce to): k = {0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,
More informationCHAPTER 6: SPECIFICATION VARIABLES
Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero
More informationVarieties of Count Data
CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function
More informationMULTIVARIATE ANALYSIS OF VARIANCE
MULTIVARIATE ANALYSIS OF VARIANCE RAJENDER PARSAD AND L.M. BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 lmb@iasri.res.in. Introduction In many agricultural experiments,
More informationFundamental Probability and Statistics
Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes
ECON 4551 Econometrics II Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova s notes 15.1 Grunfeld s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated
More informationTwo Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests
Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationOne-way ANOVA. Experimental Design. One-way ANOVA
Method to compare more than two samples simultaneously without inflating Type I Error rate (α) Simplicity Few assumptions Adequate for highly complex hypothesis testing 09/30/12 1 Outline of this class
More informationRESERVE DESIGN INTRODUCTION. Objectives. In collaboration with Wendy K. Gram. Set up a spreadsheet model of a nature reserve with two different
RESERVE DESIGN In collaboration with Wendy K. Gram Objectives Set up a spreadsheet model of a nature reserve with two different habitats. Calculate and compare abundances of species with different habitat
More informationStatistics 203: Introduction to Regression and Analysis of Variance Course review
Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying
More informationEcon 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias
Contact Information Elena Llaudet Sections are voluntary. My office hours are Thursdays 5pm-7pm in Littauer Mezzanine 34-36 (Note room change) You can email me administrative questions to ellaudet@gmail.com.
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationEXERCISE 13: SINGLE-SPECIES, MULTIPLE-SEASON OCCUPANCY MODELS
EXERCISE 13: SINGLE-SPECIES, MULTIPLE-SEASON OCCUPANCY MODELS Please cite this work as: Donovan, T. M. and J. Hines. 2007. Exercises in occupancy modeling and estimation.
More informationChapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides
Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for
More informationINSTITUTE OF ACTUARIES OF INDIA
INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS 19 th November 2012 Subject CT3 Probability & Mathematical Statistics Time allowed: Three Hours (15.00 18.00) Total Marks: 100 INSTRUCTIONS TO THE CANDIDATES
More informationAmphibian Conservation and GIS
Allen Hamilton Dr. August/Dr. Wang 12/11/13 Amphibian Conservation and GIS Amphibian populations have been on the decline in recent years, partly due to climate change, increase in mortality and loss of
More informationIntuitive Biostatistics: Choosing a statistical test
pagina 1 van 5 < BACK Intuitive Biostatistics: Choosing a statistical This is chapter 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc.
More informationEPSE 594: Meta-Analysis: Quantitative Research Synthesis
EPSE 594: Meta-Analysis: Quantitative Research Synthesis Ed Kroc University of British Columbia ed.kroc@ubc.ca January 24, 2019 Ed Kroc (UBC) EPSE 594 January 24, 2019 1 / 37 Last time Composite effect
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationEach copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
Sampling Design Trade-Offs in Occupancy Studies with Imperfect Detection: Examples and Software Author(s): Larissa L. Bailey, James E. Hines, James D. Nichols, Darryl I. MacKenzie Source: Ecological Applications,
More informationStatistical Analysis of the Item Count Technique
Statistical Analysis of the Item Count Technique Kosuke Imai Department of Politics Princeton University Joint work with Graeme Blair May 4, 2011 Kosuke Imai (Princeton) Item Count Technique UCI (Statistics)
More informationLow-Level Analysis of High- Density Oligonucleotide Microarray Data
Low-Level Analysis of High- Density Oligonucleotide Microarray Data Ben Bolstad http://www.stat.berkeley.edu/~bolstad Biostatistics, University of California, Berkeley UC Berkeley Feb 23, 2004 Outline
More informationMethods and Overview of Using EdSurvey for Running Wald Tests Developed by Alex Lishinski
Methods and Overview of Using EdSurvey for Running Wald Tests Developed by Alex Lishinski Wald Test The Wald test is a statistical test of estimated parameters in a model, with the null hypothesis being
More informationDraft Proof - Do not copy, post, or distribute
1 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1. Distinguish between descriptive and inferential statistics. Introduction to Statistics 2. Explain how samples and populations,
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationResearch Note: A more powerful test statistic for reasoning about interference between units
Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)
More informationOn the Triangle Test with Replications
On the Triangle Test with Replications Joachim Kunert and Michael Meyners Fachbereich Statistik, University of Dortmund, D-44221 Dortmund, Germany E-mail: kunert@statistik.uni-dortmund.de E-mail: meyners@statistik.uni-dortmund.de
More informationIntroduction to the Analysis of Tabular Data
Introduction to the Analysis of Tabular Data Anthropological Sciences 192/292 Data Analysis in the Anthropological Sciences James Holland Jones & Ian G. Robertson March 15, 2006 1 Tabular Data Is there
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationCHAPTER 21. Occupancy models
CHAPTER 21 Occupancy models Brian D. Gerber, Brittany Mosher, Daniel Martin, Larissa Bailey, Colorado State University Thierry Chambert, Penn State University & USGS As ecologists and conservation biologists,
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationExercises in Occupancy Estimation and Modeling; Donovan and Hines 2007 EXERCISE 7: ROYLE-NICHOLS ABUNDANCE INDUCED HETEROGENEITY
EXERCISE 7: ROYLE-NICHOLS ABUNDANCE INDUCED HETEROGENEITY Estimating mean abundance from repeated presence-absence surveys In collaboration with Kurt Rinehart, University of Vermont, Rubenstein School
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationEXERCISE 14: SINGLE-SEASON, SPECIES-INTERACTIONS OCCUPANCY MODELS. In collaboration with Rebecca J. Pfeiffer and Jeremy M. Clark
EXERCISE 14: SINGLE-SEASON, SPECIES-INTERACTIONS OCCUPANCY MODELS In collaboration with Rebecca J. Pfeiffer and Jeremy M. Clark University of Vermont, Rubenstein School of Environment and Natural Resources
More informationStatistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018
Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical
More information