Dynamic Inverse Prediction and Sensitivity Analysis With High-Dimensional Responses: Application to Climate-Change Vulnerability of Biodiversity

Size: px
Start display at page:

Download "Dynamic Inverse Prediction and Sensitivity Analysis With High-Dimensional Responses: Application to Climate-Change Vulnerability of Biodiversity"

Transcription

1 Supplementary materials for this article are available at /s Dynamic Inverse Prediction and Sensitivity Analysis With High-Dimensional Responses: Application to Climate-Change Vulnerability of Biodiversity James S. CLARK,DavidM.BELL, Matthew KWIT, Amanda POWELL, and Kai ZHU Sensitivity analysis (SA) of environmental models is inefficient when there are large numbers of inputs and outputs and interactions cannot be directly linked to input variables. Traditional SA is based on coefficients relating the importance of an input to an output response, generating as many as one coefficient for each combination of model input and output. In many environmental models multiple outputs are part of an integrated response that should be considered synthetically, rather than by separate coefficients for each output. For example, there may be interactions between output variables that cannot be defined by standard interaction terms for input variables. We describe dynamic inverse prediction (DIP), a synthetic approach to SA that quantifies how inputs affect the combined (multivariate) output. We distinguish input interactions (specified as a traditional product of input variables) from output interactions (relationships between outputs not directly linked to inputs). Both contribute to traditional SA coefficients and DIP in ways that permit interpretation of unexpected model results. An application of broad and timely interest, anticipating effects of climate change on biodiversity, illustrates how DIP helps to quantify the important input variables and the role of interactions. Climate affects individual trees in competition with neighboring trees, but interest lies at the scale of species and landscapes. Responses of individuals to climate and competition for resources involve a number of output variables, such as birth rates, growth, and mortality. They are all components of individual health, and they interact in ways that cannot be linked to observed inputs, through allocation constraints. We show how prior dependence is introduced to aid interpretation of inputs in the context of ecological resource modeling. We further demonstrate that a new approach to multiplicity (multiple-testing) correction can be implemented in such models to filter through the large number of input combinations. DIP provides a synthetic index of important inputs, including climate vulnerability in the context of competition for light and soil moisture, based on the full (multivariate) response. By aggregating in specific ways (over individuals, years, and other input variables) we provide ways to summarize and rank species in terms of their vulnerability to climate change. This article has supplementary material online. James S. Clark ( ) is Professor ( jimclark@duke.edu), David M. Bell is Graduate Student, Matthew Kwit is Graduate Student, Amanda Powell is Graduate Student, and Kai Zhu is Graduate Student, Nicholas School of the Environment, Department of Biology, and Department of Statistical Science, Duke University, Durham, NC 27708, USA International Biometric Society Journal of Agricultural, Biological, and Environmental Statistics, Volume 18, Number 3, Pages DOI: /s

2 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 377 Key Words: Biodiversity; Climate change; Forest dynamics; Hierarchical models; Interactions; Model selection; Multiple testing; Risk analysis. 1. INTRODUCTION Sensitivity analysis (SA) is an important element of environmental modeling. SA is used to quantify how uncertainty in parameter estimates affects uncertainty in predictions (Fieberg and Jenkins 2005), to identify the parameters that most influence predictions (de Kroon, van Groenendael, and Ehrleń 2000), and to determine the input variables and feedbacks that have large effect on response variables (Schwarz 2011). Complexity can preclude effective SA for at least two reasons. First, a large number of sensitivity coefficients would be needed to fully explore a model with multiple outputs. SA of q {1,...,Q} input variables that influence each of r {1,...,R} response variables could require a minimum of Q R coefficients for all input output combinations. Second, when models include feedbacks and interactions that are themselves of interest SA can be especially complex. For example, in climate models absorption of solar radiation increases cloudiness, which reflects incoming radiation back to space (Schwarz 2011). This feedback results from an interaction between an input and the current system state. In models of climate and biodiversity, species abundances change due to feedbacks between individual organisms; thus, SA could be profitable at this individual scale. Quantifying the variability in sensitivities to climate among individuals and over time could shed light on how environmental fluctuations impact population health. However, when SA varies between individuals and within individuals over time, complexity can be daunting. Hierarchical modeling allows inference on such high dimensional problems (Wikle et al. 2001; Gelfand et al. 2005; Cressie et al. 2009), but it introduces another challenge of how to effectively summarize so many relationships, including interactions. Here we introduce a new approach to evaluate inputs and outputs in large environmental models, based on dynamic prediction of a multivariate response vector. Advantages include inference at the appropriate scale, but in a simpler and more synthetic way than traditional SA. Our approach accommodates not only interactions that can be directly parameterized as combinations of input variables, indicated by the index qq, but also those that arise internally, where one response variable r depends on another response variable r in ways that cannot be parameterized through inputs. We apply it to one of the most extensive, long-term forest data sets that includes both experimental manipulation and natural variation for 30,000 individual trees tracked from 10 to 20 years for >300,000 tree-years. The data set is unique in three ways, (1) it provides annual resolution, (2) it tracks multiple demographic response variables, and (3) it includes experimental manipulation. This data set contrasts with previously published forest plot data that are purely observational, that record only tree size as a response, that lack manipulation or monitoring of important

3 378 J.S. CLARK ET AL. covariates, and/or that provide temporal resolution too coarse (e.g., five-yr frequency) to allow effective climate analysis. Our analysis includes three elements. First, we motivate SA at the scale of individual organisms, aided by the concepts of input and output interactions. Second, we introduce dynamic inverse prediction (DIP), an integrative approach to SA, having the advantages of being more synthetic and less complex. Finally, we show that a new innovation for variable selection involving multiple comparisons in high dimensional models (Scott and Berger 2010) applies well in this setting. The example of biodiversity and climate change is used to illustrate throughout. 2. BACKGROUND MOTIVATION Biodiversity response to climate change provides a challenging and important application of our approach. Climate changes occurring now pose fundamental questions about future biodiversity (e.g., Hillyer and Silman 2010; Ettinger, Ford, and HilleRis- Lambers 2011; Clark et al. 2011b; Zhu, Woodall, and Clark 2012). How can models allow for the combination of health responses to a combination of resources and climate? Can models help identify the species that will be strongly affected by climate change in the context of competition for resources? Can they help us anticipate where species will find refuge, given that the best sites will experience the strongest competition? Effective biodiversity risk assessment should consider the dynamic multivariate and interactive effects of climate variation at the scales at which processes occur, individuals responding to seasonal variation. Current models used to predict biodiversity responses to climate change primarily rely on spatial correlations between regional abundance of species and regional climate. However, climate affects not species, but rather weather affects individual trees. Climate changes, the aggregate of changes in weather, include increased growing season length and summer drought. The effects of these changes depend on competition, predation, and disease for individuals of different species and sizes. Water use and light interception by competing neighbors and topographic variation in moisture at fine spatial scales are not resolved in current models. Furthermore, the effects of input variables interact moisture is best exploited by individuals with access to high light, light capture depends on temperature, and so forth. Outputs may also be complex. The response individual health is multivariate, involving not only growth, but also fecundity and survival (Welp, Randerson, and Liu 2007; Souza et al. 2008; Granier et al. 2008; Valladares et al. 2008; Valladares and Pearcy 2002; Clark et al. 2011a, 2011b). In biodiversity studies a response vector could be as large as all of the species in a region that depend on environmental variation, all of the individuals in a population as they jointly respond to the environment and one another, or all of the variables measured on an individual that help describe its response to weather. In each of these cases interest may focus on the full (multivariate) response as opposed to the individual elements of a response vector.

4 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 379 In our application responses occur at the individual scale, but, as is typical in many such studies, interest focuses on aggregate quantities, such as population distribution and abundance in relation to climate. In other words, the processes that control patterns at the level of interest (species and climate) operate at the scale of individual responses to weather. Our approach is motivated by the need to infer how individuals of different species differ in their relationships with climate, building from long-term health of individuals exposed to natural and experimental variation in risk factors. We describe advantages to a new perspective on the dynamic inverse problem of an organism s multivariate response MOTIVATION FOR DYNAMIC INVERSE PREDICTION Two problems that arise in sensitivity analysis of environmental models are (i) the effect of input variable q on output variable r is a single coefficient extracted from a large fitted model, and (ii) there are too many combinations to interpret and no obvious way to rank them in terms of importance. Consider output vector y with elements {y r : r = 1,...,R} and input vector x with elements {x q : q = 1,...,Q}. Sensitivity s(r,q) = dy r /dx q could be as simple as a regression coefficient. But there are many of them, Q R, so which do we focus on? Is q important if s(r,q) > s(r,q )? That depends on one s view of the importance of r vs. r. Which is more diagnostic under the range of conditions that can occur? If we view the components of y as part of an integrated response it may be difficult to weight their contributions. How do we reduce the dimensionality of this sensitivity analysis? If we invert the problem to a predictive distribution x q y,..., then both problems are addressed. Based on a fitted model the full response vector y assigns a score to each x q, weighted by the full fitted model. Unlike traditional SA, which relies on coefficients for each combination of input variable q and output or response variable r, dynamic inverse prediction (DIP) is synthetic an input variable q is evaluated based on its effect on the full response vector y, rather than each response variable individually. For the application considered here, the response vector for individual i in year t +1, y i,t+1, contains fecundity, represented here by index r, and growth, represented by index r. The influence of input variables in vector x i,t can be most directly assessed from the capacity of the individual i in year t + 1 to predict these inputs, based on the full response in the length-r vector y i,t+1. DIP reduces the analysis from Q R sensitivity coefficients to no more than Q predictive distributions DIP is not only synthetic, it is less complex. It is dynamic in this application, because we are concerned with prediction over time the prediction changes with changing input variables. To illustrate, consider a simulated example for the linear model, omitting time t for the moment, y i N(x i A, ), with length-r observation vectors y i, i = 1,...,n, length-q input vectors x i, and parameter values in Q R matrix A. We defer discussion of prior distributions on A and to Section 4.4, but note here that there will be an additional prior density for input variables, x i(q) N( x (q), 10). We specify a prior density centered on the mean values of input variables. The prior for input variables is needed to ensure that the predictive distribution of input variables is proper. Large error is introduced in covariance matrix having diagonal elements selected at random from the interval (10, 100), much wider than the range of x i A. Correlations that determine off-diagonal elements are in the

5 380 J.S. CLARK ET AL. Figure 1. At left are predictive means and 95 % intervals for simulated data with four main effects in a design matrix that also includes an intercept in position 1 and an interaction in position 6 (x 4 x 5, not shown). There are R = 50 response variables and n = 30 observations. The 1:1 line of agreement and horizontal line at the prior mean are shown. The 300 coefficients in A for Q = 6 inputs are shown at lower left. Input 2 is especially informative, all values being far from zero. Inputs 3 and 4 are uninformative, having values close to zero. Input 4 interacts with input 5, which is more informative than 4. At right are histograms of scores for each observation (Equation (19)) with mean scores shown for each panel. At lower right are predictions of y (95 % intervals, with summaries for 10 bins). range ( 0.2, 0.2). The parameter matrix A has covariance H = (X T X) 1, where X is the n Q designmatrixhavingrowsx i. There is a simulated data set of sample size n = 30 with Q = 6 input variables in x and R = 50 response variables in vector y i.main effects represent four of the input variables, the remaining two being an intercept and an interaction term. Before considering the inverse prediction of input variables, we show the standard predictive intervals for vectors y i at lower right in Figure 1. These predictions marginalize over the posterior distribution of A, with additional stochasticity induced by the likelihood,

6 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 381 p ( y i X, Y, ) = N ( y i x i A, ) N ( vec(a) vec (( X T X ) 1 X T Y ), H ) da. The notation N(y i x ia, ) indicates that the vector y i is distributed as a normal distribution with arguments being the mean vector and covariance matrix, and Y is n R response matrix. The n R = 1500 predictive distributions for Y in Figure 1 (lower right) emphasize the large errors in. There is also a predictive distribution for input variable q in observation i, p ( xi(q) X, Y, ) p ( y i xi(q), A, ) N ( vec(a) vec (( X T X ) 1 X T Y ), H ) p ( xi(q)) da. Figure 1 shows 95 % predictive intervals for input variable q in each observation i. Input variables are identified with the notation x,q for all n observations in the qth column of X. In this example input x,2 has disproportionately large effects on all responses this can be seen as large sensitivity coefficients at lower left. The predictive intervals for the x,2 are narrow (top left) and predictive scores high (upper right we discuss the intervals and scores in Section 4.2). Input x,3 has average effects that are lower than other inputs by a factor of 0.1. Consequently, predictive intervals are broad, many spanning the prior mean of 0, and scores are low. Input x,4 has average effects of the same magnitude as x,3,but it interacts with x,5, which has a larger effect than x,3. The predictions for x,4 and x,5 are intermediate between important x,2 and unimportant x,3. The predictive intervals of main effects in X (left side of Figure 1) and mean scores (right side) summarize input effects in a way that could not be achieved with Q R = 300 sensitivity coefficients for a traditional SA. The 300 responses in matrix Y are weighted by the model itself, and they incorporate effects of interactions. The full model is engaged in these prediction scores, rather than individual coefficients extracted from it. In this example of a poorly fitting model with 300 poorly estimated regression coefficients there are still important input variables in terms of their impact of the full response vectors that emerge from DIP. In many studies a number of attributes in Y might be measured simply because they happen to be observable. Their diagnostic value might be unknown. The importance of a response variable may differ from one observation to the next. Inverse prediction provides an efficient means for extracting the important input variables, even in cases where many of the response variables contain limited information. Simulation studies further show that predictions of input variables are insensitive to collinearity in X. Figure 2 shows 95 % of prediction scores for experiments where n = 1000, Q = 6, R = 50, = diag(1,r), and A is constructed as in the previous example. A random X is generated as before but now with systematic correlation introduced between x,2 (informative) and x,3 (non-informative) ranging from 0 to The ranges of prediction scores result from random X and the posterior distribution of A. Scores for informative x,2 are uniformly high and x,3 uniformly low, despite a range of collinearity between these two variables. Prediction scores for remaining two main effects are likewise unaffected by collinearity. DIP builds on a tradition of predicting observations based on a fitted model, typically used to check or rank models (Gelfand and Ghosh 1998) or to evaluate their predictive

7 382 J.S. CLARK ET AL. Figure 2. Ninety-five percent of prediction scores for main effects in X where the correlation between x,2 and x,3 ranges from 0 to capacity (Gneiting and Raftery 2007). Rather than ask how well the model predicts the data, inverse prediction focuses on how well an observation, in our application an individual tree in a given year, predicts environmental inputs based on all input variables and the fitted model. In the application that follows a fitted state-space model for an imputed response vector y ij,t+1 is inverted to predict input variables in the vector x ij,t (Section 4.2). In a state-space model the response vector y ij,t could be treated as a latent state that is not directly observed, but rather is imputed based on observations, contained in a vector z ij,t. If the predictive mean is biased or predictive variance large then the variable has limited impact on y ij,t. Prediction scores can be aggregated to obtain mean scores for individuals, years, and species. The approach can be viewed as an application of in-sample prediction, applied and interpreted in the context of individual health INTERACTIONS IN THE MODEL In principle DIP could be applied to many environmental modeling challenges. Our application to biodiversity and climate change poses several considerations related to interactions, prior specification, and variable selection. Input interactions are defined here as those that can be estimated as traditional interaction terms for combinations of input variables. Input interactions are positive or negative and determine whether changes in resources and temperature tend to amplify or buffer one another. Traditionally, interactive effects of resources have been termed complementary or antagonistic (Tilman 1980; Huisman and Weissing 2001; Revilla and Weissing 2008; Hall2009). We use the terms amplifying and buffering. A positive interaction is one where the effect of an input variable is greatest when another is abundant (Figure 3a). On the other hand, many hypothesized effects of climate change are consistent with the notion of buffering, which can be viewed as a negative interaction (Figure 3b). For example, Frelich and Reich (2010) hypothesize that moist locations will provide refuges as aridity increases in the future, a negative interaction between moisture change over time (increasing aridity) and spatial variation in moisture status. In other words, reduced moisture supply during drought has greatest impact on sites where mois-

8 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 383 Figure 3. Three interaction examples from regression having positive main effects for both variables. The response surfaces grade from low (red) to high (yellow). Big arrows indicate values of variable 1 where response to variable 2 is large, and vice versa. Resource models are typically constrained to positive interactions between resource variables (a), but acknowledge negative interactions (b). Regression allows both, but is also capable of unrealistic behavior (c). Both (b) and (c) have negative interaction terms. ture is already low (Figure 3b). This would occur if individuals on wet sites are effectively buffered from drought, responding less than those on dry sites. Competition can change these expectations; a positive interaction (Figure 3a) could result if leaf area and transpiration demand increase to fully exploit the greater moisture supply on wet sites, making them more vulnerable to drought. For resources and climate results like Figure 3c may violate prior assumptions. To avoid outcomes like Figure 3c it is not enough to impose a prior on main effects. We specify a prior that can be used where the sign of the full effect is known (positive or negative), but the sign of the interaction is not (Section 4.4). Our application of the term input interactions distinguishes this interaction between input variables from those that arise internally and/or cannot be attributed to combinations of input variables. Output interactions are defined here to be relationships between response variables that cannot be directly linked to input variables. Unlike input interactions, which can be specified explicitly as relationships between input variables, there will be relationships between response variables that arise from feedbacks that are not observed. We use the term output interaction to refer to relationships between outputs that cannot be specified as traditional (input) interaction terms. In our application output interactions occur within organisms, due to allocation constraints. If allocation to fecundity comes at the cost of reduced allocation to growth, there can be a relationship (interaction) between fecundity and growth that cannot be fitted to specific combination of input variables. Output interactions can lead to surprises if not properly identified (Section 4.3). Our approach helps to accommodate and clarify contributions from input and output interactions. We discuss how these two types of interactions are identified and how they affect sensitivity analysis in Section VARIABLE SELECTION The unusual size of the data set and number of input combinations presents a model selection challenge. When many species each have different geographic distributions there are different numbers of input variables to consider for each species. Seasonal winter and spring temperatures (w), summer drought (m), local moisture status (M), tree size (D), previous growth rate (d), and availability of light (C) affect the demographic rates of 40 dominant species over 20 yr. We entertain up to 1062 main effects and two-way interactions, but depending on geographic distribution, some species cannot include all of them.

9 384 J.S. CLARK ET AL. Multiplicity describes the fact that the number of selected variables scales with the number of input variables that are considered. Size of the selected model can be corrected for the fact that the number of variables considered is not constant (Scott and Berger 2010). In our analysis variable selection is based on the marginal likelihood, which penalizes large models, and a model prior to address the multiplicity posed by variable combinations (Scott and Berger 2010) that differ among species. Model fitting and selection tools help to filter through many potential variables to identify those of consequence (Section 4.5). In the sections that follow we summarize the data sets and the model (details are in Clark et al. 2010). We then introduce DIP and its relation to SA, followed by application. 3. DATA SETS Data come from 20-yr census plots located in mixed temperate forests from midelevation Piedmont to northern hardwoods of the southern Appalachians of North Carolina. Individuals of all tree species are tracked over time as they respond to spatiotemporal variation in climate and local competition for light and moisture (Section 4.1). For a given species there are i = 1,...,n j individuals on plot j = 1,...,12 plots, modeled over t = 1,...,T years. Response variables are demographic rates, including diameter growth d ij,t and fecundity potential f ij,t, informed by observations from tree censuses, tree increment cores, remote sensing, and seed traps, using field methods detailed in Clark et al. (2010). Tree-year observations taken during censuses include tree diameter, survival status, crown class, and reproductive status. Censuses are conducted at 2 to 4 year intervals. Additional observations of growth are obtained from increment cores, which provide annual growth data. Remote sensing is used to quantify exposed canopy area (ECA) as an index of light availability. Seed-year observations come from seed traps, collected two to five times annually. Data submodels are detailed individually for annual fecundity, growth, and mortality in Clark et al. (2010). Input variables are restricted to climate, competition, and resources known to affect demographic responses (Table 1). Plots were selected to provide a range of climate variation (Piedmont to mountains). Tree canopies were manipulated (pulling down large trees) to provide a full range of light values (Cooper-Ellis et al. 1999; Dietze and Clark 2008; Clark et al. 2010). Exposed canopy cover C ij,t is an index of light availability and ranges from 0 (completely shaded by neighbors) to >100 m 2. Summer drought is summarized by the Palmer Drought Severity Index (PDSI) m j,t for June through September for site j in year t. PDSI expresses the departure of a given year from the long-term moisture availability for the site, in this case since 1930 (Figure 4b). Spatial variation in moisture availability M ij is taken as the product of annual average precipitation (mm) at site j and the topographic convergence index (Bevin and Kirkby 1979) for the location of tree ij (Figure 4c). M ij varies among the 12 stands due to variation in precipitation and within sites due to topography. Thus, M ij represents spatial variation and m j,t represents temporal variation, how drought index for a given growing season departs from the site average. In addition to climate variables and light, the model includes tree diameter D ij,t and previous growth rate d ij,t 1, both of which can explain growth and fecundity (Clark et al. 2010).

10 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 385 Table 1. Hypothesized direct effects and interactions by demographic response variable. Input covariate Reference Summary rationale Prior distribution Growth response ln(d ij,t 1 ) Fecundity potential ln(f ij,t 1 ) Minimal model Intercepts species No prior knowledge NI Canopy area ln(c ij,t 1 ) tree-year Light is a limiting resource for A qr > 0 A qr > 0 which plants compete Additional main effects Diameter ln(d ij,t 1 ) tree-year Fecundity potential can increase NI A qr > 0.5 allometrically Large diameter effect ln 2 (D ij,t 1 ) tree-year Physiological function may decline, but not improve with old age A qr < 0 A qr < 0 Previous year growth ln(d ij,t 1 ) tree-year Fecundity may depend on previous growth, beyond effects explained by past climate inputs A qr = 0 site-year Years with warm winters, long site-year Drought years decrease carbon site Sites with warm winters, long Winter temperature deviation A qr > 0 A qr > 0 w j,t 1 growing seasons increase carbon gain Summer (Jun, Jul, Aug, Sep) A qr > 0 A qr > 0 drought deviation m j,t gain Average winter (Jan, Feb, Mar) A qr > 0 A qr > 0 temperature W j growing seasons increase carbon gain Average moisture index M ij tree Moist sites support carbon gain A qr > 0 A qr > 0 Interactions Light by winter temperature tree-year C ij,t 1 w j,t 1 Light by summer drought tree-year C ij,t 1 m j,t Light by ave winter site-year temperature C ij,t 1 W j Light by ave moisture tree-year C ij,t 1 M ij Winter temperature by summer site-year drought w j,t 1 m j,t Summer drought by ave moisture m j,t M ij tree-year NI Like moisture, temperature enters as a site effect W j and a site-year effect w j,t. Winter and spring temperatures control bud break, leaf and fruit set and can have a large impact on tree carbon balance. We use the annual temperature for January through March for site j in year t. The site effect is taken to be the average winter/spring temperature W j, and the site-year effect is the departure from that average, w j,t (Figure 4a). The ranges of input variables in this study are relevant for 21st century climate change predictions. They span the southeastern Piedmont to northern hardwoods in spatial variation. Variation in temperature among sites and over time spans the 2 to 5 C is similar to the temperature increases predicted for 21st century climate change. Within the study period, variation in summer PDSI for the 20-yr study period spans the interval ( 4, 4), i.e. several

11 386 J.S. CLARK ET AL. Figure 4. Climate related input variables. (a) Winter/spring temperature has a spatial component W j (time-averaged, among sites j) and a temporal component w j,t (within j, over time). Summer moisture has a temporal component m j,t, the Palmer Drought Severity Index (b), and a spatial component, the moisture index M ij (c). The map in (c) shows M ij values for tree locations from moist (blue) to dry (yellow). severe droughts to some of the wettest years for this climate. This large temporal variation was experienced by most, but not all, species. The range of local (spatial) moisture values was limited for species that were restricted one or a few sites, but broad for many. Canopy removal experiments provided a full range of canopy area-tree size combinations, to ensure that effects of both variables could be estimated (Clark et al. 2010). Between the input variables included in this analysis, pairwise correlations were low, mostly less than 0.2 in absolute value (Supplement).

12 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS MODEL SUMMARY Multivariate responses of individuals to multiple inputs are modeled in a state-space framework. The model includes uncertainty in the process, variation among individuals, and observation models. There is process error within and between individuals and over years. Here we summarize the model from Clark et al. (2010), which provides additional detail on prior specification, algorithm development, and diagnostics MODEL DEVELOPMENT The process model tracks changing states of individuals each year as they grow, reach reproductive maturity, produce seed, and die. There are direct responses of growth and fecundity to fluctuating inputs, as well as interactions that result from tradeoffs in allocation between growth and reproduction (Knops, Koenig, and Carmen 2007; Mund et al. 2010; Sánchez-Humanes, Sork, and Espelta 2011); the latter emerge as output interactions, not directly linked to input variables. Maturation is a partially hidden Markov process, where an individual ij can change from the immature F ij,t = 0 to mature F ij,t+1 = 1 state as it increases in size, dependingon access to resources and climateinputs (Table 1). The process equation for the bivariate growth-fecundity response for a mature individual (F ij,t = 1) has response vector y ij,t =[ln d ij,t ln f ij,t ], (4.1) which includes fecundity f ij,t (potential seed productivity) and the diameter growth increment d ij,t (cm), which determines change in diameter D ij,t+1 = D ij,t + d ij,t. As mentioned previously, data models link this process equation to observations, which consist of increment cores, tree diameter measurements, observations of maturation status, and seeds collected in traps (Clark et al. 2010). Our interest here is in the latent vector y ij,t.the process equation is ( yij,t+1 Fij,t+1 = 1 ) N 2 (x ij,t A + α ij + κ t, ), α ij N 2 (0,W) (4.2) where x ij,t isa1byqvector of inputs (main effects and interactions), A is a Q by R matrix of fitted parameters for R = 2 response variables, α ij is the random effect associated with individual ij, W is the 2 by 2 covariance matrix for random effects, κ t is a fixed year effect, is a 2 by 2 covariance matrix for process error, [ = r rr r r r ]. (4.3) In this case of R = 2 the covariance is the scalar quantity rr = r r. Equation (4.2) is the process equation of a state-space model that applies to individuals that are reproductively mature. For immature individuals, F ij,t = 0, y ij,t is a scalar quantity for growth (i.e., there is no fecundity), and Equation (4.2) is univariate. Note that a given predictor q {1,...,Q} enters the model in three ways, for adult growth and fecundity (Equation (4.1)) and for juvenile growth. Data models for observations, prior distributions, algorithm development for MCMC, and diagnostics, are detailed in Clark et al. (2010).

13 388 J.S. CLARK ET AL IMPLEMENTATION OF DYNAMIC INVERSE PREDICTION DIP is derived directly from the fitted model. We wish to evaluate the importance of an input variable q, based on the full response vector, incorporating all interactions. Let q represent variable(s) that interact with q, and q represent variables that do not. Consider a predictive distribution for x ij,t (q), i.e., the qth element of x ij,t. To ensure a proper distribution of predicted xij,t (q), we have the likelihood, posterior for parameters θ, and prior for xij,t (q), p ( xij,t ) (q) X, Y p ( y ij,t+1 x ij,t (q), x ij,t (q, q),θ ) p ( θ X, Y ) p ( xij,t (q)) dθ = p ( y ij,t+1 x ij,t (q), x ij,t (q, q)) p ( x ij,t (q) ). Equation (4.2) is the likelihood, which can be reorganized this way y ij,t+1 N 2 (x ij,t ( q) A q + x ij,t (q) B ij,t (q) + α ij + κ t, ) (4.4) where B ij,t (q) is the length-r vector for the main effect of q (variable of interest; see below), and A q is the matrix excluding row q and qq (interactions between q and q ). The first term in the mean vector includes terms not involving q, neither as main effects nor as interactions. The second term is a length-r vector that includes the main effect of q and interactions, B ij,t (q) = A q + p q x ij,t (p) A qp, (4.5) A qp is a row of A corresponding to an interaction with q. For example, suppose x = [x 1,x 2,x 3,x 4 x 3 x 4 ], where the design includes an interaction x 3 x 4, and subscripts for individual, location, and year are omitted for simplicity. If the response variable of interest is q = 3, then q = 4, and q =[1, 2, 4]. The first term of Equation (4.4)is A 11 A 12 [x 1 x 2 x 4 ] A 21 A 22, A 41 A 42 and the second term is x 3 [A 31 A 32 ]+x 3 x 4 [A 5,1 A 5,2 ]. The subscript 5 in matrix A corresponds to the row containing the interaction between inputs 3 and 4, i.e., qq. The prior on an input x q is x q N(a q,b q ). Now including subscripts, there is a predictive density PD ij,t (q) = N(ˆx ij,t (q),v ij,t (q) ) (4.6)

14 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 389 with predictive mean and variance ˆx ij,t (q) = V ij,t [ Bij,t 1 (y ij,t+1 x ij,t ( q) A q α ij κ t ) + a q /b q ], ) 1. V ij,t = ( B ij,t (q) 1 B T ij,t (q) + b 1 q This predictive distribution incorporates interactions caused by input variables in Equation (4.5) and additional (output) interactions in the response vector absorbed by. This standard approach to predicting missing x is used here as the basis for evaluating the role of q when the output y is a vector HOW INTERACTIONS AFFECT DIP VS. SENSITIVITY Here we compare the simplicity of the foregoing predictive distributions in DIP with a more traditional sensitivity analysis, where we incorporate input and output interactions. In SA there is a sensitivity coefficient for each input output combination qr. In the absence of input interactions the sensitivity of response variable r (e.g., growth or fecundity) to input variable q (e.g., temperature, moisture, light availability) is simply s (q r) = dy ij,t+1(r) dx ij,t (q) = y ij,t+1(r) x ij,t (q) x ij,t (q) x ij,t (q) = A q,r x D q (4.7) where A q,r is the coefficient for the rth response to input variable q. In our model this is the proportionate (log) response of growth or fecundity (Equation (4.1)) standardized by the range of variation xq D to allow comparison of effects across different input variables, x ij,t (q) = (x ij,t (q) min(x q ))/xq D, where xd q = max(x q) min(x q ). There is no ij, t subscript on the sensitivity coefficient of Equation (4.7), because there are no interactions the response to input variable q does not depend on the level of other inputs that an individual experiences that might interact with q. This standard approach to sensitivity has two disadvantages, (i) we have many coefficients, and (ii) a given coefficient A q,r has to summarize the entire effect; by contrast, Equation (4.6) engages the entire fitted model, including all sources of uncertainty. We pursue this sensitivity approach further to show that it can expose different types of interactions. When there are interactions among inputs and outputs (allocation within individuals), sensitivity varies among individuals and years, ( yij,t+1(r) s ij,t (qq,r r) = + y ) ij,t+1(r) y ij,t+1(r ) xij,t (q) x ij,t (q) y ij,t+1(r ) x ij,t (q) x ij,t (q) = 1 xq D [1 g ij,t (rr ) ]B ij,t (q) (4.8) where g ij,t (rr ) is the (output) dependence of response y ij,t+1(r) on y ij,t+1(r ) g ij,t (rr ) = y ij,t (r) = rr 1 y ij,t (r r ), and B ij,t (q) is given by Equation (4.5). g rr is the dependence of r on r that is not tied to inputs. Input interactions (between input variables q and q ) enter through B ij,t (q). Input and output interactions interact in the second term of Equation (4.8). The sensitivity sub-

15 390 J.S. CLARK ET AL. script in Equation (4.8) now includes not only ij, t, but also r and q, the interactions. In terms of input and output interactions Equation (4.8) can be interpreted this way: s ij,t (qq r r) direct ( ) input effect + interaction q r (qq ) r + ( ) output interaction r r ij,t ij,t ( input/output interaction (qq ) r ) ij,t. (4.9) The first term is the direct effect of Equation (4.7). Input interactions (qq ) comprise the second term, those explained by combinations of input variables. Output interactions (rr ) modify the effects of inputs, due to allocation constraints between different response variables. There are many such coefficients for individuals ij, time t, and interactions r and q. There is no obvious way to simplify all of these coefficients into a meaningful summary. DIP incorporates output interactions more compactly than traditional SA. For a response vector of length R = 2 having covariance matrix [ = 1 ρ ρ 1 the predictive variance for an input variable x q in Equation (4.6) is 1 ρ 2 V = B1 2 2ρB 1B 2 + B2 2 where B 1 and B 2 are the two elements of vector B from Equation (4.5). Of course, an important input has large values in vector B and thus contributes to a small predictive variance. If the input variable affects both outputs in the same direction (B 1 and B 2 have the same sign), then an amplifying output interaction (ρ >0) increases the predictive variance. In both SA and DIP output interactions contained in are distinct from input interactions, contained in A. In DIP output interactions have little effect on the mean prediction, having more impact on the predictive variance. In SA, the output interactions play a different role, even leading to surprises. Consider an input variable q that has positive effect on overall health, such as a limiting resource. We expect that coefficients in row q of A are greater than zero. However, allocation tradeoffs result in negative elements of.aresponse variable r can have a negative sensitivity coefficient s (q r) for individuals having a strong positive response for variable r due to negative covariance rr < 0 and negative output interaction g rr. For example, trees partition stored reserves between growth and reproduction (Granier et al. 2008; Knops, Koenig, and Carmen 2007; Mund et al. 2010). Despite fluctuations in conditions that benefit overall health a negative correlation between growth and fecundity can result in responses that appear paradoxical. When negative covariance arises due to feedbacks that are not captured by input variables, sensitivities can be negative or positive, depending on values of other variables involved in the interaction q and the responses to them (r ). In DIP such negative correlations between response variables tend to decrease the predictive variance if the input variable affects responses in the same direction. ]

16 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 391 In summary, SA and DIP respond to combinations of input and output variables in different ways. DIP is synthetic for the effects of an input variable on the full multivariate output. DIP depends implicitly on output interactions, but, because it does not quantify each input-response separately, it does not identify when output interactions lead to surprises. SA is more complex, but it can be used to identify when output interactions could be the cause of surprising results. Finally, implementation requires a prior specification for the coefficients in A to address issues summarized in Figure PRIOR SPECIFICATION ON INPUTS WHEN THERE ARE OUTPUT INTERACTIONS Consider the common situation where prior knowledge indicates that input interactions could be either amplifying or buffering (Figures 3a or3b, but not 3c). Prior knowledge is limited to the full effect of an input variable q on response variable r, dy ij,t (r) /dx ij,t 1(q). In the absence of interactions, parameters can be flat over some positive or negative interval, e.g., truncated at zero. When input and output interactions can both occur a suitable prior on A is specified conditionally. The interpretation of interactions as amplifying or buffering requires that variables that interact with q do not change sign over the range of x q. This requirement is achieved with a rescaling of input variables and the prior specification that follows. We use flat priors on A, truncated at limits defined either by prior information (0 for main effects or non-zero for interactions in the next section), or they are set to positive or negative values near the extreme estimates obtained without truncation (Mitchell and Beauchamp 1988) all priors are proper, and none are defined by arbitrary and unrealistically large truncation values. For those truncated at zero we use model selection to determine whether or not an input variable is important (Section 4.5). Of course, not all elements of A need to be constrained to positive or negative values, only those where prior knowledge suggests it. The limits affect the height of the prior and thus the marginal likelihood used in model selection. Ignoring for the moment individual and year effects the expected response vector is E[y ij,t+1(r) ]=x ij,t A r where y ij,t (r) is the rth response variable in vector y, A r is the rth column of A, and x ij,t A r = +x ij,t (q) A q,r + x ij,t (q )A q,r + x ij,t (q) x ij,t (q )A qq,r + includes terms for direct effects of inputs q and q and their interaction (qq ). Further assume that a positive relationship is specified (through the prior distribution) between input variable q and response variable r, i.e., A qr > 0. This prior belief is only ensured if x q = 0. But most variables are not measured on scales where 0 has any particular significance. The prior belief applies to the derivative B ij,t(q,r) = dy ij,t+1(r) = A q,r + x ij,t (q dx )A qq,r ij,t (q) (see Equation (4.9)). The interaction coefficient A qq,r can be positive or negative, but it should not be so negative that it violates prior belief that dy r /dx q > 0. To impose the

17 392 J.S. CLARK ET AL. prior belief that x q has a positive effect on response variable r we specify a prior on A qq,r conditional on main effects in A, p(a qq,r,a q,r,a q,r) = p(a qq,r A q,r,a q,r)p(a q,r,a q,r). For a specific case where there is prior belief that both q and q have positive effect, the conditional interaction is (A q,r,a q,r)>0 A qq,r (A q,r,a q,r)>max( A q,r, A q,r). (4.10) Computationally, this dependence is imposed at the proposal stage of a Metropolis step VARIABLE SELECTION Although we limit consideration to only those variables known to have important impacts on tree health, the number of potential main effects and interactions is large and different for species having different geographic distributions. The minimal model includes only intercepts and exposed canopy area C ij,t (Table 1), the latter because all species are limited by light when growing in shaded understories. There are m = 1,...,M additional predictors in Table 1, where M = 13. The eight main effects in Table 1 contribute up to 2 8 = 256 combinations. A two-way interaction can be considered only if both main effects are included in the model. We focus on a subset of potential interactions, those most likely to have importance for climate change vulnerability and its interaction with competition (Table 1). Themodelspaceis large (1062variablecombinations), giventhatfittingis done using Metropolis within Gibbs (George and Mccullough 1993, 1997). Here we describe an implementation of Scott and Berger s (2010) technique for variable selection. Models are evaluated based on the posterior model probability, derived from the prior for model k, p(k), and marginal likelihood, p(z k), p ( k(m) z ) p ( z k(m) ) p ( k(m) ). (4.11) There are m predictors in model k(m), excluding the minimal model, which contains only intercept and canopy area. As mentioned above, z are observations, and data models are detailed in previous publications. Equation (4.10) is the basis for many model choice criteria, including posterior odds and Bayes factors (Clyde and George 2004; Clyde and Ghosh 2010) and model averaging (Hoeting et al. 1999). Following Scott and Berger (2010) we apply fully Bayesian model choice. This approach has the advantage that it automatically adjusts for multiple comparisons, without imposing ad hoc penalties. The model prior p ( k(m) ) = p m (1 p) M m (4.12) (George and McCullough 1997), when p is treated as an unknown, provides a multiplicity correction. The marginal likelihood from Equation (4.11) is p ( z k(m) ) = p ( z θ k,k(m) ) p(θ k )dθ k (4.13)

18 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 393 the integrand containing likelihood and prior. With a flat beta(p 1, 1) prior the posterior model probability is p ( k(m) ( ) 1 ) p(z k(m)) M z p ( z ) k(m) (4.14a) M + 1 m i.e., a mixture. Large models are penalized in two ways. The first penalty comes from the fact that each additional parameter adds a dimension requiring integration in the marginal likelihood (Equation (4.13)). This penalty is roughly multiplicative. Interactions bear this burden thrice, once for each main effect, then again for the interaction term. However, this is a size penalty, but not a multiplicity correction, because it does not depend on the number of models considered (Scott and Berger 2010). Instead multiplicity is handled through the prior. In our application, model selection focuses on matrix A. The marginal likelihood is approximated using the approach of Chib (1995). Inverting Bayes Theorem allows evaluation of p ( z k(m) ) = p(z A, k(m))p(a ) p(a z, k(m)) where A is a value of A having posterior density simulated by Metropolis-within-Gibbs. Results did not depend on the particular parameter matrix A chosen to evaluate it. The posterior inclusion probability for a given variable is p(a q 0) = p ( k(m) ) ( ) z I q k(m) (4.14b) {k} where {k} is the set of all models, I() is the indicator function, being equal to 1 if its argument is true and 0 otherwise. We monitored posterior model probabilities within the MCMC to progressively pare down the variable space and arrive at a maximum posterior model probability (Equations (4.14a), (4.14b)). The Metropolis-within-Gibbs algorithm described in Clark et al. (2010) is implemented such that the minimal model and an alternative model, proposed from a uniform distribution, are evaluated based on the log posterior probabilities. The more probable model is selected as the current model. Each model builds up a history of model probabilities. A combination of a large number of proposals and low posterior probability is cause for deletion of a model. In other words, model selection is progressive, with especially poor models eliminated first and surviving models requiring more proposals before rejection. Runs of 100,000 Gibbs steps were implemented consecutively, each time reintroducing the best of the models previously rejected. Finally, the last model is fitted alone AGGREGATING THE SELECTED MODEL Results from the selected model (Section 4.5) are aggregated over individuals, time or both to obtain summaries of DIP and SA. For DIP, there is a prediction for x q from every individual in every year, because individuals track an input variable differently at different

19 394 J.S. CLARK ET AL. locations, at different times, and over different ranges of input variables. Summaries are available by aggregating over individuals, years, or both. We apply Gneiting and Raftery s (2007) scoring rule to each individual and year, S ij,t (q) = (x ij,t (q) ˆx ij,t (q) ) 2 ln V ij,t (q). (4.15a) V ij,t (q) The score rewards predictions close to the truth and with low predictive variance (second term). The predictive variance in the first term penalizes overconfident predictions. Aggregation over time provides an average score for an individual, identifying which individuals are responding most and least to q, S ij (q) = 1 T 1 T S ij,t (q). t=2 (4.15b) Aggregation over individuals provides population level results for each year, which can help identify combinations of year t and location j that pose large vulnerabilities, S j,t (q) = 1 n j S ij,t (q). (4.15c) n j i=1 Aggregation over individuals, locations, and years provides the overall population prediction that can be compared among species, S(q) = 1 J(T 1) J j=1 t=2 T S j,t (q). 5. APPLICATION (4.15d) Advantages of the dynamic inverse approach as a component of SA are demonstrated here with example species and summarized with patterns for all species. We begin by considering the size of models selected and the interpretation of interactions. We then show DIP at different levels of aggregation, to evaluate synthetic responses to climate of individuals over time, and we use SA to illuminate the role of output interactions SELECTED MODELS Variable selection yielded models of intermediate size (Table 2). In some cases few variables were selected due to a limited distribution for a species, in terms of abundance (number of trees) or the range of variation in covariate space. Species that occur on a subset of plots had fewer variables under consideration than species that occur on all plots. For example, average winter temperature W j is a plot level variable. Species that occur on only a few plots were not tested for effects of average winter temperature. Rare species (acronyms are divi, ilde, mafr, qust, ulru, and ulun) did not have interactions in the final model. Although a selected model could be small due to limited distribution of a species, it was not the case that large models were selected for the abundant and widespread species (e.g.,

20 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 395 Table 2. Predictors by species. Any combination of d, f, j, or+ indicates inclusion in the model. d, f,orj indicates that the 90 % credible interval for adult growth, fecundity, or juvenile growth, respectively, does not include zero, either positive (+) ornegative( ). indicates that the 99 % credible interval does not include zero. X indicates inclusion, but the 90 % credible interval includes zero. spec C D D 2 d w m W M C w C m C W C M w m M m acru +d + f + j +d X X+ f + j +d acsa +d + f + j acpe +d + f + j +d X +d + f + j acba +d + f + j amar +d + f + j +d X + f + j X + f +d beal +d + f + j X + f +d X + f + j X + f + j +d + f bele +d + f + j +d X + f + j X + f +d f + j caca +d + f + j +d X + f + j +d cagl +d + f + j +d X + f + j X + f X + f + j +d + j caov +d + f + j +d f X+ f +d + j cato +d + f + j X + f d f +d f X+ f X X+ f +d + j caun +d + f + j +d + fx+ f + j X + f X + f +d + f +d + j ceca +d + f + j +d X + f X + f +d f + j cofl +d + f + j +d f X + f +d f + j divi +d + f + j fagr +d + f + j +d + f X + f X+ f + j +d fram +d + f + j +d + f X+ f X + f + j +d f + j +d j +d f + j ilop +d + f + j ilde +d + f + j juvi +d + f + j +d X + f + j +d + j list +d + f + j X X X X X X X X X litu +d + f + j +d + f X + f X + f + j +d + j +d + j maac+d + f + j mafr +d + j nysy +d + f + j +d f X+ f + j +d + j oxar +d + f + j +d + f X + f X + f X X +d + j +d + f + j piri +d + f + j +d + f X+ f X + f +d + j pist +d + f + j X +d + f +d +d + j

21 396 J.S. CLARK ET AL. Table 2. (Continued.) spec C D D 2 d w m W M C w C m C W C M w m M m pita +d + f + j +d f X + f + j X + f + j X + f + j +d f + j +d f + j piec +d + f + j +d f X + f + j X + f + j +d f + j pivi +d + f + j +d f X+ f + j X + f +d + j qual +d + f + j +d + f X+ f +d + j quco +d + f + j +d + f X+ f +d + j qufa +d + f + j +d + f + j +d + f + j X quma +d + f + j quph +d + f + j +d X +d + j qupr +d + f + j quru +d + f + j +d + f X+ f + j +d + f + j qust +d + f + j quve +d + f + j +d + f X+ f + j +d + j quun +d + f + j X X+ f +d + f +d + j rops +d + f + j +d X+ f + j X + f X + f + j +d + j saal +d + f + j +d X+ f + j X X +d + f + j tiam +d + f + j +d X + f + j +d tsca +d + j X + f +d + f X + f X + f X + f + j +d + f + j ulal +d + f + j +d f X+ j X + f + j +d ulam +d + f + j +d X+ j X + f + j +d + f ulru +j X X X X X X X ulun +d + j

22 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 397 acru, cofl, litu, qual, quru). Limited model size results from the fact that many species were insensitive to different input variables and not from limited sample size (Section 5.3). As reported in Clark et al. (2010) correlations between input variables tended to be low. Correlations > 0.5 occurred for only six of the input pairs reported in the Supplement. In fact, few input correlations exceeded 0.2. Even for the highest correlations, variance inflation factors (VIFs) are well below values typically taken as diagnostic of problems, e.g., 5 to 10. VIFs are presented with parameter correlations in the Supplement INTERACTION TERMS Where retained, interactions were positive more often than not (Table 2), indicating the importance of interactions that tend to amplify the effects of one another. Consider a species with an especially large number of interactions in the selected model, Pinus taeda. We contrast P. taeda with another that is abundant and widespread, yet having only weak climate effects and weak interactions, Liquidambar straciflua. The selected model for P. taeda included x ij,t = (1 C ij,t d ij,t w j,t m j,t M ij C ij,t w j,t C ij,t m j,t ). Main effects for exposed canopy (C) and climate variation (w,m) were important for all demographic rates, including juvenile and adult growth and fecundity (inputs are summarized in Figure 4). Both C w and C m interactions were positive for both juvenile and adult growth, and negative for fecundity. Thus, individual growth responded strongly to interannual variation in moisture and winter temperature, but especially so for juveniles exposed to high light levels (Figure 5). This positive interaction describes how the response to climate is amplified by light availability; juveniles shaded in the understory showed lower response than those exposed to high light. Growth responses to both variables were greatest for canopy individuals with access to full sunlight. The negative interaction between Figure 5. Examples of amplifying positive interactions (growth) and buffering negative interactions (fecundity) for Pinus taeda between light availability, winter temperature, and summer drought. Plots show input interactions, i.e., just the first term of Equation (4.8). In all panels, contours increase from low at lower left to high at upper right (see Figure 3).

23 398 J.S. CLARK ET AL. these same variables for fecundity indicates a buffering effect; fecundity responses to climate variation were largest for individuals with limited light access. Although the selected model for Liquidambar styraciflua included climate variables, weak effects are indicated by the fact that 90 % credible intervals included 0 (Table 2) DYNAMIC INVERSE PREDICTION AND SENSITIVITY The different responses of Pinus taeda and Liquidambar styraciflua individuals are particularly evident from DIP. Predictive distributions for w and m (Equation (4.6)) for randomly selected individuals of Pinus taeda and Liquidambar styraciflua are shown as predictive means and 95 % predictive intervals (red lines) with true covariate values (black lines) in Figure 6. Individuals of both species closely track the limiting resource light. However, these species contrast in their responses to drought and winter temperatures. Pinus taeda tracks both variables closely the demographic health of Pinus taeda individuals is controlled by both variables, across the population, in all years, particularly with increasing droughts since The predictive intervals for individual Liquidambar styraciflua tree years simply recover the prior, with a mean of 0.5 for both variables. The fact that the limiting resource light has large impact across individuals and years for both species supports the notion that these different responses to climate represent important species level differences. The predictive scores for the entire populations by year, shown below each predictive distribution in Figure 6.ForLiquidambar, scores are below the scale for climate variables, but they are high for light (bottom right panel). Failure to predict climate variables does not result from limited data. Liquidambar is one of the most abundant species in the data set. Liquidambar does not predict the climate variables, because climate variables do not control the multivariate responses for this species. For Pinus taeda a negative output interaction (Equation (4.8)) between growth and fecundity explains surprising negative sensitivities (Equation (4.9)) of growth to winter temperature for reproductive adults and positive for juveniles (Figure 7). The overall effect of winter temperature is positive, with a long growing season increasing opportunity for photosynthesis. The strong fecundity response (large positive sensitivity in Figure 7b) and negative output interaction (tradeoff between fecundity and growth) explains the negative response of diameter growth, which is especially strong at low light levels (Figure 7a). The fact that juveniles do not reproduce explains the positive growth sensitivities and the increasing response with understory light (Figure 7d). Aggregation of DIP scores to the population scale (Equation (4.15c)) shows differences between species and the importance of input variables. Most species track closely light availability, shown as mean scores near zero or above (Figure 8a), but other variables affect species differently. Pinus taeda is ranked low for w (Figure 8b) and m (Figure 8d), respectively, showing that the tight tracking for individuals in Figure 6 applies to the population as a whole. Liquidambar styraciflua has low rank for both variables, consistent with patterns observed for individuals. Neither species responds strongly to spatial variables M and W.

24 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 399 Figure 6. Individual predictive distributions (predictive mean and 95 % interval) for randomly selected individuals over time (red lines) compared with actual input values (black). Below each individual prediction are the aggregate scores (Equations (4.15a), (4.15b), (4.15c), (4.15d)) for the entire population for each year (aggregate mean and 95 % of individual predictions). Individuals are Pinus taeda at left and Liquidambar styraciflua at right. The prior mean is DISCUSSION Sensitivity analysis (SA) has been one of the most popular tools to emerge in population studies over the last decade. It is now routinely applied to projection matrices, which are used to extrapolate demographic rates fitted to individual organisms (de Kroon, van Groenendael, and Ehrleń 2000; Caswell2000). Sensitivity coefficients provide detail, a

25 400 J.S. CLARK ET AL. Figure 7. Sensitivity of growth and fecundity to temperatures in the months of January, February, and March, w j,t conditioned on light level C ij,t compared for juveniles and adults of Pinus taeda. Each point is a sensitivity estimate for an individual and year (Equation (4.8)). coefficient for each input variable. However, many environmental models have multiple outputs. In these cases, dynamic inverse prediction (DIP) can complement sensitivity analysis. The individual s prediction of climate and resources spatially and over time, based on the fitted model, provides a direct measure of importance for the variables that affect its multivariate response vector (Figure 6). DIP can be implemented at the scale where climate affects populations (individuals over short intervals), it is synthetic, and it is readily aggregated to explore population, year, and conditional responses. It integrates such concepts as effect size, goodness of fit, and variable interactions, and it does so in a way that has intuitive interpretation individuals not closely tracking climate variation in space and time are unlikely to show large responses to near-term variation. By contrast, species having a large number of their individuals closely tracking contemporary variation are clearly sensitive to it. The aggregation of DIP by year, site, and individual provides a basis for anticipating which species will respond in what way to each combination of climate variables, depending on the local competitive context. Evaluating the impacts of climate change could begin with aggregate species differences (Figure 8) to identify those most likely to respond on average to near-term changes in climate. Magnitude of the response can be gauged relative to the risks individuals face constantly, competition for light (Figure 8a). The approach can be implemented where the number of responses is large we have applied to the 100 species on plots across eastern North America in the USFS FIA database (Clark, Gelfand, and Zhu, in preparation). Limitations of DIP, as applied here, include need for spatiotemporal data on inputs and responses, but alternative approaches to understand climate sensitivity have limitations of their own. Species distribution models (SDMs) provide only calibrated regressions of spatial patterns in species abundance and climate, rather than dynamic responses to changing risk exposure. Climate dependence is weak in SDMs (Canham and Thomas 2010) due to the fact that spatial patterns of abundance represent climate effects indirectly, due to competition, and highly aggregated climate data (over years and geographic areas) do not reflect weather experienced by the individual. Disaggregation to the individual scale can reveal large differences between species in terms of the distribution of responses among individuals, depending on their local competitive settings.

26 DYNAMIC INVERSE PREDICTION AND SENSITIVITY ANALYSIS 401 Figure 8. Prediction scores for five predictors, with species ranked from low to high.

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

UPPLEMENT A COMPARISON OF THE EARLY TWENTY-FIRST CENTURY DROUGHT IN THE UNITED STATES TO THE 1930S AND 1950S DROUGHT EPISODES

UPPLEMENT A COMPARISON OF THE EARLY TWENTY-FIRST CENTURY DROUGHT IN THE UNITED STATES TO THE 1930S AND 1950S DROUGHT EPISODES UPPLEMENT A COMPARISON OF THE EARLY TWENTY-FIRST CENTURY DROUGHT IN THE UNITED STATES TO THE 1930S AND 1950S DROUGHT EPISODES Richard R. Heim Jr. This document is a supplement to A Comparison of the Early

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing

More information

Imperfect Data in an Uncertain World

Imperfect Data in an Uncertain World Imperfect Data in an Uncertain World James B. Elsner Department of Geography, Florida State University Tallahassee, Florida Corresponding author address: Dept. of Geography, Florida State University Tallahassee,

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

Chapter 6 Problems with the calibration of Gaussian HMMs to annual rainfall

Chapter 6 Problems with the calibration of Gaussian HMMs to annual rainfall 115 Chapter 6 Problems with the calibration of Gaussian HMMs to annual rainfall Hidden Markov models (HMMs) were introduced in Section 3.3 as a method to incorporate climatic persistence into stochastic

More information

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint

More information

particular regional weather extremes

particular regional weather extremes SUPPLEMENTARY INFORMATION DOI: 1.138/NCLIMATE2271 Amplified mid-latitude planetary waves favour particular regional weather extremes particular regional weather extremes James A Screen and Ian Simmonds

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

Nonlinear Models. and. Hierarchical Nonlinear Models

Nonlinear Models. and. Hierarchical Nonlinear Models Nonlinear Models and Hierarchical Nonlinear Models Start Simple Progressively Add Complexity Tree Allometries Diameter vs Height with a hierarchical species effect Three response variables: Ht, crown depth,

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

CLIMATE OF THE ZUMWALT PRAIRIE OF NORTHEASTERN OREGON FROM 1930 TO PRESENT

CLIMATE OF THE ZUMWALT PRAIRIE OF NORTHEASTERN OREGON FROM 1930 TO PRESENT CLIMATE OF THE ZUMWALT PRAIRIE OF NORTHEASTERN OREGON FROM 19 TO PRESENT 24 MAY Prepared by J. D. Hansen 1, R.V. Taylor 2, and H. Schmalz 1 Ecologist, Turtle Mt. Environmental Consulting, 652 US Hwy 97,

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Hierarchical Modeling for Multivariate Spatial Data

Hierarchical Modeling for Multivariate Spatial Data Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Flexible Spatio-temporal smoothing with array methods

Flexible Spatio-temporal smoothing with array methods Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and

More information

NGSS Example Bundles. Page 1 of 23

NGSS Example Bundles. Page 1 of 23 High School Conceptual Progressions Model III Bundle 2 Evolution of Life This is the second bundle of the High School Conceptual Progressions Model Course III. Each bundle has connections to the other

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

Climate Change Impact on Air Temperature, Daily Temperature Range, Growing Degree Days, and Spring and Fall Frost Dates In Nebraska

Climate Change Impact on Air Temperature, Daily Temperature Range, Growing Degree Days, and Spring and Fall Frost Dates In Nebraska EXTENSION Know how. Know now. Climate Change Impact on Air Temperature, Daily Temperature Range, Growing Degree Days, and Spring and Fall Frost Dates In Nebraska EC715 Kari E. Skaggs, Research Associate

More information

The Colorado Drought of 2002 in Perspective

The Colorado Drought of 2002 in Perspective The Colorado Drought of 2002 in Perspective Colorado Climate Center Nolan Doesken and Roger Pielke, Sr. Prepared by Tara Green and Odie Bliss http://climate.atmos.colostate.edu Known Characteristics of

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

DROUGHT IN MAINLAND PORTUGAL

DROUGHT IN MAINLAND PORTUGAL DROUGHT IN MAINLAND Ministério da Ciência, Tecnologia e Ensino Superior Instituto de Meteorologia, I. P. Rua C Aeroporto de Lisboa Tel.: (351) 21 844 7000 e-mail:informacoes@meteo.pt 1749-077 Lisboa Portugal

More information

Minnesota s Climatic Conditions, Outlook, and Impacts on Agriculture. Today. 1. The weather and climate of 2017 to date

Minnesota s Climatic Conditions, Outlook, and Impacts on Agriculture. Today. 1. The weather and climate of 2017 to date Minnesota s Climatic Conditions, Outlook, and Impacts on Agriculture Kenny Blumenfeld, State Climatology Office Crop Insurance Conference, Sep 13, 2017 Today 1. The weather and climate of 2017 to date

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

Colorado s 2003 Moisture Outlook

Colorado s 2003 Moisture Outlook Colorado s 2003 Moisture Outlook Nolan Doesken and Roger Pielke, Sr. Colorado Climate Center Prepared by Tara Green and Odie Bliss http://climate.atmos.colostate.edu How we got into this drought! Fort

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Communicating Climate Change Consequences for Land Use

Communicating Climate Change Consequences for Land Use Communicating Climate Change Consequences for Land Use Site: Prabost, Skye. Event: Kyle of Lochalsh, 28 th February 28 Further information: http://www.macaulay.ac.uk/ladss/comm_cc_consequences.html Who

More information

Variability of Reference Evapotranspiration Across Nebraska

Variability of Reference Evapotranspiration Across Nebraska Know how. Know now. EC733 Variability of Reference Evapotranspiration Across Nebraska Suat Irmak, Extension Soil and Water Resources and Irrigation Specialist Kari E. Skaggs, Research Associate, Biological

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Drought in Southeast Colorado

Drought in Southeast Colorado Drought in Southeast Colorado Nolan Doesken and Roger Pielke, Sr. Colorado Climate Center Prepared by Tara Green and Odie Bliss http://climate.atmos.colostate.edu 1 Historical Perspective on Drought Tourism

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Spatio-temporal precipitation modeling based on time-varying regressions

Spatio-temporal precipitation modeling based on time-varying regressions Spatio-temporal precipitation modeling based on time-varying regressions Oleg Makhnin Department of Mathematics New Mexico Tech Socorro, NM 87801 January 19, 2007 1 Abstract: A time-varying regression

More information

The Generalized Likelihood Uncertainty Estimation methodology

The Generalized Likelihood Uncertainty Estimation methodology CHAPTER 4 The Generalized Likelihood Uncertainty Estimation methodology Calibration and uncertainty estimation based upon a statistical framework is aimed at finding an optimal set of models, parameters

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Impacts of Changes in Extreme Weather and Climate on Wild Plants and Animals. Camille Parmesan Integrative Biology University of Texas at Austin

Impacts of Changes in Extreme Weather and Climate on Wild Plants and Animals. Camille Parmesan Integrative Biology University of Texas at Austin Impacts of Changes in Extreme Weather and Climate on Wild Plants and Animals Camille Parmesan Integrative Biology University of Texas at Austin Species Level: Climate extremes determine species distributions

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Hierarchical Modelling for Multivariate Spatial Data

Hierarchical Modelling for Multivariate Spatial Data Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Learning a probabalistic model of rainfall using graphical models

Learning a probabalistic model of rainfall using graphical models Learning a probabalistic model of rainfall using graphical models Byoungkoo Lee Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 byounko@andrew.cmu.edu Jacob Joseph Computational Biology

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Technical note on seasonal adjustment for M0

Technical note on seasonal adjustment for M0 Technical note on seasonal adjustment for M0 July 1, 2013 Contents 1 M0 2 2 Steps in the seasonal adjustment procedure 3 2.1 Pre-adjustment analysis............................... 3 2.2 Seasonal adjustment.................................

More information

Climate also has a large influence on how local ecosystems have evolved and how we interact with them.

Climate also has a large influence on how local ecosystems have evolved and how we interact with them. The Mississippi River in a Changing Climate By Paul Lehman, P.Eng., General Manager Mississippi Valley Conservation (This article originally appeared in the Mississippi Lakes Association s 212 Mississippi

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

peak half-hourly New South Wales

peak half-hourly New South Wales Forecasting long-term peak half-hourly electricity demand for New South Wales Dr Shu Fan B.S., M.S., Ph.D. Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Business & Economic Forecasting Unit Report

More information

Multilevel Analysis, with Extensions

Multilevel Analysis, with Extensions May 26, 2010 We start by reviewing the research on multilevel analysis that has been done in psychometrics and educational statistics, roughly since 1985. The canonical reference (at least I hope so) is

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

Statistical Forecast of the 2001 Western Wildfire Season Using Principal Components Regression. Experimental Long-Lead Forecast Bulletin

Statistical Forecast of the 2001 Western Wildfire Season Using Principal Components Regression. Experimental Long-Lead Forecast Bulletin Statistical Forecast of the 2001 Western Wildfire Season Using Principal Components Regression contributed by Anthony L. Westerling 1, Daniel R. Cayan 1,2, Alexander Gershunov 1, Michael D. Dettinger 2

More information

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area

More information

Spatial Inference of Nitrate Concentrations in Groundwater

Spatial Inference of Nitrate Concentrations in Groundwater Spatial Inference of Nitrate Concentrations in Groundwater Dawn Woodard Operations Research & Information Engineering Cornell University joint work with Robert Wolpert, Duke Univ. Dept. of Statistical

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

2003 Moisture Outlook

2003 Moisture Outlook 2003 Moisture Outlook Nolan Doesken and Roger Pielke, Sr. Colorado Climate Center Prepared by Tara Green and Odie Bliss http://climate.atmos.colostate.edu Through 1999 Through 1999 Fort Collins Total Water

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Progress Report Year 2, NAG5-6003: The Dynamics of a Semi-Arid Region in Response to Climate and Water-Use Policy

Progress Report Year 2, NAG5-6003: The Dynamics of a Semi-Arid Region in Response to Climate and Water-Use Policy Progress Report Year 2, NAG5-6003: The Dynamics of a Semi-Arid Region in Response to Climate and Water-Use Policy Principal Investigator: Dr. John F. Mustard Department of Geological Sciences Brown University

More information

Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution

Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution Matthew Thomas 9 th January 07 / 0 OUTLINE Introduction Previous methods for

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Drought History. for Southeast Oklahoma. Prepared by the South Central Climate Science Center in Norman, Oklahoma

Drought History. for Southeast Oklahoma. Prepared by the South Central Climate Science Center in Norman, Oklahoma Drought History for Southeast Oklahoma Prepared by the South Central Climate Science Center in Norman, Oklahoma May 28, 2013 http://southcentralclimate.org/ info@southcentralclimate.org (This page left

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

P7: Limiting Factors in Ecosystems

P7: Limiting Factors in Ecosystems P7: Limiting Factors in Ecosystems Purpose To understand that physical factors temperature and precipitation limit the growth of vegetative ecosystems Overview Students correlate graphs of vegetation vigor

More information

The effect of wind direction on ozone levels - a case study

The effect of wind direction on ozone levels - a case study The effect of wind direction on ozone levels - a case study S. Rao Jammalamadaka and Ulric J. Lund University of California, Santa Barbara and California Polytechnic State University, San Luis Obispo Abstract

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Fusing point and areal level space-time data. data with application to wet deposition

Fusing point and areal level space-time data. data with application to wet deposition Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

Comment on Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate : The role of the standardization interval

Comment on Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate : The role of the standardization interval Comment on Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate : The role of the standardization interval Jason E. Smerdon and Alexey Kaplan Lamont-Doherty Earth Observatory,

More information

Spatial Effects on Current and Future Climate of Ipomopsis aggregata Populations in Colorado Patterns of Precipitation and Maximum Temperature

Spatial Effects on Current and Future Climate of Ipomopsis aggregata Populations in Colorado Patterns of Precipitation and Maximum Temperature A. Kenney GIS Project Spring 2010 Amanda Kenney GEO 386 Spring 2010 Spatial Effects on Current and Future Climate of Ipomopsis aggregata Populations in Colorado Patterns of Precipitation and Maximum Temperature

More information

Genetic Response to Rapid Climate Change

Genetic Response to Rapid Climate Change Genetic Response to Rapid Climate Change William E. Bradshaw & Christina M. Holzapfel Center for Ecology & Evolutionary Biology University of Oregon, Eugene, OR 97403, USA Our Students & Post-Doctoral

More information

Monthly Long Range Weather Commentary Issued: APRIL 18, 2017 Steven A. Root, CCM, Chief Analytics Officer, Sr. VP,

Monthly Long Range Weather Commentary Issued: APRIL 18, 2017 Steven A. Root, CCM, Chief Analytics Officer, Sr. VP, Monthly Long Range Weather Commentary Issued: APRIL 18, 2017 Steven A. Root, CCM, Chief Analytics Officer, Sr. VP, sroot@weatherbank.com MARCH 2017 Climate Highlights The Month in Review The average contiguous

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Climate Change Impacts on Maple Syrup Yield

Climate Change Impacts on Maple Syrup Yield Climate Change Impacts on Maple Syrup Yield Rajasekaran R. Lada, Karen Nelson, Arumugam Thiagarajan Maple Research Programme, Dalhousie Agricultural Campus Raj.lada@dal.ca Canada is the largest maple

More information

A BAYESIAN SOLUTION TO INCOMPLETENESS

A BAYESIAN SOLUTION TO INCOMPLETENESS A BAYESIAN SOLUTION TO INCOMPLETENESS IN PROBABILISTIC RISK ASSESSMENT 14th International Probabilistic Safety Assessment & Management Conference PSAM-14 September 17-21, 2018 Los Angeles, United States

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois

Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois Urbanization, Land Cover, Weather, and Incidence Rates of Neuroinvasive West Nile Virus Infections In Illinois JUNE 23, 2016 H ANNAH MATZ KE Background Uganda 1937 United States -1999 New York Quickly

More information

Climate Change and Arizona s Rangelands: Management Challenges and Opportunities

Climate Change and Arizona s Rangelands: Management Challenges and Opportunities Climate Change and Arizona s Rangelands: Management Challenges and Opportunities Mike Crimmins Climate Science Extension Specialist Dept. of Soil, Water, & Env. Science & Arizona Cooperative Extension

More information

Ecosystems. 1. Population Interactions 2. Energy Flow 3. Material Cycle

Ecosystems. 1. Population Interactions 2. Energy Flow 3. Material Cycle Ecosystems 1. Population Interactions 2. Energy Flow 3. Material Cycle The deep sea was once thought to have few forms of life because of the darkness (no photosynthesis) and tremendous pressures. But

More information