Mapping under-five mortality in the Wenchuan earthquake using hierarchical Bayesian modeling

International Journal of Environmental Health Research 2011, 1 8, ifirst article Mapping under-five mortality in the Wenchuan earthquake using hierarchical Bayesian modeling Yi Hu a,b, Jinfeng Wang b *, Jun Zhu c * and Dan Ren c a School of Earth & Mineral Resource, China University of Geosciences, Beijing; b State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing; c National Office for Maternal and Child Health Surveillance, West China Second Hospital, Sichuan University, Chengdu, China (Received 22 October 2010; final version received 4 January 2011) More than two years after the 2008 earthquake in Wenchuan, China, the total number of lives lost remains unclear, particularly for children under five years old. Mortality for this age group can be estimated using a variety of techniques, but sample proportion estimates may be unreliable in areas with low populations of children under five. To address this problem, we propose a hierarchical Bayesian model to map the distribution of under-five mortality in Wenchuan at the township scale. This model is based on conditional distributions for data conditioned on a spatial process and parameters to capture uncertainties usually identified as either spatially-correlated effects or heterogeneity effects. The method was adapted to obtain reliable estimates of the under-five mortality rate in townships with low under-five populations. The approach was compared to other models and, despite some limitations, was found to outperform other methods in its smoothing effect as well as in exploration of other aspects of spatial patterns. Keywords: under-five mortality rate; Hierarchical Bayesian (HB) model; Geographic Information System (GIS); smoothing; earthquake Introduction On 12 May 2008, at 14:28 h local time, an earthquake registering 8.0 on the Richter scale hit the north-western part of Sichuan Province, China, with the epicenter in Wenchuan County. The devastating earthquake claimed more than 69,000 lives, many of which were children, particularly children less than five years old (Watts 2008). Under-five mortality, which is an important indicator of a country or district s overall health and level of development, is of great concern to policy-makers and international organizations that wish to improve public health and living standards. As part of ongoing efforts to understand spatial patterns of child mortality and to help take preventive measures in future reconstruction, small-area maps, such as those at the township level, visualizing under-five mortality rates should be produced and shared with the public. According to the traditional method, the age-specific mortality rate is calculated as the total deaths of a specific age or age group in a geographic area divided by the *Corresponding authors. Email: wangjf@lreis.ac.cn and zhujun028@163.com ISSN 0960-3123 print/issn 1369-1619 online Ó 2011 Taylor & Francis DOI: 10.1080/09603123.2011.560250 http://www.informaworld.com

2 Y. Hu et al. population of the same age or age group for a specified time period, usually a calendar year, and multiplied by 1,000. Accordingly, the under-five mortality rate of a township in an earthquake-hit area could be calculated as the total number of earthquake-induced deaths of children under five divided by the population of those in the same township and multiplied by 1,000. However, this method of sample proportion has large standard errors for townships with small populations of children under five and thus may indicate much more variability than actually exists. Different methods of smoothing have been developed to address this issue, with all being based on the assumption that observations close together in space are more likely to share similar properties than those that are far apart (Tobler 1970). While this positive spatial autocorrelation may be problematic for statistical methods that require independent observations, it can also be embraced to help smooth noisy maps by borrowing strength from neighbors for mapping units with small populations (Johnson 2004). Here, we used a Hierarchical Bayesian (HB) model to adjust the sample proportion by taking into account data from all the townships and every township s spatial contiguous relation to its neighbors when calculating the proportion in any given one. Methods Data The National Office for Maternal and Child Health Surveillance provided under-five mortality data collected at the township level in the Wenchuan earthquake and the number of children under five as of mid-2008. There were 934 under-five deaths distributed in 115 townships in Sichuan province. Township boundaries were provided in the form of shapefile by the State Key Laboratory of Resources and Environmental Information Systems (LREIS) of the Institute of Geographic Sciences and Natural Resources Research (IGSNRR), Chinese Academy of Sciences. Statistical inference The HB model is simply an extension of traditional Bayesian models where the prior distributions have some form of conditional dependency (Clark 2007). It is a powerful tool for expressing rich statistical models that more fully reflect a given problem than a simpler model could. We postulate the following simple probabilistic model. Let Z(i) ¼ O(i) denote the number of under-five deaths observed in township i during the Wenchuan earthquake. It is assumed that O(i) is independent and identically Poisson distributed with intensity parameter l(i) ¼ E(i)*r(i), where E(i) denotes the expected number of under-five deaths from the specific cause in township i, which is fixed and proportional to the corresponding under-five population n i, and r (i) is the positive township-specific relative risk of under-five mortality in township i. That is, Oi ðþpei ð ðþri Þ The relative risk parameter r (i) is assigned a log-normal prior distribution, log[r(i)]*n(m i, s 2 i ), where the expectation and variance are defined by a linear function of a common value (intercept), a, and two independent random effects; a heterogeneous component, e(i), that does not depend on geographic location of

International Journal of Environmental Health Research 3 townships and an autocorrelated component, v(i), that reflects local spatial structure by incorporating the influence of neighboring townships. That is, logðrðþ i Þ ¼ a þ vi ðþþei ð1þ Prior distributions are then assigned to these linear terms and consequent hyperprior distributions are assigned to the variance terms as follows, thus creating a hierarchical model. vi ðþn0; k 2 ; ei ðþn0; s 2 ;! vi ðþjvj ðþ; j 2 Ni ðþn Xn X n w ði; jþvðjþ; k 2 wði; jþ j 1 j 1 where N(i) denotes neighborhoods of i and w(i,j) is a weights matrix element and w*(i,j) is a standardized form of a weights matrix, defining the relationship between township i and its neighbor township j. The weight is defined simply as w(i,j) ¼ 1if the two townships are adjacent (share a common border) and w(i,j) ¼ 0 otherwise. 1=k 2 Gammaða; bþ; 1=s 2 Gammaðc; dþ where a and c are shape parameters, and b and d are inverse scale parameters. This is the convolution Gaussian model originally proposed by Besag and Newell (1991), where the random effect associated with spatial autocorrelation, v(i), is defined according to the conditional auto-regressive model (CAR) (Besag 1974). Specifically, the prior distribution for the intercept a was assigned to a flat distribution and the hyperprior distributions for 1/k 2 and 1/s 2 were both specified at Gamma (0.5, 0.0005) for this study. Using geographic information system (GIS) software, health events that have been address-matched can be automatically assigned to any level of census units, allowing conventional mapping and analysis of data. We used ArcGIS 9.3 to map the mortality rate at the township level and capture information the spatially structured variation v(i) about each township s contiguous relation to its neighbors. Then, Winbug1.4, statistical software for Bayesian analysis using Markov Chain Monte Carlo (MCMC) methods, was used to implement the hierarchical Bayesian model. And finally, the R statistical environment which calls the WinBug framework for the Gibbs sampling algorithm was employed to conduct convergence diagnostics for parameters. Results Following the Bayesian inference technique, we obtained the marginal posterior distribution for the parameters in model (1). A single chain sampler with a burn-in of 4,000 iterations was run, followed by 1,000 iterations during which values for m, v(i), and e(i) were stored. Diagnostic tests for convergence of the stored variables were carried out (Table 1), including the Geweke and Heidelberg-Welch tests. The tests show convergence of the chains for most of the parameters. Table 2 presents some statistical characteristics of under-five mortality proportion estimates in the 115 townships. The total under-five population ranges

4 Y. Hu et al. Table 1. Test statistics for MCMC convergence. Percentage (%) of tests passed. Test a v e Geweke (Z-value) 100 85.34 84.56 Heidelberg-Welch 100 98.62 98.51 The Z-value threshold interval for passing is (71.96, 1.96). Table 2. Proportion estimates for under-five mortality at the township scale in the 2008 Wenchuan earthquake, China. Results are summarized over 115 townships. Statistics T P HB Max 4916 289.48% 287.45% Min 55 0.30% 0.62% Mean 759 19.35% 17.67% Median 524 5.56% 4.92% STD 744.63 37.98 36.10 STD, standard deviation; T, population of children under-five; P, sample proportion; HB, estimate using model (1). from a minimum of 55 to a maximum of 4,916 with a large standard deviation of 744.63. The under-five mortality sample proportion estimates range from 0.30% to 289.48% with a mean and standard deviation of 19.35% and 37.98, respectively. This was not surprising considering the large fluctuation of the under-five population. The HB estimates, on the other hand, tended to be more homogenous, ranging from 0.62 287.45% with a mean and standard deviation of 17.67% and 36.10, respectively. Using ArcGIS 9.3, we mapped the 115 earthquake-hit townships. Figure 1 shows the distribution of the HB estimates of the under-five mortality rate in the study area. Discussion and conclusion The sample proportion estimates resulted in a large standard error, showing instability in townships where the under-five population is small. This is because rates based on small populations are more susceptible to data errors than rates computed from large populations (Haining 2003). Specifically, the addition or subtraction of a death will have a greater effect on the computed rate when the population denominator is small than when it is large. This is not, however, a problem in the HB model. The HB model-based mortality rate takes into account a mean effect m, independence e(i), and local spatially contiguous dependence v(i) for every subarea (township) when the mortality rate is calculated. It has the appealing feature of providing a whole distribution of possible outcomes that can be used for not only smoothing but also exploring other aspects of spatial patterns. This method actually estimates the death rate for any given township by borrowing strength from other townships in the study area, either a neighbor or all others, which is determined by the e(i) and v(i). In the case where spatially structured heterogeneity dominates, the death rates for townships with small populations are shifted towards the average rate for the areas that are geographic neighbors, whereas death rates

International Journal of Environmental Health Research 5 Figure 1. Hierarchical Bayesian smoothed under-five mortality rate of the 2008 Wenchuan earthquake, China. Thematic categories are based on the Jenks natural breaks method. shrink toward the average rate of overall townships in the study area if unstructured variation dominates. This method depends on all the data and the spatially contiguous relation of each township to its neighbors, which is typically more meaningful in practice. A benefit is potential reduction in the mean-squared error of the estimates around the true values. While there are areas open to improvement in the HB spatial modeling method, it is a valuable tool for geo-spatial assessment of death patterns that can help identify differences among specified geographic areas. This may in turn indicate patterns of health care access, screening, and diagnostic follow up and possibly indicate clues about causal relationships. One of the most important aims of mapping under-five mortality in the earthquake-affected townships in Wenchuan is to help policy-makers and the public understand the spatial pattern of under-five mortality so that preventive measures can be taken in the reconstruction. Tests of spatial patterns, however, usually suffer from small area problems. If areas vary substantially in spatial support (population sizes on which the rates are calculated) then any test for spatial autocorrelation that assumes constant variance across the set of areas should be used with caution (Gelman and Price 1999). Moran s I, a commonly used measure of spatial autocorrelation, was such a test used to explore the spatial pattern of under-five mortality. The global Moran s I, detecting global autocorrelation over the study area, shows a significantly positive autocorrelation for both sample proportions (Moran s I ¼ 0.16, p ¼ 0.02) and HB estimates (Moran s I ¼ 0.19, p ¼ 0.02), but the local Moran s I, detecting local clusters (known as hotspot ) of under-five mortality, indicates different sites of clusters for sample proportions and HB estimates. Figure 2 shows the difference; map A shows three high-high pattern clusters whereas map B shows only two. Focusing on map A can mislead risk factor exploration, policy decisions, and safe reconstruction efforts. Other models besides the HB model have been proposed to adjust the sample proportion estimate. Earlier applications employed Empirical Bayesian (EB) modeling (Clayton and Kaldor 1987), where parameters in the model are estimated directly from the data instead of priors. This approach is limited because it assigns a

6 Y. Hu et al. Figure 2. Township-level under-five mortality rate clusters in the 2008 Wenchuan earthquake, China. The statistical significance is at 95% confidence. Map A is based on sample proportion estimates and map B is based on HB estimates. point estimate to the parameters without allowing for variability that may be associated with them, and this variability can be large (Bernardinelli and Montomoli 1992). Agresti used random effects model with a simulated sample of size 2000 to mimic a poll taken before the 1996 U.S presidential election (Agresti 2003). The model is: Logit½PðY it ¼ 1ju i ÞŠ ¼ a þ u i u Nð0; s 2 Þ ð2þ Random effects model (2) treats each subarea i as a cluster drawn from the N(a, s 2 ), assuming that the true proportions vary according to normal distribution and the fitting process borrows from the whole it uses overall data from the study area to

International Journal of Environmental Health Research 7 estimate the proportion in any given subarea. Model (2) is a special case of Model (1) in which v(i) containing spatial information related to the mortality rate equals zero. Consequently, model (2) based rates are less robust to data errors than rates based on model (1). This has been proven in related researches (Johnson 2004; Zhu et al. 2006). Rushton and Lolonis (1996) used spatial filters (smoothing) through Monte Carlo (MC) simulations to map birth defect rates. The spatial filter refers to a regular lattice of grid points located at a certain interval, and the local incidence rates at regular grid locations are computed by dividing the number of cases occurring in the geographical vicinity of a grid location by the total number at risk in the same vicinity. This method of spatial filtering smoothes incidence rates in an area as a continuous spatial distribution rather than the traditional concept of a pattern of units (township in our study) by including spatial dependence of neighboring points due to their sharing of observations. However, spatial smoothing based on strict spatial adjacency may not be a good basis on which to borrow information because such neighbors are not necessarily similar (e.g., across urban/rural boundaries or where there are physical barriers) (Haining 2003). In other words, this method only considers spatial dependence of neighbors, but not a heterogeneity effect, and consequently lacks flexibility for situations where only heterogeneous variation (unstructured variation in model (1)) dominates. Although the HB model outperforms the models above, it still needs improved flexibility to deal with different situations. It may suffer from the situation mentioned above where considerable heterogeneity of affecting factors exists between neighbors though the unstructured variation e in model (1) does make a compromise. In this study, for example, there is no guarantee that any township in the study area has the exact same socioeconomic characteristics, like GDP and average income, and physical characteristics, like soil and geomorphic types, with its neighbors. These characteristics would unavoidably enlarge the mean-squared error of the estimates around the true values. Furthermore, consideration might also be given to borrowing strength from other areas that, though not adjacent, are similar in terms of the factors that influence incidence rates, e.g., two urban areas (Haining 2003). The relationship between separated townships that have similar socioeconomic characteristics could be considered in the modeling. Acknowledgments This study was supported by the MOST, China (2009 ZX10602-01-04, 2008 BAI56B02, 2007 DFC20180, 2007 AA12Z233; 2006 BAK01A13), NSFC (41023010) and the CAS, China (KZCX2-YW-308). We would like to thank Luke Driskell from Louisiana State University for English language editing. References Agresti A. 2003. Categorical data analysis. Gainesville, FL: Wiley-Interscience. 710 p. Bernardinelli L, Montomoli C. 1992. Empirical Bayes versus fully Bayes analysis of geographical variation in disease risk. Statist Med. 11:983 1007. Besag JE. 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). J Royal Statist Soc B. 36:192 236. Besag JE, Newell J. 1991. The detection of clusters in rare diseases. J Royal Statist Soc A. 154:143 155. Clark JS. 2007. Models for ecological data: an introduction. Princeton, New Jersey: Princeton University Press. 632 p.

8 Y. Hu et al. Clayton DG, Kaldor J. 1987. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics. 43:671 681. Gelman A, Price PN. 1999. All maps of parameter estimates are misleading. Statist Med. 18(23):3221 3234. Haining R. 2003. Spatial data analysis theory and practice. Cambridge, UK: Cambridge University Press. 432 p. Johnson GD. 2004. Small area mapping of prostate cancer incidence in New York State (USA) using fully Bayesian hierarchical modelling. Int J Health Geogr. 3(1):29. Rushton G, Lolonis P. 1996. Exploratory spatial analysis of birth defect rates in an urban population. Statist Med. 15(7 9):717 726. Tobler WR. 1970. A computer movie simulating urban growth in the Detroit region. Econom Geogr. 46:234 240. Watts J. 2008 23 May. Chinese quake forced 3m children from homes. Guardian. Zhu L, Gorman DM, Horel S. 2006. Hierarchical Bayesian spatial models for alcohol availability, drug hot spots and violent crime. Int J Health Geogr. 5:54.