Agro Ecological Malaria Linkages in Uganda, A Spatial Probit Model: IFPRI Project Title: Environmental management options and delivery mechanisms to reduce malaria transmission in Uganda Spatial Probit Research Team: Benjamin Wielgosz Timothy Thomas Edward Kato Thanks and Acknowledgment: This project is supported by the German Federal Ministry for Economic Cooperation and Development (BMZ) Deutsche Gesellschaft für Technische Zusammenarbeit (GTZ) GmbH.
Malaria Parasite Life Cycle
Ecology of Malaria
Malaria Atlas Project
Malaria is endemic in over 95% of the country, with the highest malaria transmission intensities in the world. Uganda as a Study Site UNHS 2005 Malaria Prevalence Uganda has the world s highest malaria incidence, with a rate of 478 cases per 1000 population per year & is the leading cause of morbidity and mortality in Uganda.
Self Reported Malaria Incidence in previous 30 days (UNHS 2005/06 Data)
Agro Ecological Linkages Identified in Literature Social Economic Status / Rural Development Land Use and Land Management Water Management Alternating Wet Dry Irrigation Wetland Cultivation Canal Based Irrigation Systems Rice Maize Cotton Pesticide Intensive Agriculture Agro Forestry Natural Ecology (Land Cover)
Spatial Auto Correlation: Household Coordinates Agricultural Parcel Coordinates, 3KM Neighborhood Land Cover Type within 3km
Binary Outcome Variable malaria is either true or false Probit or Logit James LeSage has described the mathematics for spatial dependency and spatial error in probit models. (available free on the webhttp://www.spatial econometrics.com/ ) Bayesian Spatial Probit uses a single global model, but accounts for spatial clustering within the coefficients and the error term. Spatial Effects: Clustering affects the coefficients and p values. Spatial Error: when the clustering arises in the error term
Moran s I: This test determined that the spatial variation in the correlations could not be random. Spatial Weights Matrix: We need a matrix which relates the individuals to their neighbors using the distance values. BUT we have a dataset of 41,139 potential neighbors that means our matrix has to store 1.7 billion values. At 8 bits per value, that requires 1.7 GB of memory just to store the matrix. To perform calculations, we need a Sparse Matrix which does not store values for individuals who are not neighbors. Unfortunately, ArcGIS does not use this matrix type which requires us to look into alternatives. Geographically Weighted Regression: GWR could be expanded so that variance in coefficients and p values could be mapped. But GWR can t be used for probit / logit models
Spatial Auto- Correlation: The Moran s I test for spatial auto-correlation indicates that there are significant local clustering effects present in the coefficients evaluated in the global regressions. Global Moran's I Summary for Tobit Global Moran's I Summary for of Ln (HH Malaria Proportion) Probit Moran's Index: 0.039546 Moran's Index: 0.018786 Expected Index: 0.000138 Expected Index: 0.000024 Variance: 0.000029 Variance: 0.000001 z score: 7.315744 z score: 25.862101 p value: 0.000000 p value: 0.000000 Given the z score of 7.32, there is a less than 1% likelihood that this clustered pattern could be the result of random chance. Given the z score of 25.86, there is a less than 1% likelihood that this clustered pattern could be the result of random chance. This confirms that there is clustered local variation present in the data which can explain the weak coefficients in the global regressions. The spatial probit model may address localized contributors to transmission that cannot be detected with non-spatial
Timothy Thomas has programmed a sparse matrix tool and spatial probit using R and translated it for Matlab: http://timthomas.net/ Matlab had to be run on a 64 bit PC with 16 GB of memory to calculate the spatial weights matrix and model in about 12 hours using a 100 GB page file.
Results Individual Level Variables Household Level Variables Agricultural Parcel Level Variables (as an aggregation of cultivatedareas within 3km of household) Land Cover Types within 3km of Household Effects are relative to the "Cropland Natural Vegetation Mosaic Linear Probit Model Bayesian Probit Model (No Spatial Bayesian Probit Model (Spatial Lag Effects) and Error Effects) malaria Coef. z P > z Parameter_Est T TestParameter_Est T Test nodata 0.19367 6.75 0.000 0.08387 2.98987 0.15745 2.42143 gender 0.03444 2.13 0.034 0.01460 1.71981 0.02938 1.62662 months_resident_of12 0.03430 11.16 0.000 0.01480 3.36470 0.02824 2.50386 age 0.00286 4.82 0.000 0.00119 2.75355 0.00231 2.46778 underfive 0.45047 17.26 0.000 0.18516 3.48249 0.36752 2.63878 highestgradecompleted 0.01412 4.57 0.000 0.00656 2.82117 0.01282 2.18295 ave_parent_ed 0.00089 0.22 0.823 0.00008 0.04140 0.00027 0.07220 hoursworked_last7days 0.00097 1.88 0.060 0.00050 1.74522 0.00101 1.60006 serious_disability 0.18092 5.59 0.000 0.07222 2.94569 0.14961 2.28379 bednet_lastnight 0.14171 4.71 0.000 0.06297 2.99394 0.12124 2.21869 itn_bednet 0.02854 0.67 0.502 0.01285 0.70427 0.02227 0.59681 hhh_gender 0.04858 2.14 0.032 0.01990 1.74622 0.03901 1.78120 urban 0.07040 2.17 0.030 0.02818 1.74696 0.04102 1.31268 totalhexp30d 0.00000 5.66 0.000 0.00000 3.45995 0.00000 2.50790 total_consumption_exp_365 0.00000 1.81 0.070 0.00000 2.27718 0.00000 2.21056 house_flat 0.01514 0.59 0.558 0.00196 0.18048 0.00403 0.20383 total_rooms 0.01262 2.61 0.009 0.00562 2.20916 0.00914 1.70832 natural_unimproved_watersource 0.03842 1.56 0.119 0.01659 1.51383 0.02846 1.33790 minutes_waiting_watersource 0.00049 2.44 0.015 0.00021 2.02157 0.00034 1.67488 chickens 0.00182 2.78 0.005 0.00073 2.05773 0.00126 1.88421 cattle 0.00134 0.83 0.407 0.00105 1.61400 0.00207 1.41896 pigs 0.00060 7.84 0.000 0.00025 1.88413 0.00045 1.57295 ave_parcel_size_ha 0.00145 2.92 0.003 0.00064 2.18959 0.00109 1.95866 pesticide 0.00228 3.00 0.003 0.00098 2.36597 0.00171 2.05253 maize_improved 0.00718 1.02 0.309 0.00350 1.09499 0.00601 1.07011 maize_local 0.00903 1.55 0.121 0.00388 1.05991 0.00688 1.00819 banana_improved 0.00630 0.65 0.514 0.00371 0.94860 0.00658 0.89415 banana_local 0.06571 1.80 0.071 0.03064 1.90174 0.05392 1.92156 coffee_improved 0.00028 0.03 0.974 0.00028 0.07703 0.00040 0.05785 coffee_local 0.02303 1.03 0.304 0.01456 1.31118 0.02786 1.24913 rice 0.00103 0.07 0.942 0.00026 0.05369 0.00109 0.11497 treecrops 0.00129 0.19 0.849 0.00097 0.26923 0.00085 0.13393 improved_pasture 0.00081 0.63 0.528 0.00043 0.78573 0.00073 0.86614 local_pasture 0.00349 1.63 0.102 0.00138 1.45207 0.00244 1.34687 gc14 Rainfed croplands 0.00012 0.41 0.684 0.00006 0.50092 0.00008 0.32177 gc40 Closed to open (>15%) broadleaved (evergreen or deciduous) forest 0.00012 0.24 0.813 0.00000 0.00208 0.00007 0.18547 gc50 Closed (>40%) broadleaved deciduous forest 0.00038 1.17 0.240 0.00014 0.96225 0.00023 0.89449 gc60 Open (15 40%) broadleaved deciduous forest 0.00093 1.77 0.076 0.00040 1.53492 0.00065 1.36294 gc70 Closed (>40%) needleleaved evergreen forest 0.00977 1.03 0.304 0.00501 1.00419 0.01021 1.10371 gc90 Open (15 40%) needleleaved (evergreen or deciduous) forest 0.00074 0.17 0.865 0.00059 0.26192 0.00078 0.21533 gc100 Closed to open (>15%) mixed broadleaved and needleleaved forest 0.03681 0.99 0.324 0.01301 0.66723 0.02193 0.66894 gc110 Mosaic grassland (20 50%) / forest or shrubland (50 70%) 0.00105 1.19 0.235 0.00039 1.04908 0.00062 0.99497 gc120 Mosaic grassland (50 70%) / forest or shrubland (20 50%) 0.00165 0.56 0.572 0.00112 0.88280 0.00101 0.41087 gc130 Closed to open (>15%) shrubland 0.00054 1.59 0.112 0.00025 1.71978 0.00037 1.32772 gc140 Closed to open (>15%) grassland, savannas (herbaceous vegetation) 0.00230 0.43 0.666 0.00117 0.51820 0.00134 0.33505 gc150 Sparse (<15%) vegetation 0.00355 0.52 0.600 0.00111 0.32912 0.00254 0.51009 gc160 Closed to open (>15%) broadleaved forest regularly flooded 0.00171 1.05 0.295 0.00077 1.14034 0.00126 1.11879 gc170 Closed (>40%) broadleaved forest permanently flooded 0.12853 2.50 0.012 0.05854 2.02794 0.09596 1.60159 gc180 Closed to open (>15%) herbaceous vegetation regularly flooded 0.00830 0.77 0.440 0.00457 0.93115 0.00759 0.85745 gc190 Artificial surfaces (Urban areas >50%) 0.00004 0.07 0.948 0.00002 0.05924 0.00001 0.02438 gc200 Bare areas 0.01012 2.09 0.036 0.00418 1.88850 0.00678 1.49596 gc210 Water bodies 0.00114 2.77 0.006 0.00059 2.46806 0.00099 1.94137
Socio Economic Determinants Wealth and Education Wealthier households (as measured through consumption) are less at risk of malaria Wealthier households (as measured through size of home) are less at risk of malaria Better Educated Household member report less malaria Vulnerability within Households Female Headed Households are more likely to report malaria Children Under 5 are the most at risk family members The more elderly are more likely to report malaria Serious Disability Increases Malaria risk Longer resident household members report more malaria
Elimination of False Positives UNHS Variables: Women are more likely to report malaria (Female = 0, Male = 1) Longer working hours increase malaria risk Urban households were less at risk of malaria than rural households Pig Farmers were more likely to report malaria GlobCover 2005 Variables: Open Broadleafed Forests were associated with lower malaria reporting Closed to open (>15%) shrubland was eliminated as a possible determinant through spatial controls. Flooded Broadleafed Forests were associated with lower malaria reporting Bare Areas were associated with higher malaria reporting
Agro Ecological Relationships Water Collection as a task and longer waiting times at water collection sources is associated with higher malaria risk Chicken Farmers are less likely to report malaria Small holder household are more likely to report malaria Higher pesticide usage is associated with higher malaria levels: this could be in response to living in an insect friendly environment or due to increase resistance of mosquitos to pesticides reducing the efficacy of bednets. Cultivation of local banana varieties was associated with lower malaria reporting Open Water bodies were associated with lower malaria reporting.
Those using bednets are reporting more malaria (bednet use is a response to high malaria risk) Higher levels of health expenditure in the past month are associated with higher malaria risk (possibly in response to malaria episodes) WARNING: Reverse Causality! Higher pesticide usage is associated with higher malaria levels: this could be in response to living in an insect friendly environment or due to increase resistance of mosquitoes to pesticides reducing the efficacy of bednets. Open Water bodies were associated with lower malaria reporting.
Health GIS Lessons Learned 1. Binary Outcome Models need to be developed to the level of GWR for many health analyses 2. Models need to control for spatial effects & errors or they can lead to false conclusions, due to clustering effects. 3. Spiders Nelson and Jackson 2006.
ArcGIS Wish List: Add Spatial Probit or Binary Outcome GWR to increase statistical analysis options. Integrate sparse matrices to allow for larger & faster algebraic calculations. Lift Memory Constraints, particularly for the 64 bit environment to access continuous blocks of memory > 2GB.