The role of climate normals in crop specific weather forecasts K.M. Baker 1, J. Williams 2, T.L. Lake 2 and W.W. Kirk 3 1 Department of Geography, Western Michigan University, Kalamazoo, MI, USA 2 Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA 3 Department of Plant Pathology, Michigan State University, East Lansing, MI, USA kathleen.baker@wmich.edu Abstract Crop specific weather forecasts for risk advisories in the US have been developed for a number of crops. We take the first steps toward increasing transferability of models across regions and crop systems by reducing the overall number of model inputs and substituting readily available climate data for spatial relationship variables which had to be individually calculated regionally in previous incarnations. We use a rigorously tested potato late blight model from the north central US for our analysis. Models that incorporate climate normals produce statistically similar results to models that use spatio-temporal markers. The relationship between models that incorporate climatology and those that do not varies depending on forecast length and the month of the forecast. Keywords: Artificial neural networks, climatology, plant disease, potato late blight Introduction One of the key hopes for research in the development of synoptic weather forecasting for crop specific disease risk is that results from models trained and tested at the regional scale will have widespread applications across spatial regions, and possibly across crop systems. So far, however, forecast development has been specific to a single crop and region. Weather-based prediction models have been used to estimate environmental conditions that are favorable for epidemic risk, and fungicide recommendations appropriate to that risk for more than 50 years (Beaumont, 1947; Cook, 1949; Wallin, 1960). Forecast models work to limit grower expenditures and reduce the amount of chemical released to the environment while achieving optimal control of disease. The incorporation of extended range forecast data into disease risk systems has rendered these systems even more valuable in recent years by providing prediction of risk conditions up to several days in advance of their occurrence. Baker et al. (2007) developed an extended range forecasting model to predict potato late blight disease risk up to five days in the future now in service at http://www.lateblight.org/ neuroweather.php. A number of economically important crops, in addition to potato, rely heavily on the use of fungicide application as the primary management strategy to control diseases and produce high quality marketable produce. Many plant pathogens have multiple crop hosts and similar EFITA/WCCA 11 493
temperature and relative humidity requirements for disease initiation. Thus, the extended range forecasting models which have been created specifically for late blight risk management in Michigan have far wider implications for crop diseases in general. In order to increase transferability of forecasting models across regions and crop systems the overall number of model inputs must be reduced and regionally unique variables must be eliminated. Using a rigorously tested potato late blight model from the north central US, we compare forecast models that use readily available climate normals to replace spatial relationship variables in reduced variable models. Spatial relationship variables were individually calculated by regions in previous incarnations and limited the widespread applicability of results. Methods Extended range forecast model output statistics (MOS) including 192 hour maximum and minimum daily temperatures have been produced by the National Weather Service (NWS) since 1994 with the Global Forecast System (GFS) numerical model. Since 2003, improvements to NWS forecasting accuracy have dramatically improved the usefulness of this data for use in agricultural and environmental modeling (Carrol & Maloney, 2004). Baker and Kirk (2007) developed a method to derive hourly microclimate variables associated with potato late blight risk from the available MOS produced by the NWS. This data was then fed into an artificial neural network (ANN) computer algorithm which generated accurate predictions of late blight disease risk up to five days in the future. Accuracy of predictions for all stations and years was examined through comparison of predicted potato late blight disease severity values with those computed based on the Unedited Local Climatological Data (ULCD). Forecast accuracy was significantly higher than expected values based on climate normals. A revised version of the model was made available online for growers use during the 2010 growing season (Wharton et al., 2008). The model relied on 48 input variables described previously (Baker & Kirk, 2007), including spatial relationship models. Extensive testing of network representation, learning algorithm, variable representation, ANN platforms, training set characteristics, normalization and simplification of inputs resulted in a much more efficient and more accurate base model (results not shown here). The final improved base potato late blight model used a resilient propagation (JE-Rprop) learning function in the JavaNNS software package. The model was trained on 80 percent of five day forecast data from 2004-2006 growing seasons (May 1 to Sept 30) for 45 NWS forecast locations in Michigan. Accuracy of predictions was examined through comparison with risk computed based on the Unedited Local Climatological Data (ULCD) which records the actual weather which took place on a forecast day. The model was validated on the additional 20 percent of those seasons. Long term model accuracy was assessed by testing the 7 growing seasons from 2004-2010. Variables were limited at estimated maximum and minimum values and normalized within the range 0 to 1. The final limited base data set of 10 variables (Table 1) was supplemented by 4 climate normal variables or 12 spatial cluster variables and a month variable. The accuracy of models was compared using a paired t-test across stations and years (n=314). 494 EFITA/WCCA 11
Table 1. Variables used in ANN generated crop specific forecast models, using potato late blight as a test case. Base Model Julian Day Minimum Temperature Cloud Cover - AM Cloud Cover - PM Quantity of Precip - AM Quantity of Precip - PM Over RH Threshold - Min Over RH Threshold - Max Minimum Risk Value Maximum Risk Value Climatology Model Normal Max Temp Normal Min Temp Normal Avg Temp Normal Total Precip Spatio-Temporal Model Spatial Markers Month Results Accuracy of long term results from the climatology model were remarkably similar to those of the spatio-temporal model (Table 2). For most daily forecasts, from 24-120 hours into the future and for most months of the growing season the two models were statistically equivalent. The spatio-temporal model was statistically more accurate for June forecasts of moderate range (48-72 hours). Table 2. Differences in forecast accuracy between climatology (C) and spatio-temporal models (S). The model represented represents the model of greatest accuracy. (=) is used when the models were statistically similar (p = 0.05). 24hr 48hr 72hr 96hr 120hr All Overall = = = = = = May = = = = = = June = S S = = S July = C = = = = August = = = = = = September = = = = = = When compared with the base model the climatology model was more accurate as forecast length increased (Table 3). There were no significant differences between the base model accuracy and climatology model accuracy in the 24 hour forecasts except in September. By 120 hours, the climatology model was significantly more accurate for all months. When the five forecast days were combined, the base model was equivalent to the climatology only during the month of August. EFITA/WCCA 11 495
Overall accuracies of the 3 models (Table 4) that the climatology model greatly improves the minimum annual station forecast accuracy by over 10 percent from the spatio-temporal model. The climatology model has a greater maximum and minimum annual station forecast accuracy than the base model. Table 3. Differences in forecast accuracy between climatology (C) and the base model (B). The model represented represents the model of greatest accuracy. (=) is used when the models were statistically similar (p=0.05). 24hr 48hr 72hr 96hr 120hr All Overall = C C C C C May = = = C C C June = C = C C C July = = C C C C August = = = = C = September C C C C C C Table 4. Overall 24-120hr accuracies for the three potato late blight forecast models by station and year. Min Max Mean SD Climatology 0.650 0.902 0.778 0.045 Spatio-Temporal 0.565 0.906 0.778 0.047 Base 0.642 0.882 0.772 0.044 Discussion For most of the forecast days and for most of the growing season, no statistical difference was found between the climatology and spatial-temporal models. The impact of normal climate characteristics is equivalent to that of clustering the locations by spatial similarity and breaking up the season by months. This equivalence allows for flexibility in model development depending on data availability. In most US locations, 30-year climate normals are readily available. Developing models with climate normals, maximizes transferability of crop specific models to other regions. In areas for which climate data is not distributed, including non-us locations, geographic clusters and temporal markers can easily be used instead. Although it makes logical sense that climate and geography should be closely linked, to our knowledge no previous study has investigated their equivalence for forecast purposes. As length of forecast increases, or time from the forecast is made until it is applicable increases, the significance of climate normals increases. This result is expected given the fact that weather forecasts in general decrease significantly in accuracy with length and eventually 496 EFITA/WCCA 11
are only able to forecast climatology at about 8 days from the day the forecast is made. For 24 hour forecasts, input values are more highly accurate and so generalization of the regional climate adds nothing to the model. August is the only month when the climate model is not more accurate than the base model for the complete 5 day forecast. This is probably because potato late blight risk during August in this region is generally driven by summer storms moving through in a fairly unpredictable spatial pattern from year to year. The next step of this research is to use the regionally developed, north central US potato late blight forecasts with limited variable set and climate normals to train and test data from other regions. This process is currently ongoing. Some measure of transferability of regional forecasts to other regions will be a huge step forward for the crop disease risk modeling community. The idea of transferability of model characteristics from one crop disease to another is still in its infancy, but many crop diseases result from similar temperature and relative humidity conditions. Variables that are significant in the prediction of temperature-humidity dependent systems should be similar among applications. Preliminary development of forecast models for Fusarium head blight of barley in the Great Plains of the US (Bondalapati et al., 2009) has determined a significant variable set very similar to the climatology model for potato late blight. Acknowledgements Funding provided by USDA RAMP 2008-02925. Co-PIs: Jeffrey Stein, South Dakota State U; William Kirk, Michigan State U; Phillip Wharton, U Idaho; Mark Boudreau, U Georgia; Dennis Todey, South Dakota State U. Special thanks to Jason Smith, Douglas Rivet, and Dr. Robert Trenary at Western Michigan University for their technical assistance throughout the project. EFITA/WCCA 11 497
References Baker, K.M. and Kirk, W.W.. 2007. Comparative analysis of models for integration of extended range synoptic forecast data into potato late blight risk systems. Computers and Electronics in Agriculture, 57:23-32. Baker, K.M., Wharton, P. and Kirk, W.W. 2007. Inclusion of synoptic weather forecast models in decision support systems for agriculture. In: proceedings of MODSIM 07: International Congress on Modeling and Simulation, 10-13 Dec, Christchurch, New Zealand. Beaumont, A. 1947. The dependence on the weather of the dates of outbreak of potato late blight. Transactions of the British Mycological Society. 31; 45-53. Bondalapati, K.D., J.M. Stein, K.M. Baker, and Chen, D.G. 2009. Using Forecasted Weather Data and Neural Networks for DON Prediction in Barley. Poster: Proceedings of the 2009 National Fusarium Head Blight Forum, Orlando, FL. Canty, S., Clark, A., Mundell, J., Walton, E., Ellis, D., and Van Sanford, D. (Eds.), University of Kentucky, Erlanger, KY. pp. 30-32. Carrol, K.L., and J.C. Maloney, J.C. 2004. Improvements in extended-range temperature and probability of precipitation guidance. Symposium on the 50th Anniversary of Operational Numerical Weather Prediction, College Park, MD, Amer. Meteor. Soc. Cook, H. 1949. Forecasting late blight epiphytotics of potatoes and tomatoes. Journal of Agricultural Research 78; 54-56. Wallin, J.R., and Schuster, M.L. 1960. Forecasting potato late blight in western Nebraska. Plant Disease Reporter 44; 896-900. Wharton, P.S., Kirk, W.W., Baker, K.M., and Duynslager, L. 2008. A web-based interactive system for risk management of Phytophthora infestans in potato canopies in Michigan. Computers and Electronics in Agriculture, 61:136-148. 498 EFITA/WCCA 11