Is correlation dimension a reliable indicator of low-dimensional chaos in short hydrological time series?

Similar documents
Abstract. Introduction. B. Sivakumar Department of Land, Air & Water Resources, University of California, Davis, CA 95616, USA

Application of Chaos Theory and Genetic Programming in Runoff Time Series

Predictability and Chaotic Nature of Daily Streamflow

SINGAPORE RAINFALL BEHAVIOR: CHAOTIC?

Dynamics of Sediment Transport in the Mississippi River Basin: A Temporal Scaling Analysis

The Effects of Dynamical Noises on the Identification of Chaotic Systems: with Application to Streamflow Processes

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Predictive uncertainty of chaotic daily streamflow using ensemble wavelet networks approach

On embedding dimensions and their use to detect deterministic chaos in hydrological processes

Aggregation and sampling in deterministic chaos: implications for chaos identification in hydrological processes

Is the streamflow process chaotic?

Application of an artificial neural network to typhoon rainfall forecasting

Modeling and Predicting Chaotic Time Series

Local polynomial method for ensemble forecast of time series

Commun Nonlinear Sci Numer Simulat

Available online at AASRI Procedia 1 (2012 ) AASRI Conference on Computational Intelligence and Bioinformatics

The Behaviour of a Mobile Robot Is Chaotic

This is the Pre-Published Version.

Local polynomial method for ensemble forecast of time series

Detection of Nonlinearity and Stochastic Nature in Time Series by Delay Vector Variance Method

DISCUSSION of Evidence of chaos in the rainfall runoff process * Which chaos in the rainfall runoff process?

No. 6 Determining the input dimension of a To model a nonlinear time series with the widely used feed-forward neural network means to fit the a

Forecasting River Flow in the USA: A Comparison between Auto-Regression and Neural Network Non-Parametric Models

River Flow Forecasting with ANN

Climate variability and its effects on regional hydrology: a case study for the Baltic Sea drainage basin

Neural Networks and the Back-propagation Algorithm

Understanding the dynamics of snowpack in Washington State - part II: complexity of

Effects of data windows on the methods of surrogate data

The Research of Railway Coal Dispatched Volume Prediction Based on Chaos Theory

Monthly River Flow Prediction Using a Nonlinear Prediction Method

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Prediction of Monthly Rainfall of Nainital Region using Artificial Neural Network (ANN) and Support Vector Machine (SVM)

Application of Physics Model in prediction of the Hellas Euro election results

ESTIMATING THE ATTRACTOR DIMENSION OF THE EQUATORIAL WEATHER SYSTEM M. Leok B.T.

RAINFALL RUNOFF MODELING USING SUPPORT VECTOR REGRESSION AND ARTIFICIAL NEURAL NETWORKS

1 Random walks and data

Revista Economica 65:6 (2013)

Artificial Neural Network

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

ARTICLE IN PRESS. Testing for nonlinearity of streamflow processes at different timescales. Accepted 8 February 2005

Ensembles of Nearest Neighbor Forecasts

Reconstruction Deconstruction:

18.6 Regression and Classification with Linear Models

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

Solute transport in a heterogeneous aquifer: a search for nonlinear deterministic dynamics

Information Dynamics Foundations and Applications

Predicting monthly streamflow using data-driven models coupled. with data-preprocessing techniques

Does the transition of the interval in perceptional alternation have a chaotic rhythm?

Long-Term Prediction, Chaos and Artificial Neural Networks. Where is the Meeting Point?

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Predicting wave heights in the north Indian Ocean using genetic algorithm

How much information is contained in a recurrence plot?

Universality and Scale Invariance in Hourly Rainfall Pankaj Jain, Suman Jain and Gauher Shaheen

Forecasting Drought in Tel River Basin using Feed-forward Recursive Neural Network

Forecasting of Rain Fall in Mirzapur District, Uttar Pradesh, India Using Feed-Forward Artificial Neural Network

Chaos, Complexity, and Inference (36-462)

A new method for short-term load forecasting based on chaotic time series and neural network

Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon

Neural Network Based Response Surface Methods a Comparative Study

Evidence of Low-dimensional Determinism in Short Time Series of Solute Transport

Investigation of Chaotic Nature of Sunspot Data by Nonlinear Analysis Techniques

Two Decades of Search for Chaos in Brain.

Comparison of Multilayer Perceptron and Radial Basis Function networks as tools for flood forecasting

Daily Rainfall Disaggregation Using HYETOS Model for Peninsular Malaysia

Artificial Neural Networks Francesco DI MAIO, Ph.D., Politecnico di Milano Department of Energy - Nuclear Division IEEE - Italian Reliability Chapter

Lecture 7 Artificial neural networks: Supervised learning

Optimal Artificial Neural Network Modeling of Sedimentation yield and Runoff in high flow season of Indus River at Besham Qila for Terbela Dam

ABOUT UNIVERSAL BASINS OF ATTRACTION IN HIGH-DIMENSIONAL SYSTEMS

Optimum Neural Network Architecture for Precipitation Prediction of Myanmar

Introduction to Neural Networks

Influence of Terrain on Scaling Laws for River Networks

Information Mining for Friction Torque of Rolling Bearing for Space Applications Using Chaotic Theory

Data Mining Part 5. Prediction

What is Chaos? Implications of Chaos 4/12/2010

Experiments with a Hybrid-Complex Neural Networks for Long Term Prediction of Electrocardiograms

Documents de Travail du Centre d Economie de la Sorbonne

Optimizing system information by its Dimension

Lecture 4: Perceptrons and Multilayer Perceptrons

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

ECE521 Lectures 9 Fully Connected Neural Networks

Estimating the predictability of an oceanic time series using linear and nonlinear methods

Electric Load Forecasting Using Wavelet Transform and Extreme Learning Machine

Retrieval of Cloud Top Pressure

MONTHLY RESERVOIR INFLOW FORECASTING IN THAILAND: A COMPARISON OF ANN-BASED AND HISTORICAL ANALOUGE-BASED METHODS

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Lecture 4: Feed Forward Neural Networks

Address for Correspondence

Investigation of Monthly Pan Evaporation in Turkey with Geostatistical Technique

Use of Neural Networks to Forecast Time Series: River Flow Modeling

Neural Networks DWML, /25

Artificial Neural Network and Fuzzy Logic

Real Time wave forecasting using artificial neural network with varying input parameter

A New Look at Nonlinear Time Series Prediction with NARX Recurrent Neural Network. José Maria P. Menezes Jr. and Guilherme A.

Dynamical Embodiments of Computation in Cognitive Processes James P. Crutcheld Physics Department, University of California, Berkeley, CA a

Temperature Prediction based on Artificial Neural Network and its Impact on Rice Production, Case Study: Bangladesh

Longshore current velocities prediction: using a neural networks approach

Complex Dynamics of Microprocessor Performances During Program Execution

arxiv: v1 [nlin.ao] 21 Sep 2018

Probabilistic prediction of real-world time series: A local regression approach

Transcription:

WATER RESOURCES RESEARCH, VOL. 38, NO. 2, 1011, 10.1029/2001WR000333, 2002 Is correlation dimension a reliable indicator of low-dimensional chaos in short hydrological time series? Bellie Sivakumar Department of Land, Air and Water Resources, University of California, Davis, California, USA Magnus Persson, Ronny Berndtsson, and Cintia Bertacchi Uvo Department of Water Resources Engineering, Lund Institute of Technology, Lund University, Lund, Sweden Received 21 August 2000; revised 3 October 2001; accepted 3 October 2001; published 16 February 2002 [1] The reliability of the correlation dimension estimation in short hydrological time series is investigated using an inverse approach. According to this approach, first predictions are made using the phase-space reconstruction technique and the artificial neural networks. The correlation dimension is estimated next independently and is compared with the prediction results. A short hydrological series, monthly runoff series of 48 years (with a total of only 576 values) observed at the Coaracy Nunes/Araguari River watershed in northern Brazil, is studied. The correlation dimension results are in reasonably good agreement with the optimal embedding dimension obtained from the phase-space method and the optimal number of inputs from the neural networks. No underestimation of the correlation dimension is observed due to the small data size, rather there seems to be a slight overestimation due to the presence of noise in the data. The results indicate that the accuracy of the correlation dimension may not be judged on the basis of the length of the time series but on whether the time series is long enough to reasonably represent the dynamical changes in the system. Such an observation suggests that the correlation dimension could indeed be a reliable indicator of low-dimensional chaos even in short hydrological time series, which is certainly encouraging news for hydrologists who often have to deal with short time series. INDEX TERMS: 1719 History Of Geophysics: Hydrology; 1860 Hydrology: Runoff and streamflow; 3220 Mathematical Geophysics: Nonlinear dynamics; 3240 Mathematical Geophysics: Chaos; KEYWORDS: low-dimensional chaos, correlation dimension, data size, reliability, runoff, prediction 1. Introduction [2] Applications of the ideas gained from chaos theory to understand hydrological processes have been gaining considerable momentum lately [e.g., Rodriguez-Iturbe et al., 1989; Islam et al., 1993; Jayawardena and Lai, 1994; Koutsoyiannis and Pachakis, 1996; Porporato and Ridolfi, 1997; Krasovskaia et al., 1999; Sivakumar et al., 1999, 2001; B. Sivakumar et al. (A multivariable time series phase-space reconstruction approach to investigation of chaos in hydrological processes, submitted to Water Resources Research), 2001; Islam and Sivakumar, 2002]. Though research over the past two decades has resulted in a wide variety of methods to identify low-dimensional chaos, the correlation dimension method has been the most widely used in studies dealing with hydrological processes. In spite of a number of studies reporting the presence of low-dimensional chaos in hydrological processes based on the observation of a low correlation dimension (or with other indicators), such studies have very often been subject to intense debate [e.g., Ghilardi and Rosso, 1990]. This is because the correlation dimension method has been developed on the basis of the assumption, among others, that the time series is infinite, whereas hydrological time series are always finite and often short. Copyright 2002 by the American Geophysical Union. 0043-1397/02/2001WR000333 [3] An important reason for the suspicions on the existence of low-dimensional chaos in hydrological processes is the belief that the minimum data size (N min ) required for the correlation dimension (d) estimation depends on the embedding dimension (m) used for the phase-space reconstruction [e.g., Smith, 1988]. Such a belief, however, is not entirely true, since the minimum data size may depend largely on the type and dimension of the attractor, which in turn depend on dynamics of the phenomenon. The study by Sivakumar [2000] reveals that the calculations and derivations relating the data size and the embedding dimension, i.e., N min f(m), may be valid only for m < d, and for m > d, this could be N min f(d), though such a relationship may not be valid for every dynamical system. Lorenz [1991] reported that different variables could yield different estimates of correlation dimension and a suitably selected variable could sometimes yield a fairly good estimate even if the number of data points is not large. This was supported also by Islam et al. [1993] on the basis of the dependence of physical constraints and thresholds of the variable investigated. [4] The above observations clearly indicate that the accuracy of the correlation dimension estimation must be evaluated with respect to the characteristics of the time series and the underlying dynamics, rather than the data size. One possible way to achieve this is by studying the series using any other technique, where no a priori knowledge of the actual physical process is assumed except acknowledging a dynamical relationship, and trying to relate the results to the correlation dimension results. With this in mind, in the present study, a short hydrological time series is analyzed first 3-1

3-2 SIVAKUMAR ET AL.: TECHNICAL NOTE by employing two time series prediction methods: (1) phase-space reconstruction prediction method and (2) artificial neural networks. The former assumes only a possible dynamical relationship in the series, whereas the latter is a data training and learning approach acknowledging only a nonlinear relationship between the input and the output values. The correlation dimension of the series is estimated next using the Grassberger-Procaccia algorithm. Finally, the accuracy of the correlation dimension is evaluated by comparing the dimension with the optimal embedding dimension obtained from the phase-space method and the optimal number of inputs obtained from the neural networks. Therefore the approach adopted in this study is different from the ones presented in earlier studies [e.g., Lambrakis et al., 2000], where the correlation dimension was estimated first, and the optimal dimension and the number of inputs were chosen on the basis of the correlation dimension value. The present approach can therefore be termed as an inverse approach to evaluate the accuracy of correlation dimension estimation. The hydrological series studied is a short (in terms of the data size) monthly runoff series observed over a period of 48 years (only 576 values) at the Coaracy Nunes/Araguari River watershed in northern Brazil. 2. Methods Employed 2.1. Phase-Space Reconstruction Prediction Method [5] The advantage of the phase-space reconstruction prediction method [e.g., Farmer and Sidorowich, 1987] for hydrological series lies in the fact that the method does not require a large data size. The method employs the concept of reconstruction of a single-variable series in a multidimensional phase space to represent the underlying dynamics. For a scalar time series, X i, where i =1,2,..., N, the phase space can be reconstructed as Y j ¼ X j ; X jþt ; X jþ2t ;...; X jþðm 1Þt ; ð1þ where j =1,2,..., N (m 1)t/t; m is the dimension of the vector Y j, called embedding dimension; and t is a delay time [e.g., Takens, 1981]. A (correct) phase-space reconstruction in a dimension m allows one to interpret the underlying dynamics in the form of an m-dimensional map f T, that is, Y jþt ¼ f T Y j ; Š ð2þ where Y j and Y j + T are vectors of dimension m, describing the state of the system at times j (current state) and j + T (future state), respectively. The problem then is to find an appropriate expression for f T (e.g., F T ). [6] There are several approaches for determining F T. In this study, a local approximation approach [e.g., Farmer and Sidorowich, 1987] is used. In this approach, the f T domain is subdivided into many subsets (neighborhoods), each of which identifies some approximations F T, valid only in that subset. In this way, the system dynamics is represented step by step locally in the phase space. The identification of the sets in which to subdivide the domain is done by fixing a metric and, given the starting point Y j from which the forecast is initiated, identifying neighbors Y p j, p =1,2,..., k, with j p < j, nearest to Y j, which constitute the set corresponding to Y j. The local functions can then be built, which take each point in the neighborhood to the next neighborhood: Y p p j to Y j + 1. The local map F T, which does this, is determined by a least squares fit minimizing X k p¼1 Y p jþ1 F T Y p j 2 : In this study, the local maps are learned in the form of local polynomials [e.g., Abarbanel, 1996], and the predictions are made forward from a new point Z 0 using these local maps. For the new point Z 0, the nearest neighbor in the learning or training set is found, which is denoted as Y q. Then the evolution of Z 0 is found, which is denoted as Z 1 and is given by ð3þ Z 1 ¼ F q ðz 0 Þ: ð4þ The nearest neighbor to Z 1 is then found, and the procedure is repeated to predict the subsequent values. The prediction accuracy is evaluated using the correlation coefficient (CC) and the root mean square error (RMSE). 2.2. Artificial Neural Networks (ANNs) [7] An ANN is a massively parallel-distributed informationprocessing system that has certain performance characteristics resembling biological neural networks of the human brain. The advantage of the ANN is that with no a priori knowledge of the actual physical process and hence the exact relationship between sets of input and output data, if acknowledged to exist, the network can be trained to learn such a relationship. Since the beginning of the 1990s, ANNs have been successfully applied to many hydrology-related problems, a comprehensive review of which can be found in a recent study undertaken by the ASCE Task Committee on Application of Artificial Neural Networks in Hydrology [ASCE, 2000]. [8] A neural network is characterized by its architecture, which represents the pattern of connection between nodes, its method of determining the connection weights, and the activation function. A typical ANN consists of a number of nodes organized according to a particular arrangement. One way to classify neural networks is by the number of layers: (1) single, (2) bilayer, and (3) multilayer. ANNs can also be categorized on the basis of the direction of information flow and processing: (1) feedforward and (2) recurrent. In a feedforward network, information passes from the input to the output side. In a recurrent network, information flows through the nodes in both directions, from the input to the output side and vice versa. [9] In the present study, a three-layer (input, hidden, and output) feedforward network is used. According to this, all input nodes (J) containing input data (X) are connected via certain weights (W jk ) to all hidden nodes (K). At each node the input from all connections are summarized, and the sum is used in a nonlinear transformation function to produce an output. This procedure is repeated for the output nodes (L), weights (W kl ) are multiplied with the hidden nodes output, and a sigmoid function is used to produce output data (Y ). Network training is done using a back-propagation algorithm, which is essentially a gradient descent technique that minimizes the network error function. 2.3. Correlation Dimension Method [10] Though there exists a number of algorithms for correlation dimension estimation, the Grassberger-Procaccia algorithm [e.g., Grassberger and Procaccia, 1983] has been the most widely used

SIVAKUMAR ET AL.: TECHNICAL NOTE 3-3 a Figure 1. Monthly runoff prediction results using phase-space reconstruction method: (a) correlation coefficient versus embedding dimension and (b) time series comparison of observed and predicted values. in studies investigating hydrological time series, and also in the present study. The algorithm uses the reconstruction of the phase space according to equation (1). For an m-dimensional phase space, the correlation function C(r) is given by Cr ðþ¼ lim N!1 2 NN ð 1Þ X i;j ð1i<jnþ H r Y i Y j ; ð5þ where H is the Heaviside step function, with H(u) = 1 for u >0, and H(u) = 0 for u 0, where u = r Y i Y j, r is the radius of sphere centered on Y i or Y j, and N is the number of data points. If the time series is characterized by an attractor, then for positive values of r, the correlation function C(r) and radius r are related according to Cr ðþa r v ; ð6þ r!0 N!1 where a is constant, and n is the correlation exponent or the slope of the log C(r) versus log r plot. If the correlation exponent saturates with an increase in the embedding dimension, then the system is generally considered to exhibit chaos. The saturation value of the correlation exponent is defined as the correlation dimension of the attractor. The nearest integer above the saturation value provides the minimum number of phase space or variables necessary to model the dynamics of the attractor. If the correlation exponent increases without bound with increase in the embedding dimension, then the system is generally considered as stochastic. Important issues, including that of data size, in the application of the Grassberger-Procaccia algorithm to hydrological time series have been discussed in detail by Sivakumar [2000] and therefore are not reported herein. 3. Analyses of Short Hydrological (Monthly Runoff ) Time Series [11] In this study, a monthly runoff series observed at the Coaracy Nunes/Araguari River watershed in northern Brazil is studied. The watershed has a catchment area of about 24200 km 2 and the runoff station is located at latitude 0 55 0 N and longitude 51 15 0 W. Details about the watershed can be found in the work of Uvo [1998]. Runoff over a period of 48 years (1945 1992) is analyzed. 3.1. Prediction Using Phase-Space Reconstruction [12] The phase-space reconstruction prediction method with the local polynomial approach is now applied to the runoff series. The first 480 values in the series are used in the phasespace reconstruction for predicting the next 70 values. One-stepahead predictions are made for phase spaces reconstructed using embedding dimensions from 1 to 10. Figure 1a shows the relationship between the prediction accuracy and the embedding dimension with respect to the correlation coefficient (CC). The prediction accuracy increases with the increase in the embedding dimension up to a certain value (m = 3) and then seems to saturate (or even slightly decrease) when the dimension is increased further. The time series plots, shown in Figure 1b for six of the 10 embedding dimensions used, for instance, and the scatterplots (not shown) reveal that though the predicted values are in good agreement with those observed for all the embedding dimensions, the best results are achieved only when m = 3 (i.e., m opt ), and the results are almost the same or slightly worse for m > 3 and noticeably worse for m < 3. This seems to indicate that (at least) a three-dimensional (3-D) phase-space reconstruction of the runoff series is necessary to capture the important features of the underlying dynamics. In other words, the runoff process is dependent on only three dominant variables and therefore these variables must be included in the model. On the other hand, the exclusion of any of these three variables could significantly affect the modeling and prediction outcomes, whereas the inclusion of any other variable may be expected to yield only slight improvements. The slight decrease in the predictions at higher phase spaces (m > 3) could possibly be due to the presence of noise in the data, as the influence of noise at higher embedding dimensions could be more than that at lower dimensions [e.g., Sugihara and May, 1990; Sivakumar et al., 1999].

3-4 SIVAKUMAR ET AL.: TECHNICAL NOTE b Figure 1. (continued) [13] A summary of the prediction results obtained for the runoff series is presented in Table 1. The high CC and the low RMSE values and also the good agreement in the time series plots between the observed and the predicted values indicate the suitability of the phase-space method for predicting the runoff dynamics. Even very high and very low runoff values and also the trends are well predicted. The reconstruction of the singlevariable (runoff) series in a multidimensional phase space is therefore found to be capable of capturing the important features of the runoff dynamics. The ability of the local approximation procedure to predict the dynamics lies in representing the dynamics captured in the phase space step by step in local neighborhoods. Also, the presence of an optimal (and low) embedding dimension, with m opt = 3, seems to suggest that the runoff dynamics exhibit low-dimensional chaotic behavior [e.g., Casdagli, 1989]. 3.2. Prediction Using Artificial Neural Networks [14] The three-layer feedforward network with a back-propagation training algorithm is now used to predict the runoff series. Similar to the phase-space reconstruction method, the first 480 values are used as the training set, and the next 70 values are used as the validation set. The number of inputs (i.e., past monthly runoff ) for the network is varied from 1 to 10, consistent with the

SIVAKUMAR ET AL.: TECHNICAL NOTE 3-5 Table 1. Prediction Results for Monthly Runoff at the Coaracy Nunes/Araguari River Watershed Using Phase-Space Reconstruction and Artificial Neural Networks Embedding Dimension Correlation Coefficient Root-Mean-Square Error, mm Number of Inputs Correlation Coefficient Root-Mean-Square Error, mm 1 0.8023 41.216 1 0.7841 44.897 2 0.8375 39.991 2 0.8626 38.111 3 0.8895 33.138 3 0.8879 33.862 4 0.8687 35.805 4 0.8890 33.346 5 0.8744 35.214 5 0.8919 32.871 6 0.8845 34.928 6 0.8920 32.784 7 0.8804 34.935 7 0.8998 32.004 8 0.8805 34.261 8 0.8943 32.387 9 0.8889 33.122 9 0.8960 32.461 10 0.8879 33.523 10 0.8986 31.833 embedding dimensions used in the phase-space method. The analysis starts with the normalization of the runoff series. The optimal network structure for each input series is determined by varying the number of nodes in the hidden layer (k) from 2 to 10 and then selecting the network that gives the best prediction results. The number of iterations made for training the network (with each value of k) is 100,000. The optimal network structure thus obtained is then used for validation. [15] Figure 2a shows the prediction accuracy with respect to the correlation coefficient against the number of inputs used in the network. An increase in the prediction accuracy is observed with the increase in the number of inputs up to a certain value (inputs = 3), and the prediction accuracy seems to remain the same (or increase only insignificantly) beyond this point irrespective of any additional input. The optimum prediction results achieved with three inputs (input opt = 3), as can be seen from Figure 2b, which presents the results obtained for six of the ten cases (i.e., number of inputs) used, seem to suggest that the runoff dynamics is dominantly dependent on only three variables, but the influence of any other variable does not seem to be significant. The reasonably good predictions achieved for the runoff series (Table 1) seem to indicate that the neural networks are capable of capturing the inherent nonlinearities present in the runoff dynamics, even in the absence of any a priori knowledge on the actual physics of the process. Also, the observation of an unchanged or an insignificant increase in prediction accuracy at higher number of inputs seems to indicate that the neural networks are far less sensitive to the presence of noise in the data. 3.3. Correlation Dimension Estimation [16] The correlation functions and the exponents are now computed for the runoff series. The delay time for the phase-space reconstruction is computed using the autocorrelation function method and is taken as the lag time at which the autocorrelation function first crosses the zero line. The first zero value of the autocorrelation function attained is at lag time equal to 4, and therefore this value is used as the delay time. Figure 3a shows the relationship between C(r) and r for embedding dimensions, m, from 1 to 10, and Figure 3b shows the relationship between the correlation exponent values and the embedding dimension values. Figure 3b shows that the correlation exponent value increases with the increase in the embedding dimension up to a certain value and saturates beyond that dimension. The saturation value of the correlation exponent (or correlation dimension) is 3.62. The presence of such a low correlation dimension seems to suggest the possible presence of low-dimensional chaotic behavior in the a Figure 2. Monthly runoff prediction results using artificial neural networks: (a) correlation coefficient versus number of inputs and (b) time series comparison of observed and predicted values.

3-6 SIVAKUMAR ET AL.: TECHNICAL NOTE b Figure 2. (continued) runoff dynamics. The correlation dimension value of 3.62 obtained seems to indicate that the number of variables dominantly influencing the runoff dynamics is 4. 3.4. Comparison of Prediction and Dimension Results [17] The optimal embedding dimension from the phase-space method and the optimal number of inputs from the neural networks consistently indicate that the runoff dynamics is dominantly dependent on only three variables, whereas the influence of any other variable is rather insignificant. A process that is dominantly influenced by only three variables is expected to yield reasonably good predictions, and the prediction results also support this point. Such results could form a reasonable basis for viewing the runoff dynamics as a low-dimensional chaotic system, with three dominant variables. On the other hand, an independent analysis of the runoff series using the correlation dimension method also indicates that the runoff dynamics may be viewed as a low-dimensional chaotic system, with four dominant variables. Even though a (slight) discrepancy is observed, the results do not seem to refute the presence of low-dimensional dynamics. With regard to the discrepancy, the correlation dimension results indicate the necessity of an additional dominant variable (overestimation) for modeling the runoff dynamics, which is contrary to the expected results if the short time series (small data size) were to have any influence on the dimension estimation (underestimation). Therefore the use of a small data size has not

SIVAKUMAR ET AL.: TECHNICAL NOTE 3-7 a b Figure 3. Correlation dimension results for monthly runoff: (a) relationship between log C(r) and log r and (b) relationship between correlation exponent and embedding dimension. resulted in an underestimation of the correlation dimension, instead there may have been some other factor that has resulted in its overestimation. One possible factor could be the presence of noise in the runoff series, as it may result in an overestimation of the dimension [e.g., Schreiber and Kantz, 1996]. All these results indicate that a large data size may not be necessary for the dimension estimation, and reliable estimation can be achieved even with a small data size provided it is long enough to capture the important dynamical changes of the system with time. 4. Summary and Conclusions [18] The present study was aimed at investigating the reliability of the correlation dimension in short hydrological time series, by evaluating the accuracy of its estimation using an inverse approach. This was done by first making predictions of the time series and then independently estimating the correlation dimension and, finally, comparing the dimension with prediction results. The concepts of phase-space reconstruction and artificial neural networks were used for predictions. A short hydrological (monthly runoff) time series observed over a period of 48 years (a total of 576 values) at the Coaracy Nunes/Araguari River watershed in northern Brazil was analyzed. [19] The prediction results from both the phase-space method and the neural networks indicated that the runoff dynamics was dominantly dependent on only three variables, whereas the corresponding number obtained using the correlation dimension method was four. Considering the reasonably good predictions of the runoff series, the above results suggested that the correlation dimension might have been overestimated (possibly due to the presence of noise in the data) rather than underestimated (due to the small data size). The results seem to indicate that the accuracy

3-8 SIVAKUMAR ET AL.: TECHNICAL NOTE of the correlation dimension depends primarily on whether the time series is long enough to sufficiently represent the changes that the system undergoes over a period of time, rather than the data size in terms of the number of values. Therefore the correlation dimension could be a reliable indicator of low-dimensional chaos even in short hydrological time series. On the other hand, the possibility that a finite correlation dimension may be observed also for a linear stochastic process [e.g., Osborne and Provenzale, 1989] may necessitate the detection of the presence of nonlinearity, while investigating low-dimensional chaos in hydrological time series. However, some of the existing nonlinear detection methods, such as the surrogate data method, use the correlation dimension as a statistic to detect the presence of nonlinearity, and therefore the evaluation of the accuracy of the correlation dimension may be a fundamental step in studies investigating low-dimensional chaos. The present study presented one possible way for such an evaluation. [20] Having verified above the reliability of the correlation dimension as an indicator of low-dimensional chaos in the runoff (or any other hydrological) time series, an important next step should be to assess the use of the phase-space reconstruction concept to the runoff time series to represent the dominant features of the underlying hydrological system and to interpret the correlation dimension estimate of the runoff time series toward identifying the (number of) dominant variables (or degrees of freedom) necessary to accurately model the underlying system. The importance of such tasks lie in the following facts: (1) by definition, if the phase-space can be reconstructed accurately from a time series of a given dynamical system, then irrespective of the variable analyzed, it should yield a similar correlation dimension; (2) different variables involved in the same (hydrological or any other natural) system could yield different estimates of correlation dimension and a suitably selected variable could sometimes yield a fairly good estimate even if the number of data points is not large [Lorenz, 1991]; and (3) the presence of physical constraints and thresholds could yield very different estimates of correlation dimension when analyzing different variables of the same dynamical system [Islam et al., 1993]. The immediate question therefore is whether, for instance, the runoff series studied in the present study is indeed a suitable variable for reconstructing the phase space to represent the underlying hydrological system and also for the correlation dimension estimation. One possible way to answer this question is by studying the other relevant variables involved in the same hydrological system, such as rainfall, evaporation, etc., and verifying, for instance, the correlation dimension estimates. On the other hand, a multivariable phase-space reconstruction approach, involving the time series of all the variables, if available, or at least a few of them, may also be attempted and the results verified. Details of such analyses are reported by B. Sivakumar et al. (submitted manuscript, 2001) and therefore are not reported herein. References Abarbanel, H. D. I., Analysis of Observed Chaotic Data, Springer-Verlag, New York, 1996. American Society of Civil Engineers (ASCE), Task Committee on Application of Artificial Neural Networks in Hydrology, II, Hydrologic applications, J. Hydrol. Eng., 5(2), 124 137, 2000. Casdagli, M., Nonlinear prediction of chaotic time series, Phys. D, 35, 335 356, 1989. Farmer, D. J., and J. J. Sidorowich, Predicting chaotic time series, Phys. Rev. Lett., 59, 845 848, 1987. Ghilardi, P., and R. Rosso, Comment on Chaos in rainfall by I. Rodriguez-Iturbe et al., Water Resour. Res., 26(8), 1837 1839, 1990. Grassberger, P., and I. Procaccia, Measuring the strangeness of strange attractors, Phys. D, 9, 189 208, 1983. Islam, M. N., and B. Sivakumar, Characterization and prediction of runoff dynamics: A nonlinear dynamical view, Adv. Water Resour., in press, 2002. Islam, S., R. L. Bras, and I. Rodriguez-Iturbe, A possible explanation for low correlation dimension estimates for the atmosphere, J. Appl. Meteorol., 32, 203 208, 1993. Jayawardena, A. W., and F. Lai, Analysis and prediction of chaos in rainfall and streamflow time series, J. Hydrol., 153, 23 52, 1994. Koutsoyiannis, D., and D. Pachakis, Deterministic chaos versus stochasticity in analysis and modeling of point rainfall series, J. Geophys. Res., 101, 26,441 26,451, 1996. Krasovskaia, I., L. Gottschalk, and Z. W. Kundzewicz, Dimensionality of Scandinavian river flow regimes, Hydrol. Sci. J., 44(5), 705 723, 1999. Lambrakis, N., A. S. Andreou, P. Polydoropoulos, E. Georgopoulos, and T. Bountis, Nonlinear analysis and forecasting of a brackish karstic spring, Water Resour. Res., 36(4), 875 884, 2000. Lorenz, E. N., Dimension of weather and climate attractors, Nature, 353, 241 244, 1991. Osborne, A. R., and A. Provenzale, Finite correlation dimension for stochastic systems with power-law spectra, Phys. D, 35, 357 381, 1989. Porporato, A., and L. Ridolfi, Nonlinear analysis of river flow time sequences, Water Resour. Res., 33(6), 1353 1367, 1997. Rodriguez-Iturbe, I., F. B. De Power, M. B. Sharifi, and K. P. Georgakakos, Chaos in rainfall, Water Resour. Res., 25(7), 1667 1675, 1989. Schreiber, T., and H. Kantz, Observing and predicting chaotic signals: Is 2% noise too much?, in Predictability of Complex Dynamical Systems, edited by Yu. A. Krastov and J. B. Kadtke, pp. 43 65, Springer-Verlag, New York, 1996. Sivakumar, B., Chaos theory in hydrology: important issues and interpretations, J. Hydrol., 227(1 4), 1 20, 2000. Sivakumar, B., S. Y. Liong, C. Y. Liaw, and K. K. Phoon, Singapore rainfall behavior: Chaotic?, J. Hydrol. Eng., 4(1), 38 48, 1999. Sivakumar, B., R. Berndtsson, and M. Persson, Monthly runoff prediction using phase-space reconstruction, Hydrol. Sci. J., 46(3), 377 387, 2001. Smith, L. A., Intrinsic limits on dimension calculations, Phys. Lett. A, 133(6), 283 288, 1988. Sugihara, G., and R. M. May, Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series, Nature, 344, 734 741, 1990. Takens, F., Detecting strange attractors in turbulence, in Dynamical Systems and Turbulence, Lecture Notes in Mathematics 898, edited by D. A. Rand and L. S. Young, pp. 366 381, Springer-Verlag, New York, 1981. Uvo, C. B., Influence of sea surface temperature on rainfall and runoff in northeastern South America: Analysis and modeling, D.Sc. thesis, Lund Univ., Lund, Sweden, 1998. [21] Acknowledgments. This study was funded by the Swedish Research Council for Engineering Sciences and the Swedish Natural Science Research Council. The first author wishes to thank the Nils Hörjel Foundation for granting a scholarship for his stay at the Department of Water Resources Engineering, Lund University. The useful comments provided by the two anonymous reviewers are greatly appreciated. R. Berndtsson, M. Persson, and C. B. Uvo, Department of Water Resources Engineering, Lund Institute of Technology, Lund University, P. O. Box 118, S-221 00 Lund, Sweden. B. Sivakumar, Department of Land, Air and Water Resources, University of California, Davis, CA 95616, USA. (sbellie@ucdavis.edu)