IJRASET (UGC Approved Journal): All Rights are Reserved

Similar documents
Study of Intra Annual and Intra Seasonal Variations of Indian Summer Monsoon during ARMEX Period

DETECTION OF TREND IN RAINFALL DATA: A CASE STUDY OF SANGLI DISTRICT

Seasonal Rainfall Trend Analysis

Probability models for weekly rainfall at Thrissur

Seasonal and annual variation of Temperature and Precipitation in Phuntsholing

Density Estimation using Polynomial

Testing Statistical Hypotheses

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Chapter 4 Inter-Annual and Long-Term Variability

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

A note on the performance of climatic forecasts for Asia based on the periodic portion of the data:

Testing Statistical Hypotheses

STOCHASTIC MODELING OF MONTHLY RAINFALL AT KOTA REGION

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Role of Statistics in Monitoring Rainfall

A nonparametric two-sample wald test of equality of variances

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Stochastic Modeling of Tropical Cyclone Track Data

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

Long-Range Forecasting of the Indian Summer Monsoon Onset and Rainfall with Upper Air Parameters and Sea Surface Temperature1, 2

FORECASTING OF COTTON PRODUCTION IN INDIA USING ARIMA MODEL

Rainfall is the major source of water for

Non-parametric tests, part A:

Probabilistic Analysis of Monsoon Daily Rainfall at Hisar Using Information Theory and Markovian Model Approach

Subject CS1 Actuarial Statistics 1 Core Principles

Nonparametric Methods

Weather Outlook. 1 July 2016 METEOROLOGICAL RESEARCH GROUP. INDIA HEADQUART ERS G 31 Quest Offices DLF Golf Course Road Gurgaon India

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Rainfall Trend in Semi Arid Region Yerala River Basin of Western Maharashtra, India

BOOTSTRAP CONFIDENCE INTERVALS FOR PREDICTED RAINFALL QUANTILES

International Journal of Science, Environment and Technology, Vol. 6, No 4, 2017,

Chapter 9. Non-Parametric Density Function Estimation

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

1 Ministry of Earth Sciences, Lodi Road, New Delhi India Meteorological Department, Lodi Road, New Delhi

A comparison study of the nonparametric tests based on the empirical distributions

Long Range Forecast Update for 2014 Southwest Monsoon Rainfall

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -29 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

A MARKOV CHAIN ANALYSIS OF DAILY RAINFALL OCCURRENCE AT WESTERN ORISSA OF INDIA

Unidirectional trends in rainfall and temperature of Bangladesh

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

Chapter-1 Introduction

Temporal and Spatial Analysis of Drought over a Tropical Wet Station of India in the Recent Decades Using the SPI Method

Chapter 9. Non-Parametric Density Function Estimation

Why should I use a Kruskal-Wallis test? (With Minitab) Why should I use a Kruskal-Wallis test? (With SPSS)

Rainfall Analysis in Mumbai using Gumbel s Extreme Value Distribution Model

Midwest Big Data Summer School: Introduction to Statistics. Kris De Brabanter

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

MULTIVARIATE ANALYSIS OF VARIANCE

Frequency analysis of rainfall deviation in Dharmapuri district in Tamil Nadu

Development of wind rose diagrams for Kadapa region of Rayalaseema

ARIMA modeling to forecast area and production of rice in West Bengal

The Trend Analysis Of Rainfall In The Wainganga River Basin, India

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

SPSS LAB FILE 1

Occurrence of heavy rainfall around the confluence line in monsoon disturbances and its importance in causing floods

A stochastic modeling for paddy production in Tamilnadu

Estimation of cumulative distribution function with spline functions

Summary and Conclusions

Impact of climate change on extreme rainfall events and flood risk in India

Chapter Introduction

Rainfall variation and frequency analysis study of Salem district Tamil Nadu

Intraseasonal Characteristics of Rainfall for Eastern Africa Community (EAC) Hotspots: Onset and Cessation dates. In support of;

Experimental Design and Data Analysis for Biologists

Environmental and Earth Sciences Research Journal Vol. 5, No. 3, September, 2018, pp Journal homepage:

Trends and Variability of Climatic Parameters in Vadodara District

Extending clustered point process-based rainfall models to a non-stationary climate

Spatial and Temporal Analysis of Rainfall Variation in Yadalavagu Hydrogeological unit using GIS, Prakasam District, Andhra Pradesh, India

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Forecasting Drought in Tel River Basin using Feed-forward Recursive Neural Network

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

arxiv: v1 [stat.me] 14 Jan 2019

Directional Statistics

Syllabus For II nd Semester Courses in MATHEMATICS

On prediction and density estimation Peter McCullagh University of Chicago December 2004

Weekly Rainfall Analysis and Markov Chain Model Probability of Dry and Wet Weeks at Varanasi in Uttar Pradesh

Introduction to Signal Detection and Classification. Phani Chavali

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Seasonality and Rainfall Prediction

Anomaly Construction in Climate Data : Issues and Challenges

PRINCIPLES OF STATISTICAL INFERENCE

Analysis of Rainfall and Other Weather Parameters under Climatic Variability of Parbhani ( )

Stat 5101 Lecture Notes

Rainfall variation and frequency analysis study in Dharmapuri district, India

A Feature Based Neural Network Model for Weather Forecasting

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

A NOTE ON THE CHOICE OF THE SMOOTHING PARAMETER IN THE KERNEL DENSITY ESTIMATE

Hierarchical Bayesian Modeling of Multisite Daily Rainfall Occurrence

ColomboArts. Volume II Issue I Dynamic Trends of Intensity of Rainfall Extremes in Sri Lanka

HANDBOOK OF APPLICABLE MATHEMATICS

CLIMATE CHANGE AND TREND OF RAINFALL IN THE SOUTH-EAST PART OF COASTAL BANGLADESH

Foundations of Probability and Statistics

Local Polynomial Modelling and Its Applications

My data doesn t look like that..

1. TEMPORAL CHANGES IN HEAVY RAINFALL FREQUENCIES IN ILLINOIS

Testing for Regime Switching in Singaporean Business Cycles

WELCOME! Lecture 13 Thommy Perlinger

SUPPLEMENTARY INFORMATION

SUMMARY AND CONCLUSIONS

Transcription:

Shift in Date of Onset of the Southwest Monsoon in India from 146 Years of Data Exploring the Change in Distribution Using Non-Parametric Polynomial Estimators T. Unnikrishnan 1, P. Anilkumar 1 and Kadambot H. M. Siddique 2 1 Dept. of Statistics, Farook College, University of Calicut 2 The UWA Institute of Agriculture and School of Agriculture and Environment, The University of Western Australia, Abstract: The accurate date of onset of the southwest monsoon in India is unpredictable due to its complexity in estimation. However, it appears that in recent years the onset of the southwest monsoon has been occurring earlier than normal. Therefore, we studied the change in distribution of the date of onset of the southwest monsoon in India using a recently developed, highly efficient non-parametric method, the non-parametric polynomial estimator. The polynomial estimator was used to estimate the non-parametric distribution of the frequency of occurrence of the onset of the southwest monsoon in India from 187 to 215. The 146-year data was split in to three time periods and analyzed; removing the middle time periodshoweda clear shift in the median of the observed values towards an earlier date. Keywords: Onset of South West Monsoon, Distribution Fitting, Non-parametric Polynomial Estimation, Climate. I. INTRODUCTION Random variables are characterized by probability density functions. Hence, the estimation of probability density function has great importance. Parametric density estimation pre-assumes shape of the distribution, but not the parameters. Situations may arise where parametric density estimation will not provide an accurate estimate of a probability density function. In these circumstances, nonparametric estimation is important as it does not assume a prior distribution. Here, the data will speak for themselves. Silverman (1981) reported that density estimates are useful at all stages of statistical treatment of data. Density estimators indicate multimodality, skewness, or dispersion of the data in the exploratorys tage. For confirmatory purposes, density estimators can be used, for instance, in non-parametric discriminant analysis (Prakash Rao, 1983). There is extensive literature on the asymptotic properties of the kernel estimate. Aweak uniform consistency is established for kernel estimates (Prakash Rao, 1983). A different and somewhat weaker type of consistency is discussed in Deveroy and Gyorfi(1985). Density estimators provide the best information transformations of the given datafor presentation. Rudin (1976) introduced a polynomial to prove the Stone Weierstrass theorem. The same polynomial was used todevelop the method of the polynomial estimator for fitting distributions using the nonparametric technique. The proposed estimator belongs to the general class of estimators. The appropriate degree of the polynomial (d)will be fixed based on the sample using an objective criterion. The theory of the non-parametric polynomial estimator for fitting distribution was tested on the arrival of the southwest monsoon season in India. This monsoon accounts for 8% of the rainfall in India. The coastal state of Kerala is the first Indian state to receive rain from the southwest monsoon, with the normal onset being 1 June (±7 days).the southwest monsoon reaches Kolkatta and Visakhapattanam on 7 June, Mumbai on 1 June, Delhi on 3 July, and the extreme western part of Rajasthan on 15 July. The onset of the southwestmonsoon is criticalfor Indian agriculture as it irrigates more than half of India s cropping area. The earliest onset of the southwest monsoon was recordedon 11 May 1918 and the latest on 18 June 1972. Many studies have analyzed the date of onset of the southwest monsoon in India. For example, Subbaramayya and Bhanu kumar(1978) and Subbaramayya et al. (1984) discussed different features related to the onset date of the southwest monsoon in India. Based on these studies and the climatic conditions, criteria for onset of southwest monsoon are predicted. Ananthakrishnan and Soman (1988) collated and analyzed data from the Indian Meteorological Department on the date of onset of the southwest monsoon in India from 191 198and found that the mean onset date from191 194 was four days later than that from1941 198. Unnikrishnan and Ajitha (29) developed a time-series model for forecasting the onset of the southwest monsoon in Kerala using data from 187 29. All of the above studies were based on the concepts of parametric statistical analysis. Parametric tests are usually based on assumptions, which may not be valid in certain situations. In this study, we used a 668

novel approach incorporating anon-parametric method to analyze 146 years of data on the onset date of the southwest monsoon in Kerala, India. II. MATERIALS AND METHODS The distribution of onset date of the southwest monsoon in India from 187 to 215 was analyzed using non-parametric statistical tools recently developed for situations where parametric tests miss the mark. Onset dates from 187 to 215 were collected from the Department of Agricultural Meteorology, Kerala Agricultural University. The range of dates spanned 39 days; the earliest monsoon onset on 11Maywas given the code, and the latest monsoon onset on 18 June given the code 38. The frequency of each date of onset was recorded. Nonparametric tests were applied to determine whether the median varied significantly between the observed dates of onset corresponding to the5 years of data at the beginning of the sample period and the 5 years of data at the end of the observed period. To do this, the data from 1921 to 1964were removed, and the two groups either side of these dates were analyzed to determine any change in median date of onset of the southwest monsoon in India. The Kolmogorov two-sample test was applied to determine whether the median of two samples, i.e., median of the dates of onset from 187 to 192 and median of the dates of onset from 1965 to 215, varied significantly. To test whether the sample variancesdiffered significantly, Levene s test for homogeneity of variance was applied using SPSS. he newly developed tool for finding the probability density function using a nonparametric polynomial estimator (Hamza, 29) was tested using the following formula: The c d values were calculated for d=1, 2, 3 using: c d = for x [,1], X i the i th observation. where d is the degree of the polynomial and, in fact, the smoothing parameter.the estimation process is similar to that of the kernel bandwidth parameter of a kernel density estimator (Silverman, 1986). In kernel density estimation, the bandwidthparameter is determined using two procedures: (i) least square cross-validation, and (ii) likelihood cross-validation. In kernel estimation, the bandwidth decreases with increasing sample size,whereas in polynomial estimation, the degree of the polynomial increases as the sample size increases. Polynomial estimation is more robust than kernel density estimation, as it is more precise as the sample size increases. Moreover, bandwidth is highly sensitive, but the degree of the polynomial did not affect the estimation of smaller errors in d values. In polynomial estimation, likelihood cross-validationis used to maximize CV(d) alone, where. Since the calculation of c d values can be difficult due to factorials in the expression, a recurrence formula was determined, where c d =c d 1 *(2d+1)/2d. The choice of d was based on the maximization of CV(d) values, calculated using maximum likelihood estimation of by leaving the i th variable in the frequency data. The d CV(d) curves were plotted and the maximum CV(d) point was selected. The value of d, corresponding to the unique maximum, is the optimum parameter for the distribution. Using the d and c d values, the probability density function was determined. This process is much easier than kernel band width estimation and can be estimated simply in Microsoft Excel. If the domain lies in the interval [a,b], then the formula will be: III. RESULTS AND DISCUSSION The time-series plot showed a cyclical pattern in the data (Fig. 1) as did the 3-year moving average plot (Fig. 2). The cyclical behavior of the time-series data shifted the median value. At times, the periodic behavior can repeat at regular intervals. Also, there may be an upward or downward trend in the data, and hence the distribution may vary over the long run. Here, we aimed to test whether there was any shift in the distribution of the onset of the southwestmonsoon in India. 669

Pattern of Onset dates 4 35 3 25 2 15 1 5 187 1875 188 1885 Date 189 1895 19 195 191 1915 192 1925 193 1935 194 1945 195 1955 196 1965 197 1975 198 1985 199 1995 2 25 21 215 Year Figure1. Time-series pattern for the date of onset of the southwest monsoon in India. Figure2. The 3-year moving average of the date of onset of the southwest monsoon in India. The two groups of data, with 51 observations each, were analyzed using the Kolmogorov Smirnov one-sample test for normality. The test rejected the null hypothesis with Z = 1.43 and p=.39 for the data from 187 192, and Z = 1.539 and p=.18 for the data from1965 215. The two groups were analyzedto determine whether there were any significant changes in the distribution using the Kolmogorov Smirnov two-sample test. While Levene s test for homogeneity of variances showed no significant variation in data, the Komogorov Smirnov two-sampletestshowed a change inthe median, with Z = 1.386 and p=.43. To determinethe distribution of onset dates from 187 to 215 using nonparametric estimation, CV(d) wascalculated for d=1,2,3, using the corresponding c d and the frequency of onset data.thed CV(d) curve identified an optimum d of 83,andacorresponding c d of5.163 (Fig. 3). Using these values,the distribution of the data using polynomial estimation and plotted in Figure 4 is: 67

d- cv(d) Graph for onset distribution from 187 to 215 25 24 23 CV(d) 22 21 2 19 18 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 Figure3. d CV(d) curve for date of onset of southwest monsoon in India from 187 to 215. d Distribution of onsetdates 187-215.16.14.12.1.8.6.4.2 3 5 8 11 13 Q(x) 16 19 21 24 27 29 32 35 37 x Figure4. Distribution of onset date of the southwest monsoon in India using To determine the distribution of the data from 187 192 using nonparametric estimation, CV(d) was calculated for d=1,2,3, using the corresponding c d and the frequency of onset data.thed CV(d) curve identified an optimum d of 57 and a corresponding c d of4.287 (Fig. 5). Using these values, the distribution of the data using polynomial estimation and plotted in Figure 5 is: 671

C V (d ) d- cv(d) Graph for onset distribution from 187 to 192 9 85 8 75 7 65 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 C V (d ) d- cv(d) Graph for onset distribution from 1965 to 215 1 9 8 7 6 5 4 3 2 1 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 d Figure5. The d CV(d) curve for onset date of the southwest monsoon in India. Similarly, to determine the distribution of the data from 1965 215 using nonparametric estimation, CV(d) was calculated for d=1,2,3, using the corresponding c d and the frequency of onset data.thed CV(d) curve identified an optimum d of48andacorrespondingc d of 3.939 (Fig. 5). Using these values, the distribution of the data using polynomial estimation plotted in d Figure 6 is:.14.12 Shift in Probability Density Function of date of onset of South West Monsoon in India in two different Time periods 1965-215 187-192 P ro b a b ility D e n s ity.1.8.6.4.2 1 1 -M a y 1 2 -M a y 1 3 -M a y 1 4 -M a y 1 5 -M a y 1 6 -M a y 1 7 -M a y 1 8 -M a y 1 9 -M a y 2 -M a y 2 1 -M a y 2 2 -M a y 2 3 -M a y 2 4 -M a y 2 5 -M a y 2 6 -M a y 2 7 -M a y 2 8 -M a y 2 9 -M a y 3 -M a y 3 1 -M a y 1 -J u n 2 -J u n 3 -J u n 4 -J u n 5 -J u n 6 -J u n 7 -J u n 8 -J u n 9 -J u n 1 -J u n 1 1 -J u n 1 2 -J u n 1 3 -J u n 1 4 -J u n 1 5 -J u n 1 6 -J u n 1 7 -J u n 1 8 -J u n Date of Onset of South West Monsoon in India Figure6. The shift in the distribution of onset date of the southwest monsoon in India. Superimposing the two graphs, a clear shift in the onset date of the southwest monsoon in India to earlier than normal is evident Fig. 6). 672

IV. CONCLUSIONS The Kolmogorov Smirnov two-sample test showed a shift in the median value of the onset date of the southwest monsoon in India to earlier than normal. Comparing the variance using Lavene s test for homogeneity, there was no significant change in variance. Since the Kolmogorov Smirnov test showed that the frequency was not normally distributed, the optimum probability density function for fitting the data was determined using a polynomial estimator, which showed a shift in the distribution of onset of the southwest monsoon in India. The optimum choice of smoothing parameter d, using the maximum CV(d) score, is unaffected by small changes in the data as the sample size increases. Hence, the approach is recognized as an excellent data-driven method for selecting the smoothing parameter of the proposed estimator. The idea of maximization of CV(d) for finding the value of d was used in the present study to fit the distributions. V. ACKNOWLEDGMENTS The authors thank the Department of Agricultural Meteorology, Kerala Agricultural University for providing the data for the study and the Principal, Farook College, Calicut for making arrangements to undertake the research. REFERENCES [1] Ananthakrishnan, R., Soman, M.K., 1988. The onset of the southwest monsoon over Kerala:191 198.J.Climatol.8, 283 296. [2] Deveroy,L.,Gyorfi,L.,1985. Nonparametric Density Estimation.Wiley, New York. [3] Hamza,K.K., 29. Some contributions to renewal density estimation. Thesis, Department of Statistics, University of Calicut. [4] Prakash Rao, 1983. NonparametricFunctional Estimation. Academic Press, Orlando. [5] Rudin, W.,1976.Principles of Mathematical Analysis. 3rd Edition, McGraw-Hill, New York. [6] Silverman,B.W., 1981. Using kernel density estimate to investigate multi modality, J. R. Stat. Soc.43, 833 836. [7].Silverman,B.W., 1986.Density Estimation for Statistics and Data Analysis. Chapman and Hall, London. [8] Subbaramayya, I., Bhanukumar, O.S.R.U., 1978. The onset and the northern limit of the southwest monsoon over India.Metereol. Mag.17, 37 44. [9] Subbaramayya, I., Babu, S.V., Rao, S.S., 1984. Onset of the summer monsoon over India and its variability.metereol. Mag.113, 127 135. [1] Unnikrishnan, T., Ajitha, T.K., 29. A univariate statistical approach to predict the onset of South West Monsoon in India, in: Proceedings of the National Workshop on Climate and Development, 29 June 29.Vellayani, Trivandrum, pp. 8 16. 673