On Development of Spoke Plot for Circular Variables

Similar documents
A clustering approach to detect multiple outliers in linear functional relationship model for circular data

A. H. Abuzaid a, A. G. Hussin b & I. B. Mohamed a a Institute of Mathematical Sciences, University of Malaya, 50603,

Directional Statistics

Discussion Paper No. 28

The effect of wind direction on ozone levels - a case study

By Bhattacharjee, Das. Published: 26 April 2018

Logistic Regression for Circular Data

Graphical User Interface (GUI) for Torsional Vibration Analysis of Rotor Systems Using Holzer and MatLab Techniques

14.1. Multiple Integration. Iterated Integrals and Area in the Plane. Iterated Integrals. Iterated Integrals. MAC2313 Calculus III - Chapter 14

Outliers Detection in Multiple Circular Regression Model via DFBETAc Statistic

New Family of the t Distributions for Modeling Semicircular Data

( ) g θ denote the Cumulative. G θ and. g ( θ ) can be written in terms of F( x ) and f ( ) x using. tan 2

A NOTE ON THE USE OF DIRECTIONAL STATISTICS IN WEIGHTED EUCLIDEAN DISTANCES MULTIDIMENSIONAL SCALING MODELS

Wrapped Gaussian processes: a short review and some new results

x n+1 = ( x n + ) converges, then it converges to α. [2]

1. (10pts) If θ is an acute angle, find the values of all the trigonometric functions of θ given that tan θ = 1. Draw a picture.

Concepts for Advanced Mathematics (C2) WEDNESDAY 9 JANUARY 2008

A CIRCULAR CIRCULAR REGRESSION MODEL

A SAS/AF Application For Sample Size And Power Determination

4 The Trigonometric Functions

Advanced Higher Mathematics of Mechanics

Math 30-1: Trigonometry One PRACTICE EXAM

Mathematical Models with Maple

Prentice Hall Geometry (c) 2007 correlated to American Diploma Project, High School Math Benchmarks

orbits Moon, Planets Spacecrafts Calculating the and by Dr. Shiu-Sing TONG

Scattered Energy of Vibration a novel parameter for rotating shaft vibration assessment

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

Relations in epidemiology-- the need for models

DEVIL PHYSICS BADDEST CLASS ON CAMPUS IB PHYSICS

Chap.11 Nonlinear principal component analysis [Book, Chap. 10]

The choice of origin, axes, and length is completely arbitrary.

Test Wednesday, Oct. 12 th 7pm, White Hall A-K: B51 L-Z:G09 Bring your calculator! 20 Multiple choice questions from:

IT S TIME FOR AN UPDATE EXTREME WAVES AND DIRECTIONAL DISTRIBUTIONS ALONG THE NEW SOUTH WALES COASTLINE

HubVis: Software for Gravitational Lens Estimation and Visualization from Hubble Data

Test Wednesday, March 15 th 7pm, Bring your calculator and #2 pencil with a good eraser! 20 Multiple choice questions from:

Chapter 3. Radian Measure and Circular Functions. Copyright 2005 Pearson Education, Inc.

Common Core State Standards for Mathematics Integrated Pathway: Mathematics I

Cambridge International Examinations CambridgeOrdinaryLevel

MEI STRUCTURED MATHEMATICS 4763

Expandable Structures formed by Hinged Plates

Unit Circle: The unit circle has radius 1 unit and is centred at the origin on the Cartesian plane. POA

Find the length of an arc that subtends a central angle of 45 in a circle of radius 8 m. Round your answer to 3 decimal places.

A New Iterative Procedure for Estimation of RCA Parameters Based on Estimating Functions

Mermaid. The most sophisticated marine operations planning software available.

Warm Up = = 9 5 3) = = ) ) 99 = ) Simplify. = = 4 6 = 2 6 3

International General Certificate of Secondary Education CAMBRIDGE INTERNATIONAL EXAMINATIONS PAPER 2 MAY/JUNE SESSION 2002

Topic 6 The Killers LEARNING OBJECTIVES. Topic 6. Circular Motion and Gravitation

Kinematics. Vector solutions. Vectors

Experiment 1: Linear Regression

Appendix F: Projecting Future Sea Level Rise with the SLRRP Model

The Change-of-Variables Formula for Double Integrals

Principal Component Analysis, A Powerful Scoring Technique

A Level. A Level Mathematics. Understanding the Large Dataset (Answers) Edexcel. Name: Total Marks:

ANN and Statistical Theory Based Forecasting and Analysis of Power System Variables

The Solution of Weakly Nonlinear Oscillatory Problems with No Damping Using MAPLE

LECTURE 20: Rotational kinematics

Math Section 4.3 Unit Circle Trigonometry

CBE 6333, R. Levicky 1. Orthogonal Curvilinear Coordinates

Design and Modelling of Generator for Wave Energy Conversion System in Malaysia

Given one trigonometric ratio and quadrant, determining the remaining function values

Numerical Comparisons for Various Estimators of Slope Parameters for Unreplicated Linear Functional Model

Pure Mathematics Year 1 (AS) Unit Test 1: Algebra and Functions

Determination of the Rydberg constant, Moseley s law, and screening constant (Item No.: P )

BNG 495 Capstone Design. Descriptive Statistics

North Carolina Offshore Buoy Data. Christopher Nunalee Deirdre Fateiger Tom Meiners

Analysis of Scattering of Radiation in a Plane-Parallel Atmosphere. Stephanie M. Carney ES 299r May 23, 2007

Notes on Radian Measure

+ 2gx + 2fy + c = 0 if S

Estimating Conditional Probability Densities for Periodic Variables

Single and multiple linear regression analysis

Circular Distributions Arising from the Möbius Transformation of Wrapped Distributions

Rotational Motion. 1 Introduction. 2 Equipment. 3 Procedures. 3.1 Initializing the Software. 3.2 Single Platter Experiment

Implicit Differentiation and Related Rates

Electromagnetic Forces on Parallel Current-

Uncertainty analysis of heliostat alignment at the Sandia solar field

MOVEMENT (II) This unit continues with the study of movement started in Movement (I).

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and

1. Introduction. Hang Qian 1 Iowa State University

Using Mplus individual residual plots for. diagnostics and model evaluation in SEM

Speed and change of direction in larvae of Drosophila melanogaster

Math 120: Precalculus Autumn 2017 A List of Topics for the Final

Iridescent clouds and distorted coronas

Horizontal Progression Recommendations

BV4.1 Methodology and User-friendly Software for Decomposing Economic Time Series

Mth 133 Trigonometry Review Problems for the Final Examination

Year 12 Maths C1-C2-S1 2016/2017

This section will help you revise previous learning which is required in this topic.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Year 12 Maths C1-C2-S1 2017/2018

DO NOT ROUND YOUR ANSWERS

CatchAll Version 2.0 User Operations Manual. by Linda Woodard, Sean Connolly, and John Bunge Cornell University. June 7, 2011

Cambridge International Examinations Cambridge International General Certificate of Secondary Education

HOW TO USE MIKANA. 1. Decompress the zip file MATLAB.zip. This will create the directory MIKANA.

Principles of Mathematics 12: Explained!

5-Sep-15 PHYS101-2 GRAPHING

Spreadsheet Solution of Systems of Nonlinear Differential Equations

Comparing measures of fit for circular distributions

Lecture 19. ROLLING WITHOUT SLIPPING. Figure 4.8 Gear rolling in geared horizontal guides. Figure 4.9 Wheel rolling on a horizontal surface.

Dennis Bricker Dept of Mechanical & Industrial Engineering The University of Iowa. Forecasting demand 02/06/03 page 1 of 34

Tennessee MATH III (Traditional) Pacing Guide

Transcription:

Chiang Mai J. Sci. 2010; 37(3) 369 Chiang Mai J. Sci. 2010; 37(3) : 369-376 www.science.cmu.ac.th/journal-science/josci.html Contributed Paper On Development of Spoke Plot for Circular Variables Fakhrulrozi Hussain, Abdul G. Hussin *, and Yong Z. Zubairi Centre for Foundation Studies in Science, University of Malaya, 50603 Kuala Lumpur, Malaysia. *Author for correspondence; e-mail: ghapor@um.edu.my Received: 5 January 2010 Accepted: 12 August 2010 ABSTRACT Relationship and simple analysis between two any circular variables are the main focus of this paper. Correlation measure, linear relationship and estimation of concentration parameters are some examples of simple approach in analyzing two circular variables. These analyses could be extended to the graphical plots to give a better understanding. A plot called Spoke plot which have been developed in MATLAB environments have been proposed to achieve this objective. As an illustration, the application of Spoke plot as a simple analysis between two circular variables using graphical plots in investigating the correlation, linear relationship and estimating concentration parameters are given using the Malaysian wind direction data. Keywords: concentration parameters, correlation coefficient, wind direction. 1. INTRODUCTION Circular variables are common in some data such as clock time, compass bearing, moon circle, wind direction, turbine and others. The wind direction data is a type of the circular data. The data is measured on a scale that repeats itself or with angular directions. Unlike the common linear data, the features of circular data warrants appropriate techniques and special inferential tools in the analysis. The analysis of circular data has been studied for many years [1-4] as well as the application of the simple circular regression model was discussed [5]. Another area of circular statistics which is problems of outliers have been studied by Abu Zaid et al. [6]. Further, a number of symmetric probability models have been developed for directional data and these include the von Mises, wrapped Cauchy, Sen Gupta-Rattihalli and many others However, the analysis of directional data is limited as user friendly statistical software that deal with both exploratory data analysis as well as statistical inference are not many in the market. Although theoretical formulations in modeling circular variables have been developed, the evaluations are somewhat complicated due to the nature of the data. Therefore, any development on the analysis of circular data as well as to incorporate further analysis into the available software is indispensable for circular data. This paper focuses on graphical method in analyzing the relationship of two any circular variables via correlation and linear relationship.

370 Chiang Mai J. Sci. 2010; 37(3) 2. CIRCULAR VARIABLES Circular random variable is one which takes values on the circumference of a circle, i.e. they are angles in the range (0, 2π] radians or (0 o,360 o ]. This random variable must be analyzed by techniques differing from those appropriate for the usual Euclidean type variables because the circumference is a bounded closed space, for which the concept of origin is arbitrary or undefined. A continuous linear variable is a random variable with realizations on the straight line or real line which may be analyzed straightforwardly by usual techniques. For example, if one wants to compare the difference between two circular data that have value 10 o and 350 o, by using linear technique will give the answer of 340 o ; but in reality, the difference between these two data is only 20 o which suggests the need of special approach in dealing with such variables. 2.1 Correlation coefficient of circular variables The correlation measure will be used in the development of the analysis of two circular variables. Correlation or also known as a measure of a correlation coefficient indicates the strength and direction of a linear relationship between two random variables. In general statistical usage, correlation or co-relation refers to the departure of two variables from independence. When the data are linear there are several coefficients, measuring the degree of correlation, adapted to the nature of data. Given n pairs of circular data (θ 1,ϕ 1 ),..., (θ n,ϕ n ), where 0 < (θ i,ϕ i ) < 2π, the circular correlation coefficient [3] is defined by where -1< ρ^τ < + 1., 2.2 Linear relationship between two circular variables The regression model when both variables are circular produces is a very interesting form. The model is given by ϕ = α + βθ + ε (mod2π), where ε is a circular random error having a von Mises distribution with circular mean 0, and concentration parameter κ, which can be written as ε VM(0,κ ). This model have been discussed in detail for β =1 [7] and also for β close to unity [8]. The estimates of α, β and κ namely α^, β^ and κ^ are given by where S =Σ sin (ϕ i - β^θ i ) and C=Σ cos (ϕ i - β^θ i ). Due to the nonlinear nature of the first partial derivative of the log likelihood function, then β^ is obtained by iterative procedure according to the formula The estimate of κ is given by which can be approximated by 3. THE DEVELOPMENT OF SPOKE PLOT Two circular variables (θ,ϕ) may be represented in a Spoke plot of two concentric non intersecting circles with any specific radius. To understand this idea, suppose we have a pair of observation (45 o, 90 o ) for variable θ and ϕ respectively. In the Spoke plot these two observations may be represented by a.

Chiang Mai J. Sci. 2010; 37(3) 371 line that connects these two points θ and ϕ from inner circle (45 o ) to outer circle (90 o ) respectively as shown in Figure 1. The choice of points for the inner and outer circle is arbitrary; that is θ may very well be chosen as the points in the inner and ϕ the points on outer circle respectively. Based on this idea, a Spoke plot is Figure 1. Plot of a line that connecting two points (θ,ϕ) from inner circle (45 o ) to outer circle (90 o ). developed that allows us to look at the pattern of any two circular variables. Therefore, given a set of data with size n where the coordinates (θ i, ϕ i ) for i = 1,2,, n are approximately equal (i.e. θ i ϕ i (mode 2π )) or linearly related, the set of n lines formed by connecting points from the inner circle to outer circle results in a spoke-like image. This pattern suggests a linear association between the two circular variables. As an illustration, we generated values of θ and ϕ from the von Mises distribution with mean π and concentration parameter 2 for both θ and ϕ. A sample measure in degree of the simulated data is given in Table 1. Table 1. Sample of circular data (in degree). Θ 3.56 104.69 72.28 104.51 321.22 93.11 348.57 63.62 152.67 13.88 Φ 110.15 85.40 86.24 51.55 223.60 19.96 100.48 16.98 44.51 0.90 The Spoke plot may be used to represent the pair of points graphically. A program called Spoke(spoke_data) is developed with output as shown in Figure 2. The Spoke plot in Figure 2 could be an alternative approach to look at the relationship between two circular variables θ and ϕ using diagrammatical representation. As for comparison to linear, a scatter plot that is normally used in linear analysis have been plotted as shown in Figure 3. The scatter plot, however could be misleading due to the wrap around properties of the circular data and the Spoke plot seems to be the better alternative.

372 Chiang Mai J. Sci. 2010; 37(3) Figure 2. A sample of Spoke plot for variables θ and ϕ. Figure 3. A sample of Scatter plot for variables θ and ϕ. 3.1 Spoke plot: Correlation coefficient Another application of Spoke plot are for graphical representation for correlation coefficient and linear association. For the illustration purpose, a dataset called spoke_data is generated with size n = 30 based on a simple circular regression model, which is ϕ = α + βθ + ε (mod2π). Without loss of generality, α = 0 and β = 1 have been chosen. Variable θ have been generated from VM(π/4,1.5) and ε from VM(0,30) respectively. The correlation coefficient is used to ρ^τ measure of linear association between those two circular variables using a subprogram called corr(spoke_data). The subprogram is designed in such a way that the first output gives a graphical view of the relationship of two circular variables. This is followed by a numerical measure that describes the strength of linearity. As an illustration the call function spoke(spoke_ data) have been executed followed by call function corr(spoke_data). From the output as shown in Figure 4, the Spoke plot strongly suggests the presence of linear association between two circular variables. This finding is further supported by a numerical measure of correlation value equals to 0.9555 as shown at the bottom of the plot.

Chiang Mai J. Sci. 2010; 37(3) 373 Figure 4. The Spoke plot and correlation coefficient output of spoke_data. 3.2 Spoke plot: linear relationship using simple circular regression Once a linear association is established from the graphical Spoke plot and supported by a numerical measure of, the next step ρ^τ of the analysis is to estimate the parameters of linear relationship of ϕ = α + βθ +ε (mod2π), namely α and β respectively. Another subprogram called alpha_regress (spoke_data) is written to give the estimated value of α and β using the maximum likelihood estimation. By using the similar dataset, we found that the estimation parameters are α = 6.2792 radian and β = 0.9981 respectively as shown in Figure 5. These estimated values are very close to the true values used in the simulation. Another applicability of the Spoke plot Figure 5. The Spoke plot shows linear association of α and β.

374 Chiang Mai J. Sci. 2010; 37(3) is to visually asses the concentration parameter κ of the data. A subprogram called kappa_ spoke(spoke_data) have been written to estimate and display the concentration parameter for both θ and ϕ which are κ θ and κ ϕ respectively. In summary, the four commands given below produce a comprehensive output of spoke plot as shown in Figure 6. >> spoke(spoke_data) >> corr(spoke_data) >> alpha_regress(spoke_data) >> kappa_spoke(spoke_data) The program and subprograms to obtain Spoke plot for circular variables are available at URL: http://asasi.um.edu.my/download/ Spoke_plot_program.doc. Figure 6. A sample of comprehensive Spoke plot. 4. APPLICATION OF SPOKE PLOT As an illustration, the Spoke plots are used to two different real dataset as described below: 4.1wind direction data recorded at 850 hectopascals (hpa) and 1,000 hectopascals (hpa) at time 12.00A.M. from Bayan Lepas Airport, Malaysia, in July and August 2005, with the objective to compare two sets of wind direction data at two different pressures. 4.2 wind direction data from the Holderness Coastline, which is the Humberside coast of the North Sea, United Kingdom in October 1994 that were measured by HF radar and anchored wave buoy. The dataset have 49 measurements recorded over the period 22.7 days. The results are shown in Figures 7 and 8 respectively. From the Spoke plot in Figure 7 of data set (i), it can be seen that a number of lines crossing the inner ring implies that there is no correlation between the variables. To support the finding, the calculated correlation value is 0.1316 which indicates a very weak correlation. The linear relationship value, however shows a strong of one to one linear relationship between the two circular variables. The concentration parameter estimate shows a small concentration for both θ and ϕ.

Chiang Mai J. Sci. 2010; 37(3) 375 Figure 7. Spoke Plot of wind direction data recorded at 850 hpa and 1,000 hpa, from Bayan Lepas Airport, Malaysia, in July and August 2005. Figure 8. Spoke plot of wind direction measured by HF radar and anchored wave buoy, Holderness Coastline, Humberside coast of the North Sea, U.K., in October 1994. From the Spoke plot in Figure 8 of data set (ii), it can be seen that none of the line crosses the inner ring which indicates a strong correlation between the variables. To support the finding, the calculated correlation value of 0.8680 is obtained. The linear association measure shows a strong of one to one linear relationship between the two circular variables. Further, concentration parameters show a small concentration for both θ and ϕ.

376 Chiang Mai J. Sci. 2010; 37(3) 5. DISCUSSION In this paper, a pragmatic approach in the exploratory data analysis of circular variables is developed. The novelty of the methodology is that it combines both visual methodology and numerical approach in the analysis. With the name Spoke plot, the analysis focuses on identifying the relationship of two circular variables. Plot of Spoke graph, correlation measure, linear relationship and concentration parameter value provides a comprehensive exploratory data analysis. In summary, the method of analysis developed in this study has great potential being developed into comprehensive statistical software dedicated to circular variables. With the proper interface, option like graphical-userinterface (GUI), point-and-click window and multiple windows can make the software as sophisticated as those designed for linear data. REFERENCES [1] Mardia K.V., Statistics of Directional Data, London: Academic Press Inc, 1972. [2] Batschelet E., Circular Statistics in Biology, London: Academic Press Inc, 1981. [3] Fisher N.I., Statistical Analysis of Circular Data, Cambridge University Press, 1993. [4] Mardia K.V., and Jupp P.E., Directional Statistics, London: Academic Press Inc, 2000. [5] Jammalamadaka S.R., Topics in Circular Statistics, World Scientific Publishing Co. Pte. Ltd, 2001. [6] Abu Zaid A.H.M., Mohamed I. and Hussin A.G., A new test of discordancy in circular data. Communications in Statistics Simulation and Computation, 2009; 38: 682-691. [7] Caires S. and Wyatt L.R., A linear functional relationship model for circular data with an application to the assessment of ocean wave measurements. Journal of Agricultural, Biological, and Environmental Statistics, 2003; 8: 153-169. [8] Hussin A.G., Fieller N.R.J. and Stillman E.C., Linear regression for circular variables with application to directional data. Journal of Applied Science and Technology, 2004; 9: 1-6.