Principal Component Analysis, an Aid to Interpretation of Data. A Case Study of Oil Palm (Elaeis guineensis Jacq.)

Similar documents
1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Genetic Diversity by Multivariate Analysis Using R Software

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

On Ranked Set Sampling for Multiple Characteristics. M.S. Ridout

Logistic Regression: Regression with a Binary Dependent Variable

Dimensionality Reduction Techniques (DRT)

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

The Principal Component Analysis

BIO 682 Multivariate Statistics Spring 2008

Beta vulgaris L. ssp. vulgaris var. altissima Döll

UCLA STAT 233 Statistical Methods in Biomedical Imaging

MULTIVARIATE ANALYSIS IN ONION (Allium cepa L.)

Experimental Design and Data Analysis for Biologists

Abstract =20, R 2 =25 15, S 2 = 25 25, S 3

Performance In Science And Non Science Subjects

Machine learning for pervasive systems Classification in high-dimensional spaces

Unconstrained Ordination

Comparative Analysis of ICA Based Features

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

Geophysical Study of Limestone Attributes At Abudu Area of Edo State, Nigeria

A New Generalised Inverse Polynomial Model in the Exploration of Response Surface Methodology

Multivariate Statistical Analysis

Discriminant analysis and supervised classification

Principal Component Analysis, A Powerful Scoring Technique

DIMENSION REDUCTION AND CLUSTER ANALYSIS

CHAPTER 1. Introduction

13.7 ANOTHER TEST FOR TREND: KENDALL S TAU

Analysis of Variance and Co-variance. By Manza Ramesh

Researchers often record several characters in their research experiments where each character has a special significance to the experimenter.

Principal Components Analysis (PCA)

COMBINING ABILITY ANALYSIS FOR CURED LEAF YIELD AND ITS COMPONENT TRAITS IN BIDI TOBACCO (NicotianatabacumL.)

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

SAMPLING IN FIELD EXPERIMENTS

Principal Components Analysis using R Francis Huang / November 2, 2016

Principal Component Analysis (PCA) Theory, Practice, and Examples

The Empirical Rule, z-scores, and the Rare Event Approach

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Introduction to Factor Analysis

Chapter 4: Factor Analysis

RELEVANCE OF FOLIAR EPIDERMAL CHARACTERS IN THE DELIMITATION OF THREE FORMS OF ELAEIS GUINEENSIS (JACQ.)

Evaluation of Taro (Colocasia esculenta (L.) Schott.) Germplasm Using Multivariate Analysis

STUDY ON GENETIC DIVERSITY OF POINTED GOURD USING MORPHOLOGICAL CHARACTERS. Abstract

U.S. Plant Patents and the Imazio Decision

EXTENT OF HETEROTIC EFFECTS FOR SEED YIELD AND COMPONENT CHARACTERS IN CASTOR (RICINUS COMMUNIS L.) UNDER SEMI RABI CONDITION

Computers & Geosciences, Vol. 3, pp Pergamon Press, Printed in Great Britain

Effect of the age and planting area of tomato (Solanum licopersicum l.) seedlings for late field production on the physiological behavior of plants

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Use of administrative registers for strengthening the geostatistical framework of the Census of Agriculture in Mexico

Computer exercise 3: PCA, CCA and factors. Principal component analysis. Eigenvalues and eigenvectors

VAR2 VAR3 VAR4 VAR5. Or, in terms of basic measurement theory, we could model it as:

While entry is at the discretion of the centre, candidates would normally be expected to have attained one of the following, or equivalent:

Dimension Reduction (PCA, ICA, CCA, FLD,

CS281 Section 4: Factor Analysis and PCA

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares

PRINCIPAL COMPONENTS ANALYSIS

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks

Appendix B: Skills Handbook

The aim of this section is to introduce the numerical, graphical and listing facilities of the graphic display calculator (GDC).

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Techniques and Applications of Multivariate Analysis

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Development of Agrometeorological Models for Estimation of Cotton Yield

Developing Rainfall Intensity Duration Frequency Models for Calabar City, South-South, Nigeria.

Illinois State Water Survey at the University of Illinois Urbana, Illinois

An Introduction to Path Analysis

POPULATION AND SAMPLE

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

ESCONDIDO UNION HIGH SCHOOL DISTRICT COURSE OF STUDY OUTLINE AND INSTRUCTIONAL OBJECTIVES

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)

Machine Learning 2nd Edition

TESTING FOR CO-INTEGRATION

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Doubled haploid ramets via embryogenesis of haploid tissue cultures

Genetic Divergence Studies for the Quantitative Traits of Paddy under Coastal Saline Ecosystem

to be tested with great accuracy. The contrast between this state

MULTIVARIATE TIME SERIES ANALYSIS AN ADAPTATION OF BOX-JENKINS METHODOLOGY Joseph N Ladalla University of Illinois at Springfield, Springfield, IL

POST GRADUATE DIPLOMA IN APPLIED STATISTICS (PGDAST) Term-End Examination June, 2016 MST-005 : STATISTICAL TECHNIQUES

Design and Construction of a Conical Screen Centrifugal Filter for Groundnut Oil Slurry

Maximum variance formulation

Alternative Growth Goals for Students Attending Alternative Education Campuses

Canonical Correlation & Principle Components Analysis

NONPARAMETRIC TESTS. LALMOHAN BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-12

Inferential Statistics

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

GENETIC DIVERGENCE OF A COLLECTION OF SPONGE GOURD (Luffa cylindrica L.)

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis

Measurement 4: Scientific Notation

Or, in terms of basic measurement theory, we could model it as:

The National Spatial Strategy

Principal Component Analysis

Photographs to Maps Using Aerial Photographs to Create Land Cover Maps

The Governance of Land Use

CHAPTER 4 CRITICAL GROWTH SEASONS AND THE CRITICAL INFLOW PERIOD. The numbers of trawl and by bag seine samples collected by year over the study

Transcription:

Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) 4(2): 237-241 Scholarlink Research Institute Journals, 2013 (ISSN: 2141-7016) jeteas.scholarlinkresearch.org Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) 4(1):73-76 (ISSN: 2141-7016) Principal Component Analysis, an Aid to Interpretation of Data. A Case Study of Oil Palm (Elaeis guineensis Jacq.) Ekezie Dan Dan Department of Statistics, Imo State University, PMB 2000, Owerri, Nigeria. Abstract Principal Component Analysis provides an objective way of finding indices so that the variation in the data can be accounted for as concisely as possible. It may well turn out that two or three Principal Components provide a good summary of all the original variables. Consideration of the values of the Principal Components instead of the values of the original variables may then make it much easier to understand what the data have to say. In short, Principal Components Analysis is a means of simplifying data by reducing the number of variables. Although Principal Components Analysis has been well described in a number of texts, the emphasis of the descriptions has been on the underlying theory of the methods and on the methods of computation. Only a limited number of practicable examples of the technique have been published in sufficient detail to enable the reader to gain any facility in the interpretation of the results of the analysis and the necessary background knowledge to the problems is not usually available. In this paper, a case study of the application of Principal Component Analysis to a practical problem is presented and is suggested that there is a need for the extensive application of the existing methods of multivariate analysis over a wide range of problems and subjects, especially in agriculture, in order to test the practical value of the techniques. Keywords: multivariate analysis, principal component analysis, agriculture, oil palm (Elaeis guineensis Jacq.) INTRODUCTION Principal Component Analysis is a descriptive procedure for analyzing relationships that may exist in a set of quantitative variables. It is designed to reduce the number of variables that need to be considered to a small number of indices (called the principal components) that are linear combinations of the original variables. Usually the technique is not utilized as an end in itself but as a method for illustrating, modeling, and combining variables for further analysis. For example, much of the variation in the body measurements of oil palm progeny (EWS) shown in Appendix 1, will be related to the general size of the palms, and the total 1 1 2 3 4 5 6 7 (1) will measure this quite well. This accounts for one dimension in the data. Another index is 1 X1 X2 X3 X4 X5 X6 X7 (2) which is a contrast between the first five measurements and the last two. This reflects another dimension in the data. Principal Component Analysis provides an objective way of finding indices of this type so that the variation in the data can be accounted for as concisely as possible. It may well turn out that two or three Principal Components provide a good summary of all the original variables. Consideration of the values of the Principal Components instead of the values of the original variables may then make it much easier to understand what the data have to say. In short, Principal Components Analysis is a means of simplifying data by reducing the number of variables. Although principal components analysis, and indeed, most other multivariate techniques have been well described in a number of texts, the emphasis of these descriptions has been on the underlying theory of the methods and on the methods of computation. Only a limited number of practicable examples of the technique have been published in sufficient detail to enable the reader to gain any facility in the interpretation of the results of the analysis and the necessary background knowledge to the problems is not usually available. The interpretation of the results of the analysis is therefore left to the reader and no clear advice is given as to how this may be done. One of the aims of this paper, therefore, is to suggest that there is a need for the extensive application of the present methods of multivariate analysis, including principal components analysis, over a wide range of problems and subjects, in order to test the practical value of the techniques. The theory of multivariate methods has far out-run the practical application of the techniques, with the result that only a handful of examples are available in the published literature. In fact, some of this lack of practical examples has stemmed from difficulties in computation, but with greater encouragement, therefore, and better guidance as to the interpretation of the results, more research workers might be 237

prepared to attempt multivariate analysis of their data. This paper presents a case study in the application of principal components analysis, taken from the Ph.D. dissertation of the author on Multivatiate Analysis of nursery and yield characters of the Oil Palm (Elaeis guineensis Jacq.) Data. The experiment for this study was carried out in NIFOR Nigerian Institute for Oil Palm Research, Benin City in 1964-65.NIFOR is one of the Research Institutes under the Federal Ministry of Science and Technology with a research mandate for the palms sub-sector. Its major activities are centered on the Oil Palm, Coconut, Raphia, Dates and Ornamental palms. The layout adopted for this experiment is unreplicated blocks of Trays of Progenies. A total of 620 palms were randomly selected from a total of 1470 oil palm seedlings. The selection of the planted seedlings was by stratified sampling method for optimum allocation based on height variation within Trays which form the strata. The progenies are: 1. 1st Grade EWS (Extension Works Seeds) 2. 3.415T X 6.37D (Wet Heat Treated) 3. 3.669D X 3.365D 4. 1.2209D X 1.2209D 5. 14.1258D X 14.484D Records obtained: Leaf Count, Height measurements, Flowering observations, Yield bunches. The object is to choose whole trays of seedlings planted in the pre-nursery from one or two crosses and to follow the testing of them through to yielding in the field (Field 33). Not less than 14 trays were planted out. Every stand in each tray was marked and measurements made at monthly intervals in prenursery and nursery. Normal leaf and flowering observations started immediately after planting in the field. Up to the end of the nursery stage, the identity of each palm was known so that a true cross section of each population can be planted into the field. The seedlings were transplanted from the pre-nursery trays to beds in the nursery in early April, keeping each seedling in the same relative position to it s neighbours. Monthly height measurements and identification of the youngest fully open leaf continued. Height measurements and leaf counts of the seedling in the main nursery continued until their planting into Field Principal Component Analysis The basic technique of principal components analysis is well described by Kendall (1957), Seal (1964), Quenouile (1962) and many others. In order to define precisely the technique as it has been employed in case study described in this paper, however, the following stages are distinguished. choice of the variables to be included in the analysis; construction of the basic data matrix; transformation of the basic data, if required; calculation of the dispersion or correlation matrix; calculation of eigenvalues and eigenvectors of the dispersion or correlation matrix; examination and interpretation of the eigenvalues; interpretation of the eigenvectors; calculation of the transformed values; plotting or further analysis of transformed values. A fuller commentary on these separate stages is given by Jeffers (1964), but there are several points which it is worth stressing before beginning the account of this case study. First, there are two major decisions to be taken during the course of the analysis. At stage (c), the data may be analysed without transformation, or they may be transformed, usually into the logarithms of the original value. Some exponents of principal component analysis advocate the transformation of all data, partly so as to satisfy the assumptions that may be made in the appeal to probability distributions and partly because they wish to consider hypotheses about ratios of the basic variables rather than about linear functions of the variables. The second decision is concerned with the choice between finding the eigenvalues and eigenvectors of the dispersion matrix or the correlation matrix. This choice depends on whether or not the scale of the original observations is important in the interpretation of the results. It is also worth stressing here that, in principal components analysis, the principal diagonal of the correlation matrix is never replaced by a vector of communalities, as in factor analysis. The selection of the planted seedlings was by stratified sampling method with optimum allocation based on height variation within trays which form the strata. KEY: DURA (D) - shell (endocarp), the main variety oil palm fruit found in the groups; has a large nut with a thick shell and thin mesocarp. PISIFERA (P) is a small fruit with no shell. TENERA (T) cross between Dura and Pisifera give rise to Tenera (hybrid). It has thick mesocarp containing much more oil and fat (chemically saturated oil) than either of its parents. The Tenera nut is small and easily shelled to release the palm kernel. The Tenera palm Kernel is smaller than the Dura kernel although the Tenera bunch is much larger than Dura. In all, the Tenera is a much better variety for industrial and economic purposes. 238

Practical Value of Principal Components Analysis The practical objectives of the use of principal components analysis may be summarized by the following list: the examination of the correlations between the variables of a selected set, as has been done in Table 1; the reduction of the basic dimensions of the variability in the measured set to the smallest number of meaningful dimensions, as has been done in Table 2; the elimination of variables which contribute relatively little extra information; the examination of the grouping of individuals in n-dimensional space; determination of the objective weighting of measured variables in the construction of meaningful indices; the allocation of individuals to previously demarcated groups; the recognition of misidentified individuals; orthogonalization of regression calculations. Not all of these objectives will be of equal importance in any given study and some will be entirely absent. Nevertheless, the method provides one solution to such problems, and is easy to apply, if electronic digital computer is available, with minimum of assumptions. Physical Characteristics of Oil Palm (Elaeis guineensis Jacq.). Appendix 1 In the oil palm data collection, the following biometric characteristics were collected viz: X 1 leaf count in the nursery, which was dropped because there was equal value X 2 height of the oil palm seedlings in the nursery X 3 leaf count in the field X 4 height of the oil palm seedlings in the field canopy spread (measured in metres) X 5 X 6 X 7 Sex-ratio (measured in percentage) average yield of oil palm (measured in kilograms, kg). The extracts of the correlation matrix among the different biometric characters from the computer print outs are as presented in Table 1 (See Appendix) The eigenvalues (latent roots) and eigenvectors, were obtained from the correlation matrix by solving the characteristics equation. A 0... (3) where A is the correlation matrix. The eigenvalues (latent roots) and eigenvector, together with the percentage of variability accounted for by each component are given in Table 2. Table 2 Eigenvectors for Components 1 2 3 4 5 1 Height in the nursery 0.00013 0.00075 0.00379-0.00896 0.99995 2 Height in the field -0.99985-0.01139-0.01084-0.00701 0.00012 3 Leaf count in the field -0.00675-0.02161-0.00104 0.99970 0.00898 4 Canopy spread -0.00979-0.08750 0.99611-0.00088-0.00371 5 Sex Ratio 0.01244-0.99586-0.08737-0.02154 0.00089 Latent Root 16.1610 3.6997 0.7123 0.1061 0.0017 Percentage of Variability 78.15 17.89 3.44 0.51 0.01 Cumulative percentage of variability 78.15 96.04 99.48 99.99 100.00 One important problem in the application of principal components is to decide on the number of components, which have any practical significance. Bartlett (1950) has suggested an appropriate test for arriving at this decision but this test will not be used in the present case. Instead we shall adopt the arbitrary rule of thumb which Jeffere (1966) has utilized in his two case studies in the application of principal component analysis. This rule of thumb suggests that we consider only those components, which have eigen values of 1.00 or greater as having any practical significance. Using this method, we see from the computer printout of the principal components analysis and the extracted values in Tables 2 that only the first two components might be of any practical significance. The first component accounts for 78.15, per cent of the total variability in the oil palm progeny. The second component accounts for a further 17.89, percent. However, if we take in the third component 239 which accounts for 3.44, percent of the total variability, we find that the first three components account for nearly 99.48, percent of the total variability for the oil palm progeny (see Table 2). We may therefore neglect the remaining two components as being of not much practical significance. We therefore reduce the number of biometric characters from a whopping seven to just three and proceed to apply Discriminatory and Canonical Analysis to the Data. Practical Implication of the Findings In view of the strong inter-correlation between the variables, a lesser number of variables may be sufficient to distinguish between the oil palm measurement characters of the oil palm. In the present case, we had only seven variables and the analysis has suggested that two were sufficient to work with. The application of the analysis has suggested how the time and labour in measuring a

large number of variables may be saved by measuring a few of the variables and giving the same information as the totality of the variables. The analysis therefore suggests that there are probably two (three) major components of the physical variables, accounting for about nearly 99.48, percent of the total variability of the oil palm. This gives some clear indications as to the nature of the differences between the characters/ parameters of the oil palm being considered. The fact that, by definition, these components are mutually independent greatly simplifies the interpretation of the variability measured by the physical variables, and focuses the attention of the researcher on the basic dimension of which his variables are only first approximations. The principal component analysis of the correlation matrix confirms that very few components of variation have been included in the data, the first two components together accounting for 96.04 of the oil palm. Only these two components are likely to have any practical significance. The remaining variation described by the measured variables is relatively unimportant. In further work only one of the main group of variables, example height of the oil palm seedlings in the nursery, together with height in the field need be retained, although new variables believed to be uncorrelated with those included in this study should be sought. Summary of the Principal Component Analysis (PCA) The practical objectives of the use of principal component analysis in this study may be summarized by the following tests: the examination of the correlations between the variables of a selected set; the reduction of the basic dimensions of the variability in the measured set to the smallest number of meaningful dimensions. the elimination of variables which contribute relatively little extra information; orthogonalization of regression calculations. CONCLUSION Consequently, Principal Component Analysis has enabled us to concentrate attention on the basic dimensions of the variability of the physical properties of the oil palm progeny and then uses this information to determine the relative importance of these dimensions in predicting the comprehensive yielding capacity of the improved varieties of the oil palm. Principal Component Analysis has given us clear guidance as to the selection of the necessary variables in further work on the oil palm, especially in selection studies, there would be little point in including more than two of the measurements of the physical properties, and these two variables should include height of the oil palm seedlings in the nursery and in the field. This ensures that no time is wasted by continuing to measure variables which contribute relatively little to the study. RECOMMENDATION The practical implications would seem to be relatively simple. There is a need for the technique to be more widely applied, and perhaps even more important, for the results of these applications to be more widely reported, within the contexts of their original problems, so that the value of the technique can be assessed in practice REFERENCES BARTLETT, M.S. (1950). Test of significance in factor analysis. Brit. J. Psych. 77-85 EKEZIE, D.D. (2011). Multivariate Analysis of Nursery and Yield Characters of the Oil Palm (Elaeis guineensis Jacq.) Data. Unpublished Ph.D. Dissertation JEFFERE, J.N.R. (1966). Principal component analysis in taxonomic research (Forestry Commission Statistics Section Paper No. 83) (1966) Correspondence. Statistician, 15, 207-208. KENDALL M.G. (1957). A Course in Multivariate Analysis. London: Griffin QUENOUILLE. M.H. (1962). Associate Measurements. London: Buttrworths SEAL, H. (1964). Multivariate Statistical Analysis for Biologists. London: Methuen 240

APENDIX 1 BIOMETRIC MEASUREMENTS ON THE OIL PALM (Elaeis guneenssis Jacq.) LEAF COUNT (NURSERY) HEIGHT (M) (NURSERY) LEAF COUNT (FIELD) HEIGHT(M) (FIELD) CANOPY SPREAD (METERS) SEX RATIO (%) YIELD (4yrs) (1970-1973) X 1 X 2 X 3 X 4 X 5 X 6 X 7 18 1.52 125 4.00 8.70 28.25 6.19 18 1.45 132 5.00 10.00 45.08 6.26 18 1.68 121 4.50 9.90 40.00 11.41 18 1.40 134 3.50 7.80 24.58 5.29 18 1.52 140 4.50 9.30 25.40 7.97 18 1.40 130 5.00 8.20 32.46 5.14 18 1.55 133 4.00 9.80 31.58 7.01 18 1.70 134 4.00 9.60 27.97 7.30 18 1.50 128 4.50 9.40 18.26 7.81 18 1.45 140 6.00 9.80 30.17 6.27 18 0.99 131 4.50 11.10 30.83 7.54 18 1.47 149 2.50 8.90 39.85 5.96 18 1.47 140 5.00 10.00 42.06 9.41 18 1.52 138 4.50 8.70 24.80 7.58 18 1.27 143 4.50 8.90 21.01 6.46 18 1.27 137 4.00 9.80 33.61 10.77 18 1.37 132 5.00 10.00 28.21 7.48 18 1.47 143 7.00 8.30 33.33 6.40 18 1.14 105 3.50 7.80 21.36 6.91 18 1.40 145 4.50 8.50 26.83 6.98 18 1.24 131 4.00 9.10 38.46 8.70 18 1.40 127 4.00 8.90 41.67 9.36 18 1.50 144 4.50 8.90 24.43 7.43 18 1.17 141 4.50 8.90 25.00 5.54 18 1.27 131 4.50 9.30 32.08 6.53 18 0.94 144 5.00 9.30 21.67 6.66 18 1.19 142 3.00 8.90 32.23 5.57 18 1.60 138 3.50 10.20 30.09 9.41 18 1.88 144 6.00 9.90 24.17 9.10 18 0.97 144 4.00 9.30 30.00 5.04 18 1.19 136 4.50 9.10 32.74 7.33 18 1.52 139 4.00 9.40 21.24 8.59 18 1.65 144 5.50 9.10 23.33 5.85 18 0.89 153 4.00 9.10 27.13 7.25 18 1.57 138 4.00 9.60 23.68 9.38 18 1.55 150 5.00 9.80 33.06 12.71 18 1.50 140 4.50 8.90 31.30 4.05 18 0.71 147 3.00 7.30 61.06 3.41 18 0.61 146 4.50 9.30 20.66 6.91 18 1.40 143 8.00 9.30 55.08 5.17 18 1.55 126 4.00 9.30 36.63 6.85 18 0.94 144 4.00 8.80 25.83 7.07 18 1.35 128 4.50 9.30 28.16 7.89 18 1.45 154 8.00 8.90 16.79 6.40 18 1.47 139 5.00 9.30 36.84 7.07 18 1.32 142 4.00 8.90 23.08 6.18 18 1.37 126 3.50 9.40 39.39 6.19 18 1.47 128 4.00 9.40 19.21 8.82 TOTAL 864 65.21 6798 217 441.20 1460.97 347.20 MEAN 18 1.34 138 4.50 9.19 30.37 5.19 RANGE 1.17 49 5.5 3.80 44.27 9.30 Source: Ekezie (2011) Ph.D Dissertation-extracted from EXP. 33-13, N.I.F.O.R. Benin City, Nigeria. Table 1. Correlation Matrix height in nursery leaf count in field height in field canopy spread sex ratio height in nursery leaf count in field Height in field canopy spread Sex ratio yield 0.315* 0.417** 0.443** -0.008 0.183 0.355* 0.506** 0.167 0.183 0.355* 0.549** -0.146 0.515** 0.506** 0.549** -0.103 0.204-0.167-0.146-0.103-0.053 0.515** 0.204-0.053 0.315* 0.417** 0.443** -0.008 yield 0.183 0.183 * Correlation is significant at the 0.05 level (2-tailed). ** Correlation is significant at the 0.01 level (2-tailed). 241