G E INTERACTION USING JMP: AN OVERVIEW Sukanta Dash I.A.S.R.I., Library Avenue, New Delhi-110012 sukanta@iasri.res.in 1. Introduction Genotype Environment interaction (G E) is a common phenomenon in agricultural research. Differences between genotypic values may increase or decrease from one environment to another which might cause genotypes to even rank differently between environments. The G E studies are somewhat complicated as they require integrated approaches which combine many fields including agriculture, biology, statistics, computer, and genetics. A basic principle indicated by the G E interaction is that even if all animals or plants were created equal (same genotypes), they will not necessarily express their genetic potential in the same way when environmental conditions (drought, temperature, disease pressure, stress, etc.) varies. This important concept may require genetic engineering of plants or animals specifically tailored to their environmental conditions. Statistically, G E interactions occur if the performance of genotypes varies significantly across environments. Assuming 2 genotypes (G1 and G2) tested in 2 environments (E1 and E2). Figure1 indicates the presence of GXE interaction since G1 is phenotypically superior to G2 in Environment 1 (E1) but inferior to G2 in E2. The phenotypic difference between G1 and G2 remains the same in the two environments representing no interaction between the genotype and the environment in Figure 2. G1 G2 G1 G2 E1 Environment E2 E1 Environment E2 (Figure 1) (Figure 2) 2. Model Expressing phenotypic value (P) as a function of the genotype (G) and the environment (E), the equation, P= G + E indicates the situation when environmental factors influence each genotype equally (Fig. 2). However, when environment influences some genotypes more than others (Fig. 1), the phenotypic relationship changes to P=G + E + I GE and the expression includes the G E interaction term I GE. The variance (V) of the effects follows V(P) = V(G) + V(E) + 2 Cov(GE) showing that variance components analyses could be used to partition the phenotypic variance into its genotypic, environmental, and their interaction components. One way to determine the
importance of V(G) or V(E) is to experimentally minimize one of the two effects (minimizing V(G) by using identical genotypes, or minimizing V(E) by using controlled environment chambers and random allocation of genotypes to environmental conditions). Genotypeenvironment covariance (Cov) occurs when better genotypes are provided better environments. For a simple analysis of variance of a randomized complete block design the model: Y G E GE B ijk i j ij jk The above can be applied where µ is the mean, G i is the effect of the i th genotype, E j is the effect of the j th environment, GE ij is the interaction of the i th genotype with the j th environment, B jk is the effect of the k th replication in the j th environment, and ε ijk is the random error. 3. Methods A variety of statistical methods have been proposed to analyze G E interaction data. These methods include Analysis of Variance (e.g., Least Squares, Restricted Maximum Likelihood=REML), Regression (e.g., Joint Regression Analysis, Partial Least-Squares Regression, Factorial Regression), Shifted Multiplicative Model (SHMM), Variance Components, Cluster Analysis, Factor Analysis, and Additive Main Effects and Multiplicative Interaction effects (AMMI model). The G E data can often be arranged in a two-way layout designating genotypes in rows and environments in columns. To apply AMMI model, the conventional analysis of variance for the additive main effects (µ + G i + E j ) is combined with the principal component analysis for the multiplicative interaction (non-additive residual) effects to analyze the matrix of two-way means. Example 3.1: Experimenter is interested in analyzing the performance of different genotypes in a multi-environment trial, based on 10 genotypes of 15 environments each with 4 replications displaying stability measures, genotype and G E least square means from linear-bilinear models, PCA biplots, and heritability. This data is available in sample data sets of JMP. Solution: The G E interaction module used for analyzing the performance of different genotypes in a multi-environment trial. Also it can perform many more analysis like stability measures, genotype and G E least square means from linear-bilinear models, PCA biplots, and heritability. Select Genomics > Genetics > Breeding Analysis > G E interaction to open the dialog illustrated. The dialog is made up of 2 tabs: General, Option as shown in Figure 3. ijk Step 1: Figure 3
Click on genetics example as the data based study. Click Choose to select the input data set. An Open Data window opens. Navigate into the data file peanut.sas7bdat. Click Open for see the dataset for which the path is given. Note: The peanut.sas7bdat file has been selected. If you click Open, you will see the data set. Examine the variables listed in the Available Variables field. Select the yield variable as quantitative trait variable. Select Genotype, Environment and rep as Genotype variable, environment variables and replicate variable. To specify an output folder, complete the following steps: Click Choose. Navigate to the folder where you wish to place the output or create a new folder. Click OK. The completed General tab appears as shown in Figure 3. Step 2: Click on Option tab Click on linear-bilinear and choose the method (AMMI). Write the alpha value for LS means confidence interval ranges from 0 to 1. The completed Option tab appears as shown in Figure 4. Figure 4 Click Run. Results
Running this process using the Peanut Example sample setting generates the tabbed Results window as shown below. Refer to the G E Interaction process description for more information. Output from the process is organized into tabs. Each tab contains one or more plots, data panels, data filters, and so on, that facilitate your analysis. Figure 5 The following tabs are generated by this process as shown in Figure 5. Stability: This tab shows a plot of various stability measures for each of the Genotypes. G LSMeans: This tab displays estimates of LS means for the trait at each of the genotypes with confidence limits. G E LSMeans: This tab displays estimates of LS means for the trait for the Genotype Environment interaction effect, with a line for each environment colored distinctly 2D Biplots: This tab displays 2-D plots of PC2 by PC1, PC3 by PC1, and PC3 by PC2, where PC1-PC3 are the first three principal components from the linear-bilinear model (AMMI, for example). In these plots and the ones on the next two tabs, genotypes are represented by blue triangles and environments are represented by red circles. 3D Biplot: This tab shows a 3-D plot of PC1, PC2, and PC3, the first 3 principal components from the linear-bilinear model. PC1 Mean Trait: This tab displays a biplot of PC1, the first principal component from the linear-bilinear model, by the mean of the Trait Variable.
Scree Plot (G E Interaction): This tab displays the scree plot for the principal components from the linear-bilinear model, plotting the eigenvalue by principal component in red, and the cumulative proportion explained by the PCs in blue. SAS Output: When there is only one Quantitative Trait selected, this tab shows the heritability and the ANOVA results from fitting the G E model as shown in Figure 6. Figure 6 Output Data This process generates the following data set(s) as shown in Figure 7 and 8. ANOVA Data Set: This data set contains the ANOVA statistics from fitting the G E model PCA Data Set: This data set contains the principal component eigenvectors from the linearbilinear model. ANOVA for Linear-Bilinear Model Data Set: This data set contains the ANOVA from fitting the linear-bilinear model that includes principal components to represent the G E
component. Figure 7 Figure 8 Click View Data to reveal the underlying data table associated with the current tab Click Reopen Dialog to reopen the completed process dialog used to generate this output Click Create Report to generate a pdf- or rtf-formatted report containing the plots and charts of selected tabs Click Close all to close all graphics windows and underlying data sets associated with the output.