Objectives of genotype by environment data analysis

Performance trials have to be conducted in multiple environments because of the presence of GE. For the same reason, the analysis of genotype by environment data must start with the examination of the magnitude and nature of GE (Fig. 1). The first question to ask is whether there is significant GE in the data. If no, genotypes can be reliably evaluated in any single environment. Unfortunately this situation rarely exists except perhaps for certain traits that are under simple genetic control. If GE exists, it is necessary to determine whether there are important crossovers, i.e., rank changes of the genotypes in different environments, such that different winners are picked up in different environments. If not, superior genotypes can be identified in any of the environments but there exists an ideal test environment in which the best genotypes can be most easily identified. If crossover interactions exist, it is necessary to determine whether the crossover GE patterns are repeatable across years. Data from multiple years are necessary to address this question. If there are repeatable interactions (i.e. geographical locations or other “fixed” factors are consistent in their ranking of genotypes), then the target environments should be divided into different mega-environments and genotype evaluation should be conducted separately for each mega-environment. Dividing target environments into meaningful mega-environments is the only way that complex GE can be exploited (Yan and Tinker 2005a). If there is no recognizable pattern of GE, then the target environment is a single mega-environment with unpredictable GE, and models addressing random sources of variation may be appropriate.  

Fig. 1. Objectives of multi-environment trial data analysis

Within a single mega-environment, the objectives of data analysis are two-fold: genotype evaluation to identify genotypes with both high performance and high stability, and test environment evaluation to identify test environments that are both informative (discriminating) and representative. In addition, whenever there is significant GE, then potential causes of GE should be explored.

To summarize, genotype by environment data analysis should address the following four questions:

1)     Can the target environment be divided into meaningful mega-environments so that some of the GE can be explored or avoided? Multi-year data are essential to address this question,

2)     What are the causes of GE? Data of genetic and environmental covariates are required to address this question.

3)     What are the best test environments (representative and discriminating)?

4)     What are the superior genotypes (both high and stable performance within a mega-environment)?

Given sufficient data, biplot analysis implemented by GGEbiplot can help address these questions effectively and conveniently.