Correspondence Analysis

Canonical Correspondence Analysis

Top Previous Next

Canonical Correspondence Analysis (CCA) is an ordination method widely used among ecologists. This method allows the analysis of a species abundance matrix with respect to a second table of explanatory environmental variables. Both the species and environmental variable tables must refer to the same samples and must therefore have the same number of samples. CCA is a correspondence analysis of the species data using linear constraints derived from a multiple linear regression of the environmental variables against the species scores. The result is an ordination that shows the relationship between the species, sites and environmental variables. It is termed a constrained ordination method as it is an extension of the correspondence analysis with the environmental data acting as the constraint.

CCA is suited to community data sets where: (1) species responses to environmental variables are unimodal (hump-shaped), and (2) the important underlying environmental variables have been measured. If you have not measured the most important environmental variables, or your environmental variable measurements are subject to large errors, CCA will not yield satisfactory, or believable, results.

CCA is currently one of the most popular ordination techniques in community ecology. It is, however, one of the most dangerous in the hands of people who do not take the time to understand this relatively complex method. The dangers lie principally in several areas:

(1) Because it includes multiple regression of community gradients on environmental variables, it is subject to all of the hazards of multiple regression. These are well documented in the statistical literature, but often not fully appreciated by newcomers to multiple regression. (2) As the number of environmental variables increases relative to the number of observations, the results become increasingly dubious, even though an appearance of very strong relationships is inevitable.

(3) Statistics indicating the "percentage of variance explained" can be calculated in several ways, each for a different question, but users frequently confuse these statistics when reporting their results.

CCA does not explicitly calculate a distance matrix. But CCA, like CA and PCA, is implicitly based on the chi-squared distance measure where samples are weighted according to their totals. This gives high weight to species whose total abundance in the data matrix is low, thus exaggerating the distinctiveness of samples containing several rare species.