Department of Agronomy and Plant Breeding, Faculty of Agriculture, University of Guilan , Rasht,

Iran
Key words: Peanut, canonical correlation analysis, oil quality.

doi: http://dx.doi.org/10.12692/ijb/3.8.1-10

Article published on August 20, 2013

Abstract
Seed quality traits of the plants may directly or indirectly depend on agro-morphological traits. Thus the
determination between agronomical and oil quality characters of the plants may be important for plant scientists.
The relation between agronomical and oil quality traits were studied by using Canonical Correlation Analysis
(CCA) in peanut genotypes. CCA is a multivariate statistical model that facilitates the study of interrelationships
among sets of multiple dependent variables and multiple independent variables. As a result, the canonical
correlation between the first canonical variate pair was found as 0.897. Five canonical functions obtained from
morphological and agronomical traits, had attributed about 70% from variation in the oil quality traits. It can be
concluded that CCA can be used to simplify the relationship between agro-morphological and oil quality traits of
the peanut.
* Corresponding

according to instructions stated in the peanut
descriptor (Anonymous. 1981). Determinations of

Canonical Correlation Analysis (CCA), one of the

peanut oil content and fatty acids were performed by

most direct ways of analyzing relationships between

Soxhlet

sets of variables, has been widely applied in various

respectively (Soxhlet, 1879; Hisil, 1988). Y and X

fields such as the plant sciences, biology, chemistry,

variables sets are defined as follows: X1: 100-grain

social and management sciences (Young, 1981; Keskin

weight; X2: 100-pod weight; X3: Grain Length; X4:

and Yasar, 2007; Liu et al., 2009; Norman et al.,

Grain Width; X5: Pod Length; X6: Pod Width; X7:

2012). This analysis allows investigation of the

Grain: pod volume ratio; X8: Leaflet length; X9:

relationship between two sets of variables which one

Leaflet width; X10: Plant Height and X11: Pods per

representing a set of independent variables and

plant. Y1: Iodine index; Y2: Saponification value; Y3:

another set of dependent variables (Green, 1978).

Oleic acid; Y4: Linoleic acid and Y5: Acid value.

2 Safari et al.

and

Gas

Chromatography

methods

Int. J. Biosci.

2013

Statistical analysis

of the variables in the exposure set and any linear

CCA was used to examine dependencies that exist

combination of the variables in the outcome set.

between important agronomical traits of peanuts and

Further

their

by

combinations are chosen in turn, and they are

Hotelling (Hotelling, 1935; Hotelling, 1936), as a

orthogonal to those already identified. The maximum

generalization of multiple regression analysis, CCA

number of canonical correlation is equal to the

seeks to identify and quantify the associations

number of variables in the smaller set, which is five in

between two sets of variables where each set consists

this study (the number of oil quality traits). Canonical

of at least two variables (Kshirsagar, 1972). It is used

correlation maximizes the correlation between linear

to investigate the relationship between a linear

combinations of variables in X and Y variable sets. To

combination of the set of X variables with a linear

determine how much of the variance in one set of

combination of a set of Y variables.

variables is accounted for by the other set of variables,

oil

quality

measurements.

Developed

pairs

of

maximally

correlated

linear

redundancy measure is calculated (Sharma, 1996).
Consider two groups of variables (X and Y) such that

Redundancy determines the amount of variances

one has p variables (X1, X2, . . . ,Xp), and the other has

accounted for in one set of variables by the other set

q variables (Y1, Y2, . . . ,Yq). Linear combinations of

of variables. A more detailed description of CCA is

the original variables can be defined as canonical

given by Hair et al. (1998). The three possible

variates (Um and Vm) as follows:

methods for interpretation are (1) canonical weights

Um = am1X1 + am2X2 + . . . + ampXp

(standardized coefficients), (2) canonical loadings

Vm = bm1Y1 + bm2Y2 + . . . + bmqYq

(structure correlations), and (3) canonical cross-

The correlation between Um and Vm can be called

loadings (Hair et al., 1998). All the computational

canonical

work was performed using a SAS for Windows

correlation

correlation

(Cm).

(canonical

roots

Squared
or

canonical

eigenvalues)

software package (SAS, 1985).

represents the amount of variance in one canonical
variate accounted for by the other canonical variate

Results and discussion

(Hair et al., 1998).

Canonical Correlation
The canonical correlation analysis was restricted to

The goal is to determine the coefficients, or canonical

deriving

five

canonical

functions

because

weights (aij and bij), that maximize the correlation

dependent variable set contained five variables. To

between canonical variates Ui and Vi. The first

determine the number of canonical functions to

canonical correlation, Corr (U1, V1), is the highest

include in the interpretation, the analysis focused on

possible correlation between any linear combination

the level of statistical significance.

Table 1. Canonical correlation analysis.
Canonical Function

Canonical Correlation

Canonical R2

F Statistic

Probability

1

0.8975

0.8055

1.19

0.0173

2

0.8539

0.7292

0.91

0.6342

3

0.7440

0.5535

0.75

0.7895

4

0.6545

0.4284

0.43

0.9511

5

0.2941

0.0865

0.14

0.9927

3 Safari et al.

the

Int. J. Biosci.

2013

The test statistics for the canonical correlation

simultaneously are also performed. The test statistics

analysis are presented in Table 1. The canonical

employed are Wilks’ lambda, Pillai’s criterion,

correlation between the first (0.897) pair was found

Hotelling’s trace, and Roy’s gcr. Table 2 also details

to be significant (p < 0.05) from the likelihood ratio

the multivariate test statistics, which all indicate that

test. The remaining canonical correlation is not

the canonical functions, taken collectively, are

statistically significant.

statistically

significant

at

the

0.05

level.

In addition to tests of each canonical function
separately, multivariate tests of both functions
Table 2. Multivariate tests of significance.
Statistic

Value

Approximate F Statistic

Probability

Wilk's Lambda

0.0122

1.37

0.0173

Pillai's Trace

2.6034

1.20

0.0446

Hotelling-Lowley Trace

8.9217

1.67

0.0082

Roy's Greatest Root

4.1436

3.77

0.0228

Table 3. Redundancy analysis of dependent variates for the first canonical function.
Standardized Variance of the Dependent Variables Explained by
Their Own

Canonical Variate (Shared

The

Variance)

Opposite

Canonical

Variate

(Redundancy)

Canonical Function

Proportion

Cumulative proportion

Canonical R2

Proportion

Cumulative proportion

1

0.5330

0.5330

0.8056

0.4294

0.4294

2

0.2251

0.7581

0.7293

0.1642

0.5935

3

0.1719

0.9300

0.5536

0.0951

0.6887

4

0.0424

0.9724

0.4285

0.0182

0.7069

5

0.0276

1.0000

0.0865

0.0024

0.7092

Redundancy

the

canonical variate in parallel with one standard

independent and dependent variates in Tables 3 and

analysis

deviation increase in original variables. In other

4. As can be seen, the redundancy index for the

words

dependent variate is 0.5330. The independent

contributions of original variables to the related

variate, however, has a markedly lower redundancy

variate. Based on the size of the weights, the order of

index

the

contribution of independent variables to the first

independent variate results from the relatively low

variate is X11, X3, X2, X10, X1, X9, X7, X5, X4, X8 and X6,

shared variance in the independent variate (.0156),

and the dependent variable order on the first variate

not the canonical R2. From the redundancy analysis

is Y3, Y4, Y5, Y1 and then Y2. Because canonical

and the statistical significance tests, the first function

weights can be unstable due to small sample size or

should be accepted.

presence of multicolinearity in the data, the canonical

(.0126).

The

is

low

calculated

redundancy

for

of

these

coefficients

represent

relative

loading and cross-loadings are considered more
Canonical Correlation Interpretation

appropriate (Hair et al., 1998). A canonical loading

Table 5 contains the standardized canonical weights

gives the product-moment correlation between the

for each canonical variate for both dependent and

original variable and its corresponding canonical

independent variables. The magnitude of the weights

variate.

represents their relative contribution to the variate.
Standardized canonical coefficients show variation in

4 Safari et al.

Int. J. Biosci.

2013

Canonical loadings, also called canonical structure

function separately and computes the within-set

correlations, measure the simple linear correlation

variable-to-variate

between an original observed variable in the

coefficient, the more important it is in the canonical

dependent or independent set and the setâ&#x20AC;&#x2122;s canonical

variate (Hair et al., 1998). Canonical loadings, like

variate (Hair et al., 1998). The canonical loading

weights, may be subject to considerable variability

reflects the variance that the observed variable shares

from one sample to another. This variability suggests

with the canonical variate and can be interpreted like

that loadings, and hence the relationships ascribed to

a factor loading in assessing the relative contribution

them, may be sample-specific, resulting from chance

of each variable to each canonical function. The

or extraneous factors (Lambert and Durand, 1975).

correlation.

The

larger

the

methodology considers each independent canonical
Table 4. Redundancy analysis of independent variates for the first canonical function.
Standardized Variance of the Independent Variables Explained by
Their Own Canonical Variate (Shared

The Opposite CanonicalVariate (Redundancy)

Variance)
Canonical

Proportion

Cumulative proportion

Function

Canonical

Proportion

Cumulative proportion

0.0126

0.0126

R2

1

0.0156

0.0156

0.8056

2

0.0086

0.0242

0.7293

0.0063

0.0188

3

0.0051

0.0293

0.5536

0.0028

0.0216

4

0.0828

0.1120

0.4285

0.0355

0.0571

5

0.2544

0.3665

0.0865

0.0220

0.0791

Table 5. Canonical weights for the first function.
Standardized canonical coefficients for the independent variables

U1

X1

100-grain weight

-0.3571

X2

100-pod weight

-0.5200

X3

Grain Length

0.5674

X4

Grain Width

0.0736

X5

Pod Length

0.0799

X6

Pod Width

-0.0329

X7

Grain: pod volume ratio

0.0940

X8

Leaflet length

0.0468

X9

Leaflet width

0.2966

X10

Plant Height

0.3730

X11

Pods per plant

0.6330

Standardized canonical coefficients for the dependent variables

V1

Y1

Iodine index

-0.2454

Y2

Saponification Value

0.1664

Y3

Oleic acid

0.7404

Y4

Linoleic acid

-0.6561

Y5

Acid value

-0.5019

Table 6 contains the canonical loadings for the

canonical function. The first independent variate has

dependent and independent variates for the first

loadings ranging from .0092 to .5600, with three

5 Safari et al.

Int. J. Biosci.

2013

independent variables (X4, X2 and X6) even having a

variables (Y1 and Y4) have negative loadings on the

negative loading. The two variables with the highest

dependent set. The two variables with the highest

loadings on the independent variate are X3 (grain

loadings on the dependent variate are Y3 (oleic acid),

length), and X8 (leaflet length). Three of the variables

and Y1 (iodine value).

(Y3, Y2 and Y5) have positive loadings and two of the
Table 6. Canonical loadings for the first function.
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11

Study of relationship between oil quality traits with agromorphological traits in peanut

Seed quality traits of the plants may directly or indirectly depend on agro-morphological traits. Thus the determination between agronomical and oil quality characters of the plants may be important for plant scientists. The relation between agronomical and oil quality traits were studied by using Canonical Correlation Analysis (CCA) in peanut genotypes. CCA is a multivariate statistical model that facilitates the study of interrelationships among sets of multiple dependent variables and multiple independent variables. As a result, the canonical correlation between the first canonical variate pair was found as 0.897. Five canonical functions obtained from morphological and agronomical traits, had attributed about 70% from variation in the oil quality traits. It can be concluded that CCA can be used to simplify the relationship between agro-morphological and oil quality traits of the peanut