Maximum loss of efficiency occurring on any treatment contrast if the analysis is done by regression

EFLIMIT = scalar

Limit on the loss of efficiency for the analysis to be done by regression; default 0.1

EXIT = scalar

Exit code indicating the recommended method of analysis

Parameters

Y = variates

Data values to be analysed

RESIDUALS = variates

Variate to save the residuals from each analysis

FITTEDVALUES = variates

Variate to save the fitted values from each analysis

SAVE = identifiers

To save details of each analysis to use subsequently with the AOVDISPLAY procedure

Description

AOVANYHOW assesses a data set to select the most appropriate method for analysis of variance. If the design is orthogonal or balanced it uses the ANOVA directive. Otherwise, if there is no blocking in the design (i.e. there is only one random term) it uses the Genstat regression facilities through procedure A2WAY or AUNBALANCED. Finally, if there are additional random terms, it looks to see if these contain any useful information about the treatments in order to choose between regression and REML. The EFLIMIT option sets a limit on the amount of information that may be lost on any of the treatment contrasts if the analysis to be done by regression instead of REML; the default of 0.1 implies that no more than 10% of the information on any contrast may be estimated between the random terms.

The method of use is similar to that for ANOVA. The treatment terms to be fitted must be specified, before calling the procedure, by the TREATMENTSTRUCTURE directive. Similarly, any covariates must be indicated by the COVARIATE directive. Any blocking structure must be specified by the BLOCKSTRUCTURE directive.

The parameters of the procedure are identical to those of ANOVA. The variates to be analysed are specified by the Y parameter. Residuals and fitted values can be saved using the RESIDUALS and FITTEDVALUES parameters respectively. Finally, the SAVE parameter allows details of the analysis to be saved so that further output can be obtained using the AOVDISPLAY procedure.

Printed output is controlled by the PRINT option. The settings are limited to those that can produce analogous output from any of the analysis methods:

aovtable

analysis-of-variance table from ANOVA or regression, or Wald and F tests for fixed effects from REML,

information

design type, efficiency factors and name of the command used for the analysis,

means

tables of (predicted) means, and

residuals

residuals (fitted values are printed too for analyses by regression or REML).

Probabilities can be printed for variance ratios by setting option FPROBABILITY=yes.

The SAVE parameter allows you to save a pointer containing information about the analysis. You can use this as the input for the SAVE parameter of the AOVDISPLAY procedure to print (or reprint) any of the information provided by the PRINT option above. Alternatively, the first element of the pointer is the save structure from the command that was used for the analysis. So, if you use this with the display commands associated with that analysis command, you can display the more specialized output from the command (for example, variance components from REML).

Tables of means from regression and REML are calculated using the PREDICT and VPREDICT directives, respectively. The first step (A) of their calculations forms the full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the table of means. The COMBINATIONS option specifies which cells of the full table are to be formed in Step A. The default setting, estimable, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting COMBINATIONS=present excludes the cells for factor combinations that do not occur in the data. The ADJUSTMENT option then defines how the averaging is done in Step B. The default setting, marginal, forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting equal weights all the combinations equally. Finally, for regression analyses, the setting observed uses the WEIGHTS option of PREDICT to weight each factor combination according to its own individual replication in the data.

The PSE option controls the types of standard errors that are produced to accompany the tables of means, with settings:

differences

summary of standard errors for differences between pairs of means,

alldifferences

standard errors for differences between all pairs of means,

lsd

summary of least significant differences between pairs of means,

alllsd

least significant differences between all pairs of means,

means

effective standard errors for analyses by ANOVA, or approximate effective standard errors for analyses by regression or REML – these are formed by procedure SED2ESE with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sedi,j = √( esei2 + esej2 ).

The default is differences. The LSDLEVEL option sets the significance level (as a percentage) for the least significant differences.

The PLOT option allows various residual plots to be requested: fittedvalues for a plot of residuals against fitted values, normal for a Normal plot, halfnormal for a half Normal plot, and histogram for a histogram of residuals.

The FACTORIAL option sets a limit on the number of factors that a higher-order term, such as an interaction, can contain; any terms with more factors are deleted from the analysis. The WEIGHTS option allows a variate of weights to be specified for a weighted analysis of variance.

You can save a scalar indicating the recommended method of analysis by using the EXIT option. The scalar can take values with the following meanings.

2. The design unbalanced. It has 1 or 2 treatment factors and no blocking. Analyse by A2WAY.

3. The design unbalanced and has 1 or 2 treatment factors. No more than a proportion defined by the EFLIMIT option of the information on any treatment contrast is estimated between block terms. Analyse by A2WAY.

4. The design unbalanced, and there are either weights or more than 2 treatment factors. There is no blocking. Analyse by AUNBALANCED.

5. The design is unbalanced, and there either are weights or more than 2 treatment factors. No more than a proportion defined by the EFLIMIT option of the information on any treatment contrast is estimated between block terms. Analyse by AUNBALANCED.

6. The design unbalanced with several block (i.e. random) terms. Analyse by REML.

The EFLOSS option can save the maximum loss of efficiency that would occur on any treatment contrast if the analysis is done by regression.

You can set option METHOD=recommend to request that AOVANYHOW will just form a recommendation for the command to be used if the analysis cannot be done by ANOVA. The only available PRINT option is then information, which tells you which command is recommended. You can still use the EXIT and EFLOSS options, but residuals and fitted values will be saved (by the RESIDUALS and FITTEDVALUES parameters) only if the analysis should be done by ANOVA.

Method

The EXIT option of the ANOVA directive is used to ascertain whether or not the design is orthogonal or balanced; if so it can be analysed by ANOVA. (For details, see the Guide to the Genstat Command Language, Part 2 Statistics, Section 4.7.) If the design is not orthogonal or balanced and there are several random terms, the AEFFICIENCY procedure is used to calculate the efficiency factors for the treatment terms, in order to decide whether to use regression of REML.