TUKEY MEAN-DIFFERENCE PLOT

Name:

TUKEY MEAN-DIFFERENCE PLOT

Type:

Graphics Command

Purpose:

Generates a Tukey mean-difference plot.

Description:

The Tukey mean-difference plot is an adaption of the
quantile-quantile plot.

A quantile-quantile plot (or q-q plot) is a graphical data
analysis technique for comparing the distributions of 2 data
sets. The quantile-quantile plot is a graphical alternative
for the various classical 2-sample tests (e.g., t for location,
F for dispersion).

The "quantiles" of a distribution are the distribution's
"percent points" (e.g., .5 quantile = 50% point = median).
The advantage of the quantile-quantile plot is 2-fold:

the sample sizes do not need to be identical;

many distributional aspects can be simultaneously tested.
For example, shifts in location, shifts in dispersion,
changes in symmetry/skewness, outliers, etc.

The quantile-quantile plot has 2 components:

the quantile points themselves;

a 45 degree reference line.

Given a q-q plot, assume its y coordinates are in T(i) and
its x coordinates are in D(i), then the Tukey mean-difference
is defined as:

Vertical axis = T(i) - D(i);
Horizontal axis = (T(i) + D(i)/2.

The Tukey mean-difference plot also plots a horizontal
reference line at zero.

That is, it plots the difference of the quantiles against
their average. The advantage of the Tukey mean-difference
compared to the q-q plot is that it converts interpretation
of the differences around a 45 degree diagonal line to
interpretation of differences around a horizontal zero
line. However, the Tukey mean-difference plot should only
be applied if the two variables are on a common scale.

Like usual, the appearance of the 2 components is controlled by
the first 2 settings of the CHARACTERS and LINES commands. It is
typical for the response points to be represented as some
character, say X's, with no connecting line, and the reference
line as a connected line with no character. This is
demonstrated in the sample program below.
Syntax 1:

TUKEY MEAN DIFFERENCE PLOT <y1> <y2>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification>
is optional.

Syntax 2:

HIGHLIGHT TUKEY MEAN DIFFERENCE PLOT <y1> <y2>
<tag>
<SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<tag> is the group-id variable that defines the
highlighting;
and where the <SUBSET/EXCEPT/FOR qualification>
is optional.

This syntax can be used to plot different plot points with
different attributes. For example, it can used to highlight
groups in the data or to emphasize the extremes.

This same technique can be used other distributions (use
the appropriate PPF function).

Note:

For large data sets, it may be impractical to generate the plot for
each individual point. As an alternative, you can generate the plot
for a user specified number of quantiles. To do this, enter the
command

SET QUANTILE QUANTILE PLOT NUMBER OF PERCENTILES ...
<value>

where <value> specifies the desired number of quantiles. This
is demonstrated in the Program 2 example below.