Linear
discriminant analysis (LDA) classifies a sample object into one of two categories
based on certain object properties. LDA tests whether object attributes
measured in an experiment predict categorization of the objects.

The
research team divided web pages from the Webby Awards into two categories based
on the score the web pages earned in the first round of the Webby competition.
Category 1 included web pages with scores in the top 33% of the competition,
and category 2 included web pages with scores in the bottom 67% of the
competition. Next, the research team, using an automated tool, measured how
well each web page in the competition applied the 11 design metrics identified
earlier. They predicted whether the web page was in the Webby Award ‘top 33%’
or ‘bottom 67%’ category based on its performance over the 11 metrics. Overall,
their automated tool was 67% accurate in categorizing the web pages.

In
this example, the research team demonstrates a valid use of linear discriminant
analysis. First, the independent variables, the 11 design metrics, had
continuous numeric values, which satisfy the independent variable type needed to
apply LDA. Second, the dependent variable, Webby Award score, when made into a
dichotomous categorical variable for the experiment, satisfied the LDA
dependent variable requirement. LDA, not commonly used in HCI experiments,
proved an innovative experimental design tactic and its use shows creativity on
the part of the research team.

As
alternatives, the research team could have used either quadratic discriminant
analysis (QDA) or logistic regression. QDA takes the same input parameters and
returns the same results as LDA. QDA uses quadratic equations, rather than
linear equations, to produce results. LDA and QDA are interchangeable, and
which to use is a matter of preference and/or availability of software to
support the analysis. Logistic regression takes the same input parameters and
returns the same results as LDA and QDA do. Logistic regression is preferable
in unequal grouping conditions. For example, if the researchers expect only .01
of the cases will be categorized in group 1 and .99 will be categorized in
group 2, logistic regression would be the better choice because it is more
sensitive to unequal groupings than LDA and QDA are.[i]

[i] From
personal communication with Michael Peascoe at the University of Minnesota
Statistical Consulting Clinic.