Effects are relationships between variables. The magnitude of an
effect has an essential role in sample-size estimation, statistical
inference, and clinical or practical decisions about utility of the
effect. Virtually every effect in research, clinical and practical
settings arises from a linear model, an equation in which a dependent
variable equals a sum of predictor variables and/or their products.
Linear models allow for the effect of one predictor to be adjusted for
the effects of other predictors and for the modeling of non-linearity
via polynomials. Effects and models used to estimate them depend on the
nature of the dependent variable (continuous, count, nominal) and the
predictor variables (numeric, nominal). A continuous dependent gives
rise to a difference in a mean with a nominal predictor and a slope or
correlation with a numeric predictor. Default magnitude thresholds for
difference in a mean come from standardization (dividing by the
between-subject standard deviation): 0.2, 0.6, 1.2, 2.0 and 4.0 for
small, moderate, large, very large and extremely large. The same
thresholds apply to a slope, provided the slope is evaluated as the
difference for 2 SD of the predictor. Thresholds for correlations are
0.1, 0.3, 0.5, 0.7 and 0.9. Many effects and errors are uniform across
the range of the dependent variable when expressed as percents or
factors, and these should be estimated via log transformation.
Non-uniformity of error arising from repeated measurement or from
different subject groups should be addressed via within-subject modeling
or mixed modeling, which also provide estimates of individual responses
to treatments. Effects on nominal variables and counts are analyzed with
various generalized linear models, where the dependent is the log of
either the odds of a classification, the hazard (incidence rate) of an
event, or the mean count. The effect is estimated initially as a factor
representing a ratio between two groups (or per unit or per 2 SD of a
numeric predictor) of either odds of a classification, hazards of an
event, counts, or count rates. Effects involving common classifications
or events can be converted to differences in percent risk and
interpreted with magnitude thresholds of 10, 30, 50, 70 and 90;
equivalent odds ratios are 1.5, 3.4, 9.0, 32 and 360. Thresholds for
common events can also be derived from standardization of log of time to
the event. Both sets of thresholds are similar and correspond to hazard
ratios of 1.3, 2.3, 4.5, 10 and 100. For counts and rare events, a
consideration of proportions attributable to an effect gives rise to
ratio thresholds for counts, hazards, risks or odds of 1.1, 1.4, 2.0,
3.3, and 10. Proportional hazards regression is an advanced form of
linear modeling for use with events when hazards change with time but
their ratio is constant. KEYWORDS: correlation. count ratio, hazard
ratio, minimum clinically important difference, odds ratio, relative
risk, risk difference, standardization, transformation.

After presenting the Magnitude Matters slideshow recently in
several workshops, I realized that it needed more on the role played by
linear modeling in estimation of effects. The additive nature of the
linear model is the basis of adjustment for the effects of other factors
to get pure or un-confounded effects and to identify potential mediators
or mechanisms of an effect. The additive nature of linear models also
explains why we should use the log of the dependent variable to estimate
uniform percent or factor effects. A consideration of the error term in
a linear model provides further justification for the use of log
transformation, along with the use of the unequal-variances t statistic
or mixed modeling in analyses where the error term differs between or
within subjects. Finally, the analyses for counts and binary dependent
variables make little sense without understanding how the underlying
linear models require such strange dependent variables as the log of the
odds of a classification or the log of the hazard of a time-dependent
event. The new slideshow addresses all these issues and more, using
material from the recent progressive statistics article (Hopkins et al.,
2009) and a book chapter on injury statistics (Hopkins, 2009). The
slideshow hopefully represents a useful combination of theory and
practical advice for anyone who wants to understand and estimate effects
in their research.

For more on the way we infer causality, deal with confounders, and
account for mechanisms in the relationships between variables, see the
slideshow/article on research designs (Hopkins, 2008). My article and
spreadsheets on understanding stats via simulations (Hopkins, 2007a) are
useful for learning more about log transformation, straightforward
analyses, and inferential statistics. Follow this link to a slideshow
that details the various approaches to repeated measures and random
effects; I presented it at a conference in 2003, but it is still up to
date.

When it comes to actual data analysis, you will need extra help
with the practicalities of the use of a spreadsheet or stats package.
Peruse the article on comparing two group means and play with the
associated spreadsheet to come to terms with simple comparisons of means
and adjustment for a covariate (Hopkins, 2007b). The article on the
various controlled trials and the associated spreadsheets are a little
more advanced and also full of useful material (Hopkins, 2006). See my
item on Sad Stats for an overview of some of the stats packages and for
a set of files that are useful for SPSS users. If you already have some
experience with the SAS package but need specific advice on Proc Mixed
or Proc Genmod, contact me.

Reviewer's Commentary

The reprint pdf contains this article with a printer-friendly
version of the slideshow (six slides per page).

Update 28 Aug 2010. Odds-ratio thresholds of 1.5, 3.4, 9.0, 32 and
360 now included as an adjunct to proportion-difference thresholds of
10, 30, 50, 70 and 90 percent when modeling and interpreting common
time-independent classifications. These odds-ratio thresholds, which I
computed directly from the proportion differences centered on 50% (55 vs
45, 65 vs 35, etc.), agree well with a formula devised by Chinn (2000)
to convert an odds ratio to a standardized difference in means (ln(odds
ra tio)/1.81).

Chinn S (2000). A simple method for converting an odds ratio to
effect size for use in meta-analysis. Statistics in Medicine 19,
3127-3131

* An effect arises from a dependent variable and one or more
predictor (independent) variables.

* The relationship between the values of the variables is expressed
as an equation or model.

* Example of one predictor: Strength = a + b*Age

* This has the same form as the equation of a line, Y = a + b*X,
hence the term linear model.

* The model is used as if it means: Strength [left arrow] a +
b*Age.

* If Age is in years, the model implies that older subjects are
stronger.

* The magnitude comes from the "b" coefficient or
parameter.

* Real data won't fit this model exactly, so what's the
point?

* Well, it might fit quite well for children or old folks, and if
so...

* We can predict the average strength for a given age.

* And we can assess how far off the trend a given individual falls.

* With kids, inclusion of Size would reduce the effect of Age. To
that extent, Size is a mechanism or mediator of Age.

* But sometimes a covariate is a confounder rather than a mediator.

* Example: Physical Activity (predictor) has a strong relationship
with Health (dependent) in a sample of old folk. Age is a confounder of
the relationship, because Age causes bad health and inactivity.

* Again, including potential confounders as covariates produces the
pure effect of a predictor.

* Think carefully when interpreting the effect of including a
covariate: is the covariate a mechanism or a confounder?

* If you are concerned that the effect of Age might differ for
subjects of different Size, you can add an interaction.

* Example of an interaction:

Strength = a + b*Age + c*Size+ d*Age*Size

* This model implies that the effect of Age on Strength changes
with Size in some simple proportional manner (and vice versa).

* It's still known as a linear model.

Background: The Rise of Magnitude of Effects

* Research is all about the effect of something on something else.

* The somethings are variables, such as measures of physical
activity, health, training, performance.

* An effect is a relationship between the values of the variables,
for example between physical activity and health.

* We think of an effect as causal: more active [right arrow] more
healthy.

* But it may be only an association: more active [left and right
arrow] more healthy.

* Effects provide us with evidence for changing our lives.

* The magnitude of an effect is important.

* In clinical or practical settings: could the effect be harmful or
beneficial? Is the benefit likely to be small, moderate, large...?

* In research settings:

* Effect magnitude determines sample size.

* Meta-analysis is all about averaging magnitudes of study-effects.

* So various research organizations now emphasize magnitude

* Example of two predictors: Strength = a + b*Age + c*Size

* Additional predictors are sometimes known as covariates.

* This model implies that Age and Size have effects on strength.

* It's still called a linear model (but it's a plane in
3-D).

* Linear models have an incredible property: they allow us to work
out the "pure" effect of each predictor.

* By pure here I mean the effect of Age on Strength for subjects of
any given Size.

* That is, what is the effect of Age if Size is held constant?

* That is, yeah, kids get stronger as they get older, but is it
just because they're biggger, or does something else happen with
age?

* The something else is given by the "b": if you hold
Size constant and change Age by one year, Strength increases by exactly
"b".

* We also refer to the effect of Age on Strength adjusted for Size,
controlled for Size, or (recently) conditioned on Size.

* Likewise, "c" is the effect of one unit increase in
Size for subjects of any given Age.

* You still use this model to adjust the effect of Age for the
effect of Size, but the adjusted effect changes with different values of
Size.

* Another example of an interaction:

Strength = a + b*Age + c*Age*Age = a + b*Age + c*[Age.sup.2]

* By interacting Age with itself, you get a non-linear effect of
Age, here a quadratic.

* If c turns out to be negative, this model implies strength rises
to a maximum, then comes down again for older subjects.

* We have been treating Age as a number of years, but we could
instead use AgeGroup, with several levels; e.g., child, adult, elderly.

* Stats packages turn each level into a dummy variable with values
of 0 and 1, then treat each as a numeric variable. Example:

* Strength = a + b*AgeGroup is treated as Strength = a + [b.sub.1]
*Child + [b.sub.2] *Adult + [b.sub.3] *Elderly, where Child=1 for
children and 0 otherwise, Adult=1 for adults and 0 otherwise, and
Elderly=1 for old folk and 0 otherwise.

* The model estimates the mean value of the dependent for each
level of the predictor: mean strength of children = a + [b.sub.1].

* And the difference in strength of adults and children is
[b.sub.2] -[b.sub.1].

* You don't usually have to know about coding of dummies, but
you do when using SPSS for some mixed models and controlled trials.

* Dummy variables can also be very useful for advanced modeling.

* For simple analyses of differences between group means with
t-tests, you don't have to think about models at all!

* Or you can model change scores between pairs of trials. Example:

* Strength = a + b*Group*Trial, where b has four values, is
equivalent to StrengthChange = a + b*Group, where b has just two values
(expt and cont) and StrengthChange is the post-pre change scores.

* You can include subject characteristics as covariates to estimate
the way they modify the effect of the treatment. Such modifiers or
moderators account for individual responses to the treatment.

* A popular modifier is the baseline (pre) score of the dependent:
StrengthChange = a + b*Group + c*Group*StrengthPre.

* Here the two values of c estimate the modifying effect of
baseline strength on the change in strength in the two groups.

* And [c.sub.2] -[c.sub.1] is the net modifying effect of baseline
on the change.

* Bonus: a baseline covariate improves precision of estimation when
the dependent variable is noisy.

* Modeling of change scores with a covariate is built into the
controlled-trial spreadsheets at Sportscience.

* = number you would have to treat or sample for one subject to
have an outcome attributable to the effect.

* Promoted in some clinical journals, but not widely used?

* Can't estimate directly with linear models.

* Magnitude scale (if you ever use it) is given by 1/(risk
difference).

Odds ratio = (a/c)/(b/d).

* Hard to interpret, but must use to express effects and confidence
limits for time-independent classifications, including some case-control
designs.

* Use hazard ratio for time-dependent risks.

* Magnitudes for common time-independent classifications.

* Either convert to difference in risk between the reference
(comparison or control) group and other group. Example shown: if 50% of
reference group is affected, and odds ratio is 4.9, then by simple
algebra, 83% of other group is affected. Therefore risk difference = 33%
(i.e., moderate).

* Or use this scale for odds ratios, which correspond to risk
differences of 10, 30, 50, 70 and 90% "centered" on 50%:_

trivial 1.5 small 3.4 moderate 9.0 large 32 very large 360 ext.
large

* Magnitudes for rare classifications: see later.

[GRAPHIC OMITTED]

Ratio of mean time to event = [t.sub.2]/[t.sub.1]. 100

* Easier for an individual to interpret.

* If the hazards are constant, it's also Proportion the
inverse of the hazard ratio.

* Example: if hazard ratio is 2.5, there is 2.5x the risk of
injury. But 1/2.5 = 0.4, 0 so injury occurs in less than half the time,
on average.

Difference in mean time to event = [t.sub.2]-[t.sub.1].

* Also easy to interpret, but can't model directly.

* Standardization of the log of individual values of time to event
leads to another scale for hazard ratios or mean-time ratios of common
events: 1.3, 2.2, 4.5, 13, 100.

* This scale is similar to that given by consideration of maximum
risk difference for common events. Averaging the two and simplifying...

* Hazard-ratio thresholds for common events:_ trivial 1.3 small 2.3
moderate 4.5 large 10 very large 100 ext. large