Normalizing negatively skewed Likert data

New Member

I'm researching whether several chatbot-aided online shopping scenarios have a different effect on one's customer experience and satisfaction.
All data is acquired through the use of a survey which is attached to my experiment. Customer experience is measured using four variables, all measured on a 7-point Likert scale. Customer satisfaction is measured on a 7-point Likert scale as well.

The problem is that this data tends to be negatively skewed. So much, that none of my variables are normally distributed. I tried transforming the data by doing a reverse score transformation log(score +1 - highest score). Although the data ends up less negatively skewed, it still is very much not-normal.

My questions to you:
1. Are there any common measures which can be taken to try to normalize data in this context?

2. Even if it were possible to normalize the data, is it recommended or does it decrease reliability of test results?

3. Following question 2, I find myself in a trade-off between trying to normalize it, or to move on and analyze my data with non-parametric tests. Which one would you recommend and why?

4. In case you recommend non-parametric tests, is there a substitute for the Dunnett's post-hoc test? This test would be extremely useful to several of my hypotheses.

TS Contributor

This is a bit confusing. Did you mean 4 single, Likert-type items, each with a 7-point response scale?
Likert scale is the term for a scale which is composed from such Likert-type items. Did you mean, you add these 4 variables up to obtain a scale score?

The problem is that this data tends to be negatively skewed. So much, that none of my variables are normally distributed.

Do not transform measurements just for mathematical reasons. You'll end up in a mess, or at least, with huge interpretation problems (or, what do you think a subject's log-reverse-transformed response actually does represent?)

1. Are there any common measures which can be taken to try to normalize data in this context?

New Member

This is a bit confusing. Did you mean 4 single, Likert-type items, each with a 7-point response scale?
Likert scale is the term for a scale which is composed from such Likert-type items. Did you mean, you add these 4 variables up to obtain a scale score?

I meant that I have 4 separate variables, each containing multiple items. The score for one variable is measured by taking the average of the 7-point Likert scale score of the items belonging to that variable.

Non-normality is hardly ever a problem. Why do you think it is a problem here?
If you just have Likert-type items, then one could argue that they are ordinal scaled, i.e. normality is not an issue at all.
But anyway, why do you want to normalize something here? To what end?

I need to run multiple One-Way ANOVA's as well as a Two-Way ANOVA in which I need to look at the interaction effect of my two independent variables which influence the online shopping scenarios. From what I've read in literature, there is no non-parametric equivalent of the Two-way ANOVA (independent). Next to that, many post-hoc tests cannot be performed for non-parametric tests. Non-normality infers I need to run non-parametric tests, which is therefore problematic.

Since I'm using the Likert-score average of multiple items per variable, this ends up being a reasonable approximation of an interval data point -- at least for my thesis I'm writing I am allowed to consider it that way.

Do not transform measurements just for mathematical reasons. You'll end up in a mess, or at least, with huge interpretation problems (or, what do you think a subject's log-reverse-transformed response actually does represent?)

I found in Field's book (Discovering Statistics Using SPSS 3rd edition 2009) that the log-reverse-transformed equivalent of a variable's score can be used to decrease the extent to which the data is negatively skewed.

TS Contributor

This is wrong. Who told you this? ANOVA (and also linear regression) does not assume that the dependent varible is normally distributed. Rather, the F Test assumes that the values within each group are normally distributed (in the population, by the way; the sample data are only a means to check this assumption), or, even simpler, that the residuals from the model are normally distributed (in the Population). Moreover, ANOVAs F Test is robust against violations of this assumption, if sample size is large enough (n > 30 or so; look for „Central Limit Theorem“).