The data

For this post, we’ll be using data on a Big 5 measure of personality that is freely available from Personality Tests. You can download the data yourself HERE, or running the following code will handle the downloading and save the data as an object called d:

temp

At the time this post was written, this data set contained data for 19719 people, starting with some demographic information and then their responses on 50 items: 10 for each Big 5 dimension. This is a bit much, so let’s cut it down to work on the first 500 participants and the Extraversion items (E1 to E10):

Here is a list of the extraversion items that people are rating from 1 = Disagree to 5 = Agree:

E1 I am the life of the party.

E2 I don’t talk a lot.

E3 I feel comfortable around people.

E4 I keep in the background.

E5 I start conversations.

E6 I have little to say.

E7 I talk to a lot of different people at parties.

E8 I don’t like to draw attention to myself.

E9 I don’t mind being the center of attention.

E10 I am quiet around strangers.

You can see that there are five items that need to be reverse scored (E2, E4, E6, E8, E10). Because ratings range from 1 to 5, we can do the following:

d[, paste0("E", c(2, 4, 6, 8, 10))]

We’ve now got a data frame of responses with each column being an item (scored in the correct direction) and each row being a participant. Let’s get started!

Average inter-item correlation

The average inter-item correlation is any easy place to start. To calculate this statistic, we need the correlations between all items, and then to average them. Let’s use my corrr package to get these correlations as follows (no bias here!):

Average item-total correlation

We can investigate the average item-total correlation in a similar way to the inter-item correlations. The first thing we need to do is calculate the total score. Let’s say that a person’s score is the mean of their responses to all ten items:

Cronbach’s alpha

Cronbach’s alpha is one of the most widely reported measures of internal consistency. Although it’s possible to implement the maths behind it, I’m lazy and like to use the alpha() function from the psych package. This function takes a data frame or matrix of data in the structure that we’re using: each column is a test/questionnaire item, each row is a person. Let’s test it out below. Note that alpha() is also a function from the ggplot2 package, and this creates a conflict. To specify that we want alpha() from the psych package, we will use psych::alpha()

This function provides a range of output, and generally what we’re interested in is std.alpha, which is “the standardised alpha based upon the correlations”. Also note that we get “the average interitem correlation”, average_r, and various versions of “the correlation of each item with the total score” such as raw.r, whose values match our earlier calculations.

If you’d like to access the alpha value itself, you can do the following:

There are times when we can’t calculate internal consistency using item responses. For example, I often work with a decision-making variable called recklessness. This variable is calculated after people answer questions (e.g., “What is the longest river is Asia”), and then decide whether or not to bet on their answer being correct. Recklessness is calculated as the proportion of incorrect answers that a person bets on.

If you think about it, it’s not possible to calculate internal consistency for this variable using any of the above measures. The reason for this is that the items that contribute to two people’s recklessness scores could be completely different. One person could give incorrect answers on questions 1 to 5 (thus these questions go into calculating their score), while another person might incorrectly respond to questions 6 to 10. Thus, calculating recklessness for many individuals isn’t as simple as summing across items. Instead, we need an item pool from which to pull different combinations of questions for each person.

To overcome this sort of issue, an appropriate method for calculating internal consistency is to use a split-half reliability. This entails splitting your test items in half (e.g., into odd and even) and calculating your variable for each person with each half. For example, I typically calculate recklessness for each participant from odd items and then from even items. These scores are then correlated and adjusted using the Spearman-Brown prophecy/prediction formula (for examples, see some of my publications such as this or this). Similar to Cronbach’s alpha, a value closer to 1 and further from zero indicates greater internal consistency.

We can still calculate split-half reliability for variables that do not have this problem! So let’s do this with our extraversion data as follows:

Thus, in this case, the split-half reliability approach yields an internal consistency estimate of .87.

Composite reliability

The final method for calculating internal consistency that we’ll cover is composite reliability. Where possible, my personal preference is to use this approach. Although it’s not perfect, it takes care of many inappropriate assumptions that measures like Cronbach’s alpha make. If the specificities interest you, I suggest reading this post.

Composite reliability is based on the factor loadings in a confirmatory factor analysis (CFA). In the case of a unidimensional scale (like extraversion here), we define a one-factor CFA, and then use the factor loadings to compute our internal consistency estimate. I won’t go into the detail, but we can interpret a composite reliability score similarly to any of the other metrics covered here (closer to one indicates better internal consistency). We’ll fit our CFA model using the lavaan package as follows:

There you have it. The composite reliability for the extraversion factor is .90.

One appealing aspect of composite reliability is that we can calculate it for multiple factors in the same model. For example, say we had included all personality items in a CFA with five factors, we could do the above calculations separately for each factor and obtain their composite reliabilities.

Just to finish off, I’ll mention that you can use the standardised factor loadings to visualise more information like we did earlier with the correlations. I’ll leave this part up to you!