The value of Likert scales in measuring attitudes of online learners

Hilary Page-Bucci - February 2003

Attitude is an important concept that is often used to understand and predict
people's reaction to an object or change and how behaviour can be influenced
(Fishbein and Ajzen, 1975)

Introduction

Although online learning has grown alongside the progress of digital technology
over the last 15 years; the reasoning behind why students become absorbed, practise
and achieve a variety of tasks and exercises, or why they avoid others are always
of interest to the effectors and evaluators of the learning process.

By establishing the characteristics of distance and online learners; how they
become motivated, how they feel about learning online; useful information will
be found that would empower the teaching practices and thus ultimately enhance
student retention and achievement.

A review of some of the literature available has revealed some research already
undertaken in various areas of learning online, such as 'training effectiveness
and user attitudes' (Torkzadeh et al, 1999). Torkzadeh et al suggest, "
to achieve successful training we need to be cognizant of the user's attitudes
towards computers. Further investigation revealed other factors that should
be taken into consideration; Miltiadou (1999) suggests that 'it is important
to identify motivational characteristics of online students'. By investigating
and defining their motivation, it would lead to an understanding of 'self-efficacy
beliefs about their own abilities to engage, persist and accomplish specific
tasks' (Bandura, 1986; Stipek, 1988 cited by Miltiadou).

The concept of measuring attitude is found in many areas including social psychology
and the Social Sciences; they can be complex and difficult to measure and there
are a number of different measuring instruments that have been developed to
assess attitude.

'Scaling is the science of determining measuring instruments for human
judgment' (McIver 1981). One needs to make use of appropriate scaling methods
to aid in improving the accuracy of subjective estimation and voting procedures
(Turoff & Hiltz 1997). Torgerson (1958) pointed out that scaling, as a
science of measuring human judgment, is as fundamental as collecting data
on well-developed natural sciences. Nobody would refute the fact that all
science advances by the development of its measurement instruments. Researchers
are constantly attempting to obtain more effective scaling methods that could
be applied to the less well developed yet more complicated social sciences.
Scaling models can be distinguished according to whether they are intended
to scale persons, stimuli, or both (McIver 1981). For example, Likert scale
is a subject-centered approach since only subjects receive scale scores. Thurstone
scaling is considered a method to evaluate the stimuli with respect to some
designated attributes. It is the stimuli rather than the persons that are
scaled (Togerson 1958). Guttman scaling is an approach in which both subjects
and stimuli can be assigned scale values (McIver 1981). (Li et al, 2001)

The purpose of this study is to explore the particular method of measuring attitude
known as Likert Scales (Likert, 1932), and determine their effectiveness and
value in researching attitudes, views and experiences of online learners. These
scales according to Taylor and Heath (1996) have become one of the dominant
methods of measuring social and political attitudes.

Methodology and Measurement

The methodology used for this research will be by a critique of previous research
methodologies. In order to establish the methodology of this research it is
first necessary to clarify the term 'attitude'.

Attitude is an important concept that is often used to understand and predict
people's reaction to an object or change and how behaviour can be influenced
(Fishbein and Ajzen, 1975)

An attitude is a mental and neural state of readiness, organised through experience,
exerting a directive or dynamic influence upon the individual's response to
all objects and situations to which it is related (Allport, 1935 cited by Gross)

A learned orientation, or disposition, toward an object or situation, which
provides a tendency to respond favourably or unfavourably to the object or situation.'
(Rokeach, 1968 cited by Gross)

Three of the generally accepted components of the term 'attitude' (Triandis,
1971) appear in some of the above definitions, these are:

Affective - the person's feelings about the attitude object

Cognitive - the person's beliefs or knowledge about the attitude object

Behavioural - the person's inclination to act toward the attitude object
on a particular way

By analysing these components, and as Gross (1968) suggests it is a 'hypothetical
construct'; it becomes apparent that it cannot be directly measured and the
use of only a single statement or question to assess it [attitude] will not
be effective in gaining reliable responses.

Attitude scales attempt to determine what an individual believes, perceives
or feels. Attitudes can be measured toward self, others, and a variety of
other activities, institutions, and situations (Gay, 1996)

There are several types of scales that have been developed to measure attitude:

Thurstone Scales

This is described by Thurstone & Chave (1929) as a method of equal-appearing
intervals. Thurstone scalling is 'based on the law of comparative judgment'
(Neuman, 2000). It requires the individual to either agree or disagree with
a large number of statements about an issue or object. Thurstone scales typically
present the reader with a number of statements to which they have to respond,
usually by ticking a true/false box, or agree/disagree, i.e. a choice of two
possible responses. Although one of the first scaling methods to be developed,
the questionnaires are mostly generated by face to face interviews and rarely
used in determining attitude measurement today, thus the example below (figure
1) is irrelevant to online learners.

An example of a Thurstone Scale (figure1)

ATTITUDE TOWARD WAR

An individual is asked to check those items which represent his views.

1. A country cannot amount to much without a national honor, and war
is the only means of preserving it.
2. When war is declared, we must enlist.
3. Wars are justifiable only when waged in defense of weaker nations.
4. Peace and war are both essential to progress.
5. The most that we can hope to accomplish is the partial elimination
of war.
6. The disrespect for human life and rights involved in a war is a cause
of crime waves.
7. All nations should disarm immediately.

Guttman Scales (Cumulative scales)

Guttman developed this scale in the 1940s in order to determine if a relationship
existed within a group of items. The items are ordered from low to high according
to difficulty so that to approve or correctly answer the last item implies approval
or success of all prior ones (e.g. self-efficacy scale). The respondent selects
an item that best applies. The list contains items that are cumulative, so the
respondant either agrees or disagrees, if he/she agrees to one, he/she probably
agrees to the previous statements. Arguably this scale does not give enough
variation of feelings and perceptions, therefore the author suggests, this would
not be appropriate for measuring attitude of online learners.

An example of a Guttman Scale (figure 2):

Please indicate what you think about new information
technology (IT) by ticking ONE box to identify the statement that most closely
matches your opinion (Wilson, 1997)

Agree

IT has no place in the office.

IT needs experts to use it in the office.

IT can be used in the office by those with training.

I'd be happy to have someone use IT to do things for me in the office.

Advantages

Disadvantages

Scalogram analysis may be too restrictive, only a narrow universe of
content can be used

Cornell technique questionable

Results no better than summated Likert scales

Semantic Differential Scaling

This is concerned with the 'measurement of meaning', the idea or association
that individuals attach to words or objects. The respondent is required to mark
on a scale between two opposing opinions (bipolar adjectives) the position they
feel the object holds on that scale for them. It is often used in market research
to determine how consumers feel about certain products.

Although this scale is comparatively easy for the respondent to complete, the
author argues that this would not be suitable for measuring attitude of online
learners as it tends to relate more to material associations than cognizance
of feelings.

An example of a Semantic Differential Scale (figure 3):

figure 3

Advantages

Disadvantages

Simple to construct

Analyses can be complex

Easy for subjects to answer

Allows for several types of analyses to take place

Likert Scale (Summated scale)

This was developed by Rensis Likert in 1932. It requires the individuals to
make a decision on their level of agreement, generally on a five-point scale
(ie. Strongly Agree, Agree, Disagree, Strongly Disagree) with a statement. The
number beside each response becomes the value for that response and the total
score is obtained by adding the values for each response, hence the reason why
they are also called 'summated scales' (the respondents score is found by summing
the number of responses). Dumas (1999) suggests, ' this is the most commonly
used question format for assessing participants' opinions of usability'.

Two examples of Likert Scales (figures 4 & 5):

figure 4

figure 5

Advantages

Disadvantages

Simple to construct

Lack of reproducibility

Each item of equal value so that respondents are scored rather than items

Absence of one-dimensionality or homogeneity

Likely to produce a highly reliable scale

Validity may be difficult to demonstrate

Easy to read and complete

Reliability and Validity

Likert scale measures are fundamentally at the ordinal level of measurement
because responses indicate a ranking only.

As the number of scale steps is increased from 2 up through 20, the increase
in reliability is very rapid at first. It tends to level off at about 7, and
after 11 steps, there is little gain in reliability from increasing the number
of steps (Nunally, 1978, cited by Neuman)

Interestingly, Dyer (1995) states,

'attitude scales do not need to be factually accurate - they simply need to
reflect one possible perception of the truth. ……[respondents] will
not be assessing the factual accuracy of each item, but will be responding to
the feelings which the statement triggers in them'

In line with the above statement, when constructing a Likert scale a pool of
statements needs to be generated that are relevant to the attitude (not necessarily
fact), (figure 6). The number of choices on the scale should be evenly balanced
to retain a continuum of positive and negative statements with which the respondent
is likely to agree or disagree although the actual number of choices can be
increased. This will help avoid the problem of bias (figure 7) and improves
reliability as anyone who answers 'agree' all the time will appear to answer
inconsistently.

figure 6

figure 7

As early as 1967, Tittle et al suggest,

The Likert Scale is the most widely used method of scaling in the social
sciences today. Perhaps this is because they are much easier to construct
and because they tend to be more reliable than other scales with the same
number of items (Tittle et al, 1967)

But there still seems to be some contention within research as to whether Likert
Scales are a good instrument for measuring attitude; Gal et al (1994)
suggest 'Likert-type scales reveal little about the causes for answers........it
appears they have limited usefulness'. Helgeson (1993) states that major reviews
'repeatedly point to two problems: lack of conceptual clarity in defining attitudes.....technical
limitations of the instrument used to assess attitude' (Helgeson, 1993 cited
by Gal et al 1994). The author suggests that some of these 'major'
reviews have taken place prior to 1993, and along with the progress in technology,
the reasons for measuring attitude may have also changed. It should also be
taken into account that this type of scale is not developed to provide any kind
of diagnostic information that shows underlying issues of concern to the individual
respondents. There are so many questionnaires students are asked to complete
in the course of their studies, the interface and usability should be taken
into consideration. There are now also researchers who are in favour of using
Likert Scales; Robson (1993) suggests, Likert Scales 'can look interesting to
respondents and people often enjoy completing a scale of this kind. This means
that answers are more likely to be considered rather than perfunctory; and Neuman
(2000) who states, 'the simplicity and ease of use of the Likert scale is its
real strength'.

Reservations on the use of a central Neutral Point

Arguments exist for including and not including a neutral point, and it would
be reasonable to ask what effect adding a neutral point has on the responses
you receive. Is it possible that some respondents may be neutral? In which case
it could be argued that by not including a neutral point in a scale, the respondent
is compelled to make a decision. Kline (cited by Eysenck, 1998) argues for a
middle point, 'even though some participants will very often opt out by remaining
indecisive'.

Differing with this opinion it has been suggested,

the traditional idea suggests that the qualitative results between the
two scales are unaffected since if the respondents are truly neutral, then
they will randomly choose one or the other, so forcing them to choose should
not bias the overall results (Kahn et al,2000)

It is also suggested that the exclusion of a neutral point will draw the respondent
to make a decision one way or the other. This, states Dumas (1999), 'means that
by eliminating a neutral level it is providing a better measure of the intensity
of participants' attitudes or opinions'. The author suggests that by preventing
the respondent to remain neutral, thus causing them to either 'agree' or 'disagree'
could reduce the reliability of the scale as the results will not necessarily
be true.

Review of Literature

Torkzadeh et al (2001) describe the construction of a scale to measure an individual's
self-perception and self-competency in interacting with the Internet. They consulted
five practitioners and four academics and developed a five-point Likert-type
scale (where 1 is strongly disagree to 5 is strongly agree) using a list of
24 items with objectives to explore responses relating to 'unidimensionality,
reliability, brevity and simplicity of the factor structure'. The survey was
administered at a university in the Southwest region of the United States to
a total of 227 students, with an age range from 17 to 57 years.

They used two main criteria for eliminating items that were not considered
valid and reliable; firstly if the correlation of each item with the sum of
the other items in its category was less than 0.50. This was using the assumption
that 'if all items in a measure are drawn from the domain of a single construct,
responses to those items should be highly intercorrelated'. The second criterion
was for determining reliability; Cronbach's alpha was used to examine each dimension
to see if 'additional items could be eliminated without substantially lowering
the reliability'. 'Items were eliminated if the reliability of the remaining
items would be at least 0.90.'
The resulting figures showed evidence of reliability and construct validity,
overall reliability for the scale had a coefficient alpha reliability score
of 0.96.
The final recommendation after taking into consideration that this was their
first exploratory model stated; 'the instrument should also be validated across
other variables such as age, education level and profession in order to assess
the generalisability of the scale to a more heterogeneous population' but this
was not a reflection of the instrument itself. In conclusion, they stated the
'instrument is useful in its present form' although one must always be aware
of the ever changing technologies on the World Wide Web and the need to keep
up to date with progress.

"this instrument is short, easy to use, reliable and appropriate
for use by academics and practitioners to measure Internet-related self-efficacy."
(Torkzadeh et al, 2001)

Shaw et al (2000) used a questionnaire arranged in a Likert format
to determine attitudes, views and experiences of a group of nutrition students
using an asynchronous learning network. The data was obtained through an online
'IT Appreciation' questionnaire completed in class during week 12 of the course.
'The text match questions allowed students to express opinions in their own
words and the multiple choice format consisted of 5 possible responses (some
reversed to counteract response sets) to the given statement arranged in a Likert
format' (Shaw et al, 2000).

It was concluded that the ALN paradigm could be considered a success as the
majority of the respondents agreed with the statements that they had become
more independent learners. But it was also noted that the largely positive responses
to the Likert questions were contradicted by the student responses to the open
ended questions.

From this it was decided that further study should determine the discrepancy
between the responses to the Likert question and the open-ended questions; it
was also considered a possibility that this could be due to the Hawthorne effect
(behaviour may be altered because the respondents know they are being studied.)
The author suggests therefore, that although the questionnaire was considered
a success, the initial construction of the questionnaire along with how it is
presented (i.e. online in the classroom with other students or away from the
class situation) needs to be considered carefully. The apparent acquiescence
could be because the questions some of the questions were single-sided, (although
it was stated otherwise) or perhaps there was a large number of 'don't knows'
or 'non-responses'; the results don't include any information on this.

Rovai (2002) used a Likert-type scale, referred to as the 'Classroom Community
Scale' in his study of 314 distance learners using Blackboard as the mode of
delivery. The research was 'to determine if a significant relationship exists
between sense of community and cognitive learning in an online educational environment';
with the premise that if online learners feel an 'emotional connectedness' to
a community, their learning and motivation will be increased. 20 statements
were used (some reverse scored), with a five-point scale of responses: strongly
agree, agree, neutral, disagree and strongly disagree. Cronbach's coefficient
alpha was used to calculate the reliability which was .93. Content validity
was examined by a panel of experts comprising three university professors of
educational psychology. Although there is an in-depth discussion with regard
to further research, and assumptions that the respondents were typical students
that participate in online distance education, the overall conclusion showed
that the Classroom Community Scale 'allowed for the hypothesised relationships
between the sense of community and cognitive learning'. The author suggests,
this Likert-type scale which has been adapted and renamed shows there is considerable
scope for the use of Likert scales in an e-learning environment.

Conclusions

Moving questionnaires with Likert scales onto the World Wide Web brings a whole
new meaning to questionnaires. They could almost be another source of activity
for the online learner. A form of scale that is frequently used is the 'graphic
scale', the respondent indicates his/her rating by placing a mark at the appropriate
point on a line that runs from one extreme of the attribute to the other. To
be a true Likert scale after the series of items has been developed using a
graphic rating scale, it is then necessary to determine which items have the
highest correlation with a specific criterion measure; only these will be included
in the scale.

Although not a graphic scale, figure 8 shows an example of how a Likert scale
could be presented in a web page. The use of radio buttons makes it easy to
complete, and as there is only one choice, difficult to invalidate by ticking
two boxes.

It
was easy for me to remember how to perform tasks using spreadsheets

Strongly
Disagree

Disagree

Neither

Agree

Strongly
Agree

figure 8

Other methods of presenting Likert scales in a web page are by using slider
controls (figures 9, 10, 11 & 12). A “slider control” (also
known as a trackbar) is a window containing a slider and optional tick marks.
They are useful when you want the respondent to select a discrete value or a
set of consecutive values in a range. When the user moves the slider, using
either the mouse or the direction keys, the control sends notification messages
to indicate the change.

figure 11

figure 12

The slider moves in increments that you specify when you create it. For example,
if you specify that the slider should have a range of five, the slider can only
occupy six positions: a position at the left side of the slider control and
one position for each increment in the range.

From a technical aspect, a basic knowledge of programming is useful if the
designer of the survey or questionnaire wishes to include slider controls in
a web page. Radio buttons (figure 8) require a knowledge of html making them
an easier option for the less technically minded.

Although there is some question of the reliability of Likert scales
and their analytical capacity, the general consensus is in favour of using Likert
scales; this is reinforced by the majority of the latterly dated literature
reviewed.

Maurer and Pierce (cited by Maurer and Andrews, 2000) investigated the
effectiveness of a Likert scale measure of self-efficacy for academic performance.
They suggested the Likert scale can be considered a measure of both magnitude
and confidence, and they concluded, based on reliability, predictive validity,
and factor analysis data, that a Likert scale measure of self-efficacy is
an acceptable alternative to the traditional measure.

Taylor, B & Heath, A (1996) The Use of Double-sided Items in Scale
Construction. Centre for Research into Elections and Social Trends; Working
Paper no. 37. Abstract available online: http://www.crest.ox.ac.uk/p37.htm [11.01.03]