Course rewards

Create your free OpenLearn profile

Get the most out of OpenLearn

Anyone can learn for free on OpenLearn, but signing-up will give you access to your personal learning profile and record of achievements that you earn while you study.

Anyone can learn for free on OpenLearn but creating an account lets you set up a personal learning profile which tracks your course progress and gives you access to Statements of Participation and digital badges you earn along the way. Sign-up now!

This free course is available to start right now. Review the full course description and key learning outcomes and create an account and enrol if you want a free statement of participation.

Free course

Exploring children's learning

2.3 Operant conditioning

According to behaviourism, all behaviour is learned and maintained by its consequences. B. F. Skinner (1905–1990) devised apparatus and methods for studying these effects. Figure 3 shows a ‘Skinner Box’ designed for use with a rat. The early behaviourists often examined animal learning and then extrapolated it to human learning. This was because they proposed that the fundamental principles of learning underpin the learning of all species.

Figure 3: A Skinner Box (adapted from Crain, 2000, p. 179)

The animal in the box can choose to behave in a variety of ways. The box contains a lever that delivers a food pellet when pressed. Initially, while moving about in the box the animal discovers by accident that when the lever is pressed, food appears. Over time the rate at which the lever is pressed by the animal increases, and other behaviours decrease by comparison. This suggests that the animal has learned to associate pressing the lever with the appearance of food. In Skinner's terminology, the lever-pressing behaviour was reinforced, that is, the consequences of pressing the lever made lever pressing more likely to occur in the future. When the lever pressing resulted in an unpleasant experience, such as an electric shock, then lever-pressing behaviour would occur less often. This is an example of punishment. (Punishment is an environmental stimulus that results in a decrease in a given behaviour.) The important point to remember is that reinforcement always refers to something that increases the frequency of a given behaviour, whereas punishment always refers to something that reduces the frequency of a given behaviour. ‘Punishment’ is therefore used here as a technical term with a precise meaning that differs from its everyday meaning.

Reinforcement, an environmental stimulus that results in an increase in a given behaviour, has both positive and negative forms. The terms ‘positive’ and ‘negative’ refer to the presentation or removal of an environmental stimulus. So, for example, ‘positive reinforcement’ refers to the presentation of a stimulus that increases the occurrence of a behaviour. ‘Negative reinforcement’ refers to an increase in a behaviour following the removal of an unpleasant (‘aversive’) stimuli (e.g. if a child increases the frequency of ‘room-cleaning behaviour’ because it results in the removal of parental disapproval).

Punishment can take one of three forms. ‘Positive punishment’ refers to the presentation of an unpleasant stimulus that will decrease the occurrence of the behaviour it follows. ‘Time-out’ is where a child is isolated from a reinforcing stimulus in their environment, with the aim of producing a decrease in the target behaviour. Finally, ‘response cost’ is where a penalty is applied every time an undesired behaviour is produced, again resulting in a decrease in that behaviour. The penalty, may be, for example, the removal of ‘tokens’ – items that are valued by the person, such as reward stickers or money. Table 1 summarises reinforcement and punishment.

Table 1: Reinforcement and punishment

Reinforcement

Positive reinforcement

Positive stimulus presented

Behaviour increases

Negative reinforcement

Aversive stimulus removed

Behaviour increases

Punishment

Positive punishment

Aversive stimulus presented

Behaviour decreases

Time-out

Isolation from reinforcer

Behaviour decreases

Response cost

For example token removed

Behaviour decreases

As with classical conditioning, extinction can occur if the behaviour is no longer reinforced. However, it should be noted that extinction is usually preceded by an extinction burst, which is a period of increased production of a previously reinforced behaviour following the withdrawal of that reinforcement.

Activity 1 Understanding punishment and reinforcement terms

0 hours 5 minutes

This activity will help you to understand the meaning of the different types of reinforcement and punishment.

Watching your parents walk away when you are having a tantrum, and eventually calming down to run after them.

Stopping hitting your brother after you have a favourite toy taken away every time you hit him.

Having not had the opportunity to eat all day, you are eating a large chocolate bar, and then stop having eaten three-quarters of it.

Discussion

Comment

The important thing to note in all these examples is what happened to the person's behaviour in relation to the environmental change, as it is the actual effect on behaviour that defines something as reinforcing or punishing. So, being burned in (1) is an example of positive punishment, as the presence of the burning sensation reduced the future incidence of the behaviour. (2) is an example of a positive reinforcement, as being given the star increased the production of neat writing. (3) is an example of time-out: the removal of parental attention resulted in reduced tantrum behaviour. (4) is an example of response cost – the favourite toy is systematically removed every time the undesired behaviour was produced. (5) is an example of negative reinforcement. Your hunger (an aversive stimulus) is removed by eating three-quarters of the chocolate bar.

However, ideally we should consider all these behaviours over time. For example, in (5) if your future consumption of chocolate decreased, then your ‘chocolate-eating behaviour’ was punished (eating three-quarters of a bar of chocolate may have made you feel unwell). If this behaviour increased in future then it was reinforced – either negatively (by reducing hunger) or positively (because you love chocolate!). This highlights one of the difficulties in identifying reinforcers and punishers in practice: they are defined by their outcomes, which may vary from individual to individual. For example, what is ‘reinforcing’ for one person may be ‘aversive’ for another.

In addition to reinforcement and punishment, Skinner examined the effect that different schedules of reinforcement have on the production of a behaviour: does it matter if a reward or punishment is not presented every time a behaviour is produced? (A schedule of reinforcement is the frequency and/or regularity of a given reinforcement or punishment in a setting.) Of particular significance is the predictability of the environment: the more unpredictable the pattern of reinforcement or punishment, the more resilient the behaviour will be to extinction. Consider the example of a child who has learned to expect a gold star every time she produces good work; as soon as the stars stop appearing she will quickly become de-motivated. However, if she learns that she occasionally gets gold stars for good work, she will be more likely to sustain good work in the expectation that she will, eventually, get a star again.

Making the decision to study can be a big step, which is why you'll want a trusted University.
The Open University has 50 years’ experience delivering flexible learning and 170,000 students are studying with us right now.
Take a look at all Open University courses.

If you are new to university level study, find out more about the types of qualifications we offer, including our entry level
Access courses and Certificates.