But Isn’t it Punishment to Withhold the Treat?

It would probably be good to decrease this behavior–photo credit Wikimedia Commons

Lots and lots of people think that if you withhold the treat you are punishing the dog. Some will ask the above question in a gleeful, challenging way, feeling certain that they have caught the positive reinforcement based trainers in an inconsistency. But let’s see what is really happening.

Here is a scenario. In the past, you have given your puppy attention and played with him when he jumped on you. But he’s getting big and you really don’t want him jumping on you anymore. You decide to teach him to sit to greet you. He already has a good reinforcement history for sitting, so the likelihood that he will do it in any given situation is fairly high.

So here you are with your excited pup and you are clicking and giving a treat whenever he sits.

Pup sits. Click/treat.

Pup sits. Click/treat.

Pup jumps on you. Nothing.

Pup sits. Click/treat.

Pup sits. Click/treat.

OK, what happened when the pup chose to jump instead of sitting? You didn’t click. The treats stayed in your hand, your pocket, or the bowl. (You meanie!) You stood still and didn’t react. You are paying for sits, not jumping up.

But lo and behold, the jumping up starts to decrease! Decreasing behavior means punishment, right? You must have punished your puppy for jumping!

No. Let’s look at the definitions of positive and negative punishment.

Punishment

Positive punishment: Something is added after a behavior, which results in the behavior happening less often. Example:

Antecedent: You approach your dog.

Behavior: Dog jumps on you.

Consequence: You step on the dog’s back foot, hard. (I’m not recommending this, of course. Just want a clear example of positive punishment.)

Prediction: Jumping up on you will decrease.

Negative punishment: Something is removed after a behavior, which results in the behavior happening less often. Example:

Antecedent: You approach your dog.

Behavior: Dog jumps on you.

Consequence: You turn around and leave.

Prediction: Jumping up on you will decrease.

In the positive punishment example you added painful pressure to your dog’s foot. (Please don’t ever do this.) If the dog finds having his feet stepped on sufficiently painful, jumping will decrease. In the negative punishment example you removed your presence and attention from the dog. If he likes your presence and attention well enough, and if you are consistent, (and if there is no competing reinforcer–that’s a big if!) this also will cause jumping on you to decrease.

So that’s what positive and negative punishment look like. Now back to our original example. Let’s map it out as well.

Antecedent: You approach your dog.

Behavior: Dog jumps on you.

Consequence: You just stand there.

You don’t respond with physical actions or increase or decrease your attention. Admittedly, this is hard to do, and remember, the lack of response has to be from the dog’s point of view. Even looking down at them is a response. Future blog on this point!

(By the way, some people who are very new to learning theory think that the above example is negative reinforcement. Sit, give treat = positive reinforcement. Then jump, withhold treat = negative reinforcement. No, no, no! It has an attractive symmetry, but that is not what the term means at all. Here’s a review.)

So What Is Happening?

OK, back to the first scenario, where you are working on sits with your puppy. Let’s say that after that one time when the puppy jumped and you didn’t treat, the puppy didn’t jump up again. Jumping on you decreased during training. Let’s also say that that decrease continues over time. Why isn’t that punishment again?

Because punishment is not the only process that involves a decrease in behavior. There is another: extinction.

Extinction is the nonreinforcement of a previously reinforced response, the result of which is a decrease in the strength of that response.

In other words, extinction is what happens when the behavior you used to do to achieve some thing doesn’t work anymore. So you stop doing it.

So here comes the big question, especially for those folks who think they’ve somehow caught us out on the withholding the treat business.

How Humane is Extinction?

As with so many things, the answer is, “It depends.” But in this case there is a pretty clear demarcation. In the Humane Hierarchy, extinction by itself is at the same level of negative reinforcement (which involves an aversive) and negative punishment (which involves a penalty for behavior). Not great as first choices. We know that from life. If a machine we use all the time stops working, or a method we use of interacting with another person we care about suddenly gets no response with no explanation, we are left high and dry. It is not fun.

However, extinction also happens in tandem with a process called Differential Reinforcement of Alternative Behaviors (DRA). This is how trainers who aim to train primarily with positive reinforcement use it. (There are other differential reinforcement methods, but this is a good general one to discuss right now.) It consists of reinforcement of an alternative behavior while reinforcement for the target behavior is withheld. Done with some care and skill, it can involve very little frustration for the animal, and it is one step closer to the “most humane” end of the Humane Hierarchy. And this is what is happening in the example above. As long as the trainer is being quite clear that sits are being paid for, the fact that jumping up on her no longer gets attention is not so hard on the pup. He has another thing he can do to get something good. He gets attention and food.

The trainer has communicated to the pup a new behavior to “fill the hole” where jumping used to be.

I’m borrowing this great example of how DRA works from my friend Kim Pike. Let’s say the soda machine at a workplace is not working. People will push the button repeatedly. Some will perhaps pound on the machine or kick it. This is typical when extinction is in play by itself. The people have no alternative, and get frustrated. (I’ll be covering extinction bursts and and extinction aggression in a later post.) Gradually people will stop going to the machine and give up pushing the buttons. Individuals will probably forget, and now and then go try the machine again, then perhaps give it another kick or shove. But after a while no one goes to the machine anymore.

But when the soda machine is fixed, there will likely be a crowd of people ready to buy their sodas. It’s easier than going to the corner store, and involves less planning than bringing drinks from home. The behaviors attendant to getting a soda are all still fluent and easy for people to perform. And they once again get reinforced.

However! What if, when the machine broke, someone immediately set up a system where folks could buy a soda they liked as well or better for less money? Perhaps there was a cooler, or an honor system with soda in the fridge. If that alternative were in place immediately, would the thirsty people typically have experienced the same level of frustration at the broken machine? Nope! (Except perhaps for the engineers and mechanics, grin.)

And the most important question: What will the folks who just want a soda do when the machine gets fixed? As long as the cheaper, better alternative is still available, they will keep heading for it. The machine will have become irrelevant. Maybe once in a while someone will forget, and go to the machine. But they’d then remember that they can get a better drink, cheaper, out of the fridge.

This is what we are doing when we allow an extinction process in tandem with positive reinforcement of an alternative behavior. We clearly offer the animal an attractive alternative and remind them of it to keep it front and center. It’s important that the reinforcer for the new behavior be the same or better than that of the old behavior. This makes for a process with much less frustration.

Extinction in a Specific Circumstance

In my post, How Do I Tell My Dog She’s Wrong? I address “failing to click” during a training session. I feature a short video example from the great trainer Sue Ailsby teaching her young Portuguese Water Dog, Sync, to stand and stay. In the video you can see Sync’s immediate bounce back after the couple of times she tries something other than a stand and doesn’t earn a click.

In that case, sits and downs are not going to decrease into oblivion in every situation, as we might want the jumping up to do in our other example. But they will go into extinction during training sessions of “Stand” and later when Sync learns a cue for it. Since dogs can discriminate this easily, it also tells us that when we want a behavior to go away completely, we need to practice reinforcing our alternative behavior in many locations and situations.

Conclusion

So in answer to the critics, no, withholding the cookie in itself is not punishment. And if used in tandem with reinforcing another behavior, it is quite humane. If we put even a moderate amount of thought and planning into the situation, we can set the dog up to succeed. There will be minimal frustration when he does miss the mark on occasion and fails to earn the treat.

Stay tuned for Part 2 on extinction. I’ll be talking in more detail about what happens when extinction is used by itself, and comparing that with differential reinforcement in some human and dog case studies.

36 Responses to But Isn’t it Punishment to Withhold the Treat?

I was always taught when I did my horse behaviour qualification that withholding a treat ie something the animal wanted was seen as negative punishment and that removing something ie pressure was negative reinforcement. I understand you saying that in your example nothing was taken away when treat was withheld but I’ve always understood withholding ie treat, attention was negative punishment ?

I think there is a fine line. I have an example in my movie about the processes of operant learning where my dog sits, I start to hand her a treat, then she jumps for it and I pull it away. My mentors would say that is negative punishment (if the behavior decreased) because there was a contingency. The treat was right there in front of the dog, the dog did something, and the treat went farther away. On the other hand, if I hadn’t been handing her the treat in the first place and it was still sitting in my pocket, and she jumped after sitting, and I didn’t give her the treat, that would be extinction (again if the behavior reduced). Likewise, with attention, it’s the difference between withholding and withdrawing. If you were just sitting around staring into space and your horse came and mugged you for what you might have in your pockets and if you didn’t react and continued to stare into space: extinction. However if you are already interacting with your horse and it comes and mugs your pocket and you walk away: negative punishment. (With all the usual assumptions about what the horse wants and the behavior reducing, of course.)

I think the word withholding can lead us down the wrong track sometimes, since it sounds sort of like the animal should have earned it, or almost had it, and the person is being unfair. I think I would say, on thinking about it, that extinction is when the thing “stays unavailable,” and negative punishment is when the thing is “withdrawn.” What do you think? Are we closer to the same page?

I have been working on this with one of my dogs but I must be doing something that is still somehow rewarding the jumping up (yes, it is almost exactly your example!) When I walk into the house – or any time when he is feeling anxious/excited, he immediately jumps up. After a year or so of consistently turning away combined with giving him the command to sit (or more recently “go to mat”), he now gets a sheepish look on his face (I swear!), and then immediately does as requested. But I am still getting that first jump – and from a 96-pound dog, it is a very painful experience! I am at a loss as to how to get him to leave behind the jumping for good. I suspect that the positive rewards he receives in that initial jump (face-to-face contact, good feelings) are still reinforcing the behavior.

Dogs continue to do behaviors they are reinforced for. Even though you are turning away, are you perhaps making eye contact with the dog before you do that? Believe it or not, if he’s seeking attention from you by jumping, he just go what he wanted if you made eye contact;-)
Whenever you have a problem behavior that keeps recurring, it’s useful to ask yourself, “How is my dog getting reinforced for this behavior?” Reinforcement drives behavior, so if he’s repeating it, there’s something in it for him. The trainer’s job is to figure out what that is, so that it can be removed. And, of course, to train the incompatible default behavior you want.

I’m glad Anne (pawsforpraise) chimed in. I was hoping some pro trainers would. Sounds like that initial jump got reinforced and is now part a pretty solid chain. Whenever I have something like that, where it’s the very beginning of the behavior that is the problem, I see if I can teach something else (in this case Differential Reinforcement of an

Incompatible

behavior, similar to what I was talking about in the post) that is cued instead. And not on a verbal cue, but cued by the very thing that triggers the jump in the first place. I’m wondering if it would be worth it for you to teach him to lie down (with or without a mat) when you enter a room. Have you seen my post and video about teaching Clara an incompatible behavior for trying to mug my face? Only your cue could be coming in the room instead of bending over (which was Clara’s trigger).

Also, when you say he does it when he is anxious, and the thing about the sheepish look (I get it!) I wonder how much of it is stress driven. With Clara it took me ages to perceive that the jumping up wasn’t nearly as “happy” as I thought. I think she was actually relieved when I gave her a job to do instead.

Just some thoughts. Good luck! And my next episode on the topic will be about the whole thing of inadvertently reinforcing (and punishing) dog behaviors by the micro-things we do.

Thanks for your feedback, Eileen & Anne! Good idea to work on training him to respond to my entering as a cue rather than waiting for me to say something. Anxiety is definitely behind the jumping. I think with him it is a constant but we’ve been working on it and he is getting better at self-calming. Using the mat has helped a lot and I’ve noticed that he goes to his often when he is feeling nervous.

I also meant to tell you that I took your advice on the window film as well. My neighbors across the stree recently installed a basketball hoop that has been drawing neighborhood kids in droves and driving my two crazy!! I had started just keeping my drapes closed which wasn’t a good solution because I hate not getting light and when the dogs heard the kids, they would nose the drapes aside to see and still barked. After reading your article, I decided to try wax paper as a temporary fix to see if it helped and it has! Now the bottom half of my windows are “frosted” and it seems to have done the trick to keeping the dogs from obsessing about what’s going on out there.

As always, your posts have been very helpful. I always learn so much! Thank you!

From my experience jumping can be rewarding too just the act of paws touching things can be rewarding.
Sometimes teaching from the other side of a baby gate can be and option too when starting out to avoid practicing the ‘bad’ behavior.

I’ve read extensively about training, and I totally agree that withholding a treat is not considered “punishment” within the terminology of behavior. However, I would say that very sensitive dogs can become worried when the click/treat does not occur when they think it should. E.g., when shaping a behavior, you have to withhold the click/treat sometimes to teach your dog what you want them to do. If a dog is too sensitive and easily worried, this type of shaping is a very precarious balancing act. In fact, it is sometimes even impossible (fortunately not very often).

I do understand that you are talking about the definitions used by academics, and I am talking about the emotions felt by a very sensitive dog – those are two different things. However, I’ve learned from having a sensitive dog (to the extreme) that she certainly views withholding the click/treat as a very bad outcome.

I do love your alternative behavior example as a way of setting up any dog for success. That works in so many situations but not in shaping a new behavior.

I agree 100%, KB. And I sure didn’t intend to diminish the stress that can come with shaping. I’m really glad you mentioned it. I’ve certainly had the experience of my dogs thinking it was the end of the world when they didn’t get a click. Zani and Summer have mostly grown out of that, but Clara is still moderately sensitive in that way.

I think in many situations the lack of a consequence can be harder on a person or animal than a mildly unpleasant consequence.

I wrote a post related to this (shaping and stress) a while back. Rereading that post I would add one thing to it. And that is, one doesn’t really have to shape that often, especially when we are talking about basic pet manners. When Clara was a puppy I captured, captured, and captured some more. To me that is fun. And it was low stress for her.

Thanks for your excellent comment. I love it that you stuck up for the dogs’ feelings.

> I’ve read extensively about training, and I totally agree that withholding a treat is not considered “punishment” within the terminology of behavior. >

Actually, you are talking about “Behaviourism” as defined by Skinner. Though even there I DO think that ‘withholding an expected consequence MUST be considered ‘punishing’, as it causes the preceding behaviour to be less likely to be repeated (after the extinction burst, of course).

Although Skinnerianism is simple, I do not think that outside a Skinner box we can slot everything into his Quadrants. There are always ‘other things happening’ – so that while we might be reinforcing something, we are at the same time, punishing something else. And while all this is happening the environment, which we cannot control, is reinforcing or punishing something, possibly entirely different, at the same time as increasing or decreasing the ‘learner’s’ motivation or the salience of what we are doing.

I far and away prefer Skinner’s predecessor, Thorndike, who said simply
“that any behavior that is followed by pleasant consequences is likely to be repeated, and any behavior followed by unpleasant consequences is likely to be stopped.”http://www.simplypsychology.org/edward-thorndike.html

No. I am not talking about behaviorism as defined by Skinner. I am talking about behaviorism (behavior analysis, actually) as can be found in any beginner’s textbook in 2014.

In this blog I seek to explain current terminology as it is currently in use by professionals in the discipline, and help people understand its relevance to their lives. You would be hard pressed to find anything controversial here. If I make a mistake I own up to it, correct it publicly, and credit the person who brought my attention to it if they choose.

You can prefer anyone you want to, and Thorndike was one of the biggies, but using different terminology from what degreed professionals in the field use is confusing to beginners who are trying to learn about learning and behavior. Behavior analysis is a field with complexity and beautiful nuance, but one thing I have learned is that you have to get the basics down first. And that’s what I seek to help people to do here.

In Beyond Freedom and Dignity (1972) Skinner writes, “A [subject] who has been punished is not thereby simply less inclined to behave in a given way; at best, he learns how to avoid punishment.”

Whether a consequence increases or decreases the likelihood of a behavior can only be determined by measuring what happens. In ClearKrystal’s story, jumping is rewarded with proximity and contact, even if the handler then leaves the area, the reward was already delivered. Teaching an incompatible behavior (sitting) to the same cue that often results in jumping is one technique to change the behavior. Is it aversive for the handler to leave? Is it punishment for the dog to fail to earn a treat for sitting? Is the approach of the handler positive? It doesn’t matter.

Labeling punishment with + or – is jargon that causes people to choose sides, especially when one side has laid claim to the humane domain. I attended a Karen Pryor Academy and understand what you are fighting for but we’ve lost sight of the problem. No one should hurt a dog to teach him anything. A handler should use consequences that change behavior such that a dog does not learn to avoid the handler or the training experience. Doesn’t matter what you call it.

I think P- and extinction can be interrelated. But I would not consider a time out extinction because it is an action and it is contingent. The dog does something, you respond by removing them from opportunities for reinforcement. If you do something like the LRS that marine trainers do, a “Least Reinforcing Scenario,” that can be very very close to a non response. Just my opinion, but I think that one might tread the line. Interesting that you bring this up because my next post is going to talk about these borderline cases where we think we aren’t reacting but we are. Thanks for the comment! Good points.

Punishment is action and is contingent. I agree, though. Extinction is supposed to be an absence of consequence. The reason it works to “ignore” a behavior is that with no reward or punishment a behavior will disappear from the repertoire. The problem is removing results from jumping is almost impossible. It’s rewarded by energy expended while aroused, contact with a human (even if the human leaves), and proximity to hands and face of a human. So it is a least reinforcing choice by the person but doesn’t extinguish the behavior if the particular dog is rewarded by any of the above. The reward can’t be removed. Once you add a consequence that is likely to diminish jumping well, now +P or -R are the only things those actions can be. The problem with time out is that it usually involves giving a cue or moving yourself. Either is a reward in a chain as you pointed out at the start.

“Extinction is the nonreinforcement of a previously reinforced response, the result of which is a decrease in the strength of that response.”

And so is going from a fixed schedule of reinforcement to a random schedule of reinforcement.

Extinction is only really extinguishing, when the unwanted behavior is no longer exercised. it’s a little far-fetched to speak of spontaneous regeneration of the unwanted behavior. So if you have a series of 10 repetitions, and the third is wrong (extinction – no reinforcer) was well as the 5th and the 8th and the 10th. Extinction didn’t work. But you DID withhold the reinforcer.

Now, if you have a series of 10 repetitions, and the third is not reinforced, although it was correct, as was the 5th and the 8th and the 10th, we all know what we intend this to be. One set of approximations set out to a random rate of reinforcement. For the next set could be the 2nd 5ht 8th and 9th not rewarded although the behaviors were correct.

Now may question is, how does the dog “get it”, that in the one example, the wrong behavior was punished through extinction, but in the other example the right behavior was simply not reinforced?

Add to the mix, that some trainers teach what they call NRM (seen on DVD, they are actually used as negative interruptors, but that’s another topic). How do these trainers use extinction which is basically NO input – neither negative nor positive punisher – if they are using NRM. Extinction does not combine with NRM that I’ve ever heard of….

In your second example, dog is doing behaviors, right, right, right, right, right, right, etc. You use variable reinforcement (ostensibly to make the behavior stronger).

The dog “gets it” because in the second example (if you are a good trainer) you haven’t started using variable reinforcement until the behavior is very strong, i.e., the dog is *positive* he knows what to do. So when he doesn’t get the reinforcement he tries *the same thing* over and over. He’s sure it’s right.

In the first example, the dog isn’t sure what you want, he’s still guessing. So he believes you when you “tell” him yes, yes, no, yes, no, etc.

That’s why it can be hard to extinguish a behavior the dog has been getting reinforcement for for a long time. They’re *sure* it’s right. But of course, you wouldn’t then be using variable reinforcement to fix that, you’d be using no reinforcement at all. So eventually they’d get it.

(Excuse the anthropomorphism.) And I know I saw somewhere that Bob Bailey said he didn’t think variable reinforcement schedules were ever necessary in pet dog training anyway.

Eileen, I just want to say thank you for your posts. My head is swimming at the moment, but I’m starting to get the answers I need. So many other things I have been reading have such cut and dried stock standard examples that fit very neatly and logically into their little boxes. You address the very important questions and confusions and the blurry lines. I haven’t got it all sussed yet, but I’m sure I will with the help of these posts. Thank you!