When it comes to encouraging people to work together for the greater good, carrots work better than sticks. That’s the message from a new study showing that rewarding people for good behaviour is better at promoting cooperation than punishing them for offences.

David Rand from Harvard University asked teams of volunteers to play “public goods games”, where they could cheat or cooperate with each other for real money. After many rounds of play, the players were more likely to work together if they could reward each other for good behaviour or punish each other for offences. But of these two strategies, the carrot was better for the group than the stick, earning them far greater rewards. .

Public goods games, albeit in a more complex form, are part and parcel of modern life. We play them when we decide to take personal responsibility for reducing carbon emissions, or rely on others to do so. We play them when we choose to do our share of the household chores, or when we rely on our housemates or partners to sort it out.

These sorts of games are useful for understanding our tendency to help unrelated strangers even if we stand to gain nothing in return. The big question is why such selflessness exists when altruists can be easily exploited by cheats and slackers, who reap common benefits without contributing anything of their own. How does selflessness persist in the face of such vulnerabilities?

Many researchers have turned to punishment for an answer, testing the idea that societies are glued together by the ability to mete out penalties to freeloaders, often at some personal cost. Long -time readers of this blog may remember a set of articles on this topic.

The first, from Simon Gachter’s group, showed that the ability to punish freeloaders stabilises cooperative behaviour, but a second study from Martin Nowak showed that this boost in cooperation carries a cost – by escalating conflicts, the ability to punish leaves groups with smaller rewards than those that shun punishments altogether. Gachter disagreed, and in a third study, his group suggested that in the long run, both groups and individuals are better off if punishment is an option. Now, Rand, part of Nowak’s group at Harvard University, is back with another take on the debate and this time, he has focused on punishment’s cuddlier counterpart – reward.

This time, Rand wanted to make the games more realistic. Often, they are played for single rounds or with constantly switching anonymous partners. But that’s hardly an accurate reflection of real life where, as Rand says, “repetition is often possible, and reputation is usually at stake”. We often interact with the same people again, and our attitudes toward them depends on our past dealings. So Rand wanted to tweak the public goods game so that reputation could play a role.

Rand recruited 192 volunteers who played the game anonymously over computer screens. They were split into groups of four who stayed together throughout the experiment. The games lasted for 50 rounds, although the players didn’t know this. In each round, the players decide how many of 20 units to contribute to a public pool and how many to keep for themselves. The common pot is multiplied by 1.6 and divided evenly. At the end of it all, the units were converted into real cash.

It’s a classic dilemma. The group does best if everyone puts in their full allocation and all the players get 32 units back. But for a single player, the best tactic is to hold everything back and reap a share of the other players’ contributions – by doing so, they could end up with a princely 44 units.

On top of this basic model, some of the recruits played games with a second stage to each round. During this stage, one group could punish a specific player, paying 4 of their own units to deprive the victim of 12 of theirs. A second group could reward their peers with 12 units at the cost of 4 of their own. And a third group could choose between rewards, punishment or inaction. At the end of it, each player learns what their peers did to them, but not to the others.

With neither carrots nor sticks on the cards, Rand found that the players’ average contributions fell over time from 14 units per round to a mere 8. However, both rewards and punishment were equally good at promoting cooperation; when these options for interaction were available, the contributions stayed high. However, the two groups that could reward each other earned much higher payoffs than those that could only punish, or those that could do neither.

People also seemed to like playing the do-gooder; in both groups were rewards were available, players used them more and more as the games went on. Punishments, however, decayed over time. The cost of retaliations meant that players hardly ever doled out penalties by the experiment’s end. In fact, when both options were on hand, the groups that largely rewarded each other ended up wealthier than those that favoured punishments.

A few questions remain. Rand’s modified public goods game was designed to allow reputation to influence how people hand out reward and punishment. This reflects many of our most important interactions – with friends, family and colleagues – and it certainly reflects the world of our ancestors, who lived in tightly knit communities. Whether it applies to our globalised world, where we may only have one-off encounters with others and where online communication offers the boon of anonymity, remains to be seen.

But all in all, Rand’s results suggest that when people repeatedly cross each other’s paths, carrots are far better than sticks at fostering behaviour for the greater good. Not only do they lead to greater payoffs for everyone concerned but they minimise the threat of antisocial punishment, where freeloaders vengefully castigate the altruists. This behaviour has the ability to derail cooperation and while fairly rare in countries like the US or the UK, it is far more common in places like Greece and Oman. In such countries, the relative merits of rewards may be even greater.

Rand sums it up best in his own conclusions and I will let his words speak for themselves:

“Sometimes it is argued that it is easier to punish people than to reward them. We think this is not the case. Life is full of opportunities for mutually beneficial trade, as well as situations where we can help others, be they friends, neighbors, office mates, or strangers. We regularly spend time and effort, as well as money, to assist people around us. This assistance can be minor, like helping a friend to move furniture, working extra shifts to cover for an ill co-worker, or giving directions to a tourist. It can also be more important, like recommending a colleague for promotion or speaking out to support a victim of discrimination. These sorts of productive interactions are the building blocks of our society and should not be disregarded.”

Up front caveat, I need to read the paper to see if they investigated the effect…

That said, the results presented here are beyond trivial. Punishment subtracts 16 units from the sum total while reward adds 8. Hardly a surprise that rewarding leads to higher payoffs.

When rewarding is an option, the second phase is a non-zero-sum game unto itself. In fact, it is from a class of trust games which are pretty well studied. The interaction with the first phase game may well be interesting, but it is totally swamped by the extra inflow of resources.

Concur with travc@6. A net difference of 24 points is highly significant when you’re starting with 20. If they didn’t (somehow) control for that statistically, this result is garbage.

If they DID control, they didn’t isolate -why- the score shift happens; all they did was determine that people like to give out (and receive) points. That -could- point to a mechanism for developing cooperation, but it isn’t necessarily so.

And keep in mind this is all occurring under the happy auspices of a guaranteed x1.6 multiplier on returns. The lower that multiplier, the more significant punishment becomes as a response, as defectors who receive punishment drop below sustenance levels and either get eliminated from the population or get with the program.

The conclusion also mentions that “[s]ometimes it is argued that it is easier to punish people than to reward them” and that the presented results contradict this. I think the reason that punishment is argued to be easier is that in general it’s cost is lower (at least on the short term). The experiment doesn’t reflect this. Whereas it may sometimes be the case that punishing someone has a cost (although fining someone will give you his money), it is likely not nearly as high as rewarding that person. Normally when you want to reward someone $12, it means you lose $12, not $4.

In cases where reward and punishment have the same cost, I expect people choosing reward much more often, simply because it improves relations and makes you feel better about yourself.

This whole “carrot or stick” thing is wrong, by the way. It’s “carrot and stick”. A carrot and stick approach is one thing – the carrot and stick. The carrot is tied to a string dangling on the end of the stick, so that it can be positioned ahead of the donkey. You can’t do it with just a carrot – you can’t reach far enough to motivate it forward, it’ll just turn round and eat it without moving. The stick is purely to distance the carrot from you and the donkey to make the donkey go.

Yes, most of the world is utterly wrong on this, most of the world thinks the phrase implies a duality with carrot at one end of a scale and stick at the other, but they’re wrong.

I completely agree that rewards will obviously lead to higher payoffs than punishment as long as rewards can effectively stabilize contributes in the public goods game. As is pointed out above, this is because rewards create the possibility of additional benefit, whereas punishments do not. The novel result in our study is that rewards are in fact able to maintain public cooperation as well as punishments – previous studies, which limited reputation and repetition, had found rewards to be relatively ineffective. To quote from the paper:

“In the RN and RNP treatments [where reward is possible], there is the possibility of generating additional income during the targeted interactions. Thus, it follows naturally from [the ability of rewards to stabilize public goods game contributions] that the reward treatments, RN and RNP, generate larger absolute payoffs than the punishment-only treatment, PN. Groups that have the opportunity to reward do better than groups that can only punish. The point we want to make is this: If several
targeted interactions can promote cooperation in the public goods game, then those that generate additional positive payoff will result in the best outcomes.”

In terms of the availability/realistic-ness of non-zero sum rewards, you have to remember that these economic games are metaphorical – they are not supposed to literally represent situations where you give up $4 and magically the other person gains $12. Instead, this type of interaction (formalized as the Prisoner’s Dilemma) represents a wide range of situations where you can help someone at a cost to yourself – not necessarily an actual financial cost, could be time/effort/information etc. The basic concept of the public goods game is predicated on the availability of non-zero sum interactions (the 1.6x multiplier on contributions), as is the extremely widely-discussed Prisoner’s Dilemma.

Our reward treatment is exactly equivalent to having each person play a public goods game, and then play a Prisoner’s Dilemma with each other group member (where their actions in the PD are influenced by the other’s action in the public goods game). The existence of these Prisoner’s Dilemma type interactions is generally accepted, and that is what we are capturing in reward our setup.

Could you explain the basis for the values (4,20,1.6,4,12) you’ve assigned to parameters? The max average payoffs in both PD as well as PGG parts (obtained when everyone cooperates) are close to each other at 8 and 12 units. But the max loss per round for PD is much smaller at 4 units (which gives a potential average RoI of 200%), compared to 15 for the PGG (making max average RoI 80%). The max payoff in PD (36) is larger than the max payoff in the PGG (24). Even the max payoff with reputation intact in PD (32) is larger than the max payoff in PGG which requires defection.

Given the relative sizes of these numbers, its probably not fair to characterize your results as “rewards stabilizing cooperation in a PGG”. In fact, the iterated PD with your parameters might produce stable co-operation in of itself. If so, your results might best be described as cooperation in PGG riding piggyback on cooperation in PD.

We used the Public Goods Game (PGG) multiplier of 1.6 so as to be comparable to previous PGG studies – 1.6x has become a sort of standard for 4 player PGGs.

We used an 1-to-3 cost-to-effect ratio for reward and punishment again to be consistent with previous punishment experiments, many of which have used a 1:3 punishment technology.

We chose the absolute values of -4 and +12 for rewards such that the total payoff received in a PGG round of all cooperators (32 per player) was greater than, but not so much greater than, the total payoff received when a player rewarded, and was rewarded by, the 3 others in his group (24).

It’s most likely true that if we had just had people played the iterated PD separate from the PGG, we would have seen stable cooperation in the PD. Getting cooperation in an iterated PD is easy because of reciprocity – if you dont cooperate with me, I won’t cooperate with you in the future.

What makes the PGG so challenging is that you can’t have that kind of reciprocity – if some players contribute and others dont, you can’t selectively withhold your contributions from the free-riders. Adding punishment (or reward) fundamentally changes this, because now targeted action is possible.

In our experiment, it seems that the actions in the reward round were largely based on actions in the PGG, and not the other way around. We say this based on the conditional probabilities of reward shown in the supplementary materials Figure S3 – average or above average contributors in the PGG were much more likely to be rewarded than below average contributors. If PGG behavior was tagging along with the PD, we would not expect to see this type of relationship.

I really want to thank Dave for taking the time to answer questions here. Very few scientists ever make the effort to engage with the public about their research like this and it’s only to be encouraged.

@Dave
Hi Dave, I don’t have access to Science at home so I can’t dig through the data, but I have a question: what were the total payouts for each version of game, expressed as a fraction of the total possible payouts for each game when all players use the optimal strategy? Wouldn’t this be a better measure of cooperation/strategy than just the raw scores?

Here’s the full Figure 1 comparing contributions and payoffs between the different treatments (Ed showed two of the 3 panels in his posting above), along with the accompanying text. We examined contributions, % of maximum possible payoff, and actual absolute payoff.

“Figure 1A shows the average contribution to the public goods game in each round. Consistent with previous findings, we observe that the average contribution declines in the control experiment but stays high in the punishment treatment, PN. However, we also observe that the two other
treatments, RN and RNP, are equally effective in maintaining cooperation in the public goods game. Therefore, it is not punishment per se that is important
for sustaining contributions but rather the possibility of targeted interactions. This option is present in all three treatments but absent in the control experiment.

Figure 1B shows the percentage of the maximum possible payoff achieved in each round. The maximum payoff is obtained for full cooperation in the public goods game, no punishment use in the PN treatment, and full rewarding in the targeted rounds of the RN and RNP treatments. All three treatments in which targeted interactions are possible outperform the control after an initial period of adjustment. We again find that reward works as well as punishment, with no significant difference in percentage of maximum possible payoff between the three targeted treatments.

Figure 1C shows the average payoff in each round, summed over the public goods game and the targeted interaction. In the RN and RNP treatments, there is the possibility of generating additional income during the targeted interactions. Thus, it follows naturally from Fig. 1B that the reward treatments, RN and RNP, generate larger absolute payoffs than the punishment-only treatment, PN. Groups that have the opportunity to reward do better than groups that can only punish. The point we want to make is this: If several targeted interactions can promote cooperation in
the public goods game, then those that generate additional positive payoff will result in the best outcomes.”

Also, I should point out that the other main result of the paper was that in the treatment where both reward and punishment were possible, (i) groups which choose to reward contributors earned much higher overall contributions than groups which did not reward as much, whereas (ii) groups which chose to punish free-riders so no such benefits – the probability of punishing free-riders had no significant relationship with overall contribution level. So when both options are available, we found reward to be effective and punishment not.