Split or Steal? Cooperative Behavior When the Stakes Are Large

Transcription

1 Split or Steal? Cooperative Behavior When the Stakes Are Large Martijn J. van den Assem Erasmus School of Economics, Erasmus University Rotterdam Dennie van Dolder Erasmus School of Economics/Tinbergen Institute, Erasmus University Rotterdam Richard H. Thaler Booth School of Business, University of Chicago Published in Management Science, January 2012 (58:1), 2 20 Abstract: We examine cooperative behavior when large sums of money are at stake, using data from the TV game show Golden Balls. At the end of each episode, contestants play a variant on the classic Prisoner s Dilemma for large and widely ranging stakes averaging over $20,000. Cooperation is surprisingly high for amounts that would normally be considered consequential but look tiny in their current context, what we call a big peanuts phenomenon. Utilizing the prior interaction among contestants, we find evidence that people have reciprocal preferences. Surprisingly, there is little support for conditional cooperation in our sample. That is, players do not seem to be more likely to cooperate if their opponent might be expected to cooperate. Further, we replicate earlier findings that males are less cooperative than females, but this gender effect reverses for older contestants because men become increasingly cooperative as their age increases. JEL: C72, C93, D03 Version: September 2011 (first draft: April 2010) Keywords: natural experiment, game show, prisoner s dilemma, cooperation, cooperative behavior, social behavior, social preferences, reciprocity, reciprocal behavior, context effects, anchoring Postal address for manuscript correspondence: Erasmus University of Rotterdam, P.O. Box 1738, 3000 DR, Rotterdam, the Netherlands. E: T: , F: We are grateful to Guido Baltussen, Han Bleichrodt, Uri Gneezy and the anonymous reviewers for their many constructive and valuable comments. We thank Endemol UK and in particular Tara Ali and Tom Blakeson for providing us with information and recordings of Golden Balls, and Shirley Kremer, Thomas Meijer and Sanne van Bemmel for their skillful research assistance. The paper benefited from discussions with seminar participants at the Erasmus University of Rotterdam and Tilburg University, and with participants of the Decision & Uncertainty Workshop 2010 at HEC Paris, BDRM 2010 Pittsburgh, SABE 2010 San Diego, and SPUDM 2011 Kingston upon Thames. We gratefully acknowledge support from the Erasmus Research Institute of Management, the Netherlands Organisation for Scientific Research (NWO), and the Tinbergen Institute. Electronic copy available at:

2 Cooperation is vital for the functioning of society, and the organizations and communities that form its fabric. Not surprisingly, cooperative behavior is the focus of many studies across a wide range of scientific disciplines, including psychology (Dawes, 1980; Dawes and Messick, 2000), sociology (Marwell and Ames, 1979, 1980; Raub and Snijders, 1997), economics (Ledyard, 1995; Fehr and Gächter, 2000a; Fischbacher and Gächter, 2010), political science (Ostrom, Walker and Gardner, 1992) and biology (Gardner and West, 2004; West, Griffin and Gardner, 2007). The key question in this literature is why humans cooperate even in situations in which doing so is not in line with their material self-interest. While cooperation is ubiquitous in social life and an important topic for all kinds of economic interaction, field data rarely allow for a clean discrimination among competing theories. Because carefully designed laboratory experiments do allow for such rigorous comparisons, laboratory experiments have provided numerous important insights into cooperative behavior, and the resulting rich literature forms the basis of most of our knowledge on human cooperation. Still, laboratory settings inevitably have limitations that some argue may hinder the generalization of findings to situations beyond the context of the lab (Levitt and List, 2007, 2008). Subjects are often volunteering students who thus constitute a non-random sample of the population at large. Also, they generally have less familiarity with decision tasks in the laboratory than with those in everyday life, no opportunity to seek advice from friends or experts, and they know that their behavior is examined in detail. From an economic perspective, another obvious drawback to lab studies is that the financial stakes employed tend to be relatively small. Even those experiments that utilize relatively large payoffs do not involve amounts in excess of a few hundred dollars (e.g., Hoffman, McCabe and Smith, 1996a; List and Cherry, 2000; Carpenter, Verhoogen and Burks, 2005), giving rise to the question to what extent findings will generalize to situations of significant economic importance. One solution is to perform experiments in low-income countries, where small nominal amounts carry a larger value. In the domain of social interaction, such experiments are, for example, employed by Slonim and Roth (1998), Cameron (1999), Fehr, Fischbacher and Tougareva (2002), Munier and Zaharia (2002), Johansson-Stenman, Mahmud and Martinsson (2005), and Kocher, Martinsson and Visser (2008). While this might appear an ideal approach, it has its own drawbacks. Culture, for example, has been shown to play an important role in social interaction (Henrich et al., 2001, 2004; Herrmann, Thöni and Gächter, 2008), making it difficult to generalize findings from low-income countries. 1 And while 1 Interestingly, though not generally acknowledged, this argument at the same time questions the universal applicability of the many findings from higher-income countries, including ours. We refer to Henrich, Heine and Norenzayan (2010) for a discussion on this issue. [2] Electronic copy available at:

3 the stakes in these experiments are larger than commonly employed, they still rarely exceed a few months wages. In the current paper, we study cooperative behavior using another source of data, namely the behavior of contestants on the British TV game show Golden Balls. Although the game show setting is an unusual environment, it has the benefit of employing large and varying stakes. Furthermore, game shows are markedly different from laboratory experiments in terms of participant selection, scrutiny, and familiarity of participants with the decision task. Combined with the strict and welldefined rules, game shows can therefore provide unique opportunities to investigate the robustness of existing laboratory findings. Because game shows are often competitive in nature and ask contestants to make risky or strategic choices, is it not surprising that they have mostly been used to study decision making under risk (e.g., Gertner, 1993; Metrick, 1995; Beetsma and Schotman, 2001; Post et al., 2008) or strategic reasoning (e.g., Bennett and Hickman, 1993; Berk, Hughson and Vandezande, 1996; Tenorio and Cason, 2002). More recently, however, game shows have also been used to study social interaction, in particular discrimination (Levitt, 2004; Antonovics, Arcidiacono and Walsh, 2005) and cooperative behavior (List, 2004, 2006; Belot, Bhaskar and van de Ven, 2010a; Oberholzer-Gee, Waldfogel and White, 2010). The current paper is in the latter category. In the final stage of Golden Balls, contestants make a choice on whether or not to cooperate in a variant of the famous Prisoner s Dilemma. In particular, the two final contestants independently have to decide whether they want to split or steal the jackpot. If both contestants choose split, they share the jackpot equally. If one chooses split and the other chooses steal, the one who steals takes the jackpot and the other gets nothing. If they both steal, both go home empty-handed. On average, the jackpot is over $20,000. The variation is large: from a few dollars to about $175,000. If we assume that each player only cares about maximizing her immediate financial payoff, the choice problem in Golden Balls can be labeled as a weak form of the Prisoner s Dilemma. 2 Where in the classic form of the Prisoner s Dilemma defecting strictly dominates cooperating, here defecting only weakly dominates cooperating: choosing steal always does at least as well, and sometimes better than choosing split. Of course, contestants may consider other factors aside from their own monetary payoff when deciding which choice to make. Much experimental research suggests that people have social preferences in the sense that the payoffs to others enter their utility functions. For discussions, see, for example, Fehr and Gächter (1998, 2000b), Fehr and Schmidt (1999), Bolton 2 Rapoport (1988) introduced this terminology. For the sake of brevity, we will simply use the term Prisoner s Dilemma to refer to the game studied here. [3]

4 and Ockenfels (2000), Charness and Rabin (2002), Camerer (2003), Fehr and Gintis (2007), and Cooper and Kagel (2009). The fact that the show is aired on TV of course creates another set of rather special circumstances that could affect our results, although there is little existing theory to suggest what the effect of a large TV audience would be. One might argue that players would not want to be seen as a jerk on national television and so would be more likely to cooperate, but one can also argue that a player would not want to been seen as a sucker (or someone who cannot detect the weakly dominant solution to a simple game) in public. 3 The public nature of the choice could also magnify subtle features created by the fact that the game is a weak form of the Prisoner s Dilemma. Specifically, if a player thinks that the other player will steal, she might decide to split on the grounds that it costs her nothing to appear nice on TV. These complications do not render our results uninteresting, but do need to be incorporated in any attempt to evaluate how our results should be interpreted in the context of existing theories and experimental findings on cooperation. Although Golden Balls is unique in its format, the show shares the Prisoner s Dilemma element with a few game shows from other countries including Friend or Foe (US, aired in ) and Deelt ie t of deelt ie t niet? (in English: Will he share or not? ; Netherlands, 2002). These two shows have been studied in four different papers. List (2004, 2006) and Oberholzer-Gee, Waldfogel and White (2010) analyze data from Friend or Foe, and Belot, Bhaskar and van de Ven (2010a) use the Dutch show. List focuses on the effects of demographic variables such as gender, race and age. Studying the same game show, Oberholzer-Gee, Waldfogel and White compare the behavior in the first season of the show with later seasons in which the contestants have had a chance to observe prior episodes. Finally, Belot, Bhaskar and van de Ven find that making a promise to cooperate prior to the decision is positively related to cooperation if the promise was voluntary, but not if it has been elicited by the host. In this paper we replicate many of the earlier investigations, but also undertake several novel analyses that are possible due to some unique features in the format of Golden Balls. The way the stakes are determined and the very wide range they cover provide the basis for new insights into the effect of stakes and context. The dynamic setting of the show enables us to look at reciprocity in cooperative behavior, and also at the effect of earlier deceitful behavior. 3 Most studies related to the issue of observability indicate that people display more other-regarding behavior when they are or feel more subject to public scrutiny (see, for example, Hoffman, McCabe and Smith, 1996b; Rege and Telle, 2004; Haley and Fessler, 2005), but there is also contradictory evidence (Dufwenberg and Muren, 2006). Kerr (1999) suggests that the effect depends on conditions related to social expectations and sanctions. [4]

5 In our sample, individual players on average cooperate 53 percent of the time. Although this rate is similar to earlier findings from the experimental literature (Dawes and Thaler, 1988; Sally, 1995), direct comparisons are hampered by systematic differences in the stakes, the visibility of decisions, characteristics of the subjects, and preceding opportunities for communication or other social interaction. We find only limited support for the notion that cooperation will decrease if the stakes get significant. The cooperation rate is unusually high when the stakes lie in the low range of our sample, perhaps because contestants think that for so little money (relatively speaking) they might as well cooperate in public. Cooperation does decline with the stakes for stakes below the median, but plateaus at around 45 percent for medium to large amounts. The high cooperation rate for relatively small stakes suggests that context can convert money amounts that would normally be considered consequential or big into amounts that are perceived to be small, just peanuts. This idea is supported by our finding that cooperation is not only based on the actual stakes but also on what the jackpot potentially could have been. This effect is especially pronounced for those who appeared in the earlier episodes of the show and had no or little opportunity to watch the show on TV and learn what sizes are large or small in the context of this game. A special property of Golden Balls is the interaction that occurs among contestants prior to the final. Utilizing the dynamic setting, we find evidence that contestants show some tendency toward reciprocity. Among contestants whose final opponent has attempted to vote them off the show, the propensity to cooperate is significantly lower. Contestants do not appear to reciprocate against opponents who have lied earlier in the game. Lying seems to be accepted here, similar to bluffing in poker. A possible reason for this is that, in contrast to a vote cast against someone, lying is a defensive act that is not aimed at anyone in particular. Surprisingly, we find little evidence that contestants propensity to cooperate depends positively on the likelihood that their opponent will cooperate. While an opponent s promise to cooperate is a strong predictor of her actual choice, contestants appear not to be more likely to cooperate if their opponent might be expected to cooperate. Our final main result is that young males cooperate less than young females. This difference decreases and even reverses as age increases and men become increasingly cooperative. The paper proceeds as follows. Section I describes the game show in more detail, discusses our data and presents descriptive statistics. Sections II VI cover the various possible factors behind cooperative behavior included in our analysis. Each of these sections provides related literature and [5]

6 other background, explains the variables that we use, and discusses the results of our Probit regression analyses. Section VII concludes. I. Game Show and Data Description of Golden Balls The TV game show Golden Balls was developed by the Dutch production company Endemol. It debuted on the ITV network in the United Kingdom in June 2007 and ran until December Each episode consists of four rounds and starts with four contestants, usually two men and two women. In Round 1, twelve golden balls are randomly drawn from the Golden Bank, a lottery machine containing one hundred golden balls. Each of these balls has a hidden cash amount inside, ranging from a minimum of 10 to a maximum of 75, Contestants know that this is the range for the amounts in the balls, but they do not know the precise distribution (though this becomes clearer over time as the show is aired). At a later stage of the game, a subset of the cash balls drawn will contribute to the final jackpot. Also, four balls hiding the word killer inside are mixed with the twelve cash balls. Killer balls are undesirable in a way we will explain below. From the sixteen balls, each contestant receives four balls at random. For each contestant, two are placed on the front row with their contents - either a cash amount or the word killer - openly displayed; the other two are placed on the back row and their contents are known by the particular contestant alone. (Poker players can think of this as two up cards and two down cards.) The contestants now have to decide by vote which player will be kicked off the show. Because the balls of voted-off contestants are removed from the game and the remaining contestants balls matter for the ultimate jackpot, there is a strong incentive to retain players with high value balls and kick off players with low value balls or killer balls. Before the voting starts, each contestant publicly announces the contents of the balls on her back row (knowing that these values will subsequently be revealed, but only after the vote). Then, the four contestants together have an open discussion in which they can voice their evaluation of other players statements and their opinion of who should be voted off. Each player then anonymously casts a vote against one specific opponent. After the votes are tallied, the player who received the 4 Values in British pounds can be translated into US dollars using a rate of $1,75 per pound, an approximate average of the exchange rate during the period in which the show ran. [6]

7 most votes leaves the game. 5 Lastly, all the players reveal the values of their back row balls, and differences between the actual values and the previous claims are noted. In Round 2, two additional cash balls from the lottery machine and one extra killer ball are added to the twelve remaining balls from Round 1. The fifteen balls are then randomly allocated to the three contestants. Each of them receives two balls on her front row and three on her (hidden) back row. Similar to Round 1, contestants make (cheap talk) statements on the balls on their back row, a round of banter follows, votes are cast anonymously and tallied, the player who receives the most votes leaves the game, and all hidden ball values are revealed. 6 Two players and their ten balls proceed to Round 3. Round 3 determines the size of the final jackpot. First, one additional killer ball is mixed with the ten balls from Round 2. Then five of the balls are selected sequentially at random. If a ball selected is a cash ball, its face value is added to the jackpot. If a killer ball is drawn, the current cumulative jackpot is divided by ten. For example, if the first two balls were 50,000 and 1,000 and the third is a killer ball, the level of the jackpot is reduced from 51,000 to 5,100. A killer ball does not affect the jackpot contribution of cash balls drawn thereafter. If the fourth and fifth ball in our example were another killer ball and 25,000, respectively, then the actual jackpot would be 25,510. Note that this round is a completely stochastic process, and that contestants have full information on the balls that are in play. Before the five balls are drawn, special attention is always paid to the highest possible jackpot (that is, the sum of the five largest cash values). This value, and the number of killer balls, are explicitly stressed by the game show host. After Round 3 determined the jackpot, in the fourth and final round the contestants play a variant of the Prisoner s Dilemma. Each contestant receives two golden balls. One of the balls says split and the other says steal on the inside. The contestants then simultaneously have to decide which ball they want to play. If both choose split, they share the jackpot equally. If one chooses split and the other chooses steal, the contestant who steals takes the whole jackpot and the other gets nothing. If they both choose steal, both go home empty-handed. Before each contestant makes her actual decision, a brief time period is reserved for a discussion between the players in which they can make non-binding promises, ask about intentions, or attempt to get assurances of cooperative behavior. This is the final round of cheap talk. Importantly, the contestants have not met before the 5 If two contestants receive two votes each, their opponents openly discuss who they want to keep in the show. If they cannot decide, a decision is made at random. If all four contestants receive one vote each, contestants openly attempt to form a coalition against one specific contestant. Again, if they cannot decide, a decision is made at random. 6 The procedure in the case of a tie is similar to that in Round 1. Tie-breaking occurs by discussion or by random draw if no agreement is reached. [7]

8 game starts, and have no opportunity before or during the show to make any kind of collusive agreement. A relevant question is how the contestants are selected. A spokeswoman of Endemol informed us that anyone can apply to be on Golden Balls by submitting a detailed application form. Shortlisted contestants are then invited to an audition in order to determine their skills at playing the game, their character and their suitability to appear on a TV show such as Golden Balls. Producers watch tapings of these auditions and put together shows such that, according to the producers, a good mix of characters is represented on each show. Thus while the contestants are not a random sample of society, the selection process does not seem to create any obvious confounds with the analyses we conduct here. Data and descriptive game characteristics We examine the split and steal decisions of 574 final contestants appearing in 287 episodes aired between June 2007 (when the show was introduced) and December During this period, 288 episodes were aired, and, at the time of writing, no further episodes were aired thereafter. Recordings from the show and additional information such as recording and airing dates were kindly provided by Endemol s local production company Endemol UK. The one missing episode could not be supplied because it was not present in their archives. 7 For each episode we collected data on the relevant observables in the show, such as the hidden and visible ball values, statements made by contestants, the votes, the jackpot size, and the decision to split or steal at the end. Some variables were estimated based on contestants physical appearance and on information provided in the introductory talk and other conversations during the show. Table 1 displays some descriptive characteristics of the game. Cash balls drawn from the lottery machine during the first two rounds have a mean value of 5,654 and a median of 1,500. Clearly, the distribution is positively skewed. The mean value of the cash balls taken to Round 3 is 6,775, which is statistically significantly greater than the average value of all cash balls in the show, implying that the contestants are successful in using their votes to keep high-value balls in play and eliminate small ones. The average number of killer balls in the game at the start of Round 3 is 3.14, significantly less than the 3.67 we would statistically expect if voting was random. Contestants thus also seem 7 Sixteen episodes in our data set feature returning contestants. In twelve of these, players who previously had lost in the final (opponents stole while they themselves chose to split ) get a second chance. In four, unlucky players who had been voted off in the first game round receive a second chance. We do not find that returning contestants behave differently, and, unless stated otherwise hereafter, excluding them from our analyses does not materially affect our results. [8]

9 successful in eliminating killer balls from the game. Unreported analyses of contestants voting behavior clearly show that contestants indeed try to vote off the opponents that have the worst set of balls on their front row. At the start of Round 3, special attention is paid to the highest possible jackpot. Dependent on the cash balls and killer balls taken to this stage, this maximum varies between 5,000 and 168,100, with a mean of 51,493 and a median of 41,150. The actual jackpot for which contestants play the Prisoner s Dilemma game is generally considerably smaller due to the skewed distribution of cash ball values and the effect of killer balls, but still has a mean size of 13,416 and a median of 4,300. These amounts are many times the amounts typically used in laboratory experiments, and also large sums relative to the median gross weekly earnings of 397 in the UK in April 2009 (Source: Statistical Bulletin Office for National Statistics, 12 November 2009). About half of the time, the jackpot in our show exceeds three months of median UK earnings, and 21 percent of the contestants decide over a jackpot that is even larger than a median annual salary (the third quartile in our sample is at 18,350). The stakes are also large compared to the two other game shows employed in earlier analyses of cooperative behavior: in Friend or Foe, the average is about $3,500 (List, 2004, 2006; Oberholzer-Gee, Waldfogel and White, 2010), and for the Dutch show, Belot, Bhaskar and van de Ven (2010a) report a median of 1,683. The wide range of the jackpot in our sample is caused by its random construction, by the highly skewed distribution of cash ball values, and by the effect of killer balls. The largest jackpot was played for in an exhilarating episode from March 2008; trainee accountant Sarah stole the entire jackpot of 100,150 from collection agent Stephen. 8 For the jackpot to be awarded, at least one player needs to cooperate. We find that 52.8 percent of the contestants decide to split. While this might seem high, the rate is actually remarkably similar to earlier experimental evidence (see, for example, Sally, 1995). In our sample, both players split the jackpot 31 percent of the time, one splits while the other one steals occurs in 44 percent of the shows, and in the remaining 25 percent of the shows both players steal. The efficiency rate in terms of the percentage of jackpots that is actually awarded thus amounts to 75 percent. The efficiency rate obtained by dividing the sum of earnings across all episodes by the sum of all jackpots is slightly lower at 72 percent. (The difference in efficiency results from contestants lower propensity to cooperate when the stakes are larger; we explore this effect in detail later.) These simple statistics are a first indication that contestants do not condition their behavior on that of their opponent. Given the average cooperation rate, we would expect to observe (split, split) in 28 percent of the 8 A video clip of this episode is widely available on the Internet, for example through YouTube. [9]

10 Table 1: Selected Game Show Characteristics The table shows selected characteristics for the British TV game show Golden Balls, extracted from our sample of 287 episodes. Cash ball (overall) is the monetary value of a cash ball drawn from the lottery machine in the first or second round of the game. Cash ball (Round 3) is the monetary value of a cash ball that is in play at the start of the third round. No. of killer balls (Round 3) describes the number of killer balls that are in play at the start of the third round. Potential jackpot (Round 3) is the jackpot size that is attained during the third round if the best-case scenario would occur. Jackpot describes the actual size of the jackpot. Decision is a contestant s decision in the Prisoner s Dilemma at the end of the show, with a value of 1 for split and 0 for steal. Prize won (if non-zero) records the take home prize for a contestant who made it to the final (if she did not leave empty-handed). All monetary values are in UK Pounds ( 1.00 $1.75). N Mean Stdev Min Median Max Cash ball (overall) 4,018 5, , , , Cash ball (Round 3) 2,257 6, , , , No. of killer balls (Round 3) Potential jackpot (Round 3) , , , , , Jackpot , , , , Decision (split=1) Prize won 574 4, , , Prize won if non-zero 303 9, , , , cases and (steal, steal) 22 percent of the time if the individual decisions in our sample were randomly matched. Although the actual percentages are higher (31 and 25), the differences are relatively small considering that each pair of contestants operates under highly similar conditions (same jackpot, same potential jackpot, and many shared unobserved conditions). On average, a finalist goes home with 4,851, but the median prize is only 39 because 47 percent of the contestants get nothing. The 303 contestants who end up with a non-zero prize take home 9,189 on average, with a median of 2,175. It is worth noting that would we have run this show as an experiment ourselves, the total costs in subject payoffs alone would have been 2.8 million. Modeling the decision to split or steal In the following sections, we will analyze the decisions to split or steal the jackpot using a binary Probit model. We assume that when people enter the final round they have a latent propensity to split y *, where y * (, ). Furthermore, we assume that this latent propensity is a linear function of personal demographic characteristics x and context characteristics z, in the form y * = x'β + z' γ + u, where β and γ are parameter vectors and u represents an unobserved stochastic component. We do not observe the latent propensity to split directly, but only the actual decision y, where y = 1 if a contestant chooses split and y = 0 if a contestant chooses steal. We impose the observation criterion y = 1 ( y* > 0), where 1(.) is the indicator function taking the value of 1 if y * > 0 and 0 otherwise. Assuming that the stochastic component has a standard normal distribution, or u ~N(0,1), leads to the binary Probit model of the form Pr( y = 1 x, z) = Φ( x' β + z' γ), [10]

11 where Φ(.) is the standard normal cumulative distribution function. Using this framework we estimate the parameter vectors β and γ using maximum likelihood estimation. We allow for the possibility that the decisions of contestants within the same episode are correlated by performing a clustering correction on the standard errors (see, for example, Wooldridge, 2003). Because coefficients in a Probit model do not have an immediate intuitive economic meaning due to the inherent nonlinearities, we will follow the common approach of reporting marginal effects instead. More specifically, the marginal effects we report apply to the medians of the explanatory variables, with two exceptions. To highlight some of the interaction effects that we find in our data, we set Age to be 20 and Transmissions (the number of times the show has aired at the time of recording) to be 0. The resulting representative agent is a 20 year old white female without higher education, who lives in a relatively small town and plays the final of our game for a jackpot of 4,300, which potentially could have been 41,150. For dummy variables we consider the effect of a discrete change from 0 to 1. As noted by Ai and Norton (2003), the traditional way of calculating marginal effects and their standard errors is not valid for interaction terms, and we therefore apply the alternative method they propose. For the sake of consistency, we report significance levels that apply to the marginal effects, though these levels do not differ materially from the significance levels for the original regression coefficients. Original coefficients and their significance levels are available from the authors upon request. II. Demographic Characteristics First, we investigate how various demographic characteristics are related to the propensity to cooperate. Our later analyses include these demographic variables as control variables. In previous studies examining the relations between demographic characteristics and cooperative behavior, most attention has been directed to gender. Psychologists have a long history when it comes to investigating the relation between gender and behavior, and, over the past decade, economists have become increasingly interested in gender effects as well. The standard finding is that women act more pro-socially than males, but the reverse is also found. 9 For contextual settings similar to ours, List (2004, 2006), Oberholzer-Gee, Waldfogel and White (2010) and Belot, Bhaskar 9 One possible cause for the varying results is that males and females respond differently to specific contextual settings of the experiments (Croson and Gneezy, 2009). For example, if women are more risk averse than men, this may lead to different social behavior in situations in which risk is involved (Eckel and Grossman, 2008). [11]

12 and van de Ven (2010a) report that women are more cooperative than men, although some results are only marginally significant. For other demographic characteristics, the experimental findings are also mixed. Carpenter, Daniere and Takahashi (2004), for example, run public good experiments with symbolic but costly punishment in Bangkok and Ho Chi Minh City. They find that in Bangkok males and higher educated subjects contribute more, while there is no significant age effect. The same experiment in Ho Chi Minh City, however, shows the opposite findings: males and higher educated subjects cooperate less and age increases cooperation. Gächter, Herrmann and Thöni (2004) find no influence of background characteristics in a one-shot public good experiment with Russian subjects. In order to add to this literature we will explore the effect of various demographic characteristics on cooperative behavior. We employ the following set of variables: - Gender is a dummy variable indicating whether a contestant is male (1) or female (0). - Age is a continuous variable measuring the contestant s age in years. In many instances the contestant s age is not explicitly mentioned during the show. In these cases we estimate age on the basis of physical appearance and other helpful information such as the age of children. - Race is a dummy variable indicating whether a contestant is white (1) or non-white (0). We apply such a broad distinction because the large majority of contestants are white. - City and London are two dummy variables that are constructed in order to distinguish contestants that live in major urban areas from those that reside in more rural surroundings. Contestants city or county of residence is always an integral part of the introductory talk. City indicates whether a contestant lives in a large urban area (1) or not (0). We define a large urban area as a conurbation with a population exceeding 250,000 inhabitants. 10 For some contestants we only know their region and not their exact town or city; we then assume a small domicile. London indicates whether a contestant lives inside (1) or outside (0) the Greater London Urban Area. - Education is a dummy variable for the level of education and differentiates between those with at least a bachelor degree (1) and those without (0). Players generally do not talk about their education during the show. We therefore estimate a contestant s level of education on 10 For England and Wales, the population data and the definitions of conurbations are taken from the UK Office for National Statistics (www.statistics.gov.uk). Similar information for Scotland, Northern Ireland and Ireland is from the General Register Office for Scotland (www.gro-scotland.gov.uk), the Northern Ireland Statistics and Research Agency (www.nisra.gov.uk) and the Central Statistics Office Ireland (www.cso.ie), respectively. [12]

13 the basis of her occupation, which is always explicitly mentioned when she is introduced, and on the basis of other information given in talks. Contestants who are currently enrolled in higher education and people whose job title suggests work experience equivalent to the bachelor level or higher are included in the higher education category. From the information that we have about each contestant, the proper binary values are generally clear. - Student is a dummy variable indicating whether the contestant currently is a higher education (undergraduate or postgraduate) student (1) or not (0). Estimates for Age and Education are based on the independent judgments of three research assistants, where each value is based on the assessments of two of them. When the estimates for Education were different, we decided on the most appropriate value ourselves. For Age we took the mean of the two judgments, and included our own assessment as a third input if the values of the coders diverged more than five years. We have also attempted to collect data on contestants marital status and the existence of children. These topics were, however, not systematically discussed in the program and values would therefore be unknown for the large majority of our contestants. Table 2 summarizes all the variables that are included in our analyses, including the demographic characteristics. Table 3, Model 1 shows the regression results for a model that includes demographic characteristics only. To be able to distinguish both general gender and age effects as well as a possible interaction effect, the interaction of gender and age is also included. The results show that, relative to our representative 20 year-old female agent, young males are 22 percentage points less likely to cooperate (p = 0.002). In line with past results by List (2004) and Carpenter, Connolly and Myers (2008), this difference disappears when age increases. The effect of age is significantly different for males and females (p = 0.001). Women do not become significantly more or less cooperative when age increases (p = 0.422). Men, on the other hand, do have a higher propensity to cooperate as they are older: their cooperation rate increases by more than one percentage point per year (p = 0.000; untabulated). 11 Contrary to the two previous studies, we find that the gender difference not only disappears as age increases, but actually reverses; males become significantly more likely to split from age 46 onwards. Figure 1 displays observed cooperation rates at different age levels for both 11 The effect of age for males could be related to increasing dependence on others (van Lange et al., 1997), or to hormonal or neurological changes as men grow older, but we are hesitant to draw conclusions in these directions for we cannot exclude that a generational effect (van Lange et al., 1997; List, 2004) or a wealth effect is (partly) driving our finding. [13]

14 Table 2: Summary Statistics The table shows descriptive statistics for the explanatory variables in our analyses of cooperative behavior based on the decisions of 574 contestants to either split or steal the jackpot in the Prisoner s Dilemma at the end of the British TV game show Golden Balls. Age is the contestant s age measured in years. Gender, Race, City, London, Education and Student are dummy variables taking the value of 1 if the contestant is male (Gender), is white (Race), lives in a conurbation with a population exceeding 250,000 inhabitants (City), is a resident of the Greater London Urban Area (London), has completed or is enrolled in higher education (bachelor degree or higher) or has equivalent working experience (Education), or is a student (Student), respectively. Actual stakes is the natural logarithm of the size of the jackpot in the Prisoner s Dilemma game. Potential stakes is the natural logarithm of the highest possible jackpot at the start of the third round. Transmissions expresses the number of episodes that was already aired when the current episode was recorded in the studio. Vote received from opp. is a dummy variable taking the value of 1 if the contestant s final opponent has tried to vote her off the program at an earlier stage of the game. Promise is a dummy variable taking the value of 1 if the contestant explicitly promised her opponent to split (or not to steal ) the jackpot. Lie Round 1 (Round 2) is a dummy variable taking the value of 1 if the contestant has misrepresented her back row balls either by overstating a cash ball or by hiding a killer ball in the first (second) round. Lie cash (killer) ball Round 1 (Round 2) is a dummy variable taking the value of 1 if the contestant has overstated a cash ball (hidden a killer ball) in the first (second) round. Standard deviations for the two stakes variables and the transmissions variable are calculated across episodes (N = 287) to avoid the effect of clusters at the episode level. All monetary values are in UK Pounds ( 1.00 $1.75). Mean Stdev Min Median Max Demographic Characteristics Age Gender (male=1) Race (white=1) City (large=1) London (London=1) Education (high=1) Student (student=1) Stakes and Context Actual stakes (log) Potential stakes (log) Transmissions Reciprocal Preferences Vote received from opp. (yes=1) Expectational Conditional Cooperation Promise (promise=1) Past Deceitful Behavior Lie Round 1 (lie=1) Lie Round 2 (lie=1) Lie cash ball Round 1 (lie=1) Lie cash ball Round 2 (lie=1) Lie killer ball Round 1 (lie=1) Lie killer ball Round 2 (lie=1) [14]

15 Table 3: Binary Probit Regression Results [1/2] The table displays results from the Probit regression analyses of contestants decisions to split (1) or steal (0) the jackpot in the Prisoner s Dilemma at the end of the British TV game show Golden Balls. First (Second) half is a dummy variable taking the value of 1 if less (more) than 50 percent of the episodes in our sample were already aired when the current episode was recorded in the studio. Definitions of other variables are as in Table 2. For each explanatory variable, the marginal effect is shown for a representative agent who takes the median value on all variables, except for Age and Transmissions, which are set to 20 and 0, respectively. Standard errors are corrected for clustering at the episode level, p-values are in parentheses. Model 1 Model 2 Model 3 Model 4 Model 5 Demographic Characteristics Age (0.422) (0.345) (0.311) (0.352) (0.403) Gender (male=1) (0.002) (0.000) (0.001) (0.001) (0.001) Race (white=1) (0.101) (0.082) (0.091) (0.089) (0.065) City (large=1) (0.396) (0.359) (0.335) (0.405) (0.439) London (London=1) (0.402) (0.348) (0.400) (0.444) (0.406) Education (high=1) (0.062) (0.053) (0.050) (0.049) (0.058) Student (student=1) (0.888) (0.983) (0.923) (0.884) (0.933) Age x Gender (0.001) (0.001) (0.001) (0.001) (0.001) Stakes and Context Actual stakes (log) (0.000) (0.000) (0.000) (0.000) Potential stakes (log) (0.139) (0.006) Transmissions (0.722) Potential stakes x Transmissions (0.037) Second half (second=1) (0.508) Potential stakes x First half (0.005) Potential stakes x Second half (0.543) Wald chi 2 (df) 34.87(8) 51.61(9) 52.57(10) 57.87(12) 62.35(12) Log pseudo-likelihood Pseudo R N Number of clusters males, females and aggregates, clearly depicting an age effect for men. Further analyses show that there is no evidence of a quadratic age effect, neither for men nor for women. We have also experimented with specifications where the (semi-) continuous age variable is replaced by a set of dummy variables that represent various age groups. The results are economically and statistically similar. When it comes to race we find weak evidence that whites are more likely (about 13 percentage points) to cooperate than non-whites (p = 0.101; p < 0.10 in the models discussed hereafter). List (2004, 2006) and Oberholzer-Gee, Waldfogel and White (2010) report a similar pattern, yet in more conventional experiments the reverse is often found (see, for example, Cox, Lobel and McLeod, 1991). Since possible but unobservable wealth effects could contribute to this result, it should be interpreted with caution. [15]

16 Figure 1: Age and the Propensity to Cooperate for Males and Females. The figure displays the relative frequency of contestants who decide to split across various age intervals. Bars depict the percentage of cooperators within specific age brackets for males, females and the aggregate, respectively. For each category, the number of contestants is displayed at the bottom of the bar. Higher educated contestants are about 9 percentage points more cooperative (p = 0.062), although this effect is only marginally significant in the current model and not consistently significant across the various regression models discussed hereafter (0.041 < p < 0.070). Similar to the effect of race, the effect of education could be spurious due to an unobservable wealth effect. Students are frequently used as subjects in experiments, and the reliance on such a specific subject pool is often criticized. Sears (1986), for example, extensively describes how the use of student subjects might produce misleading or mistaken conclusions about social behavior. It is therefore interesting to investigate whether there is evidence that students behave differently from others, holding other observable characteristics constant. This turns out not to be the case. Controlling for demographics such as age and education, our regression results yield no indications of a different attitude toward cooperation among students (p = 0.888) In a similar vein, van Lange et al. (1997) and Bellemare and Kröger (2007) do not detect a difference, whereas Carpenter, Connolly and Myers (2008) and Egas and Riedl (2008) do report a negative bias. [16]

17 None of the residence dummy variables have a significant effect. Possibly, relatively small social differences between urban and more rural areas in the UK explain this null result. 13 III. Stakes and Context Economists typically argue that behavior will converge toward the prediction of rational self-interest if the stakes increase (e.g., Rabin, 1993; Telser, 1995; Levitt and List, 2007). The evidence from lab and field experiments is, however, not generally supportive of this view. Except for the finding that people seem to become more willing to accept relatively low offers in ultimatum bargaining games when the stakes are high, empirical research generally finds no evidence that stake size affects behavior, even when the stakes are increased up to several months wages. 14 Given that the stakes in Golden Balls are widely ranging, and, on average, considerably larger than in previous studies, the show provides an excellent opportunity to re-examine the relation between cooperation and stakes. In addition, compared to earlier game-show studies on cooperation, an advantage of Golden Balls is that the stakes are mainly built up by a random process and not by contestants answers to trivia questions. The latter may lead to a spurious correlation because the ability to answer trivia questions may be related to unobserved background characteristics such as income, which in turn may well be related to the propensity to cooperate. The variable that we use in our regressions is labeled Actual stakes and defined as the natural logarithm of the size of the jackpot. Model 2 in Table 3 displays the regression results when the stakes are included. Clearly, cooperative behavior in our show is sensitive to the amount that is at stake. To illustrate this effect, Figure 2 depicts the actual and estimated cooperation rates for different stake levels. The fitted line based on our full regression model (Model 6 presented later on) appears to capture the pattern rather well. Cooperation is high when the stakes are relatively small: for amounts up to 500, people on average 13 In experiments conducted in a region of Russia where there is a large gap, Gächter and Herrmann (2011) do find that rural residents are more cooperative than urban residents. 14 See, for example, Hoffman, McCabe and Smith (1996a), Slonim and Roth (1998), Cameron (1999), List and Cherry (2000, 2008), Fehr, Fischbacher and Tougareva (2002), Munier and Zaharia (2002), Carpenter, Verhoogen and Burks (2005), Johansson-Stenman, Mahmud and Martinsson (2005), and Kocher, Martinsson and Visser (2008). For TV game show data, List (2004, 2006) and Oberholzer-Gee, Waldfogel and White (2010) also find that cooperative behavior is practically invariant to the stakes. Belot, Bhaskar and van de Ven (2010a) report some counter-intuitive evidence that cooperation actually increases with stakes. [17]

18 Figure 2: Stakes and the Propensity to Cooperate. The figure displays the relative frequency of contestants who decide to split across various stake intervals. Tick mark values represent the endpoints of the intervals. Each bar depicts the percentage of cooperators within a specific stake bracket. The dashed line reflects the average cooperation rate across our full sample, while the solid line connects the average estimate of the propensity to cooperate for each stake bracket. The estimates are computed using our full model, which is Model 6 in Table 4. For each interval, the number of contestants is displayed at the bottom of the bar. cooperate 73.4 percent of the time. The rate drops to approximately 45 percent as the stakes increase and remains relatively stable for the largest amounts. An unreported test shows that we cannot reject that the relation becomes essentially flat for stakes larger than 1,500. While the absolute level of the stakes thus appears to have some influence on the propensity to cooperate, behavioral research suggests that people do not always evaluate prospects just in absolute terms, but rather they sometimes use relative comparisons to determine subjective values. This way, what comprises the context can strongly influence choices (Kahneman, Ritov and Schkade, 1999). In choice tasks, for example, one can increase the likelihood that a given option is chosen by adding an alternative to the choice set that is dominated by the given option but not by the other alternatives available (Huber, Payne and Puto, 1982). Also, as a consequence of the use of relative judgments, seemingly irrelevant anchors can influence how people value goods of various kinds (Green et al., 1998; Ariely, Loewenstein and Prelec, 2003; Simonson and Drolet, 2004) and even (risky) monetary prospects (Johnson and Schkade, 1989). [18]

19 For our purposes, the question of interest is whether the game show influences the contestants perceptions of what constitutes serious money. Suppose that some contestants decide that for serious money they are willing to bear the reputational costs, if any, of defecting on national TV, but if the stakes are small, so-called peanuts, then they will just cooperate to look good. In this scenario, we would observe the pattern of cooperation in our data: high cooperation rates for low stakes and lower cooperation for high stakes. The interesting point, however, is that the small stakes on this show, several hundred Pounds, are quite large relative to most experiments. So even when the contestants are playing for what seems to be peanuts, these are big peanuts! In Deal or No Deal, another game show that has even larger stakes than Golden Balls, Post et al. (2008) also find strong evidence of such a big peanuts phenomenon. Namely, when unlucky contestants faced decisions near the end of the show that were merely for thousands of euros, they displayed little or no risk aversion. In fact, some of their contestants made risk-seeking choices in such situations. The authors provide further evidence for this behavior in classroom experiments designed to mimic the show at two levels of stakes, call them low and medium. In the low stakes treatment the average prize was 40 with a maximum of 500, while in their medium stakes treatment the average prize was 400 with a maximum of 5,000. While risk aversion increased with stakes within each treatment, such an effect was not found across treatments: despite the very different money amounts, risky choices were similar for the low and medium stakes session. Choices in both conditions were even remarkably similar to those made in the actual TV show, despite the huge stakes used there (average 400,000, maximum 5,000,000). These results suggest that a context can convert a sum of money that would normally be considered consequential into perceived peanuts. In the Golden Balls scenario, earlier expectations about the jackpot size or a specific value from the game might operate as an anchor or reference value by which the actual size of the jackpot is evaluated. 15 The most obvious benchmark contestants may use seems to be the maximum possible jackpot at the beginning of Round 3. Though the expected jackpot size might be an alternative candidate, it is neither salient nor easily calculated. In fact, even a rough assessment is rather complicated, particularly because of the influence of killer balls. The maximum potential jackpot, however, is always visually displayed and explicitly stressed by the game show host. In order to test for such an effect, we include the variable Potential stakes in our analyses, defined as the highest possible jackpot at the start of Round 3. As with the actual stakes, we take the natural 15 We intentionally do not use the term reference point in order to avoid associations with prospect theory here, since we are hesitant to translate the elements of prospect theory to preferences in this game and to derive testable predictions. [19]

20 logarithm. Not surprisingly, the actual and the potential stakes variable are significantly correlated, but due to the effect of killer balls and the skewed distribution of cash balls the degree is rather limited; the Pearson correlation coefficient is ρ = Model 3 shows the new results. The positive sign of the potential stakes coefficient is in line with what we would expect, but the effect is statistically insignificant for the entire sample (p = 0.139). Interestingly, the effect becomes marginally significant if we exclude the (returning) contestants who already appeared in a previous episode of the show (p = 0.066; untabulated). This gives rise to the idea that the anchoring effect of the potential jackpot may decrease over time as contestants become more familiar with the show by watching it on TV. Prior shows will give contestants an impression of expected payoffs, which may help them to evaluate whether the stakes they face themselves are high or low in the context of the game and reduce the role of an episode-specific reference value such as the maximum possible jackpot size. We explore the effect of experience by testing whether the effect of the potential jackpot changes as contestants have watched more episodes on TV. As a (noisy) proxy for how many shows a contestant has watched, we define the variable Transmissions as the number of different episodes broadcast on TV prior to the studio recording of the current episode. Model 4 in Table 3 displays the results. There is no significant main effect of this variable (p = 0.722), indicating that there is no evidence of a trend in the cooperation rate over time. However, the interaction effect of the number of transmissions and the potential stakes is significantly negative (p = 0.037), and implies that the anchoring effect of the maximum possible jackpot decreases by 0.10 percentage point for each previously aired episode. Controlling for this interaction effect, the effect of the maximum potential jackpot is highly significant in the early episodes (p = 0.006), where doubling the maximum potential jackpot increases cooperation by more than 12 percentage points. 16 The pattern of a pronounced effect of the potential jackpot size in the earlier but not in the later shows also becomes apparent if we include dummy variables that subdivide our sample. Model 5, for example, uses a natural subdivision and employs a dummy variable for the first 149 episodes (0-112 transmissions prior to the recordings) and for the remaining 138 ( prior transmissions). Clearly, the effect is significant across the first half our data (p = 0.005) and insignificant thereafter (p = 0.543). 16 Because the potential stakes and the actual stakes are correlated, the interaction of Potential stakes and Transmissions might pick up an effect of the interaction of Actual stakes and Transmissions. As a robustness check, we have therefore also added the latter to Model 4. The effect of this additional control variable is insignificant (p = 0.573; untabulated), confirming our interpretation. [20]

Other-Regarding Preferences: A Selective Survey of Experimental Results* David J. Cooper Florida State University John H. Kagel Ohio State University 2/12/2013 To appear in the Handbook of Experimental

Experimental Economics, 5:53 84 (2002) c 2002 Economic Science Association Risk Attitudes of Children and Adults: Choices Over Small and Large Probability Gains and Losses WILLIAM T. HARBAUGH University

Other-Regarding Behavior: Theories and Evidence PRELIMINARY AND INCOMPLETE, NOT FOR CITATION Nicola Persico University of Pennsylvania Dan Silverman University of Michigan April 2006 Abstract We use fielddataontraffic

BIS RESEARCH PAPER NO. 112 THE IMPACT OF UNIVERSITY DEGREES ON THE LIFECYCLE OF EARNINGS: SOME FURTHER ANALYSIS AUGUST 2013 1 THE IMPACT OF UNIVERSITY DEGREES ON THE LIFECYCLE OF EARNINGS: SOME FURTHER

RESEARCH A Cost-Benefit Analysis of Apprenticeships and Other Vocational Qualifications Steven McIntosh Department of Economics University of Sheffield Research Report RR834 Research Report No 834 A Cost-Benefit

24 Why Do People Give? LISE VESTERLUND The vast majority of Americans make charitable contributions. In 2000, 90 percent of U.S. households donated on average $1,623 to nonprofit organizations. 1 Why do

Do Workers Work More if Wages Are High? Evidence from a Randomized Field Experiment By ERNST FEHR AND LORENZ GOETTE* Most previous studies on intertemporal labor supply found very small or insignificant

NBER WORKING PAPER SERIES DO CIGARETTE TAXES MAKE SMOKERS HAPPIER? Jonathan Gruber Sendhil Mullainathan Working Paper 8872 http://www.nber.org/papers/w8872 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts

COGNITION Cognition 72 (1999) 269±304 www.elsevier.com/locate/cognit When and why do people avoid unknown probabilities in decisions under uncertainty? Testing some predictions from optimal foraging theory

Inequality and Growth: What Can the Data Say? Abhijit V. Banerjee and Esther Duflo June 2003 Abstract This paper describes the correlations between inequality and the growth rates in crosscountry data.

Can political science literatures be believed? A study of publication bias in the APSR and the AJPS Alan Gerber Yale University Neil Malhotra Stanford University Abstract Despite great attention to the

Conclusions and Controversies about the Effectiveness of School Resources Eric A. Hanushek Both the U.S. public and U.S. policymakers pursue a love-hate relationship with U.S. schools. While a majority

Journal of Behavioral Decision Making (forthcoming) Do the Right Thing: But Only if Others Do So Cristina Bicchieri University of Pennsylvania Erte Xiao University of Pennsylvania Correspondence to Erte

RESEARCH Review of the Academic Evidence on the Relationship Between Teaching and Research in Higher Education Mohammad Qamar uz Zaman Research Report RR506 Research Report No 506 Review of the Academic