Archive for the ‘Probability’ Category

Sebastian Thrun is a professor of computer science and electrical engineering at Stanford, and director of the Stanford Artificial Intelligence Laboratory. He was the leader of Stanford’s team which won the $2 million first prize in the 2005 DARPA Grand Challenge, which was a race of driver-less robotic cars across the desert, and also leads Stanford’s entry into the 2007 DARPA Urban Challenge.

One of the ingredients in the Stanford team’s win was their use of “probabilistic robotics,” which is an approach based on the recognition that all sensor readings and models of the world are inherently subject to uncertainty and noise. Thrun, together with Wolfram Burgard and Dieter Fox have written the definitive text on probabilistic robotics, which will be a standard for years to come. If you are seriously interested in robotics, you should read this book. (The introductory first chapter, which clearly explains the basic ideas of probabilistic robotics is available as a download here.)

The Laboratory of Intelligent Systems at the Swiss École Polytechnique Fédérale de Lausanne (EPFL) hosts the superb “Talking Robots” web-site, which consists of a series of podcast interviews with leading robotics researchers. I noticed that the latest interview is with Thrun, and liked it quite a bit; it is well worth downloading to your iPod or computer.

You can watch Thrun speaking about the DARPA Grand Challenge at this Google TechTalk.

What does it mean to say that “I believe that Hillary Clinton has a 42% chance to win the 2008 U.S. Presidential election?” It used to be that some academics (called “frequentists”) had problems with this statement, because they only wanted to talk about probabilities for experiments that, at least in principle, could be run many times, to permit a reasonable estimate of the frequency of some event. I think hardly anybody is a frequentist anymore.

One good operational definition of the above statement is that if you offer me a choice between $1.00 if Hillary Clinton wins, or $0.42 regardless of whether or not she wins, I am indifferent. If you offer me less than the $0.42, I’ll take the chance on Senator Clinton, if you offer me more, I’ll go with the sure thing.

Nowadays, we can get good estimates of the probabilities of many interesting events from prediction markets like TradeSports for sporting events, or Intrade for political and other events. In these markets, participants buy and sell contracts of the above variety, so that the market provides a consensus probability for a particular event.

I find these markets fascinating. You can find out for example, looking at the chart above, that Senator Obama’s apparent chance of winning the Democratic nomination has fallen from around 39% to around 17% over the last few weeks (presumably because of the fallout over his statements on foreign policy). Or that the New England Patriots off-season acquisitions have increased their chances of winning the next Super Bowl from about 7% to 17%.

There’s some bias towards American sports and politics, but events like the recent French Presidential election are also heavily traded.

I want to expand on what I wrote previously in “A Simple But Challenging Game: Part II, this time focusing on Rosenthal’s Centipede Game. To remind you of the rules, in that game there are two players. The players, named Mutt and Jeff, start out with $2 each, and they alternate rounds. On the first round, Mutt can defect by stealing $2 from Jeff, and the game is over. Otherwise, Mutt cooperates by not stealing, and Nature gives Mutt $1. Then Jeff can defect and steal $2 from Mutt, and the game is over, or he can cooperate and Nature gives Jeff $1. This continues until one or the other defects, or each player has $100.

As I previously wrote, in this game, the Nash equilibrium is that Mutt should immediately defect on his first turn. This result is obtained by induction. When both players have $99, it is clearly in Mutt’s interest to steal from Jeff, so that the he will end with $101, and Jeff will end with $97. But that means that when Jeff has $98 and Mutt has $99, Jeff knows what Mutt will do if he cooperates, and can see that he should steal from Mutt, so that he will end with $100 and Mutt will end with $97. But of course that means that when both players have $98, Mutt can see that he should steal from Jeff, and so on, until one reaches the conclusion that Mutt should start the game by stealing from Jeff.

Of course, this Nash equilibrium behavior doesn’t really seem very wise (not to mention ethical), and experiments show that humans will not follow it. Instead they usually will cooperate until the end or near the end of the game, and thus obtain much more money than would “Nashists” who rigorously follow the conclusions of theoretical game theory.

Game theorists often like to characterize the behavior of Nashists as “rational,” which means that they need to explain the “irrational” behavior of humans in the Rosenthal Centipede Game. See for example, this economics web-page, which gives the following “possible explanations of ‘irrational’ behavior”:

There are two types of explanation to account for the divergence. The first assumes that the subject pool contains a certain proportion of altruists who place a positive weight in their utililty function on the payoff of their opponent. Also to the extent that selfish players believe that there is some probability that other players are altruists, they have an incentive to mimic altruistic behaviour by passing.

The second explanation considers the possibility of action errors. Errors in action, or ‘noisy’ play, may result from subjects experimenting with different strategies. Or simply from subjects pressing the wrong key.

Let’s step back for a second and consider what “rational” behavior should mean. A standard definition from economics is that a rational agent will act so as to maximize his expected utility. Let’s accept this definition of “rational.”

The first thing we should note is that “utility” is not usually the same as “pay-off” in a game. As noted in the first explanation above, many people get utility from helping other people get a pay-off. But there are many other differences between pay-offs and utility. You might lose utility from performing actions that seem unethical or unjust, and gain utility from performing actions that seem virtuous or just. You might want to minimize the risk in your pay-off as well as maximize the expected pay-off. You might value pay-offs in a non-linear way, so that the difference between $101 and $100 is very small in terms of utility.

Of course, this difference between pay-off and utility is very annoying theoretically. We’d really like the pay-offs to strictly represent utilities, but unfortunately for experiments, it is only possible to hand out dollars, not some abstract “utils.”

But suppose that the pay-offs in the Rosenthal Centipede Game really did represent utils. Would the game theory result really be “rational” even in that case? Would the only remaining explanation of cooperating behavior be that the players just don’t understand the situation and are making an error?

No. Remember that to be “rational,” an agent should maximize his expected utility. But he can only do that conditioned on some belief about the nature of the person he is playing with. That belief should take the form of a probability distribution for the possible strategies of his opponent. A Nashist rigidly reasons by backward induction that his opponent must always defect at the first opportunity. He even believes this if he plays second, and his opponent cooperates on the first turn! But is this the most accurate belief possible, or the one that will serve to maximize utility? Probably not.

A much more accurate belief could be based on the understanding that even people who understand the backward induction argument can reason beyond it and see that many of their opponents are nevertheless likely to cooperate for a long time, and therefore it pays to cooperate. If you believe that your opponent is likely to cooperate, it is completely “rational” to cooperate. And if this belief that other players are likely to cooperate is backed by solid evidence such as the fact that they started the game by cooperating, then the behavior of the Nashist, based on inaccurate beliefs that cannot be updated, is in fact quite “irrational,” because it does not maximize his utility.

Sophisticated game theorists do in fact understand these points very well, but they muddy the waters by unnecessarily overloading the term “rational” with a second meaning beyond the definition above; they in essence say that “rational” beliefs are those of the Nashist. For example, take a look at this 1995 paper about the centipede game by Nobel Laureate Robert Aumann. Aumann proves that “Common Knowledge of Rationality” (by which he which he essentially means the certain knowledge that all players must always behave as Nashists) will imply backward induction. He specifically adds the following disclaimer at the end of his paper:

We have shown that common knowledge of rationality (CKR) implies backward induction. Does that mean that in perfect information games, only the inductive choices are appropriate or wise? Would we always recommend the inductive choice?

Certainly not. CKR is an ideal (this is not a value judgement; “ideal” is meant as in “ideal gas”) condition that is rarely met in practice; when it is not met, the inductive choice may be not only unreasonable and unwise, but quite simply irrational. In Rosenthal’s (1982) centipede games, for example, even minute departures from CKR may make it incumbent on rational players to “stay in” until quite late in the game (Aumann, 1992); the resulting outcome is very far from that of backward induction. What we have shown is that if there is CKR, then one gets the backward induction outcome; we do not claim that CKR obtains or “should” obtain, and we make no recommendations.

This is all well and good, but why use the horribly misleading name “Common Knowledge of Rationality” for something that would be more properly called “Universal Insistence on Idiocy?”

I hope it is obvious by now why I am skeptical of explanations of various types of human behavior that are based on assuming that all humans are always Nashists, and even more skeptical of recommendations about how we should behave that are based on those same assumptions.

[Acknowledgement: I thank my son Adam for discussions about these issues.]

My 15-year-old son Adam likes game theory. He invented the following simple game, and asked me about it when I got on the phone with him while I was away at a conference last month (I’ve simplified and formalized the set-up slightly):

There are two players, each of whom is given a real number which is chosen randomly from a uniform distribution between 0.0 and 1.0. The players know their own number but not their opponent’s. One player moves first and has the choice of passing or challenging. If he challenges, both players reveal their number, and the player with the higher number receives a payoff of 1, while the other player receives a payoff of 0. If the first player passes, the second player has a choice of challenging or passing. If he challenges, again both players reveal their numbers and the player with the higher number receives a payoff of 1, while the other player receives a payoff of 0. If the second player also passes, both players receive a payoff of 1/2. They play the game one time, and are interested in maximizing their expected payoff.

What is the right strategy? For example, if you received the number 0.17, would you pass or challenge if you were the first player? What about if you were the second player? What would you do if the number you received was 0.0017?

I’ll tell you more in a later post, but for now why don’t you think about it….

Here’s the promised followup:

It is clear that if any player has the advantage, it’s the second player, because he gets some information from the first player, and can use it to make his decision. Nevertheless, the first player can adopt the strategy of always challenging, and thereby guarantee that he wins half the time. So apparently he should always challenge. This is the answer that was given by “Optionalstopping” in the comments. The same answer was given to me by the evolutionary game theorist Arne Traulsen (who has recently worked on a ground-breaking theory for the emergence of punishment), after I asked him about the game at a lunch conversation.

There is another way to arrive at the same answer. Assume that each player chooses a strategy parameterized by a single value; if he receives a number above that value, he challenges, while if he receives a number below that value he passes. If you work out the Nash equilibrium (basically, that means that both you and your opponent pick the strategy that gives you the best payoff assuming that the opponent is maximizing their payoff), you’ll find (I won’t bore you with the math) that the value for both players is zero–they should always challenge. My son Adam gave an intuitive version of this argument, without the math. Of course it’s true that probabilistic strategies are also possible. I haven’t proven it, but I strongly doubt that introducing probabilistic strategies will change the result that the Nash equilibrium is to always challenge for both players.

So at first I thought the matter was settled. But still, there is something very weird about this result. Would you really challenge if you were the first player and you received a .001? Would you really?

And what if you were the second player, and the first player passed, and you had a .000001? You know that the first player is not “following the rules” of Nash equilibrium. Are you really going to challenge because Nash tells you to? It’s obviously a crazy result! There must be a hole in the arguments.

So what’s wrong with the above arguments?

First let’s start with the Nash equilbrium arguments. By the way, many authors use the term “rational” for players that use strategies dictated by Nash equilibrium arguments, but I think “rational” and “irrational” are excessively loaded terms, so I prefer to instead say that a player that follows a strategy dictated by Nash equilibrium arguments is a “Nashist.”

If you are the second player, and the first player has passed, you can deduce that the first player is not a Nashist. So in order to make a correct play (maximize your expected winnings) you need to choose some probability distribution for what the first player’s strategy is, and then compute whether you will win more or less depending on whether you challenge. There doesn’t seem to be any obvious way to choose the probability distribution for the first player’s strategy, but you can definitely say that he is not certainly a Nashist! A Nashist is stuck (by definition of being a Nashist) believing that all other players are always Nashists, even in the face of clear evidence that they are not (you see why I don’t like to call Nashists “rational”) and would choose the strategy that followed from that obviously wrong belief; he would always challenge as the second player.

A non-Nashist, on the other hand, can come up with a reasonable probability distribution for the first player’s strategy, and come to the conclusion that he should pass if he is the second player and he has a .000001.

OK, so we can see why the second player might want to pass if he receives a .000001. What about the first player: is it wrong to be a Nashist? Should you pass or challenge if you get a .001, or a .000001, or a .000000001? A Nashist would be compelled, by the force of his “idealogy,” to challenge in each case. But you can make a good argument that that’s wrong. Instead, let’s say I am the first player and I have a .000001. I know that if I pass, the second player will be able to deduce that I’m not a Nashist, and will go through the argument given above for the second player. Now “all” I have to do is form my probability distribution for what his probability distribution of my strategy will be, and compute whether I will have more chance of winning depending whether I challenge or not. Again there’s no obvious way to choose these probability distributions, but it seems pretty clear to me to that reasonable probability distributions will give a result that says don’t challenge if you are the first player and you have a sufficiently low number.

Well what about the other argument, that says that the first player should always challenge, since he is at a disadvantage and if he always challenges, he’ll win half the time? It seems paradoxical to think that there is a strategy that can do better than winning half the time for the first player.

Of course, all Nashists will always win exactly half the time, whether they play first or second. If you are playing against someone who you know is a Nashist, it actually doesn’t matter what you do! But suppose instead that you are playing against an ordinary human. You should play forming the best possible probability distribution of what they will do. Many humans will challenge if they have a number above .5 and pass otherwise, whether or not they play first (a very bad strategy by the way). It is perfectly possible, indeed likely, that playing against a population of ordinary humans, there exists a strategy that wins more than half the time for the first player. I can’t prove that strategy exists by arguing in this way (one would need to run experiments to determine the probability distributions of strategies, and then it would be easy to compute), but I’m actually pretty confident that it does exist, and I’m also pretty confident that the strategy involves passing when you are given a .000001 as the first player.

So I don’t believe either argument for always challenging holds up, which is comforting, because always challenging does seem intuitively wrong. Unfortunately, I can’t tell you exactly what the optimal strategy is either, at least until you tell me what the true probability distribution is for player strategies.

By the way, Arne recommended that I pick up Game Theory Evolving, by Herbert Gintis, for my son. It’s a wonderful book, full of interesting games and solved problems in game theory. Adam and I both love it. Gintis gives other examples showing that Nashists (he calls them “self-regarding agents”) can choose bizarre strategies, including “Rosenthal’s Centipede Game:”

The players, Mutt and Jeff, start out with $2 each, and they alternate rounds. On the first round, Mutt can defect by stealing $2 from Jeff, and the game is over. Otherwise, Mutt cooperates by not stealing, and Nature gives Mutt $1. Then Jeff can defect and steal $2 from Mutt, and the game is over, or he can cooperate and Nature gives Jeff $1. This continues until one or the other defects, or each player has $100.

In this game, the Nash equilibrium, obtained by induction by working backwards from the end of the game, when it is clearly “correct” to defect, is that Mutt should immediately defect on his first turn. So that’s what a Nashist would do, but fortunately humans are much more “rational” than Nashists!

Much of my own work is at the intersection of statistical mechanics and algorithms, in particular understanding and developing new algorithms using ideas originating in statistical mechanics. Werner Krauth also works at the intersection of the two fields, but coming from a very different angle: he is a leading expert on the development and application of algorithms to compute and understand the properties of physical systems.

In his recently published book, “Statistical Mechanics: Algorithms and Computations,” targeted at advanced undergraduates or graduate students, he covers a very wide range of interesting algorithms. To give you an idea of the coverage, I’ll just list the chapters: “Monte Carlo methods,” “Hard disks and spheres,” “Density matrices and path integrals,” “Bosons,” “Order and disorder in spin systems, “Entropic forces,”and “Dynamic Monte Carlo methods.”

Krauth’s presentation is leavened by his humor, and he often uses the results obtained using his algorithms to make surprising points about physics that would otherwise be hard to convey.

I am often asked by computer science or electrical engineering scientists and researchers for good introductions to physics, and particularly statistical mechanics, and I’m now happy to be able to recommend this book.

“Computational algorithms are used to communicate precisely some of the methods used in the analysis of dynamical phenomena. Expressing the methods of variational mechanics in a computer language forces them to be unambiguous and computationally effective. Computation requires us to be precise about the representation of mechanical and geometric notions as computational objects and permits us to represent explicitly the algorithms for manipulating these objects. Also, once formalized as a procedure, a mathematical idea becomes a tool that can be used directly to compute results.”

But while Sussman and Wisdom’s book focuses in great detail on classical mechanics, Krauth’s book covers more broadly subjects in classical mechanics, statistical mechanics, quantum mechanics, and even quantum statistical mechanics. Another difference is that Sussman and Wisdom specify their algorithms in executable Scheme code, while Krauth uses pseudo-code. Of course, both choices have their advantages, just as both of these books are worth your time.

In 2002, I gave a lecture at the Mathematical Sciences Research Institute on the work I did, together with Bill Freeman and Yair Weiss on Generalized Belief Propagation, and the correspondence between free energy approximations and message passing algorithms. The lecture is available as a streaming video, together with a pdf for the slides, here.

It’s worth mentioning that there are many other interesting research lectures available in MSRI’s video archive, and that the more recent ones are of higher production quality.

Here is our most recent and comprehensive paper on this subject, published in the July 2005 issue of IEEE Transactions on Information Theory, which gives many additional details compared to the lecture: MERL TR2004-040.

If that paper is too difficult, you should probably start with this earlier paper, which was more tutorial in nature: MERL TR2001-22.

If you’re looking for generalized belief propagation software, your best bet is this package written by Yair’s student Talya Meltzer.

P.S.: I realized I haven’t told those of you who don’t know anything about it what generalized belief propagation is. Well, one answer is to that is look at the above material! But here’s a little background text that I’ve copied from my research statement to explain why you might be interested:

Most of my current research involves the application of statistical methods to “inference” problems. Some important fields which are dominated by the issue of inference are computer vision, speech recognition, natural language processing, error-control coding and digital communications. Essentially, any time you are receiving a noisy signal, and need to infer what is really out there, you are dealing with an inference problem.

A productive way to deal with an inference problem is to formalize it as a problem of computing probabilities in a “graphical model.” Graphical models, which are referred to in various guises as “Markov random fields,” “Bayesian networks,” or “factor graphs,” provide a statistical framework to encapsulate our knowledge of a system and to infer from incomplete information.

Physicists who use the techniques of statistical mechanics to study the behavior of disordered magnetic spin systems are actually studying a mathematically equivalent problem to the inference problem studied by computer scientists or electrical engineers, but with different terminology, goals, and perspectives. My own research has focused on the surprising relationships between methods that are used in these communities, and on powerful new techniques and algorithms, such as Generalized Belief Propagation, that can be understood using those relationships.

This is an easy book for me to recommend. David J.C. MacKay is a professor in the physics department of Cambridge University, and he is a polymath who has made important contributions in a wide variety of fields. This textbook is an excellent introduction to modern error-correcting codes, compression, statistical physics, and neural networks. It is tied together by a recurring appeal to the power of Bayesian methods.

David wrote this book over the course of many years, publishing his drafts on the web. You can still view the entire book on the web here. But the book is very inexpensive; unless you’re very poor, you’ll really want to buy a copy.

As Bob McEliece (a professor at Caltech and Shannon medalist) wrote, “you’ll want two copies of this astonishing book, one for the office and one for the fireside at home.” I know this is true because I actually have two copies; I bought my own copy as soon as the book was published, and then found that David had kindly sent me a copy.