Sunday, February 10, 2008

Malice, Anarchy, and Regret

Its been a long time since my last post -- I've been busy doing a lot of things, including attending part of this very interesting workshop on data privacy, and laboriously adding my own random coins to the PhD admissions process at CMU. Now that the next class has been admitted (Welcome!), I will hopefully have some time to get back to research.

There are some interesting things going on in the world of privacy, but that is a topic for another post. Since the terminology used by the algorithmic game theory community is much more colorful, I'll write a little about something I have been thinking about recently: The price of malice!

Recall that the price of anarchy is a measure of the degradation of social welfare in the face of selfish behavior. Specifically, it is the worst case ratio of the optimal social welfare to the social welfare in a Nash equilibrium. One of the problems with this measure is that it is really quite brittle -- it assumes that everyone is perfectly self interested and rational. But in actual games (say, computer networks), although many people are selfish and wish to minimize their own delay, others exhibit different behaviors. Some are oblivious to network traffic, and so may seem to be irrational. Some are explicitly malicious, and seek to increase the delay of others (see denial of service attacks). Enter the price of malice: an attempt to define a quantity that measures the effect on social welfare of these malicious players. So how should we define it?

Well, its been studied in several papers, and they define it differently! Moscibroda, Schmid, and Wattenhofer define a price of malice that is parameterized by the number of malicious players, k. PoM(k) is the ratio of the cost in a game with k malicious players, and n-k rational (selfish) players, to the worst case social cost with n rational (selfish) players. Babaioff, Kleinberg, and Papadimitriou give a different, unparameterized definition (also, of course, called the price of malice!) Their price of malice is essentially the first derivative of PoM(k), evaluated at k=0: That is, it is the worst case marginal cost of converting an epsilon fraction of the players from selfish to malicious.

But what does it mean to have a game with rational and malicious players? Both papers define an equilibrium model by assigning the malicious players a utility function equal to the negation of the social welfare function. This is natural when we are talking about Nash Equilibrium, but leads to perhaps an awkward definition of a malicious player: because the malicious players are myopically seeking to minimize social welfare, sometimes malicious players can fall into equilibria with rational players that actually improve social welfare (as compared to equilibria with rational players alone)! Both papers observe this phenomenon, and Babaioff, Kleinberg, and Papadimitriou call it "the windfall of malice". This is a pretty cool and counterintuitive phenomenon, but somehow it doesn't seem like a satisfying definition of malicious play -- after all, if a truly malicious adversary was helping you by acting adversarialy, he could (if nothing else) act `rationally' and avoid giving you this boost. What we really want is an adversary who we allow to behave arbitrarily -- we want to take the worst case over all possible adversaries, who might engage in sophisticated counterspeculation.

So, to add to the clutter, in a recent paper, I have given yet another definition of the price of malice. I use the framework from the recent paper on the price of total anarchy to define the price of malice (an analogue of Moscibroda et al.'s definition), and the differential price of malice (an analogue of Babaioff et al.'s definition) with weaker assumptions. First, imagine that players are playing in a repeated game, rather than a one-shot game. This allows us to weaken the assumption that players are playing according to a one-shot Nash equilibrium and consider each player individually, assuming only that each rational player experiences no "regret". That is, no player "regrets" not playing some fixed action -- each player is doing at least as well as she would have been had she retroactively changed her play history to repeatedly play any fixed action. Note that by the definition of a Nash equilibrium, if all players were repeatedly playing according to a Nash, all players would be experiencing no regret (and so this is a weaker assumption than that players play according to a Nash equilibrium). This weakened assumption has two benefits. First, it makes the predicted behavior computationally plausable: while finding a Nash equilibrium is PPAD complete in general games, there are efficient algorithms players can use to unilaterally guarantee that they have regret quickly tending to 0. But that isn't the main point here: It also allows us to define rational play on a player by player basis, rather than defining rational play over an entire population (by saying that they all play according to a Nash equilibrium). This allows us to analyze games in which only a fraction of the players are `rational', and the rest are behaving arbitrarily. So we can now study the price of malice -- in which k players are Byzantine -- players about whom we make no assumptions, and who may engage in sophisticated malicious play. Now that we are taking the worst case over all possible adversaries, we can no longer observe a windfall of malice. This should be what we want -- after all, we want to bound the cost of malicious players who may be smart enough not to help us out.

Of course, depending on the situation, our adversaries might be myopic, and so the equilibria models of the price of malice also are interesting quantities to study. Fortunately, the no-regret definitions are backwards compatable: since they make strictly weaker assumptions, upper bounds on the price of malice and differential price of malice in the no-regret model also serve as upper bounds on Moscibroda et al.'s and Babaioff et al.'s price of malice (respectively).

I haven't looked into it very much, apart from having heard Silvio Micali give a talk on the subject once. This is one of the many things that I should read about when I get the chance... (I see you have a survey article -- I will print it out). Another interesting "crypto"/game theory connection is the McSherry/Talwar FOCS 07 paper which points out that "differential privacy" and "incentive compatibility" are very closely related concepts.