Similar presentations

1
No Regret Algorithms in Games Georgios Piliouras Georgia Institute of Technology John Hopkins University

2
No Regret Algorithms in Games Georgios Piliouras Georgia Institute of Technology John Hopkins University

3
No Regret Algorithms in Games Georgios Piliouras Georgia Institute of Technology John Hopkins University

4
No Regret Algorithms in Social Interactions Georgios Piliouras Georgia Institute of Technology John Hopkins University

5
No Regret Algorithms in Social Interactions Georgios Piliouras Georgia Institute of Technology John Hopkins University

6
No Regret Behavior in Social Interactions Georgios Piliouras Georgia Institute of Technology John Hopkins University

7
No Regret Behavior in Social Interactions Georgios Piliouras Georgia Institute of Technology John Hopkins University

8
“Reasonable” Behavior in Social Interactions Georgios Piliouras Georgia Institute of Technology John Hopkins University

9
No Regret Learning Regret(T) in a history of T periods: total profit of algorithm total profit of best fixed action in hindsight - An algorithm is characterized as “no regret” if for every input sequence the regret grows sublinearly in T. [Blackwell 56], [Hannan 57], [Fundberg, Levine 94],… 10 01 (review) No single action significantly outperforms the dynamic.

11
Games (i.e. Social Interactions) Interacting entities Pursuing their own goals Lack of centralized control

12
Games n players Set of strategies S i for each player i Possible states (strategy profiles) S= × S i Utility u i :S→ R Social Welfare Q:S→ R Extend to allow probabilities Δ(S i ), Δ(S) u i (Δ(S))=E(u i (S)) Q(Δ(S))=E(Q(S)) (review)

20
No Regret Algorithms & CCE A history of no-regret algorithms is a sequence of outcomes s.t. no agent has a single deviating action that can increase her average payoff. A Coarse Correlated Equilibrium is a probability distribution over outcomes s.t. no agent has a single deviating action that can increase her expected payoff.

21
How good are the CCE? It depends… Which class of games are we interested in? Which notion of social welfare? Today Class of games: potential games Social welfare: makespan [Kleinberg, P., Tardos STOC 09]

22
Congestion Games n players and m resources (“edges”) Each strategy corresponds to a set of resources (“paths”) Each edge has a cost function c e (x) that determines the cost as a function on the # of players using it. Cost experienced by a player = sum of edge costs xxxx 2x xx Cost(red)=6 Cost(green)=8

31
(t) is the current state of the system (this is a tuple of randomized strategies, one for each player). Each player tosses their coins and a specific outcome is realized. Depending on the outcome of these random events, we transition to the next state. (Multiplicative Weights) Algorithm in (Potential) Games Δ(S) (t) (t+1) Infinite Markov Chains with Infinite States O(ε)

32
Problem 1: Hard to get intuition about the problem, let alone analyze. Let’s try to come up with a “discounted” version of the problem. Ideas?? (Multiplicative Weights) Algorithm in (Potential) Games Δ(S) (t) (t+1) Infinite Markov Chains with Infinite States O(ε)

37
As ε→0, the equation specifies a vector on each point of our state space (i.e. a vector field). This vector field defines a system of ODEs which we are going to analyze. (Multiplicative Weights) Algorithm in (Potential) Games Δ(S) (t) f ( (t), ε)-f ( (t), 0) = f ´ ( (t), 0) ε f ´ ( (t), 0)

38
As ε→0, the equation specifies a vector on each point of our state space (i.e. a vector field). This vector field defines a system of ODEs which we are going to analyze. (Multiplicative Weights) Algorithm in (Potential) Games Δ(S) (t) f ( (t), ε)-f ( (t), 0) = f ´ ( (t), 0) ε f ´ ( (t), 0)

39
Deriving the ODE Taking expectations: Differentiate w.r.t. ε, take expected value: This is the replicator dynamic studied in evolutionary game theory.

43
Motivating Example Even in the simplest case of two balls, two bins with linear utility the replicator equation has a nonlinear form.

44
The potential function The congestion game has a potential function Let Ψ=E[Φ]. A calculation yields Hence Ψ decreases except when every player randomizes over paths of equal expected cost (i.e. is a Lyapunov function of the dynamics). [Monderer-Shapley ’96].

52
Weakly stable Nash equilibrium Definition: A weakly stable Nash equilibrium is a mixed Nash equilibrium (σ 1,...,σ n ) such that: for all players i,j... if i switches to using any pure strategy in support(σ i )... then j remains indifferent between all the strategies in support(σ j ).

54
Weakly stable Nash equilibrium Definition: A weakly stable Nash equilibrium is a mixed Nash equilibrium (σ 1,...,σ n ) such that: for all players i,j... if i switches to using any pure strategy in support(σ i )... then j remains indifferent between all the strategies in support(σ j ).

59
Mixed weakly stable NE are due to rare coincidences Relies on a coincidence: two edges having equal cost functions. If we perturb the cost functions — e.g. scale each one by an independent random factor between 1-δ and 1+δ — then the game has no mixed equilibrium with the same support sets. p1-p c(x)=x Example:

62
Sard’s Theorem The solution set of these polynomials is an algebraic variety X in R {p} ×R {c}. Project it onto R {c}. Is the image dense? Sard’s Theorem: If f:X→Y is smooth, the image of the set of critical points of f has measure zero.

63
Sard’s Theorem The solution set of these polynomials is an algebraic variety X in R {p} ×R {c}. Project it onto R {c}. Is the image dense? Sard’s Theorem: If f:X→Y is smooth, the image of the set of critical points of f has measure zero. Use an alg. geom. version of Sard’s Theorem and prove the derivative of f is rank-deficient everywhere.

64
Summary 1 Price of Pure Anarchy ???? Multiplicative Updates in Potential Games Price of Anarchy Price of Total Anarchy

70
Learning Algs + Social Welfare Price of Pure Anarchy Price of Anarchy How low can we go? [Kleinberg, P., Tardos STOC 09] [Roughgarden STOC 09] Any no-regret alg in potential games, when social welfare is the sum of utilities.