Game Theory

Decision Theory

Preference, Utility and Rationality

Def. Preferences are relationships defined on two choices \(x,y\in X\) for a specific agent. \(x\succeq y\) means the agent prefers \(x\) to \(y\).

Def. (Complete preference) A preference relation on \(X\) is complete if for any \(x,y\in X\), we have either \(x \succeq y\), or \(y\succeq x\), or both.

Def. (Transitive preference) A preference relation on \(X\) is transitive if given \(x \succeq y\) and \(y \succeq z\), then \(x \succeq z\).

Def. (Best Choice) Define the agent's best choice on a choice set \(X\) as

\[C(X; \succeq) = \{x \in X \mid x \succeq y, \forall y \in X\}. \]

Prop. (Existence of best choice) Suppose the preference relation \(\succeq\) is complete and transitive. Then,

  • for every finite non-empty set \(B\), \(C(B; \succeq) \neq \emptyset\);
  • if \(x, y \in A \cap B\), and \(x \in C(A; \succeq)\) and \(y \in C(B; \succeq)\), then \(x \in C(B; \succeq)\) and \(y \in C(A; \succeq)\).

Def. A utility function \(u : X \to \mathbb{R}\) is consistent with a preference relation \(\succeq\) on \(X\) if for all \(x, y \in X\), \(x \succeq y\) if and only if \(u(x) \geq u(y)\).

Thm. (Representation theorem) If \(X\) is finite, and the preference relation \(\succeq\) on \(X\) is complete and transitive, there exists a utility function that is consistent.

Ex. (Lexicographic Preference) For two points \(x = (x_1, x_2)\) and \(y = (y_1, y_2)\), we have the following lexicographic preference:

\[x \succ_L y \iff \begin{cases} x_1 > y_1 \\ \text{or} \\ x_1 = y_1 \text{ and } x_2 > y_2 \end{cases} \]

If \(x_1,x_2\in\{0,1,...,99\}\), we can find a consistent utility function.

But, if \(x_1,x_2\in[0,1]\), we couldn't. Otherwise there are uncountable disjoint intervals \([u_1(x,0),u_1(x,1)]\). In each interval, we can find a rational number \(f(x)\). It infers that rational number is uncountable, which leads to a contradiction.

Def. (Rationality) An agent is rational if the agent always maximizes his utility.

Lottery Model

Def. A simple lottery over outcomes \(\chi = \{x_1, \dots, x_n\}\) is a vector \(p = (p_1, \dots, p_n)\) with \(p_i \geq 0\) for all \(i\) and \(\sum_{i=1}^n p_i = 1\), where \(p_i\) is the probability that outcome \(x_i\) occurs.

Def. A utility function \(U : P \to \mathbb{R}\) is called an expected utility function (or equivalently, a von Neumann-Morgenstern (VNM) expected utility function) if there is an assignment of numbers \((u_1, \dots, u_n)\) to the \(n\) outcomes \((x_1, \dots, x_n)\) such that for every simple lottery \(p \in P\),

\[U(p) = u_1 p_1 + \dots + u_n p_n. \]

Def. (Independence) A preference relation \(\succeq\) on the space of lotteries \(\mathcal{P}\) satisfies independence if for all \(p, p', p'' \in \mathcal{P}\) and \(\alpha \in (0, 1]\), we have:

\[p \succeq p' \iff \alpha p + (1 - \alpha) p'' \succeq \alpha p' + (1 - \alpha) p''. \]

Def. (Continuity) A preference relation \(\succeq\) on the space of lotteries \(\mathcal{P}\) is continuous if for any \(p, p', p'' \in \mathcal{P}\) with \(p \succeq p' \succeq p''\), there exists some \(\alpha \in [0, 1]\) such that

\[\alpha p + (1 - \alpha) p'' \sim p'. \]

Thm. (Expected Utility Theorem) A preference relation \(\succeq\) on lottery space \(P\) admits an expected utility representation if and only if it satisfies axioms of completeness, transitivity, continuity and independence.

Pf. Find the most and least preferred lotteries \(\overline{p},\underline{p}\). For any \(p\in P\), by continuity axiom, there exists a unique \(U(p)\) s.t.

\[p\sim U(p)\overline{p}+(1-U(p))\underline{p}. \]

We can easily verify that \(U\) is a utility function. Furthermore, we have:

\[\alpha p+(1-\alpha)p'\\ \sim\alpha(U(p)\overline{p}+(1-U(p))\underline{p})+(1-\alpha)(U(p')\overline{p}+(1-U(p'))\underline{p})\\ \sim (\alpha U(p)+(1-\alpha)U(p'))\overline{p}+(1-\alpha U(p)+(1-\alpha)U(p'))\underline{p}. \]

Thus

\[U(\alpha p+(1-\alpha)p')=\alpha U(p)+(1-\alpha) U(p'). \]

Prop. Suppose that \(U : P \rightarrow \mathbb{R}\) is a VNM expected utility function for the preference relation \(\succeq\) on \(\mathcal{P}\). Then \(V : \mathcal{P} \rightarrow \mathbb{R}\) is another VNM utility function for \(\succeq\) if and only if there are scalar \(a\) and scalar \(b > 0\) such that \(V(p) = a + bU(p)\) for all \(p \in \mathcal{P}\).

Def. Risk-averse: \(\delta_{E(p)}\succeq p\). Risk-neutral: \(\delta_{E(p)}\sim p\). Risk-loving/risk-seeking: \(\delta_{E(p)}\preceq p\).

Strategic Form Game

Def. (Strategic Form Game)

A strategic form game, or a normal form game, is a triplet \(G = \{\mathcal{I}, (S_i)_{i \in \mathcal{I}}, (u_i)_{i \in \mathcal{I}}\}\) such that:

  • Players: \(\mathcal{I}\) is a finite set of players, i.e., \(\mathcal{I} = \{1, \dots, I\}\).

  • Strategy Space:

    • The pure-strategy space \(S_i\) is the set of available actions for player \(i\);
    • \(s_i \in S_i\) is an action for player \(i\).
  • Payoff Functions:

    • \((u_i)_{i \in I}: S_1 \times S_2 \times \dots \times S_I \to \mathbb{R}\);
    • \(u_i(s)\) is player \(i\)'s von Neumann-Morgenstern utility function for each strategy profile \(s = (s_1, \dots, s_I)\) of all players.

Rem. We assume that every agent cannot observe other agents' actions. A strategic form game is a simultaneous-move (one-shot) game. Some notations: \(s_{-i}=\{s_j\}_{j\neq i}, S_{-i}=\prod_{j\neq i}S_j.\) \(s=(s_1,...,s_I)=(s_i,s_{-i})\) is a strategy profile.

Assm. (Common knowledge) We assume that every agent knows all agents' actions, utilities, and that every agent is rational. Also, every agent knows that every agent knows this, and every agent knows that every agent knows that every agent knows this, and so on.

Dominance

Def. (Dominant strategy) A strategy \(s_i \in S_i\) is a dominant strategy for player \(i\) if:

\[u_i(s_i, s_{-i}) \geq u_i(s'_i, s_{-i}), \]

for all \(s'_i \in S_i\) and for all \(s_{-i} \in S_{-i}\).

Rem. A dominant strategy is the “best” choice no matter what others play!

Def. (Dominant strategy equilibrium) A strategy profile \(s^*\) is a dominant strategy equilibrium if \(s_i\) is a dominant strategy for every player \(i\).

Def. (Strictly dominated strategy) For player \(i\), a strategy \(s_i \in S_i\) is strictly dominated (by a pure strategy) if there exists \(s'_i \in S_i\) such that:

\[u_i(s'_i, s_{-i}) > u_i(s_i, s_{-i}), \forall s_{-i} \in S_{-i}. \]

Def. (Weakly dominated strategy) For player \(i\), a strategy \(s_i \in S_i\) is weakly dominated (by a pure strategy) if there exists \(s'_i \in S_i\) such that:

\[u_i(s'_i, s_{-i}) \geq u_i(s_i, s_{-i}),\forall s_{-i} \in S_{-i},\\ u_i(s'_i, s_{-i}) > u_i(s_i, s_{-i}),\exists s_{-i} \in S_{-i}. \]

Def. (Iterated Elimination of Strictly Dominated Strategy) We define the pure strategy iterated elimination of strictly dominated strategy:

  • Step \(0\): Define \(S_0^i = S_i\) for all \(i\).
  • Step \(1\): Define

\[S_1^i = \{s_i \in S_0^i \mid \nexists s'_i \in S_0^i \text{ such that } u_i(s'_i, s_{-i}) > u_i(s_i, s_{-i}) \text{ for all } s_{-i} \in S_{-i} \}. \]

  • Step \(k\): Define

\[S_k^i = \{s_i \in S_{k-1}^i \mid \nexists s'_i \in S_{k-1}^i \text{ such that } u_i(s'_i, s_{-i}) > u_i(s_i, s_{-i}) \text{ for all } s_{-i} \in S_{k-1,-i} \}. \]

  • Step \(\infty\): Define

\[S^\infty_i = \bigcap_{k=0}^\infty S_i^k. \]

\(S^\infty_i\) is the set of player \(i\)'s pure strategies that survive iterated deletion of strictly dominated strategies.

Ex. (Cournot Competition) Two firms compete on production levels: each firm \(i \in \{1, 2\}\) chooses its own product level \(s_i \in [0, \infty]\). The market price is

\[p = \max\{0, 2 - q\}, \]

where \(q = s_1 + s_2\) is the market production quantity. For each firm \(i\), the utility function is:

\[u_i = s_i\left(\max\{0, 2 - q\} \right) - c s_i. \]

Let \(c = 1\). First we can see \(s_i> 1\) is strictly dominated by \(s_i=0\) since

\[u_i(s_i>1)\le (2-s_i-c)s_i<0=u_i(s_i=0). \]

Now we know \(s_i\in[0,1]\) for \(i\in\{1,2\}\), so \(u_i=s_i(2-s_i-s_{-i}-c).\) Taking its derivative, we get

\[u_i'=1-2s_i-s_{-i}. \]

The best response is

\[s_i^*=\frac{1-s_{-i}}{2}\le\frac{1}{2} \]

and it strictly dominates \(s_i>\frac{1}{2}\). Thus we can see that \(s_i\in[0,\frac{1}{2}]\). Again,

\[s_i^*=\frac{1-s_{-i}}{2}\ge\frac{1}{4}, \]

we can narrow the scope to \(s_i\in[\frac{1}{4},\frac{1}{2}]\). Repeat this finally gives

\[s_1^*=s_2^*=\frac{1}{3}. \]

Ex. (Location Game) Two vendors are choosing from locations 1, ..., 10, 11 to open their shops. At each location, there is one customer. The customer goes to the nearest vendor. If two shops are at the same distance for a customer, each of the shops "shares" half of the customer there.

\(1\) is dominated by \(2\), \(11\) is dominated by \(10\), and so on. Finally both shops will at \(6\).

Rem. Order matters during weakly dominated strategy elimination!

Nash Equilibrium

Pure Strategy

Def. A strategy profile \(s^*\) is a (pure strategy) Nash equilibrium if, for all players \(i\):

\[u_i(s^*_i, s^*_{-i}) \geq u_i(s'_i, s^*_{-i}), \forall s'_i \in S_i. \]

Rem. It means that one will not be better off by unilateral deviation.

Thm. If \(s^*\) is a pure strategy Nash equilibrium of game \(G\), then \(s^* \in S^\infty\).

Rem. PSNE may not exist and may have multiple.

Def. Given a belief of others' strategy profile \(s_{-i}\), define best response as:

\[B_i(s_{-i}) = \{s' \mid \arg\max_{s' \in S_i} u_i(s', s_{-i}) \}. \]

Prop. At Nash equilibrium, every agent is at its best response to others.

Mixed strategy

Def. (Mixed strategy) For player \(i\), a mixed strategy \(\sigma_i\) is a probabilistic distribution over pure strategies.

Rem. Let \(\Sigma_i\) denote the set of probability measures over the pure strategy set \(S_i\). The space of mixed strategy profiles of the game is \(\Sigma = \prod_i \Sigma_i\) and a mixed strategy profile \(\sigma \in \Sigma\). Let \(\sigma_i(s_i)\) denote the probability that \(\sigma_i\) assigns to \(s_i \in S_i\). Given von Neumann-Morgenstern utility, player \(i\)'s expected payoff \(u_i\) is:

\[u_i = \sum_{s \in S} \left( \prod_{j=1}^I \sigma_j(s_j) \right) u_i(s). \]

Def. (Support of a mixed strategy) We define the support of a mixed strategy \(\sigma_i\) as:

\[\text{support}(\sigma_i) = \{ s_i \in S_i \mid \sigma_i(s_i) > 0 \}. \]

Def. (Mixed Strategy Nash equilibrium) A mixed strategy profile \(\sigma^*\) is a Nash equilibrium if for every player \(i\):

\[u_i(\sigma^*_i, \sigma^*_{-i}) \geq u_i(\sigma'_i, \sigma^*_{-i}), \forall \sigma'_i \in \Sigma_i. \]

Prop. A mixed strategy profile \(\sigma^*\) is a Nash equilibrium if and only if for each player \(i \in I\), every pure strategy in the support of \(\sigma^*_i\) is a best response to \(\sigma^*_{-i}\).

Prop. If \(\sigma^*\) is a Nash equilibrium, every pure strategy in the support of \(\sigma^*_i\) yields the same payoff.

Ex. Battle of Sexes:

Football Ballet
Football \((2, 1)\) \((0, 0)\)
Ballet \((0, 0)\) \((1, 2)\)

Obviously, two pure strategy Nash equilibrium are \((F,F)\) and \((B,B)\). Besides, we can verify that the only MSNE is \((\sigma_1=\frac{2}{3}F+\frac{1}{3}B,\sigma_2=\frac{1}{3}F+\frac{2}{3}B).\)

Def. A pure strategy \(s_i\) is strictly dominated if there exists a mixed strategy \(\sigma'_i \in \Sigma_i\) such that:

\[u_i(\sigma'_i, s_{-i}) > u_i(s_i, s_{-i}) \text{ for all } s_{-i} \in S_{-i}. \]

Def. Iterative Elimination of Strictly Dominated Strategies with Mixed Strategy:

  • Step \(0\): Define \(S_0^i = S_i\) and \(\Sigma_0^i = \Sigma_i\) for all \(i\).
  • Step \(k\): Define

\[S_i^k = \{ s_i \in S_i^{k-1} \mid \nexists \sigma'_i \in \Sigma_i^{k-1} \text{ such that } u_i(\sigma'_i, s_{-i}) > u_i(s_i, s_{-i}), \forall s_{-i} \in S_{-i}^{k-1} \}. \]

and define

\[\Sigma_i^k = \{ \sigma_i \in \Sigma_i \mid \sigma_i(s_i) > 0 \text{ only if } s_i \in S_i^k \}. \]

This means eliminating all dominated strategies in each step.

  • Step \(\infty\): Define

\[S_i^\infty = \bigcap_{k=0}^\infty S_i^k. \]

Ex.

L C R
U \(1,1\) \(0,2\) \(0,4\)
M \(0,2\) \(5,0\) \(1,6\)
D \(0,2\) \(1,1\) \(2,1\)

First, \(C\) is strictly dominated by \(\frac{1}{2}L+\frac{1}{2}R\). Then \(M\) is dominated by \(\frac{1}{3}U+\frac{2}{3}D\). Finally, we can verify that the only MSNE is \((\sigma_1=\frac{1}{4}U+\frac{3}{4}D,\sigma_2=\frac{2}{3}L+\frac{1}{3}R).\)

Rationalizability

Def. (Belief) A belief \(\mu_i\) of player \(i\) about the other players' strategies is a probability measure on \(\prod_{j \neq i} \Sigma_j.\)

Def. (Never-best response) A pure strategy \(s_i\) is a never-best response if for all beliefs \(\sigma_{-i}\), there exists \(\sigma_i \in \Sigma_i\) such that:

\[u_i(\sigma_i, \sigma_{-i}) > u_i(s_i, \sigma_{-i}). \]

Rem. Strictly dominated strategy \(\Rightarrow\) Never-best response.

But... Never-best response \(\not\Rightarrow\) Strictly dominated strategy.

Def. Iterative elimination of never-best response strategies:

  • Start with \(\tilde{S}^0_i = S_i\), \(\tilde{\Sigma}^0_i = \Sigma_i\).
  • For each player \(i \in I\) and for each step \(k \geq 1\),

\[\tilde{S}^k_i = \left\{ s_i \in \tilde{S}^{k-1}_i \mid \exists \sigma_{-i} \in \prod_{j \neq i} \tilde{\Sigma}^k_j \text{ such that } u_i(s_i, \sigma_{-i}) \geq u_i(s'_i, \sigma_{-i}), \forall s'_i \in \tilde{S}^{k-1}_i \right\}. \]

  • Independently mix over \(\tilde{S}^k_i\) to obtain \(\tilde{\Sigma}^k_i\), i.e., \(\tilde{\Sigma}_i = \Delta(\tilde{S}_i^k)\).
  • Let \(R^\infty_i = \bigcap_{k=1}^\infty \tilde{S}_i^k\).

Def. (Rationalizable strategies) \(R^\infty_i\) is the set of rationalizable strategies.

Thm. Let \(NE_i\) denote the set of pure strategies of player \(i\) used with positive probability in any mixed strategy Nash equilibrium, we have

\[NE_i\subseteq R_i^\infty\subseteq S_i^\infty. \]

Ex. Consider a game where player 1 has actions \(A, B, C, D\) and player 2 has actions \(x, y\):

x y
A 3 0
B 0 3
C 2 2
D 1 1

Strategy \(A\) is the best response to \(x\), and strategy \(B\) is the best response to \(y\). \(D\) is dominated by \(C\).

Thus, \(A\) and \(B\) survive the elimination of never-best responses. But the mixed strategy \(\frac{1}{2}A+\frac{1}{2}B\) is dominated by \(C\) and hence is not a best response to any strategy of player 2.

Thm. (Pearce, 1984) Rationalizability and iterated strict dominance coincide in two-player games.

Rem. Consider following three-player game:

LL LR RL RR
A 0 0 0 0
B 0 1 1 0
C 1 0 0 -1
D -1 0 0 1

We can show that \(A\) is never-best response, but not strictly dominated. For former, let \(\sigma_2(L)=p,\sigma_3(L)=q\):

  • If \(p(1-q)>0\) or \(q(1-p)>0\), \(B\) is better than \(A\).
  • Otherwise \(p=q=0\) or \(p=q=1\), \(D\) and \(C\) are better than \(A\).

Existence of Nash Equilibrium

Nash's Theorem and Three Lemmas

Lem (Brouwer). Let \(C\) be a bounded, convex and closed subset of Euclidean space. If \(f : C \to C\) is a continuous function, then there exists \(x \in C\) such that \(f(x) = x\).

Def. Suppose the strategy profile \(\sigma \in \Sigma\) is given. For a player \(i\) and a pure strategy \(s \in S_i\), define the gain function of player \(i\) as

\[G_i(s,\sigma)=\max\{u_i(s;\sigma_{-i})-u_i(\sigma),0\}. \]

Thm (Nash's Theorem). Every finite game has at least one mixed strategy Nash equilibrium.

Pf. Define a function \(f : \Sigma \to \Sigma\) as follows. For all \(\sigma \in \Sigma\), \(\sigma \overset{f}{\to} \sigma'\), where for all \(i\) and \(s_i \in S_i\):

\[ \sigma'_i(s_i) = \frac{\sigma_i(s_i) + G_i(s_i, \sigma)}{1 + \sum_{s_i \in S_i} G_i(s_i, \sigma)} \]

Intuition: function \(f\) maps \(\sigma\) to \(\sigma'\) by boosting the probability mass on pure strategies that could lead to higher gain.

\(f\) is continuous. \(\Sigma\) is bounded and closed. Since \(\Sigma_i\) is convex and \(\Sigma\) is the product of \(\Sigma_i\), thus is also convex. Brouwer’s fixed point theorem ensures the existence of a fixed point.

Next we show that any fixed point of \(f\) is a Nash equilibrium. It suffices to show that a fixed point \(\sigma = f(\sigma)\) satisfies \(G_i(s, \sigma) = 0\), \(\forall i, s \in S_i\).

Assume the contrary, at fixed point \(\sigma\), there exists some \(s'\) such that \(G_i(s', \sigma) > 0\). In order to be a fixed point, we must have \(\sigma_i(s')=\sigma'_i(s') > 0\).

Since \(u_i(\sigma) = \sum_{s_i} \sigma_i(s_i) u_i(s_i, \sigma_{-i})\) and we already have \(u_i(s', \sigma_{-i}) > u_i(\sigma)\), there must exist some \(s''\) such that \(\sigma_i(s'') > 0\) and \(u_i(s''; \sigma_{-i})<u_i(\sigma)\). Thus, we have \(G_i(s'', \sigma) = 0\). Then, we have

\[\sigma_i'(s'') = \frac{\sigma_i(s'')}{1 + \sum_{s_i \in S_i} G_i(s_i, \sigma)} < \sigma_i(s'') \]

which contradicts the fact that \(\sigma\) is a fixed point.

Lem (2D-Sperner). Consider a triangle \(T\) with vertices \(v_0, v_1, v_2\). Let \(\mathcal{T}\) be a triangulation of \(T\) and \(V(\mathcal{T})\) denote its set of vertices. Consider any coloring of \(V(\mathcal{T})\) with \(\{0, 1, 2\}\) such that:

  1. \(v_i\) is colored with \(i\) \((i \in \{0, 1, 2\})\).
  2. If \(v \in V(\mathcal{T})\) lies on the line joining \(v_i\) and \(v_j\) for \(i, j \in \{0, 1, 2\}\), the color of \(v\) is either \(i\) or \(j\).

Then, there exists at least one triangle in \(T\) whose vertices are colored by all three colors, i.e., a tri-chromatic triangle.

Pf. Think of the big triangle as a “house”, while the small triangles being the “rooms”. Each edge is a “door”. Choose a boundary edge with different colors at two end points to enter the triangle. Remember the coloring of the entering “door”.

An entry to \(T\) either ends up at a tri-chromatic triangle or gets out from the same side. Since there is an odd number of these “doors” at each boundaries, there must exists a tri-chromatic triangle.

Ex. Fair Rent Sharing: The total rent is $3,000 but the rooms are of different sizes. How can they choose rooms and divide the rent fairly?

Construct a triangulation. Each vertex represents a price profile for all three rooms (according to its coordinates). Name each vertex with A, B, C as its owner, such that each triangle is assigned with all three owners at its three vertices. At each vertex, the named person picks his favorite room under the price profile of the vertex.

Color the vertex as the named person prefers. Assume that everyone prefers a free room if possible. Then each bound side will only have the color within the two end points. Apply Sperner’s lemma, we have the tri-chromatic triangle.

Lem (Kakutani). Let \(\phi\) be a correspondence on set \(S\), with \(x \in S\) and \(\phi(x) \subseteq S\). Suppose we have the following conditions:

  • \(S\) is a non-empty, compact, and convex subset of Euclidean space \(\mathbb{R}^n\).
  • \(\phi(x)\) is non-empty and convex-valued for all \(x \in S\).
  • \(\phi\) has a closed graph.

Then \(\phi\) has a fixed point. That is, there exists some \(x \in S\) such that \(x \in \phi(x)\).

  • Compact: A set in \(\mathbb{R}^n\) is compact if it is both closed and bounded.
  • Convex-valued correspondence: \(\phi\) is convex-valued for all \(x \in S\) means \(\phi(x)\) is a convex set for all \(x \in S\).
  • Graph: A graph of a correspondence \(\phi\) is the set \(\{x, y | y \in \phi(x)\}\). If \(\phi\) is a real function, the graph is the plot of the function.
  • Closed graph: A correspondence has a closed graph if the graph of the correspondence is a closed set. That is, for any sequences \(\{x_n, y_n\} \to \{x^*, y^*\}\) as \(n \to \infty\), with \(y_n \in \phi(x_n)\), we have \(y^* \in \phi(x^*)\).

Pf. (Prove Nash’s Theorem via Kakutani’s Theorem)

(Nash, 1950) The idea is to apply Kakutani’s fixed point theorem to the best-response correspondence \(BR(\sigma) : \Sigma \overset{\phi}{\to}2^\Sigma\).

Recall the definition of best response

\[BR_i(\sigma_{-i}) = \{\sigma_i | \arg \max_{\sigma_i \in \Sigma_i} u_i(\sigma_i, \sigma_{-i})\}. \]

Define the correspondence as \(BR(\sigma) = \{\sigma' | \sigma'_i \in BR_i(\sigma_{-i})\}\), \(\forall i\). We can check:

  • \(\Sigma\) is compact and non-empty.
  • \(BR(\sigma)\) is non-empty and convex-valued.
  • \(BR(\sigma)\) has a closed graph.

Thus, a fixed point exists, and the fixed point is a Nash equilibrium.

Other results

Thm (Debreu, Glicksberg, Fan). For a strategic form game \(G = \{I, (S_i)_{i \in I}, (u_i)_{i \in I}\}\), if:

  • \(I\) is a finite set;
  • \(S_i\) is compact and convex;
  • \(u_i(s_i, s_{-i})\) is continuous in \(s_{-i}\);
  • \(u_i(s_i, s_{-i})\) is continuous and quasi-concave in \(s_i\);

then there exists a pure strategy Nash equilibrium.

Thm (Glicksberg). For a strategic game \(G = \{I, (S_i)_{i \in I}, (u_i)_{i \in I}\}\), if:

  • \(I\) is a finite set;
  • \(S_i\) is nonempty and compact;
  • \(u_i(s_i, s_{-i})\) is continuous in \(s_i\);

then there exists a mixed strategy Nash equilibrium.

Def. A metric space \((X, d)\) is a set \(X\) with a notion of distance:

  • \(d(x, y) \geq 0\) with \(d(x, y) = 0\) iff \(x = y, \forall x, y \in X\).
  • \(d(x, y) = d(y, x), \forall x, y \in X\).
  • \(d(x, y) + d(y, z) \geq d(x, z), \forall x, y \in X\).

Def. A mapping \(T : X \to X\) is called a contraction mapping on metric space \((X, d)\) if for some \(0 \leq \beta < 1\) and \(\forall x, y \in X\):

\[ d(Tx, Ty) \leq \beta d(x, y). \]

Thm. Let \((X, d)\) be a non-empty complete metric space, then any contraction \(f : X \to X\) has a unique fixed point.

Thm. If the best response mapping is a contraction on the entire strategy space, then there is a unique NE in the game.

Optimums and Correlated Equilibriums

Optimums

Def. A strategy profile s that maximizes social welfare is a social optimum.

Def (Pareto dominance). A strategy profile \(s\) Pareto dominates \(s'\) if:

  • \(\forall i \in I, u_i(s) \geq u_i(s')\), and
  • \(\exists i \in I, u_i(s) > u_i(s')\).

Def (Pareto optimal). A strategy profile \(s\) is Pareto optimal if it is not Pareto dominated, i.e., there is no strategy profile \(s'\) such that:

\[\forall i \in \{1, ..., I\}, u_i(s') \geq u_i(s)\text{ and }\exists i \in \{1, ..., I\}, u_i(s') > u_i(s). \]

Prop. There always exists at least one Pareto optimal strategy profile in a finite game.

Prop. A social optimum strategy profile \(s\) is Pareto optimal.

Ex. Prisoner’s dilemma

Confess Don’t confess
Confess (−5, −5) (0, −20)
Don’t confess (−20, 0) (−1, −1)
  • Nash equilibrium: \((C,C)\)
  • Social optimal: \((D,D)\)
  • Pareto optimal: \((C,D),(D,C),(D,D)\)

Correlated Equilibrium

Ex. Battle of Sexes

Football Ballet
Football (2, 1) (0, 0)
Ballet (0, 0) (1, 2)
  • Mixed strategy Nash equilibrium gives \(\frac{2}{3}\) payoff to each player.
  • If they decided to flip a coin, head means both football, and tail means both ballet, the expected payoff becomes \(\frac{3}{2}\).

Def (Correlated Equilibrium).
A correlated equilibrium of a finite game is a joint probability distribution \(p\) over \(\Delta(S)\) such that:

\[\sum_{s_{-i} \in S_{-i}} p(s_{-i} | s_i) u_i(s_i, s_{-i}) \geq \sum_{s_{-i} \in S_{-i}} p(s_{-i} | s_i) u_i(s_i', s_{-i}), \quad \forall s_i' \in S_i \]

for every player \(i\) and every \(s_i\) with \(p(s_i) > 0.\)

Prop. Every mixed strategy Nash equilibrium is a correlated equilibrium.

Ex.

L R
U (5, 1) (0, 0)
D (4, 4) (1, 5)
  • Mixed strategy Nash equilibrium: \(s_1=\frac{U+D}{2},s_2=\frac{L+R}{2}\), payoff is \((2.5,2.5)\).
  • If they play according to a coin: With probability \(\frac{1}{2}(\text{Head})\), player 1 plays \(U\) and player 2 plays \(L\), and with probability \(\frac{1}{2}(\text{Tail})\), player 1 plays \(D\) and player 2 plays \(R\). The payoff is \((3,3)\).
  • If they play with correlated signals as follows: \(UL,DL,DR\) each with probability \(\frac{1}{3}\). The payoff is \(\frac{10}{3}\).

Coarse Correlated Equilibrium

Def. A coarse correlated equilibrium of a finite game is a joint probability distribution \(p\) over \(\Delta(S)\) such that:

\[\sum_{s \in S} p(s) u_i(s_i, s_{-i}) \geq \sum_{s \in S} p(s) u_i(s_i', s_{-i}), \quad \forall s_i' \in S_i \]

for every player \(i\).

Rem. A coarse correlated equilibrium only requires that following the suggested action is only a best response in expectation before you see the specific signal of \(s_i\). A correlated equilibrium requires that given a signal of \(s_i\), this pure-strategy is indeed a best response. Coarse correlated equilibrium (CCE) is a strictly large set than CE.

Comparison

  • Dominant strategy equilibrium (DSE)
  • Pure strategy Nash equilibrium (PSNE)
  • Mixed strategy Nash equilibrium (MSNE)
  • Correlated equilibrium (CE)
  • Coarse correlated equilibrium (CCE)

We know the following strict containment:

\[\text{DSE} \subset \text{PSNE} \subset \text{MSNE} \subset \text{CE} \subset \text{CCE} \]

Starting at MSNE, the solution concept is guaranteed to exist (but may still be hard to find).

Extensive Form Games

Game Tree, Extensive Form Games and Information Set

Def. (Game Tree) The extensive form game can be represented by a rooted tree \(T\).

  • Nodes: Each node in \(T\) represents a possible state in the game.

    • Terminal nodes (outcomes): Each leaf state results in a certain payoff for each player.
    • Decision nodes: Each decision node \(v\) in \(T\) is associated with one of the players, indicating that it is his turn to play when \(v\) is reached.
    • Exogenous events can be treated as decisions of the player of “Nature”, i.e., the “chance nodes”.
  • Edges: The edges from an internal node to its resulting child node represent an action and the corresponding next state. All edges from a node show the possible moves of the player when the game reaches that state.

Def. (Extensive Form Game) An extensive form game contains the following information (can be represented by the game tree):

  1. The set of players,

  2. The order of moves, i.e., who moves at each state,

  3. The players' payoffs as a function of the moves,

  4. What the players’ choices are when they move,

  5. The probability distributions over any exogenous events,

  6. What each player knows when he makes his choices.

Assm. (Perfect recall) Players remember all the information that he received during the play.

Def. (Perfect Information) The concept of perfect information means, during play, each player knows the complete history of previous moves, and hence which state of the game he is currently at. Conversely, imperfect information means that players may not know their previous actions.

Def. (Information Set) An information set \(h_i \in H_i\) of player \(i\) consists of the nodes of the game tree. All information sets \(h \in \bigcup H_i\) partition the game tree, i.e., every node is in exactly one information set.

For an information set \(h_i\) of player \(i\):

  • Player \(i\) makes the move for any state \(u \in h_i\),
  • Available actions for any \(u, v \in h_i\) are the same,
  • Given any \(u, v \in h_i\), player \(i\) cannot distinguish which state he is currently at.

Rem. A game of perfect information is the special case when all information sets are singletons.

Rem. Perfect recall rules out the possibility that \(u\) is the predecessor of \(v\) while they are in the same information set.

Ex. A Simple Card Game:

Consider a two-player game. At the beginning, players 1 and 2 each put a dollar in the pot.

Next, player 1 draws a card from a shuffled deck in which half the cards are red and half are black. Player 1 looks at his card privately and decides whether to raise or fold.

  • If player 1 folds, he shows the card to player 2 and the game ends; in this case, player 1 takes the money in the pot if the card is red, but player 2 takes the money in the pot if the card is black.
  • If player 1 raises, then he adds another dollar to the pot, and player 2 must decide whether to meet or pass. If player 2 passes, then the game ends, and player 1 takes the money in the pot.
  • If player 2 meets, then she also must add another dollar to the pot, and then player 1 shows the card to player 2 and the game ends; in this case, again, player 1 takes the money in the pot if the card is red, and player 2 takes the money otherwise.

Def. A pure strategy for player \(i\) is a map \(s_i : H_i \to A_i\) such that \(s_i(h) \in A(h)\) for all \(h \in H_i\).

Rem. Every extensive form game with perfect information can be converted into a normal form game.

Mixed Strategy and Behavioral Strategy

Def. (Mixed strategy) A mixed strategy $ \sigma_i $ of a player $ i $ in an extensive form game is a distribution over pure strategies, i.e.,

\[\sigma_i \in \Delta S_i = \Delta \left( \prod_{h \in H_i} A(h) \right). \]

Def. (Behavioral Strategy) A behavioral strategy $ b_i $ of player $ i $ is a product distribution on $ A_i $:

\[b_i \in \prod_{h \in H_i} \Delta(A(h)). \]

Def. (Realization equivalence) Two strategies $ \sigma_i $ and $ \sigma_i' $ for player $ i $ in an extensive form game are realization equivalent if for each strategy $ \sigma_{-i} $ of the opponents and every node $ v $ in the game tree, the probability of reaching $ v $ when strategy profile \((\sigma_i, \sigma_{-i})\) is employed is the same as the probability of reaching $ v $ when \((\sigma'_i, \sigma_{-i})\) is employed.

Thm. In a finite extensive form game of perfect recall, mixed and behavioral strategies are equivalent.

Pf. Let \(\sigma_i\) be a mixed strategy. Let $ R_i(h_i) $ be the set of player $ i $'s pure strategies that do not preclude $ h_i $, so that for all $ s_i \in R_i(h_i) $ there is an opponent profile $ s_{-i} $ that reaches $ h_i $.

If $ \sigma_i $ assigns positive probability to some $ s_i $ in $ R_i(h_i) $, define the behavioral strategy to be the conditional distribution over actions at $ h_i $, i.e. the corresponding probability that $ b_i $ assigns to $ a_i \in A(h_i) $ as

\[b_i(a_i|h_i) = \frac{\sum_{s_i \in R_i(h_i) \text{ and } s_i(h_i) = a_i} \sigma_i(s_i)}{\sum_{s_i \in R_i(h_i)} \sigma_i(s_i)}. \]

If the denominator is zero, let $ b_i(a_i|h_i) $ be the uniform distribution over the actions at $ h_i $ (can be arbitrary).

In either case, $ b_i(\cdot| \cdot) $ are nonnegative, and

\[\sum_{a_i \in A_i(h_i)} b_i(a_i|h_i) = 1. \]

Subgame Perfection and Backward Induction

Def. A subgame $ G' $ of an extensive form game $ G $ is a subtree of the game tree of $ G $ that:

  • begins at a singleton information set;
  • includes all subsequent nodes;
  • does not cut any information sets.

Def. (Subgame Perfect Equilibrium) A (behavioral) strategy profile $ \sigma^* $ is a subgame perfect equilibrium of $ G $ if (the restriction of) it is a Nash equilibrium of every subgame of $ G $.

Ex. (Mutual Assured Destruction)

  • NE: (Escalate, Back down) and (Peace, Retaliate)
  • SPE: (Escalate, Back down)

Def. (Backward Induction) For finite games, we introduce the method of backward induction:

  1. Find terminal subgames:
    • Terminal subgames are subgames that contain no further subgames.
    • If no terminal subgame is found, output the outcome payoff.
  2. Solve the terminal subgames:
    • They contain no “smaller” subgames.
    • Nash equilibrium is guaranteed to exist (finite game).
  3. Calculate the Nash equilibrium payoffs of the terminal subgames and replace these subgames with the Nash payoffs:
    • By calculation, a “node” of subgame becomes an outcome.
  4. Go back to Step 1.

Ex. (Centipede) Two players are deciding on a division of investment. A pot of money, with initial amount of \(4\). Two players take turns. The pot gains \(1\) in each turn.

At each player’s turn and the pot has \(p\), the player can choose to split the money, take the share of $ \left\lfloor \frac{p+4}{2} \right\rfloor $ and leave the rest to the other. By doing this, the game ends. Or, the player can choose to do nothing and pass his turn. When the pot reaches \(100\), two players split evenly and end the game.

SPE gives the result where each player split in every node.

Ex. (Matching pennies)

SPE: \(s_1^*=(A,\frac{1}{2}H+\frac{1}{2}T,\frac{1}{2}H+\frac{1}{2}T),s_2^*=(\frac{1}{2}H+\frac{1}{2}T,\frac{1}{2}H+\frac{1}{2}T)\).

Thm. Any finite extensive form game has a subgame perfect equilibrium.

Prop. (Zermelo’s Theorem) Any finite game of perfect information has a pure strategy SPE.

Stackelberg Game

Ex. Cournot as a Stackelberg Game: each firm choose \(q_i\in[0,1],p=1-q_1-q_2\). Firm 1 moves first, and then firm 2 moves. Utility \(u_i=q_i(1-q_1-q_2)\). Best response is \(BR(q_{-i})=(1-q_{-i})/2\).

  • NE for normal form: \(q_1=q_2=\frac{1}{3}\).

  • NE for Stackelberg: \(q_1=0,q_2(q_1=0)=\frac{1}{2},q_2(q_1>0)=1\).

  • SPE for Stackelberg: \(q_1=\frac{1}{2},q_2=\frac{1}{4}\).

Def. Stackelberg Game consists of a leader (\(l\)) and a follower (\(f\)). The objective of the follower is to choose her strategy to best respond to the leader:

\[BR(\sigma_l) = \arg \max_{\sigma_f \in \Delta_f} u_f(\sigma_l, \sigma_f). \]

The goal of the leader is to maximize the leader's utility subject to the follower’s best response:

\[\max_{\sigma_l \in \Delta_l} u_l(\sigma_l, \sigma_f) \quad \text{s.t.} \quad \sigma_f \in BR(\sigma_l). \]

Rem. The issue is that $ BR(\sigma_l) $ may be a set-valued function, and $ u_l(\sigma_l, \sigma_f) $ generally would differ depending on which $ y \in BR(\sigma_l) $ is chosen.

Def. In Strong Stackelberg equilibrium (SSE), we assume that the follower breaks ties in favor of the leader, i.e.,

\[\max_{\sigma_l \in \Delta_l, \sigma_f \in BR(\sigma_l)} u_l(\sigma_l, \sigma_f). \]

Def. In Weak Stackelberg equilibrium (WSE), we assume that the follower breaks ties adversarially,

\[ \max_{\sigma_l \in \Delta_l} \min_{\sigma_f \in BR(\sigma_l)} u_l(\sigma_l, \sigma_f) . \]

Rem. We can only put our attention on SSE since it can be obtained by making a small perturbation to WSE.

Thm. In a Stackelberg game, the leader achieves weakly more utility in SSE than in any Nash equilibrium.

Pf. Consider the Nash equilibrium (\(\sigma^*_l\), \(\sigma^*_f\)) that yields the highest utility for the leader.

Since the follower breaks ties in favor of the leader, we get that if the leader commits to $ \sigma^* _l $, then the follower can at worst pick $ \sigma^* _f $ from $ BR(\sigma^* _l) $.

Otherwise, the follower must pick something that yields even better utility for the leader.

Ex. (Inspection Game) An inspector chooses whether or not to inspect, and the inspetee chooses whether to cheat or not.

Cheat Not Cheat
Inspect (-6,-9) (-1,0)
Not Inspect (-10,1) (0,0)
  • NE: the inspector inspects with probability 1/10, and the inspectee cheats with probability 1/5, yielding an expected payoff of (−2,0) for the two players.

  • SSE: the inspector to inspect with probability 1/10 and the inspectee to “not cheat”. This yields expected utilities (−1/10,0).

  • WSE: does not exist.

Rem. It is enough to just consider pure strategy for followers.

Ex. (Security Game) A defender (the leader) is interested in protecting a set of targets using limited resource $ r \in R $, while an attacker (the follower) is able to observe the strategy of the leader, and best respond to it.

Let $ u^c_d(t) $ denote the defender's utility if target $ t $ is attacked and covered, and $ u^u_{d}(t) $ if $ t $ is attacked and uncovered. Similarly, $ u^c_a(t) $ denotes the attacker’s utility if target $ t $ is attacked and covered, and $ u^u_a(t) $ if $ t $ is attacked and uncovered.

Imagine a mixed strategy for the defender to allocate guards to cover $ T $ gates.

We maximize the defender utility by choosing the coverage strategy \(c_{r,t}\), subject to making some target \(t^*\) a best response for the attacker. Specifically, each resource can cover exactly one target (e.g., a guard on a gate), \(c_{r,t}\) is the probability of using resource \(r\) to cover target \(t\), and \(c_t\) is the total probability of target \(t\) being covered. Let

\[ u_a(t | c) = c_t u_a^c(t) + (1 - c_t) u_a^u(t), \]

and \(u_d(t | c)\) defined analogously. We solve the following optimization problem:

\[\max_c u_d(t^* | c) \]

subject to:

\[c_t = \sum_{r \in R} c_{r,t} \leq 1, \quad \forall t \in T \quad (\text{probability constraints}) \]

\[\sum_{t \in T} c_{r,t} \leq 1, \quad \forall r \in R \quad (\text{resource constraints}) \]

\[u_a(t | c) \leq u_a(t^* | c), \quad \forall t \in T \quad (\text{make } t^* \text{ as BR for attacker}) \]

We can then solve for each \(t^* \in T\), and pick the best for the defender. An intuition is that the attacker is induced to attack the target "desired" by the defender.

Repeated Games

We study the repeated strategic form game \(G = \{I, \{A_i\}_{i \in I}, \{g_i\}_{i \in I}\}\) for \(T\) periods. Denote the repeated game as \(G^T\). We assume perfect monitoring: the actions of all past periods are observed by all players.

Payoff of player \(i\) at stage \(t\): \(g_i(a_i^t, a_{-i}^t)\), where \(a^t = (a_i^t, a_{-i}^t)\) is the action profile at period \(t\). Usually, the total payoff is given by:

\[u_i(a_i,a_{-i})=\sum\limits_0^T\delta^tg_i(a_i^t,a_{-i}^t) \]

where \(\delta\in[0,1)\) is the discount factor.

Finitely Repeated Games

Thm. Consider repeated game \(G^T(\delta)\) for \(T < \infty\). Suppose that the stage game \(G\) has a unique pure strategy equilibrium \(a^*\). Then the repeated game \(G^T\) has a unique SPE. In this unique SPE, \(a^t = a^*\) for each \(t = 0, 1, \dots, T\) regardless of history.

Pf. Backward induction.

Infinite Game

Def. Infinite Game is some multi-stage games could contain infinite stages.

Def. An infinite extensive form game \(G\) is continuous at \(\infty\) if for any two strategies \(\sigma\) and \(\sigma'\) such that \(\sigma(h) = \sigma'(h)\) for all histories \(h\) in stages \(t \leq T\), the payoff function \(u_i\) satisfies

\[\lim_{T \to \infty} \sup_{i,\sigma,\sigma'} |u_i(\sigma) - u_i(\sigma')| = 0. \]

Thm. (One-Shot Deviation Principle for Infinite Game) In an infinite-horizon multi-stage game with observed actions that is continuous at infinity, profile \(\sigma\) is subgame perfect if and only if there is no player \(i\) and strategy \(\hat{\sigma}_i\) that agrees with \(\sigma_i\) except at a single stage \(t\) and history \(h^t\), and such that \(\hat{\sigma}_i\) is a better response to \(\sigma_{-i}\) conditional on history \(h^t\) being reached.

Ex. (Rubinstein Bargaining Model) Two players divide one dollar by proposing alternatively.

  • In periods \(0,2,4,\dots\) player 1 offers the division \((x_1, 1 - x_1)\). Player 2 then accepts and the game ends, or he rejects and the play continues.
  • In periods \(1,3,5,\dots\) player 2 offers the division \((1 - x_2, x_2)\). Player 1 then accepts and the game ends, or rejects and the play continues.
  • Assume that if the division \((y, 1 - y)\) is agreed to in period \(t\), then the payoffs are \(\delta^t y\) and \(\delta^t (1 - y)\).

Essentially, each player is facing the same game alternatively with interchanged roles. Therefore, player 1 should propose \((x, 1 - x)\), while player 2 should propose \((1 - x, x)\).

At equilibrium, we should have that player 1’s proposal to player 2 makes player 2 indifferent between accepting immediately or waiting for the next round.

Thus, we have: \(1 - x = \delta x\), which gives \(x^* = \frac{1}{1 + \delta}\).

Def. (Trigger Strategies) A trigger strategy threatens other players with a punishment action if they deviate from an (implicitly agreed) action profile. A non-forgiving trigger strategy (or grim trigger strategy) s would involve this punishment forever after a single deviation from other players.

Rem. A non-forgiving trigger strategy (for player \(i\)) takes the following form:

\[a_i^t = \begin{cases} \bar{a}_i & \text{if } a_\tau = \bar{a} \text{ for all } \tau < t \\ \underline{a}_i & \text{if } a_\tau \neq \bar{a} \text{ for some } \tau < t \end{cases} \]

Here \(\bar{a}\) is the implicitly agreed action profile and \(\underline{a}_i\) is the punishment action.

This strategy is non-forgiving since a single deviation from \(\bar{a}\) induces player \(i\) to switch to punishment action \(\underline{a}_i\) forever.

Ex. (Infinitely Repeated Prisoner’s Dilemma)

Cooperate Defect
Cooperate (1, 1) (-1, 2)
Defect (2, -1) (0, 0)

Prop. In the infinitely repeated Prisoner's Dilemma with \(\delta \geq 1/2\), there exists an SPE in which the outcome is that both players cooperate in every period.

Pf. Consider the following symmetric trigger strategy profile:

\[s_i(h_t) = \begin{cases} C & \text{if both players have played C in every period} \\ D & \text{if either player has ever played D in the past} \end{cases}. \]

Def. Set of feasible payoffs is defined as:

\[V = \text{Conv}\{v \in \mathbb{R}^I \mid \text{there exists } a \in A \text{ such that } g(a) = v\}. \]

Def. Player \(i\)'s minmax payoff is the lowest payoff that player \(i\)'s opponent can hold him/her to:

\[\underline{v}_i = \min_{\alpha_{-i}} \max_{\alpha_i} g_i(\alpha_i, \alpha_{-i}). \]

Minmax strategy profile against \(i\) (\(m^i\)):

\[m_{-i}^i = \arg\min_{\alpha_{-i}} \left[ \max_{\alpha_i} g_i(\alpha_i, \alpha_{-i}) \right]. \]

Let \(m_i^i\) denote the corresponding strategy of player \(i\) such that

\[g_i(m_i^i, m_{-i}^i) = g_i(m^i) = \underline{v}_i. \]

Prop. Individual rationality (IR): In any Nash equilibrium, player \(i\) must receive at least \(\underline{v}_i\).

Pf. Suppose \((\alpha_i, \alpha_{-i})\) is a Nash equilibrium of the stage game \(G\). Then let \(\alpha'_i\) be the strategy of playing a static best response to \(\alpha_{-i}\) in each period. Then \((\alpha'_i, \alpha_{-i})\) will give \(i\) a payoff of at least \(v_i\), and playing \(\alpha_i\) must give at least this much.

Thm. (Nash Folk Theorem) If \((v_1, \dots, v_I)\) is feasible and strictly individual rational, then there exists \(\underline{\delta} < 1\) such that for all \(\delta \geq \underline{\delta}\), there is a Nash equilibrium of \(G^\infty(\delta)\) with payoffs \((v_1, \dots, v_I)/(1 - \delta)\).

Pf. (1) Assume there exists a pure strategy profile \(a = (a_1, \dots, a_I)\) such that \(g_i(a) = v_i\) for all \(i\). Consider the following strategies:

  • I. Play \((a_1, \dots, a_I)\) as long as no one deviates.
  • II. If some player \(j\) deviates, play \(m_{-j}^{j}\) thereafter.

(2) If the payoff cannot be generated by pure actions, then we replace the action profile \(a\) with a public randomization \(a(\omega)\) yielding payoffs with expected value \((v_1, \dots, v_I)\).

To ensure player \(i\) follows the rule after observing the public randomization, a safe condition for the discount factor is

\[\max_a g_i(a) + \frac{\delta \cdot v_i}{1 - \delta} = \min_a g_i(a) + \frac{\delta \cdot v_i}{1 - \delta}. \]

Thm. (Friedman) Let \(a^{NE}\) be a static equilibrium of the stage game with payoffs \(e^{NE}\). For any feasible payoff \(v\) with \(v_i > e^{NE}_i\) for all \(i \in I\), there exists some \(\underline{\delta} < 1\) such that for all \(\delta \geq \underline{\delta}\), there exists a subgame perfect equilibrium of \(G^\infty(\delta)\) with payoffs \(v\).

Pf. Similarly, construct the non-forgiving trigger strategies with punishment by the static Nash Equilibrium. Punishments are therefore subgame perfect.

Ex. The Bertrand game:

  • Two firms have marginal production cost \(c > 0\). The market has a decreasing demand curve \(q = D(p)\). Firms can set any price \(p \geq 0\).
  • Customers choose the firm with lower price. (firms with equal prices equally split the market).

The unique NE is \(p_1 = p_2 = c.\)

Now consider Infinitely Repeated Bertrand Game with discount factor \(\delta\).

Denote the maximum profit of monopoly:

\[\Pi^M = \max_{p \geq 0} (p - c)D(p). \]

The following triggering strategy will be an SPE that ensures both firms together make the same profit as monopoly.

  • Each firm sets its price to profit maximizing \(p^m\) in each period starting from \(t = 1\) as long as no firm has deviated from this price in previous periods.
  • After a deviation, firm sets price equal to \(c\).

To check subgame perfection, consider one shot deviation. There are two types of histories:

  • On the equilibrium path (i.e., no deviation before), the best deviation is to undercut the other firm and capture the entire market for one period, and then gain zero profit afterwards. We must let the deviation must not be profitable:

    \[\Pi^M\le \frac{1}{1-\delta}\frac{\Pi^M}{2}, \]

    which gives \(\delta\ge \frac{1}{2}\).

  • Off the equilibrium path, no matter how to deviate, the firm can only get \(0\) at most. Thus, no profitable deviation.

When market has booms and recessions, high (w.p. p) or low, we need to check:

\[\Pi_H^M\le \frac{\Pi_H^M}{2}+\frac{\delta}{1-\delta}\left(p\frac{\Pi_H^M}{2}+(1-p)\frac{\Pi_L^M}{2}\right). \]

Ex. (Efficiency Wage)

A firm and a worker play a two period game. In the first period the firm sets a wage \(w\). In the second period, the worker observes the wage and decides whether to accept or reject the job.

If she rejects she has an outside option \(w_0\). If she accepts she can exert effort or exert no effort.

If she exerts no effort she will produce output \(y > 0\) with probability \(p\) and \(0\) otherwise. If she exerts effort she will produce y for sure (this implies that output being \(0\) is a sure sign of shirking). Exerting effort has cost \(e\) to the worker.

The firm has to pay the worker a higher wage \(w^∗ > w_0\) and claim that the worker will be fired once she is detected shirking to prevent the worker from shirking.

Bayesian Games

Def. A Bayesian game consists of:

  • A set of players \(I\);
  • A set of actions (pure strategies) for each player \(i: S_i\);
  • A set of types for each player \(i: \theta_i \in \Theta_i\);
  • A payoff function for each player \(i: u_i(s_1, \dots, s_I, \theta_1, \dots, \theta_I)\) (can be extended to mixed strategy);
  • A (joint) probability distribution \(p(\theta_1, \dots, \theta_I)\) over types (or \(P(\theta_1, \dots, \theta_I)\) when types are not finite).

Rem. Essentially any private information (more precisely, an information that is not common knowledge to all players) can be included in the description of the type.

Thm. (Harsanyi’s transformation) Games of incomplete information can be thought of as games of complete but imperfect information where nature makes the first move and not everyone is informed about nature’s move, i.e. nature chooses a \(\theta\) but only reveal \(\theta_i\) to player \(i\).

Def. (Bayesian Nash Equilibrium) The strategy profile \(\sigma\) is a Bayesian Nash equilibrium if for all \(i \in I\) and for all \(\theta_i \in \Theta_i\), we have:

\[\sigma_i(\theta_i) \in \arg \max_{\sigma_i' \in \Sigma_i} \sum_{\theta_{-i}} p(\theta_{-i} | \theta_i) u_i(\sigma_i', \sigma_{-i}(\theta_{-i}), \theta_i, \theta_{-i}) \]

Ex1. (Public Good I)

Contribute Don't
Contribute \((1 - c_1, 1 - c_2)\) \((1 - c_1, 1)\)
Don't \((1, 1 - c_2)\) \((0, 0)\)

Type \(\Theta_1=\{c_1\},\Theta_2=\{\underline{c},\overline{c}\}.\) Assume that \(c_1 < \frac{1}{2},0 < \underline{c} < 1 < \bar{c}\) and \(p < \frac{1}{2}\).

The unique BNE is \(s_1^* = contribute\) for player 1, and \(s^*_2 = don't\) for all \(c\in\Theta_2\) of player 2.

Ex2. (Public Good II)

Same payoff matrix but both \(c_1\) and \(c_2\) are drawn independently from a uniform distribution on \([0,2]\).

There exists \(c^*\) so that the player will contribute if \(c\le c^*\), otherwise don't. We have

\[u_1(contribute,c=c^*)=1-c^*=\frac{c^*}{2}=u_1(don't,c=c^*). \]

Solving it out gives \(c^*=\frac{2}{3}.\)

Ex3. (Incomplete Information Cournot)

Suppose that two firms both produce at constant marginal cost. Inverse demand function (i.e., price) is given by \(P(Q)\) as in Cournot.

Firm 1 has marginal cost equal to \(C\) (and this is common knowledge).

Firm 2's marginal cost is private information. It is equal to \(C_L\) with probability \(\theta\) and to \(C_H\) with probability \((1 - \theta)\), where \(C_L < C_H\). \(\Theta_2=\{L,H\}\)

The possible actions of each player are \(q_i \in [0, \infty)\). Payoff function:

\[u_1(q_1, q_2) = q_1 (P(q_1 + q_2) - C) \]

\[u_2((q_1, q_2), \theta_2) = q_2 (P(q_1 + q_2) - C_{\theta_2}) \]

An equilibrium \((q_1,q_L,q_H)\) should satisfy:

\[q_1=BR_1(q_L,q_H)=\arg\max_{q_1\ge 0}\{\theta q_1(P(q_1+q_L)-C)+(1-\theta)q_1(P(q_1+q_H)-C)\}\\ q_L=BR_L(q_1)=\arg\max_{q_L\ge 0}\{q_L(P(q_1+q_L)-C_L)\}\\ q_H=BR_H(q_1)=\arg\max_{q_H\ge 0}\{q_L(P(q_1+q_H)-C_H)\}. \]

Ex4. (First Price Auction)

Consider a first-price auction, where some risk-neutral players bid for one item.

Each player’s valuation of the item is independently and identically distributed on the interval \([0, \bar{v}]\) with cumulative distribution function \(F\), with continuous density \(f\) and full support on \([0, \bar{v}]\). The distribution \(F\) is common knowledge.

The bidder with higher bid wins (tie breaks equally), and pays his bid. Thus, the winner \(i\) gets payoff \(v_i - b_i\), others get payoff \(0\).

Bidder \(i\)'s strategy is \(\beta_i : [0, \bar{v}] \to [0, \infty)\). We will then characterize \(\beta\) such that when all other bidders play \(\beta\), \(\beta\) is also a best response for bidder 1.

We can calculate that

\[u_1(b_1,v_1)=(v_1-b_1)\text{Pr}[b_j=\beta(v_j)\le b_1,\forall j\neq 1]. \]

Thus

\[b_1=\arg\max_{b_1}F^{n-1}(\beta^{-1}(b_1))(v_1-b_1). \]

The first order condition gives:

\[(v_1 - b_1)(n - 1) F^{n-2}(\beta^{-1}(b_1)) f(\beta^{-1}(b_1)) \frac{1}{\beta'(\beta^{-1}(b_1))} - F^{n-1}(\beta^{-1}(b_1)) = 0\\ \Rightarrow[\beta(v)F^{n-1}(v)]^\prime=v[F^{n-1}(v)]^\prime \]

Solving it with the boundary condition \(\beta(0)=0\) gives

\[\beta(v)=v-\frac{\int_0^vF^{n-1}(x)\mathrm{d}x}{F^{n-1}(v)}=\mathbb{E}[\max(v_2,...,v_n)|v_2,...,v_n<v]. \]

Perfect Bayesian Equilibrium

Def. A Perfect Bayesian Equilibrium is a strategy profile \(\sigma^*\) together with a belief system \(\mu\) such that:

  • At every information set, strategies are optimal given beliefs and opponents' strategies (sequential rationality).

\[\sigma_i^*\left(h_i\right) \in \arg \max \mathbb{E}_{x\sim\mu_i\left(x \mid h_i\right)} u_i\left(\sigma_i^*, \sigma_{-i}^* \mid x, \theta_i, \theta_{-i}\right) \]

  • Beliefs are always updated according to Bayes rule when applicable.

Rem. Let’s have a digest of the requirements in the definition:

  • Requirement 1: At each information set, the player with the move must have a belief about which node in the information set has been reached by the play of the game. For a non-singleton information set, a belief is a probability distribution over the nodes in the information set.
  • Requirement 2: Given their beliefs, at each information set, the action taken by the player with the move (and the player's subsequent strategy) must be optimal given the player's belief at that information set and the other players' subsequent strategies (where a "subsequent strategy" is a complete plan of action covering every contingency that might arise after the given information set has been reached).
  • Requirement 3: At information sets on the equilibrium path, beliefs are determined by Bayes rule and the players' equilibrium strategies.
  • Requirement 4: At information sets off the equilibrium path (i.e., information sets reached with zero probability), beliefs can be arbitrarily chosen.

Rem. A refinement of PBE, "Sequential equilibrium," further requires that beliefs on off-equilibrium paths should be determined by Bayes' rule and the players' equilibrium strategies as if they are reached by "trembling hands".

Signaling Games

Ex. (Signaling Games)

  • A signaling game involves two players: a Sender (S) and a Receiver (R).
  • Nature draws a type \(t_i\) for the Sender from a set of feasible types \(T=\{t_1, ...,t_I\}\) according to a probability distribution where \(p\left(t_i\right)>0\) for every \(i\) and \(p\left(t_1\right)+\ldots+p\left(t_I\right)=1\).
  • The Sender observes \(t_i\) and then chooses a message \(m_j\) from a set of feasible messages \(M=\{m_1, ..., m_J\}\).
  • The Receiver observes \(m_j\) (but not \(t_i\)) and then chooses an action \(a_k\) from a set of feasible actions \(A=\{a_1, ..., a_K\}\).
  • Payoffs are given by \(u_S\left(t_i, m_j, a_k\right)\) and \(u_R\left(t_i, m_j, a_k\right)\).

Now, let's check the requirements.

  • Signaling Requirement 1: After observing any message \(m_j\), the Receiver must have a belief about which types could have sent \(m_j\), denoted as \(\mu\left(t_i \mid m_j\right) \geq 0\), and

    \[\sum_{i \in T} \mu\left(t_i \mid m_j\right)=1 \]

  • Signaling Requirement 2R: For each \(m_j\) in \(M\), the Receiver's action \(a^*(m_j)\) must maximize the Receiver's expected utility, given the belief \(\mu\left(t_i \mid m_j\right)\) about which types could have sent \(m_j\): \(a^*(m_j)\) solves

    \[\max_{a_k \in A} \sum_{t_i \in T} \mu\left(t_i \mid m_j\right) u_R\left(t_i, m_j, a_k\right). \]

  • Signaling Requirement 2S: For each \(t_i\) in \(T\), the Sender's message \(m^*\left(t_i\right)\) must maximize the Sender's utility, given the Receiver's strategy \(a^*\left(m_j\right)\). That is, \(m^*(t_i)\) solves

    \[\max _{m_j \in M} u_S\left(t_i, m_j, a^*\left(m_j\right)\right) \]

  • Signaling Requirement 3: For each \(m_j\) in \(M\), if there exists \(t_i\) in \(T\) such that \(m^*\left(t_i\right)=m_j\), then the Receiver's belief at the information set corresponding to \(m_j\) must follow from Bayes rule and the Sender's strategy:

    \[\mu\left(t_i \mid m_j\right)=\frac{p\left(t_i\right)}{\sum_{t_i \in T_j} p\left(t_i\right)} \]

​ Here \(T_j\) denotes the set of the types that send the message \(m_j\) at equilibrium.

Ex. (Spence’s Job-Market Signaling: Nobel Prize Laureate 2001)

  • Stage 0: Nature chooses the ability \(\theta\) of a worker. Suppose \(\Theta = \{ 2,3 \}\) and that \(\text{Prob}(\theta = 2) = p\) and \(\text{Prob}(\theta = 3) = 1 - p\).

  • Stage I: Player 1 (worker) observes his type and chooses education level \(e \in \{0,1\}\). Education has cost \(\frac{ce}{\theta}\). Note, that higher ability workers have lower cost and that getting no education is costless.

  • Stage II: Player 2 (the competitive labor market) chooses the wage rate \(w(e)\) of workers after observing the education level.

  • Utility Functions: \(u_1(e, w; \theta) = w - \frac{ce}{\theta}\) and \(u_2(e, w; \theta) = -(w - \theta)^2\).

Two-player Zero-sum Game

Def. A two-player zero-sum game is a two-player normal form game, where the two players, the row player and the column player, have \(m\) and \(n\) pure strategies, respectively. The game is specified by a pair \((R,C)\) of \(m\times n\) (finite) payoff matrices, satisfying \(R=-C\).

Players may have mixed strategies. Denote the simplex of available mixed strategies:

  • Row player:

    \[\Delta_m = \left\{ \mathbf{x} \in \mathbb{R}^m : x_i \geq 0, \sum_{i=1}^m x_i = 1 \right\} \]

  • Column player:

    \[\Delta_n = \left\{ \mathbf{y} \in \mathbb{R}^n : y_j \geq 0, \sum_{j=1}^n y_j = 1 \right\} \]

Given mixed strategies \(\mathbf{x} \in \Delta_m\) and \(\mathbf{y} \in \Delta_n\), the expected payoff of the row player and column player are \(\mathbf{x}^T R \mathbf{y}\) and \(\mathbf{x}^T C \mathbf{y}\), respectively.

Thm (Von Neumann's Minimax Theorem). For any two-player zero-sum game with \(m \times n\) payoff matrices \(R,C=-R\), there exists a number \(z\) (value of the game) satisfying:

\[\max_{\mathbf{x} \in \Delta_m} \min_{\mathbf{y} \in \Delta_n} \mathbf{x}^T R \mathbf{y} = z=\min_{\mathbf{y} \in \Delta_n} \max_{\mathbf{x} \in \Delta_m} \mathbf{x}^T R \mathbf{y} \]

Cor. There exists a Nash equilibrium in every two-player zero-sum game.

Prop. If the payoff matrix of a zero-sum game is antisymmetric, i.e., \(R=-R^T\), then the game has value \(0\).

Mechanism Design

Design a set of rules that interact with selfish agents and implement an outcome.

Social Choice

Consider a set of alternatives \(A\) (the candidates), a set \(I\) of \(n\) voters. Denote by \(L\) the set of linear orders on \(A\) (\(L\) is isomorphic to the set of permutations on \(A\)).For every \(\succ \in L\), \(\succ\) is a complete and transitive order on \(A\). Let \(\succ_i \in L\) denote voter \(i\)'s preference.

Def. A function \(W: L^n \rightarrow L\) is called a social welfare function. A function \(f: L^n \rightarrow A\) is called a social choice function.

Desired Properties for Social Welfare Functions:

  • Condorcet winner criterion.
  • Unanimity: if \(a\succ_i b\) for all voter \(i\), then in the social choice \(a\succ b\).
  • Independent of irrelevant alternatives (IIA): the social preference between any two alternatives \(a\) and \(b\) depends only on the voters’ preferences between \(a\) and \(b\).
  • Non-dictatorial: no such \(i\) exists that for all \(\succ_1,...,\succ_n\), \(W(\succ_1,...,\succ_n)=\succ_i\).
  • Monotone: moving only candidate A higher in one’s ranking should not move A down in the society ranking

Thm (Arrow’s Impossibility Theorem). Every social welfare function over a set of more than \(2\) candidates (\(|A|\ge 3\)) that satisfies unanimity and independence of irrelevant alternatives is a dictatorship.

Pf. First, we have the following lemma:

Lem. For any candidate \(b\), if for all \(\succ_1,...,\succ_n\), \(b\) is ranked either strictly best or strictly worst, then in \(\succ^*=W(\succ_1,...,\succ_n)\), \(b\) is also ranked either strictly best or strictly worst.

Pf. Suppose \(a\succ^*b\succ^*c\), moving \(c\) upward as much as possible gives \(c\succ^* a\), contradiction. \(\qquad \square\)

Now let's find the pivotal guy. For some candidate \(b\), consider \(n+1\) profiles: in the \(i\)-th profile, \(b\) is ranked best in \(\succ_1,...,\succ_{i-1}\), and worst in \(\succ_i,...,\succ_n\). By Lemma, \(b\) is ranked best in the outcome of the first profile and worst in the outcome of the last profile. So there must be some voter \(i\) so that \(b\) is ranked best in the outcome of the \(i\)-th profile and worst in the outcome of the \(i+1\)-th profile. Call these two profile I and II and the pivotal guy Bob.

Consider any two voters \(a\) and \(c\), in any profile IV. Suppose \(a\succ_{Bob}c\) in IV. Moving \(a\) to the best of Bob and \(b\) to the second best of Bob gives profile III. By IIA:

  • \((a,c)\) is same in III and IV
  • \((a,b)\) is same in I and III (\(a\succ^*b\))
  • \((b,c)\) is same in II and III (\(b\succ^*c\))

By transitivity, \(a\succ^* c\) in IV. Thus Bob dictates things except for \(b\). Finally, we show that Bob also dictates \(b\), which only need to show that the dictator of things except for \(c\neq b\) is also Bob. \(\qquad\qquad \square\)

Here are some examples on voting, most of them have problems.

  • Pairwise comparison: choose the candidate who defeats all others in pairwise elections using majority rule. Condorcet winner criterion: If a candidate beats all other candidates in pairwise contests, then he should be the winner of the election. Condorcet loser criterion: The system should never select a candidate that loses to all others in pairwise contests.

  • Plurality Voting: chooses the candidate who is ranked first by the largest number of voters.

  • Runoff: If no candidate has a majority in the first round, then the two leading candidates compete in a second round (India, Brazil, and France). Satisfy Condorcet loser criterion but not winner criterion.

  • Borda Count: Determine a utility function to assign some points to each position. Add up all points from all voters and the outcome with most points wins.

  • Approval Voting: Voters can approve as many candidates as they wish.

Potential Game

Potential Game

Def. A game \(G = (N, A, U)\) is an (exact) potential game if there exists a function \(P: A \rightarrow \mathbb{R}\) such that for all \(i \in N\), all \(a_{-i} \in A_{-i}\) and \(a_i, a_i^{\prime} \in A_i\),

\[u_i(a_i, a_{-i}) - u_i(a_i^{\prime}, a_{-i}) = P(a_i, a_{-i}) - P(a_i^{\prime}, a_{-i}) . \]

Ex. A specific example of Prisoner's dilemma:

C D
C \((1,1)\) \((9,0)\)
D \((0,9)\) \((6,6)\)

Potential function \(P\):

C D
C \(4\) \(3\)
D \(3\) \(0\)

Thm. Every (finite) potential game has a pure-strategy Nash equilibrium.

Pf. The maximum point of potential funition.

Congestion Game

Def. A congestion game is defined on a tuple \((N, R, A, c)\), where \(N\) is a set of \(n\) agents; \(R\) is a set of \(r\) resources; \(A = A_1\times...\times A_n\), where \(A_i\subseteq 2^R/ \{\emptyset\}\) is the set of actions for agent \(i\); \(c = (c_1, ..., c_r)\), where \(c_k:\mathbb{N}\to\mathbb{R}\) is a cost function for resource \(k\in R\).

Given a pure-strategy profile \((a_i,a_{-i})\), the utility function is

\[u_i(a)=-\sum\limits_{r\in R,r\in a_i} c_r(\#(r,a)). \]

Thm. Every congestion game is a potential game.

Pf. By constructing a potential function

\[P(a) = \sum_{r \in R} \sum_{j=1}^{\#(r,a)} c_r(j). \]

Prop. Every congestion game has a pure-strategy Nash equilibrium.

Myopic Best Response Dynamics

A straightforward way of computing Nash equilibrium in a normal form game is the myopic best response dynamics.

Def (Myopic Best Response). While there exists an agent \(i\) that is not best responding to \(a_{-i}\) do

  • \(a_i^{\prime} \leftarrow\) some best response by \(i\) to \(a_{-i}\)
  • \(a \leftarrow (a_i^{\prime}, a_{-i})\).

Thm. The Myopic Best Response procedure is guaranteed to find a pure-strategy Nash equilibrium of a potential game.

Pf. \(P(a)\) strictly increases.

Ex (Game of Consensus). Consider a graph \(G = (V, E)\), where each node is a player. Player \(i\)'s action is to choose a bit \(b_i \in \{0,1\}\). Let \(N(i)\) be the set of \(i\)'s neighbors, and \(\mathbf{b} = ( b_1, \ldots, b_n )\). Player \(i\)'s loss is the number of neighbors that she disagrees with:

\[D_i(\mathbf{b}) = \sum_{j \in N(i)} |b_i - b_j| . \]

We can show that \(\phi(\mathbf{b})=\frac{1}{2}\sum_iD_i(\mathbf{b})\) is a potential function, so this game is a potential game.

Myopic best response: Select the majority among neighbors.

Matching

Stable Matching

Prob. Two disjoint sets of agents, \(M\) (men) and \(W\) (women), each side has equally \(n\) agents, each with strict preference lists over the other set.

A matching is a set of \(n\) distinct pairs of \((m,w)\). A matching is called unstable if there are two men \(m,m'\) and two women \(w,w'\) such that: \(m\) is matched to \(w\) and \(m'\) is matched to \(w'\), but \(m\succ_{w'} m'\) and \(w'\succ_m w\). Such a pair \((m,w')\) is called a blocking pair. A matching is stable if it has no blocking pairs.

Alg (Gale and Shapley). In step \(k\), each man proposes to his most preferred woman who has not rejected him yet (or gives up if he’s been rejected by all women), and each woman is tentatively matched to her favorite among her proposers and rejects the rest. Repeat these steps until a round in which there are no rejections.

Thm. The Gale-Shapley algorithm produces a stable matching.

Pf. Prove by contradiction. Suppose not. Then there exists a blocking pair \((m_1, w_1)\) with \(m_1\) matched to \(w_2\), and \(w_1\) matched to \(m_2\). Thus, \(m_1 \succ_{w_1} m_2\). Since \(w_1 \succ_{m_1} w_2\), \(m_1\) would have proposed to \(w_1\) before \(w_2\). This means \(w_1\) must received a proposal from a man that she ranked higher than \(m_1\). Since the algorithm matched her with \(m_2\), it follows that \(m_2 \succ_{w_1} m_1\). Contradiction.

Def. A matching \(\mu\) is male-optimal if there is no stable matching \(\nu\) such that \(\nu(m)\succeq_m\mu(m)\) for all \(m\) with \(\nu(j)\succ_j\mu(j)\) for at least one \(j\in M\).

Thm. The Gale-Shapley algorithm (man-propose) produces a stable matching that is male-optimal.

Pf. Let \(\mu\) be the matching returned by the male-propose algorithm. Suppose \(\mu\) is not male optimal. Then, there is a stable matching \(\nu\) such that \(\nu(m) \succeq_m \mu(m)\) for all \(m\) and some \(j \in M\) that \(\nu(j) \succ_j \mu(j)\). These \(j\) must got rejected by \(\nu(j)\) in \(\mu\). Consider the first such \(j\), who proposes to \(\nu(j)\) before \(\mu(j)\) and got rejected by \(\nu(j)\). Women \(\nu(j)\) must reject \(j\) because of a more preferred man \(i\). Since this \(j\) is the first man among \(M\) got rejected by his corresponding partner in \(\nu\), this means \(i\) must rank \(\nu(j)\) higher than \(\nu(i)\) (so that \(i\) could have asked \(\nu(j)\) before \(j\) at that time). This gives \(i \succ_{\nu(j)} j\) and \(\nu(j) \succ_i \nu(i)\) implying that \(\nu\) is not stable. Contradiction.

Def. We say a woman \(w\) is attainable for a man \(m\) if there exists a stable matching \(\mu\) with \(\mu(m)=w\).

Thm. Let \(\mu\) be the stable matching produced by the men-proposing algorithm. Then:

  • Every man is matched in \(\mu\) to his most preferred attainable woman.
  • Every woman is matched in \(\mu\) to her least preferred attainable man.

Pf. Prove by contradiction, similar techniques.

Prop. If a women \(w\) is assigned to the same man \(m\) in both the men-proposing and the women-proposing version of the GS algorithm, then this is the only attainable man for her.

Def (Preferences by Compatibility). Suppose we have a matrix \(A=(a_{i,j})_{n\times n}\) where all entries are distinct in each row and in each column. For the \(i\)-th row, we have: \(a_{i,j_1}>a_{i,j_2}>...>a_{i,j_n}\), and man \(i\)’s preference is \(j_1\succ_i j_2\succ_i...\succ_i j_n\). Similarly, for \(j\)-th column, we have \(a_{i_1,j}>...>a_{i_n,j}\), which represents women \(j\)’s preference \(i_1\succ_ji_2\succ_j...\succ_j i_n\).

Lem. In the case preferences by compatibility, there exists a unique stable matching.

Thm. Man misreporting the preference over women can never improve his match computed by the Gale-Shapley algorithm.

House Allocation Problem

Prob. A set \(N\) of \(n\) agents, each owning a unique house and a strict preference ordering over all \(n\) houses. An allocation is an \(n\)-dimensional vector \(a\) whose \(i\)-th component, \(a_i\), is the ID of the house assigned to agent \(i\).

Let \(A\) denote the set of all feasible allocations (\(a_i \neq a_j\) for all \(i \neq j\).). The initial allocation is \(a_i=i\) for all \(i\). For every \(S \subseteq N\), let \(A(S)=\{z \in A: z_i \in S, \forall i \in S\}\) denote the set of allocations that can be achieved by the agents in \(S\) trading among themselves alone. A set \(S\) of agent is a block coalition for an allocation \(a\) if there exists a \(z \in A(S)\) such that:

  • for all \(i \in S\), \(z_i \succeq_i a_i\)
  • exists \(j \in S\) that \(z_j \succ_j a_j\).

The set of allocations that is not blocked by any subset of agents is called a stable allocation.

Alg (Top Trading Cycle Algorithm, TTC).

  • Each agent \(i\) points to his most preferred house (possibly \(i\)'s own, we use the same node to identify the agent and its own house).
  • Give each agent in a cycle the house he points at and remove the agent from the market.
  • Repeat above steps if unmatched agents remain.

Thm. TTC generates a stable allocation that is not blocked.

Pf.

  • Cannot make any agent matched in the first round better off;
  • Cannot make any agent matched in the second round better off without making some of the agents matched in the first round worse off;
  • Cannot make any agent matched in round \(n\) better off without making some agents matched in earlier rounds worse off.

Thm. Stable allocation is unique.

Thm. No person has an incentive to misreport his preferences.

Cooperative Game Theory

Cooperative Games

Cooperative games model the cooperative scenarios, where actions are the coalitions by groups of agents.

The focus is on the distribution of the benefits from the cooperation:

  • Transferable utility games: payoffs are given to the group and then divided/transferred among group members
  • Non-transferable utility games: group actions result in payoffs to individual group members, which is not transferable.

Even in cooperative games, players are still selfish.

Thus, one key problem is to find a proper solution concept, which is stable; no player (or group of players) has an incentive to deviate from. Then, another key problem is how to guarantee fair distribution among a group.

Transferable Utility Games

Ex. 3 kids are buying ice cream. There are three types of the ice cream:

  • Type 1: small, cost $7, contains 500g.
  • Type 2: medium, cost $9, contains 750g.
  • Type 3: large, cost $11, contains 1kg.

A has $4, B and C each has 3. Children want to have as much ice cream as possible, don’t care about money. The payoff to a group: the maximum quantity of ice cream the group can buy. The ice cream will be shared within the group.

Def. A transferable utility game is a pair \((N, v)\), where: \(N=\{1, \ldots, n\}\) is the set of players, \(v: 2^{N} \rightarrow \mathbb{R}\) is the characteristic function. For each subset of players \(C\), \(v(C)\) is the utility that the members of \(C\) can earn by working together. \(N\) itself is called the grand coalition. We assumed that \(v\) is

  • normalized: \(v(\emptyset)=0\)
  • non-negative: \(v(C) \geq 0\) for any \(C \subseteq N\)
  • monotone: \(v(C) \leq v(D)\) for any \(C, D\) such that \(C \subseteq D\)

Ex. \(v(\empty)=v(\{A\})=v(\{B\})=v(\{C\})=0,v(\{A,B\})=v(\{A,C\})=500,v(\{B,C\})=0,v(\{A,B,C\})=750.\)

Def. An outcome of a TU game \((N, v)\) is a pair \((CS, x)\) where

  • \(CS=(C_1, \ldots, C_k)\) is a coalition structure, i.e., partition of the set \(N\) into disjoint coalitions: \(\cup_i C_i=N\), and \(C_i \cap C_j=\emptyset\) for \(i \neq j\);
  • and \(x=(x_1, \ldots, x_n)\) is the payoff vector that distributes the payoff of each coalition to all players, satisfying \(x_i \geq 0\) for all \(i \in N\) and \(\sum_{i \in C} x_i = v(C)\) for each \(C \in CS\).

Def. An outcome \((CS, x)\) is called an imputation if it satisfies individual rationality: \(x_i \geq v(\{i\})\) for all \(i \in N\).

Ex. \((750,0,0)\) and \((0,370,370)\) are inputations.

Def. A game \((N, v)\) is called superadditive if \(v(C \cup D) \geq v(C) + v(D)\) for any two disjoint coalitions \(C\) and \(D\).

Core

Def. The core of a game is the set of all stable outcomes, i.e., outcomes that no coalition wants to deviate from:

\[\text{core}(N, v) = \left\{(CS, x) \mid \sum_{i \in C} x_i \geq v(C), \forall C \subseteq N \right\} \]

Ex. \((750,0,0)\) is in the core. But \((0,375,375)\) not, since let \(C=\{A,B\}\) wants to deviate: \(v(C)=500>0+375\).

Fact. Some games have empty core.

Pf. Consider Game \((N, v)\): \(N=\{1,2,3\}\), \(v(C)=1\) if \(|C|>1\), and \(v(C)=0\) otherwise.

Consider an outcome \((CS, x)\).

  • If \(CS=(\{1\},\{2\},\{3\})\), the grand coalition can deviate.
  • If \(CS=(\{1,2\},\{3\})\), either 1 or 2 gets less than 1, so can deviate with 3. Same argument for \(CS=(\{1,3\},\{2\})\) or \(CS=(\{2,3\},\{1\})\).
  • If \(CS=(\{1,2,3\})\), \(x_i > 0\) for some \(i\), so agents in \(N \setminus \{i\}\) earns strictly less than 1, yet \(v(N \setminus \{i\})=1\).

Def. The \(\varepsilon\)-core of a game is the set of all stable outcomes, i.e., outcomes that no coalition wants to deviate from:

\[\text{core}(N, v) = \left\{(CS, x) \mid \sum_{i \in C} x_i \geq v(C)-\varepsilon, \forall C \subseteq N \right\} \]

Ex. In the above game, \(CS=(\{1,2,3\}),x=(\frac{1}{3},\frac{1}{3},\frac{1}{3})\) is a \(\frac{1}{3}\)-core.

Def. The least core of a game: \(\varepsilon^*(G)=\inf\left\{\varepsilon\mid\varepsilon\text{-core is not empty}\right\}\).

Ex. In the above game, the least core is \(\frac{1}{3}\).

Def. Let \(( N, v )\) be a coalitional game. A set of weights \(w(S)\), where \(0 \leq w(S) \leq 1\), for all \(S \subseteq N\), is a balanced sequence (or balancing set of weights) if \(\forall i \in N\):

\[\sum_{S, i \in S} w(S) = 1. \]

Def. A coalitional game \((N, v)\) is balanced if and only if, for every balancing set of weights \(w\), we have

\[\sum_{\emptyset\neq S \subseteq N} w(S) v(S) \leq v(N). \]

Thm (Bondareva-Shapley). The coalitional game \((N, v)\) has a non-empty core if and only if it is balanced.

Shapley Value

Outcomes in the core may be unfair.

Ex. Consider game \((N, v)\): \(N=\{1,2\}\), \(v(\emptyset)=0\), \(v(\{1\})=v(\{2\})=5\), \(v(\{1,2\})=20\). \((15,5)\) is in the core, which is unfair since 1 and 2 are symmetric.

Def. The Shapley value of player \(i\) in a game \((N, v)\) is

\[\phi_i = \frac{1}{n!} \sum_{\pi \in P(N)} \delta_i(S_{\pi}(i)). \]

where \(P(N)\) denote the set of all permutations of \(N\), \(S_{\pi}(i)\) denote the set of predecessors of \(i\) in \(\pi \in P(N)\), and \(\delta_i(C) = v(C \cup \{i\}) - v(C)\), for all \(C \subseteq N\).

Rem. \(\phi_{i}\) is \(i\)'s average marginal contribution to the coalition of its predecessors, over all permutations. This definition has many good properties.

Prop. In any game \((N,v)\), \(\phi_{1}+\ldots+\phi_{n}=V(n)\).

Def. A player \(i\) is called a dummy in a game \((N,v)\) if \(v(C)=v(C\cup\{i\})\) for any \(C\subseteq N.\)

Prop. If a player \(i\) is a dummy in a game \((N,v)\) then \(\phi_{i}=0\).

Def. Two players \(i,j\) are said to be symmetric if \(v(C\cup\{i\})=v(C\cup\{j\})\) for any \(C\subseteq N\backslash\{i,j\}\).

Prop. If \(i\) and \(j\) are symmetric then \(\phi_{i}=\phi_{j}\).

Def. Let \(G_{1}=(N, u)\) and \(G_{2}=(N, v)\) be two games with the same set of players. Then \(G=G_{1}+G_{2}\) is the game with the set of players \(N\) and characteristic function \(w\) given by \(w(C)=u(C)+v(C)\) for all \(C\subseteq N\).

Prop. \(\phi_{i}(G)=\phi_{i}(G_{1})+\phi_{i}(G_{2}).\)

Rem. Properties of Shapley Value:

  • Efficiency: \(\phi_{1}+...+\phi_{n}=V(n).\)

  • Dummy: if a player \(i\) is a dummy, then \(\phi_i = 0\).

  • Symmetry: If \(i\) and \(j\) are symmetric then \(\phi_{i}=\phi_{j}\).

  • Additivity:\(\phi_{i}(G)=\phi_{i}(G_{1})+\phi_{i}(G_{2}).\)

Thm. Shapley value is the only payoff distribution scheme that has the above four properties.

But... Shapley Value may NOT in Core. A direct example is when core is empty.

Ex. Another example even when core is non-empty: \(N=\{1,2,3\}, v(\{1\})=0, v(\{2\})=0, v(\{3\})=0, v(\{1,2\})=90, v(\{1,3\})=80, v(\{2,3\})=70, v(\{1,2,3\})=120.\) The Shapley value is \((45,40,35)\).

Two-person Bargaining

Two players try to reach an agreement on splitting one unit of a good. We first define the possible outcomes. Let \(X\) denote the possible split agreements, and \(D\) denote the disagreement outcome. Define utilities \(u_{i}\) on the outcomes, and denote the set of possible payoffs as:

\[U=\{(v_1,v_2)\mid v_1=u_1(x), v_2=u_2(x)\text{ for some } x\in X\},d=(u_1(D),u_2(D)). \]

The pair \((U,d)\) where \(U\subset \mathbb{R}^{2}\) and \(d\subset U\) defines a bargaining problem. We assume that \(U\) is closed and bounded, and there exists some \(v\in U\) such that \(v>d.\)

A solution to a bargaining problem is a mapping \(f\) from each instance \((U,d)\) to an agreement point, \(f(U,d)=v\in U.\)

Def. The Nash bargaining solution \(f^{N}(U,d)=v=(v_{1},v_{2})\) is the solution of the following optimization problem:

\[\max_{v_{1},v_{2}}\prod_{i=1}^{2}(v_{i}-d_{i})\\ \text{s.t.}(v_{1},v_{2})\in U\\ (v_{1},v_{2})\geq(d_{1},d_{2}) \]

Rem. The solution exists since \(U\) is closed and bounded and contains a point \(v>d\). The solution is also unique.

Def. Nash bargaining axioms:

  • I. Pareto efficiency: If \(f(U,d)=v\triangleq(v_{1},v_{2})\), and if \(v^{\prime}\triangleq(v_{1}^{\prime},v_{2}^{\prime})\) satisfies \(v_{1}^{\prime}\geq v_{1}\) and \(v_{2}^{\prime}\geq v_{2}\), then \(v=v^{\prime}\).

  • II. Symmetry: For the instance \((U,d)\) where \((v_{1},v_{2})\in U\) if and only \((v_{2},v_{1})\in U\) and \(d_{1}=d_{2}\), then \(f_{1}(U,d)=f_{2}(U,d)\) If the players are indistinguishable after permutation, then the payoffs should be equal.

  • III. Invariant to affine transformation: For instance \((U,d)\), consider the transformed instance:$$U'={(\alpha_1 v_1+\beta_1,\alpha_2 v_2+\beta_2\mid(v_1, v_2)\in U},\ d'=(\alpha_1 d_1+\beta_1,\alpha_2 d_2+\beta_2).$$Then, \(f_i(U',d')=\alpha_i f_i(U,d)+\beta_i.\)That is, a linear transformation of the utility should not alter the outcome.

  • IV. Independence of Irrelevant Alternatives(IIA): Let \((U,d)\) and \((U^{\prime},d)\) be two bargaining problems and \(U^{\prime}\subseteq U.\) If \(f(U,d)\in U^{\prime}\), then \(f(U^{\prime},d)=f(U,d).\)

Thm. The Nash bargaining solution \(f^{N}(U,d)\) is the unique solution satisfying the Nash bargaining axioms.

Auction

Types:

  • Single-item auction.
  • \(k\)-unit auction: to allocate \(k\) identical units to bidders, and decide their payments.
  • Combinatorial auctions: selling \(m\) items and each bidder \(i\) has a private valuation \(v_i(S)\) for every subset \(S\) of these items.

Def. Direct mechanisms is the class of mechanism that requires bidders to simple claim their types (i.e., report privation information completely and for once, though not necessarily truthful).

Thm (Revelation principle). For every mechanism \(M\) in which every agent has a dominant strategy (no matter what their private information), there is an equivalent direct-revelation DSIC mechanism \(M'\).

Rem. Further, for any \(M\) that achieves an equilibrium, we can construct an equivalent direct mechanism \(M'\) in which truth-telling is likewise an equilibrium (achieving identical outcomes).

Def. Consider the auctioneer is selling \(1\) item to \(N\) bidders. Second price auction allocates the item to the bidder with the highest bid, but charges this bidder a price of the second highest bid. Also called "Vickrey's" auction.

Thm. Truth-telling is a dominant strategy in a second-price auction.
Consider the outcome of a mechanism is \((\mathbf{x}, \mathbf{p})=\left(\mathbf{x},\left(p_1, p_2, \ldots, p_n\right)\right)\), representing the allocation and payments. The utility of agent \(i\) is \(u_i(\mathbf{x}, \mathbf{p})=v_i(\mathbf{x})-p_i\). A direct mechanism run on agents' declaration \(\hat{v}=\left(\hat{v}_1, \hat{v}_2,\ldots,\hat{v}_n\right)\)

Def (VCG). A mechnism is called a Vickrey-Clarke-Groves (VCG) mechanism if

\[\begin{aligned} \mathbf{x}(\hat{v})\in&\ \arg \max\limits_{\mathbf{x}'} \sum_i\hat{v}_i(\mathbf{x}'),\\ p_i(\hat{v}) =&\ h_i(\hat{v}_{-i}) -\sum_{j \neq i} \hat{v}_j(\mathbf{x}(\hat{v})) & \end{aligned} \]

where \(h_i:V_{-i}\to\mathbb{R}\).

Thm. VCG is dominant-strategy incentive compatible and maximizes social welfare.

Pf. Only need to show that for all \(i\) and \(\hat{v}\), let \(\hat{v}'=(v_i,\hat{v}_{-i})\),

\[v_i(\mathbf{x}(\hat{v}'))-p_i(\hat{v}')\ge v_i(\mathbf{x}(\hat{v}))-p_i(\hat{v}). \]

Equivalently,

\[v_i(\mathbf{x}(\hat{v}'))+\sum\limits_{j\neq i}\hat{v}_j(x(\hat{v}'))\ge v_i(\mathbf{x}(\hat{v}))+\sum\limits_{j\neq i}\hat{v}_j(x(\hat{v})) \]

Notice that the declaring of \(i\) affects this formula only through \(\mathbf{x}\), in other words,

\[\mathbf{x}(\hat{v}')=\arg\max_{\mathbf{x}'}\left(v_i(\mathbf{x}')+\sum\limits_{j\neq i}\hat{v}_j(\mathbf{x}')\right) \]

which is immediately given by definition.

Def (Clarke pivot rule). Choose \(h_i(v_{-i})=\max\limits_{\mathbf{x}'}\sum_{j \neq i} \hat{v}_j\left(\mathbf{x}^{\prime}\right)\).

Rem. Such a VCG is individual rational (IR): \(u_i(\mathbf{x},p_i) \ge 0\). The designer does not have to pay to run VCG: \(\sum_i p_i\ge 0\).

posted @ 2025-02-17 16:44  xcyle  阅读(140)  评论(0)    收藏  举报