读书笔记: 博弈论导论 - 02 - 引入不确定性和时间

前言

本文是Game Theory An Introduction (by Steven Tadelis) 的学习笔记。

术语

概率分布函数(probability distribution function)
一个简单投机(lottery)(行动$a \in A$)在结果 $ X = { x_1, x_2, \cdots, x_n }$上的概率分布记做

\[p = (p(x_1|a), p(x_2|a), \cdots, p(x_n|a)), \\ where \\ p(x_k|a) \geq 0 \text{: the probability that } x_k \text{ occurs when take action a} \\ \sum_{k=1}^n p(x_k|a) = 1 $$。 - 累积分布函数(cumulative distribution function) 一个简单投机(lottery)行动$a \in A$，在结果区间$X = [\underline{x}, \overline{x}]$上的累积分布函数： \]

F : X \to [0, 1] \
where \
f(\hat{x} | a) = Pr{x \leq \hat{x}} \text{: the probability that the outcome is less than or equal to } \hat{x}.

\[ - 期望收益(expected payoff from the lottery function) 一个简单投机(lottery)行动$a \in A$，在结果区间$X = [x_1, x_2, \cdots, x_n]$上的期望收益函数： \]

E[u(x)|a] = \sum_{k=1}^n p_k u(x_k) \
where \
u(x) \text{: the payoff function} \
p = (p_1, p_2, \cdots, p_n) \text{: probability distribution}

\[ - 连续案例：期望收益(expected payoff from the lottery function) 一个简单投机(lottery)行动$a \in A$，在结果区间$X = [\underline{x}, \overline{x}]$上的期望收益函数： \]

E[u(x) | a] = \int_{\underline{x}}^{\overline{x}} u(x)f(x)dx \
where \
u(x) \text{: the payoff function} \
f(x|a) \text{: the cumulative distribution function}

\[ - 经济人2 我们称一个人是理性的，如果这个人选择最大期望收益。 \]

\text{choose } a^* \in A \iff v(a^) = E[u(x)|a^] \geq E[u(x)|a^*] = v(a), a \in A

\[ ## 考虑次序和时间 - 逆向归纳法(backward induction) 或者称为动态编程(dynamic programming)。就是说在连续的随机案例下，从后向前，每个简单的投机，都使用最大期望收益推算其投机行为，作为投机的计算行为，向前计算。 - 折扣合计期望(discounted sum of future payoffs) \]

v(x_1, x_2, \cdots, x_n) = \sum_{t=1}^{T} \delta^{t-1} u(x_t) \
where \
T \text{: period} \
u(x) \text{: the payoff function of outcome x} \

\[ ## 风险态度 - 中立风险 - risk neutral 认为同样期望回报的价值相同。 - 厌恶风险 - risk averse 倾向于一个确定性的回报，不愿意采用一个拥有同样期望回报的不确定性方案。 - 喜爱风险 - risk loving 更严格地倾向于采用拥有同样期望回报的赌注。 > 到现在，基本上就是强化学习。 ## 参照 * Game Theory An Introduction (by Steven Tadelis) - [读书笔记: 博弈论导论 - 01 - 单人决策问题](http://www.cnblogs.com/steven-yang/p/8075901.html)\]

posted @ 2017-12-20 21:23 SNYang 阅读(2152) 评论(0) 收藏举报

刷新页面返回顶部

想想你应该干什么

读书笔记: 博弈论导论 - 02 - 引入不确定性和时间

读书笔记: 博弈论导论 - 02 - 引入不确定性和时间

前言

术语