大数定律(Law of Large Numbers)的原理及 Python 实现
本文以抛掷硬币(tossing coins)为例, 来理解大数定律(Law of Large Numbers), 并使用 Python 语言实现.
原理
大数定律, 简单来说, 就是随着抛掷硬币的次数的增多, 正面向上出现的比例(the ratio of heads)会越来越接近正面朝上的概率(the probability of heads).
Python 实现
在示例代码中, 假定正面朝上的概率(the probability of heads)为0.51, 模拟进行10个系列的硬币投掷(coin tosses), 每个投掷系列, 投掷硬币 10000 次, 然后, 将正面朝上的比例(the ratio of heads)随着投掷次数的变化进行显示, 并保存到 images/ 目录下. 具体代码如下:
#-*- coding: utf8 -*-
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import os
def law_of_large_numbers(num_series=10, num_tosses=10000, heads_prob=0.51, display=True):
""" Get `num_series` series of biased coin tosses, each of which has `num_tosses` tosses,
and the probability of heads in each toss is `heads_prob`."""
# 1 when less than heads_prob; 0 when no less than heads_prob
coin_tosses = (np.random.rand(num_tosses, num_series) < heads_prob).astype('float32')
cumulative_heads_ratio = np.cumsum(coin_tosses, axis=0)/np.arange(1, num_tosses+1).reshape(-1,1)
if display:
plot_fig(cumulative_heads_ratio, heads_prob)
def save_fig(fig_id, dirname="images/", tight_layout=True):
print("Saving figure", fig_id)
if tight_layout:
plt.tight_layout()
# First, ensure the directory exists
if not os.path.isdir(dirname):
os.makedirs(dirname)
# Then, save the fig_id imagename
image_path = "%s.png" % os.path.join(dirname, fig_id)
plt.savefig(image_path, format='png', dpi=300)
def plot_fig(cumulative_heads_ratio, heads_prob, save=True):
# Get the number of tosses in a series
num_tosses = cumulative_heads_ratio.shape[0]
# Set the width and height in inches
plt.figure(figsize=(8, 3.5))
# Plot cumulative heads ratio
plt.plot(cumulative_heads_ratio)
# Plot the horizontal line of value `heads_prob`, with black dashed linetype
plt.plot([0, num_tosses], [heads_prob, heads_prob], "k--", linewidth=2, label="{}%".format(round(heads_prob*100, 1)))
# Plot the horizontal line of value 0.5 with black solid linetype
plt.plot([0, num_tosses], [0.5, 0.5], "k-", label="50.0%")
plt.xlabel("Number of coin tosses")
plt.ylabel("Heads ratio")
plt.legend(loc="lower right")
# Set x ranges and y ranges
xmin, xmax, ymin, ymax = 0, num_tosses, 0.42, 0.58
plt.axis([xmin, xmax, ymin, ymax])
if save:
save_fig("law_of_large_numbers_plot")
plt.show()
if __name__ == '__main__':
num_series, num_tosses = 10, 10000
heads_proba = 0.51
law_of_large_numbers(num_series, num_tosses, heads_proba)
显示结果, 如下图所示
参考资料
[1] Aurélien Géron. Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly Media, 2017.