已知两个数列各自的均值和方差,如何快速求出两个数列拼合后的均值和方差

问题:

数列A为[1, 2, 3, 4, 5, 6, 7, 8, 9],已知数列A的均值和方差和个数为mean_x,var_x,size_x

数列B为[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],已知数列B的均值和方差和个数为mean_y,var_y,size_y

现在将数列A与数列B拼合为数列Z,则数列Z为[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]

现在要求数列Z的均值和方差



这里的重点是要快速求出数列Z的均值方差,或者可以换一种问法,那就是知道数列A的均值和方差和个数,也知道数列B的均值和方差和个数,但是不知道数列A和数列B的具体值,现在要求拼合后的数列Z的均值和方差。


给出代码:(代码地址:https://github.com/openai/baselines/blob/master/baselines/common/vec_env/vec_normalize.py

def update_mean_var_count_from_moments(mean, var, count, batch_mean, batch_var, batch_count):
    delta = batch_mean - mean
    tot_count = count + batch_count

    new_mean = mean + delta * batch_count / tot_count
    m_a = var * count
    m_b = batch_var * batch_count
    M2 = m_a + m_b + torch.pow(delta, 2).double() * count * batch_count / tot_count
    new_var = M2 / tot_count
    new_count = tot_count

    return new_mean, new_var, new_count

注意,上面的mean,var,count为数列A的均值、方差、个数;batch_mean, batch_var, batch_count为数列B的均值、方差、个数;目标数列Z的均值、方差、个数为new_mean, new_var, new_count 。


具体验证:

image





另一种计算动态增加数据的方差计算方法:

https://www.johndcook.com/blog/standard_deviation/


image



posted on 2024-02-25 09:54  Angry_Panda  阅读(68)  评论(0编辑  收藏  举报

导航