分布拟合——正态/拉普拉斯/对数高斯/瑞利 分布
作者:桂。
时间:2017-03-16 20:30:20
链接:http://www.cnblogs.com/xingshansi/p/6561536.html
声明:欢迎被转载,记得注明出处~
前言
本文为曲线与分布拟合的一部分,主要介绍正态分布、拉普拉斯分布等常用分布拟合的理论推导以及代码实现。
一、理论推导
假设数据独立同分布。对于任意数据点$x_i$,对应概率密度为$f(x_i)$,最大似然函数:
$J = \mathop \prod \limits_{i = 1}^N f({x_i})$
表示成参数,并写成对数形式:
$L\left( \theta \right) = \ln J\left( \theta \right) = \sum\limits_{i = 1}^N {f({x_i};\theta )} $
A-正态分布
对于正态分布:
$f(x) = \frac{1}{{\sqrt {2\pi } \sigma }}{e^{ - \frac{{{{(x - \mu )}^2}}}{{2{\sigma ^2}}}}}$
求偏导得参数估计:
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{x_i}} }}{N}$
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {{{\left( {{x_i} - \mu } \right)}^2}} }}{N} = \frac{{{{\left( {{\bf{x}} - \mu } \right)}^T}\left( {{\bf{x}} - \mu } \right)}}{N}$
B-拉普拉斯分布
对于拉普拉斯分布:
$f(x) = \frac{1}{{2b}}{e^{ - \frac{{\left| {x - \mu } \right|}}{b}}}$
由于其概率密度曲线为对称分布,因此均值估计可用统计均值直接表示:
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{x_i}} }}{N}$
最大似然函数求偏导,得出$b$的估计:
$\hat b = \frac{{\sum\limits_{i = 1}^N {\left| {{x_i} - \mu } \right|} }}{N}$
C-对数正态分布
对数正态分布:
$f(x) = \frac{1}{{x\sqrt {2\pi } \sigma }}{e^{ - \frac{{{{(\ln x - \mu )}^2}}}{{2{\sigma ^2}}}}}$
事实上,令$t = lnx$,则参数求解与正态分布完全一致。
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{t_i}} }}{N}$
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {{{\left( {{t_i} - \mu } \right)}^2}} }}{N} = \frac{{{{\left( {{\bf{t}} - \mu } \right)}^T}\left( {{\bf{t}} - \mu } \right)}}{N}$
D-瑞利分布
瑞利分布:
$f(x) = \frac{x}{{{\sigma ^2}}}{e^{ - \frac{{{x^2}}}{{2{\sigma ^2}}}}}$
最大似然求导,得出参数估计:
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {x_i^2} }}{{2N}}$
二、代码实现
A-正态分布
x = x(:); % should be column vectors ! N = length(x); u = sum(x)/N; sig2 = (x-u)'*(x-u)/N;
B-拉普拉斯分布
x = x(:); % should be column vectors ! N = length(x); u = sum( x )/N; b = sum(abs(x-u))/N;
C-对数正态分布
t = log(x(:)); % should be column vectors ! N = length(x); m = sum( t )/N; sig2 = (t-m)'*(t-m)/N;
D-瑞利分布
x = real(x(:)); % should be column vectors ! N = length(x); s = sum(x.^2)/(2*N);
三、应用举例
以正态分布为例:
rng('default') % for reproducibility x = 3*randn(100000,1)-2; %fitting x = x(:); % should be column vectors ! N = length(x); u = sum(x)/N; sig2 = (x-u)'*(x-u)/N; %Plot figure; %Bar subplot 311 numter = [-15:.2:10]; [histFreq, histXout] = hist(x, numter); binWidth = histXout(2)-histXout(1); bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on; %Fitting plot subplot 312 y = 1/sqrt(2*pi*sig2)*exp(-(numter-u).^2/2/sig2); plot(numter,y,'r','linewidth',2);grid on; %Fitting result subplot 313 bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on; plot(numter,y,'r','linewidth',2);
结果图:
单个分布以本文为例。