核密度估计(parzen窗密度估计)
matlab中提供了核平滑密度估计函数ksdensity(x):
[f, xi] = ksdensity(x)
返回矢量或两列矩阵x中的样本数据的概率密度估计f。 该估计基于高斯核函数,并且在等间隔的点xi处进行评估,覆盖x中的数据范围。
ksdensity估计单变量数据的100点密度,或双变量数据的900点密度。
ksdensity适用于连续分布的样本。
也可以指定评估点:
[
specifies points (f
,xi
] = ksdensity(x
,pts
)pts
) to evaluate f
. Here, xi
and pts
contain identical values。
核密度估计方法:
代码实现,对比:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | clear all ; clc ; n=100; % 生成一些正态分布的随机数 x=normrnd(0,1,1,n); minx = min (x); maxx = max (x); dx = (maxx-minx)/n; x1 = minx:dx:maxx-dx; h=0.5; f= zeros (1,n); for j = 1:n for i =1:n f( j )=f( j )+ exp (-(x1( j )-x( i ))^2/2/h^2)/ sqrt (2* pi ); end f( j )=f( j )/n/h; end plot (x1,f); %用系统函数计算比较 [f2,x2] = ksdensity(x); hold on; plot (x2,f2, 'r' ); %红色为参考 |
PRML上核密度估计(parzen窗密度估计)的实现:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | %% Demonstrate a non-parametric (parzen) density estimator in 1D %% % This file is from pmtk3.googlecode.com function parzenWindowDemo2 % setSeed(2); %[data, f] = generateData; % http://en.wikipedia.org/wiki/Kernel_density_estimation data = [-2.1 -1.3 -0.4 1.9 5.1 6.2]'; f = @(x) 0; n = numel (data); domain = -5:0.001:10; kernels = { 'gauss' , 'unif' }; for kk=1:2 kernel = kernels{kk}; switch kernel case 'gauss' , hvals = [1, 2]; case 'unif' , hvals = [1, 2]; end for i =1: numel (hvals) h = hvals( i ); figure ; hold on handle = plot (data, 0.01* ones (1,n), 'x' , 'markersize' , 14); set ( handle , 'markersize' ,14, 'color' , 'k' ); g = kernelize(h, kernel, data); plot (domain,g(domain '),' -b'); if strcmp (kernel, 'gauss' ) for j =1:n k = @(x) (1/h)*gaussProb(x,data( j ),h^2); plot (domain, 0.1*k(domain '), ' :r'); end end title ( sprintf ( '%s, h=%5.3f' , kernel, h), 'fontsize' , 16); %printPmtkFigure(sprintf('parzen-%s-%d', kernel, h)); end %placeFigures('nrows',3,'ncols',1,'square',false); end end function [data, f] = generateData mix = [0.35,0.65]; sigma = [0.015,0.01]; mu = [0.25,0.75]; n = 50; %% The true function, we are trying to recover f = @(x)mix(1)*gaussProb(x, mu(1), sigma(1)) + mix(2)*gaussProb(x, mu(2), sigma(2)); %Generate data from a mixture of gaussians. model1 = struct ( 'mu' , mu(1), 'Sigma' , sigma(1)); model2 = struct ( 'mu' , mu(2), 'Sigma' , sigma(2)); pdf1 = @(n)gaussSample(model1, n); pdf2 = @(n)gaussSample(model2, n); data = rand (n,1); nmix1 = data <= mix(1); data(nmix1) = pdf1( sum (nmix1)); data(~nmix1) = pdf2( sum (~nmix1)); end function g = kernelize(h,kernel,data) %Use one gaussian kernel per data point with smoothing parameter h. n = size (data,1); g = @(x)0; %unif = @(x, a, b)exp(uniformLogprob(structure(a, b), x)); for i =1:n switch kernel case 'gauss' , g = @(x)g(x) + (1/(h*n))*gaussProb(x,data( i ),h^2); %case 'unif', g = @(x)g(x) + (1/n)*unif(x,data(i)-h/2, data(i)+h/2); case 'unif' , g = @(x)g(x) + (1/(h*n))*unifpdf(x,data( i )-h/2, data( i )+h/2); end end end function setupFig(h) figure ; hold on; axis ([0,1,0,5]); set ( gca , 'XTick' ,0:0.5:1, 'YTick' ,[0,5], 'box' , 'on' , 'FontSize' ,16); title ([ 'h = ' , num2str (h)]); scrsz = get (0, 'ScreenSize' ); left = 20; right = 20; lower = 50; upper = 125; width = scrsz(3)-left-right; height = (scrsz(4)- lower - upper )/3; set ( gcf , 'Position' ,[left,scrsz(4)/2,width, height]); end function p = gaussProb(X, mu, Sigma) % Multivariate Gaussian distribution, pdf % X(i,:) is i'th case % *** In the univariate case, Sigma is the variance, not the standard % deviation! *** % This file is from pmtk3.googlecode.com d = size (Sigma, 2); X = reshape (X, [], d); % make sure X is n-by-d and not d-by-n X = bsxfun (@minus, X, rowvec(mu)); logp = -0.5* sum ((X/(Sigma)).*X, 2); logZ = (d/2)* log (2* pi ) + 0.5*logdet(Sigma); logp = logp - logZ; p = exp (logp); end function x = rowvec(x) % Reshape a matrix into a row vector % Return x as a row vector. This function is useful when a function returns a % column vector or matrix and you want to immediately reshape it in a functional % way. Suppose f(a,b,c) returns a column vector, matlab will not let you write % f(a,b,c)(:)' - you would have to first store the result. With this function you % can write rowvec(f(a,b,c)) and be assured that the result is a row vector. % This file is from pmtk3.googlecode.com x = x(:)'; end function y = logdet(A) % Compute log(det(A)) where A is positive-definite % This is faster and more stable than using log(det(A)). % %PMTKauthor Tom Minka % (c) Microsoft Corporation. All rights reserved. % This file is from pmtk3.googlecode.com try U = chol (A); y = 2* sum ( log ( diag (U))); catch % #ok y = 0; warning ( 'logdet:posdef' , 'Matrix is not positive definite' ); end end |
对比发现与MATLAB自带的函数的估计结果仍有一定的差异?
参考:
https://ww2.mathworks.cn/help/stats/ksdensity.html
https://zhidao.baidu.com/question/176640010444765324.html
快去成为你想要的样子!
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 开发中对象命名的一点思考
· .NET Core内存结构体系(Windows环境)底层原理浅谈
· C# 深度学习:对抗生成网络(GAN)训练头像生成模型
· .NET 适配 HarmonyOS 进展
· .NET 进程 stackoverflow异常后,还可以接收 TCP 连接请求吗?
· 本地部署 DeepSeek:小白也能轻松搞定!
· 基于DeepSeek R1 满血版大模型的个人知识库,回答都源自对你专属文件的深度学习。
· 在缓慢中沉淀,在挑战中重生!2024个人总结!
· Tinyfox 简易教程-1:Hello World!
· 大人,时代变了! 赶快把自有业务的本地AI“模型”训练起来!