PIC2, Histograms
As a first guess,you can start with Scott’s rule for the bin width
w = 3.5σ/ 3√n,
where σ is the standard deviation for the entire data set and n is the number of points.
This rule assumes that the data follows a Gaussian distribution;
otherwise, it is likely to give a bin width that is too wide.
See the end of this chapter for more information on the standard deviation
直方图刻度的值估计,前提是高斯分布。
The other parameter that we need to fix (whether we realize it or not) is the alignment of the bins on the x axis.
Let’s say we fixed the width of the bins at 1. Where do we now place the first bin?
We could put it flush left, so that its left edge is at 0, or we could center it at 0.
In fact, we can move all bins by half a bin width in either direction.
第一个bin放在哪里?
几个注意的地方:
1. Histograms can be either normalized or unnormalized。 归一化。
2. bin的宽度调整。
缺陷:
1. 丢信息
2. 表示的信息不唯一。
3. 信息粗糙。
4. 异常值不好。