过雁

--每天都被梦想唤醒--

   :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理
  80 随笔 :: 0 文章 :: 4 评论 :: 16万 阅读
定义:
In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation (stratum) independently.Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. The strata should be mutually exclusive: every element in the population must be assigned to only one stratum. 
简言之,将数据集划分为相同标签的子集,然后再在每个子集进行独立的抽样

Advantages[edit]

优点是:即使在样本空间的概率密度急剧变化的情况,层次抽样也能保证不同(概率密度)层次的样本的抽取概率的精确性。

If population density varies greatly within a region, stratified sampling will ensure that estimates can be made with equal accuracy in different parts of the region, and that comparisons of sub-regions can be made with equal statistical power.

Randomized stratification can also be used to improve population representativeness in a study.

Disadvantages[edit]

Stratified sampling is not useful when the population cannot be exhaustively partitioned into disjoint subgroups. It would be a misapplication of the technique to make subgroups' sample sizes proportional to the amount of data available from the subgroups, rather than scaling sample sizes to subgroup sizes (or to their variances, if known to vary significantly 


 





posted on   过雁  阅读(793)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:使用Catalyst进行自然语言处理
· 分享一个我遇到过的“量子力学”级别的BUG。
· Linux系列:如何调试 malloc 的底层源码
· AI与.NET技术实操系列:基于图像分类模型对图像进行分类
· go语言实现终端里的倒计时
阅读排行:
· 历时 8 年,我冲上开源榜前 8 了!
· 物流快递公司核心技术能力-海量大数据处理技术
· 四大AI编程工具组合测评
· 关于能否用DeepSeek做危险的事情,DeepSeek本身给出了答案
· 几个技巧,教你去除文章的 AI 味!
点击右上角即可分享
微信分享提示