np.percentile()

描述：

百分位数表示小于这个值的观察值的百分比。

numpy.percentile(a, q, axis)

参数：

a：输入数组。
q：计算的百分位数，在 \(0-100\) 之间。
axis：沿着它计算百分位数的轴。
interpolation：\(str\)，用于估计百分位数的方法。默认 \(linear\)。可选：
- inverted_cdf
- averaged_inverted_cdf
- closest_observation
- interpolated_inverted_cdf
- hazen
- weibull
- linear \((default)\)
- median_unbiased
- normal_unbiased
注意：前三种方法不连续。 \(Numpy\) 定义了默认 linear 的不连续变体：
- lower
- higher
- midpoint
- nearest

首先明确百分位数：

第 \(p\) 个百分位数是这样一个值，使至少有 \(p\%\) 的数据项小于或等于这个值，且至少有 \((100-p)\%\) 的数据项大于或等于这个值。

示例：语文成绩 \(54\) 分，单从分数看并不知道成绩好坏，如果 \(54\) 分是第 \(70\) 百分位数，知道大约 \(70\%\) 的考生分数比他低，\(30\%\) 的考生分数比他高。

示例：

import numpy as np 
 
a = np.array([[10, 7, 4], [3, 2, 1]])
print(a)
# 50% 的分位数，就是 a 里排序之后的中位数
print (np.percentile(a, 50)) 
# axis 为 0，在纵列上求
print (np.percentile(a, 50, axis=0)) 
# axis 为 1，在横行上求
print (np.percentile(a, 50, axis=1)) 
# 保持维度不变
print (np.percentile(a, 50, axis=1, keepdims=True))

[[10  7  4]
 [ 3  2  1]]

3.5

[6.5 4.5 2.5]

[7. 2.]

[[7.]
 [2.]]

interpolation 默认为 \(linear\)。

nums = np.array([1, 2, 3, 4, 8])
print(np.percentile(nums, 50))
print(np.percentile(nums, 40))     # 2 + 1/25*15 = 2.6
print(np.percentile(nums, 90))     # 4 + 4/25*15 = 6.4

3.0
2.6
6.4

interpolation = 'nearest'。

print(np.percentile(nums, 40, interpolation='nearest'))		# 3

interpolation = 'midpoint'。

print(np.percentile(nums, 40, interpolation='midpoint'))	# (2+3)/2 = 2.5

2.5

posted @ 2022-12-08 14:24 做梦当财神阅读(846) 评论(0) 编辑收藏举报

刷新页面返回顶部

做梦当财神

np.percentile()

公告