DataFrame随机选行+纵向拼接
Dataframe
随机选行
(1)dataframe
实例:
city_data = {'city': ['beijing', 'shanghai', 'xining', 'dalian', 'xian', 'chongqing'],
'location': ['north', 'south', 'northwest', 'northeast', 'west', 'southwest'],
'level': ['first', 'first', 'third', 'second', 'second', 'second'],
'if-to-sea':['no', 'yes', 'no','yes','no','no']}
city_df = pd.DataFrame(city_data)
dataframe
具体如下:
city location level if-to-sea
0 beijing north first no
1 shanghai south first yes
2 xining northwest third no
3 dalian northeast second yes
4 xian west second no
5 chongqing southwest second no
(2)随机取行--方式1
city_df_2_3 = city_df.sample(frac=0.6)
# 随机取行比例为0.6
city location level if-to-sea
4 xian west second no
0 beijing north first no
5 chongqing southwest second no
3 dalian northeast second yes
(2)随机取行--方式2
city_df_2_3_1 = city_df.sample(n=4)
city location level if-to-sea
0 beijing north first no
2 xining northwest third no
4 xian west second no
3 dalian northeast second yes
Dataframe在随机取行后,取剩余的Dataframe
承接上面的案例,在取完原Dataframe的2/3后,我们想得到剩余的1/3.
city_df_2_3_index = city_df_2_3.index.to_list()
city_df_1_3 = city_df[~city_df.index.isin(city_df_2_3_index)]
得到:
city location level if-to-sea
3 dalian northeast second yes
5 chongqing southwest second no
将两个dataframe进行纵向拼接
city_df_concat = pd.concat([city_df_1_3, city_df_2_3])
得到:
city location level if-to-sea
1 shanghai south first yes
3 dalian northeast second yes
5 chongqing southwest second no
4 xian west second no
0 beijing north first no
2 xining northwest third no