qiezi_online

2020年12月11日

摘要： pd.read_csv(file, nrows = 100) df.iloc[: 100, :].to_csv(out_file, index = False) Linux下查看前100行： head -100 123.txt 阅读全文

posted @ 2020-12-11 09:09 qiezi_online 阅读(263) 评论(0) 推荐(0) 编辑

2020年12月8日

张一鸣相关经历启发

摘要：相关资料收集： 1. 百度百科：https://baike.baidu.com/item/%E5%BC%A0%E4%B8%80%E9%B8%A3/15898544?fr=aladdin 2. https://www.tmtpost.com/123328.html 《张一鸣，他的每一句话都在被挑错》阅读全文

posted @ 2020-12-08 23:54 qiezi_online 阅读(466) 评论(0) 推荐(0) 编辑

pytorch 分类问题用到的分类器（F.CROSS_ENTROPY和F.BINARY_CROSS_ENTROPY_WITH_LOGITS）

摘要：推荐参考：https://www.freesion.com/article/4488859249/ 实际运用时注意： F.binary_cross_entropy_with_logits()对应的类是torch.nn.BCEWithLogitsLoss，在使用时会自动添加sigmoid，然后计算lo 阅读全文

posted @ 2020-12-08 11:13 qiezi_online 阅读(2033) 评论(0) 推荐(0) 编辑

pytorch SubsetRandomSampler 用法和说明

摘要：官网：https://pytorch.org/docs/stable/data.html?highlight=subsetrandomsampler#torch.utils.data.SubsetRandomSampler 推荐参考：https://www.sohu.com/a/291959747_ 阅读全文

posted @ 2020-12-08 10:55 qiezi_online 阅读(4860) 评论(0) 推荐(1) 编辑

2020年12月7日

pandas LabelEncoder 测试集出现了训练集中未出现过的值怎么解决（y contains previously unseen labels 解决方法）

摘要： for i in categorical_ix: le = joblib.load(f"./LabelEncoder/{i}_LabelEncoder.model") #由于test集合中可能出现新的label，没有在train中出现过，因此将新的标签也转为<unk> test_labels = d 阅读全文

posted @ 2020-12-07 19:55 qiezi_online 阅读(1876) 评论(0) 推荐(0) 编辑

pytorch Dataset Dataloader用法（一个示例）

摘要： from torch.utils.data import Dataset from torch.utils.data import DataLoader from torch.utils.data import sampler import numpy as np import torch clas 阅读全文

posted @ 2020-12-07 19:27 qiezi_online 阅读(359) 评论(0) 推荐(0) 编辑

2020年12月6日

python 保存list，map方法

摘要： 1. 保存list import numpy as np a = [1,2,3,4,5] np.save("number.npy", a) k = np.load("number.npy") 2. 保存map import json data = {} data["a"] = 1 data["b"] 阅读全文

posted @ 2020-12-06 18:54 qiezi_online 阅读(4272) 评论(0) 推荐(1) 编辑

pandas LabelEncoder方法，对离散值进行编码，并储存

摘要： # 3.离散值进行LabelEncoder #处理数据的三个步骤，去重，处理缺失值，离散值LabelEncoder from sklearn import preprocessingfrom sklearn.externals import joblib categorical_ix = ["1", 阅读全文

posted @ 2020-12-06 18:52 qiezi_online 阅读(1096) 评论(0) 推荐(0) 编辑

pandas 处理缺失值（连续值取平均，离散值fillna"<unk>"）

摘要： # 2.1处理缺失值，连续值用均值填充 continuous_fillna_number = [] for i in train_null_ix: if(i in continuous_ix): mean_v = df_train[i].mean() continuous_fillna_number 阅读全文

posted @ 2020-12-06 18:34 qiezi_online 阅读(168) 评论(0) 推荐(0) 编辑

dataframe 检查缺失值

摘要： s = df.isnull().any() #返回series形式，可以用enumerate打印s #true代表有空值 null_index = [] for i,j in enumerate(s): print(i,j,s.index[i]) if(j): null.index.append(s 阅读全文

posted @ 2020-12-06 16:53 qiezi_online 阅读(722) 评论(0) 推荐(0) 编辑

公告