2023年12月9日

模型评价指标

摘要：准确率、召回率网址：https://blog.csdn.net/seagal890/article/details/105059498 True Positive（TP）：真正类。样本的真实类别是正类，并且模型识别的结果也是正类。 False Negative（FN）：假负类。样本的真实类别是正阅读全文

posted @ 2023-12-09 19:38 广爷天下无双阅读(94) 评论(0) 推荐(0) 编辑

逻辑回归思路

摘要：整体建模思路 1、拿到样本先分训练集和测试集，0.7或0.8，根据样本数量考虑是否增加验证集，本次不增加验证集。 2、对训练集样本进行初筛。iv大于0.01，相关系数低于80%，缺失率根据变量中文名无特殊情况，不得高于80%。 3、对训练集先用决策树分6-8箱。此时对变量进行二次筛选。主要考虑变量是阅读全文

posted @ 2023-12-09 14:36 广爷天下无双阅读(7) 评论(0) 推荐(0) 编辑

逻辑回归自己尝试

摘要：自己逻辑回归尝试 1、固定好坏样本，随机种子先去看分箱情况 data_sd = X1 num_cols=X1.columns import pycard as pc num_iv_woedf = pd.DataFrame() clf = pc.NumBin(max_bins_num=7,min_b 阅读全文

posted @ 2023-12-09 14:34 广爷天下无双阅读(25) 评论(0) 推荐(0) 编辑

toad逻辑回归尝试

摘要： from sklearn.model_selection import train_test_split train,test=train_test_split(dd,test_size=0.6) toad.detect(dd) toad.quality(dd,target='target',iv_ 阅读全文

posted @ 2023-12-09 14:33 广爷天下无双阅读(38) 评论(0) 推荐(0) 编辑

XGB调参思路

摘要：（1）选择较高的学习率，例如learning_rate=0.1，这样可以减少迭代用时。（2）然后对 (max_depth , min_child_weight) , （3）在第二步确定的基础上调整 gamma , （4）subsample, colsample_bytree 这些参数进行调整。（阅读全文

posted @ 2023-12-09 14:33 广爷天下无双阅读(90) 评论(0) 推荐(0) 编辑

LGBM

摘要： import pandas as pd from lightgbm import LGBMClassifier from sklearn.metrics import accuracy_score df3=pd.concat([df1,df2],axis=1) model = LGBMClassif 阅读全文

posted @ 2023-12-09 14:32 广爷天下无双阅读(75) 评论(0) 推荐(0) 编辑

XGB

摘要： from xgboost import XGBClassifier model = XGBClassifier(learning_rate=0.1,max_depth=5,alpha=0.2) model.fit(x_train,y_train) p=model.predict_proba(x_te 阅读全文

posted @ 2023-12-09 14:32 广爷天下无双阅读(91) 评论(0) 推荐(0) 编辑

随机森林

摘要： from sklearn.ensemble import RandomForestClassifier model=RandomForestClassifier(n_estimators=22,max_depth=7,min_samples_split=33,min_samples_leaf=18) 阅读全文

posted @ 2023-12-09 14:31 广爷天下无双阅读(15) 评论(0) 推荐(0) 编辑

决策树

摘要： 2、决策树 from sklearn import tree clf = tree.DecisionTreeClassifier(criterion="gini",max_depth=5,min_samples_split=2,min_samples_leaf=52) clf.fit(x_train 阅读全文

posted @ 2023-12-09 14:31 广爷天下无双阅读(25) 评论(0) 推荐(0) 编辑

逻辑回归

摘要： 2、逻辑回归 2.1常规但是要考虑样本均衡问题 import matplotlib.pyplot as plt x=z.iloc[:,0:7] y=z.iloc[:,7:] from sklearn.model_selection import train_test_split from sklea 阅读全文

posted @ 2023-12-09 14:30 广爷天下无双阅读(21) 评论(0) 推荐(0) 编辑

画ks曲线能得到阈值和精确ks

摘要：尝试模型代码 1、画出p值实现ks计算 from sklearn.metrics import roc_curve from sklearn.pipeline import make_pipeline import matplotlib import matplotlib.pyplot as pl 阅读全文

posted @ 2023-12-09 14:29 广爷天下无双阅读(44) 评论(0) 推荐(0) 编辑

2023年12月9日

公告