xgboost调参

最近在做kaggle比赛,xgboost调参是个大问题。耗时,耗力啊。一个参数调半个小时啊。
看得懂吧,每个参数逐步的,调整取值范围。
建议:
每次调一个参数。
每次一个参数,输入3个数,例如:默认参数是 1, 候选范围你可以选择 【0.1,1,10】,一定要差一个数量级,这样可以圈定范围。然后通过调整粒度,使参数越调约精巧。

param = {'subsample':[0.000001,0.00001,0.0001,0.0005]}
#           'learning_rate':[0.001,0.005,0.007,0.01,0.02,0.03]            
#          'colsample_bytree':[0.6,0.7,0.8,0.9,1,1.1,1.2],
#          'n_estimators':[400,500,600,700,1000],
#          'min_child_weight':[1,2,3,4,5,6],
#          'max_depth':[3,4,5,6,7,8,9,10],
#          'gamma': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6],
#          'reg_alpha': [0.05, 0.1, 1, 2, 3],
#          'reg_lambda': [0.05, 0.1, 1, 2, 3]

XGBR=XGBRegressor(gpu_id=0,
                  single_precision_histogram=True,
                  n_estimators=500,
                  min_child_weight=1,
                  tree_method='gpu_hist',
                  eval_metric='mae',
                  objective='reg:linear',
                  booster='gblinear',
                  silent=1,
                  nthread=-1,
                  learning_rate=0.02,
                  gamma=0,
                  subsample=0.8,
                  colsample_bytree=0.8,
                  max_depth=5,
                  reg_alpha=0,
                  reg_lambda=1,
                  verbose_eval=5)

model = GridSearchCV(XGBR, param_grid=param,cv=5,scoring='neg_mean_absolute_error') 
model.fit(X_train, Y_train)

print("Best score: %0.3f" % model.best_score_)
print("Best parameters set:")
best_parameters = model.best_estimator_.get_params()
for param_name in sorted(best_parameters.keys()):
    print("\t%s: %r" % (param_name, best_parameters[param_name]))

lightgbm调参

#lightgbm 调参

from lightgbm import LGBMRegressor

# param = {
#     'alpha':[0.0001,0.001,0.002]
#          'max_depth': range(3,8,2)
#          'num_leaves':range(50, 170, 30)
#          'min_child_samples': [18, 19, 20, 21, 22]
#          'min_child_weight':[0.001, 0.002]
#          'feature_fraction': [0.5, 0.6, 0.7, 0.8, 0.9]
#          'bagging_fraction': [0.01,0.1,0.2,0.3,0.4,0.5,0.6]
#         }

modelb=LGBMRegressor(device='gpu',
                     alpha=0.002,
                     learning_rate=0.1,
                     num_leaves=140,
                     min_child_samples=19,
                     min_child_weight=0.001,
                     objective='regression',
                     n_estimators=43,
                     subsample=0.8,
                    max_depth=7,
                     bagging_fraction = 0.8,
                     feature_fraction = 0.8,
                     num_boost_round=150,
                     metric='mae')
LGBMR = GridSearchCV(modelb, param,cv=5, n_jobs=-1, verbose=1) 

gs_results =LGBMR.fit(X_train, Y_train)

   
print("BEST PARAMETERS: " + str(gs_results.best_params_))
print("BEST CV SCORE: " + str(gs_results.best_score_))    
    
with open('best_parameters.txt','wb') as f:
    f.write(str(gs_results.best_params_))
    f.write(str(gs_results.best_score_))

posted on 2020-04-02 23:20  耀扬  阅读(746)  评论(0编辑  收藏  举报

导航