xgboost
基本概念
Given dataset
a tree ensemble model uses K additive functions to predict the output
where,
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233939629-1912097712.png)
是CART的集合
优化目标
其中,
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233940441-1303800864.png)
为正则项
when train the model in additive manner, minimize the objective
for ![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233941062-1647651031.png)
也即,
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233941652-1138583437.png)
拟合的是
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233941805-1764672652.png)
和
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233941968-1818732008.png)
的差值
基于二阶泰勒展开
这是一条过
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233942698-2132641168.png)
点的二次曲线,是
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233942924-1625987401.png)
在
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233943156-598048648.png)
附近的近似
则可以针对
![](https://img2018.cnblogs.com/blog/494740/201809/494740-20180902233943492-1304201168.png)
进行二次近似
进一步化解
其中