Boosting is a very powerful technique of alogrithms ensembling. Its outstanding performance achieved by combining some or many weak classifiers to make a strong one. like Bagging, it also vote to judge a sample's catagory, but there is a significant difference: base classifiers that are made up of the strong classifier usually has different 'voting right'. The mostly used form of boosting is adaptive boosting( adaboost), is the one will talk about in detail.
There is a key point on training the difference between bagging and adaboost: the base classifiers are trained in sequence,which has to be, as the performance of previous classifiers determine the following's weights of sample and the 'voting right' of the following classifier.
The points that are misclassified will be assigned to a larger weights and the right ones to a small weights. Then in the following base classifier, 'error' points will attract more 'consideration', repeat the process, once all the base classifiers have been trained, the final prediction will combine all the classifiers' choices, and choose the largest weights of catagory as a point's right class.
Consider a two-class classification problem, in which the training data comprise to vector along with the corresponding binary variables , where . And we have procedure availiable for training a base classifier m using weighted data to give a function . Each data point is given an associated weighting parametere , which is initially set 1/N for all data points.
AdaBoost Process:
1. Initialize the data weighting coeffecients by setting for .
2. For m = 1,2,...,M:
(a): Fit a classifier m to train data by minimizing the weighted error function:
PS: as classifier m is a weak classifier, so we can naturally assume the e_m will always greater than 0.
(b): Calculate m classifier's coefficient :
PS:the cofficient 1/2 dissapear in some reffences; has two functions: update weights , and then as final cofficient in the final prediction function.
from the formula, we could find the is negative proportional to the .
(c) Update the weights of sample points:
; where
.
3. Make predictions by ensembling all the classifiers and their own as cofficients:
Actually, the Adaboost is an instance of Additive Model(AM):
where is called base function, and is the parameters of base function, and the is the cofficient of base function.
Given data and loss function(L), the target of AM is to minimize the combined loss function:
T =
and this can be simplified by forward stepwise algorithms:
every step we just to calculate:
Reference:
1. http://blog.csdn.net/v_july_v/article/details/40718799.
2. Pattern Recogintion and Machine learning 657-662.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 提示词工程——AI应用必不可少的技术
· Open-Sora 2.0 重磅开源!
· 周边上新:园子的第一款马克杯温暖上架