AdaBoost - Python实现

  • 算法特征:
    ①. 模型级联加权(级联权重); ②. 样本特征选择; ③. 样本权重更新(关联权重)
  • 算法原理:
    Part Ⅰ:
    给定如下原始数据集:
    \begin{equation}
    D = \{(x^{(1)}, \bar{y}^{(1)}), (x^{(2)}, \bar{y}^{(2)}), \cdots, (x^{(n)}, \bar{y}^{(n)})\}, \quad\text{where }\bar{y}^{(i)} \in \{-1, +1\}
    \label{eq_1}
    \end{equation}
    指定级联加权弱模型之最大数量为$T$. 对于第$t$个弱模型$h_t(x)$, 其样本关联权重分布如下:
    \begin{equation}
    W_t = \{ w_t^{(1)}, w_t^{(2)}, \cdots, w_t^{(n)} \}
    \label{eq_2}
    \end{equation}
    主要流程如下:
    步骤1. 均匀初始化样本关联权重:
    \begin{equation}
    W_1 = \{ w_1^{(1)}, w_1^{(2)}, \cdots, w_1^{(n)}\}, \quad\text{where }w_1^{(i)} = \frac{1}{n}
    \label{eq_3}
    \end{equation}
    步骤2. 对于迭代次数$t = 1, 2, \cdots, T$, 在训练样本上基于关联权重分布$W_t$获得弱模型$h_t(x) = \pm 1$.
    注意:
    ①. 此弱模型的构建依赖于训练样本的特征选择操作;
    ②. 若弱模型支持权重样本, 重赋权(调整损失函数)以更新模型, 否则, 重采样(有放回采样)以更新模型.
    步骤3. 计算弱模型$h_t(x)$在训练样本上的加权错误率:
    \begin{equation}
    \epsilon_t = \sum_{i=1}^nw_t^{(i)}I(h_t(x^{(i)}) \neq \bar{y}^{(i)})
    \label{eq_4}
    \end{equation}
    计算弱模型$h_t(x)$在最终决策模型中的级联权重:
    \begin{equation}
    \alpha_t = \frac{1}{2}\mathrm{ln}\frac{1-\epsilon_t}{\epsilon_t}
    \label{eq_5}
    \end{equation}
    步骤4. 更新样本关联权重
    \begin{equation}
    \begin{split}
    &W_{t+1} = \{ w_{t+1}^{(1)}, w_{t+1}^{(2)}, \cdots, w_{t+1}^{(n)} \}   \\
    &w_{t+1}^{(i)} = \frac{w_{t}^{(i)}}{Z_t}\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &Z_t = \sum_{i=1}^{n}w_{t}^{(i)}\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))
    \end{split}
    \label{eq_6}
    \end{equation}
    步骤5. 回到步骤2
    步骤6. 最终决策模型为:
    \begin{equation}
    \begin{split}
    f(x) &= \sum_{t=1}^{T}\alpha_th_t(x)   \\
    h_{final}(x) &= \mathrm{sign}(f(x)) = \mathrm{sign}(\sum_{t=1}^T\alpha_th_t(x))
    \end{split}
    \label{eq_7}
    \end{equation}
    Part Ⅱ:
    定理:
    若弱模型$h_t(x)$在训练集上加权错误率$\epsilon_t < 0.5$, 则随着$T$的增加, AdaBoost最终决策模型$h_{final}(x)$在训练集上错误率$E$将越来越小, 即:
    \begin{equation*}
    E = \frac{1}{n}\sum_{i=1}^{n}I(h_{final}(x^{(i)}) \neq \bar{y}^{(i)}) \leq \frac{1}{n}\sum_{i=1}^{n}\mathrm{exp}(-\bar{y}^{(i)}f(x^{(i)})) = \prod_{t=1}^{T}Z_t
    \end{equation*}
    证明:
    \begin{equation*}
    \begin{split}
    E &\leq \frac{1}{n}\sum_{i=1}^n\mathrm{exp}(-\bar{y}^{(i)}f(x^{(i)}))   \\
    &=\frac{1}{n}\sum_{i=1}^n\mathrm{exp}(-\bar{y}^{(i)}\sum_{t=1}^T\alpha_th_t(x^{(i)}))   \\
    &=\frac{1}{n}\sum_{i=1}^n\mathrm{exp}(\sum_{t=1}^T-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=\sum_{i=1}^nw_1^{(i)}\prod_{t=1}^T\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=\sum_{i=1}^nw_1^{(i)}\mathrm{exp}(-\alpha_1\bar{y}^{(i)}h_1(x^{(i)}))\prod_{t=2}^T\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=\sum_{i=1}^nZ_1w_2^{(i)}\prod_{t=2}^T\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=Z_1\sum_{i=1}^nw_2^{(i)}\prod_{t=2}^T\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=\prod_{t=1}^TZ_t
    \end{split}
    \end{equation*}
    \begin{equation*}
    \begin{split}
    Z_t &= \sum_{i=1}^{n}w_{t}^{(i)}\mathrm{exp}(-\alpha_t\bar{y}^{(i)}h_t(x^{(i)}))   \\
    &=\sum_{i=1; \bar{y}^{(i)}=h_t(x^{(i)})}^nw_t^{(i)}\mathrm{exp}(-\alpha_t) + \sum_{i=1; \bar{y}^{(i)} \neq h_t(x^{(i)})}^nw_t^{(i)}\mathrm{exp}(\alpha_t)   \\
    &=\mathrm{exp}(-\alpha_t)\sum_{i=1; \bar{y}^{(i)}=h_t(x^{(i)})}^nw_t^{(i)} + \mathrm{exp}(\alpha_t)\sum_{i=1; \bar{y}^{(i)} \neq h_t(x^{(i)})}^nw_t^{(i)}   \\
    &=\mathrm{exp}(-\alpha_t)(1 - \epsilon_t) + \mathrm{exp}(\alpha_t)\epsilon_t   \\
    &=2\sqrt{\epsilon_t(1 - \epsilon_t)}
    \end{split}
    \end{equation*}
    对于弱模型$h_t(x)$, 若其在训练集上的加权错误率$\epsilon_t < 0.5$, 则$Z_t < 1$. 此时, 随着$T$的增加, AdaBoost最终决策模型$h_{final}(x)$在训练集上错误率$E$将越来越小. 证毕.
  • 代码实现:
    本文以线性SVM作为弱模型, 对AdaBoost进行算法实施. SVM相关内容详见:
    Smooth Support Vector Machine - Python实现
      1 # AdaBoost之实现
      2 # 注意, 采用级联加权SVM实施之
      3 
      4 
      5 import numpy
      6 from matplotlib import pyplot as plt
      7 
      8 
      9 
     10 def spiral_point(val, center=(0, 0)):
     11     rn = 0.4 * (105 - val) / 104
     12     an = numpy.pi * (val - 1) / 25
     13 
     14     x0 = center[0] + rn * numpy.sin(an)
     15     y0 = center[1] + rn * numpy.cos(an)
     16     z0 = -1
     17     x1 = center[0] - rn * numpy.sin(an)
     18     y1 = center[1] - rn * numpy.cos(an)
     19     z1 = 1
     20 
     21     return (x0, y0, z0), (x1, y1, z1)
     22 
     23 
     24 
     25 def spiral_data(valList):
     26     dataList = list(spiral_point(val) for val in valList)
     27     data0 = numpy.array(list(item[0] for item in dataList))
     28     data1 = numpy.array(list(item[1] for item in dataList))
     29     return data0, data1
     30 
     31 
     32 
     33 class SSVM(object):
     34     
     35     def __init__(self, trainingSet, c=1, mu=1, beta=100):
     36         self.__trainingSet = trainingSet                     # 训练集数据
     37         self.__c = c                                         # 误差项权重
     38         self.__mu = mu                                       # gaussian kernel参数
     39         self.__beta = beta                                   # 光滑化参数
     40         
     41         self.__A, self.__D = self.__get_AD()
     42     
     43     
     44     def get_cls(self, x, alpha, b):
     45         A, D = self.__A, self.__D
     46         mu = self.__mu
     47 
     48         x = numpy.array(x).reshape((-1, 1))
     49         KAx = self.__get_KAx(A, x, mu)
     50         clsVal = self.__calc_hVal(KAx, D, alpha, b)
     51         if clsVal >= 0:
     52             return 1
     53         else:
     54             return -1
     55         
     56         
     57     def optimize(self, maxIter=1000, epsilon=1.e-9):
     58         '''
     59         maxIter: 最大迭代次数
     60         epsilon: 收敛判据, 梯度趋于0则收敛
     61         '''
     62         A, D = self.__A, self.__D
     63         c = self.__c
     64         mu = self.__mu
     65         beta = self.__beta
     66         
     67         alpha, b = self.__init_alpha_b((A.shape[1], 1))
     68         KAA = self.__get_KAA(A, mu)
     69         
     70         JVal = self.__calc_JVal(KAA, D, c, beta, alpha, b)
     71         grad = self.__calc_grad(KAA, D, c, beta, alpha, b)
     72         DMat = self.__init_D(KAA.shape[0] + 1)
     73         
     74         for i in range(maxIter):
     75             # print("iterCnt: {:3d},   JVal: {}".format(i, JVal))
     76             if self.__converged1(grad, epsilon):
     77                 return alpha, b, True
     78     
     79             dCurr = -numpy.matmul(DMat, grad)
     80             ALPHA = self.__calc_ALPHA_by_ArmijoRule(alpha, b, JVal, grad, dCurr, KAA, D, c, beta)
     81             
     82             delta = ALPHA * dCurr
     83             alphaNew = alpha + delta[:-1, :]
     84             bNew = b + delta[-1, -1]
     85             JValNew = self.__calc_JVal(KAA, D, c, beta, alphaNew, bNew)
     86             if self.__converged2(delta, JValNew - JVal, epsilon ** 2):
     87                 return alphaNew, bNew, True
     88             
     89             gradNew = self.__calc_grad(KAA, D, c, beta, alphaNew, bNew)
     90             DMatNew = self.__update_D_by_BFGS(delta, gradNew - grad, DMat)
     91             
     92             alpha, b, JVal, grad, DMat = alphaNew, bNew, JValNew, gradNew, DMatNew
     93         else:
     94             if self.__converged1(grad, epsilon):
     95                 return alpha, b, True
     96         return alpha, b, False
     97     
     98     
     99     def __update_D_by_BFGS(self, sk, yk, D):
    100         rk = 1 / (numpy.matmul(yk.T, sk)[0, 0] + 1.e-30)
    101 
    102         term1 = rk * numpy.matmul(sk, yk.T)
    103         term2 = rk * numpy.matmul(yk, sk.T)
    104         I = numpy.identity(term1.shape[0])
    105         term3 = numpy.matmul(I - term1, D)
    106         term4 = numpy.matmul(term3, I - term2)
    107         term5 = rk * numpy.matmul(sk, sk.T)
    108 
    109         DNew = term4 + term5
    110         return DNew
    111     
    112     
    113     def __calc_ALPHA_by_ArmijoRule(self, alphaCurr, bCurr, JCurr, gCurr, dCurr, KAA, D, c, beta, C=1.e-4, v=0.5):
    114         i = 0
    115         ALPHA = v ** i
    116         delta = ALPHA * dCurr
    117         alphaNext = alphaCurr + delta[:-1, :]
    118         bNext = bCurr + delta[-1, -1]
    119         JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext)
    120         while True:
    121             if JNext <= JCurr + C * ALPHA * numpy.matmul(dCurr.T, gCurr)[0, 0]: break
    122             i += 1
    123             ALPHA = v ** i
    124             delta = ALPHA * dCurr
    125             alphaNext = alphaCurr + delta[:-1, :]
    126             bNext = bCurr + delta[-1, -1]
    127             JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext)
    128         return ALPHA
    129     
    130                 
    131     def __converged1(self, grad, epsilon):
    132         if numpy.linalg.norm(grad, ord=numpy.inf) <= epsilon:
    133             return True
    134         return False
    135         
    136         
    137     def __converged2(self, delta, JValDelta, epsilon):
    138         val1 = numpy.linalg.norm(delta, ord=numpy.inf)
    139         val2 = numpy.abs(JValDelta)
    140         if val1 <= epsilon or val2 <= epsilon:
    141             return True
    142         return False
    143         
    144     
    145     def __init_D(self, n):
    146         D = numpy.identity(n)
    147         return D
    148         
    149     
    150     def __calc_grad(self, KAA, D, c, beta, alpha, b):
    151         grad_J1 = self.__calc_grad_J1(alpha)
    152         grad_J2 = self.__calc_grad_J2(KAA, D, c, beta, alpha, b)
    153         grad = grad_J1 + grad_J2
    154         return grad
    155         
    156         
    157     def __calc_grad_J2(self, KAA, D, c, beta, alpha, b):
    158         grad_J2 = numpy.zeros((KAA.shape[0] + 1, 1))
    159         Y = numpy.matmul(D, numpy.ones((D.shape[0], 1)))
    160         YY = numpy.matmul(Y, Y.T)
    161         KAAYY = KAA * YY
    162 
    163         z = 1 - numpy.matmul(KAAYY, alpha) - Y * b
    164         p = numpy.array(list(self.__p(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1))
    165         s = numpy.array(list(self.__s(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1))
    166         term = p * s
    167 
    168         for k in range(grad_J2.shape[0] - 1):
    169             val = -c * Y[k, 0] * numpy.sum(Y * KAA[:, k:k+1] * term)
    170             grad_J2[k, 0] = val
    171         grad_J2[-1, 0] = -c * numpy.sum(Y * term)
    172         return grad_J2
    173 
    174 
    175     def __calc_grad_J1(self, alpha):
    176         grad_J1 = numpy.vstack((alpha, [[0]]))
    177         return grad_J1
    178         
    179         
    180     def __calc_JVal(self, KAA, D, c, beta, alpha, b):
    181         J1 = self.__calc_J1(alpha)
    182         J2 = self.__calc_J2(KAA, D, c, beta, alpha, b)
    183         JVal = J1 + J2
    184         return JVal
    185         
    186         
    187     def __calc_J2(self, KAA, D, c, beta, alpha, b):
    188         tmpOne = numpy.ones((KAA.shape[0], 1))
    189         x = tmpOne - numpy.matmul(numpy.matmul(numpy.matmul(D, KAA), D), alpha) - numpy.matmul(D, tmpOne) * b
    190         p = numpy.array(list(self.__p(x[i, 0], beta) for i in range(x.shape[0])))
    191         J2 = numpy.sum(p * p) * c / 2
    192         return J2
    193 
    194 
    195     def __calc_J1(self, alpha):
    196         J1 = numpy.sum(alpha * alpha) / 2
    197         return J1
    198         
    199         
    200     def __get_KAA(self, A, mu):
    201         KAA = numpy.zeros((A.shape[1], A.shape[1]))
    202         for rowIdx in range(KAA.shape[0]):
    203             for colIdx in range(rowIdx + 1):
    204                 x1 = A[:, rowIdx:rowIdx+1]
    205                 x2 = A[:, colIdx:colIdx+1]
    206                 val = self.__calc_gaussian(x1, x2, mu)
    207                 KAA[rowIdx, colIdx] = KAA[colIdx, rowIdx] = val
    208         return KAA
    209         
    210         
    211     def __get_KAx(self, A, x, mu):
    212         KAx = numpy.zeros((A.shape[1], 1))
    213         for rowIdx in range(KAx.shape[0]):
    214             x1 = A[:, rowIdx:rowIdx+1]
    215             val = self.__calc_gaussian(x1, x, mu)
    216             KAx[rowIdx, 0] = val
    217         return KAx
    218         
    219         
    220     def __calc_hVal(self, KAx, D, alpha, b):
    221         hVal = numpy.matmul(numpy.matmul(alpha.T, D), KAx)[0, 0] + b
    222         return hVal
    223         
    224         
    225     def __calc_gaussian(self, x1, x2, mu):
    226         # val = numpy.math.exp(-mu * numpy.linalg.norm(x1 - x2) ** 2)    # 高斯核
    227         # val = (numpy.sum(x1 * x2) + 1) ** 1                            # 多项式核
    228         val = numpy.sum(x1 * x2)                                         # 线性核
    229         return val
    230         
    231         
    232     def __init_alpha_b(self, shape):
    233         '''
    234         alpha, b之初始化
    235         '''
    236         alpha, b = numpy.zeros(shape), 0
    237         return alpha, b
    238         
    239         
    240     def __get_AD(self):
    241         A = self.__trainingSet[:, :2].T
    242         D = numpy.diag(self.__trainingSet[:, 2])
    243         return A, D
    244         
    245         
    246     def __p(self, x, beta):
    247         term = x * beta
    248         if term > 10:
    249             val = x + numpy.math.log(1 + numpy.math.exp(-term)) / beta
    250         else:
    251             val = numpy.math.log(numpy.math.exp(term) + 1) / beta
    252         return val
    253 
    254 
    255     def __s(self, x, beta):
    256         term = x * beta
    257         if term > 10:
    258             val = 1 / (numpy.math.exp(-beta * x) + 1)
    259         else:
    260             term1 = numpy.math.exp(term)
    261             val = term1 / (1 + term1)
    262         return val
    263         
    264         
    265         
    266 class AdaBoost(object):
    267     
    268     def __init__(self, trainingSet):
    269         self.__trainingSet = trainingSet            # 训练数据集
    270         
    271         self.__W = self.__init_weight()             # 关联权重初始化
    272         
    273         
    274     def get_weakModels(self, T=100):
    275         '''
    276         T: 弱模型上限数量
    277         '''
    278         T = T if T >= 1 else 1
    279         W = self.__W
    280         trainingSet = self.__trainingSet
    281         
    282         weakModels = list()                               # 级联弱模型(SVM)列表
    283         alphaList = list()                                # 级联权重列表
    284         for t in range(T):
    285             print("getting the {}th weak model...".format(t))
    286             weakModel, alpha = self.__get_weakModel(trainingSet, W, 0.49)
    287             weakModels.append(weakModel)
    288             alphaList.append(alpha)
    289             
    290             W = self.__update_W(W, weakModel, alpha)      # 更新关联权重
    291         else:
    292             realErr = self.__calc_realErr(weakModels, alphaList, trainingSet)  # 计算真实错误率
    293             print("Final error rate: {}".format(realErr))
    294         return weakModels, alphaList
    295             
    296             
    297     def get_realErr(self, weakModels, alphaList, dataSet=None):
    298         '''
    299         计算AdaBoost在指定数据集上的错误率
    300         weakModels: 弱模型列表
    301         alphaList: 级联权重列表
    302         dataSet: 指定数据集
    303         '''
    304         if dataSet is None:
    305             dataSet = self.__trainingSet
    306         
    307         realErr = self.__calc_realErr(weakModels, alphaList, dataSet)
    308         return realErr
    309         
    310         
    311     def get_cls(self, x, weakModels, alphaList):
    312         hVal = self.__calc_hVal(x, weakModels, alphaList)
    313         if hVal >= 0:
    314             return 1
    315         else:
    316             return -1
    317         
    318         
    319     def __calc_realErr(self, weakModels, alphaList, dataSet):
    320         cnt = 0
    321         num = dataSet.shape[0]
    322         for sample in dataSet:
    323             x, y_ = sample[:-1], sample[-1]
    324             y = self.get_cls(x, weakModels, alphaList)
    325             if y_ != y:
    326                 cnt += 1
    327         err = cnt / num
    328         return err
    329         
    330         
    331     def __calc_hVal(self, x, weakModels, alphaList):
    332         if len(weakModels) == 0:
    333             raise Exception("Weak model list is empty!")
    334             
    335         hVal = 0
    336         for (ssvmObj, ssvmRet), alpha in zip(weakModels, alphaList):
    337             hVal += ssvmObj.get_cls(x, ssvmRet[0], ssvmRet[1]) * alpha
    338         return hVal
    339             
    340             
    341     def __update_W(self, W, weakModel, alpha):
    342         ssvmObj, ssvmRet = weakModel
    343         trainingSet = self.__trainingSet
    344         
    345         WNew = list()
    346         for sample in trainingSet:
    347             x, y_ = sample[:-1], sample[-1]
    348             val = numpy.math.exp(-alpha * y_ * ssvmObj.get_cls(x, ssvmRet[0], ssvmRet[1]))
    349             WNew.append(val)
    350         WNew = numpy.array(WNew) * W
    351         WNew = WNew / numpy.sum(WNew)
    352         return WNew
    353             
    354     
    355     def __get_weakModel(self, trainingSet, W, maxEpsilon=0.5, maxIter=100):
    356         '''
    357         获取弱模型及级联权重
    358         W: 关联权重
    359         maxEpsilon: 最大加权错误率
    360         maxIter: 最大迭代次数
    361         '''
    362         roulette = self.__build_roulette(W)
    363         for idx in range(maxIter):
    364             dataSet = self.__get_dataSet(trainingSet, roulette)
    365             weakModel = self.__build_weakModel(dataSet)
    366 
    367             epsilon = self.__calc_weightedErr(trainingSet, weakModel, W)
    368             if epsilon == 0:
    369                 raise Exception("The model is not weak enough with epsilon = 0")
    370             elif epsilon < maxEpsilon:
    371                 alpha = self.__calc_alpha(epsilon)
    372                 return weakModel, alpha
    373         else:
    374             raise Exception("Fail to get weak model after {} iterations!".format(maxIter))
    375             
    376             
    377     def __calc_alpha(self, epsilon):
    378         '''
    379         计算级联权重
    380         '''
    381         alpha = numpy.math.log(1 / epsilon - 1) / 2
    382         return alpha
    383             
    384             
    385     def __calc_weightedErr(self, trainingSet, weakModel, W):
    386         '''
    387         计算加权错误率
    388         '''
    389         ssvmObj, (alpha, b, tab) = weakModel
    390         
    391         epsilon = 0
    392         for idx, w in enumerate(W):
    393             x, y_ = trainingSet[idx, :-1], trainingSet[idx, -1]
    394             y = ssvmObj.get_cls(x, alpha, b)
    395             if y_ != y:
    396                 epsilon += w
    397         return epsilon
    398         
    399         
    400     def __build_weakModel(self, dataSet):
    401         '''
    402         构造SVM弱模型
    403         '''
    404         ssvmObj = SSVM(dataSet, c=0.1, mu=250, beta=100)
    405         ssvmRet = ssvmObj.optimize()
    406         return (ssvmObj, ssvmRet)
    407         
    408     
    409     def __get_dataSet(self, trainingSet, roulette):
    410         randomDart = numpy.sort(numpy.random.uniform(0, 1, trainingSet.shape[0]))
    411         dataSet = list()
    412         idxRoulette = idxDart = 0
    413         while idxDart < len(randomDart):
    414             if randomDart[idxDart] > roulette[idxRoulette]:
    415                 idxRoulette += 1
    416             else:
    417                 dataSet.append(trainingSet[idxRoulette])
    418                 idxDart += 1
    419         return numpy.array(dataSet)
    420     
    421         
    422     def __build_roulette(self, W):
    423         roulette = list()
    424         val = 0
    425         for ele in W:
    426             val += ele
    427             roulette.append(val)
    428         return roulette
    429         
    430         
    431     def __init_weight(self):
    432         num = self.__trainingSet.shape[0]
    433         W = numpy.ones(num) / num
    434         return W
    435         
    436         
    437 
    438 class AdaBoostPlot(object):
    439     
    440     @staticmethod
    441     def data_plot(trainingData0, trainingData1):
    442         fig = plt.figure(figsize=(5, 5))
    443         ax1 = plt.subplot()
    444         
    445         ax1.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="Positive")
    446         ax1.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="Negative")
    447         
    448         ax1.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$")
    449         ax1.legend(fontsize="x-small")
    450         
    451         fig.savefig("data.png", dpi=100)
    452         # plt.show()
    453         plt.close()
    454         
    455         
    456     @staticmethod
    457     def pred_plot(trainingData0, trainingData1, adaObj, weakModels, alphaList):
    458         x = numpy.linspace(-0.5, 0.5, 500)
    459         y = numpy.linspace(-0.5, 0.5, 500)
    460         x, y = numpy.meshgrid(x, y)
    461         z = numpy.zeros(x.shape)
    462         for rowIdx in range(x.shape[0]):
    463             print("on the {}th row".format(rowIdx))
    464             for colIdx in range(x.shape[1]):
    465                 z[rowIdx, colIdx] = adaObj.get_cls((x[rowIdx, colIdx], y[rowIdx, colIdx]), weakModels, alphaList)
    466         
    467         errList = list()
    468         for idx in range(len(weakModels)):
    469             tmpWeakModels = weakModels[:idx+1]
    470             tmpAlphaList = alphaList[:idx+1]
    471             realErr = adaObj.get_realErr(tmpWeakModels, tmpAlphaList)
    472             print("idx = {}; realErr = {}".format(idx, realErr))
    473             errList.append(realErr)
    474         
    475         fig = plt.figure(figsize=(10, 3))
    476         ax1 = plt.subplot(1, 2, 1)
    477         ax2 = plt.subplot(1, 2, 2)
    478         
    479         ax1.plot(numpy.arange(len(errList))+1, errList, linestyle="--", marker=".")
    480         ax1.set(xlabel="T", ylabel="error rate")
    481         
    482         ax2.contourf(x, y, z, levels=[-1.5, 0, 1.5], colors=["blue", "red"], alpha=0.3)
    483         ax2.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="Positive")
    484         ax2.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="Negative")
    485         ax2.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$")
    486         ax2.legend(loc="upper left", fontsize="x-small")
    487         fig.tight_layout()
    488         fig.savefig("pred.png", dpi=100)
    489         # plt.show()
    490         plt.close()
    491         
    492         
    493         
    494 if __name__ == "__main__":
    495     # 生成训练数据集
    496     trainingValList = numpy.arange(1, 101, 1)
    497     trainingData0, trainingData1 = spiral_data(trainingValList)
    498     trainingSet = numpy.vstack((trainingData0, trainingData1))
    499     
    500     adaObj = AdaBoost(trainingSet)
    501     weakModels, alphaList = adaObj.get_weakModels(200)
    502     
    503     AdaBoostPlot.data_plot(trainingData0, trainingData1)
    504     AdaBoostPlot.pred_plot(trainingData0, trainingData1, adaObj, weakModels, alphaList)
    View Code

    笔者所用训练数据集分布如下:

    很显然, 此数据集非线性可分, 直接采用线性SVM将获得较差的分类效果.
  • 结果展示:
    左侧为训练集上错误率$E$随弱模型数量$T$的变化情况, 右侧为AdaBoost在此训练集上的最终分类效果. 可以看到, 相较于单一线性SVM, AdaBoost通过级联多个线性SVM, 使其在训练集上的错误率由初始的0.45降至0.12, 极大程度上增强了弱模型的表达能力.
  • 使用建议:
    ①. 注意区分级联权重$\alpha$与关联权重$w$;
    ②. 注意区分错误率$E$与加权错误率$\epsilon$.
  • 参考文档:
    Boosting之AdaBoost算法
posted @ 2020-09-20 20:33  LOGAN_XIONG  阅读(358)  评论(0编辑  收藏  举报