模型树——就是回归树的分段常数预测修改为线性回归对于非线性回归有较好的预测效果

说完了树回归，再简单的提下模型树，因为树回归每个节点是一些特征和特征值，选取的原则是根据特征方差最小。如果把叶子节点换成分段线性函数，那么就变成了模型树，如（图六）所示：

（图六）

（图六）中明显是两个直线组成，以X坐标（0.0-0.3）和（0.3-1.0）分成的两个线段。如果我们用两个叶子节点保存两个线性回归模型，就完成了这部分数据的拟合。实现也比较简单，代码如下：

[python] view plain copy

def linearSolve(dataSet): #helper function used in two places
m,n = shape(dataSet)
X = mat(ones((m,n))); Y = mat(ones((m,1)))#create a copy of data with 1 in 0th postion
X[:,1:n] = dataSet[:,0:n-1]; Y = dataSet[:,-1]#and strip out Y
xTx = X.T*X
if linalg.det(xTx) == 0.0:
raise NameError('This matrix is singular, cannot do inverse,\n\
try increasing the second value of ops')
ws = xTx.I * (X.T * Y)
return ws,X,Y
def modelLeaf(dataSet):#create linear model and return coeficients
ws,X,Y = linearSolve(dataSet)
return ws
def modelErr(dataSet):
ws,X,Y = linearSolve(dataSet)
yHat = X * ws
return sum(power(Y - yHat,2))

代码和树回归相似，只不过modelLeaf在返回叶子节点时，要完成一个线性回归，由linearSolve来完成。最后一个函数modelErr则和回归树的regErr函数起着同样的作用。

谢天谢地，这篇文章一个公式都没有出现，但同时也希望没有数学的语言，表述会清楚。

数据ex00.txt：

0.036098 0.155096

xxx

参考文献：

[1] machine learning in action.Peter Harrington

posted @ 2017-07-26 20:30 bonelee 阅读(1706) 评论(0) 收藏举报

刷新页面返回顶部

将者，智、信、仁、勇、严也。