OLE DB for DM实践 —— 用数据挖掘实现交叉销售(转)
OLE DB for DM实践 —— 用数据挖掘实现交叉销售
我们在访问一些销售网站时,常常会碰上这种情况:浏览某商品信息的同时,网页上会打出促销广告,购买此商品加另一商品可以享受折扣,就像下面图片中的那样。
实现这种功能关键步骤是找到商品间的关联规则——我们会向购买了童车的客户推荐儿童玩具,而不会推荐汽车这种与童车毫不相干的商品。不过一个超市或网站上通常会销售上千种商品,通过人去识别商品相关性是不可能的,所以要使用计算机进行模式识别,计算机找到的规则有些是可预知的,但未知的规则更有意思,比较经典的是沃尔玛的尿布和啤酒故事(也有人说这个案例是编造的)。
这里我做了个购物页面,模拟一个交叉销售的过程。
页面上方有两个列表,左边的列出了可供客户选购的商品,客户选中了某个商品后加入到右边的购物篮列表,同时下方的单选列表,询问客户是否愿意再选购一个商品和当前的商品一起打包购买,打包购买可以享受折扣。
要做的工作很简单,在SQL Server中使用样本数据库AdventureWorksDW建立一个关联规则挖掘模型(这部分在SQL Server 2005联机丛书中有详细的教程),部署模型后,在网页的后台代码中通过ADOMD.NET查询关联模型,预测可能实现交叉销售的商品。查询挖掘模型需要使用Microsoft.AnalysisServices.AdomdClient命名空间,其中包含与Analysis Services对话的客户端对象AdomdConnection用于连接SSAS数据库,AdomdCommand用于执行DMX查询,查询返回AdomdDataReader对象保存查询结果,有了查询结果就知道客户可能会打包购买哪些商品。
使用AdventureWorksDW数据库中的视图vAssocSeqLineItems和vAssocSeqOrders建立挖掘模型。这两个视图,一个是订单数据,一个是订单明细行数据。应用Microsoft_Association_Rules算法,因为只要两个商品打包销售,将算法参数Maximun_Itemset_Size改成2。模型训练后就可以得到各种关联规则,例如图中第一行规则:购买了Touring Tire Tube同时可能会购买Touring Tire
设计一个页面,很简单,四个Label,两个ListBox,三个Button,一个RadioButtonList,一个SqlDataSource
页面代码如下:
2
3<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4
5<html xmlns="http://www.w3.org/1999/xhtml" >
6<head runat="server">
7 <title>无标题页</title>
8</head>
9<body>
10 <form id="form1" runat="server">
11 <div>
12 <asp:Label ID="Label1" runat="server" Height="1px" Text="选择您想要的商品"
13 Width="200px" style="z-index: 108; left: 10px; position: absolute; top: 15px"></asp:Label>
14 <asp:Label ID="Label2" runat="server" Height="1px" Style="left: 270px;
15 position: absolute; top: 15px; z-index: 101;" Text="您已选购的商品" Width="270px"></asp:Label><br />
16 <br />
17 <asp:ListBox ID="ListBox1" runat="server" DataSourceID="AdventureWorksDW"
18 DataTextField="Model" DataValueField="Model" Height="353px" Width="200px" style="z-index: 102; left: 10px; position: absolute; top: 53px"></asp:ListBox>
19 <asp:ListBox ID="ListBox2" runat="server" Height="353px" Style="z-index: 103; left: 270px;
20 position: absolute; top: 53px" Width="270px" AutoPostBack="True"></asp:ListBox>
21 <asp:Button ID="Button1" runat="server" Font-Bold="True" Font-Size="16pt" Style="left: 218px;
22 position: absolute; top: 160px; z-index: 104;" Text=">" Width="41px" Height="41px" />
23 <asp:Button ID="Button2" runat="server" Font-Bold="True" Font-Size="16pt" Style="left: 218px;
24 position: absolute; top: 221px; z-index: 105;" Text="<" Width="41px" Height="41px" />
25 <asp:SqlDataSource ID="AdventureWorksDW" runat="server" ConnectionString="Data Source=127.0.0.1;Initial Catalog=AdventureWorksDW;Integrated Security=True"
26 ProviderName="System.Data.SqlClient"
27 SelectCommand="SELECT DISTINCT Model FROM vAssocSeqLineItems"></asp:SqlDataSource>
28
29 </div>
30
31 <asp:Panel ID="Panel1" runat="server" BorderStyle="None" BorderWidth="1px" Enabled="False"
32 Height="182px" Style="z-index: 106; left: 10px; position: absolute; top: 420px"
33 Width="380px">
34 <asp:Label ID="Label3" runat="server" Height="62px" Style="z-index: 100; left: 6px;
35 position: absolute; top: 3px" Text="您可以捆绑购买下面的一种商品,捆绑购买将享受9折优惠" Width="370px"></asp:Label>
36 <asp:RadioButtonList ID="RadioButtonList1" runat="server" Height="71px" RepeatLayout="Flow"
37 Style="z-index: 101; left: 6px; position: absolute; top: 72px" Width="370px">
38 </asp:RadioButtonList>
39 <asp:Button ID="Button3" runat="server" Style="z-index: 103; left: 6px; position: absolute;
40 top: 151px" Text="确定" Width="370px" />
41 </asp:Panel>
42 <asp:Label ID="Label4" runat="server" Height="50px" Style="z-index: 107; left: 10px;
43 position: absolute; top: 610px" Width="380px" Font-Size="12pt"></asp:Label>
44 </form>
45</body>
46</html>
47
向项目中添加Microsoft.AnalysisServices.AdomdClient引用,然后写后台代码:
2Partial Class _Default
3 Inherits System.Web.UI.Page
4
5 '客户向订单中加入一种商品
6 Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
7 Me.line += 1 '增加订单中行数
8 Dim newPackage As Package = New Package(Me.ListBox1.SelectedItem.Text)
9 Me.order.Add(newPackage, line.ToString, , ) '单据中加入一商品包
10 Me.selectedLine = Me.line '指定当前加入的行作为挖掘模型查询行
11 Me.ListBox2.Items.Add(New ListItem(newPackage.name, line.ToString)) '加入已购商品列表
12 Me.generateSelectionList() '查询挖掘模型,产生供客户选购的打包商品
13 End Sub
14
15 '客户从已购商品列表中删除一行
16 Protected Sub Button2_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button2.Click
17 Me.selectedLine = 0
18 Me.Panel1.Enabled = False
19 Dim removedLine As String = Me.ListBox2.SelectedValue
20 Me.ListBox2.Items.Remove(Me.ListBox2.SelectedItem)
21 Me.order(removedLine).removed = True
22 End Sub
23
24 '客户在已购列表中选择一行,如果此行的包只有一件商品,产生供客户选购的打包商品
25 Protected Sub ListBox2_SelectedIndexChanged(ByVal sender As Object, ByVal e As System.EventArgs) Handles ListBox2.SelectedIndexChanged
26 If Me.order(Me.ListBox2.SelectedValue).models.count = 1 Then
27 Me.Panel1.Enabled = True
28 Me.selectedLine = CType(Me.ListBox2.SelectedValue, Integer)
29 Me.generateSelectionList()
30 Else
31 Me.Panel1.Enabled = False
32 End If
33 End Sub
34
35 '客户从单选列表中选中一个商品,将选中商品加入包中
36 Protected Sub Button3_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button3.Click
37 If Me.RadioButtonList1.SelectedIndex <> -1 AndAlso Me.order(Me.selectedLine.ToString).models.count = 1 Then
38 Me.Panel1.Enabled = False
39 CType(Me.order(Me.selectedLine), Package).models.Add(Me.RadioButtonList1.SelectedItem.Text)
40 CType(Me.order(Me.selectedLine), Package).discount = 9
41 Me.ListBox2.Items.FindByValue(Me.selectedLine.ToString).Text = CType(Me.order(Me.selectedLine), Package).name
42 End If
43 End Sub
44
45 '查询挖掘模型,在单选列表控件产生供客户选择的商品
46 Private Sub generateSelectionList()
47 If Me.selectedLine <> 0 Then
48 Dim cnDataSource As String = ""
49 Dim cnPassword As String = "xxxxxx"
50 Dim cnUserId As String = "cheney"
51 Dim cnInitialCatalog As String = "test"
52 Dim cnLocation As String = "127.0.0.1"
53 Dim cnMiningModel As String = "v Assoc Seq Orders"
54 Dim cnNestedTable As String = "v Assoc Seq Line Items"
55
56 Dim cnString As String = Me.getCnString(cnDataSource, cnPassword, cnUserId, cnInitialCatalog, cnLocation)
57 Dim cmdText As String = Me.getCmdText(cnMiningModel, cnNestedTable)
58 Dim cn As AdomdConnection = New AdomdConnection(cnString)
59 Dim cmd As AdomdCommand = cn.CreateCommand
60 cmd.CommandText = cmdText
61 cmd.Parameters.Add("ProductModel", Me.order(Me.selectedLine.ToString).models(1).ToString)
62 Dim Reader As AdomdDataReader
63 cn.Open()
64 Try
65 Reader = cmd.ExecuteReader()
66 Catch ex As Exception
67 Me.Panel1.Enabled = False
68 Me.Label4.Text = ex.Message
69 Exit Sub
70 End Try
71 Reader.Read()
72 Dim nestedReader As AdomdDataReader = Reader.GetDataReader(0)
73 Me.RadioButtonList1.Items.Clear()
74 While nestedReader.Read
75 Me.RadioButtonList1.Items.Add(nestedReader(0))
76 End While
77 cn.Close()
78 Me.Label3.Text = "您已经购买了" + Me.order(Me.selectedLine.ToString).models(1) + ",您可以捆绑购买下面一种商品,且享受9折优惠"
79 Me.Panel1.Enabled = True
80 End If
81 End Sub
82
83 '获取数据挖掘连接字符串
84 Private Function getCnString(ByVal dataSource As String, ByVal password As String, ByVal userId As String, ByVal initialCatalog As String, ByVal location As String) As String
85 Dim conxtString As String = ""
86 If dataSource = "" OrElse IsNothing(dataSource) Then
87 conxtString += ""
88 Else
89 conxtString += "Data Source=" + dataSource + ";"
90 End If
91 If password = "" OrElse IsNothing(password) Then
92 conxtString += ""
93 Else
94 conxtString += "Password=" + password + ";"
95 End If
96 If userId = "" OrElse IsNothing(userId) Then
97 conxtString += ""
98 Else
99 conxtString += "User ID=" + userId + ";"
100 End If
101 If initialCatalog = "" OrElse IsNothing(initialCatalog) Then
102 conxtString += ""
103 Else
104 conxtString += "Initial Catalog=" + initialCatalog + ";"
105 End If
106 If location = "" OrElse IsNothing(location) Then
107 conxtString += ""
108 Else
109 conxtString += "Location=" + location
110 End If
111 Return conxtString
112 End Function
113
114 '获取DMX查询语句
115 Private Function getCmdText(ByVal miningModel As String, ByVal nestedTable As String) As String
116 Return "select predictassociation([" + miningModel + "].[" + nestedTable + "],exclusive,3,$adjustedprobability) from [" + miningModel + "] prediction Join (select (select @ProductModel as [Model]) as Product) as t on t.product.[Model]=[" + miningModel + "].[" + nestedTable + "].[Model]"
117 End Function
118
119 Public Sub on_load(ByVal sender As Object, ByVal e As EventArgs) Handles Me.Load
120 If Me.IsPostBack = False Then
121 Me.order = New Collection
122 Me.line = 0
123 Me.selectedLine = 0
124 End If
125 End Sub
126
127 '订单中的行数,一行代表一个商品销售包
128 Property line() As Integer
129 Get
130 Return Me.ViewState("line")
131 End Get
132 Set(ByVal value As Integer)
133 If IsNothing(Me.ViewState("line")) Then
134 Me.ViewState.Add("line", value)
135 Else
136 Me.ViewState("line") = value
137 End If
138 End Set
139 End Property
140
141 '用于查询挖掘模型的订单中的行。将查询数据挖掘模型中,在购买了此行的包中商品后,客户还可能购买哪些商品
142 Property selectedLine() As Integer
143 Get
144 Return Me.ViewState("selectedLine")
145 End Get
146 Set(ByVal value As Integer)
147 If IsNothing(Me.ViewState("selectedLine")) Then
148 Me.ViewState.Add("selectedLine", value)
149 Else
150 Me.ViewState("selectedLine") = value
151 End If
152 End Set
153 End Property
154
155 '订单中包括多行,每一行代表一个销售包,包中含有多个商品。多个商品可以打包销售,也可以单独销售,如果单独销售,包中只有一个商品
156 Property order() As Collection
157 Get
158 Return Me.ViewState("order")
159 End Get
160 Set(ByVal value As Collection)
161 If IsNothing(Me.ViewState("order")) Then
162 Me.ViewState.Add("order", value)
163 Else
164 Me.ViewState("order") = value
165 End If
166 End Set
167 End Property
168End Class
169
170'商品可以打包销售,包中有多个商品
171<Serializable()> _
172Class Package
173 Public ReadOnly Property name() As String '包的名称,格式是:“商品型号名 + 商品型号名,折扣”
174 Get
175 Dim v As String = ""
176 For i As Integer = 1 To Me.models.Count
177 If i = 1 Then
178 v += models(i)
179 Else
180 v += " + " + models(i)
181 End If
182 Next
183 If Me.discount <> 10 Then
184 v += " , " + Me.discount.ToString + "折"
185 End If
186 Return v
187 End Get
188 End Property
189 Public discount As Integer '折扣
190 Public removed As Boolean '包从已选商品列表中删除标识
191 Public models As Collection '商品型号集合
192 Public Sub New(ByVal firstModel As String)
193 Me.models = New Collection
194 Me.models.Add(firstModel)
195 Me.discount = 10
196 Me.removed = False
197 End Sub
198End Class
总结:从代码中可以看出Adomd与Ado在查询方面的使用方法区别不大,关键是对于OLE DB for DM中嵌套表及DMX的理解。我这里仅仅是简单的实践而已,实际环境中的就不可能这么简单,商务网站一般不会是在客户选择了一件商品后就打折促销,而是等客户选购完他需要的商品后,再将其他商品打折推销给客户;另外对复杂的挖掘模型查询是非常费时的,代码中还需要考虑线程异步的问题。
OLAP server: SQL Server
ETL: Power mart
展现工具: BO