机器学习 —— 概率图模型(Homework: Factors)
Talk is cheap, I show you the code
第一章的作业主要是关于PGM的因子操作。实际上,因子是整个概率图的核心。对于有向图而言,因子对应的是CPD(条件分布);对无向图而言,因子对应的是势函数。总而言之,因子是一个映射,将随机变量空间映射到实数空间。因子表现的是对变量之间关系的一种设计。每个因子都编码了一定的信息。
因子的数据结构:
1 phi = struct('var', [3 1 2], 'card', [2 2 2], 'val', ones(1, 8));
在matlab中,因子被定义为一个结构体。结构体中有三个变量,分别是 var : variable, card : cardinate, val : value.这三个变量构成了因子表。其中,var 后面是变量名 X_3,X_2,X_1. card 是每个变量所对应的取值范围。[2 2 2]表示这三个变量都是二值变量。如果是骰子,则应该写成[6 6 6]。val 后面接的是一个列向量。该向量的长度应该为 prod(card).
1、因子相乘
两个Scope相同或不同或部分不同的因子可以相乘,称为因子联合。对于有向图而言,因子相乘就是贝耶斯的链式法则。对于无向图而言,因子相乘就是合并两个相似的信息。因子相乘的原则是能够冲突,也就是只有对应项相乘(因子都是表)。
1 % FactorProduct Computes the product of two factors. 2 % C = FactorProduct(A,B) computes the product between two factors, A and B, 3 % where each factor is defined over a set of variables with given dimension. 4 % The factor data structure has the following fields: 5 % .var Vector of variables in the factor, e.g. [1 2 3] 6 % .card Vector of cardinalities corresponding to .var, e.g. [2 2 2] 7 % .val Value table of size prod(.card) 8 % 9 % See also FactorMarginalization.m, IndexToAssignment.m, and 10 % AssignmentToIndex.m 11 12 function C = FactorProduct(A, B) 13 %A = struct('var', [1], 'card', [2], 'val', [0.11, 0.89]); 14 %B = struct('var', [2, 1], 'card', [2, 2], 'val', [0.59, 0.41, 0.22, 0.78]); 15 % Check for empty factors 16 if (isempty(A.var)), C = B; return; end; 17 if (isempty(B.var)), C = A; return; end; 18 19 % Check that variables in both A and B have the same cardinality 20 [dummy iA iB] = intersect(A.var, B.var); 21 if ~isempty(dummy) 22 % A and B have at least 1 variable in common 23 assert(all(A.card(iA) == B.card(iB)), 'Dimensionality mismatch in factors'); 24 end 25 26 % Set the variables of C 27 C.var = union(A.var, B.var); 28 29 % Construct the mapping between variables in A and B and variables in C. 30 % In the code below, we have that 31 % 32 % mapA(i) = j, if and only if, A.var(i) == C.var(j) 33 % 34 % and similarly 35 % 36 % mapB(i) = j, if and only if, B.var(i) == C.var(j) 37 % 38 % For example, if A.var = [3 1 4], B.var = [4 5], and C.var = [1 3 4 5], 39 % then, mapA = [2 1 3] and mapB = [3 4]; mapA(1) = 2 because A.var(1) = 3 40 % and C.var(2) = 3, so A.var(1) == C.var(2). 41 42 [dummy, mapA] = ismember(A.var, C.var); 43 [dummy, mapB] = ismember(B.var, C.var); 44 45 % Set the cardinality of variables in C 46 C.card = zeros(1, length(C.var)); 47 C.card(mapA) = A.card; 48 C.card(mapB) = B.card; 49 50 % Initialize the factor values of C: 51 % prod(C.card) is the number of entries in C 52 C.val = zeros(1, prod(C.card)); 53 54 % Compute some helper indices 55 % These will be very useful for calculating C.val 56 % so make sure you understand what these lines are doing. 57 assignments = IndexToAssignment(1:prod(C.card), C.card); 58 indxA = AssignmentToIndex(assignments(:, mapA), A.card); 59 indxB = AssignmentToIndex(assignments(:, mapB), B.card); 60 61 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 62 % YOUR CODE HERE: 63 % Correctly populate the factor values of C 64 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 65 for i = 1:size(indxA) 66 C.val(i) = A.val(indxA(i))*B.val(indxB(i)); 67 end 68 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 69 70 end
2、变量边际化
变量边际化是概率图模型的核心操作,其目的是消除其他变量,获得单个变量的边缘分布。例如有CPD: struct('var', [3 1 2], 'card', [2 2 2], 'val', ones(1, 8)), 如果我们需要知道x_1 最本质的分布,则需要对x_2,x_3进行边际化操作。变量边际化最大的难点在于算法。比如要边际 x_2,那么应该把x_1,x_3取值相同的组合求和。此函数用了一个巧妙的方法来解决:先由var 求assignment,之后抽去assignment里需要消除的变量所对应列,在将assignment转成index.此时index里数字相同的编码就是需要求和的。
1 % FactorMarginalization Sums given variables out of a factor. 2 % B = FactorMarginalization(A,V) computes the factor with the variables 3 % in V summed out. The factor data structure has the following fields: 4 % .var Vector of variables in the factor, e.g. [1 2 3] 5 % .card Vector of cardinalities corresponding to .var, e.g. [2 2 2] 6 % .val Value table of size prod(.card) 7 % 8 % The resultant factor should have at least one variable remaining or this 9 % function will throw an error. 10 % 11 % See also FactorProduct.m, IndexToAssignment.m, and AssignmentToIndex.m 12 13 function B = FactorMarginalization(A, V) 14 % A = Joint; 15 % V = [2]; 16 % Check for empty factor or variable list 17 if (isempty(A.var) || isempty(V)), B = A; return; end; 18 19 % Construct the output factor over A.var \ V (the variables in A.var that are not in V) 20 % and mapping between variables in A and B 21 [B.var, mapB] = setdiff(A.var, V); 22 23 % Check for empty resultant factor 24 if isempty(B.var) 25 error('Error: Resultant factor has empty scope'); 26 end; 27 28 % Initialize B.card and B.val 29 B.card = A.card(mapB); 30 B.val = zeros(1, prod(B.card)); 31 32 % Compute some helper indices 33 % These will be very useful for calculating B.val 34 % so make sure you understand what these lines are doing 35 assignments = IndexToAssignment(1:length(A.val), A.card); 36 indxB = AssignmentToIndex(assignments(:, mapB), B.card); 37 38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 39 % YOUR CODE HERE 40 % Correctly populate the factor values of B 41 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 42 43 shunxu = 1:size(indxB); 44 hunhe = [indxB,shunxu']; 45 hunhe_paixu = sortrows(hunhe,1); 46 tmp_1 = hunhe_paixu(:,1); 47 tmp_2 = hunhe_paixu(:,2); 48 k = 1; 49 for i = 1:length(tmp_1)-1 50 if tmp_1(i) == tmp_1(i+1) 51 B.val(k) = B.val(k) + A.val(tmp_2(i)); 52 else 53 B.val(k) = B.val(k) + A.val(tmp_2(i)); 54 k = k+1; 55 end 56 end 57 B.val(k) = B.val(k) + A.val(i+1); 58 59 60 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 61 end
3、变量观测
变量观测是将未知的变量置为已知。那么该变量其他不符合观测条件的取值都应该置为0. 该算法的核心思想是首先获得变量的assignment,之后选出被观测变量所占assignment的列。最后依次读取该列的值,将不符合观测结果的value置0.
1 % ObserveEvidence Modify a vector of factors given some evidence. 2 % F = ObserveEvidence(F, E) sets all entries in the vector of factors, F, 3 % that are not consistent with the evidence, E, to zero. F is a vector of 4 % factors, each a data structure with the following fields: 5 % .var Vector of variables in the factor, e.g. [1 2 3] 6 % .card Vector of cardinalities corresponding to .var, e.g. [2 2 2] 7 % .val Value table of size prod(.card) 8 % E is an N-by-2 matrix, where each row consists of a variable/value pair. 9 % Variables are in the first column and values are in the second column. 10 11 12 13 function F = ObserveEvidence(F, E) 14 15 % Iterate through all evidence 16 17 for i = 1:size(E, 1) 18 v = E(i, 1); % variable 19 x = E(i, 2); % value 20 21 % Check validity of evidence 22 if (x == 0), 23 warning(['Evidence not set for variable ', int2str(v)]); 24 continue; 25 end; 26 27 for j = 1:length(F), 28 % Does factor contain variable? 29 indx = find(F(j).var == v); 30 31 if (~isempty(indx)), 32 33 % Check validity of evidence 34 if (x > F(j).card(indx) || x < 0 ), 35 error(['Invalid evidence, X_', int2str(v), ' = ', int2str(x)]); 36 end; 37 38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 39 % YOUR CODE HERE 40 % Adjust the factor F(j) to account for observed evidence 41 % Hint: You might find it helpful to use IndexToAssignment 42 % and SetValueOfAssignment 43 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 44 for k = 1:prod(F(j).card) 45 Assignment_ = IndexToAssignment(k,F(j).card); 46 if Assignment_(indx) ~= x 47 indx_ = AssignmentToIndex(Assignment_,F(j).card); 48 F(j).val(indx_) = 0; 49 end 50 end 51 52 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 53 54 % Check validity of evidence / resulting factor 55 if (all(F(j).val == 0)), 56 warning(['Factor ', int2str(j), ' makes variable assignment impossible']); 57 end; 58 59 end; 60 end; 61 end; 62 63 end
4、计算联合分布
计算联合分布就是将若干个变量进行相乘。联合分布的输入是因子集。将因子集中的变量依次相乘,所得最后结果则为联合分布。
1 %ComputeJointDistribution Computes the joint distribution defined by a set 2 % of given factors 3 % 4 % Joint = ComputeJointDistribution(F) computes the joint distribution 5 % defined by a set of given factors 6 % 7 % Joint is a factor that encapsulates the joint distribution given by F 8 % F is a vector of factors (struct array) containing the factors 9 % defining the distribution 10 % 11 12 13 % FACTORS.INPUT(1) = struct('var', [1], 'card', [2], 'val', [0.11, 0.89]); 14 % 15 % % FACTORS.INPUT(2) contains P(X_2 | X_1) 16 % FACTORS.INPUT(2) = struct('var', [2, 1], 'card', [2, 2], 'val', [0.59, 0.41, 0.22, 0.78]); 17 % 18 % % FACTORS.INPUT(3) contains P(X_3 | X_2) 19 % FACTORS.INPUT(3) = struct('var', [3, 2], 'card', [2, 2], 'val', [0.39, 0.61, 0.06, 0.94]); 20 % 21 % F = FACTORS.INPUT; 22 23 24 function Joint = ComputeJointDistribution(F) 25 26 % Check for empty factor list 27 if (numel(F) == 0) 28 warning('Error: empty factor list'); 29 Joint = struct('var', [], 'card', [], 'val', []); 30 return; 31 end 32 33 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 34 % YOUR CODE HERE: 35 % Compute the joint distribution defined by F 36 % You may assume that you are given legal CPDs so no input checking is required. 37 % 38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 39 F_ = F; 40 %Joint = struct('var', [], 'card', [], 'val', []); % Returns empty factor. Change this. 41 for i = 2 : numel(F_) 42 F_(i) = FactorProduct(F_(i),F_(i-1)); 43 end 44 Joint = F_(i); 45 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 46 end
5、计算边缘分布
边缘分布是在联合分布的基础上更进一步的处理,对于给定联合分布,某些变量可能被观测到。某些变量可能需要边际掉,最后的结果就是边缘分布。边缘分布得名于其在因子表中处以边缘的位置。其关键操作在于得到联合分布后必须归一化。因为边缘分布的和总是1.
1 %ComputeMarginal Computes the marginal over a set of given variables 2 % M = ComputeMarginal(V, F, E) computes the marginal over variables V 3 % in the distribution induced by the set of factors F, given evidence E 4 % 5 % M is a factor containing the marginal over variables V 6 % V is a vector containing the variables in the marginal e.g. [1 2 3] for 7 % X_1, X_2 and X_3. 8 % F is a vector of factors (struct array) containing the factors 9 % defining the distribution 10 % E is an N-by-2 matrix, each row being a variable/value pair. 11 % Variables are in the first column and values are in the second column. 12 % If there is no evidence, pass in the empty matrix [] for E. 13 14 % % FACTORS.INPUT(1) contains P(X_1) 15 % FACTORS.INPUT(1) = struct('var', [1], 'card', [2], 'val', [0.11, 0.89]); 16 % 17 % % FACTORS.INPUT(2) contains P(X_2 | X_1) 18 % FACTORS.INPUT(2) = struct('var', [2, 1], 'card', [2, 2], 'val', [0.59, 0.41, 0.22, 0.78]); 19 % 20 % % FACTORS.INPUT(3) contains P(X_3 | X_2) 21 % FACTORS.INPUT(3) = struct('var', [3, 2], 'card', [2, 2], 'val', [0.39, 0.61, 0.06, 0.94]); 22 % 23 % V = [3]; 24 % F = FACTORS.INPUT; 25 % E = []; 26 function M = ComputeMarginal(V, F, E) 27 28 % Check for empty factor list 29 if (numel(F) == 0) 30 warning('Warning: empty factor list'); 31 M = struct('var', [], 'card', [], 'val', []); 32 return; 33 end 34 35 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 36 % YOUR CODE HERE: 37 % M should be a factor 38 % Remember to renormalize the entries of M! 39 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 40 Joint = ComputeJointDistribution(F); 41 Obser = ObserveEvidence(Joint,E); 42 To_be_Mglzed = setdiff(Joint.var,V); 43 if ~isempty(To_be_Mglzed) 44 M = FactorMarginalization(Obser,To_be_Mglzed); 45 else 46 M = Obser; 47 end 48 M.val = M.val/sum(M.val); 49 % M = struct('var', [], 'card', [], 'val', []); % Returns empty factor. Change this. 50 51 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 52 53 M = StandardizeFactors(M); 54 end
最后,所有代码请点这里