非负矩阵分解的两种方法简析
一、使用非负最小二乘法
Non-negative matrix factorisation using non-negative least squares
问题
给定一个矩阵\(A\),将其分解成两个非负的因子:
\[A_{M \times N} \approx W_{M \times K} \times H_{K \times N}, such that \space W_{M \times K} \geq 0 \space and \space H_{K \times N} \geq 0
\]
解法
我们的解决方法包含两个步骤。首先,在 A 给定的情况下固定 W 然后求解 H。接下来固定 H 来求解 W。迭代的重复这个过程,求解的方法就是最小二乘法,所以这种方法也叫做交替最小二乘法(ALS)。但是我们的问题有特殊性,那就是我们将 W 和 H 约束位非负的,所以我们用非负最小二乘(NNLS)来代替最小二乘。
代码示例
import numpy as np
from scipy.optimize import nnls
M, N = 20, 10
K = 4
np.random.seed(2019)
A_orig = np.abs(np.random.uniform(low=0.0, high=1.0, size=(M,N)))
A = A_orig.copy()
# 在实际问题中常会出现 A 中有缺失值的情况,特别是在协同过滤的问题中
A[0, 0] = np.NAN
A[3, 1] = np.NAN
A[6, 3] = np.NAN
A[3, 6] = np.NAN
W = np.abs(np.random.uniform(low=0, high=1, size=(M, K)))
H = np.abs(np.random.uniform(low=0, high=1, size=(K, N)))
def cost(A, W, H):
# 计算代价函数时忽略 A 中缺失的元素
mask = ~np.isnan(A)
WH = np.dot(W, H)
WH_mask = WH[mask] # Now WH_mask is a vector, only include the non-nan values
A_mask = A[mask]
A_WH_mask = A_mask-WH_mask
return np.linalg.norm(A_WH_mask, 2)
num_iter = 1000
for i in range(num_iter):
if i%2 ==0:
# 固定 W 求解 H
for j in range(N): # 注意 H 是 一列一列的求
mask_rows = ~np.isnan(A[:,j])
H[:,j] = nnls(W[mask_rows], A[:,j][mask_rows])[0]
else:
# 固定 H 求解 W
for j in range(M): # W 是一行一行的求
mask_rows = ~np.isnan(A[j,:])
W[j,:] = nnls(H.T[mask_rows], A[j,:][mask_rows])[0]
if i%100 == 0:
print(i,cost(A,W,H))
二、使用TensorFlow
https://nipunbatra.github.io/blog/2017/nnmf-tensorflow.html
主要是利用梯度下降的原理
代码示例
import tensorflow as tf
import numpy as np
np.random.seed(2019)
A = np.array([[np.nan, 4, 5, 2],
[4, 4, np.nan, 3],
[5, 5, 4, 4]], dtype=np.float32).T # 4 users,3 movies
# Boolean mask for computing cost only on non-missing value
tf_mask = tf.Variable(~np.isnan(A))
shape = A.shape
A = tf.constant(A)
# latent factors
rank = 3
H = tf.Variable(np.random.randn(rank,shape[1]).astype(np.float32))
W = tf.Variable(np.random.randn(shape[0],rank).astype(np.float32))
WH = tf.matmul(W,H)
# Define cost on Frobenius norm
cost = tf.reduce_sum(tf.pow(tf.boolean_mask(A,tf_mask)\
- tf.boolean_mask(WH,tf_mask),2))
learning_rate = 0.001
steps=1000
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
# Clipping operation. This ensure that W and H learnt are non-negative
clip_W = W.assign(tf.maximum(tf.zeros_like(W),W))
clip_H = H.assign(tf.maximum(tf.zeros_like(H),H))
clip = tf.group(clip_W,clip_H)
steps = 1000
with tf.Session() as sess:
sess.run(init)
for i in range(steps):
sess.run(train_step)
sess.run(clip)
if i%100==0:
print("Cost: ",sess.run(cost))
learnt_W = sess.run(W)
learnt_H = sess.run(H)
posted on 2019-01-13 15:35 Frank_Allen 阅读(2561) 评论(0) 编辑 收藏 举报