非负矩阵分解的两种方法简析

一、使用非负最小二乘法

Non-negative matrix factorisation using non-negative least squares

问题

给定一个矩阵\(A\),将其分解成两个非负的因子:

\[A_{M \times N} \approx W_{M \times K} \times H_{K \times N}, such that \space W_{M \times K} \geq 0 \space and \space H_{K \times N} \geq 0 \]

解法

我们的解决方法包含两个步骤。首先,在 A 给定的情况下固定 W 然后求解 H。接下来固定 H 来求解 W。迭代的重复这个过程,求解的方法就是最小二乘法,所以这种方法也叫做交替最小二乘法(ALS)。但是我们的问题有特殊性,那就是我们将 W 和 H 约束位非负的,所以我们用非负最小二乘(NNLS)来代替最小二乘。

代码示例

import numpy as np
from scipy.optimize import nnls

M, N = 20, 10
K = 4

np.random.seed(2019)
A_orig = np.abs(np.random.uniform(low=0.0, high=1.0, size=(M,N)))

A = A_orig.copy()
# 在实际问题中常会出现 A 中有缺失值的情况,特别是在协同过滤的问题中
A[0, 0] = np.NAN
A[3, 1] = np.NAN
A[6, 3] = np.NAN
A[3, 6] = np.NAN

W = np.abs(np.random.uniform(low=0, high=1, size=(M, K)))
H = np.abs(np.random.uniform(low=0, high=1, size=(K, N)))

def cost(A, W, H):
    # 计算代价函数时忽略 A 中缺失的元素
    mask = ~np.isnan(A)
    WH = np.dot(W, H)
    WH_mask = WH[mask] # Now WH_mask is a vector, only include the non-nan values
    A_mask = A[mask]
    A_WH_mask = A_mask-WH_mask
    return np.linalg.norm(A_WH_mask, 2)

num_iter = 1000

for i in range(num_iter):
    if i%2 ==0:
        # 固定 W 求解 H
        for j in range(N): # 注意 H 是 一列一列的求
            mask_rows = ~np.isnan(A[:,j])
            H[:,j] = nnls(W[mask_rows], A[:,j][mask_rows])[0]
    else:
        # 固定 H 求解 W
        for j in range(M): # W 是一行一行的求
            mask_rows = ~np.isnan(A[j,:])
            W[j,:] = nnls(H.T[mask_rows], A[j,:][mask_rows])[0]
    if i%100 == 0:
        print(i,cost(A,W,H))

二、使用TensorFlow

https://nipunbatra.github.io/blog/2017/nnmf-tensorflow.html

主要是利用梯度下降的原理

代码示例

import tensorflow as tf
import numpy as np

np.random.seed(2019)

A = np.array([[np.nan, 4, 5, 2],
              [4, 4, np.nan, 3],
              [5, 5, 4, 4]], dtype=np.float32).T # 4 users,3 movies

# Boolean mask for computing cost only on non-missing value
tf_mask = tf.Variable(~np.isnan(A))

shape = A.shape
A = tf.constant(A)

# latent factors
rank = 3

H = tf.Variable(np.random.randn(rank,shape[1]).astype(np.float32))
W = tf.Variable(np.random.randn(shape[0],rank).astype(np.float32))

WH = tf.matmul(W,H)

# Define cost on Frobenius norm
cost = tf.reduce_sum(tf.pow(tf.boolean_mask(A,tf_mask)\
                            - tf.boolean_mask(WH,tf_mask),2))

learning_rate = 0.001
steps=1000
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()

# Clipping operation. This ensure that W and H learnt are non-negative
clip_W = W.assign(tf.maximum(tf.zeros_like(W),W))
clip_H = H.assign(tf.maximum(tf.zeros_like(H),H))
clip = tf.group(clip_W,clip_H)

steps = 1000
with tf.Session() as sess:
    sess.run(init)
    for i in range(steps):
        sess.run(train_step)
        sess.run(clip)
        if i%100==0:
            print("Cost: ",sess.run(cost))
    learnt_W = sess.run(W)
    learnt_H = sess.run(H)

posted on 2019-01-13 15:35  Frank_Allen  阅读(2561)  评论(0编辑  收藏  举报

导航