【机器学习】协同过滤

Collaborative Filtering Recommender Systems

解决相似度问题

概念

准确率 = \(accuracy = \frac{预测正确的样本}{总样本}\)

精确率 = \(precision = \frac{预测成功的正类}{预测的正类}\) 【不能误检】

召回率 = \(recall = \frac{预测成功的正类}{总正类}\) 【不能漏报】

相似度

余弦定理相似度

\[Cosine = \frac{\sum^{n}_{i=1}A_i \times B_i}{\sqrt{\sum^{n}_{i=1}(A_i)^2} \times \sqrt{\sum ^{n}_{i=1}(B_i)^2}} \]

def compute_cos(a, b):
    cos = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

皮尔逊相关系数

两向量减均指,在计算consline的值

\[sim(i, j) = \frac{\sum_{p\in P}(R_{i, p} - \overline{R_i})R_{j, p} - \overline{R_J}}{\sqrt{\sum_{p\in P}(R_{i, p} - \overline{R_i})^2} \sqrt{\sum_{p\in P}(R_{j, p} - \overline{R_J})^2}} \]

def compute_sim(a, b):
    a = a - np.mean(b)
    b = b - np.mean(b)
    sim = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

Cost Function

公式

\[J = \frac{1}{2}\sum_{j=0}^{n_u-1} \sum_{i=0}^{n_m-1}r(i,j)*(\mathbf{w}^{(j)} \cdot \mathbf{x}^{(i)} + b^{(j)} - y^{(i,j)})^2 +\text{regularization} \]

代码

# GRADED FUNCTION: cofi_cost_func
# UNQ_C1

def cofi_cost_func(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering
    Args:
      X (ndarray (num_movies,num_features)): matrix of item features
      W (ndarray (num_users,num_features)) : matrix of user parameters
      b (ndarray (1, num_users)            : vector of user parameters
      Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
      R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
      lambda_ (float): regularization parameter
    Returns:
      J (float) : Cost
    """
    nm, nu = Y.shape
    J = 0
    ### START CODE HERE ###  
    for j in range(nu):
        w = W[j, :]
        b_j = b[0, j]
        for i in range(nm):
            x = X[i, :]
            y = Y[i, j]
            r = R[i, j]
            J += r * np.square(np.dot(w, x) + b_j - y)
        
    J += lambda_ * (np.sum(np.square(W)) + np.sum(np.square(X)))
    J /= 2
    print(J)
    ### END CODE HERE ### 

    return J
posted @ 2023-07-31 15:45  码农要战斗  阅读(26)  评论(0编辑  收藏  举报