降维实例之主成分分析

数据集来源:https://www.kaggle.com/psparks/instacart-market-basket-analysis

思路:

 

实例代码:

import pandas as pd
from sklearn.decomposition import PCA

def main():
    '''
    降维实例:主成分分析
    :return: None
    '''
    # 读取数据
    prior = pd.read_csv("order_products__prior.csv")
    products = pd.read_csv("products.csv")
    orders = pd.read_csv("orders.csv")
    aisles = pd.read_csv("aisles.csv")
    # 合并数据
    _mg = pd.merge(prior, products, on=['product_id', 'product_id'])
    _mg = pd.merge(_mg, orders, on=['order_id', 'order_id'])
    mt = pd.merge(_mg, aisles, on=['aisle_id', 'aisle_id'])
    # print(mt.head(10))
    # 交叉表
    cross = pd.crosstab(mt['user_id'], mt['aisle'])
    # print(cross)
    pca = PCA(n_components=0.9)
    data = pca.fit_transform(cross)
    print(data)
    print(data.shape)
    return None

if __name__ == '__main__':
    main()

运行结果:

从结果中可以看出数据的维数降到了27

posted @ 2018-12-24 23:55  wydxry  阅读(874)  评论(0编辑  收藏  举报
Live2D