HDFS 转dataframe

import pyarrow.parquet as pq
from pyarrow import fs
 
# 创建Hadoop文件系统对象
fs = fs.LocalFileSystem()
hadoop_path = "hdfs://<your-hdfs-address>/<csv-file>"
 
# 从HDFS读取CSV文件并转化为DataFrame
table = pq.read_pandas(hadoop_path)
dataframe = table.to_pandas()
print(dataframe)

 

posted @ 2024-01-07 16:03  myrj  阅读(3)  评论(0编辑  收藏  举报