2、安装spark与python练习

一、安装Spark

1、检查基础环境hadoop,jdk

 

2、下载spark

 

 

3、环境变量

 

 

 

 

4、试运行Python代码

 

 

二、Python编程练习:英文文本的词频统计

准备文本文件(f1.txt):

Carter's devotion to her ancestor is about more than personal pride: it is about family honor。 For Josiah Henson has lived on through the character in American fiction that he helped inspire: Uncle Tom, the long-suffering slave in Harriet Beecher Stowe's Uncle Tom's Cabin。 Ironically, that character has come to symbolize everything Henson was not。 A racial sellout unwilling to stand up for himself? Carter gets angry at the thought。 "Josiah Henson was a man of principle," she said firmly。

插入代码:

 1 path='/home/hadoop/cc/f1.txt'
 2 with open(path) as f:
 3     text=f.read()
 4 words = text.split()
 5 cc={}
 6 for word in words:
 7     cc[word]=cc.get(word,0)+1
 8 cclist=list(cc.items())
 9 cclist.sort(key=lambda x:x[1],reverse=True)
10 print(cclist)

 

输出结果:

 

posted @ 2022-03-04 16:18  彭翠清  阅读(41)  评论(0编辑  收藏  举报