2、安装spark与python练习
一、安装Spark
1、检查基础环境hadoop,jdk
2、下载spark
3、环境变量
4、试运行Python代码
二、Python编程练习:英文文本的词频统计
准备文本文件(f1.txt):
Carter's devotion to her ancestor is about more than personal pride: it is about family honor。 For Josiah Henson has lived on through the character in American fiction that he helped inspire: Uncle Tom, the long-suffering slave in Harriet Beecher Stowe's Uncle Tom's Cabin。 Ironically, that character has come to symbolize everything Henson was not。 A racial sellout unwilling to stand up for himself? Carter gets angry at the thought。 "Josiah Henson was a man of principle," she said firmly。
插入代码:
1 path='/home/hadoop/cc/f1.txt' 2 with open(path) as f: 3 text=f.read() 4 words = text.split() 5 cc={} 6 for word in words: 7 cc[word]=cc.get(word,0)+1 8 cclist=list(cc.items()) 9 cclist.sort(key=lambda x:x[1],reverse=True) 10 print(cclist)
输出结果: