字符串操作、文件操作,英文词频统计预处理

作业来源:https://edu.cnblogs.com/campus/gzcc/GZCC-16SE1/homework/2684

1.字符串操作:

  • 解析身份证号:生日、性别、出生地等。
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
#获取身份证号中的出生日期与性别
ID=input("请输入您的身份证号:");
while(len(ID)!=18):
    print("您的身份证号码输入错误");
    ID = input("请重新输入您的身份证号:");
year=ID[6:10];
month=ID[10:12];
day=ID[12:14];
province=ID[0:2];
area={'11':'北京市','12':'天津市','13':'河北省','14':'山西省','15':'内蒙古自治区','21':'辽宁省','22':'吉林省','23':'黑龙江省','31':'上海市','32':'江苏省','33':'浙江省','34':'安徽省','35':'福建省','36':'江西省','37':'山东省','41':'河南省','42':'湖北省','43':'湖南省','44':'广东省','45':'广西壮族自治区','46':'海南省','50':'重庆市','51':'四川省','52':'贵州省','53':'云南省','54':'西藏自治区','61':'陕西省','62':'甘肃省','63':'青海省','64':'宁夏回族自治区','65':'新疆维吾尔自治区','71':'台湾省','81':'香港特别行政区','82':'澳门特别行政区'}
print("你所查询的身份证归属地为:"+area.get(province),  "   出生日期是{}-{}-{}".format(year,month,day));
sex=ID[-2];
if int(sex)%2==0:
    print("性别为女");
else:
    print("性别为男")

      运行结果截图:

      

  • 凯撒密码编码与解码
def encryption():
  str_raw = input("请输入明文:")
  k = int(input("请输入位移值:"))
  str_change = str_raw.lower()
  str_list = list(str_change)
  str_list_encry = str_list
  i = 0
  while i < len(str_list):
    if ord(str_list[i]) < 123-k:
      str_list_encry[i] = chr(ord(str_list[i]) + k)
    else:
      str_list_encry[i] = chr(ord(str_list[i]) + k - 26)
    i = i+1
  print ("加密结果为:"+"".join(str_list_encry))
def decryption():
  str_raw = input("请输入密文:")
  k = int(input("请输入位移值:"))
  str_change = str_raw.lower()
  str_list = list(str_change)
  str_list_decry = str_list
  i = 0
  while i < len(str_list):
    if ord(str_list[i]) >= 97+k:
      str_list_decry[i] = chr(ord(str_list[i]) - k)
    else:
      str_list_decry[i] = chr(ord(str_list[i]) + 26 - k)
    i = i+1
  print ("解密结果为:"+"".join(str_list_decry))
while True:
  print (u"1. 加密")
  print (u"2. 解密")
  choice = input("请选择:")
  if choice == "1":
    encryption()
  elif choice == "2":
    decryption()
  else:
    print (u"您的输入有误!")

  运行结果截图:

      

  • 网址观察与批量生成
for i in range(3,8):
    url='http://news.gzcc.cn/html/xiaoyuanxinwen/{}.html'.format(i)
    print(url)

    运行结果截图:

    

2.英文词频统计预处理

  • 下载一首英文的歌词或文章或小说。
  • 将所有大写转换为小写
  • 将所有其他做分隔符(,.?!)替换为空格
  • 分隔出一个一个的单词
  • 并统计单词出现的次数。

     代码如下:

#英文歌词:
str1='''I will not make the same mistakes that you did 
I will not let myself cause my heart so much misery 
I will not break the way you did 
You fell so hard 
I learned the hard way, to never let it get that far 
-
Because of you 
I never stray too far from the sidewalk 
Because of you 
I learned to play on the safe side 
So I don't get hurt 
Because of you 
I find it hard to trust 
Not only me, but everyone around me 
Because of you 
I am afraid 
-
I lose my way 
And it's not too long before you point it out 
I cannot cry 
Because I know that's weakness in your eyes 
I'm forced to fake a smile, a laugh 
Every day of my life 
My heart can't possibly break 
When it wasn't even whole to start with 
-
Because of you 
I never stray too far from the sidewalk 
Because of you 
I learned to play on the safe side 
So I don't get hurt 
Because of you 
I find it hard to trust 
Not only me, but everyone around me 
Because of you 
I am afraid 
-
I watched you die 
I heard you cry 
Every night in your sleep 
I was so young 
You should have known better than to lean on me 
You never thought of anyone else 
You just saw your pain 
And now I cry 
In the middle of the night 
Over the same damn thing 
-
Because of you 
I never stray too far from the sidewalk 
Because of you 
I learned to play on the safe side so I don't get hurt 
Because of you 
I tried my hardest just to forget everything 
Because of you 
I don't know how to let anyone else in 
Because of you 
I'm ashamed of my life because it's empty 
Because of you 
I am afraid 
-
Because of you'''
#把单词全部变成小写
s1=str1.lower()
print(s1)
#去掉空格
str1=str1.lstrip()
print(str1)
#将歌词的每个单词分隔组成列表形式
print("将歌词的每个单词分隔组成列表形式:")
strList=str1.split()
print(strList)
#计算单词出现的次数
print("计算单词出现的次数:")
strSet=set(strList)
for word in strSet:
   print(word,strList.count(word))

  运行结果截图:

     

3.文件操作

  • 词频统计:下载一首英文的歌词或文章或小说,保存为utf8文件。从文件读入文本进行处理。

代码如下:

print("词频统计")
file = open("E:\\Shape of you.txt")
soy=file.read();
file.close();
s=",.?!"
for i in s:
    soy=soy.replace(i," ")
lyric=soy.lower().split()
print(soy)
count={}
for i in lyric:
    try:
        count[i]=count[i]+1
    except KeyError:
        count[i]=1
print(count)

  运行结果截图:

       

 

 4.函数定义

  • 加密函数
  • 1
    2
    3
    4
    5
    6
    7
    8
    def get_text():
        plaincode = 'abcd'
        cipher=''
        for in plaincode:
            cipher=cipher+chr(ord(i) + 3)
        return cipher
    bigstr = get_text()
    print(bigstr)

      

  • 解密函数
  • 1
    2
    3
    4
    5
    6
    7
    8
    def get_text():
        plaincode = 'defg'
        cipher=''
        for in plaincode:
            cipher=cipher+chr(ord(i) -3)
        return cipher
    bigstr = get_text()
    print(bigstr)

      

  • 读文本函数
  • 1
    2
    3
    4
    5
    6
    def get_text():
        with open('yw.txt''r', encoding='utf8',errors='ignore'as f:
            text = f.read()
        return text
    bigstr = get_text()
    print(bigstr)
posted on 2019-03-08 17:54  3fufu  阅读(212)  评论(0编辑  收藏  举报