Python学习Day17
一、re模块 import re print(re.findall('alex','hahaha alex is alex is dsb')) ['alex','alex'] print(re.findall('\w','Aah123 +-_’)) print(re.findall('\w\w','Aah123 +-_’)) ['Aa','h1','23'] print(re.findall('\w9\w','Aa9h123 aaa9c+-_’)) ['a9h','a9c'] print(re.findall('\W','Aah123 +-_')) print(re.findall('\s',Aah\t12\n3 +-_’)) print(re.findall('\S',Aah\t12\n3 +-_’)) print(re.findall('\d',Aah\t12\n3 +-_’)) print(re.findall('\D',Aah\t12\n3 +-_’)) print(re.findall('\w\w\d\d','adaweweaafaa001safsafff0002fdfafd01 ew02')) print(re.findall('\s','Aah\t12\n3 +-_’)) print(re.findall('\t','Aah\t12\n3 +-_’)) print(re.findall('\n','Aah\t12\n3 +-_’)) ^:仅从头开始匹配 print(re.findall('^alex',' alex is alex is alex')) $:仅从尾部开始匹配 print(re.findall('alex$',' alex is alex is alex1')) .:代表匹配一个字符,该字符可以是除换行符之外任意字符 print(re.findall('a.c','a alc aaac a c asfdsaf a\nc',re.DOTALL)) ['alc','aac','a c','a\nc'] []:代表匹配一个字符,这一个字符是来自于我们自定义的范围 print(re.findall('a[0-9]c','a,c a a1c a9c aaac a c asfdsaf a\nc',re.DOTALL)) print(re.findall('a[a-zA-Z]c','a,c aAc a1c a9c aaac a c asfdsaf a\nc',re.DOTALL)) print(re.findall('a[a-zA-Z]c','a,c aAc a1c a9c aaac a c asfdsaf a\nc',re.DOTALL)) print(re.findall('a[+*/-]c','a,c a+c a-c a*c a/c aAc a1c a9c aaac a c asfdsaf a\nc',re.DOTALL)) print(re.findall('a[+*\-/]c','a,c a+c a-c a*c a/c aAc a1c a9c aaac a c asfdsaf a\nc',re.DOTALL)) print(re.findall('a[^0-9]c','a,c a alc a9c aaac a c asfdsaf a\nc',re.DOTALL)) 重复匹配 ?:代表左边那一个字符出现0次到1次 print(re.findall('ab?','a ab abb abbbb a123b a123bbbb')) ['a','ab','ab',ab','a','a'] *:代表左边那一个字符出现0次到无穷次 print(re.findall('ab*','a ab abb abbbb a123b a123bbbb')) ['a','ab','abb','abbbb','a','a'] +: 代表左边那一个字符出现1次到无穷次 print(re.findall('ab+','a ab abb abbbb a123b a123bbbb')) ab+ ['ab','abb','abbbb'] {n,m}:代表左边那一个字符出现n次到m次 print(re.findall('ab{1,3}','a ab abb abbbb a123b a123bbbb')) ['ab', 'abb', 'abbb'] print(re.findall('ab{1,}','a ab abb abbbb a123b a123bbbb')) print(re.findall('ab+','a ab abb abbbb a123b a123bbbb')) print(re.findall('ab{0,}','a ab abb abbbb a123b a123bbbb')) print(re.findall('ab*','a ab abb abbbb a123b a123bbbb')) print(re.findall('ab{3}','a ab abb abbbb a123b a123bbbb')) .*: 匹配任意0个到无穷个字符,贪婪匹配 print(re.findall('a.*c','a123213123asdfasdfc123123123123+-0)((c123123')) a.*c .*?:匹配任意0个到无穷个字符,非贪婪匹配 print(re.findall('a.*?c','a123213123asdfasdfc123123123123+-0)((c123123')) |:或者 print(re.findall('companies|company','Too many companies have gone bankrupt,c and the next one is my company')) companies|company ():分组 print(re.findall('compan(?:ies|y)','Too many companies have gone bankrupt,c and the next one is my company')) compan(ies|y) print(re.findall('href="(.*?)"','<p>动感视频</p><a href="https://www.douniwan.com/1.mp4">逗你玩呢</a><a href="https://www.xxx.com/2.mp4">葫芦娃</a>')) href=".*?" 'a\\c' print(re.findall('a\\\\c','a\c aac')) print(re.findall(r'a\\c','a\c aac')) print(re.findall('alex','my name is alex Alex is dsb aLex ALeX',re.I)) 忽略大小写 print(re.findall('alex','my name is alex Alex is dsb aLex ALeX',re.I)) msg=""" my name is egon asdfsadfadfsadf egon 123123123123123egon """ print(re.findall('egon$',msg,re.M)) #my name is egon\nasdfsadfadfsadf egon\n123123123123123egon' re模块其他方法 res=re.findall('(href)="(.*?)"','<p>动感视频</p><a href="https://www.douniwan.com/1.mp4">逗你玩呢</a><a href="https://www.xxx.com/2.mp4">葫芦娃</a>') print(res) res=re.search('(href)="(.*?)"','<p>动感视频</p><a href="https://www.douniwan.com/1.mp4">逗你玩呢</a><a href="https://www.xxx.com/2.mp4">葫芦娃</a>') print(res) print(res.group(0)) print(res.group(1)) print(res.group(2)) res=re.match('abc','123abc') ## res=re.search('^abc','123abc') print(res) print(re.findall('alex','alex is alex is alex')) print(re.search('alex','alex is alex is alex')) print(re.match('alex','alex is alex is alex')) pattern=re.compile('alex') print(pattern.findall('alex is alex is alex')) print(pattern.search('alex is alex is alex')) print(pattern.match('alex is alex is alex')) ['1', '2', '60', '-40.35', '5', '-4', '3'] msg="1-2*(60+(-40.35/5)-(-40*3))" print(re.findall('\D?(-?\d+\.?\d*)',msg)) msg="1-2*(60+(-40.35/5)-(-40*3))" \D?-?\d+\.?\d* 模式 描述 \w 匹配字母数字及下划线 \W 匹配非字母数字及下划线 \s 匹配任意空白字符,等价于[\t\n\r\f]. \S 匹配任意非空字符 \d 匹配任意数字,等价于[0-9] \D 匹配任意非数字 \A 匹配字符串开始 \Z 匹配字符串结束,如果是存在换行,只匹配到换行前的结束字符串 \z 匹配字符串结束 \G 匹配最后匹配完成的位置 \n 匹配一个换行符 \t 匹配一个制表符 ^ 匹配字符串的开头 $ 匹配字符串的末尾 . 匹配任意字符,除了换行符,当re.DOTALL标记被指定时,则可以匹配包括换行符的任意字符。 […] 用来表示一组字符,单独列出:[amk]匹配'a','m'或'k' [^…] 不在[]中的字符:[^abc]匹配除了a,b,c之外的字符。 * 匹配0个或多个的表达式 + 匹配1个或多个的表达式 ? 匹配0个或1个由前面的正则表达式定义的片段,非贪婪方式 {n} 精确匹配n个前面表达式。 {n,m} 匹配n到m次由前面的正则表达式定义的片段,贪婪方式 a|b 匹配a或b () 匹配括号内的表达式,也表示一个组 二、hashlib模块 1.什么是hash hash是一种算法,该算法接收一系列的数据,经过运算会得到一个hash值, hash值具备三大特性: 1.只要传入的内容一样,那么得到的hash值一定是一样 2.只要采用hash算法固定,无论传入的内容多大,hash值的长度是固定 3.hash值不可逆,即不能通过hash值逆推出内容 2.为何要用hash 特性1+2=>文件完整性校验 特性3==> import hashlib m=hashlib.md5() m.update('你好'.encode('utf-8')) m.update('hello'.encode('utf-8')) print(m.hexdigest()) m1=hashlib.md5() m1.update('你好hello'.encode('utf-8')) print(m1.hexdigest()) print(len(m1.hexdigest())) #32 m2=hashlib.sha512() m2.update(b'asdfassssssssssssssssssssssssssss') print(m2.hexdigest()) print(len(m2.hexdigest())) with open(r'D:\脱产5期内容\day17\今日内容',mode='rb') as f: m=hashlib.md5() for line in f: m.update(line) print(m.hexdigest()) pwd=input('password>>> ').strip() m=hashlib.md5() m.update('天王盖地虎'.encode('utf-8')) m.update(pwd.encode('utf-8')) m.update('一行白鹭上青天'.encode('utf-8')) print(m.hexdigest())