正则

  • 功能:模糊匹配(针对字符串操作)
  • re由c语言编写,效率也会快很多
  • .可以指代任意字符除了换行符
  • ^开头开始匹配
  • $结尾匹配
  • *匹配0到无穷次
  • +代匹配1到无穷次
  • ?匹配0或者1次
  • {}自定义匹配
  • import re
    print(re.findall("alex*","464ale") )  #['ale']
    print(re.findall("alex+","464ale") )  #[]
    print(re.findall("alex?","5646alex") )#['alex']
    print(re.findall("alex{3,5}","asdasdalexxx") )#['alexxx']
    print(re.findall("alex{3,5}?","asdasdalexxxxx") ) ?可以改成惰性匹配
     

     

  • []或者,字符集中只有^/-有特殊意义(^在字符集中表示非)
    print(re.findall("a[a-z]*","hiahdsih") ) #['ahdsih']
    print(re.findall("\([^()]*\)","1515+565*565(556*54644)") )  #['(556*54644)']

     

  • \d是数字0到9,相当于[0,9]
  • \D非数字字符
  • \s任何空白字符
  • \S任何非空白字符
  • \w任何数字字母字符
  • \W非任何数字字母字符
  • print(re.findall("\d","sadsasdasd12") ) #['1', '2']
    print(re.findall("\d+","sadasd54654") )  #['54654']

     

  • \d任何特殊字符
    print(re.findall("i\\b","hello i am aia") )
    print(re.findall("e\\\\","he\l") ) #['e\\']
    print(re.findall(r"e\\","he\l") ) #['e\\']

     

  • |
    print(re.findall(r"ka|b","hsaudhka|b") )#['ka', 'b']
    print(re.findall(r"ka|bc","hsaudhka|b") )#['ka']
    print(re.findall(r"ka|bc","hsaudhka|bc") )#['ka', 'bc']
    print(re.search("(?P<name>[a-z]+)\d+","alex36wusir34").group()  )
    print(re.search("(?P<name>[a-z]+)\d+","alex36wusir34").group("name")  )  

     

  • findall把匹配结果放在列表中
  • search方法智能匹配出一个结果并且需要group来调用
  • match同search,只不过只能从开头取
  • print(re.split("[ab]","abc") )   #按a分完后按b分

     

  • print(re.sub("\d","A","saljdl6545645"),4 )  #saljdlAAAAAAA   4代表匹配次数      subn('saljdlAAAAAAA', 7)

     

  • comlile
    com =re. compile("\d+")
    print(com.findall("sauhdy54564"))   #['54564']  方便重复使用

     

  • finditer
    print(re.finditer("\d","sads4s5"))  #迭代器

     

  • 分组
    print(re.findall("www.(baidu).com","jiiojijwww.baidu.com") )  #['baidu']
    print(re.findall("www.(?:baidu).com","jiiojijwww.baidu.com") )#['www.baidu.com']

     

posted on 2018-02-25 15:27  python_an  阅读(170)  评论(0编辑  收藏  举报

导航