python 爬虫003-正则表达式简单介绍
正则表达式,简单的说就是用一个“字符串”来描述一个特征,然后去验证另外一个“字符串”是否符合这个特征。
正则表达式在线测试工具 http://tool.chinaz.com/regex
实例一,判断字符串全是小写字母
#!/usr/bin/env python # -*- coding: utf-8 -*- import re if __name__ == '__main__': str1 = '2asdfsfwdsfsfwk' an = re.match('[a-z]+$', str1) print(type(an)) if an: print(u'全是小写') else: print(u'不全是小写')
#!/usr/bin/env python # -*- coding: utf-8 -*- import re if __name__ == '__main__': str1 = '2asdfsfwdsfsfwk' an = re.search('^[a-z]+$', str1) print(type(an)) if an: print(u'全是小写') else: print(u'不全是小写')
#!/usr/bin/env python # -*- coding: utf-8 -*- import re if __name__ == '__main__': str1 = 'asdfsfwdsfsfwk' regex = re.compile('^[a-z]+$') an = regex.search(str1) print(type(an)) if an: print(u'全是小写') else: print(u'不全是小写')
实例二,从字符串中提取手机号
#!/usr/bin/env python # -*- coding: utf-8 -*- import re if __name__ == '__main__': str1 = '从字符串中15011891096abc@qq.com提取1368678804手机13710819640号码' regex_phone = re.compile('(?:13[0-9]|14|15[^4,\D]|18[0,2,5-9])\d{8}') # regex_phone = re.compile('((?:(?:13[0-9])|(?:15[^4,\D])|(?:18[0,2,5-9]))\d{8})') print regex_phone.findall(str1)