0821 1336 模块与包的导入方法、常用模块介绍

1.模块的定义

模块：用来从逻辑上组织python代码（变量，函数，类），目的是实现一个或多个功能，本质就是.py结尾的python文件（文件名test.py对应的模块名是test）

包：本质就是一个目录，但是目录下必须有一个__init__.py文件；有这个文件就是包，没有这个文件就是目录。

2.导入方法

2.1 import module_name 或 import module_name,module2_name

导入module_name这个模块，导入的本质是解释一遍module_name这个模块里的内容，然后将里面的变量和方法等等所有内容赋值给module_name，其他模块调用module_name的变量时这样调用：module_name.name；调用module_name的方法时这样调用：module_name.func1()。也就是说这种导入模块的方法，在调用时前面必须加上模块名。

2.2 from module_name import *

导入module_name模块里的所有内容，但是并不会把模块里的内容赋值给module_name，而是把模块里的内容直接返回给当前模块，当前模块可以直接调用module_name里的变量和方法，比如module_name里有个name变量，则在当前模块里可以直接newname = name。

这种导入有个问题，如果module_name里有个log()方法，当前模块也有个log()方法，则在当前模块写log()时会调用当前模块的方法，module_name里的log()方法会被覆盖。所以不建议这样导入。

2.3 from module_name import m1,m2,m3

将module_name模块里的m1,m2,m3导入到当前模块，当前模块可直接调用，如m1()，m3()这样，不用加模块名称。

2.4 当前模块与被调用模块里有重名方法

如果module_name里有个log()方法，当前模块也有个log()方法，而且必须用到module_name里的log()方法，则采取这种导入方式：from module_name import log as waibu_log，这样当前方法可以用waibu_log()方法来调用module_name里的log()方法，用log()调用当前函数的log()方法。

3.import本质（路径搜索和搜索路径）

导入模块的本质就是把python文件解释一遍；

导入包的本质就是执行该包下的__init__.py文件。

要导入的模块或者包必须在sys.path路径下面，如果没在，就需要手动添加路径sys.path.append('new path')；注意，把路径append到path上，是在结尾加一个路径，加入前面的路径中包含一个同名模块，那就不会走到最后这个路径了，而是导入前面的同名模块，这就会出问题，遇到这种情况就需要sys.path.insert(0,'new path')在列表头插入新路径。

4.导入其他包的模块

packageA里的main.py模块想要调用packageB里的li.py模块，直接在main.py里写import packageB，然后packageB.li.test()，这样调用li模块的test方法是会报错的，因为import packageB导入包，上面说过，其实是执行了一遍packageB包里的__init__.py模块，跟li.py没有任何关系，既然导入包会执行__init__.py模块，所以我们可以在packageB的__init__.py模块里写上from . import li，这样就可以了；按理说写import li也行，但实际不行，没找到原因。

5.导入模块优化

尽量用from module_name import test 来代替import module_name，因为前者效率高，举例如下：

用import module_name

每次调用模块里的方法都需要module_name.test()，也就是每次调用test方法都需要加载一次module_name里的内容，这样效率低。

用from module_name import test

相当于把module_name里的test方法保存到当前模块，每次使用test()直接调用即可。

6.模块的分类

内建模块
开源模块
自定义模块

7.内建模块精析

7.1 time

在Python中，通常有这几种方式来表示时间：1）时间戳 2）格式化的时间字符串 3）元组（struct_time）共九个元素。由于Python的time模块实现主要调用C库，所以各个平台可能有所不同。

UTC（Coordinated Universal Time，世界协调时）亦即格林威治天文时间，世界标准时间。在中国为UTC+8。DST（Daylight Saving Time）即夏令时。

时间戳（timestamp）的方式：通常来说，时间戳表示的是从1970年1月1日00:00:00开始按秒计算的偏移量。我们运行“type(time.time())”，返回的是float类型。返回时间戳方式的函数主要有time()，clock()等。

元组（struct_time）方式：struct_time元组共有9个元素，返回struct_time的函数主要有gmtime()，localtime()，strptime()。

time和datetime举例如下：

 1 import time,datetime
 2 
 3 print(time.time())    #显示1970到现在过了多少秒
 4 print(time.localtime(423432423432))    #将时间戳变成元组，本地时区的时间。
 5 print(time.gmtime())     #将时间戳变成元组，标准UTC时间。
 6 print(time.mktime(time.localtime()))    #将元组变成时间戳
 7 print(time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()))    #将元组变成str格式时间。
 8 print(time.strptime('2016-08-21 18:11:24', "%Y-%m-%d %H:%M:%S"))    #将str格式时间变成元组
 9 print('datetime'.center(40,'-'))
10 print(datetime.datetime.now())    #当前时间
11 print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天
12 print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天
13 print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时
14 print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分

View Code

格式参照：

%a    本地（locale）简化星期名称    
%A    本地完整星期名称    
%b    本地简化月份名称    
%B    本地完整月份名称    
%c    本地相应的日期和时间表示    
%d    一个月中的第几天（01 - 31）    
%H    一天中的第几个小时（24小时制，00 - 23）    
%I    第几个小时（12小时制，01 - 12）    
%j    一年中的第几天（001 - 366）    
%m    月份（01 - 12）    
%M    分钟数（00 - 59）    
%p    本地am或者pm的相应符    一    
%S    秒（01 - 61）    二    
%U    一年中的星期数。（00 - 53星期天是一个星期的开始。）第一个星期天之前的所有天数都放在第0周。    三    
%w    一个星期中的第几天（0 - 6，0是星期天）    三    
%W    和%U基本相同，不同的是%W以星期一为一个星期的开始。    
%x    本地相应日期    
%X    本地相应时间    
%y    去掉世纪的年份（00 - 99）    
%Y    完整的年份    
%Z    时区的名字（如果不存在为空字符）    
%%    ‘%’字符

时间关系转换：

7.2 random

实际应用

#!/usr/bin/env python
# encoding: utf-8
import random
import string
#随机整数：
print( random.randint(0,99))  #70
 
#随机选取0到100间的偶数：
print(random.randrange(0, 101, 2)) #4
 
#随机浮点数：
print( random.random()) #0.2746445568079129
print(random.uniform(1, 10)) #9.887001463194844
 
#随机字符：
print(random.choice('abcdefg&#%^*f')) #choice里可以传入序列，即字符串、列表、元组
 
#多个字符中选取特定数量的字符：
print(random.sample('abcdefghij',3)) #['f', 'h', 'd']
 
#随机选取字符串：
print( random.choice ( ['apple', 'pear', 'peach', 'orange', 'lemon'] )) #apple
#洗牌#
items = [1,2,3,4,5,6,7]
print(items) #[1, 2, 3, 4, 5, 6, 7]
random.shuffle(items)
print(items) #[1, 4, 7, 2, 5, 3, 6]

生成验证码：

import random
checkcode = ''
for i in range(4):
    current = random.randrange(0,4)
    if current != i:
        temp = chr(random.randint(65,90))    #取大写字母
    else:
        temp = random.randint(0,9)
    checkcode += str(temp)
print (checkcode)

7.3 os

os.getcwd() #获取当前工作目录，即当前python脚本工作的目录路径
os.chdir("dirname") # 改变当前脚本工作目录；相当于shell下cd；os.chdir("cd:\\user\\zsc")或者os.chdir(r"c:\user\zsc")，前一种方式两个"\"，第一个是转义的，推荐使用第二种方法；其他涉及目录的方法也需要转义或者加'r'
os.curdir  #返回当前目录: ('.')
os.pardir  #获取当前目录的父目录字符串名：('..')
os.makedirs('dirname1/dirname2')    #可生成多层递归目录，相当于mkdir -p /a/b/c/d。
os.removedirs('dirname1')    # 若目录为空，则删除，并递归到上一级目录，如上一级目录还为空，则继续删除，依此类推，递归删除所有空目录。
os.mkdir('dirname')   #  生成单级目录；相当于shell中mkdir dirname
os.rmdir('dirname')    # 删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
os.listdir('dirname')   #  列出指定目录下的所有文件和子目录的所有文件，包括隐藏文件，并以列表方式打印
os.remove() #  删除一个文件，写上文件路径。
os.rename("oldname","newname")  # 重命名文件/目录
os.stat('path/filename')  # 获取文件/目录信息
os.sep   #  输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"
os.linesep    # 输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"
os.pathsep   #  输出用于分割文件路径的字符串
os.name    # 输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
os.system("bash command")  # 运行shell命令，直接显示
os.environ  # 获取系统环境变量
os.path.abspath(path)  # 返回path规范化的绝对路径
os.path.split(path)  # 将path分割成目录和文件名的两个元素的元组返回
os.path.dirname(path)  # 返回path的目录。其实就是os.path.split(path)的第一个元素
os.path.basename(path)  # 返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素
os.path.exists(path)  # 如果path存在，返回True；如果path不存在，返回False
os.path.isabs(path) #  如果path是绝对路径，返回True
os.path.isfile(path) #  如果path是一个存在的文件，返回True。否则返回False
os.path.isdir(path)  # 如果path是一个存在的目录，则返回True。否则返回False
os.path.join(path1[, path2[, ...]])  # 将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
os.path.getatime(path)  # 返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path)  # 返回path所指向的文件或者目录的最后修改时间

7.4 sys

sys.argv           #命令行参数List，第一个元素是程序本身路径
sys.exit(n)         #退出程序，正常退出时exit(0)
sys.version         #获取Python解释程序的版本信息
sys.maxint          #最大的Int值
sys.path            #返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
sys.platform        #返回操作系统平台名称
sys.stdout.write('please:')
val = sys.stdin.readline()[:-1]

7.5 shutil

7.5.1 shutil.copyfileobj(fsrc, fdst[, length])

将文件内容拷贝到另一个文件中，可以拷贝部分内容

1 def copyfileobj(fsrc, fdst, length=16*1024):
2     """copy data from file-like object fsrc to file-like object fdst"""
3     while 1:
4         buf = fsrc.read(length)
5         if not buf:
6             break
7         fdst.write(buf)

View Code

7.5.2 shutil.copyfile(src, dst)

拷贝文件，直接将src拷贝成dst，跟cp命令似的。

 1 def copyfile(src, dst):
 2     """Copy data from src to dst"""
 3     if _samefile(src, dst):
 4         raise Error("`%s` and `%s` are the same file" % (src, dst))
 5 
 6     for fn in [src, dst]:
 7         try:
 8             st = os.stat(fn)
 9         except OSError:
10             # File most likely does not exist
11             pass
12         else:
13             # XXX What about other special files? (sockets, devices...)
14             if stat.S_ISFIFO(st.st_mode):
15                 raise SpecialFileError("`%s` is a named pipe" % fn)
16 
17     with open(src, 'rb') as fsrc:
18         with open(dst, 'wb') as fdst:
19             copyfileobj(fsrc, fdst)

View Code

7.5.3 shutil.copymode(src, dst)

仅拷贝权限。内容、组、用户均不变

1 def copymode(src, dst):
2     """Copy mode bits from src to dst"""
3     if hasattr(os, 'chmod'):
4         st = os.stat(src)
5         mode = stat.S_IMODE(st.st_mode)
6         os.chmod(dst, mode)

View Code

7.5.4 shutil.copystat(src, dst)

拷贝状态的信息，包括：mode bits, atime, mtime, flags

 1 def copystat(src, dst):
 2     """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
 3     st = os.stat(src)
 4     mode = stat.S_IMODE(st.st_mode)
 5     if hasattr(os, 'utime'):
 6         os.utime(dst, (st.st_atime, st.st_mtime))
 7     if hasattr(os, 'chmod'):
 8         os.chmod(dst, mode)
 9     if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
10         try:
11             os.chflags(dst, st.st_flags)
12         except OSError, why:
13             for err in 'EOPNOTSUPP', 'ENOTSUP':
14                 if hasattr(errno, err) and why.errno == getattr(errno, err):
15                     break
16             else:
17                 raise

View Code

7.5.5 shutil.copy(src, dst)

拷贝文件及权限

 1 def copy(src, dst):
 2     """Copy data and mode bits ("cp src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copymode(src, dst)

View Code

7.5.6 shutil.copy2(src, dst)

拷贝文件和状态信息

 1 def copy2(src, dst):
 2     """Copy data and all stat info ("cp -p src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copystat(src, dst)

View Code

7.5.7 shutil.copytree(src, dst, symlinks=False, ignore=None)

递归的去拷贝文件

7.5.8 shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

7.5.9 shutil.move(src, dst)
递归的去移动文件

7.5.10 shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录
 
import shutil
ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
 
 
#将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录
import shutil
ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

7.5.11 shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

import zipfile

# 压缩
z = zipfile.ZipFile('laxi.zip', 'w')
z.write('a.log')
z.write('data.data')
z.close()

# 解压
z = zipfile.ZipFile('laxi.zip', 'r')
z.extractall()
z.close()

zipfile 压缩解压

#压缩多个单个文件最好用zipfile.ZipFile()方法，灵活。

import tarfile

# 压缩
tar = tarfile.open('your.tar','w')
tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
tar.close()

# 解压
tar = tarfile.open('your.tar','r')
tar.extractall()  # 可设置解压地址
tar.close()

tarfile 压缩解压

7.6 shelve

shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式，注意是pickle，所以也可以存储类或者时间等基础类型以外的数据。

import shelve,datetime
f = shelve.open('test')

name = ['zhangshanci','qing','xi']

class test(object):
    def __init__(self,n):
        self.n = n
        
t1 = test(1)
t2 = test(2)

f['name'] = name
f['t1'] = t1
f['t2'] = t2
f['date'] = datetime.datetime.now
f.close()

#执行上述代码后，会在同级目录下生成test开头的几个文件，都有作用，暂时不用理会它们。

调用：

import shelve,datetime
f = shelve.open('test')
print(f.get("date"))
print(f.get("name"))

8.xml处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单，不过，古时候，在json还没诞生的黑暗年代，大家只能选择用xml呀，至今很多传统公司如金融行业的很多系统的接口还主要是xml。

xml的格式如下，就是通过<>节点来区别数据结构的:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml，将上述内容存储到xmltest.xml的文件里，下面演示调用：

import xml.etree.ElementTree as ET
 
tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)
 
#遍历xml文档
for child in root:
    print(child.tag, child.attrib)
    for i in child:
        print(i.tag,i.text)
 
#只遍历year 节点
for node in root.iter('year'):
    print(node.tag,node.text)

创建xml文件：

import xml.etree.ElementTree as ET
 
new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'
 
et = ET.ElementTree(new_xml) #生成文档对象
et.write("test.xml", encoding="utf-8",xml_declaration=True)
 
ET.dump(new_xml) #打印生成的格式

9. PyYAML模块

http://pyyaml.org/wiki/PyYAMLDocumentation

10. ConfigParser模块

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes
 
[bitbucket.org]
User = hg
 
[topsecret.server.com]
Port = 50022
ForwardX11 = no

如果想用python生成一个这样的文档怎么做呢？

import configparser
 
config = configparser.ConfigParser()
config["DEFAULT"] = {'ServerAliveInterval': '45',
                      'Compression': 'yes',
                     'CompressionLevel': '9'}
 
config['bitbucket.org'] = {}
config['bitbucket.org']['User'] = 'hg'
config['topsecret.server.com'] = {}
topsecret = config['topsecret.server.com']
topsecret['Host Port'] = '50022'     # mutates the parser
topsecret['ForwardX11'] = 'no'  # same here
config['DEFAULT']['ForwardX11'] = 'yes'
with open('example.ini', 'w') as configfile:
   config.write(configfile)

写完了还可以再读出来哈，

 1 >>> import configparser
 2 >>> config = configparser.ConfigParser()
 3 >>> config.sections()
 4 []
 5 >>> config.read('example.ini')
 6 ['example.ini']
 7 >>> config.sections()
 8 ['bitbucket.org', 'topsecret.server.com']
 9 >>> 'bitbucket.org' in config
10 True
11 >>> 'bytebong.com' in config
12 False
13 >>> config['bitbucket.org']['User']
14 'hg'
15 >>> config['DEFAULT']['Compression']
16 'yes'
17 >>> topsecret = config['topsecret.server.com']
18 >>> topsecret['ForwardX11']
19 'no'
20 >>> topsecret['Port']
21 '50022'
22 >>> for key in config['bitbucket.org']: print(key)
23 ...
24 user
25 compressionlevel
26 serveraliveinterval
27 compression
28 forwardx11
29 >>> config['bitbucket.org']['ForwardX11']
30 'yes'

View Code

configparser增删改查语法，

 1 [section1]
 2 k1 = v1
 3 k2:v2
 4   
 5 [section2]
 6 k1 = v1
 7  
 8 import ConfigParser
 9   
10 config = ConfigParser.ConfigParser()
11 config.read('i.cfg')
12   
13 # ########## 读 ##########
14 #secs = config.sections()
15 #print secs
16 #options = config.options('group2')
17 #print options
18   
19 #item_list = config.items('group2')
20 #print item_list
21   
22 #val = config.get('group1','key')
23 #val = config.getint('group1','key')
24   
25 # ########## 改写 ##########
26 #sec = config.remove_section('group1')
27 #config.write(open('i.cfg', "w"))
28   
29 #sec = config.has_section('wupeiqi')
30 #sec = config.add_section('wupeiqi')
31 #config.write(open('i.cfg', "w"))
32   
33   
34 #config.set('group2','k1',11111)
35 #config.write(open('i.cfg', "w"))
36   
37 #config.remove_option('group2','age')
38 #config.write(open('i.cfg', "w"))

View Code

11.hashlib模块

用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib
 
m = hashlib.md5()
m.update(b"Hello")
m.update(b"It's me")
print(m.digest())
m.update(b"It's been a long time since last time we ...")
 
print(m.digest()) #2进制格式hash
print(len(m.hexdigest())) #16进制格式hash
'''
def digest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of binary data. """
    pass
 
def hexdigest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of hexadecimal digits. """
    pass
 
'''
import hashlib
 
# ######## md5 ########
 
hash = hashlib.md5()
hash.update('admin')
print(hash.hexdigest())
 
# ######## sha1 ########
 
hash = hashlib.sha1()
hash.update('admin')
print(hash.hexdigest())
 
# ######## sha256 ########
 
hash = hashlib.sha256()
hash.update('admin')
print(hash.hexdigest())
 
 
# ######## sha384 ########
 
hash = hashlib.sha384()
hash.update('admin')
print(hash.hexdigest())
 
# ######## sha512 ########
 
hash = hashlib.sha512()
hash.update('admin')
print(hash.hexdigest())

##########如果有中文的话，在python3.x里需要encode一下##
hash = hashlib.sha512()
hash.update('是我的'.encode(encoding='utf_8'))
print(hash.hexdigest())

12. re模块

记住：只要有返回值就是匹配到了（匹配到了默认会返回一个对象），只要没有返回值就是没匹配到（没匹配到返回None）；使用group()方法查看匹配到的字符串。

常用正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，re.search(r'[a-Z]','abDCDv',flag=re.I)，则这个匹配由于有re.I的存在会忽略大小写。
'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以
'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']
'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']
'?'     匹配前一个字符1次或0次
'{m}'   匹配前一个字符m次
'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']
'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'
'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c
 
 
'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的
'\Z'    匹配字符结尾，同$
'\d'    匹配数字0-9
'\D'    匹配非数字
'\w'    匹配[A-Za-z0-9]
'\W'    匹配非[A-Za-z0-9]
's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'
 
'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

最常用的匹配语法

re.match 从头开始匹配
re.search 匹配包含
re.findall 把所有匹配到的字符放到以列表中的元素返回
re.splitall 以匹配到的字符当做列表分隔符
re.sub      匹配字符并替换

splist用法示例：

import re

print(re.split(r'\d+', 'fdsa123dffd2312fdr2132d2312'))
print(re.sub(r'\d+', '|', 'fdsa123dffd2312fdr2132d2312'))   #替换所有
print(re.sub(r'\d+', '|', 'fdsa123dffd2312fdr2132d2312', count=2))  #替换前两次匹配的

结果：
['fdsa', 'dffd', 'fdr', 'd', '']
fdsa|dffd|fdr|d|
fdsa|dffd|fdr2132d2312

反斜杠的困扰
与大多数编程语言相同，正则表达式里使用"\"作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符"\"，那么使用编程语言表示的正则表达式里将需要4个反斜杠"\\\\"：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r"\\"表示。同样，匹配一个数字的"\\d"可以写成r"\d"。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

仅需轻轻知道的几个匹配模式

re.I(re.IGNORECASE): 忽略大小写（括号内是完整写法，下同）
M(MULTILINE): 多行模式，改变'^'和'$'的行为（参见上图）
S(DOTALL): 点任意匹配模式，改变'.'的行为

13. Subprocess模块

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:

os.system
os.spawn*

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly.

The run() function was added in Python 3.5; if you need to retain compatibility with older versions, see the Older high-level API section.

subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, shell=False, timeout=None, check=False)

Run the command described by args. Wait for command to complete, then return a CompletedProcess instance.

The arguments shown above are merely the most common ones, described below in Frequently Used Arguments (hence the use of keyword-only notation in the abbreviated signature). The full function signature is largely the same as that of the Popen constructor - apart from timeout, input and check, all the arguments to this function are passed through to that interface.

This does not capture stdout or stderr by default. To do so, pass PIPE for the stdout and/or stderr arguments.

The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated.

The input argument is passed to Popen.communicate() and thus to the subprocess’s stdin. If used it must be a byte sequence, or a string if universal_newlines=True. When used, the internal Popen object is automatically created withstdin=PIPE, and the stdin argument may not be used as well.

If check is True, and the process exits with a non-zero exit code, a CalledProcessError exception will be raised. Attributes of that exception hold the arguments, the exit code, and stdout and stderr if they were captured.

>>> subprocess.run(["ls", "-l"])  # doesn't capture output
CompletedProcess(args=['ls', '-l'], returncode=0)
 
>>> subprocess.run("exit 1", shell=True, check=True)
Traceback (most recent call last):
  ...
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
 
>>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)
CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0,
stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null\n')

调用subprocess.run(...)是推荐的常用方法，在大多数情况下能满足需求，但如果你可能需要进行一些复杂的与系统的交互的话，你还可以用subprocess.Popen(),语法如下：

p = subprocess.Popen("find / -size +1000000 -exec ls -shl {} \;",shell=True,stdout=subprocess.PIPE)
print(p.stdout.read())

可用参数：

args：shell命令，可以是字符串或者序列类型（如：list，元组）
bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲
stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄
preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用
close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
shell：同上
cwd：用于设置子进程的当前目录
env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。
universal_newlines：不同系统的换行符不同，True -> 同意使用 \n
startupinfo与createionflags只在windows下有效
将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等

终端输入的命令分为两种：

输入即可得到输出，如：ifconfig
输入进行某环境，依赖再输入，如：python

需要交互的命令示例

import subprocess
 
obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 \n ')
obj.stdin.write('print 2 \n ')
obj.stdin.write('print 3 \n ')
obj.stdin.write('print 4 \n ')
 
out_error_list = obj.communicate(timeout=10)
print out_error_list

14. logging模块

很多程序都有记录日志的需求，并且日志中包含的信息即有正常的程序访问日志，还可能有错误、警告等信息输出，python的logging模块提供了标准的日志接口，你可以通过它存储各种格式的日志，logging的日志可以分为 debug(), info(), warning(), error() and critical() 5个级别，下面我们看一下怎么用。

最简单用法

import logging
 
logging.warning("user [alex] attempted wrong password more than 3 times")
logging.critical("server is down")
 
#输出
WARNING:root:user [alex] attempted wrong password more than 3 times
CRITICAL:root:server is down

如果想把日志写到文件里，也很简单

import logging
 
logging.basicConfig(filename='example.log',level=logging.INFO)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')

其中下面这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，在这个例子，第一条日志是不会被纪录的，如果希望纪录debug的日志，那把日志级别改成DEBUG就行了。

logging.basicConfig(filename='example.log',level=logging.INFO)

感觉上面的日志格式忘记加上时间啦，日志不知道时间怎么行呢，下面就来加上!

import logging
logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
logging.warning('is when this event was logged.')
 
#输出
12/12/2010 11:46:36 AM is when this event was logged.

如果想同时把log打印在屏幕和文件日志里，就需要了解一点复杂的知识了

The logging library takes a modular approach and offers several categories of components: loggers, handlers, filters, and formatters.

Loggers expose the interface that application code directly uses.
Handlers send the log records (created by loggers) to the appropriate destination.
Filters provide a finer grained facility for determining which log records to output.
Formatters specify the layout of log records in the final output.

import logging
 
#create logger
logger = logging.getLogger('TEST-LOG')
logger.setLevel(logging.DEBUG)
 
 
# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
 
# create file handler and set level to warning
fh = logging.FileHandler("access.log")
fh.setLevel(logging.WARNING)
# create formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
 
# add formatter to ch and fh
ch.setFormatter(formatter)
fh.setFormatter(formatter)
 
# add ch and fh to logger
logger.addHandler(ch)
logger.addHandler(fh)
 
# 'application' code
logger.debug('debug message')
logger.info('info message')
logger.warn('warn message')
logger.error('error message')
logger.critical('critical message')

posted @ 2016-08-22 14:43 freedom_dog 阅读(336) 评论(0) 编辑收藏举报

刷新页面返回顶部

0821 1336 模块与包的导入方法、常用模块介绍

公告