day05_20170530_函数（三）/模块与包

上节课补充：

一、yield的表达式方式

协程函数（生成器：yield关键字的另外一种用法：表达式法）。

#模拟人去餐厅吃饭的过程
def eater(name):
    print('%s ready to eat' %name)
    while True:
        food = yield#food为变量名，每次都会拿到一个值，这个值由yield传给他的，一旦传给他，在打印的额就是谁在吃什么东西的一个过程
        print('%s start to eat %s' %(name,food))
g = eater('alex')
print(g)#<generator object eater at 0x007EBBD0>返回生成器，因为有yield，只是yield是一种表达式方式。
next(g)#既然是生成器你的形式，那么next一次就会触发一次执行。此时停在food = yield
next(g)#再次执行，那么会接着food = yield继续执行，结果输出alex start to eat None
#以上代码发现food值取不到，这就是我要讲的yield的另一种表达方式形式：food = yield 该怎么用呢？
g.send('西红柿')#g.send()#g.send跟next的效果是一样的，都是从上次执行函数暂停的位置继续执行。除此之外可以传值。send给yield传值，yield传给food

如果上述代码，我们把next注释掉，直接第一次就开始用send,执行报错如图：

Traceback (most recent call last):
  File "D:/python2017/20170423/s17_zc/day05/上节课复习.py", line 14, in <module>
    g.send('西红柿')
TypeError: can't send non-None value to a just-started generator

提示告诉我们开始的生成器必须传一个none作为初始状况。所以如果我们不用next，我们也可以用 g.sent(None).总结：表达式形式的生成器在最初的时候必须先传一个空，让他到达一个初始的位置！以后才能给他send值！

对于上面的例子，我们一开始拿到的是没有初始化的生成器，需要我们手动进行初始化！我可以不以做一个装饰器，使得拿到的直接就是一个已经初始化过的生成器

def deco(func):
    def wrapper(*args,**kwargs):
        res=func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@deco
def eater(name):
    print('%s ready to eat' %name)
    while True:
        food = yield#food为变量名，每次都会拿到一个值，这个值由yield传给他的，一旦传给他，在打印的额就是谁在吃什么东西的一个过程
        print('%s start to eat %s' %(name,food))
g = eater('alex')
g.send('西红柿')

现在追加完成这样的效果：每次执行一个send操作，都返回一下现在菜单总共上了多少道菜，我们可以用列表。

def deco(func):
    def wrapper(*args,**kwargs):
        res=func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@deco
def eater(name):
    print('%s ready to eat' %name)
    food_list=[]
    while True:
        food = yield food_list
        food_list.append(food)
        print('%s start to eat %s' %(name,food))
g = eater('alex')
print(g.send('西红柿'))
输出结果：
alex ready to eat
alex start to eat 西红柿
['西红柿']

总结一下表达式形式的yield功能（x = yield）：g.send('1111')，先把1111传给yield，由yield赋值给x,然后在往下执行，直到再次碰到yield，然后把yield后的返回值返回。

来讲一下yield表达式的应用：用携程函数模拟：grep -rl 'python' /root 找到root目录下面所有包含Python这一行内容的文件名是哪些!

在模拟之前先讲一下os的用法：

首先创建一个目录结构如下：

如a下面的文件一层一层遍历出来：

import os
g = os.walk(r'D:\python2017\20170423\s17_zc\day05\a')#因为window平台目录是\\，所以前面加上r,r就是告诉Python解释器识别这个字符串的时候用原生字符串去识别
print(g)#输出结果<generator object walk at 0x0057BBA0> 这是一个生成器，那我们可以for循环
for i in g:
    print(i)
#os.walk实现一层一层向下找文件
输出结果：
<generator object walk at 0x0062BBA0>
('D:\\python2017\\20170423\\s17_zc\\day05\\a', ['b'], ['a.txt', 'a1.txt'])
('D:\\python2017\\20170423\\s17_zc\\day05\\a\\b', ['c'], ['b.txt'])
('D:\\python2017\\20170423\\s17_zc\\day05\\a\\b\\c', ['d'], ['c.txt'])
('D:\\python2017\\20170423\\s17_zc\\day05\\a\\b\\c\\d', [], ['d.txt'])

取出所有的文件名，见代码：

import os
g = os.walk(r'D:\python2017\20170423\s17_zc\day05\a')#因为window平台目录是\\，所以前面加上r,r就是告诉Python解释器识别这个字符串的时候用原生字符串去识别
print(g)#输出结果<generator object walk at 0x0057BBA0> 这是一个生成器，那我们可以for循环
for par_dir,_,files in g:
    for file in files:
        file_abs_path=r'%s\%s' %(par_dir,file)
        print(file_abs_path)
输出结果：
<generator object walk at 0x005BBBA0>
D:\python2017\20170423\s17_zc\day05\a\a.txt
D:\python2017\20170423\s17_zc\day05\a\a1.txt
D:\python2017\20170423\s17_zc\day05\a\b\b.txt
D:\python2017\20170423\s17_zc\day05\a\b\c\c.txt
D:\python2017\20170423\s17_zc\day05\a\b\c\d\d.txt

拿到所有的文件名，如下代码也可以实现：

import os
def search(search_path):
    g = os.walk(search_path)
    for par_dir,_,files in g:
        for file in files:
            file_abs_path=r'%s\%s' %(par_dir,file)
            print(file_abs_path)
search(r'D:\python2017\20170423\s17_zc\day05\a')

但是，不能每次都search里面传参数，我们可以用send不断的传值，我们可以用yield表达式的形式完成：

import os
def init(func):
    def wrapper(*args,**kwargs):
        res = func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@init
def search():
    while True:#这样就可以不听的send了
        search_path = yield
        g = os.walk(search_path)
        for par_dir,_,files in g:
            for file in files:
                file_abs_path=r'%s\%s' %(par_dir,file)
                print(file_abs_path)
g = search()
x=r'D:\python2017\20170423\s17_zc\day05\a'
g.send(x)

显示文件的功能完成了，我们开始进行下面的功能：接收上一个功能的结果，然后挨个打开，怎么从一个函数外部源源不断的传过来值呢？

import os
def init(func):
    def wrapper(*args,**kwargs):
        res = func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@init
def search(target):
    while True:#这样就可以不听的send了
        search_path = yield
        g = os.walk(search_path)
        for par_dir,_,files in g:
            for file in files:
                file_abs_path=r'%s\%s' %(par_dir,file)
                #print(file_abs_path)
                target.send(file_abs_path)
#接收上一个功能的结果，然后挨个打开，怎么从一个函数外部源源不断的传过来值呢？
@init#初始化
def opener():
    while True:
        file_abs_path = yield
        print('openner func==>',file_abs_path)
        with open(file_abs_path,encoding='utf-8') as f:
            pass
g = search(opener())#search执行得到的是一个生成器，所以要想有执行结果，得传值
g.send(r'D:\python2017\20170423\s17_zc\day05\a')

值传过来之后，如何不停的接受呢？下面是完整的代码：

import os
def init(func):
    def wrapper(*args,**kwargs):
        res = func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@init
def search(target):
    while True:#这样就可以不听的send了
        search_path = yield
        g = os.walk(search_path)
        for par_dir,_,files in g:
            for file in files:
                file_abs_path=r'%s\%s' %(par_dir,file)
                #print(file_abs_path)
                target.send(file_abs_path)
#接收上一个功能的结果，然后挨个打开，怎么从一个函数外部源源不断的传过来值呢？
@init#初始化
def opener(target):
    while True:
        file_abs_path = yield
        #print('openner func==>',file_abs_path)
        with open(file_abs_path,encoding='utf-8') as f:
            target.send((file_abs_path,f))
@init
#不断的接收上面功能传过来的文件对象，并进行遍历：
def cat(target):
    while True:
        file_abs_path,f = yield #(file_abs_path,f)
        for line in f :#遍历文件内容
            target.send((file_abs_path,line))
@init
#进行过滤
def grep(target,pattern):
    while True:
        file_abs_path,line = yield
        if pattern in line:
            target.send(file_abs_path)
@init
def printer():
    while True:
        file_abs_path=yield
        print(file_abs_path)

#如上，整套流程完毕，那么怎么调用呢？

x = r'D:\python2017\20170423\s17_zc\day05\a'
g = search(opener(cat(grep(printer(),'python'))))
print(g)#输出结果：<generator object search at 0x007AACF0>这只是个生成器
g.send(x)

输出结果：
<generator object search at 0x0036ACF0>
D:\python2017\20170423\s17_zc\day05\a\a.txt
D:\python2017\20170423\s17_zc\day05\a\b\b.txt
D:\python2017\20170423\s17_zc\day05\a\b\c\d\d.txt

这种变成思路称之为面向过程的编程思路：这是一种流水线式的编程思想，生产线的特点是每个阶段都有自己负责的内容，各个阶段之间有个衔接，上一个跌段的结果交给下一个阶段执行。如果把函数传进去的参数当做是他吃进去的东西，那函数的返回值就是拉出来的结果，这样用函数去写面向过程的东西就变成这样的流水线。这种流水线的编程思想有什么好处么？流水线非常清晰，可以把复杂的东西变得非常的简单，一个环节套一个环节，所以如果让你基于面向过程去写程序的话，那脑子里面应该设计成一条流水线，分各个环节，直到不能再细分为止。坏处就是一个流水线只是在做这一件事情，可扩展性非常差。

总结：面向过程的程序设计：是一种流水线式的编程思路，是机械式。优点：程序的结构清晰，可以把复杂的问题简单缺点：扩展性差应用场景：Linux内核，git,httpd都是面向过程中比较好的案例。

问题：上面的这个例子，如果一个文件中有三行含Python的，那打印出几个文件名称呢？

输出结果：

<generator object search at 0x006BBCF0>
D:\python2017\20170423\s17_zc\day05\a\a.txt
D:\python2017\20170423\s17_zc\day05\a\b\b.txt
D:\python2017\20170423\s17_zc\day05\a\b\b.txt
D:\python2017\20170423\s17_zc\day05\a\b\b.txt
D:\python2017\20170423\s17_zc\day05\a\b\c\d\d.txt

那代码怎么修正呢？

import os
def init(func):
    def wrapper(*args,**kwargs):
        res = func(*args,**kwargs)
        next(res)
        return res
    return wrapper
@init
def search(target):
    while True:#这样就可以不听的send了
        search_path = yield
        g = os.walk(search_path)
        for par_dir,_,files in g:
            for file in files:
                file_abs_path=r'%s\%s' %(par_dir,file)
                #print(file_abs_path)
                target.send(file_abs_path)
#接收上一个功能的结果，然后挨个打开，怎么从一个函数外部源源不断的传过来值呢？
@init#初始化
def opener(target):
    while True:
        file_abs_path = yield
        #print('openner func==>',file_abs_path)
        with open(file_abs_path,encoding='utf-8') as f:
            target.send((file_abs_path,f))
@init
#不断的接收上面功能传过来的文件对象，并进行遍历：
def cat(target):
    while True:
        file_abs_path,f = yield #(file_abs_path,f)
        for line in f :#遍历文件内容
            tag = target.send((file_abs_path,line))
            if tag:
                break
@init
#进行过滤
def grep(target,pattern):
    tag = False
    while True:
        file_abs_path,line = yield tag
        tag = False
        if pattern in line:
            tag = True
            target.send(file_abs_path)
@init
def printer():
    while True:
        file_abs_path=yield
        print(file_abs_path)

#如上，整套流程完毕，那么怎么调用呢？

x = r'D:\python2017\20170423\s17_zc\day05\a'
g = search(opener(cat(grep(printer(),'python'))))
print(g)#输出结果：<generator object search at 0x007AACF0>这只是个生成器
g.send(x)

二、匿名函数

有名函数，看下面的例子：

#有名函数：
def func(x,y):
    return x + y
func(1,2)

匿名函数，看下面的例子：　　

#匿名函数，用在不需要名字的场景：
f = lambda x,y:x+y
print(f)#<function <lambda> at 0x00851150>输出的func的内存地址
#lambda 的好处：取代上面简单的函数
print(f(1,2))#调用

分析前请看下面的图片解析：　　

匿名函数，定义完了没人用，引用计数为0 ，是垃圾，直接就回收掉了。那他的应用场景何在？就跟下面要讲的函数有关系了！

#max,min,zip的用法
#定义一个薪资的字典
salaries = {
    'egon':3000,
    'alex':100000000,
    'wupeiqi':10000,
    'yuanhao':2000
}
#找出薪资最高的人！
#print(max(salaries))#为何输出yuanhao呢？max默认比较的是key的最大值！首字母一个一个去比较
#怎样求value最大值呢？
#print(max(salaries.values()))#这样就只输出了最大的那个值，也不是那个人
res = zip(salaries.values(),salaries.keys())
#print(res)#<zip object at 0x004E5468> zip的一个对象
#print(list(res))#看结果可以用for也可以用list
print(max(res))#需要把print(list(res))注释掉，否则报ValueError: max() arg is an empty sequence，原因res是迭代器，遍历一次，就不会再回去取值了，把他注释掉，就没有任何迭代了，就可以取值了

输出结果：
(100000000, 'alex')

但是上面的zip实现的代码还是没有完美的表达我想要的结果，我只想取薪资最高者的一个名字，用max函数优化：

#定义一个薪资的字典
salaries = {
    'egon':3000,
    'alex':100000000,
    'wupeiqi':10000,
    'yuanhao':2000
}
def func(k):
    return salaries[k]
print(max(salaries,key=func))#要比较的是salaries,而key表示你要按照什么字段进行比价薪资的大小

上面的按个函数功能非常简单，他的功能就是一个return，而且用了一次就结束了，针对这种情况，什么可以派上用场？匿名函数的用途：

salaries = {
    'egon':3000,
    'alex':100000000,
    'wupeiqi':10000,
    'yuanhao':2000
}
def func(k):
    return salaries[k]
#print(max(salaries,key=func))#要比较的是salaries,而key表示你要按照什么字段进行比价薪资的大小
print(max(salaries,key=lambda k:salaries[k]))
print(min(salaries,key=lambda k:salaries[k]))

现在我要排序这个字典，看下代码：

salaries = {
    'egon':3000,
    'alex':100000000,
    'wupeiqi':10000,
    'yuanhao':2000
}
print(sorted(salaries))#sorted默认的排序结果是从小到大
print(sorted(salaries,key=lambda x:salaries[x]))#升序
print(sorted(salaries,key=lambda x:salaries[x],reverse=True))#降序排序

普及另外一种编程思想：函数式编程思想-->他有很多特性，在Python里面支持不了！1）执行一个函数，不要对外部产生任何的修改。2）没有循环的概念，他所有的循环都用递归去做，函数式编程里面的递归一定要做成伪递归的方式，但是Python里面不支持伪递归的方式。要研究函数是编程语言，跟Python是没有关系的。面向过程编程思想和函数式编程思想跟语言无关！

global关键字的功能：在执行的过程中对外部的状态就行修改！

x = 1000
def f1():
    global x#如果global x 不加这句，那么最终的x的结果就是1000
    x = 0
f1()
print(x)

高阶函数：如果函数传入的参数指向的是一个函数，或者他的返回值是一个函数，这个函数就称为高阶函数！

#如下面例子-----这是函数式编程思想里面的特性，Python不是一门编程式语言，但是可以用它的一些好的特性！

def func(f):
    return f
res = func(max)#参数是一个内置的max
print(res)#返回的是max的内存地址

讲一下map reduce filter

map:

实现场景1）：给列表中的每一个元素加上sb后缀

l = ['alex','wupeiqi','yuanhao']
res = map(lambda x:x+'_sb',l)
print(res)#返回map的对象<map object at 0x00339770>，发现有__next__方法，他是一个迭代器
print(list(res))
输出结果：
<map object at 0x00839770>
['alex_sb', 'wupeiqi_sb', 'yuanhao_sb']

实现场景2）已知num=(2,4,9,10),现在要求得到一个新的列表，里面的值是平方的形式！

nums = (2,4,9,18)
res1=map(lambda x:x**2,nums)#每次迭代的结果有个参数传过来
print(list(res1))
print(nums)#发现原来的数据没有被修改
输出结果：[4, 16, 81, 324](2, 4, 9, 18)

reduce：

实现场景：求列表各个元素的和

#Python3里面reduce放在一个模块里面了
from functools import reduce
l = [1,2,3,4,5]
print(reduce(lambda x,y:x+y,l))
print(reduce(lambda x,y:x+y,l,10))#如果有初始值10，那么就是初始值和迭代的结果相加
输出结果：
15
25

filter：

实现场景：把下面列表中SB结尾的元素筛选出来

l = ['alex_sb','wupeiqi_sb','yuanhao_sb','egon']
res = filter(lambda x:x.endswith('sb'),l)
print(res)
print(list(res))

本节课内容

三、递归调用

递归调用：在函数调用过程中，直接或间接地调用了函数本身，这就是函数的递归调用

递归的形式：

#!/usr/bin/python
# -*- coding:utf-8 -*-
def f1():
    print('from f1')
    f1()
f1()

还有另外一种形式，但是用的很少：

def f1():
    print('f1')
    f2()
def f2():
    f1()
f1()

递归，抛开技术，我们来举个例子：　比如问路：我问A天安门怎么走，A去问B，在A问B的时候，我在等待，我怎么等呢？我在内存中维护我的状态，A问B，B也不知道，B去问C，那么A在B问C的过程中也是在等待，在内存中维护他的状态，这一连串的等待都需要开辟内存空间，大家都要在这等着，直到C问到了天安门在哪，然后C告诉B，B在告诉A，A在告诉我，那如果允许让你出现无限递归，那么会出现什么状况呢？会一直占着内存，直到内存消耗完毕，那Python是不允许你这样做的。你需要导入一个sys.getrecursionlimit(）模块，他规定你递归可以递归到多少层，默认是1000层，他是可以让你设置这个值的！　　

import sys
print(sys.getrecursionlimit())

自定义递归多少层：改了是没用的，他只是增加了你程序的风险，这个要取决于你内存的大小。

import sys
print(sys.getrecursionlimit())
print(sys.setrecursionlimit(100000))
print(sys.getrecursionlimit())

使用场景：　问A多大，A说比B大2岁，B说比C大2岁，C说比D大2岁，D为18岁

分析：

age(5) = age(4)+2
age(4) = age(3)+2
age(3) = age(2)+2
age(2) = age(1)+2
age(1) = 18

age(n) = age(n-1)+2 n>1
age(n) = 18 n=1

代码实现：　

def age(n):
    if n == 1:
        return 18
    return age(n-1) +2
print(age(5))

分析下代码过程：

分析完之后，我们来总结一下递归：

1）必须有一个明确的结束条件（把地推给结束掉，才能回溯）

2）每次进入更深一层递归时，问题规模相比上次递归都应有所减少（递归的目的：这一层解决不了问题，就要下一层，知道解决问题）

3）递归效率不高，递归层次过多会导致栈溢出（在计算机中，函数调用是通过栈（stack）这种数据结构实现的，每当进入一个函数调用，栈就会加一层栈帧，每当函数返回，栈就会减一层栈帧。由于栈的大小不是无限的，所以，递归调用的次数过多，会导致栈溢出）

讲一下递归的应用：

二分法：比如一个序列，里面有一万个值，让你判断某个值是否在这个序列里面，你会for循环一个一个比对，若这个值在第九千个位置，那前面的遍历都浪费了，可以采取二分法。二分法的前提是这个列表里面的元素必须是从小到大的排列，我们可以把这个列表切为两半，中间的值比左边的值大，比右面的值小，那我最开始就去比较中间的这个值，如果我要找的那个值比中间这个值大，那证明这个值在列表的右边，我们可以用切片的方式把列表切开只保留右边的一部分，然后我在进行找中间值，直到最终找到那个值！

用递归的方式，把这个二分法写出来：

l = [1,2,10,33,53,71,73,75,77,85,101,201,202,999,11111]
def search(find_num,seq):
    if len(seq) == 0:
        print('not exist')
        return #这个if是判断如果要查的值不在列表里面，就打印not exist,并返回递归
    mid_index=len(seq)//2#根据中间的索引取中间的值,注意：如果列表只有两个值，索引取1，也就是右边的那个值
    mid_num = seq[mid_index]
    print(seq,mid_num)
    if find_num > mid_num:
        #in the right
        seq = seq[mid_index+1:]
        search(find_num,seq)
    elif find_num < mid_num:
        #in the left
        seq = seq[:mid_index]
        search(find_num,seq)
    else:
        print('find it')
search(77,l)
search(72,l)
输出结果：
[1, 2, 10, 33, 53, 71, 73, 75, 77, 85, 101, 201, 202, 999, 11111] 75
[77, 85, 101, 201, 202, 999, 11111] 201
[77, 85, 101] 85
[77] 77
find it
[1, 2, 10, 33, 53, 71, 73, 75, 77, 85, 101, 201, 202, 999, 11111] 75
[1, 2, 10, 33, 53, 71, 73] 33
[53, 71, 73] 71
[73] 73
not exist

四、模块

什么是模块？一个模块就是一个包含了Python定义和声明的文件，文件名就是模块名字加上.py的后缀。

为何要使用模块？如果你退出python解释器然后重新进入，那么你之前定义的函数或者变量都将丢失，因此我们通常将程序写到文件中以便永久保存下来，需要时就通过python test.py方式去执行，此时test.py被称为脚本script。随着程序的发展，功能越来越多，为了方便管理，我们通常将程序分成一个个的文件，这样做程序的结构更清晰，方便管理。这时我们不仅仅可以把这些文件当做脚本去执行，还可以把他们当做模块来导入到其他的模块中，实现了功能的重复利用，

如何使用模块？　第一种方式：可以直接把py文件当做一个脚本，直接运行。　第二种方式：当做一个模块导入进来。

import：

新建目录：模块与包在该目录建spam.py，代码如下：

#spam.py
print('from the spam.py')
money=1000
def read1():
    print('spam->read1->money',money)
def read2():
    print('spam->read2 calling read')
    read1()
def change():
    global money
    money=0

再在该目录建test.py,代码如下：

import spam#import这个关键字跟def,class等一样，就是在定义一个名字
#执行一下，输出from the spam.py，说明导入一个模块，就会执行里面的代码

#取出导入模块里面的变量名称，比如money
money = 10#即使本文件定义了money，但是最终输出的money的值还是导入模块中定义的money
print(spam.money)#spam.的形式
print(spam.read1)#拿到read1的内存地址，就可以执行了
spam.read1()
def read1():
    print('from test.py')
spam.read2()#调用的是spam.py里面的read2函数里面的read1函数，而不是上面定义的read1()函数
spam.change()
print(money)#调的是原文件的money，不是导入模块的

import导入模块干的事：
1、产生新的名称空间
2、以新建的名称空间为全局名称空间，执行文件的代码
3、拿到一个模块名spam，指向spam.py产生的名称空间
有时候你导入的模块的名称非常长，你可以给他取一个别名如下：

import spam as x
print(x.money)

在一行你可以导入多个模块，如：import spam,os,sys用逗号分隔,但是不鼓励这么做，最好还是分行导入，这样更明确一点。

from ...import...

from spam import money
print(money)
输出结果：
from the spam.py
1000

这个就不用输入（模块名.）的形式了！直接就可以使用。

import导入模块干的事：
1、产生新的名称空间
2、以新建的名称空间为全局名称空间，执行文件的代码
3、但是不会拿到一个模块名spam，直接拿到是spam.py产生的名称空间中的名字

这种导入方式的优点：方便，不用加前缀。缺点：特别容易和当前文件里面的名称空间冲突

拿到spam.py里面的函数怎么操作呢？

from spam import money,read1
print(money)
read1()
输出结果：
from the spam.py
1000
spam->read1->money 1000

如下代码返回的结果是什么呢？　　

from spam import money,read1
money = 10
print(money)
输出结果：
from the spam.py
10

返回是当前文件的money，money在最开始导入的时候绑定到了一个名字，再次定义的时候，money被重新绑定了。

如下代码返回的是什么结果？

from spam import money,read1,read2,change
def read1():
    print('=============from test.py read1')
read2()
输出结果：
from the spam.py
spam->read2 calling read
spam->read1->money 1000

跟导入方式无关，这个函数来自于哪个文件，他执行的时候全都以那个环境变量为准。

思考这样的一个问题：如果我的spam文件里面有三十多个函数，那我在导入的时候能全部都写一遍导入么？导入模块是一种方式，那如果不想加前缀，又不想一个一个导入怎么办呢？直接写：from spam import * 有一个控制导入*的内置的行为变量：在源文件的任意位置定义一个__all__=['money','read1'] 列表里面一定是字符串类型的。这就是证明from spam import *只导入了all里面定义的两个名称空间。

from...import也支持as,给导入的内容取个别名

from spam import read1 as read#给read1取的别名

from...import也支持导入多行

from spam import (read1,
                    read2,
                    money)

之前给大家讲过，一个Python文件有两种用途，一种用途是直接当脚本运行，还有一种方式是在另一个文件中当做模块导入。

有一个内置变量叫__name__，通过他的值，我可以判断当前文件是哪种用途。

#spam.py当做脚本执行，__name__ == '__main__'
#spam.py当做模块导入，__name__ = 模块名

print('当前文件的用途是：',__name__) 
输出结果： 当前文件的用途是： __main__
　　　　   当前文件的用途是： spam

五、模块搜索路径

导入模块跟路径有什么关系？导入模块是要执行那个文件，是文件就要有路径，跟文件有关系，找不到就会出现问题，那么我们来讲一下模块搜索路径。

我导入一个模块，他怎么找呢？先从内存里面开始找，内存没有在去内置里面找，内置没有去sys.path里面找，

我们来演示一下怎么从内存里面找的？存在spam.py，test2.py 我们在test2.py里面输入下面代码：导入了spam和time ,我们执行test2.py之后，睡10s的期间删除spam.py ，发现输出结果并没有报错，说明内存里面已经有spam.py这个文件了。

#!/usr/bin/python
# -*- coding:utf-8 -*-
import time
import spam
time.sleep(10)
import spam as aa
print(aa.money)
输出结果：
from the spam.py
当前文件的用途是： spam
1000

模块搜索顺序：内存--->内置----->sys.path

sys.path指的是什么呢？

import sys
print(sys.path)
输出结果：
['D:\\python2017\\20170423\\s17_zc\\day05\\模块与包', 
'D:\\python2017\\20170423\\s17_zc', 
'D:\\tools\\python\\python35.zip', 
'D:\\tools\\python\\DLLs', 
'D:\\tools\\python\\lib', 
'D:\\tools\\python', 
'D:\\tools\\python\\lib\\site-packages']

问题：在pycharm中输入spam.之后，显示的方法不是spam.py里面的内容，如图：这是为什么呢？

这是pycharm给返回来的，spam.py的路径不在环境变量里面。如果想显示spam.py里面的内容，怎么操作呢？spam.py在目录《模块与包》下面，所以右键：

如图：发现spam也没有错误提示了，实际上就做了一个加环境变量的操作。

注意：如果导入的模块在其他目录下，而不在引入该模块的文件所在的目录下，我们需要把他的目录添加到sys.path中，如下操作：

这样，执行test2.py，就可以输出正确结果了！append表示把该目录添加到了环境变量的最后一行，如果想要提高速度可以执行：

sys.path.insert(0,r'D:\python2017\20170423\s17_zc\day05\dir1')

注意：我们导入spam.py文件的时候生成了一个文件，我们可以通过命令来看到：

这个文件夹里面显示的是：

spam.cpython-35.pyc这个就是字节码文件，什么时候会产生这个文件呢？只有文件被导入的时候会产生。产生了以后下一次在导入的时候就不用再解释了直接读这个字节码，提升了导入速度。

六、包

包是一种通过使用‘.模块名’来组织Python模块名称空间的方式。

1.无论是import形式还是from...import形式，凡是在导入语句中（而不是在使用时）遇到带点的，都要第一时间提高警觉：这是关于包才有的导入语法

2.包是目录级的（文件夹级），文件夹是用来组成py文件（包的本质就是一个包含__init__.py文件的目录）

3.import导入文件时，产生名称空间中的名字来源于文件，import包，产生的名称孔家安的名字同样来源于文件，即包下的__init__.py。导入包本质就是在导入该文件。

首先按照下面的目录结构建好包：

glance/           #Top-level package
|--__init__.py     #Initialize the glance package
|-- api            #Subpackage for api
|   |--__init__.py
|   |-- policy.py
|   |-- versions.py
|-- cmd            #Subpackage for cmd
|   |--__init__.py
|   |-- manage.py
|-- db            #Subpackage for db
    |--__init__.py
    |-- models.py

建好的目录如图：

glance与test.py在同级目录，glance对于test.py来说就是一个模块，只不过这个模块里面还有很多小的模块。那test.py
在导入glance的时候，先在内存里面找，然后在内置空间找，都不是，就在sys.path里面找，在当前路径可以找到。

建好之后，下面为各个文件里面的内容：

#文件内容
#policy.py
def get():
    print('from policy.py')
#versions.py
def create_resource(conf):
    print('from version.py:',conf)
#manage.py
def main():
    print('from manage.py')
#models.py
def register_models(engine):
    print('from models.py:',engine)

在test.py里面执行如下：

import glance
#导入glance这个包，实际上使用的就是这个包下面的__init__.py
#如何验证呢？
在glance下的__init__.py编辑：print('from glance.__init__.py'),执行之后输出的结果：from glance.__init__.py

在test.py里面执行如下：

import glance
glance.api.policy.get()
#报错：AttributeError: module 'glance' has no attribute 'api'
#说明：import glance，看起来导入的是glance，实际上导入的是他下面的__init__.py，也就是说glance.调用的名字都是在找__init__.py里面的名字，而__init__.py里面并没有api,没有所以报这个错误！

在test.py里面执行如下：

import glance.api.policy#导入具体的模块
print("========")#判断输出的from glance.__init__.py是上一句执行的结果
glance.api.policy.get()#输出from policy.py

注意：执行import glance.api.policy的时候，因为导入的api是包，glance也是包，所以也会执行glance和api下面的__init__.py

在test.py里面执行如下：

import glance.api.policy.get
glance.api.policy.get()
#报错：ImportError: No module named 'glance.api.policy.get'; 'glance.api.policy' is not a package
#说明：点的使用是包独有的，点的左面必须是包。policy前面的点的前面是包，但是，get前面的点的前面不是包名，故报错。

考虑这个问题：

在glance下面的__init__.py编辑：import api，在test.py 中编辑：import glance ，执行test.py结果报错：ImportError: No module named 'api' 这是什么原因呢？

说明：按照流程：执行test.py----->导入glance，就要找glance，就跟test.py这个执行文件在同一路径下，找到之后，发现他是一个包，就开始执行他下面的__init__.py（执行的过程中发现他又import api）----->导api就要找api,问题来了，去哪里找呢？api用的是一个相对路径，到底是相对于init所在的路径，还是执行文件test.py的相对路径呢？你执行哪个文件，那个sys.path就是执行文件的sys.path的。__init__.py里面的import api ，api要在test.py所在的路径里面找，test.py在包这个路径下，所以找不到。怎么解决呢？

在test.py里面执行如下：

import glance#触发glance下面__init.py__的运行
glance.policy.get()
glance.models.register_models('mysql')

glance下面的__init__.py的内容：

from glance.api import policy,versions
#from glance import api.versions #语法错误
from glance.cmd import manage
from glance.db import models

这样，test.py就正常执行输出：

from policy.py
from models.py: mysql

注意：导入包的时候实际在导入他下面的__init__.py文件。

思考这样的问题：

如果我test.py中执行如下：

import glance
glance.get()
glance.register_models('mysql')

那我怎么修改glance下面的__init__.py文件呢？如下：

from glance.api.policy import get
from glance.db.models import register_models

下面有这样的一个需求：比如说在policy.py里面，我的包内部还引用了我包内部的其他模块，就是包内部的互相引用！现在要在policy.py里面用到api包下面的versions.py里面的create_resource功能。怎么操作？

policy.py编辑如下：加上main方法，这样可以直接在本文件右键执行（内部测试时候用main）

if __name__ == '__main__':
    import versions
    versions.create_resource('a.conf')
输出结果：from version.py: a.conf

在包内部使用导入操作，应该尽量避免使用import，因为import在导入的时候就是以当前路径为准去找，所以在包内部要想导入其他模块的话，应该用from...import

policy.py编辑如下：

from glance.api import versions
versions.create_resource('a.conf')

这样需要通过执行test.py来调：　　

import glance.api.policy

如果是想包内调用跨包的功能怎么操作呢？现在要在policy.py里面用db包下面的models.py里面的register_models功能。怎么操作？

policy.py编辑如下：

from glance.db import models
models.register_models('mysql')

通过执行test.py来调：

import glance.api.policy

包内部去导入自己包的其他模块有两种方式：包的绝对导入和包的相对导入！

包的绝对导入：用from...import的导入方式，from后面跟包的顶头比如glance，称为包的绝对导入。从包的头开始导，这样就一定能找到导入的内容。这种的导入的缺点：如果你的路径有变更，那么已经导入过的因路径不对，会报错。所以引入包的相对导入！实现下面需求：

现在要在policy.py里面用db包下面的models.py里面的register_models功能。怎么操作？

policy.py编辑如下：

from ..db import models
models.register_models('mysql')

执行test.py如下：　　

import glance.api.policy

优点：即便路径发生变化，里面的代码也无需改变！如果多层目录，就可以不断的加点（.）。

疑点：关于模块的导入，有import * 的用法，那关于包的的导入，有这种用法么？也有！

语法为：from glance.api import *

与* 对应的是 __all__ ,配置在 __init__.py里面！下面简例：

glance/api下的__init__.py代码如下：

__all__=['x']
x = 1 
y = 2

test.py代码如下：

from glance.api import *
print(x)
输出结果：
1

七、re模块的用法

正则就是用一些具有特殊含义的符号组合到一起（称为正则表达式）来描述字符或者字符串的方法。或者说：正则就是用来描述一类事物的规则。（在Python中）他内嵌在Python中，并通过re模块实现。正则表达式模式被编译成一系列的字节码，然后由C编写的匹配引擎运行。

生活中处处都是正则：
    比如我们描述：4条腿
        你可能会想到的是四条腿的动物或者桌子，椅子等
    继续精确描述：4条腿，活的
        就只剩下四条腿的动物这一类了

　　是

为了消除你对正则表达式的恐惧，我们提供了这个地址：http://tool.oschina.net/regex/#

输入左侧的内容，比如匹配邮箱：

用Python实现，把生成的正则粘过来：

#!/usr/bin/python
# -*- coding:utf-8 -*-
import re
s = '''
http://www.baidu.com
1011010101
egon@oldboyedu.com
你好
21213
010-3134
egon@163.com
'''
res = re.findall(r"[\w!#$%&'*+/=?^_`{|}~-]+(?:\.[\w!#$%&'*+/=?^_`{|}~-]+)*@(?:[\w](?:[\w-]*[\w])?\.)+[\w](?:[\w-]*[\w])?",s)#因为Python对/有自己的认识，所以加个 r ,把单引号改成双引号
print(res)

下面来介绍下正则：

1）\w：匹配字母数字及下划线

import re
print(re.findall('\w','as213df_*|'))
#输出结果：['a', 's', '2', '1', '3', 'd', 'f', '_'] 
#没有输出* |

2）\W：匹配非字母数字下划线

import re
print(re.findall('\W','as213df_*|'))
#输出结果：['*', '|']

注意：下面执行代码：

import re
print(re.findall('a\wb','a_b a3b aEb a*b'))
输出结果：['a_b', 'a3b', 'aEb']

3）\s：匹配任意空白字符，等价于[\t\n\r\f]

import re
print(re.findall('\s','a b\nc\td'))
输出结果：
[' ', '\n', '\t']

4）\S：匹配任意非空字符

import re
print(re.findall('\S','a b\nc\td'))
输出结果：
['a', 'b', 'c', 'd']

5）\d：匹配任意数字，等价于[0-9]

import re
print(re.findall('\d','a123bcdef'))
输出结果：
['1', '2', '3']

6）\D：匹配任意非数字

import re
print(re.findall('\D','a123bcdef'))
输出结果：
['a', 'b', 'c', 'd', 'e', 'f']

7）\n：匹配一个换行符

import re
print(re.findall('\n','a123\nbcdef'))
输出结果：
['\n']

8）\t：匹配一个制表符

import re
print(re.findall('\t','a123\tbcdef'))
输出结果：
['\t']

9）^：匹配字符串的开头

import re
print(re.findall('^h','hello egon hao123'))
print(re.findall('^h','ello egon hao123'))#没有就为空
输出结果：
['h']
[]

10）$：匹配字符串的结尾

import re
print(re.findall('3$','ello egon hao123'))
print(re.findall('3$','ello egon hao123ww'))
输出结果：
['3']
[]

11）.：匹配任意字符，除了换行符

import re
print(re.findall('a.c','abc a1c a*c a|c abd aed a\nc'))#换行符匹配不到
输出结果：
['abc', 'a1c', 'a*c', 'a|c']
#如果想让点也匹配到换行符，该怎么办？
import re
print(re.findall('a.c','abc a1c a*c a|c abd aed a\nc',re.S))#这样就可以输出换行符了

12）[...]：跟点差不多，匹配一个字符，但是他可以指定这个字符是谁

import re
print(re.findall('a[12]c','abc a1c a*c a|c abd aed a\nc'))
输出结果：
['a1c']

13）-：有开头有结尾表示范围，从...到...

import re
print(re.findall('a[0-9]c','abc a1c a*c a|c abd aed a\nc'))
输出结果：
['a1c']

还可以表示匹配 -

import re
print(re.findall('a[0-9a-zA-Z*-]c','a1c abc a*c a-c aEc'))
输出结果：
['a1c', 'abc', 'a*c', 'a-c', 'aEc']

14）[ ]:如果里面加上^，表示取反

import re
print(re.findall('a[^0-9]c','a1c abc a*c a-c aEc'))
输出结果：
['abc', 'a*c', 'a-c', 'aEc']

15）* + ？ {n，m} 这些统一代表重复匹配

1)*:匹配0个或多个的表达式
import re
print(re.findall('ab*','a'))
print(re.findall('ab*','abbbbbb'))
print(re.findall('ab*','bbbbbb'))
输出结果：
['a']
['abbbbbb']
[]
2)+：表示+左侧的字符出现一次或者无穷次，至少一次
import re
print(re.findall('ab+','a'))
print(re.findall('ab+','abbbbb'))
print(re.findall('ab+','bbbb'))
输出记结果：
[]
['abbbbb']
[]
#注意：下面的这两种取值情况：
print(re.findall('ab[123]','abbbbb123'))
print(re.findall('ab[123]','ab1 ab2 ab3 abc1'))
#下面相当于 ab[123] ab1+ ab2+ ab3+
print(re.findall('ab[123]+','ab1111 ab2 ab3 abc1 ab122'))
输出结果：
[]
['ab1', 'ab2', 'ab3']
['ab1111', 'ab2', 'ab3','ab122']
3){}:指定左侧字符出现的次数
print(re.findall('ab{3}','ab1 abbbbbbbbb2 abbbb3 ab4 ab122'))
输出结果：
['abbb', 'abbb']
#也可以指定出现次数的范围
print(re.findall('ab{3,4}','ab1 abbb123 abbbb123 abbbbbt'))
输出结果：
['abbb', 'abbbb', 'abbbb']
#指定a后面的字符至少出现多少次
print(re.findall('ab{3,}','ab1 abbb123 abbbb123 abbbbbt'))
4)？：匹配0次或者1次，也就是左侧的字符出现0次或者1次
print(re.findall('ab?c','ac abc aec a1c'))#可以匹配ac ,abc
输出结果：
['ac', 'abc']
5) .* 贪婪匹配 匹配任意长度的任意字符
print(re.findall('a.*c','ac abc aec a1c'))#输出结果：['ac abc aec a1c']
6) .*? 非贪婪匹配
print(re.findall('a.*?c','ac abc aec a1c a'))#输出结果：['ac', 'abc', 'aec', 'a1c']
print(re.findall('a.*?c','ac abc aec a1c a\nc',re.S))#注意 ：.不包含\n  输出结果：['ac', 'abc', 'aec', 'a1c', 'a\nc']

16）a|b:匹配a或b 左边成功了就不会执行右边，左边不成功才会执行右边

import re
print(re.findall('compan(?:y|ies)','Too many companies have gone bankrupt，next company is my company'))
输出结果：
['companies', 'company', 'company']

17）()分组

#()分组
print(re.findall('ab+123','ababab123'))
print(re.findall('(ab)+123','ababab123'))#['ab'],匹配到末尾的ab123中的ab
print(re.findall('(?:ab)+123','ababab123'))#findall的结果不是匹配的全部内容，而是组内的内容，？:可以让结果为匹配的全部内容
输出结果：
['ab123']
['ab']
['ababab123']

18) \ 转义字符

import re
print(re.findall('a\\c','a\c'))#对于正则来说a\\c确实可以匹配到a\c，但是在Python解释器读取a\\c时，会发生转义，然后交给re去执行，所以跑出异常
print(re.findall(r'a\\c','a\c'))#r代表告诉解释器使用rawstring,即原生字符串，把我们正则内的所有符号都当普通字符处理，不要转义
print(re.findall('a\\\\c','a\c'))#同上面的意思一样，和上面的结果一样都是['a\\c']

----------------本节over　

posted @ 2017-07-06 17:52 zhaichao 阅读(272) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

day05_20170530_函数（三）/模块与包

公告