Python socket进阶多线程/进程

Posted on 2016-03-15 18:04 善恶美丑阅读(6149) 评论(0) 编辑收藏举报

#首先，什么场合下用进程，什么场合下用线程：

　　. 计算密集型的用进程。

　　. IO密集型的用进程。

xSocket语法及相关

Socket Families(地址簇)

socket.AF_UNIX unix本机进程间通信

socket.AF_INET　IPV4　

socket.AF_INET6 IPV6

上面的这些内容代表地址簇，创建socket必须指定，默认为IPV4

Socket Types

socket.SOCK_STREAM #for tcp

socket.SOCK_DGRAM #for udp

socket.SOCK_RAW #原始套接字，普通的套接字无法处理ICMP、IGMP等网络报文，而SOCK_RAW可以；其次，SOCK_RAW也可以处理特殊的IPv4报文；此外，利用原始套接字，可以通过IP_HDRINCL套接字选项由用户构造IP头。

socket.SOCK_RDM #是一种可靠的UDP形式，即保证交付数据报但不保证顺序。SOCK_RAM用来提供对原始协议的低级访问，在需要执行某些特殊操作时使用，如发送ICMP报文。SOCK_RAM通常仅限于高级用户或管理员运行的程序使用。

socket.SOCK_SEQPACKET #废弃了

Socket 方法

`socket.socket`(family=AF_INET, type=SOCK_STREAM, proto=0, fileno=None)

Create a new socket using the given address family, socket type and protocol number. The address family should be AF_INET (the default), AF_INET6, AF_UNIX, AF_CAN or AF_RDS. The socket type should beSOCK_STREAM (the default), SOCK_DGRAM, SOCK_RAW or perhaps one of the other SOCK_ constants. The protocol number is usually zero and may be omitted or in the case where the address family is AF_CAN the protocol should be one of CAN_RAW or CAN_BCM. If fileno is specified, the other arguments are ignored, causing the socket with the specified file descriptor to return. Unlike socket.fromfd(), fileno will return the same socket and not a duplicate. This may help close a detached socket using socket.close().

socket.socketpair([family[, type[, proto]]])

Build a pair of connected socket objects using the given address family, socket type, and protocol number. Address family, socket type, and protocol number are as for the socket() function above. The default family is AF_UNIX if defined on the platform; otherwise, the default is AF_INET.

（）

socket.create_connection(address[, timeout[, source_address]])

Connect to a TCP service listening on the Internet address (a 2-tuple (host, port)), and return the socket object. This is a higher-level function than socket.connect(): if host is a non-numeric hostname, it will try to resolve it for both AF_INET and AF_INET6, and then try to connect to all possible addresses in turn until a connection succeeds. This makes it easy to write clients that are compatible to both IPv4 and IPv6.

Passing the optional timeout parameter will set the timeout on the socket instance before attempting to connect. If no timeout is supplied, the global default timeout setting returned by getdefaulttimeout() is used.

If supplied, source_address must be a 2-tuple (host, port) for the socket to bind to as its source address before connecting. If host or port are ‘’ or 0 respectively the OS default behavior will be used.

socket.getaddrinfo(host, port, family=0, type=0, proto=0, flags=0) #获取要连接的对端主机地址

sk.bind(address)

　　s.bind(address) 将套接字绑定到地址。address地址的格式取决于地址族。在AF_INET下，以元组（host,port）的形式表示地址。

sk.listen(backlog)

　　开始监听传入连接。backlog指定在拒绝连接之前，可以挂起的最大连接数量。

backlog等于5，表示内核已经接到了连接请求，但服务器还没有调用accept进行处理的连接个数最大为5
这个值不能无限大，因为要在内核中维护连接队列

sk.setblocking(bool)

　　是否阻塞（默认True），如果设置False，那么accept和recv时一旦无数据，则报错。

sk.accept()

　　接受连接并返回（conn,address）,其中conn是新的套接字对象，可以用来接收和发送数据。address是连接客户端的地址。

　　接收TCP 客户的连接（阻塞式）等待连接的到来

sk.connect(address)

　　连接到address处的套接字。一般，address的格式为元组（hostname,port）,如果连接出错，返回socket.error错误。

sk.connect_ex(address)

　　同上，只不过会有返回值，连接成功时返回 0 ，连接失败时候返回编码，例如：10061

sk.close()

　　关闭套接字

sk.recv(bufsize[,flag])

　　接受套接字的数据。数据以字符串形式返回，bufsize指定最多可以接收的数量。flag提供有关消息的其他信息，通常可以忽略。

sk.recvfrom(bufsize[.flag])

　　与recv()类似，但返回值是（data,address）。其中data是包含接收数据的字符串，address是发送数据的套接字地址。

sk.send(string[,flag])

　　将string中的数据发送到连接的套接字。返回值是要发送的字节数量，该数量可能小于string的字节大小。即：可能未将指定内容全部发送。

sk.sendall(string[,flag])

　　将string中的数据发送到连接的套接字，但在返回之前会尝试发送所有数据。成功返回None，失败则抛出异常。

内部通过递归调用send，将所有内容发送出去。

sk.sendto(string[,flag],address)

　　将数据发送到套接字，address是形式为（ipaddr，port）的元组，指定远程地址。返回值是发送的字节数。该函数主要用于UDP协议。

sk.settimeout(timeout)

　　设置套接字操作的超时期，timeout是一个浮点数，单位是秒。值为None表示没有超时期。一般，超时期应该在刚创建套接字时设置，因为它们可能用于连接的操作（如 client 连接最多等待5s ）

sk.getpeername()

　　返回连接套接字的远程地址。返回值通常是元组（ipaddr,port）。

sk.getsockname()

　　返回套接字自己的地址。通常是一个元组(ipaddr,port)

sk.fileno()

　　套接字的文件描述符

socket.sendfile(file, offset=0, count=None)

发送文件，但目前多数情况下并无什么卵用。

SocketServer （每一个连接过来都创建一个线程）

The socketserver module simplifies the task of writing network servers.

There are four basic concrete server classes:

class socketserver.TCPServer(server_address, RequestHandlerClass, bind_and_activate=True): This uses the Internet TCP protocol, which provides for continuous streams of data between the client and server. If bind_and_activate is true, the constructor automatically attempts to invoke server_bind() andserver_activate(). The other parameters are passed to the BaseServer base class.

class socketserver.UDPServer(server_address, RequestHandlerClass, bind_and_activate=True): This uses datagrams, which are discrete packets of information that may arrive out of order or be lost while in transit. The parameters are the same as for TCPServer.

class socketserver.UnixStreamServer(server_address, RequestHandlerClass, bind_and_activate=True) 本机
class socketserver.UnixDatagramServer(server_address, RequestHandlerClass,bind_and_activate=True) 本机: These more infrequently used classes are similar to the TCP and UDP classes, but use Unix domain sockets; they’re not available on non-Unix platforms. The parameters are the same as for TCPServer.

These four classes process requests synchronously; each request must be completed before the next request can be started. This isn’t suitable if each request takes a long time to complete, because it requires a lot of computation, or because it returns a lot of data which the client is slow to process. The solution is to create a separate（独立） process or thread（线程） to handle each request; the ForkingMixIn and ThreadingMixIn mix-in classes can be used to support asynchronous behaviour.

There are five classes in an inheritance diagram, four of which represent synchronous servers of four types:

这里有五个类，在下面的继承图里

+------------+
| BaseServer |
+------------+
      |
      v
+-----------+        +------------------+
| TCPServer |------->| UnixStreamServer |
+-----------+        +------------------+
      |
      v
+-----------+        +--------------------+
| UDPServer |------->| UnixDatagramServer |
+-----------+        +--------------------+

Note that UnixDatagramServer derives from UDPServer, not from UnixStreamServer — the only difference between an IP and a Unix stream server is the address family, which is simply repeated in both Unix server classes.

class socketserver.ForkingMixIn

class socketserver.ThreadingMixIn

Forking and threading versions of each type of server can be created using these mix-in classes. For instance, ThreadingUDPServer is created as follows:

class ThreadingUDPServer(ThreadingMixIn, UDPServer):
    pass

The mix-in class comes first, since it overrides a method defined in UDPServer. Setting the various attributes also changes the behavior of the underlying server mechanism.

常用的为下面的四个

class socketserver.ForkingTCPServer
class socketserver.ForkingUDPServer
class socketserver.ThreadingTCPServer
class socketserver.ThreadingUDPServer
　　These classes are pre-defined using the mix-in classes.

Request Handler Objects

class socketserver.BaseRequestHandler

This is the superclass of all request handler objects. It defines the interface, given below. A concrete request handler subclass must define a new handle() method, and can override any of the other methods. A new instance of the subclass is created for each request.

setup()

Called before the handle() method to perform any initialization actions required. The default implementation does nothing.

handle() 所有客户端的请求都是在handle处理

This function must do all the work required to service a request. The default implementation does nothing. Several instance attributes are available to it; the request is available as self.request; the client address as self.client_address; and the server instance as self.server, in case it needs access to per-server information.

The type of self.request is different for datagram or stream services. For stream services,self.request is a socket object; for datagram services, self.request is a pair of string and socket.

finish()

Called after the handle() method to perform any clean-up actions required. The default implementation does nothing. If setup() raises an exception, this function will not be called.

写一个简单的socketserver 和客户端client的事例

import socketserver
class MyTcpServer(socketserver.BaseRequestHandler):
    def handle(self): 
        while True:
            print("NEW:",self.client_address)
            Server_recv=self.request.recv(1024)
            print("client：",Server_recv.decode())
            self.request.send(Server_recv)

if __name__ == "__main__":
    HOST,PORT = "localhost",5007
 # 把刚才写的类当作一个参数传给ThreadingTCPServer这个类，下面的代码就创建了一个多线程socket server   Server=socketserver.ThreadingTCPServer((HOST,PORT),MyTcpServer)
# 启动这个server,这个server会一直运行，除非按ctrl-C停止
    Server.serve_forever()
#--------------看与之前写的单线程的socket服务端有什么区别呢
import socket

ip_port = ('127.0.0.1',9999)

sk = socket.socket()
sk.bind(ip_port)
sk.listen(5)

while True:
    print ('server waiting...')
    conn,addr = sk.accept()#conn表示实例，addr包含地址端口
    client_data = conn.recv(1024)
    print (str(client_data,'utf8'))
    conn.sendall(bytes('不要回答,不要回答,不要回答','utf8'))
    conn.close()
#少了等待用户连接，获取用户的地址和端口以及创建实例的操作，
socketserver用Server.serve_forever()实现了类似用户连接的操作，不同的是可以多用户连接，socketserver利用创建类在handle方法里实现用户连接自动调用对客户端的操作
# 之前的单线程的socket server 只能一个用户连接，其余的连接都被阻塞了，等待上一次连接释放 才可以进行下一次连接

socketserver

client的与上一节的socketclient并无不同之处
import socket
HOST,PORT=("localhost",5007)
client = socket.socket()
client.connect((HOST,PORT))
while True:
    User_input = input(">>>:")
    client.send(bytes(User_input,"utf8"))
    Server_recv=client.recv(1024)
    print("server:",Server_recv.decode())

socketclient

上述类似于server = socketserver.ThreadingTCPServer((HOST, PORT), MyTCPHandler)这种， socketserver.ThreadingTCPServer是系统本身定义好的一个方法，我们只需要传入参数就行，参数包含了连接地址，端口以及我们自定义的方法，自定义方法内必须指定handle函数，一般我们在handle函数里实现用户的交互

线程与进程

Python GIL(Global Interpreter Lock)　　

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

上面的核心意思就是，无论你启多少个线程，你有多少个cpu, Python在执行的时候会淡定的在同一时刻只允许一个线程运行，擦。。。

首先需要明确的一点是GIL并不是Python的特性，它是在实现Python解析器(CPython)时所引入的一个概念。就好比C++是一套语言（语法）标准，但是可以用不同的编译器来编译成可执行代码。有名的编译器例如GCC，INTEL C++，Visual C++等。Python也一样，同样一段代码可以通过CPython，PyPy，Psyco等不同的Python执行环境来执行。像其中的JPython就没有GIL。然而因为CPython是大部分环境下默认的Python执行环境。所以在很多人的概念里CPython就是Python，也就想当然的把GIL归结为Python语言的缺陷。所以这里要先明确一点：GIL并不是Python的特性，Python完全可以不依赖于GIL

这篇文章透彻的剖析了GIL对python多线程的影响，强烈推荐看一下：http://www.dabeaz.com/python/UnderstandingGIL.pdf

很多同学都听说过，现代操作系统比如Mac OS X，UNIX，Linux，Windows等，都是支持“多任务”的操作系统。

什么叫“多任务”呢？简单地说，就是操作系统可以同时运行多个任务。打个比方，你一边在用浏览器上网，一边在听MP3，一边在用Word赶作业，这就是多任务，至少同时有3个任务正在运行。还有很多任务悄悄地在后台同时运行着，只是桌面上没有显示而已。

现在，多核CPU已经非常普及了，但是，即使过去的单核CPU，也可以执行多任务。由于CPU执行代码都是顺序执行的，那么，单核CPU是怎么执行多任务的呢？

答案就是操作系统轮流让各个任务交替执行，任务1执行0.01秒，切换到任务2，任务2执行0.01秒，再切换到任务3，执行0.01秒……这样反复执行下去。表面上看，每个任务都是交替执行的，但是，由于CPU的执行速度实在是太快了，我们感觉就像所有任务都在同时执行一样。

真正的并行执行多任务只能在多核CPU上实现，但是，由于任务数量远远多于CPU的核心数量，所以，操作系统也会自动把很多任务轮流调度到每个核心上执行。

对于操作系统来说，一个任务就是一个进程（Process），比如打开一个浏览器就是启动一个浏览器进程，打开一个记事本就启动了一个记事本进程，打开两个记事本就启动了两个记事本进程，打开一个Word就启动了一个Word进程。

有些进程还不止同时干一件事，比如Word，它可以同时进行打字、拼写检查、打印等事情。在一个进程内部，要同时干多件事，就需要同时运行多个“子任务”，我们把进程内的这些“子任务”称为线程（Thread）。

由于每个进程至少要干一件事，所以，一个进程至少有一个线程。当然，像Word这种复杂的进程可以有多个线程，多个线程可以同时执行，多线程的执行方式和多进程是一样的，也是由操作系统在多个线程之间快速切换，让每个线程都短暂地交替运行，看起来就像同时执行一样。当然，真正地同时执行多线程需要多核CPU才可能实现。

我们前面编写的所有的Python程序，都是执行单任务的进程，也就是只有一个线程。如果我们要同时执行多个任务怎么办？

有两种解决方案：

一种是启动多个进程，每个进程虽然只有一个线程，但多个进程可以一块执行多个任务。

还有一种方法是启动一个进程，在一个进程内启动多个线程，这样，多个线程也可以一块执行多个任务。

当然还有第三种方法，就是启动多个进程，每个进程再启动多个线程，这样同时执行的任务就更多了，当然这种模型更复杂，实际很少采用。

总结一下就是，多任务的实现有3种方式：

多进程模式；
多线程模式；
多进程+多线程模式。
同时执行多个任务通常各个任务之间并不是没有关联的，而是需要相互通信和协调，有时，任务1必须暂停等待任务2完成后才能继续执行，有时，任务3和任务4又不能同时执行，所以，多进程和多线程的程序的复杂度要远远高于我们前面写的单进程单线程的程序。

因为复杂度高，调试困难，所以，不是迫不得已，我们也不想编写多任务。但是，有很多时候，没有多任务还真不行。想想在电脑上看电影，就必须由一个线程播放视频，另一个线程播放音频，否则，单线程实现的话就只能先把视频播放完再播放音频，或者先把音频播放完再播放视频，这显然是不行的。

Python既支持多进程，又支持多线程，我们会讨论如何编写这两种多任务程序。

小结

线程是最小的执行单元，而进程由至少一个线程组成。如何调度进程和线程，完全由操作系统决定，程序自己不能决定什么时候执行，执行多长时间。

多进程和多线程的程序涉及到同步、数据共享的问题，编写起来更复杂。

线程与进程介绍

如果还是不理解，复制url访问看图文并茂！

http://www.ruanyifeng.com/blog/2013/04/processes_and_threads.html

总结

1、线程共享创建它的进程的地址空间,进程有自己的地址空间

2、线程可以访问进程所有的数据，线程可以相互访问

3、线程之间的数据是独立的

4、子进程复制线程的数据

5、子进程启动后是独立的，父进程只能杀掉子进程，而不能进行数据交换

6、修改线程中的数据，都是会影响其他的线程，而对于进程的更改，不会影响子进程

多线程实例：

创建多线程有两种方式，一种是直接调用，一种是继承式调用，即：通过继承Thread类，重写它的run方法；另一种是创建一个threading.Thread对象，在它的初始化函数（__init__）中将可调用对象作为参数传入。下面分别举例说明

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import threading
import time

def sayhi(num): #定义每个线程要运行的函数
    print("running on number:%s" %num)
    time.sleep(3)
if __name__ == '__main__':
    '''
    t1 = threading.Thread(target=sayhi,args=(1,)) #生成一个线程实例
    t2 = threading.Thread(target=sayhi,args=(2,)) #生成另一个线程实例
    t1.start() #启动线程
    t2.start() #启动另一个线程
    print(t1.getName()) #获取线程名
    print(t2.getName())
    t1.join() #t2.wait()
    t2.join() #t2.wait()'''
    t_list = [] # 定义一个空列表，每启动一个实例，将实例添加到列表
    for i in range(10):
        t = threading.Thread(target=sayhi,args=[i,])
        t.start()
        t_list.append(t)
    for i in t_list: 循环列表 ，等待列表中的每一个线程执行完毕
        i.join()
    #t.join() 这种做法只是等待最后一个线程执行完毕了，不等待其他线程
    print('---main---')

直接调用方式

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import threading
import time

class MyThread(threading.Thread):
    def __init__(self,num):
        super(MyThread,self).__init__() 新式类继承原有方法写法
        #threading.Thread.__init__(self)#经典类写法
        self.num = num

    def run(self):#定义每个线程要运行的函数
        print("running on number:%s" %self.num)
        time.sleep(3)

if __name__ == '__main__':

    t1 = MyThread(1)
    t2 = MyThread(2)
    t1.start()
    t2.start()

继承式调用

Thread类还定义了以下常用方法与属性：

Thread.getName() 获取线程名称
Thread.setName() 设置线程名称
Thread.name

Thread.ident 获取线程的标识符。线程标识符是一个非零整数，只有在调用了start()方法之后该属性才有效，否则它只返回None

判断线程是否是激活的（alive）。从调用start()方法启动线程，到run()方法执行完毕或遇到未处理异常而中断这段时间内，线程是激活的

Thread.is_alive()
Thread.isAlive()

Thread.join([timeout]) 调用Thread.join将会使主调线程堵塞，直到被调用线程运行结束或超时。参数timeout是一个数值类型，表示超时时间，如果未提供该参数，那么主调线程将一直堵塞到被调线程结束（测试有疑问）

Join & Daemon

主线程A中，创建了子线程B，并且在主线程A中调用了B.setDaemon(),这个的意思是，把主线程A设置为守护线程，这时候，要是主线程A执行结束了，就不管子线程B是否完成,一并和主线程A退出.这就是setDaemon方法的含义，这基本和join是相反的。此外，还有个要特别注意的：必须在start() 方法调用之前设置，如果不设置为守护线程，程序会被无限挂起。

import time
import threading

def run(n):

    print('[%s]------running----\n' % n)
    time.sleep(2)
    print('--done--')

def main():
    for i in range(5):
        t = threading.Thread(target=run,args=[i,])
        t.start()
        # t.join()
        print('starting thread', t.getName())


m = threading.Thread(target=main,args=[])
m.setDaemon(True) #将主线程设置为Daemon线程,它退出时,其它子线程会同时退出,不管是否执行完任务
m.start()
# m.join()
print("---main thread done----")
”“”
执行效果，首先m.start()启动守护进程，不join main函数里又启动了5个子线程,调用run函数 run函数里sleep了2秒钟，因此程序执行后会直接打印---main thread done----或者在加上running..不会再有其他"""

Daemon实例

线程锁(互斥锁Mutex)

一个进程下可以启动多个线程，多个线程共享父进程的内存空间，也就意味着每个线程可以访问同一份数据，此时，如果2个线程同时要修改同一份数据，会出现什么状况？

import time
import threading
num = 100  #设定一个共享变量

def addNum():
    global num #在每个线程中都获取这个全局变量
    print('--get num:',num )
    time.sleep(2)
    num  -=1 #对此公共变量进行-1操作

thread_list = []
for i in range(100):
    t = threading.Thread(target=addNum)
    t.start()
    thread_list.append(t)

for t in thread_list: #等待所有线程执行完毕
    t.join()


print('final num:', num )
""" 获取到的get num每次都是100 是因为100个线程同事执行，获取的都是100，线程内部都排队去吃 每次取的值-1，但是在2.7中就会出错，最后print final num 的值不为1
"""

正常来讲，这个num结果应该是0，但在python 2.7上多运行几次，会发现，最后打印出来的num结果不总是0，为什么每次运行的结果不一样呢？哈，很简单，假设你有A,B两个线程，此时都要对num 进行减1操作，由于2个线程是并发同时运行的，所以2个线程很有可能同时拿走了num=100这个初始变量交给cpu去运算，当A线程去处完的结果是99，但此时B线程运算完的结果也是99，两个线程同时CPU运算的结果再赋值给num变量后，结果就都是99。那怎么办呢？很简单，每个线程在要修改公共数据时，为了避免自己在还没改完的时候别人也来修改此数据，可以给这个数据加一把锁，这样其它线程想修改此数据时就必须等待你修改完毕并把锁释放掉后才能再访问此数据。

*注：不要在3.x上运行，不知为什么，3.x上的结果总是正确的，可能是自动加了锁

加锁版本

import time
import threading
 
def addNum():
    global num #在每个线程中都获取这个全局变量
    print('--get num:',num )
    time.sleep(1)
    lock.acquire() #修改数据前加锁
    num  -=1 #对此公共变量进行-1操作
    lock.release() #修改后释放
 
num = 100  #设定一个共享变量
thread_list = []
lock = threading.Lock() #生成全局锁
for i in range(100):
    t = threading.Thread(target=addNum)
    t.start()
    thread_list.append(t)
 
for t in thread_list: #等待所有线程执行完毕
    t.join()
 
print('final num:', num )

RLock（递归锁）

说白了就是在一个大锁中还要再包含子锁

import threading,time
 
def run1():
    print("grab the first part data")
    lock.acquire()
    global num
    num +=1
    lock.release()
    return num
def run2():
    print("grab the second part data")
    lock.acquire()
    global  num2
    num2+=1
    lock.release()
    return num2
def run3():
    lock.acquire()
    res = run1()
    print('--------between run1 and run2-----')
    res2 = run2()
    lock.release()
    print(res,res2)
 
 
if __name__ == '__main__':
 
    num,num2 = 0,0
    lock = threading.RLock()
    for i in range(10):
        t = threading.Thread(target=run3)
        t.start()
 
while threading.active_count() != 1:
    print(threading.active_count())
else:
    print('----all threads done---')
    print(num,num2)

Rlock与Lock的区别：

RLock允许在同一线程中被多次acquire。而Lock却不允许这种情况。否则会出现死循环，程序不知道解哪一把锁。注意：如果使用RLock，那么acquire和release必须成对出现，即调用了n次acquire，必须调用n次的release才能真正释放所占用的锁

Semaphore(信号量)

互斥锁同时只允许一个线程更改数据，而Semaphore是同时允许一定数量的线程更改数据，比如厕所有3个坑，那最多只允许3个人上厕所，后面的人只能等里面有人出来了才能再进去

import threading,time

def run(n):
    semaphore.acquire()
    time.sleep(1)
    print("run the thread: %s\n" %n)
    semaphore.release()

if __name__ == '__main__':

    num= 0
    semaphore  = threading.BoundedSemaphore(3) #最多允许3个线程同时运行
    for i in range(20):
        t = threading.Thread(target=run,args=(i,))
        t.start()

while threading.active_count() != 1:
    pass #print threading.active_count()
else:
    print('----all threads done---')
    print(num)

Events

Python提供了Event对象用于线程间通信，它是由线程设置的信号标志，如果信号标志位真，则其他线程等待直到信号接触。

Event对象实现了简单的线程通信机制，它提供了设置信号，清除信号，等待等用于实现线程间的通信。

event = threading.Event() 创建一个event

1 设置信号

event.set()

使用Event的set（）方法可以设置Event对象内部的信号标志为真。Event对象提供了isSet（）方法来判断其内部信号标志的状态。当使用event对象的set（）方法后，isSet（）方法返回真

2 清除信号

event.clear()

使用Event对象的clear（）方法可以清除Event对象内部的信号标志，即将其设为假，当使用Event的clear方法后，isSet()方法返回假

3 等待

event.wait()

Event对象wait的方法只有在内部信号为真的时候才会很快的执行并完成返回。当Event对象的内部信号标志位假时，则wait方法一直等待到其为真时才返回。也就是说必须set新号标志位真

下面用红绿灯事件来讲解event

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import threading,time
import random
def light(): # 设定一个灯函数
    if not event.isSet(): # 检查event有没有被设定 ，如果没设定 就设定
        event.set() #wait就不阻塞 #绿灯状态
    count = 0
    while True:
        if count < 10: # 绿灯
            print('\033[42;1m--green light on---\033[0m')
        elif count <13:# 黄灯3秒
            print('\033[43;1m--yellow light on---\033[0m')
        elif count <20:#7秒红灯
            if event.isSet():
                event.clear() #清楚setwait阻塞
            print('\033[41;1m--red light on---\033[0m')
        else:
            count = 0
            event.set() #打开绿灯
        time.sleep(1)
        count +=1

def car(n): #no bug version
    while 1:
        time.sleep(1) #sleep(1)在这里是让车出现的慢一点 没其他作用
        if  event.isSet(): #绿灯
            print("car [%s] is running.." % n)
        else:#红灯
            print("car [%s] is waiting for the red light.." %n)
            event.wait()

if __name__ == '__main__':
    event = threading.Event() # 创建event事件
    Light = threading.Thread(target=light)
    Light.start()
    for i in range(3):
        t = threading.Thread(target=car,args=(i,))
        t.start()
"""
首先创建一个事件，启动灯事件的实例
灯函数
第一次判断event有没有被设定，如果没设定就设定，第一次为绿灯
设定一个随机数count 用来定义灯的时间，否则每次count+1大于20了就一直为红灯了
循环每次count+1 设定红绿灯的判断值，大于20就重新设定count值，再次event.set()打开绿灯
车函数
循环，每过一秒出现一辆车，总共为三辆车，下面的for循环创建了三个车的实例
判断event.set()有没有被设定，即：count的值大于13小于20
设定了即为绿或者黄灯，否则为红灯，event.wait在这里的作用是车等待，即不在循环了 直到event被设定
"""

红绿灯event

多进程multiprocessing

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

多进程模块使用一个API类似线程模块。多处理方案啊提供了本地和远程的并发性，有效的避开了全局解释器GIL锁的线程，因此，multiprocessing允许程序员利用多个处理器在一台特定的机器上，它运行在windows和linux

#!/usr/bin/env python
# -*- coding:utf-8 -*-
from multiprocessing import Process
import os

def info(title):
    print(title)
    print('module name:', __name__)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())
    print("\n\n")

def f(name):
    info('\033[31;1mfunction f\033[0m')
    print('hello', name)

if __name__ == '__main__':
    info('\033[32;1mmain process line\033[0m')
    p = Process(target=info, args=('bob',))
    p2 = Process(target=info, args=('bob',))
    p.start()
    p2.start()
    p.join()

multiprocessing

进程间通讯　　

不同进程间内存是不共享的，要想实现两个进程间的数据交换，可以用以下方法：

Queues

使用方法跟threading里的queue差不多

from multiprocessing import Process, Queue

def f(q):
     q.put([42, None, 'hello'])

if __name__ == '__main__':
    que  = Queue()
    p = Process(target=f, args=(que,))
    p2 = Process(target=f, args=(que,))
    p.start()
    p2.start()
    print('from parent:',que.get())    # prints "[42, None, 'hello']"
    print('from parent2:',que.get())    # prints "[42, None, 'hello']"
    p.join()

Pipes

The Pipe() function returns a pair of connection objects connected by a pipe which by default is duplex (two-way). For example:

管道函数返回一个对在管道连接的连接对象,默认情况下是双工(双向)。例如:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
from multiprocessing import Process, Pipe

def f(conn):
    conn.send([42, None, 'hello']) #用传过来的实例send到另一端的地址 也就是parent_conn
    conn.close()

if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=f, args=(child_conn,))#传入管道一端的实例，
    p2 = Process(target=f, args=(child_conn,))
    p2.start()
    p.start()
    print(parent_conn.recv())   # prints "[42, None, 'hello']" #打印对端传出过来的数据
    print(parent_conn.recv())   # prints "[42, None, 'hello']"
    p.join()

Pipe

Managers

A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.

A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Barrier, Queue, Value and Array. For example,

Python中进程间共享数据，处理基本的queue，pipe和value+array外，还提供了更高层次的封装。使用multiprocessing.Manager可以简单地使用这些高级接口。

Manager()返回的manager对象控制了一个server进程，此进程包含的python对象可以被其他的进程通过proxies来访问。从而达到多进程间数据通信且安全。

Manager支持的类型有list,dict,Namespace,Lock,RLock,Semaphore,BoundedSemaphore,Condition,Event,Queue,Value和Array。

from multiprocessing import Process, Manager

def f(d, l,n):
    d[n] =n
    d['2'] = 2
    d[0.25] = None
    l.append(n)
    print(l)
    print(d)

if __name__ == '__main__':
    with Manager() as manager:
        d = manager.dict() #使用manager生成一个dict,对dict进行封装
        l = manager.list(range(5))#使用manager生成一个list,对list进行封装 """ 如果不用manager对数据进行封装，效果就是会生成10个列表，而不是对一个列表进行修改数据
        p_list = []
        for i in range(10):
            p = Process(target=f, args=(d, l,i))
            p.start()
            p_list.append(p)
        for res in p_list:
            res.join()
        print(d)
        print(l)

manager实例

进程同步

Without using the lock output from the different processes is liable to get all mixed up.

（不使用锁的输出不同的过程很容易得到全搞混了。）和线程锁基本是一样的

from multiprocessing import Process, Lock

def f(l, i):
    l.acquire()
    try:
        print('hello world', i)
    finally:
        l.release()

if __name__ == '__main__':
    lock = Lock()
    for num in range(10):
        Process(target=f, args=(lock, num)).start()

进程池　　

进程池内部维护一个进程序列，当使用时，则去进程池中获取一个进程，如果进程池序列中没有可供使用的进进程，那么程序就会等待，直到进程池中有可用进程为止。

进程池中有两个方法：

apply 同步执行（串行）

apply_async （异步执行并行）

#!/usr/bin/env python
# -*- coding:utf-8 -*-
from  multiprocessing import Process,Pool,freeze_support
import time

def Foo(i):
    time.sleep(2)
    print('exec...')
    return i+100

def Bar(arg):
    print('-->exec done:',arg)


if __name__ == '__main__':
    freeze_support() # 这块代码是为了防止在windows下出错而定义，freeze_support模块需导入
    pool = Pool(3) #进程池设定三个进程
    for i in range(10):
        pool.apply_async(func=Foo, args=(i,),callback=Bar) #启动进程，callback为回调机制，表示执行完foo函数后会把值传送 给bar函数，apply_async表示异步
        #pool.apply(func=Foo, args=(i,)) #apply表示同步（串行），一个个执行，，同步的时候不能callback
    print('end')
    pool.close()# 必须先关闭在join
    pool.join()#进程池中进程执行完毕后再关闭，如果注释，那么程序直接关闭。

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 震惊！C++程序真的从main开始吗？99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码？零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾（3.3-3.9）
· winform 绘制太阳，地球，月球运作规律

善恶美丑

公告

搜索

常用链接

随笔档案

阅读排行榜

推荐排行榜