python基础 - 网络编程
本章索引
- socketserver
- 多线程与多进程
socketserver
SocketServer模块简化了编写网络服务程序的任务。同时SocketServer模块也是Python标准库中很多服务器框架的基础。Python把网络服务抽象成两个主要的类
一个是Server类,用于处理连接相关的网络操作
另外一个则是RequestHandler类,用于处理数据相关的操作。并且提供两个MixIn 类,用于扩展 Server,实现多进程或多线程。
########################################## #服务器端 import socketserver class MyServer(socketserver.BaseRequestHandler): def handle(self): print('服务端启动...') while True: while True: conn = self.request #接受客户端socket对象 print(self.client_address) while True: client_data = conn.recv(1024) print(str(client_data,'utf8')) print('waiting...') conn.sendall(client_data) conn.close() if __name__ == '__main__': server = socketserver.ThreadingTCPServer(('127.0.0.1',8000),MyServer) server.serve_forever() ######################################### #客户端 import socket sk = socket.socket() address = ('127.0.0.1',8000) sk.connect(address) while True: inp = input('>>>') if inp == 'exit': break sk.send(bytes(inp,'utf8')) #传送的内容一定是bytes类型 data = sk.recv(1024) #recv阻塞 print(str(data,'utf8')) sk.close()
创建一个socketserver 至少分以下几步
1.First, you must create a request handler class by subclassing the BaseRequestHandlerclass and overriding its handle() method; this method will process incoming requests.
2.Second, you must instantiate one of the server classes, passing it the server’s address and the request handler class.
3.Then call the handle_request() orserve_forever() method of the server object to process one or many requests.
4.Finally, call server_close() to close the socket.
多线程与多进程
进程:
An executing instance of a program is called a process.
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
进程与线程的区别:
- Threads share the address space of the process that created it; processes have their own address space.
- Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
- Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
- New threads are easily created; new processes require duplication of the parent process.
- Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
- Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.
多线程:
在python里:如果是任务是IO密集型,可以用多线程,如果是计算密集型,最优方案是改C
并行:就是把正在执行的大量任务分割成小块,分配给多个同时运行的线程。
并发:就是同时做多件事情
import time import threading bagin = time.time() def foo(n): print('foo %s'%n) time.sleep(1) print('end foo') def bar(n): print('bar %s'%n) time.sleep(2) print('end bar') t1 = threading.Thread(target=foo,args=(1,)) #子线程1 t2 = threading.Thread(target=bar,args=(2,)) #子线程2 t1.start() t2.start() print('-------- in the main --------') #主线程 t1.join() t2.join() end = time.time() print(end-bagin) # 2.000999927520752
import threading import time class MyThread(threading.Thread): def __init__(self,num): threading.Thread.__init__(self) self.num = num def run(self): print('running on number:%s'%self.num) time.sleep(3) if __name__ == '__main__': t1 = MyThread(1) t2 = MyThread(2) t1.start() t2.start()
GIL:
GIL,在同一个时刻,只能有一个线程,解决方案:协成+多进程
import time import threading begin = time.time() def foo(n): sum = 0 for i in range(n): sum += i print(sum) # foo(100000000) # foo(100000000) #8.871000051498413 t1 = threading.Thread(target=foo,args=(100000000,)) t1.start() t2 = threading.Thread(target=foo,args=(100000000,)) t2.start() #9.016000032424927 t1.join() t2.join() end = time.time() print(end - begin) #开了多线程比没开还要慢,原因GIL,同步锁
几条概念:
1.当一个进程启动之后,会默认产生一个主线程,因为线程是程序执行流的最小单元,当设置多线程时,主线程会创建多个子线程,在python中,默认情况下(其实就是setDaemon(False)),主线程执行完自己的任务以后,就退出了,此时子线程会继续执行自己的任务,直到自己的任务结束,例子见下面一。
2.当我们使用setDaemon(True)方法,设置子线程为守护线程时,主线程一旦执行结束,则全部线程全部被终止执行,可能出现的情况就是,子线程的任务还没有完全执行结束,就被迫停止,例子见下面二。
3.此时join的作用就凸显出来了,join所完成的工作就是线程同步,即主线程任务结束之后,进入阻塞状态,一直等待其他的子线程执行结束之后,主线程在终止,例子见下面三。
4.join有一个timeout参数:
当设置守护线程时,含义是主线程对于子线程等待timeout的时间将会杀死该子线程,最后退出程序。所以说,如果有10个子线程,全部的等待时间就是每个timeout的累加和。简单的来说,就是给每个子线程一个timeout的时间,让他去执行,时间一到,不管任务有没有完成,直接杀死。
没有设置守护线程时,主线程将会等待timeout的累加和这样的一段时间,时间一到,主线程结束,但是并没有杀死子线程,子线程依然可以继续执行,直到子线程全部结束,程序退出。
同步锁:
import time import threading def addNum(): global num #在每个线程中都获取这个全局变量 # num-=1 lock.acquire() temp=num print('--get num:',num ) #time.sleep(0.1) num =temp-1 #对此公共变量进行-1操作 lock.release() num = 100 #设定一个共享变量 thread_list = [] lock=threading.Lock() for i in range(100): t = threading.Thread(target=addNum) t.start() thread_list.append(t) for t in thread_list: #等待所有线程执行完毕 t.join() print('final num:', num )
同步锁与GIL的关系?
Python的线程在GIL的控制之下,线程之间,对整个python解释器,对python提供的C API的访问都是互斥的,这可以看作是Python内核级的互斥机制。但是这种互斥是我们不能控制的,我们还需要另外一种可控的互斥机制———用户级互斥。内核级通过互斥保护了内核的共享资源,同样,用户级互斥保护了用户程序中的共享资源。
但是如果你有个操作比如 x += 1,这个操作需要多个bytecodes操作,在执行这个操作的多条bytecodes期间的时候可能中途就换thread了,这样就出现了data races的情况了。
import threading,time class myThread(threading.Thread): def doA(self): lockA.acquire() print(self.name,"gotlockA",time.ctime()) time.sleep(3) lockB.acquire() print(self.name,"gotlockB",time.ctime()) lockB.release() lockA.release() def doB(self): lockB.acquire() print(self.name,"gotlockB",time.ctime()) time.sleep(2) lockA.acquire() print(self.name,"gotlockA",time.ctime()) lockA.release() lockB.release() def run(self): self.doA() self.doB() if __name__=="__main__": lockA=threading.Lock() lockB=threading.Lock() threads=[] for i in range(5): threads.append(myThread()) for t in threads: t.start() for t in threads: t.join()
解决方案:递归锁
lockA=threading.Lock() lockB=threading.Lock() #--------------> lock=threading.RLock()