Python Web-第三周-Networks and Sockets(Using Python to Access Web Data)

1.Networked Programs


1.Internet

我们现在学习Internet部分,即平时我们浏览器做的事情,之后再学习客服端这部分


2.TCP 传输控制协议


3.Socket


 


HTTP80端口用来与浏览器沟通


4.Sockets in Python

1 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open
2 #AF_INET refer i'm make an internet socket
3 #STREAM refer i'm make an stream socket
4 mysock.connect(('www.py4inf.com',80))
5 #在我们这个程序和www.py4inf.com的80端口间建立一个Sockets
Python天然支持TCP Sockets
docs.python.org/library/socket.html  

2.From Sockets to Applications


1.HTTP 超文本传输协议


http://www.dr-chuck.com/page1.htm

protocol        host                  document

2.Sockets


Click the Second Page is just a socket

3.Hacking HTTP


用telnet 加 GET去获取网页内容(Win7 默认不带telnet)

每次访问网页都是十几二十个GET,GET html、GET CSS、GET image....

3.Let's Write a Web Browser


1.An HTTP Request in Python

 1 import socket
 2 mysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)#like file open
 3 #AF_INET refer i'm make an internet socket
 4 #STREAM refer i'm make an stream socket
 5 mysock.connect(('www.py4inf.com',80))
 6 #在我们这个程序和www.py4inf.com的80端口间建立一个Sockets
 7 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n'
 8 mysock.send(toSend.encode('ascii'))
 9 whileTrue:
10 data = mysock.recv(65)#65是buf长度,此处用来设置显示数据时的长度
11 if(len(data)<1):
12 break
13 print(data)
14 mysock.close()

2.编码错误,及其解决方法

使用encode 进行以下类型转换即可

1 toSend='GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n'
2 mysock.send(toSend.encode('ascii'))

3.Making HTTP Easier With urllib

socket比url更加接近底层,也就是说url更加简单。

socket是 Transport Layer , url是 Application Layer

 

注:2.x版本python使用import urllib,但3.x版本python使用的是import urllib.request

1 import urllib.request
2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt')
3 for line in fhand:
4 print(line.strip())

4.Like a file

urllib turn URLs into files,所以我们可以像操作文件一样操作它
1 import urllib.request
2 fhand=urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt')
3 counts=dict()
4 for line in fhand:
5 words=line.split()
6 for word in words:
7 counts[word]=counts.get(word,0)+1
8 print(counts)

Words:

subtlety 微妙

posted @ 2016-01-08 09:34  只追昭熙  阅读(1035)  评论(0编辑  收藏  举报