urllib

1. API

1.1 发送POST请求:


import urllib.request
import urllib.parse

url = "http://www.httpbin.org/post"

# 请求数据
data = bytes(urllib.parse.urlencode({"name": "张飞"}), encoding="utf-8")
resp = urllib.request.urlopen(url, data=data)
print(resp.read().decode("utf-8"))


out:

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "\u5f20\u98de"
  }, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Content-Length": "23", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "www.httpbin.org", 
    "User-Agent": "Python-urllib/3.10", 
    "X-Amzn-Trace-Id": "Root=1-61f13a4c-0a4e177c2091473a7c9bf70d"
  }, 
  "json": null, 
  "origin": "223.73.1.176", 
  "url": "http://www.httpbin.org/post"
}

1.2 发送GET 请求:

注意 "User-Agent": "Python-urllib/3.10", 在爬虫的时候上送头如果不修改伪装为浏览器的话‘
有的网站会识别出来返回HTTP 418

url = 'http://www.httpbin.org/get'

try:
    # timeout 可以指定等待时间,超过后会抛出异常
    resp = urllib.request.urlopen(url, timeout=0.01)
    print(resp.read().decode('utf-8'))
except urllib.error.URLError as e:
    print(e)

out:

{
  "args": {}, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Host": "www.httpbin.org", 
    "User-Agent": "Python-urllib/3.10", 
    "X-Amzn-Trace-Id": "Root=1-61f13d11-6884fbc355aa3d4b5d7e85cc"
  }, 
  "origin": "223.73.1.176", 
  "url": "http://www.httpbin.org/get"
}

1.3 获取头信息

url = 'https://baidu.com'

resp = urllib.request.urlopen(url)

# 获得返回的头信息
# print(resp.getheaders())

# 获得返回的指定头信息,注意是getheader(),不是getheaders()
print(resp.getheader("Set-Cookie"))

1.4 封装request对象

自定义header - Usert-Agent 伪装为浏览器

url = "http://www.httpbin.org/post"
headers = {
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36'
}

data = bytes(urllib.parse.urlencode({'name': 'eric'}), encoding='utf-8')
# 封装request
req = urllib.request.Request(url=url, data=data, headers=headers, method='POST')
# 发送request
resp = urllib.request.urlopen(req)
print(resp.read().decode('utf-8'))

out:

{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "eric"
  }, 
  "headers": {
    "Accept-Encoding": "identity", 
    "Content-Length": "9", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "www.httpbin.org", 
    "User-Agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36", 
    "X-Amzn-Trace-Id": "Root=1-61f14b4e-246c1a46750e5efc3546f11c"
  }, 
  "json": null, 
  "origin": "223.73.1.176", 
  "url": "http://www.httpbin.org/post"
}
posted @   chuangzhou  阅读(20)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?
· 展开说说关于C#中ORM框架的用法!
点击右上角即可分享
微信分享提示