scrapy/w3lib · GitHub - lexus

公告

w3lib
Overview
This is a Python library of web-related functions, such as:
remove comments, or tags from HTML snippets
extract base url from HTML snippets
translate entites on HTML strings
encoding mulitpart/form-data
convert raw HTTP headers to dicts and vice-versa
construct HTTP auth header
converting HTML pages to unicode
RFC-compliant url joining
sanitize urls (like browsers do)
extract arguments from urls
Modules
The w3lib package consists of four modules:
w3lib.url - functions for working with URLs
w3lib.html - functions for working with HTML
w3lib.http - functions for working with HTTP
w3lib.encoding - functions for working with character encoding
w3lib.form - functions for working with web forms

posted on 2013-01-03 23:18 lexus 阅读(404) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

浙江省高等学校教师教育理论培训

公告

w3lib

Overview

Modules