python 从url中提取域名和path
使用Python 内置的模块 urlparse
from urlparse import *
url = 'https://docs.google.com/spreadsheet/ccc?key=blah-blah-blah-blah#gid=1'
result = urlparse(url)
result 包含了URL的所有信息
>>> from urlparse import * >>> url = 'https://docs.google.com/spreadsheet/ccc?key=blah-blah-blah-blah#gid=1' >>> result = urlparse(url) >>> print result ParseResult(scheme='https', netloc='docs.google.com', path='/spreadsheet/ccc', params='', query='key=blah-blah-blah-blah', fragment='gid=1') >>> url='http://pkunews.pku.edu.cn/xwzh/2018-04/29/content_302272.htmhttp://pkunews.pku.edu.cn/xwzh/2018-04/29/content_302272.htm' >>> result = urlparse(url) >>> print result ParseResult(scheme='http', netloc='pkunews.pku.edu.cn', path='/xwzh/2018-04/29/content_302272.htmhttp://pkunews.pku.edu.cn/xwzh/2018-04/29/content_302272.htm', params='', query='', fragment='')