Python下载网页图片
#coding:utf-8 import requests from bs4 import BeautifulSoup import re DownPath = "/jiaoben/python/meizitu/pic/" import urllib head = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'} TimeOut = 5 PhotoName = 0 c = '.jpeg' PWD="/jiaoben/python/meizitu/pic/" for x in range(1,4): site = "http://www.meizitu.com/a/qingchun_3_%d.html" %x Page = requests.session().get(site,headers=head,timeout=TimeOut) Coding = (Page.encoding) Content = Page.content#.decode(Coding).encode('utf-8') ContentSoup = BeautifulSoup(Content) jpg = ContentSoup.find_all('img',{'class':'scrollLoading'}) for photo in jpg: PhotoAdd = photo.get('data-original') PhotoName +=1 Name = (str(PhotoName)+c) r = requests.get(PhotoAdd,stream=True) with open(PWD+Name, 'wb') as fd: for chunk in r.iter_content(): fd.write(chunk) print ("You have down %d photos" %PhotoName)
# -*- coding:utf-8 -*- import urllib.request path = "D:\\Download" url = "http://pic2.sc.chinaz.com/files/pic/pic9/201309/apic520.jpg" name ="D:\\download\\1.jpg" #保存文件时候注意类型要匹配,如要保存的图片为jpg,则打开的文件的名称必须是jpg格式,否则会产生无效图片 conn = urllib.request.urlopen(url) f = open(name,'wb') f.write(conn.read()) f.close() print('Pic Saved!')
很简单,打开个url链接,然后save到某个文件夹下就可以了。
有时候不如不想输入路径,那就需要用os模块来修改当前路径
os.chdir("D:\\download") os.getcwd()
这样保存的文件就只需要名字就可以了
f = open('1.jpg','wb')
这上面的url是给定的,只能下载一张图片,如果要是批量下载,就需要用循环来判断不同的url,
下面是从其他地方看到的一个例子,就是把图片url中的图片名字修改,然后就可以循环保存了,不过也是先确定了某个url
来源:http://www.oschina.net/code/snippet_1016509_21961 开源中国社区,自己修改的地方是提出了相同代码def了个函数
import os import urllib.request def rename(name): if len(name) == 2: name = '0' + name + '.jpg' elif len(name) == 1: name = '00' + name + '.jpg' else: name = name + '.jpg' return name os.chdir("D:\\download") os.getcwd() count = 1 name=str(count) name = rename(name) print(name) url = 'http://bgimg1.meimei22.com/list/2012-5-24/2/sa' + name while count < 15: a = urllib.request.urlopen(url) f = open(name, "wb") f.write(a.read()) f.close() print(url + ' Saved!') count = count + 1 name=str(count) name = rename(name) print(name) url = 'http://bgimg1.meimei22.com/list/2012-5-24/2/sa' + name try: a = urllib.request.urlopen(url) pass except (Exception) as e: print(e) else: pass else: print(url + ' not found')
当然也可以自己建立http连接,然后动态获取.jpg的图片
url = "desk.zol.com.cn" conn = http.client.HTTPConnection(url) conn.request("GET", "/dongman/") r = conn.getresponse() print (r.status, r.reason) data1 = r.read()#.decode('utf-8') #编码根据实际情况酌情处理
开始时候写的老是提示目标计算机主动拒绝, 后来才发现我选的函数是HTTPSConnection() ,当然会被拒绝了,这一点应该注意,要选择HTTPConnection()