我想用python脚本下载很多文件,但是经常就有那么几个出错,写了个error handling,跳了过去,但是把出错的链接保存了一下。
转过天来,研究了一下出的什么错。
一个报错如下:
PS C:\temp> python .\DownloadFromList.py
Downloading https://github.com/Unity-Technologies/ScriptableRenderPipeline/archive/master.zip
Traceback (most recent call last): File ".\DownloadFromList.py", line 20, in <module> r = requests.get(url) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 72, in get return request('get', url, params=params, **kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 58, in request return session.request(method=method, url=url, **kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 512, in request resp = self.send(prep, **send_kwargs) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 644, in send history = [resp for resp in gen] if allow_redirects else [] File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 644, in <listcomp> history = [resp for resp in gen] if allow_redirects else [] File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 222, in resolve_redirects **adapter_kwargs File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 662, in send r.content File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\models.py", line 827, in content self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
MemoryError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File ".\DownloadFromList.py", line 28, in <module> print("Error happened:", e.message)
AttributeError: 'MemoryError' object has no attribute 'message'
PS C:\temp> |
上网搜索了一下, 找到了解决方案.
为了防止这个参考资料的网页消失(以前经常发生的), 所以我就直接把代码抄过来放在这里, 备用(抄袭,嗯,注明了出处就可以光明正大的抄袭).
使用request
def download_file(url): local_filename = url.split('/')[-1] # NOTE the stream=True parameter r = requests.get(url, stream=True) with open(local_filename, 'wb') as f: for chunk in r.iter_content(chunk_size=1024): if chunk: # filter out keep-alive new chunks f.write(chunk) f.flush() return local_filename
|
使用urllib2
file = urllib2.urlopen('url') with open('filename','w') as f: while True: tmp = file.read(1024) if not tmp: break f.write(tmp)
|
参考资料
==================
https://ox0spy.github.io/post/python/python-download-large-file-without-out-of-memory/
参考资料所援引的代码来自下面的两个链接。
http://stackoverflow.com/questions/16694907/how-to-download-large-file-in-python-with-requests-py
http://stackoverflow.com/questions/27053028/how-to-download-large-file-without-memoryerror-in-python
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律
2009-09-14 ISAPI Extension和ISAPI Filter
2009-09-14 一个奇怪的问题
2009-09-14 修改属性之后保存web part时报错
2009-09-14 网络端口范围分类
2009-09-14 更改文档库可备份的大小