Appengine直接下载文件并保存到google drive
一直对下载文件比较感兴趣。前些日子无意搜到google 推出一项服务,可以直接将文件下载到google drive中,原型猛戳这里,但有限额限制。一时脑洞大开,可不可以在appengine 上架设服务利用google来下载文件呢。(你折腾不折腾,上google dirve还要用梯子,然后还要从google drive上下载,是不是有病,答:我愿意,不折腾就会死)以下是研究结果:
在appengine下申请appid 什么的我就不说了,就从前期的设置权限说起吧。
假设你已经建立了appid为abcd的app,通过这进入总控台,(总控台的最近有些变化,以下我按新版的界面说。)点abcd进入abcd的项目设置。点左侧的“api&auth”下的“API”进入,打开drive api ,注意一定要是drive api 而不是cloud storage api。我在这里走一些弯路,关键是cloud storage api是付费的。
下一步取得OATH key。点左侧的“api&auth”下的“credential”,点“Oath”下的“creat new client ID”在页面中选“web aplication ”,在 "authorized javascript origins" 下填好你的appengine 的域名,一般是
http://apiid .appspot.com (注意是http 而不是https)这里appid 是"abcd"。填好这个后下面的“authorized redirect URI”也能自动填好。然后单击“create client ID”就生成了新的client id 。返回到“credential”页面后就可以看到新生成的client ID。单击下面的download json以.json文件的形式下载刚才生产的client ID ,一般文件名是“client_secrets.json”
上面这些事做得很麻烦,也可以让google替你做,这里的最下方,select an api 选 drive api , select a platform 选 google app engine 然后点“configure project ”选择你要设置的app id ,再“continue”,就设置完成了,点“Download the starter application” 下载下来的框架修改下就可以完成预计任务了。注意这里的client_secrets.json是本地的文件,用于在本地模拟,一定要替换成上面设置的。也就是“Client ID for web application”
修改框架中的文件,用你的client_secrets.json替换原有的文件。
修改main.html为
<html> <head> <title>Upload file to Google Drive from url Demo</title> </head> <body> {% if has_credentials %} <form name="input" action="/upload" method="post"> Url: <input type="text" name="urls" style="width: 350px; height: 60px;"> <input type="submit" value="Upload"> </form> {% else %} <p> You should follow the link below and grant this application permission to access your data using the Drive API. </p> <blockquote> <a href="{{ url }}">{{ url }}</a> </blockquote> {% endif %} </body> </html>
html很丑大家轻拍。
然后修改main.py主要是处理“/”的mainhandler,和处理“/upload”的uploadhandler
1 class MainHandler(webapp2.RequestHandler): 2 @decorator.oauth_required 3 def get(self): 4 variables = { 5 'url': decorator.authorize_url(), 6 'has_credentials': decorator.has_credentials() 7 } 8 template = JINJA_ENVIRONMENT.get_template('main.html') 9 self.response.write(template.render(variables))
1 class UploadHandler(webapp2.RequestHandler): 2 @decorator.oauth_required 3 def post(self): 4 url = cgi.escape(self.request.get('urls')) 5 parse = urlparse(url) 7 path = parse.path 8 filename = path.split('/')[-1] 9 headers = {} 10 deadline = 5 11 for i in range(0, 10): 12 try: 13 response = urlfetch.fetch(url, headers=headers, deadline=deadline) 14 break 15 except apiproxy_errors.OverQuotaError: 16 response = None 17 time.sleep(4) 18 except urlfetch.DeadlineExceededError: 19 logging.error('DeadlineExceededError(deadline=%s, url=%r)', deadline, url) 20 response = None 21 time.sleep(1) 22 except urlfetch.DownloadError: 23 logging.error('DownloadError(deadline=%s, url=%r)', deadline, url) 24 response = None 25 time.sleep(1) 26 except urlfetch.InvalidURLError as e: 27 logging.error('Invalid URL: %s' % e) 28 response = None 29 except urlfetch.ResponseTooLargeError as e: 30 response = e.response 31 logging.error('ResponseTooLargeError(deadline=%s, url=%r) response(%r)', deadline, url, response) 32 m = re.search(r'=\s*(\d+)-', headers.get('Range') or headers.get('range') or '') 33 if m is None: 34 headers['Range'] = 'bytes=0-%d' % URLFETCH_MAXSIZE 35 else: 36 headers.pop('Range', '') 37 headers.pop('range', '') 38 start = int(m.group(1)) 39 headers['Range'] = 'bytes=%s-%d' % (start, start+URLFETCH_MAXSIZE) 40 deadline = URLFETCH_TIMEOUT * 2 41 except Exception as e: 42 logging.error('Exception %s(deadline=%s)' % (e, deadline)) 43 response = None 44 if response: 45 data = io.BytesIO(response.content) 46 filemimetype = response.headers.get('Content-Type', 'application/octet-stream') 47 media = MediaIoBaseUpload(data, mimetype=filemimetype, chunksize=1024*1024, resumable=True) 48 body = {'title': filename, 'mimeType': filemimetype} 49 try: 50 end = service.files().insert(body=body, media_body=media, convert=False).execute(http=decorator.http()) 51 except errors.HttpError as error: 52 logging.error('An error occured: %s' % error) 53 end = None 54 self.response.write("<p>download %s success.</p>" % url) 55 self.response.write(end) 56 else: 57 self.response.write("<p>download %s failed.</p>" % url)
mainhandler比较简单主要是获取授权。uploader是主要的下载和上传程序。下载通过urlfetch完成,其他的代码主要处理各种exception,上传分两步一是通过mediaiobase上传,上传前先转换为iobase,二是执行file insert。其余的还是exception处理。
问题是不知道为什么上传的文件的mimetype总是设置为"application/msword",有哪位大牛来帮忙解决下