Python/Django 下载Excel2007
一、前提
上一篇,我写了下载Excel2003的博文,这里写下载Excel2007的博文的原因有三:
第一、Excel2003基本已经淘汰了
第二、Excel2003下载文件太大,不利于网络传输
第三、xlwt这个库有个Bug,就是它不支持单元格字符串长度超过32767【这里,有兴趣的同学可以查看下源码】
好了,废话不多说了,我们进入正题。
二、安装
本文使用的是pandas,引入库的方式:pip install pandas
三、使用
首先、引入该库,例如:import pandas
其次、
创建Excel文档:
out = BytesIO() 创建的输出流
excel = pandas.ExcelWriter(out, engine = 'xlsxwriter') 这里使用xlsxwriter引擎创建, 创建的文件直接关联到输出流out,安装xlsxwriter也很简单,如: pip install xlsxwriter
创建Sheet:
summary_df = pandas.DataFrame({}) 这里使用空对象创建,下面代码生成的sheet会是一个空sheet
summary_df.to_excel(excel, sheet_name = "Summary", index = False, header = False) 这里会在excel中创建一个sheet名称为Summary的文档,index=False,使得Excel不会咋第一列创建序号列,xlsxwriter默认会在第一列创建序号列。这里的参数具体用法请参考相关文档。
获取sheet:worksheet = excel.sheets["Summary"] 获取sheet名称为summary的sheet名
设置列宽:worksheet.set_column('D:D', 18, long_text_format) 设置第四列宽度为18,使用样式long_text_format【这个样式需要自己提前定义】
保存Excel:excel.save()
Excel文件会直接输出上面创建的输出流out。
第四、输出
response = HttpResponse(out.getvalue(), content_type = 'application/vnd.ms-excel') dt = datetime.datetime.now() response['Content-Disposition'] = 'attachment;filename={} {}.xlsx'.format(urlquote(domain_name), dt.strftime('%Y-%m-%d %H-%M-%S')) print("End downloading...") return response
贴下源码:
def download_report(request, task_id, domain_name = '全部'): print("start downloading xlsx...", task_id) print("start downloading...", domain_name) domains = [{'domain_name': domain_name}] ai_task = AITask.objects.get(id = task_id) if domain_name == '全部': if 1 == ai_task.type: domains = Classification.objects.values('domain_name').distinct().filter(type = 1).order_by("domain_name") elif 2 == ai_task.type: domains = Classification.objects.values('domain_name').distinct().filter(type = 2).order_by("domain_name") out = BytesIO() excel = pandas.ExcelWriter(out, engine = 'xlsxwriter') summary_title = ['Domain', 'Pass', 'Fail'] summary_dict = {title: [] for title in summary_title} domain_title = ['Domain', 'One level', 'Two level', 'Semantic', 'Priority', 'Intent group', 'Intent', 'Result', 'Handle time', 'Response time', 'Server Domain', 'Detail'] summary_df = pandas.DataFrame({}) summary_df.to_excel(excel, sheet_name = "Summary", index = False, header = False) workbook = excel.book body_format = workbook.add_format(style.body_style) header_format = workbook.add_format(style.head_style) long_text_format = workbook.add_format(style.long_text_style) large_text_format = workbook.add_format(style.large_text_style) sheet_data = {} for domain in domains: dmain_name = domain["domain_name"] sheet_data[dmain_name] = {column_name: [] for column_name in domain_title} reports = ai_task.report.filter(semantic__classification__domain_name__exact = dmain_name) if len(reports): pass_no = fail_no = 0 for report in reports: semantic = report.semantic classification = semantic.classification sheet_data[dmain_name][domain_title[0]].append(classification.domain_name) sheet_data[dmain_name][domain_title[1]].append(classification.first_classification) sheet_data[dmain_name][domain_title[2]].append(classification.second_classification) sheet_data[dmain_name][domain_title[3]].append(semantic.name) sheet_data[dmain_name][domain_title[4]].append(classification.semantic_property) sheet_data[dmain_name][domain_title[5]].append(classification.intent_group) sheet_data[dmain_name][domain_title[6]].append(classification.intent) sheet_data[dmain_name][domain_title[7]].append(report.result) sheet_data[dmain_name][domain_title[8]].append(report.in_handle_time) sheet_data[dmain_name][domain_title[9]].append(report.ex_handle_time) sheet_data[dmain_name][domain_title[10]].append(report.server_domain) sheet_data[dmain_name][domain_title[11]].append(report.description) if "pass" == report.result: pass_no += 1 elif "fail" == report.result: fail_no += 1 sheet_df = pandas.DataFrame(sheet_data[dmain_name]) sheet_df.to_excel(excel, sheet_name = dmain_name, index = False, header = False, startrow = 1) worksheet = excel.sheets[dmain_name] worksheet.set_column('A:C', None, body_format) worksheet.set_column('D:D', 18, long_text_format) worksheet.set_column('E:E', None, body_format) worksheet.set_column('F:G', 30, long_text_format) worksheet.set_column('H:H', None, body_format) worksheet.set_column('I:K', None, body_format) worksheet.set_column('L:L', 50, large_text_format) for col, title in enumerate(sheet_df.columns.values): worksheet.write(0, col, title, header_format) sheet_data.clear() #回收内存 summary_dict[summary_title[0]].append(dmain_name) summary_dict[summary_title[1]].append(pass_no) summary_dict[summary_title[2]].append(fail_no) summary_df = pandas.DataFrame(summary_dict) summary_df.to_excel(excel, sheet_name = 'Summary', index = False, header = False, startrow = 1) worksheet = excel.sheets['Summary'] for col, title in enumerate(summary_df.columns.values): worksheet.write(0, col, title, header_format) for row in range(len(summary_dict[title])): worksheet.write(row + 1, col, summary_dict[title][row], body_format) excel.save() response = HttpResponse(out.getvalue(), content_type = 'application/vnd.ms-excel') dt = datetime.datetime.now() response['Content-Disposition'] = 'attachment;filename={} {}.xlsx'.format(urlquote(domain_name), dt.strftime('%Y-%m-%d %H-%M-%S')) print("End downloading...") return response