EasyExcel实现百万级数据导入导出
实现思路
1、由于Easyexcel读功能是对excel一行一行进行读的,这是为了保证不过多占用我们内存。如果我们系统需要对数据进行入库的话则需要对数据进行缓存,比如1w每批次入库。虽然会损失一定的内存,但是写库时间大大降低了呀;
2、如果传入的excel有多个sheet,可以考虑开启多个线程进行读excel。比如每个sheet一个线程,但是线程需要做好管理,如使用线程池等等。但是,一般大数据量都不使用excel来保存,而是使用csv来储存数据,因为这个格式简单、体积小、易于使用、可被多种软件打开和编辑。当然,Easyexel也是可以读取csv文件的,但因为要兼容csv文件就不采用多sheet的方式,因为csv没有sheet。
3、excel导出目前Easyexcel最新版本是不支持多线程写数据的,只能单线程进行写excel。为了保证写excel效率,我采用20w数据一批一次写入excel。
4、由于excel数据行数超过100w打开时间特别长,所以我们在导出的时候对数据进行切割,每个sheet最多只保存100w数据,其他数据写入下个sheet。
代码及步骤如下
<dependency> <groupId>com.alibaba</groupId> <artifactId>easyexcel</artifactId> <version>3.3.2</version> </dependency>
mapper:
List<SocialChannelExcelVo> selectDataByPage(int offset, int size); int insertBatch(List<SocialChannelExcelVo> list);
xml:
<!--查询分页数据--> <select id="selectDataByPage" parameterType="map" resultType="com.zhhs.SocialChannelExcelVo"> select terminal_code as terminalCode, terminal_brand as terminalBrand, terminal_model as terminalModel from table1 limit #{offset},#{size} </select> <insert id="insertBatch" parameterType="java.util.List"> insert into table1 (terminal_code, terminal_brand, terminal_model) values <foreach collection="list" item="item" index="index" separator=","> (#{item.terminalCode}, #{item.terminalBrand}, #{item.terminalModel}) </foreach> </insert>
一、cvs文件一次性入库
/** * readExcel * @author senfel * @date 2024/7/1 17:17 * @return void */ @Test public void readExcel(){ //800w+的csv文件,每批次读取10000条 long startTime = System.currentTimeMillis(); System.err.println("readExcel开始执行时间:"+startTime); String filePath = "D:\\测试导出openid.csv"; EasyExcel.read(filePath, SocialChannelExcelVo.class, new PageReadListener<SocialChannelExcelVo>(dataList ->{ if(!CollectionUtils.isEmpty(dataList)){ //数据存储 baseSocialChannelMapper.insertBatch(dataList); } }, 10000)).sheet().doRead(); System.err.println("readExcel结束执行时间:"+(System.currentTimeMillis()-startTime)); }
二、excel多sheet页,多线程入库
@Test public void doImport() { long startTime = System.currentTimeMillis(); log.info("进入导入逻辑"); //需要读取的sheet数量 int numberOfSheets = 4; String fileName = "D:\\blank\\测试导出openid.xlsx"; ExecutorService executorService = Executors.newFixedThreadPool(numberOfSheets); for (int sheetNo = 0; sheetNo < numberOfSheets; sheetNo++){ int sheetNumber = sheetNo; executorService.submit(()->{ EasyExcel.read(fileName, SocialChannelExcelVo.class, new PageReadListener<SocialChannelExcelVo>(dataList -> { //这里的dataList 数据量是根据BATCH_COUNT参数的大小定的 //这里可以对数据进行校验和处理 baseSocialChannelMapper.insertBatch(dataList);;//批量入库 },10000)).sheet(sheetNumber).doRead(); //这里下面这个BATCH_COUNT是设置每次读取sheet页的多少条数据到内存 }); } executorService.shutdown();//关闭线程池 try { executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS); }catch (InterruptedException e){ e.printStackTrace(); } System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime)); }
三、导出
@Test public void exportExcel(){ long startTime = System.currentTimeMillis(); System.err.println("exportExcel:"+startTime); String excelName = "测试导出openid"; String exportPath = "D:\\blank\\"; boolean isRun = true; int size = 20 * 10000; int page = 0; int sheetDataSize = 0; int sheetNo = 0; FileOutputStream outputStream = null; try { outputStream = new FileOutputStream(exportPath + excelName + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(outputStream).build(); do { page++; List<SocialChannelExcelVo> openList = baseSocialChannelMapper.selectDataByPage((page - 1) * size, size); if(CollectionUtils.isEmpty(openList)){ isRun = false; break; } sheetDataSize += openList.size(); if(sheetDataSize > 1000000){ sheetNo++; sheetDataSize = openList.size(); } //写入文件流 WriteSheet writeSheet = EasyExcel.writerSheet(sheetNo, "channel-"+sheetNo).head(SocialChannelExcelVo.class) .registerWriteHandler(new LongestMatchColumnWidthStyleStrategy()).build(); excelWriter.write(openList, writeSheet); }while (isRun); excelWriter.finish(); outputStream.flush(); }catch (IOException e){ e.printStackTrace(); }finally { if (outputStream != null) { try { outputStream.close(); } catch (IOException e) { e.printStackTrace(); } } System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime)); } }@Test public void exportExcel(){ long startTime = System.currentTimeMillis(); System.err.println("exportExcel:"+startTime); String excelName = "测试导出openid"; String exportPath = "D:\\blank\\"; boolean isRun = true; int size = 20 * 10000; int page = 0; int sheetDataSize = 0; int sheetNo = 0; FileOutputStream outputStream = null; try { outputStream = new FileOutputStream(exportPath + excelName + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(outputStream).build(); do { page++; List<SocialChannelExcelVo> openList = baseSocialChannelMapper.selectDataByPage((page - 1) * size, size); if(CollectionUtils.isEmpty(openList)){ isRun = false; break; } sheetDataSize += openList.size(); if(sheetDataSize > 1000000){ sheetNo++; sheetDataSize = openList.size(); } //写入文件流 WriteSheet writeSheet = EasyExcel.writerSheet(sheetNo, "channel-"+sheetNo).head(SocialChannelExcelVo.class) .registerWriteHandler(new LongestMatchColumnWidthStyleStrategy()).build(); excelWriter.write(openList, writeSheet); }while (isRun); excelWriter.finish(); outputStream.flush(); }catch (IOException e){ e.printStackTrace(); }finally { if (outputStream != null) { try { outputStream.close(); } catch (IOException e) { e.printStackTrace(); } } System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime)); } }
转自:https://blog.csdn.net/qq_42251944/article/details/135518279
https://blog.csdn.net/weixin_39970883/article/details/140127982
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· DeepSeek “源神”启动!「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)