随笔 - 195  文章 - 0  评论 - 5  阅读 - 20万

EasyExcel实现百万级数据导入导出

实现思路
1、由于Easyexcel读功能是对excel一行一行进行读的,这是为了保证不过多占用我们内存。如果我们系统需要对数据进行入库的话则需要对数据进行缓存,比如1w每批次入库。虽然会损失一定的内存,但是写库时间大大降低了呀;

2、如果传入的excel有多个sheet,可以考虑开启多个线程进行读excel。比如每个sheet一个线程,但是线程需要做好管理,如使用线程池等等。但是,一般大数据量都不使用excel来保存,而是使用csv来储存数据,因为这个格式简单、体积小、易于使用、可被多种软件打开和编辑。当然,Easyexel也是可以读取csv文件的,但因为要兼容csv文件就不采用多sheet的方式,因为csv没有sheet。

3、excel导出目前Easyexcel最新版本是不支持多线程写数据的,只能单线程进行写excel。为了保证写excel效率,我采用20w数据一批一次写入excel。

4、由于excel数据行数超过100w打开时间特别长,所以我们在导出的时候对数据进行切割,每个sheet最多只保存100w数据,其他数据写入下个sheet。

代码及步骤如下

<dependency>
   <groupId>com.alibaba</groupId>
   <artifactId>easyexcel</artifactId>
   <version>3.3.2</version>
</dependency>

mapper:

List<SocialChannelExcelVo> selectDataByPage(int offset, int size);

int insertBatch(List<SocialChannelExcelVo> list);

xml:

复制代码
<!--查询分页数据-->
    <select id="selectDataByPage" parameterType="map" resultType="com.zhhs.SocialChannelExcelVo">
        select terminal_code as terminalCode, terminal_brand as terminalBrand, terminal_model as terminalModel from table1 limit #{offset},#{size}
    </select>


    <insert id="insertBatch" parameterType="java.util.List">
        insert into table1 (terminal_code, terminal_brand, terminal_model) values
        <foreach collection="list" item="item" index="index" separator=",">
            (#{item.terminalCode}, #{item.terminalBrand}, #{item.terminalModel})
        </foreach>
    </insert>
复制代码

一、cvs文件一次性入库

复制代码
    /**
     * readExcel
     * @author senfel
     * @date 2024/7/1 17:17
     * @return void
     */
    @Test
    public void readExcel(){
        //800w+的csv文件,每批次读取10000条
        long startTime = System.currentTimeMillis();
        System.err.println("readExcel开始执行时间:"+startTime);
        String filePath = "D:\\测试导出openid.csv";

        EasyExcel.read(filePath, SocialChannelExcelVo.class, new PageReadListener<SocialChannelExcelVo>(dataList ->{
            if(!CollectionUtils.isEmpty(dataList)){
                //数据存储
                baseSocialChannelMapper.insertBatch(dataList);
            }
        }, 10000)).sheet().doRead();
        System.err.println("readExcel结束执行时间:"+(System.currentTimeMillis()-startTime));
    }
复制代码

二、excel多sheet页,多线程入库

复制代码
 @Test
 public void doImport() {
        long startTime = System.currentTimeMillis();
        log.info("进入导入逻辑");
        //需要读取的sheet数量
        int numberOfSheets = 4;
        String fileName = "D:\\blank\\测试导出openid.xlsx";

        ExecutorService executorService = Executors.newFixedThreadPool(numberOfSheets);

        for (int sheetNo = 0; sheetNo < numberOfSheets; sheetNo++){
            int sheetNumber = sheetNo;
            executorService.submit(()->{
                EasyExcel.read(fileName, SocialChannelExcelVo.class, new PageReadListener<SocialChannelExcelVo>(dataList -> { //这里的dataList 数据量是根据BATCH_COUNT参数的大小定的
                   //这里可以对数据进行校验和处理
                    baseSocialChannelMapper.insertBatch(dataList);;//批量入库
                },10000)).sheet(sheetNumber).doRead(); //这里下面这个BATCH_COUNT是设置每次读取sheet页的多少条数据到内存

            });
        }

        executorService.shutdown();//关闭线程池
        try {
            executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
        }catch (InterruptedException e){
            e.printStackTrace();
        }
        System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime));

    }
复制代码

三、导出

复制代码
@Test
    public void exportExcel(){
        long startTime = System.currentTimeMillis();
        System.err.println("exportExcel:"+startTime);
        String excelName = "测试导出openid";
        String exportPath = "D:\\blank\\";
        boolean isRun = true;
        int size = 20 * 10000;
        int page = 0;
        int sheetDataSize = 0;
        int sheetNo = 0;
        FileOutputStream outputStream = null;
        try {
            outputStream = new FileOutputStream(exportPath + excelName + ".xlsx");
            ExcelWriter excelWriter = EasyExcel.write(outputStream).build();
            do {
                page++;
                List<SocialChannelExcelVo> openList = baseSocialChannelMapper.selectDataByPage((page - 1) * size, size);
                if(CollectionUtils.isEmpty(openList)){
                    isRun = false;
                    break;
                }
                sheetDataSize += openList.size();
                if(sheetDataSize > 1000000){
                    sheetNo++;
                    sheetDataSize = openList.size();
                }
                //写入文件流
                WriteSheet writeSheet = EasyExcel.writerSheet(sheetNo, "channel-"+sheetNo).head(SocialChannelExcelVo.class)
                        .registerWriteHandler(new LongestMatchColumnWidthStyleStrategy()).build();
                excelWriter.write(openList, writeSheet);
            }while (isRun);
            excelWriter.finish();
            outputStream.flush();
        }catch (IOException e){
            e.printStackTrace();
        }finally {
            if (outputStream != null) {
                try {
                    outputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime));
        }

    }@Test
    public void exportExcel(){
        long startTime = System.currentTimeMillis();
        System.err.println("exportExcel:"+startTime);
        String excelName = "测试导出openid";
        String exportPath = "D:\\blank\\";
        boolean isRun = true;
        int size = 20 * 10000;
        int page = 0;
        int sheetDataSize = 0;
        int sheetNo = 0;
        FileOutputStream outputStream = null;
        try {
            outputStream = new FileOutputStream(exportPath + excelName + ".xlsx");
            ExcelWriter excelWriter = EasyExcel.write(outputStream).build();
            do {
                page++;
                List<SocialChannelExcelVo> openList = baseSocialChannelMapper.selectDataByPage((page - 1) * size, size);
                if(CollectionUtils.isEmpty(openList)){
                    isRun = false;
                    break;
                }
                sheetDataSize += openList.size();
                if(sheetDataSize > 1000000){
                    sheetNo++;
                    sheetDataSize = openList.size();
                }
                //写入文件流
                WriteSheet writeSheet = EasyExcel.writerSheet(sheetNo, "channel-"+sheetNo).head(SocialChannelExcelVo.class)
                        .registerWriteHandler(new LongestMatchColumnWidthStyleStrategy()).build();
                excelWriter.write(openList, writeSheet);
            }while (isRun);
            excelWriter.finish();
            outputStream.flush();
        }catch (IOException e){
            e.printStackTrace();
        }finally {
            if (outputStream != null) {
                try {
                    outputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            System.err.println("exportExcel结束执行时间:"+(System.currentTimeMillis()-startTime));
        }

    }
复制代码

转自:https://blog.csdn.net/qq_42251944/article/details/135518279

https://blog.csdn.net/weixin_39970883/article/details/140127982

posted on   大山008  阅读(704)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· DeepSeek “源神”启动!「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)
< 2025年2月 >
26 27 28 29 30 31 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 1
2 3 4 5 6 7 8

点击右上角即可分享
微信分享提示