[Java/Jdbc]ResultSet对象的setFetchSize对于大批量数据读取的显著提速作用

在笔者一段程序中有这样的代码:

final int BATCH_SIZE=10000;
final String sql=buildSql(...);

try(Connection conn=DbUtil.getConn();
    PreparedStatement pstmt=conn.prepareStatement(sql);
    ResultSet rs=pstmt.executeQuery()){

    final int colCnt=10;
    List<String[]> rowList=new ArrayList<>(BATCH_SIZE);

    int count=0;
    while(rs.next()){
        String[] arr=new String[colCnt];

        for(int i=0;i<colCnt;i++){
            arr[i]=rs.getString(i+1);
        }

        rowList.add(arr);

        count++;
        if(count==BATCH_SIZE){
            使用写线程处理rowList...

            count=0;
            rowList=new ArrayList<>(BATCH_SIZE);
        }
    }

    rs.close();
}catch(Exception ex){
    ex.printStackTrace();
}

这段代码及配套代码处理1300万行含21字段10字段需脱敏的表需要约20分。

当加入rs.setFetchSize(10000)后代码是这样：

final int BATCH_SIZE=10000;
final String sql=buildSql(...);

try(Connection conn=DbUtil.getConn();
    PreparedStatement pstmt=conn.prepareStatement(sql);
    ResultSet rs=pstmt.executeQuery()){

    final int colCnt=10;
    List<String[]> rowList=new ArrayList<>(BATCH_SIZE);

    int count=0;
    
    rs.setFetchSize(BATCH_SIZE);
    while(rs.next()){
        String[] arr=new String[colCnt];

        for(int i=0;i<colCnt;i++){
            arr[i]=rs.getString(i+1);
        }

        rowList.add(arr);

        count++;
        if(count==BATCH_SIZE){
            使用写线程处理rowList...

            count=0;
            rowList=new ArrayList<>(BATCH_SIZE);
        }
    }

    rs.close();
}catch(Exception ex){
    ex.printStackTrace();
}

只加了一句话，这段代码及配套处理1300万行含21字段10字段需脱敏的表只需要约10分，缩短约一半!

究其原因，应该是设置了fetchSize后，程序一次性从DB获取了一批的量，然后在内存中读取，读取一批再向DB索取，这样就大幅减少了IO次数，从而提高了速度。

END

posted @ 2022-07-22 16:21 逆火狂飙阅读(1748) 评论(0) 收藏举报

刷新页面返回顶部