在Oracle中快速创建一张百万级别的表,一张十万级别的表 并修改两表中1%的数据 全部运行时间66秒
万以下小表做性能优化没有多大意义,因此我需要创建大表;
创建大表有三种方法,一种是insert into table selec..connect by.的方式,它最快但是数据要么是连续值,要么是随机值或是系统值,并不好用,而且总量上受到限制;另一种方法是用程序,借助Oracle的批量插值语法插入数据,它好在数据可以用程序掌控,总量也没有限制,缺点是速度慢;还有一种方法是有一张大表,把它的数据倒腾进来,语法是 insert into newtable select * from oldtable,这种方法暂时不在我的考虑之列。
为了节约时间,我采用了两种结合的办法,即用第一种方式先大量建立数据,再用程序修改其中一部分。
下面进入正题:
百万级别的表建表语句是这样的:
CREATE TABLE bigtable ( id NUMBER not null primary key, name NVARCHAR2(60) not null, score NUMBER(4,0) NOT NULL, createtime TIMESTAMP (6) not null )
给它塞入百万数据可以这样做:
Insert into bigtable select rownum,dbms_random.string('*',dbms_random.value(6,20)),dbms_random.value(0,20),sysdate from dual connect by level<=1000000
使用上面这个方法,在我的T440p机器上实验,一次性创建的记录数大约在两百万到三百万之间,再多就会报“
第 1 行出现错误:
ORA-30009: CONNECT BY 操作内存不足
”看来机器性能在制约我的研究。
看看它的运行数据如何:
已创建1000000行。 已用时间: 00: 00: 39.87
近四十秒创建百万数据,还不错。
十万级别的表结构如下:
create table smalltable ( id NUMBER not null primary key, name NVARCHAR2(60) not null, createtime TIMESTAMP (6) not null )
同样的方式给它塞入数据:
Insert into smalltable select rownum,dbms_random.string('*',dbms_random.value(6,20)),sysdate from dual connect by level<=100000 order by dbms_random.random
发现4秒不到就搞定了:
已创建100000行。 已用时间: 00: 00: 03.71
当然这样的数据还不够,于是下面的程序登场了,它的作用是将某表中百分之一记录的name字段改写成设定数组中的随机值:
package com.ufo; public class DBParam { public final static String Driver = "oracle.jdbc.driver.OracleDriver"; public final static String DbUrl = "jdbc:oracle:thin:@127.0.0.1:1521:orcl"; public final static String User = "ufo"; public final static String Pswd = "1234"; }
package com.ufo.bigsmall; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; import java.text.DecimalFormat; import java.util.Random; import java.util.Set; import java.util.TreeSet; import com.ufo.DBParam; public class RecordChanger { public boolean changeOnePencent(String table) { Connection conn = null; Statement stmt = null; try{ Class.forName(DBParam.Driver).newInstance(); conn = DriverManager.getConnection(DBParam.DbUrl, DBParam.User, DBParam.Pswd); stmt = conn.createStatement(); long startMs = System.currentTimeMillis(); int totalCount=fetchExistCount(table,stmt); System.out.println("There are "+toEastNumFormat(totalCount)+" records in the table:'"+table+"'."); int changeCount=totalCount/100; System.out.println("There are "+toEastNumFormat(changeCount)+" records should be changed."); Set<Integer> idSet=fetchIdSet(totalCount,changeCount,table,stmt); System.out.println("There are "+toEastNumFormat(idSet.size())+" records in idSet."); int changed=updateRecords(idSet,table,stmt); System.out.println("There are "+toEastNumFormat(changed)+" records have been changed."); long endMs = System.currentTimeMillis(); System.out.println("It takes "+ms2DHMS(startMs,endMs)+" to update 1% records of table:'"+table+"'."); } catch (Exception e) { e.printStackTrace(); } finally { try { stmt.close(); conn.close(); } catch (SQLException e) { System.out.print("Can't close stmt/conn because of " + e.getMessage()); } } return false; } private int updateRecords(Set<Integer> idSet,String tableName,Statement stmt) throws SQLException{ int updated=0; for(int id:idSet) { String sql="update "+tableName+" set name='"+getRNDName()+"' where id='"+id+"' "; updated+= stmt.executeUpdate(sql); } return updated; } private String getRNDName() { String[] arr= {"Andy","Bill","Cindy","张三","张飞","张好古","李四","王五","赵六","孙七","钱八","岳飞","关羽","刘备","曹操","张辽","虚竹","王语嫣"}; int index=getRandom(0,arr.length); return arr[index]; } // fetch a set of id which should be changed private Set<Integer> fetchIdSet(int totalCount,int changeCount,String tableName,Statement stmt) throws SQLException{ Set<Integer> idSet=new TreeSet<Integer>(); while(idSet.size()<changeCount) { int id=getRandom(0,totalCount); if(idSet.contains(id)==false && isIdExist(id,tableName,stmt)) { idSet.add(id); } } return idSet; } private boolean isIdExist(int id,String tableName,Statement stmt) throws SQLException{ String sql="select count(*) as cnt from "+tableName+" where id='"+id+"' "; ResultSet rs = stmt.executeQuery(sql); while (rs.next()) { int cnt = rs.getInt("cnt"); return cnt==1; } rs.close(); return false; } // get a random num between min and max private static int getRandom(int min, int max){ Random random = new Random(); int s = random.nextInt(max) % (max - min + 1) + min; return s; } // fetch exist record count of a table private int fetchExistCount(String tableName,Statement stmt) throws SQLException{ String sql="select count(*) as cnt from "+tableName+""; ResultSet rs = stmt.executeQuery(sql); while (rs.next()) { int cnt = rs.getInt("cnt"); return cnt; } rs.close(); return 0; } // 将整数在万分位以逗号分隔表示 public static String toEastNumFormat(long number) { DecimalFormat df = new DecimalFormat("#,####"); return df.format(number); } // change seconds to DayHourMinuteSecond format private static String ms2DHMS(long startMs, long endMs) { String retval = null; long secondCount = (endMs - startMs) / 1000; String ms = (endMs - startMs) % 1000 + "ms"; long days = secondCount / (60 * 60 * 24); long hours = (secondCount % (60 * 60 * 24)) / (60 * 60); long minutes = (secondCount % (60 * 60)) / 60; long seconds = secondCount % 60; if (days > 0) { retval = days + "d" + hours + "h" + minutes + "m" + seconds + "s"; } else if (hours > 0) { retval = hours + "h" + minutes + "m" + seconds + "s"; } else if (minutes > 0) { retval = minutes + "m" + seconds + "s"; } else { retval = seconds + "s"; } return retval + ms; } public static void main(String[] args) { RecordChanger rc=new RecordChanger(); rc.changeOnePencent("bigtable"); } }
以下是修改两个表的运行结果反馈:
There are 10,0000 records in the table:'smalltable'. There are 1000 records should be changed. There are 1000 records in idSet. There are 1000 records have been changed. It takes 2s276ms to update 1% records of table:'smalltable'. There are 100,0000 records in the table:'bigtable'. There are 1,0000 records should be changed. There are 1,0000 records in idSet. There are 1,0000 records have been changed. It takes 20s251ms to update 1% records of table:'bigtable'.
全部运行时间加起来一分钟出点头,还是可以的。
--END-- 2020年1月5日09点32分
分类:
Oracle.千万级表
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)