SQL*Loader实验笔记【一】
SQL*Loader是oracle提供的可以从多种平面文件中向数据库中加载数据的工具,使用SQLLDR工具可以在很短的时间内向数据库中加载大量的数据,像把制作好的excel表格导入数据库,可以说非常方便,相关的数据加载和卸载工具还有外部表,IMP/EXP,数据泵等,其实呢~ 关于SQL*Loader的学习多数时间是花在了琢磨sqlldr控制文件的写法上,下面来总结我的SQL*Loader学习过程和一些实验案例:
实验环境:Red Hat Enterprise Linux Server release 5.2 + Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit
1.先看一下sqlldr命令的帮助信息
[oracle@bean ~]$ sqlldr SQL*Loader: Release 11.2.0.1.0 - Production on Sat Oct 13 01:03:09 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Usage: SQLLDR keyword=value [,keyword=value,...] Valid Keywords: userid -- ORACLE username/password control -- control file name log -- log file name bad -- bad file name data -- data file name discard -- discard file name discardmax -- number of discards to allow (Default all) skip -- number of logical records to skip (Default 0) load -- number of logical records to load (Default all) errors -- number of errors to allow (Default 50) rows -- number of rows in conventional path bind array or between direct path data saves (Default: Conventional path 64, Direct path all) bindsize -- size of conventional path bind array in bytes (Default 256000) silent -- suppress messages during run (header,feedback,errors,discards,partitions) direct -- use direct path (Default FALSE) parfile -- parameter file: name of file that contains parameter specifications parallel -- do parallel load (Default FALSE) file -- file to allocate extents from skip_unusable_indexes -- disallow/allow unusable indexes or index partitions (Default FALSE) skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable (Default FALSE) commit_discontinued -- commit loaded rows when load is discontinued (Default FALSE) readsize -- size of read buffer (Default 1048576) external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE (Default NOT_USED) columnarrayrows -- number of rows for direct path column array (Default 5000) streamsize -- size of direct path stream buffer in bytes (Default 256000) multithreading -- use multithreading in direct path resumable -- enable or disable resumable for current session (Default FALSE) resumable_name -- text string to help identify resumable statement resumable_timeout -- wait time (in seconds) for RESUMABLE (Default 7200) date_cache -- size (in entries) of date conversion cache (Default 1000) no_index_errors -- abort load on any index errors (Default FALSE) PLEASE NOTE: Command-line parameters may be specified either by position or by keywords. An example of the former case is 'sqlldr scott/tiger foo'; an example of the latter is 'sqlldr control=foo userid=scott/tiger'. One may specify parameters by position before but not after parameters specified by keywords. For example, 'sqlldr scott/tiger control=foo logfile=log' is allowed, but 'sqlldr scott/tiger control=foo log' is not, even though the position of the parameter 'log' is correct.
2.先通过一个简单的加载示例来了解一下SQL*Loader的操作流程
- 初始化
CJ@db11g>create table t as select * from scott.dept where 1=2; Table created. CJ@db11g>alter table t add constraint pri_t primary key (deptno); Table altered. CJ@db11g>desc t; Name Null? Type ----------------------------------------- -------- ---------------------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13)
- 控制文件
[oracle@bean ~]$ vi demol.ctl
LOAD DATA INFILE * APPEND INTO TABLE t FIELDS TERMINATED BY "," (DEPTNO,dname,loc) begindata 10,ACCOUNTING,NEW YORK 20,RESEARCH,DALLAS 30,SALES,CHICAGO 40,OPERATIONS,BOSTON
- 执行导入语句
注意命令行中指定的参数总会覆盖控制文件的参数
[oracle@bean ~]$ sqlldr cj/cj control=demol.ctl log=demo.log SQL*Loader: Release 11.2.0.1.0 - Production on Mon Oct 8 21:37:12 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Commit point reached - logical record count 4
- 查看log文件检验
[oracle@bean ~]$ cat demo.log SQL*Loader: Release 11.2.0.1.0 - Production on Mon Oct 8 21:37:12 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Control File: demol.ctl Data File: demol.ctl Bad File: demol.bad Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 256000 bytes Continuation: none specified Path used: Conventional Table T, loaded from every logical record. Insert option in effect for this table: APPEND Column Name Position Len Term Encl Datatype ------------------------------ ---------- ----- ---- ---- --------------------- DEPTNO FIRST * , CHARACTER DNAME NEXT * , CHARACTER LOC NEXT * , CHARACTER Table T: 4 Rows successfully loaded. 0 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Space allocated for bind array: 49536 bytes(64 rows) Read buffer bytes: 1048576 Total logical records skipped: 0 Total logical records read: 4 Total logical records rejected: 0 Total logical records discarded: 0 Run began on Mon Oct 08 21:37:12 2012 Run ended on Mon Oct 08 21:37:12 2012 Elapsed time was: 00:00:00.19 CPU time was: 00:00:00.03
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON
3.实验案例总结
现在来正式看看这个SQL*Loader的一系列用法,其实就是一系列sqlldr控制文件~ ~,这里记录了我觉得较为典型的示例,以备参考。
1)使用optionally enclosed分隔字符
- 初始化
CJ@db11g>desc t Name Null? Type ----------------------------------------- -------- ---------------------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13)
- 控制文件
LOAD DATA INFILE * APPEND INTO TABLE t FIELDS TERMINATED BY "," optionally enclosed by '"' (DEPTNO,dname,loc) begindata 10,ACCOUNTING,"NEW YORK" 20,RESEARCH,"CJ,""DALLAS""" 30,SALES,CHICAGO 40,OPERATIONS,BOSTON
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH CJ,"DALLAS" 30 SALES CHICAGO 40 OPERATIONS BOSTON
2)指定字段间分隔,空格或制表符
- 初始化
CJ@db11g>desc t Name Null? Type ----------------------------------------- -------- ---------------------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13)
- 控制文件
LOAD DATA INFILE * INTO TABLE t REPLACE FIELDS TERMINATED BY WHITESPACE (DEPTNO,DNAME,LOC) BEGINDATA 10 Sales Virginia
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC ---------- -------------- ------------- 10 Sales Virginia
- 备注
- FIELDS TERMINATED BY x'09' 以制表符分隔
- FIELDS TERMINATED BY WHITESPACE 以空格分隔
3)过滤指定字段
- 初始化
CJ@db11g>desc t Name Null? Type ----------------------------------------- -------- ---------------------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13)
- 控制文件
LOAD DATA INFILE * INTO TABLE t REPLACE FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ( DEPTNO, FILLER_1 FILLER, DNAME, LOC ) BEGINDATA 20,Blank,Accounting,"Virginia,USA"
- 验证结果
SQL> select * from t; DEPTNO DNAME LOC ---------- -------------- ------------- 20 Accounting Virginia,USA
- 备注
- FILLER_1 对应虚拟列名 filler指定该列为虚拟列
4)加载定界字符,指定字段位置加载
- 初始化
CJ@db11g>desc t Name Null? Type ----------------------------------------------------- -------- ------------------------------------ DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13) LAST_UPTIME DATE ALL_LINE VARCHAR2(40)
- 控制文件
LOAD DATA INFILE * INTO TABLE t REPLACE ( DEPTNO position(1:2), DNAME position(*:16), LOC position(*:29), all_line position(1:29) ) BEGINDATA 101234567890 123456789,0
- 验证结果
SQL> select * from t; DEPTNO DNAME LOC ALL_LINE ---------- -------------- ------------- ---------------------------- 10 1234567890 123 456789,0 101234567890 123456789,0
备注
- LOC1 position(1:29) 表示loc1字段的位置开始于第一个字符到第29个字符
- LOC1 position(*+5:29) 表示前一个字段末位置加1,再加5开始到第29个字符注意 *+5不能大29(即前一字符不能大于后一字符)
5)使用日期,函数加载
- 初始化
SQL> desc t; Name Null? Type ----------------------------------------- -------- -------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13) LAST_UPTIME DATE ALL_LINE VARCHAR2(40)
- 控制文件
LOAD DATA INFILE * INTO TABLE t REPLACE FIELDS TERMINATED BY ',' TRAILING NULLCOLS (DEPTNO, DNAME "upper(:dname)", LOC "upper(:loc)", LAST_UPTIME date 'dd/mm/yyyy', ALL_LINE ":deptno||:dname||:loc||:last_uptime" ) BEGINDATA 10,Sales,Virginia,1/5/2000 20,Accounting,Virginia,21/6/1999 30,Consulting,Virginia,5/1/2000 40,Finance,Virginia,15/3/2001
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC LAST_UPTI ALL_LINE ---------- -------------- ------------- --------- ---------------------------------------- 10 SALES VIRGINIA 01-MAY-00 10SalesVirginia1/5/2000 20 ACCOUNTING VIRGINIA 21-JUN-99 20AccountingVirginia21/6/1999 30 CONSULTING VIRGINIA 05-JAN-00 30ConsultingVirginia5/1/2000 40 FINANCE VIRGINIA 15-MAR-01 40FinanceVirginia15/3/2001
- 备注
- TRAILING NULLCOLS 如果不指定trailing nullcols 将导致加载失败,这是因为all_line列在加载数据中没有指定
- ALL_LINE ":deptno||:dname||:loc||:last_uptime" 这个地方同样可以指定自定义的函数如my_func(:loc)
6)合并多行记录为一行
- 初始化
SQL> desc t; Name Null? Type ----------------------------------------- -------- -------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13) LAST_UPTIME DATE ALL_LINE VARCHAR2(40)
- 控制文件
LOAD DATA INFILE * concatenate 3 INTO TABLE t replace FIELDS TERMINATED BY ',' ( DEPTNO, DNAME "upper(:dname)", LOC "upper(:loc)", LAST_UPTIME date 'dd/mm/yyyy' ) BEGINDATA 10, Sales,Virginia, 1/5/2000 10,Sales, //(不满足三行条件的会导入失败)
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC LAST_UPTI ALL_LINE ---------- -------------- ------------- --------- ---------------------------------------- 10 SALES VIRGINIA 01-MAY-00
- 备注
- concatenate 3 表示合并三行为一行
- 10,Sales 不满足三行条件的会加载失败
7)使用特殊字符合并多行数据
- 初始化
SQL> desc t; Name Null? Type ----------------------------------------- -------- -------------- DEPTNO NOT NULL NUMBER(2) DNAME VARCHAR2(14) LOC VARCHAR2(13) LAST_UPTIME DATE
- 控制文件
LOAD DATA INFILE * continueif this(1:1) = '%' INTO TABLE t replace FIELDS TERMINATED BY ',' TRAILING NULLCOLS ( DEPTNO, DNAME "upper(:dname)", LOC "upper(:loc)", LAST_UPTIME date 'dd/mm/yyyy' ) BEGINDATA %10,Sales, Virginia,1/5/2000 %20, Finance,Virginia,13/04/2001 %30,Sales,Virginia, 01/5/2000 40,Usales,Newyork,13/04/2001
- 验证结果
CJ@db11g>select * from t; DEPTNO DNAME LOC LAST_UPTI ---------- -------------- ------------- --------- 10 SALES IRGINIA 01-MAY-00 20 INANCE VIRGINIA 13-APR-01 30 SALES VIRGINIA 01-MAY-00 0 USALES NEWYORK 13-APR-01
- 备注
- continueif this(1:1) = '%' 表示合并开始字符为’%’的行和它的下一行
- 合并行的开始部分必须有’%’才可以成功,sqlldr会截取每行的第一个字符判断,所以一定要保证输入正确的数据,参见官档中的示例:
Example 9-3 CONTINUEIF THIS Without the PRESERVE Parameter
Assume that you have physical records 14 bytes long and that a period represents a space:
%%aaaaaaaa....
%%bbbbbbbb....
..cccccccc....
%%dddddddddd..
%%eeeeeeeeee..
..ffffffffff..
In this example, the CONTINUEIF THIS clause does not use the PRESERVE parameter:
CONTINUEIF THIS (1:2) = '%%'
Therefore, the logical records are assembled as follows:
aaaaaaaa....bbbbbbbb....cccccccc....
dddddddddd..eeeeeeeeee..ffffffffff..
Note that columns 1 and 2 (for example, %% in physical record 1) are removed from the physical records when the logical records are assembled.
所有SQL*Loader实验笔记
实验案例总结(1-7): SQL*Loader实验笔记【一】
实验案例总结(8-13): SQL*Loader实验笔记【二】
实验案例总结(14-19): SQL*Loader实验笔记【三】