SQL*Loader之CASE11
CASE11
1. SQL脚本
[oracle@node3 ulcase]$ cat ulcase11.sql
set termout off rem host write sys$output "Building demonstration tables for case study 11. Please wait" drop table emp; create table emp (empno number(4) not null, ename char(10), job char(9), mgr number(4), hiredate date, sal number(7,2), comm number(7,2), deptno number(2), projno number, loadseq number); exit
2. 控制文件
[oracle@node3 ulcase]$ cat ulcase11.ctl
-- Copyright (c) 1991, 2004 Oracle. All rights reserved. -- NAME -- ulcase11.ctl - SQL*Loader Case Study 11: Load Data in the Unicode -- Character Set UTF-16 -- -- DESCRIPTION -- This case study demonstrates the following: -- Using SQL*Loader to load data in the Unicode character set, UTF16. -- -- Using SQL*Loader to load data in a fixed-width, multibyte character set. -- -- Using character-length semantics. -- -- Using SQL*Loader to load data in little-endian byte order. SQL*Loader -- checks the byte order of the system on which it is running. If necessary, -- SQL*Loader swaps the byte order of the data to ensure that any -- byte-order-dependent data is correctly loaded. -- -- TO RUN THIS CASE STUDY: -- 1. Before executing this control file, log in to SQL*Plus as -- scott/tiger. Enter @ulcase11 to execute the SQL script for -- this case study. This prepares and populates tables and -- then returns you to the system prompt. -- -- 2. At the system prompt, invoke the case study as follows: -- sqlldr USERID=scott/tiger CONTROL=ulcase11.ctl LOG=ulcase11.log -- -- NOTES ABOUT THIS CONTROL FILE -- -- The character set specified with the CHARACTERSET keyword is UTF16. -- SQL*Loader will convert the data from the UTF16 character set to -- the database character set. Because UTF16 is specified as the -- character set, character-length semantics are used for the load. -- -- BYTEORDER LITTLE tells SQL*Loader that the data in the datafile is -- in little-endian byte order. SQL*Loader checks the byte order of the -- system on which it is running to determine if any byte-swapping is -- necessary. In this example, all the character data in UTF16 is -- byte-order dependent. -- -- The TERMINATED BY and OPTIONALLY ENCLOSED BY clauses both specify -- hexadecimal strings. The X'002c' is the encoding for a comma (,) in -- UTF-16 big-endian format. The X'0022' is the encoding for a double -- quotation mark (") in big-endian format. Because the datafile is in -- little-endian format, SQL*Loader swaps the bytes before checking for -- a match. If these clauses were specified as character strings instead -- of hexadecimal strings, SQL*Loader would convert the strings to the -- datafile character set (UTF16) and byte-swap as needed before checking -- for a match. -- -- Because character-length semantics are used, the maximum length for -- the empno, hiredate, and deptno fields is interpreted as characters, -- not bytes. -- -- The TERMINATED BY clause for the deptno field is specified using the -- character string ":". SQL*Loader converts the string to the datafile -- character set (UTF16) and byte-swaps as needed before checking for a match. LOAD DATA CHARACTERSET utf16 BYTEORDER little INFILE ulcase11.dat REPLACE INTO TABLE EMP FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022' (empno integer external (5), ename, job, mgr, hiredate DATE(20) "DD-Month-YYYY", sal, comm, deptno CHAR(5) TERMINATED BY ":", projno, loadseq SEQUENCE(MAX,1) )
3. 数据文件
数据文件因为是UTF16编码,在文本文件中显示为乱码,在这里不贴出。
执行后结果:
[oracle@node3 ulcase]$ sqlplus scott/tiger @ulcase11.sql
[oracle@node3 ulcase]$ sqlldr userid=scott/tiger control=ulcase11.ctl
SQL> select * from emp; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO PROJNO LOADSEQ ----- ------ --------- ----- --------- ------- ----- ------ ------ ---------- 7782 Clark Manager 7839 09-JUN-81 2573 10 101 1 7839 King President 17-NOV-81 5500 10 102 2 7934 Miller Clerk 7782 23-JAN-82 920 10 102 3 7566 Jones Manager 7839 02-APR-81 3124 20 101 4
7499 Allen Salesman 7698 20-FEB-81 1600 300 30 103 5 7654 Martin Salesman 7698 28-SEP-81 1313 1400 30 103 6 7658 Chan Analyst 7566 03-MAY-82 3450 20 101 7 7 rows selected.
查看一下日志文件:
[oracle@node3 ulcase]$ cat ulcase11.log
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Sep 19 04:13:59 2014 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Control File: ulcase11.ctl Character Set utf16 specified for all input. Using character length semantics. Byteorder little endian specified. Data File: ulcase11.dat Bad File: ulcase11.bad Discard File: none specified (Allow all discards) Number to load: ALL Number to skip: 0 Errors allowed: 50 Bind array: 64 rows, maximum of 256000 bytes Continuation: none specified Path used: Conventional Table EMP, loaded from every logical record. Insert option in effect for this table: REPLACE Column Name Position Len Term Encl Datatype ------------------------------ ---------- ----- ---- ---- --------------------- EMPNO FIRST 10 , O(") CHARACTER ENAME NEXT * , O(") CHARACTER JOB NEXT * , O(") CHARACTER MGR NEXT * , O(") CHARACTER HIREDATE NEXT 40 , O(") DATE DD-Month-YYYY SAL NEXT * , O(") CHARACTER COMM NEXT * , O(") CHARACTER DEPTNO NEXT 10 : O(") CHARACTER PROJNO NEXT * , O(") CHARACTER LOADSEQ SEQUENCE (MAX, 1) Table EMP: 7 Rows successfully loaded. 0 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Space allocated for bind array: 104768 bytes(64 rows) Read buffer bytes: 1048576 Total logical records skipped: 0 Total logical records read: 7 Total logical records rejected: 0 Total logical records discarded: 0 Run began on Fri Sep 19 04:13:59 2014 Run ended on Fri Sep 19 04:13:59 2014 Elapsed time was: 00:00:00.41 CPU time was: 00:00:00.24
总结:在本例中
1> CHARACTERSET utf16指数据文件的字符集是utf16
2> BYTEORDER little指the data in the datafile is in little-endian byte order。