SQL*Loader之CASE11

CASE11

1. SQL脚本

[oracle@node3 ulcase]$ cat ulcase11.sql

set termout off

rem host write sys$output "Building demonstration tables for case study 11.  Please wait"

drop table emp;

create table emp
       (empno number(4) not null,
        ename char(10),
        job char(9),
        mgr number(4),
        hiredate date,
        sal number(7,2),
        comm number(7,2),
        deptno number(2),
    projno number,
    loadseq number);

exit

2. 控制文件

[oracle@node3 ulcase]$ cat ulcase11.ctl

-- Copyright (c) 1991, 2004 Oracle.  All rights reserved.
-- NAME
-- ulcase11.ctl - SQL*Loader Case Study 11: Load Data in the Unicode
--                Character Set UTF-16
--
-- DESCRIPTION
-- This case study demonstrates the following:
-- Using SQL*Loader to load data in the Unicode character set, UTF16.
--
-- Using SQL*Loader to load data in a fixed-width, multibyte character set.
--
-- Using character-length semantics.
--
-- Using SQL*Loader to load data in little-endian byte order. SQL*Loader 
-- checks the byte order of the system on which it is running. If necessary, 
-- SQL*Loader swaps the byte order of the data to ensure that any 
-- byte-order-dependent data is correctly loaded.
--
-- TO RUN THIS CASE STUDY:
-- 1. Before executing this control file, log in to SQL*Plus as
--    scott/tiger. Enter @ulcase11 to execute the SQL script for
--    this case study. This prepares and populates tables and
--    then returns you to the system prompt.
--      
-- 2. At the system prompt, invoke the case study as follows:
-- sqlldr USERID=scott/tiger CONTROL=ulcase11.ctl LOG=ulcase11.log
-- 
-- NOTES ABOUT THIS CONTROL FILE
--
-- The character set specified with the CHARACTERSET keyword is UTF16. 
-- SQL*Loader will convert the data from the UTF16 character set to 
-- the database character set. Because UTF16 is specified as the 
-- character set, character-length semantics are used for the load.
--
-- BYTEORDER LITTLE tells SQL*Loader that the data in the datafile is 
-- in little-endian byte order. SQL*Loader checks the byte order of the 
-- system on which it is running to determine if any byte-swapping is 
-- necessary. In this example, all the character data in UTF16 is 
-- byte-order dependent.
--
-- The TERMINATED BY and OPTIONALLY ENCLOSED BY clauses both specify 
-- hexadecimal strings. The X'002c' is the encoding for a comma (,) in 
-- UTF-16 big-endian format. The X'0022' is the encoding for a double 
-- quotation mark (") in big-endian format. Because the datafile is in 
-- little-endian format, SQL*Loader swaps the bytes before checking for 
-- a match. If these clauses were specified as character strings instead 
-- of hexadecimal strings, SQL*Loader would convert the strings to the 
-- datafile character set (UTF16) and byte-swap as needed before checking 
-- for a match.
--
-- Because character-length semantics are used, the maximum length for 
-- the empno, hiredate, and deptno fields is interpreted as characters, 
-- not bytes.
--
-- The TERMINATED BY clause for the deptno field is specified using the 
-- character string ":". SQL*Loader converts the string to the datafile 
-- character set (UTF16) and byte-swaps as needed before checking for a match.

LOAD DATA 
CHARACTERSET utf16
BYTEORDER little
INFILE ulcase11.dat
REPLACE

INTO TABLE EMP
FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022'
(empno integer external (5), ename, job, mgr,
 hiredate DATE(20) "DD-Month-YYYY",
 sal, comm,
 deptno   CHAR(5) TERMINATED BY ":",
 projno,
 loadseq  SEQUENCE(MAX,1) )    

3. 数据文件

数据文件因为是UTF16编码,在文本文件中显示为乱码,在这里不贴出。

执行后结果:

[oracle@node3 ulcase]$ sqlplus scott/tiger @ulcase11.sql

[oracle@node3 ulcase]$ sqlldr userid=scott/tiger control=ulcase11.ctl

SQL> select * from emp;

EMPNO ENAME  JOB       MGR HIREDATE       SAL    COMM  DEPTNO PROJNO    LOADSEQ
----- ------ --------- ----- --------- -------   ----- ------ ------ ----------
 7782 Clark  Manager   7839 09-JUN-81      2573          10    101        1

 7839 King   President      17-NOV-81      5500          10    102        2

 7934 Miller Clerk     7782 23-JAN-82       920          10    102        3

 7566 Jones  Manager   7839 02-APR-81      3124          20    101        4 

7499 Allen Salesman 7698 20-FEB-81 1600 300 30 103 5 7654 Martin Salesman 7698 28-SEP-81 1313 1400 30 103 6 7658 Chan Analyst 7566 03-MAY-82 3450 20 101 7 7 rows selected.

查看一下日志文件:

[oracle@node3 ulcase]$ cat ulcase11.log 

SQL*Loader: Release 11.2.0.1.0 - Production on Fri Sep 19 04:13:59 2014

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Control File:   ulcase11.ctl
Character Set utf16 specified for all input.
Using character length semantics.
Byteorder little endian specified.

Data File:      ulcase11.dat
  Bad File:     ulcase11.bad
  Discard File:  none specified
 
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional

Table EMP, loaded from every logical record.
Insert option in effect for this table: REPLACE

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
EMPNO                               FIRST    10   ,  O(") CHARACTER            
ENAME                                NEXT     *   ,  O(") CHARACTER            
JOB                                  NEXT     *   ,  O(") CHARACTER            
MGR                                  NEXT     *   ,  O(") CHARACTER            
HIREDATE                             NEXT    40   ,  O(") DATE DD-Month-YYYY   
SAL                                  NEXT     *   ,  O(") CHARACTER            
COMM                                 NEXT     *   ,  O(") CHARACTER            
DEPTNO                               NEXT    10   :  O(") CHARACTER            
PROJNO                               NEXT     *   ,  O(") CHARACTER            
LOADSEQ                                                   SEQUENCE (MAX, 1)


Table EMP:
  7 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                 104768 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:             7
Total logical records rejected:         0
Total logical records discarded:        0

Run began on Fri Sep 19 04:13:59 2014
Run ended on Fri Sep 19 04:13:59 2014

Elapsed time was:     00:00:00.41
CPU time was:         00:00:00.24

总结:在本例中

   1> CHARACTERSET utf16指数据文件的字符集是utf16

   2> BYTEORDER little指the data in the datafile is in little-endian byte order。

posted @ 2014-09-19 15:59  iVictor  阅读(517)  评论(0编辑  收藏  举报