NLS DEBUGGING SUCCESS GUIDE (From NetWork)
This document is intended for WWSUP analysts and provides guidelines
and helpful examples on how to quickly and accurately identify and
diagnose technical issues related to NLS usage. NLS debugging topics are
divided into 5 sections: (A)Defining the NLS Environment, (B)Character
Set Conversion, (C)Data Formatting, (D)Migration, and (E)V6 NLS Issues.
A copy of this document can also be located on the RTSS Bulletin Board
#106319.922.
A. DEFINING THE NLS ENVIRONMENT
-------------------------------
A1. Verify Server NLS character encoding scheme:
A1a. What is the database version?
(Example, V7.0.16, V6.0.37)
A1b. What is the NLS release or porting kit version, if any?
(Example, A4.0.2, A3.7.2)
A1c. What is the character set of this database?
(Example, JA16EUC, US7ASCII)
For V6, find the LANGUAGE parameter in the init.ora file.
(Unix example, $ORACLE_HOME/dbs/init{ORACLE_SID}.ora)
For V7 there are two methods:
1. Use the SQL statement
SELECT USERENV ('LANGUAGE') FROM DUAL;
This returns the current session's language, territory and
database character encoding scheme in the format:
<language>_<territory>.<database character set>
AMERICAN_AMERICA.WE8ISO8859P1
The server session cannot display the client session's
character set as defined in NLS_LANG.
2. Query the Data Dictionary view NLS_DATABASE_PARAMETERS.
SELECT * FROM NLS_DATABASE_PARAMETERS;
The selection returns the values of the NLS parameters set
when the database was created.
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC _CHARACTERS .,
NLS_DATE_FORMAT DD-MON-YY
NLS_DATE_LANGUAGE AMERICAN
NLS_CHARACTERSET US7ASCII
NLS_SORT BINARY
Please note the above are default values. NLS_CHARACTERSET
(established at create database time) identifies the encoding
scheme used to store data. This specification should be
equivalent to or a superset of all client encoding schemes
that may connect to a given database.
A2. Does the user want to change the database character set (ie, allowed
NLS_CHARACTERSET to default to US7ASCII, NLS_CHARACTERSET is not a
superset of the client character sets)?
Once the database is created, the database character set cannot be
changed without re-creating the database. It is important to care-
fully consider which character set to use prior to install/create
database. If the user is willing to recreate the database, use
export/import as documented in D5.
A3. How can NLS parameters be defined?
NLS parameters can be defined as default parameters in the initialization
file (INIT.ORA), as an OS environment variable (NLS_LANG) to define
the client environment or altered dynamically within a session using
the ALTER SESSION command.
Please note, an ALTER SESSION statement is automatically executed
when a session connects to a database to override the values of
the INIT.ORA parameters NLS_LANGUAGE and NLS_TERRITORY to those
specified by the <language> and <territory> arguments of NLS_LANG.
If NLS_LANG is not defined, an ALTER SESSION statement is not executed
and NLS session values normally default to AMERICAN_AMERICA.US7ASCII.
A4. Which NLS parameters can be defined in the INIT.ORA?
In the INIT.ORA, additional NLS parameters can be explicitly defined
on a instance-wide basis. These parameters provide flexibility in
controlling specific localized (language-dependent) features.
They are:
NLS_DATE_FORMAT -date format
NLS_DATE_LANGUAGE -day and month names
NLS_NUMERIC_CHARACTERS -decimal character and group separator
NLS_CURRENCY -local currency symbol
NLS_ISO_CURRENCY -ISO currency symbol
NLS_SORT -sort sequence
These parameters can be explicitly overridden for a user session by
issuing 'ALTER SESSION SET NLS_......'. If specified, they override
the corresponding values defined implicitly by NLS_LANGUAGE and
NLS_TERRITORY.
A5. Which NLS parameters can be specificed as OS environment variables?
NLS_LANG is defined for each client session as an environment variable.
Different sessions connected to the same database can specify different
values. Prior to 7.2, NLS_LANG only overrides the default values
for the INIT.ORA parameters NLS_LANUGAGE and NLS_TERRITORY. These two
parameters implicitly set default values for the more granular NLS
parameters (ie, NLS_SORT, NLS_NUMERIC_CHARACTERS, NLS_CURRENCY).
As an example,
setenv NLS_LANG FRENCH_FRANCE.WE8ISO8859P1
SELECT * FROM NLS_SESSION_PARAMETERS;
PARAMETER VALUE
------------------------------------------
NLS_LANGUAGE FRENCH
NLS_TERRITORY FRANCE
NLS_CURRENCY F
NLS_ISO_CURRENCY FRANCE
NLS_NUMERIC_CHARACTERS ,.
NLS_DATE_FORMAT DD/MM/YY
NLS_DATE_LANGUAGE FRENCH
NLS_SORT FRENCH
ALTER SESSION SET NLS_LANGUAGE=GERMAN;
SELECT * FROM NLS_SESSION_PARAMETERS;
PARAMETER VALUE
------------------------------------------
NLS_LANGUAGE GERMAN
NLS_TERRITORY FRANCE
NLS_CURRENCY F
NLS_ISO_CURRENCY FRANCE
NLS_NUMERIC_CHARACTERS ,.
NLS_DATE_FORMAT DD/MM/YY
NLS_DATE_LANGUAGE GERMAN
NLS_SORT GERMAN
Note the explicit change of NLS_LANGUAGE implicitly altered
the values of NLS_DATE_LANGUAGE and NLS_SORT.
In 7.2, the following NLS parameters (in addition to NLS_LANG)
can be specified as environment variables:
NLS_DATE_FORMAT
NLS_DATE_LANGUAGE
NLS_SORT
NLS_NUMERIC_CHARACTERS
NLS_CURRENCY
NLS_ISO_CURRENCY
Refer to bug 18325 for more details.
A6. How is NLS_LANG defined?
The NLS_LANG environment variable is composed of three components in
the form:
NLS_LANG = <language>_<territory>.<character set>
= JAPANESE_JAPAN.JA16EUC
= FRENCH_CANADA.WE8DEC
where <language> specifies language used for Oracle
messages, names of day and month.
<territory> specifies default date format, decimal
character used for numbers and currency
symbol.
<character set> specifies the client-side character set.
Note <language> and <territory> are equivalent to the INIT.ORA
NLS_LANGUAGE and NLS_TERRITORY parameters and can be used to override
these instance-wide values for a given session.
A7. Does the user want to dynamically change the NLS settings for a
given session?
The various NLS parameters can be overridden for a given session
with:
ALTER SESSION SET NLS_LANGUAGE=FRENCH
NLS_TERRITORY=FRANCE
NLS_DATE_LANGUAGE=FRENCH
A8. Does the user understand the difference between NLS_DATABASE_PARAMETERS,
NLS_INSTANCE_PARAMETERS and NLS_SESSION_PARAMETERS?
The values displayed for the data dictionary view NLS_DATABASE_PARAMETERS
indicate the NLS parameter values in use at the time of database
creation. They are used as the default NLS parameters (if any are
required) when evaluating integrity constraints. NLS_INSTANCE_PARAMETERS
display the default NLS parameters for the instance. These values
represent the default NLS values for the database session when the
client-side NLS_LANG environment variable is not defined.
These values are only effective for the database server environment
and do not define the client-side environment. The data dictionary
view NLS_SESSION_PARAMETERS displays the NLS values established for
that session whether they have been overridden or not. Thus, if
the user has not defined NLS_LANG nor issued an 'ALTER SESSION SET
NLS_...' statement, NLS_SESSION_PARAMETERS will be equivalent to
NLS_INSTANCE_PARAMETERS. Note, NLS_DATABASE_PARAMETERS
contains the parameter NLS_CHARACTERSET and identifies the character
encoding scheme used to define the database. NLS_SESSION_PARAMETERS
does not contain this parameter nor does it capture the character set
identified in NLS_LANG. In addition, NLS_DATABASE_PARAMETERS cannot
be changed after the database is created. These parameters are used
to process constraints.
A9. How was the client-side NLS_LANG parameter set?
Unix:
Bourne Shell
NLS_LANG=FRENCH_FRANCE.WE8ISO8859P1
export NLS_LANG
or
C Shell
setenv NLS_LANG FRENCH_FRANCE.WE8ISO8859P1
VMS:
define NLS_LANG FRENCH_FRANCE.WE8ISO8859P1
A10. Verify specified client-side NLS_LANG parameter:
A10a. For V6, what is the environment variable LANGUAGE set to?
Unix:
echo $LANGUAGE
env LANGUAGE
VMS:
show logical LANGUAGE
AMERICAN_AMERICA.US7ASCII
DUTCH_THE NETHERLANDS.WE8ISO8859P1
CANADIAN FRENCH_FRANCE.WE8ISO8859P1
A10b. For V7, what is the environment variable NLS_LANG set to?
Unix:
echo $NLS_LANG
env NLS_LANG
VMS:
show logical NLS_LANG
A10c. Is NLS_LANG set to an invalid specification?
Currently there is no direct way of confirming valid values.
Validity can be checked indirectly:
For <language> and <territory>, use
ALTER SESSION SET NLS_LANGUAGE=<language>;
ALTER SESSION SET NLS_TERRITORY=<territory>;
If an ORA-12705 error message is returned, the specified value
is invalid.
For <character set> use the CONVERT function:
SELECT CONVERT('a','<character set>','us7ascii') from dual;
If an ORA-01482 error message is returned, the <character set>
specification is invalid. This procedure is dependent on whether
the <character set> is supported by the customer's machine.
A10d. What are the possible reasons for invalid values?
--value is misspelled
--value is not valid for customer's platform
--syntax is incorrect
Note some language and territory names use more than one word
separated by a space. In these cases, double quotes should be
used. For example on UNIX,
$ NLS_LANG="CANADIAN FRENCH_FRANCE.WE8ISO8859P1"
$ NLS_LANG="DUTCH_THE NETHERLANDS.WE8ISO8859P1"
A10e. In the configuration file for MS-WINDOWS (ORACLE.INI) and
MACINTOSH (CONFIG.ORA), have quotes incorrectly been placed
around language/territory specifications which include
spaces (ie,BRAZILIAN PORTUGUESE)?
When NLS_LANG is placed in ORACLE.INI, quotes are 'not' necessary
for language/territory specifications which include spaces.
However, quotes are necessary when NLS_LANGUAGE with spaces is
specified in the INIT.ORA.
A10f. Have the values for the NLS parameters, as determined by NLS_LANG,
been correctly defined for the user session?
SELECT * FROM NLS_SESSION_PARAMETERS;
PARAMETER VALUE
NLS_LANGUAGE DUTCH
NLS_TERRITORY THE NETHERLANDS
NLS_CURRENCY f
NLS_ISO_CURRENCY THE NETHERLANDS
NLS_NUMERIC _CHARACTERS .,
NLS_DATE_FORMAT DD-MM-YY
NLS_DATE_LANGUAGE DUTCH
NLS_SORT DUTCH
Note this query does not return the client-side character
set encoding scheme since this information is not recorded
at the server side.
A11.What is NLS_LANG default behavior?
The application (client) NLS environment is defined by NLS_LANG, and if
NLS_LANG is not specified the following default occurs:
If <language> is invalid or not specified, it defaults to American.
If <territory is invalid or not specified, the default specified
by <language> is used.
If <character set> is invalid or not specified, the default
specified by <language> is used.
This typically results in American_America.US7ASCII.
If NLS_LANG is not specified on the client side and is allowed
to default it has no effect on the server environment. The INIT.ORA
parameters are in effect. Thus database server messages will appear
in the language specified by the INIT.ORA parameter NLS_LANGUAGE
yet application type messages will appear in American.
However, if INIT.ORA NLS parameters aren't specified the NLS_LANG
defaults for language and territory are used.
Please note the setting of NLS_LANG ensures both server and application
NLS environments are configured equally.
If the above values default, there is no warning or error message
and logon proceeds normally. In V7.1, an error message will be
displayed to indicate an invalid value. However in Oracle7 it is
important that NLS_LANG always be specified in full and checked for
valid values.
A12.What are the effects of specifying NLS_LANG in the client session?
This controls, for example, the default formatting of numbers and
dates. For example, NLS_LANG was originally defined as
$setenv NLS_LANG ITALIAN_ITALY.WE8ISO8859P1
SELECT ENAME,HIREDATE,ROUND(SAL/12,2) SAL FROM EMP;
ENAME HIREDATE SAL
---------- --------- ----------
Clark 09-DIC-88 4195,83
Turner 17-FEB-87 5416,67
Miller 23-MAR-87 4366,67
Mller 01-APR-89 3795,83
Hscht 10-MAG-90 2933,33
Hlne 01-NOV-91 4066,67
Then
ALTER SESSION SET NLS_DATE_LANGUAGE=GERMAN
NLS_DATE_FORMAT='DD.MON.YYYY'
NLS_NUMERIC_CHARACTERS='.,';
SELECT ENAME,HIREDATE,ROUND(SAL/12,2) SAL FROM EMP;
ENAME HIREDATE SAL
---------- --------- ----------
Clark 09.DEZ.1988 4,195.83
Turner 17.FEB.1987 5,416.67
Miller 23.MER.1987 4,366.67
Mller 01.APR.1989 3,795.83
Hscht 10.MAI.1990 2,933.33
Hlne 01.NOV.1991 4,066.67
A13.During an existing user session, was the NLS environment correctly
changed?
The NLS environment can also be changed during a session using the
ALTER SESSION statement
ALTER SESSION SET <nls_parameter> = <value>
ALTER SESSION SET NLS_DATE_LANGUAGE = FRENCH
NLS_ISO_CURRENCY = FRANCE;
SELECT TO_CHAR(COST,'C99G999D99') FROM ORDERS
WHERE CUSTNO=586;
COST
------------
FRF12.673,49
B. CHARACTER SET CONVERSION
---------------------------
B1. Is the character set defined for the database equivalent or a superset
of the client-side character set defined by NLS_LANG?
The database encoding scheme is defined by CREATE DATABASE and is
fixed for the life of the database. A client encoding scheme is
defined by NLS_LANG on a per-session basis. Data is automatically
converted between client and server if their character sets differ.
However, the database encoding scheme should be equivalent to or a
superset of all client encoding schemes that may connect to a given
database. If not, characters may not convert correctly or may display
with a replacement character symbol.
When feasible, a suitable alternate character will be assigned
during conversion. If not a replacement character is used. A
specific replacement character can be defined for a 'non-defined'
character; otherwise a standard replacement character is used.
In the case of US7ASCII, the default replacement character is '?'.
For example, the 7-bit American character set (US7ASCII) does not
contain an 'n tilde lowercase'. The 8-bit character set WE8PC850
does and this character is assigned a binary code of 164.
If the database is created in US7ASCII and the client's NLS_LANG
is defined as 'AMERICAN_AMERICA.WE8PC850', character set conversion
will occur on input from WE8PC850 to US7ASCII and on output from
US7ASCII to WE8PC850. An 'n tilde lowercase' cannot realistically be
converted from WE8PC850 to US7ASCII since US7ASCII does not define this
character as part of its encoding scheme. Instead the 'n tilde lowercase'
is replaced by a suitable alternate character which in this case is the
letter 'n'.
B2. During client/server character set conversion, are replacement
characters displayed?
For complete conversion, the target encoding scheme must contain all
characters in the source data. If not, alternate/replacement characters
are used. Specific replacement characters can be defined for the target
encoding scheme. Otherwise a default replacement character is used.
Conversion is not reversible when alternate/replacement characters are
used. In the following example ISO8859-1 is a 8-bit character set and
US7ASCII is 7-bit.
input output
ISO8859-1 US7ASCII ISO8859-1
(an acute) a --> a --> a
(sharp s) B --> ? --> ?
In the above example, a suitable alternate character 'a' was
defined in US7ASCII for 'a acute'. Since a specific replacement
character was not identified for 'sharp s', a standard default '?'
is used. Note for 'non-English-like' characters, conversion on
output is not reversible to the values originally input.
B3. Has character conversion been tested?
Character conversion can be tested using SQL DUMP and CHR functions
by specifying the numeric (binary) value of a character in a given
encoding scheme. This method provides an important testing
technique since a terminal using a designated character encoding scheme
is not necessary.
SELECT DUMP (CHR(x), '<target_char_set>','<source_char_set>')
FROM DUAL;
SELECT DUMP (CHR(246),'WE8ISO8859P1','WE8PC850') FROM DUAL;
This technique can also be used to test case conversion
SELECT DUMP(UPPER(CHR(231))) FROM DUAL;
and to directly insert and select character data independent of
character set conversion.
INSERT INTO <table> (<column>) VALUES (CHR(231)||CHR(232)||CHR(233));
SELECT DUMP (<column>) FROM <table>;
B4. Can character data be converted explicitly?
Yes. Applications can use the CONVERT function to explicitly convert
data from one encoding scheme to another.
CONVERT (<column>,<dest_char_set>,<src_char_set>)
where <dest_char_set> is the encoding scheme to convert data to
and <src_char_set> is the encoding scheme of the data to be con-
verted. For example,
SELECT CONVERT (ENAME,'WE8HPHP','WE8DEC') FROM EMP;
generates a report from data entered as DEC 8-bit character set for
output to a laserjet printer using HP Western European 8-bit
character set.
B5. Is the problem related to client vs. server?
Oracle products may perform language sensitive functions 'locally'.
For example, SQLFORMS directly uppercases character data input into
a field defined with the UPPERCASE attribute. It also directly
validates charcters placed into a CHAR or ALPHA field.
If there are problems with this type of case conversion and data
entry within SQLFORMS, then it is a client-side problem pointing
to an invalid codeset specified with NLS_LANG. Even if NLS_LANG
specifies a 7-bit character encoding set (ie,US7ASCII) 8-bit
characters can still be entered into the database. Character data
in 7-bit encoding schemes is 8-bit but the 8th bit encoding space
is not used.
B6. Can the same problem be duplicated in an US7ASCII database with
NLS_LANG=AMERICAN_AMERICA.US7ASCII?
If yes, it may be an indication the problem is generic rather
than NLS specific.
B7. In situations where there isn't an US7ASCII database available,
can the problem be reproduced if the test data were in 7-bit
ASCII characters and the character set specification for NLS_LANG
is US7ASCII? If yes, it may be an indication the problem is generic
rather than NLS specific.
B8. Does the problem occur when the client's character set specification
in NLS_LANG is different than the database server's NLS_CHARACTERSET?
If so, change NLS_LANG to be equivalent to NLS_CHARACTERSET and
retest. This avoids character set conversion between client and
server. Please note this may not always be possible and is directly
dependent upon the character sets supported by the terminal hardware.
B9. Have the OS, terminal and printer been configured correctly to process
8-bit character data?
In addition to having the client character set correctly specified
by NLS_LANG, it is also necessary that the terminal/printer and
operating system software be defined appropriately to use the
desired encoding scheme. Note terminal setup can be affected by Oracle
Terminal definitions used by the Oracle tools. In addition, there
may be issues related to network software (TCP/IP) and terminal
emulators that may affect 8-bit character set processing.
In general, these settings can be validated if the characters
defined by the encoding scheme can be displayed by the terminal
at the operating system level. Please be aware there is no method
the Oracle software can use to validate a terminal has been physically
set to the correct character set as specified by NLS_LANG.
Confirm:
--Terminal and printer definitions are set to the correct character
encoding scheme.
--Oracle terminal definitions are defined correctly.
--Operating System has been defined to process 8-bit character data.
VMS:
$ SET TERM/EIGHT
UNIX:
% stty -istrip cs8 -parenb
B10.Was ASCII<-->EBCDIC character set conversion performed correctly?
V6 provides only limited support for heterogeneous client/server
environments where one side is ASCII-based and the other EBCDIC. In
these cases, correct EBCDIC/ASCII conversion is only performed for the
98 basic characters comprising the 'Oracle Universal Character Set'.
Other characters (referred to an 'non-English') are converted
according to an 'internal' coding scheme that only recovers these
characters correctly when retrieved from a client using the same
encoding scheme as used to create the data.
Thus, there is no problem accessing a V6 MVS (EBCDIC) database from
a V6 ASCII client as long as that particular ASCII client accesses
the MVS server directly. However it isn't possible for a V6 ASCII
client to access non-English character data created by a V6 MVS
client and vice versa.
Oracle7 uses a different data conversion methodology. The client
translates each source character (not byte) into a corresponding
'workbench character' code. The client then determines if the
server's designated encoding scheme also contains the same 'work-
bench character'. If so, the character is translated to the corre-
sponding server value. If not, a replacement character is inserted
into the output. The Oracle7 server will use the V6 method if a V6
client connects to the database. If an Oracle7 client connects to
an Oracle7 server, client and server adhere to the Oracle7 methodo-
logy even though the data may have been created under V6. Incompatible
translations may occur.
B11.Does the problem replicate on other Oracle tools (ie,sqlplus,sqldba)
than the tool reported?
Within SQLPLUS and SQLDBA, UPPER() and NLS_UPPER() convert correctly
from lower case to upper case for all 8-bit characters. However
with SQLFORMS3.0, if the field attribute UPPERCASE is set the
input data may be truncated to 7-bit and then converted to UPPER.
See bug 196402 for details.
C. DATA FORMATTING
------------------
C1. Are the text components of the end-user interface (ie,messages)
displayed incorrectly or not in the expected character set?
NLS uses the LMS message handler for text that comprises the
product end-user interface. LMS uses the following naming convention:
<product id><language id>.MSB
Language specific message files are created per product. For example,
ORAF.MSB identifies the kernel (ORA) French (F) error message file.
MSB files are in a special binary format and cannot be edited
directly. The messages in an MSB file are stored in one specific
character encoding scheme. Some platforms also support corresponding
MSG text files. The MSB file for the product in use is automatically
converted at runtime if the client specifies a different character
encoding scheme with NLS_LANG. Designated character encoding scheme/
product associations may differ between products. See bulletin
105626.837.
Check:
--Message files are not installed or are in the wrong file path.
If the expected message file cannot be located or opened, LMS
uses the US message file by default (if present) with no error
indication. On UNIX systems, message files are installed in
$ORACLE_HOME/<product>/mesg.
--Message has not been translated.
Rare and/or system error messages may be left in English. These
message are not intended for end users. Alternatively, the
absence of translation may be a result of a product error.
--Characters are not displayed correctly.
This implies a problem with the conversion of message text.
Try to 'dump' an MSB file using a terminal set to the character
encoding scheme used in the message file. Refer to RTSS bulletin
105626.837 for a listing of message file configurations.
Note RDBMS error messages are first converted into the database
character set if different from the encoding scheme used in the ORA
message files. Then subsequently into the client encoding scheme.
C2. If the database's encoding scheme is KO16KSCC5601 (Korean) and
the client's NLS_LANG' character set component is also set to
KO16KSCC5601, will client-side messages appear in Korean?
The 'language' component of NLS_LANG determines the language in
which messages will be displayed. In other words, character set
does not determine the language of messages.
C3. Has the ouput data been sorted correctly?
NLS provides two types of sorting: binary and linguistic. A binary
sort is based on binary values assigned within a given character
encoding scheme. A linguistic sort produces a 'localized' sort sequence
that adheres to specific linguistic conventions.
8-bit multi national encoding schemes that support several languages
(ie, ISO8859-1) cannot provide multiple localized linguistic sort
sequences. For appropriate sorting, the INIT.ORA parameter NLS_SORT
should be set to the desired linguistic sort sequence.
NLS_SORT=BINARY | <name>
Note <name> references a linguistic sort sequence and has no direct
connection to <language>.
NLS_SORT=GERMAN
NLS_SORT=WEST_EUROPEAN
NLS_SORT=BINARY
$setenv NLS_LANG GERMAN
SELECT LETTER FROM LETTERS ORDER BY LETTER;
LETTER
------
a
a (with umlat)
b
z
ALTER SESSION SET NLS_SORT=SWEDISH;
SELECT LETTER FROM LETTERS ORDER BY LETTER;
LETTER
------
a
b
z
a (umlat)
NLS_SORT is implicitly defined by NLS_LANGUAGE.
Refer to RTSS bulletins 105679.993 and 105629.958 for more detailed
information on linguistic sorting within Oracle7.
Note the NLS_LANG environment variable influences the NLS_SORT behavior
for the user session. NLS_SORT will be changed to the default value
as defined by the NLS_LANG <language> component and will supersede the
INIT.ORA NLS_SORT parameter. Currently, no additional NLS parameters
(except NLS_LANG) can be defined in the user environment. Refer to
enhancement request 183251 and A4.
C4. Can uppercase characters be sorted prior to lowercase characters?
You can specify any of the sorts irrespective of whether the data is
in ASCII or EBCDIC, as the linguitsic sorts are independent of the
actual character encoding scheme. NLS_SORT=WEST_EUROPEAN sorts
uppercase before lowercase.
C5. How does NLS in Oracle7 provide localized case conversion?
The functions NLS_UPPER, NLS_LOWER AND NLS_INITCAP provide this
functionality.
SELECT NLS_INITCAP ('ijsland','NLS_SORT=XDUTCH') CAP
FROM DUAL;
CAP
---
IJsland
C6. How does the function NLSSORT affect character comparison?
When using comparison operators, characters are compared according
to their binary values in the designated encoding scheme. A
character is greater than another if it has a higher binary value.
Since the binary sequence of characters may not match the linguistic
sequence for a particular language, such comparisons may not be
'linguistically correct'. The NLSSORT function allows such com-
parisons to reflect linguistic conventions.
$setenv NLS_LANG GERMAN
SELECT LETTER FROM LETTERS
WHERE NLSSORT(LETTER) > NLSSORT('b') ORDER BY LETTER;
LETTER
------
z
ALTER SESSION SET NLS_SORT=SWEDISH;
SELECT LETTER FROM LETTERS
WHERE NLSSORT(LETTER) > NLSSORT('b') ORDER BY LETTER;
LETTER
------
z
a (umlat)
Note NLSSORT has to be identified on both sides of the comparison
operator.
C7. How does NLS handle linguistic special cases like diagraphs (double
characters)?
Extended linguistic sorts are designed to accommodate language-specific
special cases involving diagraphs: a)sorting of diagraphs as a single
character and b) converting uppercase diagraphs into lowercase single
characters and vice versa. For example, the linguistic sort sequence
'XSpanish' specifies that the double characters 'ch' and 'll' are
sorted as single characters.
ALTER SESSION SET NLS_SORT=XSPANISH;
SELECT ENAME FROM EMP ORDER BY NAME:
ENAME
-----
Caldern
Lamb
Lowe
Llames
Martinez
C8. Are numbers and dates formatted as expected?
Language dependent features is controlled by defining the environment
variable NLS_LANG at the start of a user session. The syntax for
NLS_LANG has three independent components.
NLS_LANG = <language>_<territory>.<character set>
<language> specifies the values for language-dependent conventions:
- language for messages
- language for day and month names
- default sort sequence
<territory> specifies values for territory-dependent conventions:
- default date format
- decimal character
- group seperator
- local currency symbol
- ISO currency symbol
<character set> specifies the character encoding scheme of the end-users's
terminal.
Note <language> also defines default values for <territory> and
<character set> if omitted (see A10).
The following is an illustration of NLS default formatting of numbers
and dates using
setenv NLS_LANG AMERICAN_AMERICA.WE8ISO8859P1
SELECT ENAME,HIREDATE,ROUND(SAL/12,2) SAL FROM EMP;
ENAME HIREDATE SAL
---------- --------- ----------
Clark 09-DEC-88 4195.83
Turner 17-FEB-87 5416.67
Miller 23-MAR-87 4366.67
Mller 01-APR-89 3795.83
Hscht 10-MAY-90 2933.33
Hlne 01-NOV-91 4066.67
When specifying another language:
setenv NLS_LANG FRENCH_FRANCE.WE8ISO8859P1
SELECT ENAME,HIREDATE,ROUND(SAL/12,2) SAL FROM EMP;
ENAME HIREDATE SAL
---------- --------- ----------
Clark 09/12/88 4195,83
Turner 17/02/87 5416,67
Miller 23/03/87 4366,67
Mller 01/04/89 3795,83
Hscht 10/05/90 2933,33
Hlne 01/11/91 4066,67
C9. Is the customer aware of the change of behavior in the WW format
mask?
In V6 the WW format mask can return the ISO week number controlled
by <territory>. In Oracle7 WW always returns the non-ISO week number
and a new format mask, IW, returns the ISO week number.
C10.What are the date and number format masks provided by Oracle7
in the TO_CHAR and TO_DATE functions?
Formatting Dates
IW returns the ISO week number
I,IY,IYY ,IYYY return the year relating to the ISO week number
RM,rm returns the month as a Roman numeral
If sysdate were 1st Jan 1988:
SELECT TO_CHAR (SYSDATE,'IW,IYYY') TODAY FROM DUAL;
TODAY
-----
53 1987
Formatting Numbers
D returns the decimal character
G returns the group seperator
L returns the local currency symbol
C returns the international currency symbol
(as per ISO Specification 4217 1987-07-15)
RN,rn returns a number in its roman numeral equivalent
For example,
SELECT TO_CHAR(6,'rn') PAGENO FROM DUAL;
PAGENO
------
vi
Another example,
ALTER SESSION SET NLS_NUMERIC_CHARACTERS=',.';
SELECT ENAME,TO_CHAR(ROUND(SAL/12,2),'99G999D99') SAL
FROM EMP;
ENAME SAL
CLARK 4.195,83
HLNE 4.066,67
HSCHT 2.933,33
MILLER 4.366,67
The decimal character and group separator have to be different.
The syntax for NLS_NUMERIC_CHARACTERS is
NLS_NUMERIC_CHARACTERS = '<decimal character><group separator>'
C11.Do check constraints in Oracle7 result in the expected values?
In evaluating integrity constraints containing SQL statements with
NLS parameters as arguments (TO_CHAR, TO_DATE), the default values
for these parameters are taken from the NLS parameters set at data-
base creation. These values can be obtained from the data dictionary
view NLS_DATABASE_PARAMETERS. Dependance on default values can be
avoided by specifying NLS parameters explicitly.
C12.How are NLS parameters specified directly in SQL functions?
Specification of NLS parameters in SQL functions enables NLS
behavior to be independent of the current session's NLS settings.
This allows 'language-independent' integrity constraints, stored
procedures and views to be defined where these use string literals
containing dates and numbers.
TO_DATE('1-JAN-89','DD-MON-YY','NLS_DATE_LANGUAGE=AMERICAN')
D. MIGRATION
------------
D1. Is the customer experiencing V6 to Oracle7 database conversion
problems?
In V6, were inconsistent database character sets used?
An Oracle7 database is assumed to use one character encoding scheme
as specified by CREATE DATABASE. However a V6 database may have used
multiple encoding schemes since the init.ora LANGUAGE parameter
may have been set to multiple settings. Thus, a single database
character set concept was not enforced. If multiple encoding schemes
were used, data must be converted to a single encoding scheme
before an Oracle7 conversion. See D5 for information concerning
Export/Import.
If the V6 database is not converted to a single encoding scheme
prior to Oracle7 migration, client NLS environments with different
character set specifications may experience unpredictable results
when displaying data on output from the database. Correct character
set conversion from client to server cannot be guaranteed.
D2. Are there problems with the Migration Utility?
The Oracle7 Migration Utility uses the <character set> specified by
LANGUAGE in the V6 INIT.ORA file as the Oracle7 database encoding
scheme. As a result, this defined character set becomes the
Oracle7 database character set (NLS_CHARACTERSET). Thus, it is
essential the correct character set is specified in the INIT.ORA
parameter LANGUAGE prior to migration.
D3. Can MIGRATION be used to change the character set of a database when
converting from V6 to Oracle7?
No, there is no facility within the migration utility for converting
data to another character set.
D4. Can the user determine the character set of an Export file?
No. Export stores the character set ID (not the text string).
As a result, users may be unaware of which character set should be
specified for the Import's command line parameter 'CHARSET'.
This can result in data being incorrectly converted. See bugs 224161,
220349.
D5. Has the user exported/imported database or table(s) and is now
encountering character set conversion problems?
Confirm whether the export/import procedure was performed correctly:
--Export database using NLS_LANG to identify the source database's
character set. Export stores the character set ID (not the text
string) in the dump file.
--Establish NLS_LANG environment variable for Import session.
--Import reads the Export character set ID from the dump file and
compares it with the session's character set as defined in NLS_LANG.
--No conversion occurs if the Export's character set and the Import's
session character set are the same. If they are not the same,
conversion is performed from the Export character set to the Import's
session character set prior to the data being inserted into the
database.
--The Import's session character set should be a superset of the Export's
character set otherwise special characters will not be correctly
converted.
--Include the parameter 'CHARSET' when defining the Import parameter set.
CHARSET identifies the character set of the Export file. Currently
in V7, the code expects the value in CHARSET to match the Export's
file character set. If they do not match, IMP-42 will result.
The CHARSET option was developed to import older export files
which hadn't stored character set ID information.
--After the data has been converted to the Import's session character
set, it is then converted to the database character set if they
differ. The database character set should be a superset of the
Import's session character set otherwise special characters will not
be correctly converted.
It's important to note Import will do up to 2 character set conversions
depending on:
(a) character set of export file, (b)NLS_LANG of import session and
(c) character set of database.
D6. Did SQL*LOADER fail to interpret the character set of the input file?
SQL*LOADER has the capability to convert data from the input data
file's character set to the database character set if they differ.
The parameter CHARACTERSET in the control file identifies the
character set of the input data file.
SQLLDR CONTROL=<name> CHARACTERSET=WE8PC850
Note NLS_LANG must also be set to the data file's character set.
E. V6 NLS ISSUES
----------------
E1. Is the customer using 'V6' products (ie,SQLFORMS3.0,SQLMENU50,SRW)
with Oracle7?
These 'V6' products do not support the Oracle7 language parameter
NLS_LANG nor the new Oracle7 NLS features. For these products
the V6 NLS parameter LANGUAGE must be defined to set the
language environment for the product. In addition, NLS_LANG still
needs to be defined to establish the language environment for the
V7 database server.
E2. Has the V6 NLS 'LANGUAGE' parameter been defined correctly?
The NLS 'LANGUAGE' parameter is comprised of <language>_<territory>.
<character set> and is defined in the INIT.ORA file to specify
the server language environment. Note <character set> in this
regard specifies the database character encoding scheme. This
parameter is also defined in the user environment to identify the
client application environment in which <character set> now
pertains to the character encoding scheme of the user's terminal.
E3. Is the V6 NLS user viewing kernel messages displayed in a character
set different than specified in the 'user environment' LANGUAGE
parameter?
In V6, the init.ora NLS parameter LANGUAGE is applied instance-wide
to all sessions. It controls:
--Kernel RDBMS error messages
--Utility-specific messages (ie,SQLFORMS)
--Language used for month and day names
--Week and day number calculation
--Behavior of UPPER, LOWER, INITCAP and CONVERT
--Sort method used for an ORDER BY query.
This LANGUAGE parameter can also be specified as a user defined
environment variable/symbol to override certain effects of the
instance-wide LANGUAGE parameter. In this case, the value set
for LANGUAGE affects only utility-specific messages. Kernel RDBMS
messages are returned in the language and character set specified
in the instance-wide init.ora LANGUAGE parameter.
In conclusion, ask the user if they have read the bulletins stored
under the NLS category on the RTSS Bulletin Board?