Original link: http://www.eygle.com/special/nls_character_set_06.htm Original published in ITPUB Technology Series "Oracle Database DBA Special Technology Jing" Usually in our real-world environment, there are 3 character set settings. First: Client Application Character Set 2: Client NLS_LANG Parameter Settings Third: Server, Database Character Sets We said, a character is applying in client application (such as SQLPlus, cmd , NOTEPAD, etc., how to display what kind of character display depends on the client operating system, how to display these characters in the app, as for these characters to store normally in the database, and additional Two character set settings are closely related. During transmission, client NLS_LANG is primarily used to perform conversion judgment If NLS_LANG is equal to the database character set, no conversion directly inserts the character into the database if the conversion is converted, and the conversion has two tasks.
If there is a corresponding relationship, the corresponding binary coding is mapped (after this step mapping, the character may be converted) is passed to the database if there is no corresponding relationship, then pass a replacement character (many platforms?)
Database character set, at different times and client NLS_LANG, the character passed through NLS_LANG conversion will be further processed
For? (Ie, there is no corresponding relationship) directly? The form is stored in the database for other characters, and the conversion is converted between the NLS_LANG and the database character set.
Let's take a look at the most common character set and garbled: 1. When the NLS_LANG character set is different from the database character set, the NLS_LANG is different from the Server terminal character set setting in this case, there are two possibilities:
The character entered by the client does not have a corresponding character in the NLS_LANG, which cannot be converted, NLS_LANG uses replacement characters to replace these unable mapped characters (this step is completed in TTS), in many character sets this alternative character is "?" When the client's characters correspond to different characters in the NLS_LANG, the transition is transmitted to the database, stored characters, but the metadata has been lost, and the characters in the database are no longer on behalf of the client. And this process is irreversible, which is why many times the normal code is entered in the client, and the unknown character will be obtained after the query.
We briefly explain this process by the above picture, enter the euro symbol: €, when the client is in the WE8ISO8859P15 character set, then the client NLS_LANG and the database end character set are different, the first conversion, the client € symbol code is A4, when the NLS_LANG conversion, A4 corresponds to '¤' in NLS_LANG ', which conversion has generated an error mapping. Since the database character set is different from NLS_LANG settings, further transformations occur, the encoding of the database is stored into C2A4, although the correct conversion is performed with NLS_lang, but the data entered by the client has been damaged or lost. We can use our familiar characters to do a simple test: Test environment: client application is Chinese 18030 character set NLS_LANG set to US7ASCII character set database Character set is zhs16GBK
C: /> set nls_lang = american_america.us7ascii
C: /> SQLPlus Eygle / Eygle
SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:19:57 2003
Copyright (C) 1982, 2002, Oracle Corporation. All Rights Reserved.Connected TO:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS
JServer Release 9.2.0.4.0 - Production
SQL> INSERT INTO TEST VALUES ('Test');
1 row created.
SQL> SELECT NAME, DUMP (NAME) from test;
Name dump (name)
--------------------------------------------------
2bjt Typ = 1 LEN = 4: 50, 98, 74, 84
At this time, we found that the query is the chaotic character, we convert these characters to 2 credits.
110010 1100010 1001010 1010100
Built out 8 is 00110010 01100010 01001010 01010100
We replace the first one 10110010 11100010 11001010 11010100
Let's see the right store: c: /> set nls_lang = American_america.zHS16GBK
C: /> SQLPlus Eygle / Eygle
SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:40:18 2003
CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Productionwith The Partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONSERRERVER Release 9.2.0.4.0 - Production
SQL> INSERT INTO TEST VALUES ('Test');
1 row created.
SQL> Col Dump (Name) for A30SQL> SELECT NAME, DUMP (NAME) from test;
Name dump (name) ---------- ------------------------------ Test TYP = 1 LEN = 4: 178, 226, 202, 212
1 row selected.
We convert this result to 2 credits 10110010 11100010 11001010 11010100
This result is the result of our first completion of 1.
This test indicates that the first one is removed when the US7ASCII is converted in Chinese, so that the metadata is lost, resulting in garbled, the conversion of NLS_LANG can be added!
3. In this case, the NLS_LANG and Database character set is in this case, the database end is not converted to the client (this can improve performance), and directly store the database, then there is the same problem in this time, if The character set passed by the client can be stored correctly in the database. If not, it will be replaced by replacement character? It is yet this.
As shown in the figure above, when NLS_LANG and database character set settings are UTF8, the encoding A4 of the client's euro symbol will not be inserted into the database through any conversion, and in the database of UTF8, A4 represents an illegal character. Let's take a simple test
test environment:
Client character set application is Chinese GB18030
Client NLS_LANG is US7ASCII
The database character set is US7ASCII. We know this time, the data stored, the database does not make any conversion, in the following tests, we see that Chinese is displayed correctly under the US7ASCII character set.
C: /> set nls_lang = american_america.us7ascii
C: /> SQLPlus Eygle / Eygle
SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:02:04 2003
CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS
JServer Release 9.2.0.4.0 - Production
SQL> INSERT INTO TEST VALUES ('Test');
1 row created.
SQL> commit;
COMMIT COMPLETE.
SQL> SELECT *.
Name
------------
test
1 row selected.
SQL> Col Dump (Name) for A30
SQL> SELECT NAME, DUMP (NAME) from test;
Name dump (name)
---------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Test TYP = 1 LEN = 4: 178, 226, 202, 212
1 row selected.
SQL> Select * from NLS_DATABASE_PARETERS;
Parameter Value
------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS.,
NLS_CharacterSet US7ASCII
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
Parameter Value
------------------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------- NLS_TimeStamp_Format DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TimeStamp_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_Comp Binary
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 9.2.0.4.0
20 rows selected.
SQL>
Conclusion:
For DBA, there is a very important principle that don't place your database in dangerous situations!
This requires us to do a valid backup before performing any operations that may change the database structure, and many DBA has a painful lesson in the operation of the backup.