Preliminary discussion of character set issues (6) ----- garbled

xiaoxiao2021-03-06  82

Original link: http://www.eygle.com/special/nls_character_set_06.htm Original published in ITPUB Technology Series "Oracle Database DBA Special Technology Jing" Usually in our real-world environment, there are 3 character set settings. First: Client Application Character Set 2: Client NLS_LANG Parameter Settings Third: Server, Database Character Sets We said, a character is applying in client application (such as SQLPlus, cmd , NOTEPAD, etc., how to display what kind of character display depends on the client operating system, how to display these characters in the app, as for these characters to store normally in the database, and additional Two character set settings are closely related. During transmission, client NLS_LANG is primarily used to perform conversion judgment If NLS_LANG is equal to the database character set, no conversion directly inserts the character into the database if the conversion is converted, and the conversion has two tasks.

If there is a corresponding relationship, the corresponding binary coding is mapped (after this step mapping, the character may be converted) is passed to the database if there is no corresponding relationship, then pass a replacement character (many platforms?)

Database character set, at different times and client NLS_LANG, the character passed through NLS_LANG conversion will be further processed

For? (Ie, there is no corresponding relationship) directly? The form is stored in the database for other characters, and the conversion is converted between the NLS_LANG and the database character set.

Let's take a look at the most common character set and garbled: 1. When the NLS_LANG character set is different from the database character set, the NLS_LANG is different from the Server terminal character set setting in this case, there are two possibilities:

The character entered by the client does not have a corresponding character in the NLS_LANG, which cannot be converted, NLS_LANG uses replacement characters to replace these unable mapped characters (this step is completed in TTS), in many character sets this alternative character is "?" When the client's characters correspond to different characters in the NLS_LANG, the transition is transmitted to the database, stored characters, but the metadata has been lost, and the characters in the database are no longer on behalf of the client. And this process is irreversible, which is why many times the normal code is entered in the client, and the unknown character will be obtained after the query.

We briefly explain this process by the above picture, enter the euro symbol: €, when the client is in the WE8ISO8859P15 character set, then the client NLS_LANG and the database end character set are different, the first conversion, the client € symbol code is A4, when the NLS_LANG conversion, A4 corresponds to '¤' in NLS_LANG ', which conversion has generated an error mapping. Since the database character set is different from NLS_LANG settings, further transformations occur, the encoding of the database is stored into C2A4, although the correct conversion is performed with NLS_lang, but the data entered by the client has been damaged or lost. We can use our familiar characters to do a simple test: Test environment: client application is Chinese 18030 character set NLS_LANG set to US7ASCII character set database Character set is zhs16GBK

C: /> set nls_lang = american_america.us7ascii

C: /> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:19:57 2003

Copyright (C) 1982, 2002, Oracle Corporation. All Rights Reserved.Connected TO:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> INSERT INTO TEST VALUES ('Test');

1 row created.

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

--------------------------------------------------

2bjt Typ = 1 LEN = 4: 50, 98, 74, 84

At this time, we found that the query is the chaotic character, we convert these characters to 2 credits.

110010 1100010 1001010 1010100

Built out 8 is 00110010 01100010 01001010 01010100

We replace the first one 10110010 11100010 11001010 11010100

Let's see the right store: c: /> set nls_lang = American_america.zHS16GBK

C: /> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:40:18 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Productionwith The Partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONSERRERVER Release 9.2.0.4.0 - Production

SQL> INSERT INTO TEST VALUES ('Test');

1 row created.

SQL> Col ​​Dump (Name) for A30SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name) ---------- ------------------------------ Test TYP = 1 LEN = 4: 178, 226, 202, 212

1 row selected.

We convert this result to 2 credits 10110010 11100010 11001010 11010100

This result is the result of our first completion of 1.

This test indicates that the first one is removed when the US7ASCII is converted in Chinese, so that the metadata is lost, resulting in garbled, the conversion of NLS_LANG can be added!

3. In this case, the NLS_LANG and Database character set is in this case, the database end is not converted to the client (this can improve performance), and directly store the database, then there is the same problem in this time, if The character set passed by the client can be stored correctly in the database. If not, it will be replaced by replacement character? It is yet this.

As shown in the figure above, when NLS_LANG and database character set settings are UTF8, the encoding A4 of the client's euro symbol will not be inserted into the database through any conversion, and in the database of UTF8, A4 represents an illegal character. Let's take a simple test

test environment:

Client character set application is Chinese GB18030

Client NLS_LANG is US7ASCII

The database character set is US7ASCII. We know this time, the data stored, the database does not make any conversion, in the following tests, we see that Chinese is displayed correctly under the US7ASCII character set.

C: /> set nls_lang = american_america.us7ascii

C: /> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Tue Nov 4 01:02:04 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> INSERT INTO TEST VALUES ('Test');

1 row created.

SQL> commit;

COMMIT COMPLETE.

SQL> SELECT *.

Name

------------

test

1 row selected.

SQL> Col ​​Dump (Name) for A30

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

---------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Test TYP = 1 LEN = 4: 178, 226, 202, 212

1 row selected.

SQL> Select * from NLS_DATABASE_PARETERS;

Parameter Value

------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------------

NLS_LANGUAGE AMERICAN

NLS_TERRITORY AMERICA

NLS_CURRENCY $

NLS_ISO_CURRENCY AMERICA

NLS_NUMERIC_CHARACTERS.,

NLS_CharacterSet US7ASCII

NLS_CALENDAR GREGORIAN

NLS_DATE_FORMAT DD-MON-RR

NLS_DATE_LANGUAGE AMERICAN

NLS_SORT BINARY

NLS_TIME_FORMAT HH.MI.SSXFF AM

Parameter Value

------------------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------- NLS_TimeStamp_Format DD-MON-RR HH.MI.SSXFF AM

NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR

NLS_TimeStamp_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR

NLS_DUAL_CURRENCY $

NLS_Comp Binary

NLS_LENGTH_SEMANTICS BYTE

NLS_NCHAR_CONV_EXCP FALSE

NLS_NCHAR_CHARACTERSET AL16UTF16

NLS_RDBMS_VERSION 9.2.0.4.0

20 rows selected.

SQL>

Conclusion:

For DBA, there is a very important principle that don't place your database in dangerous situations!

This requires us to do a valid backup before performing any operations that may change the database structure, and many DBA has a painful lesson in the operation of the backup.

转载请注明原文地址:https://www.9cbs.com/read-108577.html

New Post(0)