Preliminary discussion of character sets (5)

xiaoxiao2021-03-06  49

Original link:

http://www.eygle.com/special/nls_character_set_05.htm

Originally published in the ITPUB Technology Series "Oracle Database DBA Special Technology Jing", which is strictly prohibited to reprint this article without permission.

We know that in the export file, we record the character set ID of the exported used, by viewing the 2nd, 3rd bytes of the export file header, we can find the 16-en-generated character set ID, on Windows, we can use UltraEdit Open the DMP file to see its export character set ::

On UNIX we can view:

CAT Expdat.dmp | OD -X | HEAD

Oracle provides standard functions, converts character set names and ID:

SQL> SELECT NLS_CHARSET_ID ('zhs16gbk') from dual;

NLS_CHARSET_ID ('zhs16GBK')

----------------------------

852

1 row selected.

SQL> SELECT NLS_CHARSET_NAME (852) from DUAL

NLS_CHAR

------------

ZHS16GBK

1 row selected.

Ten reformed conversion hexadecimal:

SQL> SELECT TO_CHAR ('852', 'XXXX') from Dual;

TO_CH

-----

354

1 row selected.

In response to the 2nd and 3rd bytes of the above figure, we know that the exported file character set is zhs16GBK.

Query the valid character set in the database can use the following script:

COL NLS_CHARSET_ID for 9999

COL NLS_CHARSET_NAME FOR A30

COL HEX_ID FOR A20

SELECT

NLS_CHARSET_ID (Value) NLS_CHARSET_ID,

Value NLS_Charset_name,

TO_CHAR (NLS_Charset_ID (Value), 'XXXX') HEX_ID

From V $ nls_valid_values

WHERE parameter = 'characterset'

ORDER BY NLS_CHARSET_ID (VALUE)

/

The output is like:

NLS_CHARSET_ID NLS_CHARSET_NAME HEX_ID --------------------------------------------- ---------- 1 US7ASCII 1 2 WE8DEC 2 3 WE8HP 3 4 US8PC437 4 5 WE8EBCDIC37 5 6 WE8EBCDIC500 6 7 WE8EBCDIC 1140 7 8 WE8EBCDIC285 8 ................ ... 850 ZHS16CGB231280 352 851 ZHS16MACCGB231280 353 852 ZHS16GBK 354 853 ZHS16DBCS 355 860 ZHT32EUC 35c 861 ZHT32SOPS 35d 862 ZHT16DBT 35e 863 ZHT32TRIS 35f 864 ZHT16DBCS 360 865 zht16big5 361 866 zht16ccdc 362 867 zht16mswin950 363 868 zht16hkscs 364 870 Al24utffss 366 871 UTF8 367 872 UTFE 368 ............................. .....

In many cases, when we conducted an import operation, you have left the source database. At this time, if the character set and export file of the target database are inconsistent, it is necessary to perform special processing. The following introduces several methods, mainly US7ASCII and ZHS16GBK Take advantage of

1. The source database character set is US7ASCII, exports the file character set to US7ASCII or zHS16GBK, the target database character set is zhs16GBK in Oracle92, we found that this export file cannot be properly imported into the Oracle9i database, this It may be because Oracle9i's coding scheme has changed a large change.

The following is a simple test we do, in which the file name rules are: S-Server, follow the Server character set C-Client, and then the client character set when the export operation

When importing, the client character set is set to the command line, which is limited to the space, and we omit some test procedures. For Oracle9ir2, our test results are US7ASCII character sets, no matter how conversion, you cannot correctly import the database of the zhs16GBK character set.

When the import operation is performed, if the character cannot be converted normally, the Oracle database will automatically use one "?" Instead, that is, encoding 63.

E: / NLS2> SET NLS_LANG = American_america.us7ascii

E: / NLS2> IMP EYGLE / EYGLE FILE = SUS7ASCII- CUS7ASCII.DMP fromUser = Eygle Touser = Eygle Tables = TEST

Import: Release 9.2.0.4.0 - Production on Mon Nov 3 17:14:39 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V09.02.00 Via Conventional Path

Import Done in US7ASCII Character Set and Al16UTF16 Nchar Character Set

Import Server Uses ZHS16GBK Character Set (Possible Charset Conversion)

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production ON MON NOV 3 17:14:50 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

-----------------------------

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

2 rows selected.

SQL> EXIT

Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - ProductionWith The Partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

E: / NLS2> SET NLS_LANG = American_america.zHS16GBK

E: / NLS2> IMP EYGLE / EYGLE FILE = SUS7ASCII- CUS7ASCII.DMP fromUser = Eygle Touser = EYGLE TABLES = TEST IGNORE = Y

Import: Release 9.2.0.4.0 - Production ON Mon Nov 3 17:15:28 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V09.02.00 Via Conventional Path

Import done in zhs16gbk character set and al16utf16 nchar character set

Export Client Uses US7ASCII Character Set (Possible Charset Conversion)

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production ON MON NOV 3 17:15:34 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

-------------------------------------------------- ------------------------------

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

4 Rows SELECTED.SQL> Drop Table Test;

Table Dropped.

SQL> EXIT

Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

E: / NLS2> SET NLS_LANG = American_america.zHS16GBK

E: / NLS2> IMP EYGLE / EYGLE FILE = SUS7ASCII-CZHS16GBK.DMP fromUser = Eygle Touser = Eygle Tables = Test Ignore = Y

Import: Release 9.2.0.4.0 - Production on Mon Nov 3 17:17:21 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V09.02.00 Via Conventional Path

Import done in zhs16gbk character set and al16utf16 nchar character set

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production ON MON NOV 3 17:17:30 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

----------------------------------------------

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

2 rows selected.

SQL> EXIT

Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONSJSERVER RELEASE 9.2.0.4.0 - Production

E: / NLS2> SET NLS_LANG = American_america.us7ascii

E: / NLS2> IMP EYGLE / EYGLE FILE = SUS7ASCII-CZHS16GBK.DMP fromUser = Eygle Touser = Eygle Tables = Test Ignore = Y

Import: Release 9.2.0.4.0 - Production on Mon Nov 3 17:18:00 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V09.02.00 Via Conventional Path

Import Done in US7ASCII Character Set and Al16UTF16 Nchar Character Set

Import Server Uses ZHS16GBK Character Set (Possible Charset Conversion)

Export Client Uses ZHS16GBK Character Set (Possible Charset Conversion)

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Mon Nov 3 17:18:08 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

----------------------------------------

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

???? TYP = 1 LEN = 4: 63, 63, 63, 63

TEST TYP = 1 len = 4: 116, 101, 115, 116

4 rows selected.

SQL>

For this case, we can set the export character set to US7ASCII by using Oracle8i's export tool, and then modify the second, third characters, and modify 0001 is 0354, so that the data of the US7ASCII character set can be correctly imported into the zhs16GBK database. in. Modify the export file:

Import a modified export file:

E: / NLS2> SET NLS_LANG = American_america.zHS16GBK

E: / NLS2> IMP EYGLE / EYGLE FILE = SUS7ASCII- CUS7ASCII-Exp817.dmp fromuser = EYGLE TOUSER = Eygle Tables = TEST

Import: Release 9.2.0.4.0 - Production on Mon Nov 3 17:37:17 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V08.01.07 Via Conventional Path

Import done in zhs16gbk character set and al16utf16 nchar character set

Export Server Uses Utf8 Nchar Character Set (Possible Ncharset Conversion)

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Mon Nov 3 17:37:23 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT NAME, DUMP (NAME) from test;

Name dump (name)

-------------------------------------------------- ------------------------------

Test TYP = 1 LEN = 4: 178, 226, 202, 212

TEST TYP = 1 len = 4: 116, 101, 115, 116

2 rows selected.

SQL>

2. Method using Create Database If the character set used by the export file is US7ASCII, the character set of the target database is zHS16GBK, we can use the Create Database method to modify, the specific: SQL> Col ​​Parameter for A30

SQL> Col ​​Value For A30

SQL> SELECT * FROM V $ NLS_PARAMETERS;

Parameter Value

------------------------------------------------------------------------------------------------------------------------------------------------------------------ ------------

NLS_LANGUAGE AMERICAN

NLS_TERRITORY AMERICA

NLS_CURRENCY $

NLS_ISO_CURRENCY AMERICA

NLS_NUMERIC_CHARACTERS.,

NLS_CALENDAR GREGORIAN

NLS_DATE_FORMAT DD-MON-RR

NLS_DATE_LANGUAGE AMERICAN

NLS_CharacterSet ZHS16GBK

NLS_SORT BINARY

.................

19 rows selected.

SQL> CREATE DATABASE Character Set US7ASCII;

Create Database Character Set US7ASCII

*

Error At Line 1:

ORA-01031: Insufficient Privileges

SQL> SELECT * FROM V $ NLS_PARAMETERS;

Parameter Value

------------------------------------------------------------------------------------------------------------------------------------------------------------------ ------------

NLS_LANGUAGE AMERICAN

NLS_TERRITORY AMERICA

NLS_CURRENCY $

NLS_ISO_CURRENCY AMERICA

NLS_NUMERIC_CHARACTERS.,

NLS_CALENDAR GREGORIAN

NLS_DATE_FORMAT DD-MON-RR

NLS_DATE_LANGUAGE AMERICAN

NLS_CharacterSet US7ASCII

NLS_SORT BINARY

............ .....

19 rows selected.

SQL> EXIT

Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

E: / NLS2> SET NLS_LANG = American_america.us7ascii

E: / NLS2> IMP Eygle / Eygle File = Sus7ASCII-Cus7ASCII.dmp fromuser = Eygle Touser = EygleImport: Release 9.2.0.4.0 - Production on Sun Nov 2 14:53:26 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to: Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

Export File Created by Export: V09.02.00 Via Conventional Path

Import Done in US7ASCII Character Set and Al16UTF16 Nchar Character Set

Import Server Uses ZHS16GBK Character Set (Possible Charset Conversion)

Importing Table "Test" 2 ROWS IMPORTED

Import Terminated SuccessFully without Warnings.

E: / NLS2> SQLPlus Eygle / Eygle

SQL * Plus: Release 9.2.0.4.0 - Production on Sun Nov 2 14:53:35 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT *.

Name

------------

test

Test

2 rows selected.

We see that when the CREATE DATABASE Character set US7ASCII; command, the character set setting in the database V $ nls_Parameters changes, which affects the import process. After the changes can be imported correctly, then the database will restore it.

Tip: V $ nls_paemters comes from X $ NLS_Parameters, which affects the import operation; and NLS_DATABASE_PARAMETERS is derived from the PROPS $ data table, affecting the data storage.

3. Character scanning tools provided by Oracle CSScan

We say that the above method is only in the case of nothing, its essence is deceived database, forcing import data, may lose metadata. If you want to ensure the integrity of the data, you should use the CSSCAN to scan the database, find all incompatible characters, and then update the corresponding script and code, and ensure the correctness of the data. Let's take a look at the use of CSSCAN.

To create a corresponding data dictionary object as a SYS user as a SYS user: E: / NLS2> SQLPLUS "/ as sysdba"

SQL * Plus: Release 9.2.0.4.0 - Production on Sun Nov 2 19:42:07 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

SQL> SELECT Instance_name from V $ Intance;

SELECT Instance_name from V $ Intance

*

Error At Line 1:

ORA-00942: Table or view does not exist

SQL> SELECT Instance_name from V $ Instance;

INSTANCE_NAME

----------------

Penny

1 row selected.

SQL> @? / Rdbms / admin / csminst.sql

User created.

Grant succeeded.

........

This script creates the corresponding user (CSMIG) and data dictionary object, and the scan information is recorded in the corresponding data dictionary table.

We can scan the database on the command line:

E: / NLS2> CSSCAN FULL = y fromchar = zhs16gbk tochar = US7ASCII log = US7check.log capture = y arrAy = 1000000 process = 2

Character Set Scanner V1.1: Release 9.2.0.1.0 - Production on Sun Nov 2 20:24:45 2003

CopyRight (C) 1982, 2002, Oracle Corporation. All Rights Reserved.

UserName: Eygle / Eygle

Connected to:

Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production

With the partitioning, Oracle Label Security, OLAP AND ORACLE DATA MINING OPTIONS

JServer Release 9.2.0.4.0 - Production

ENUMERANTING TABLES TO SCAN ...

Process 1 Scanning Sys.Source $ [Aaaabhaabaaairaaa]

Process 2 scanning sys.attribute $ [AAAAEOAABAAAAAHZAAA]

Process 2 Scanning Sys.Parameter $ [Aaaeoaabaaaahzaaa]

Process 2 scanning sys.method $ [AAAAAAAABAAAAHZAAA]

...... ..

Process 2 scanning system.def $ _AQERROR [AAAA8FAABAAACWJAAA]

Process 1 Scanning WMSYS.WM $ env_vars [Aaabewaabaaafmzaaa]

....................

Process 2 scanning sys.ugroup $ [aaaaa5aabaaaagpaaa]. Process 2 Scanning Sys.con $ [Aaaaacaabaaaaacpaaa]

Process 1 Scanning Sys.File $ [Aaaaaaabaaaabxaaa]

Creating Database Scan Summary Report ...

Creating Individual Exception Report ...

Scanner Terminated SuccessFully.

Then we can check the output log to view the database scan:

Database Scan Individual Exception Report

[Database Scan Parameters]

Parameter Value

------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------------------------

Scan Type Full Database

SCAN CHAR DATA? YES

Current Database Character Set ZHS16GBK

New Database Character Set US7ASCII

Scan nchar data? No

Array Fetch Buffer Size 1000000

Number of processes 2

CAPTURE CONVERTIBLE DATA? YES

------------------------------------------------------------------------------------------------------------------------------------------------------------------ ----------------------------

[Data Dictionary Individual Exceptions]

[Application Data Individual Exception]]

User: Eygle

Table: Test

Column: Name

TYPE: VARCHAR2 (10)

Number of Exceptions: 1

Max Post Conversion Data Size: 4

Rowid Exception Type Size Cell Data (First 30 Bytes)

----------------------------------- ---------------------

Aaabpiaadaaaaamaaa Lossy Conversion Test

----------------------------------- ---------------------

The data that cannot be converted will be recorded, and we can update the data according to this information, and ensure that the conversion is correct.

转载请注明原文地址:https://www.9cbs.com/read-111983.html

New Post(0)