Copy from: http://www-900.ibm.com/developerWorks/cn/dmdd/library/techarticles/0209kline/0209kline.shtml
Set the mixed byte character set (MBCS) database in DB2 UDB version 8 on English OS
David J Kline Gabor Wieser DB2® Vendor Enablement, PartnerWorld® For Developersibm Developer Technical Support (DTS) Center - Dallas September 2002
Content introduction Character set encoding Confidential Summary Setting Environment Configuration on Windows Configuration AIX Configuration Creating an MBCS Database Connect to Database and Performs Basic SQL Test Conclusion
© 2002 International Business Machines Corporation. All Rights Reserved.
Introduction
This article will illustrate how developers are tested on an English Windows operating system or AIX operating system, and a database that is running, using a database of Mixed-byte Character Set (MBCS)). Now, many developers have faced this demand: running the database test in various languages, but they don't or do not want more hardware resources to test each language on a dedicated machine. By supporting English operating systems, the test for multiple heterogeneous databases is performed, and the hardware required is minimized, thus reducing costs and improving efficiency.
In the Darlas's Developer Technical Support Center, we have found that most independent software vendors in DB2 need to get help from the independent software vendor: How to set up the database for MBCS support and Operating system environment. Therefore, this paper is written to the developer and other readers (such as ISVs that cooperate with us) on DB2 Universal DatabaseTM version 8 in English operating system environments. The basic guidelines in this article also apply to any single-word character set (SBCS) locale.
We will provide you with detailed information, allowing you to create, connect the MBCS database, and execute SQL testing from the DB2 command window (Command WINDOW) in the Windows® and AIX® environments. The UTF-8 database can also be used in multilingual environments, but we will not involve this topic because it is discussed in this article. Please note that UTF-8 has some benefits worth careful. At the end of this article, you will see links to UTF-8 information and other code pages related topics.
We will use a Japanese example to set up a Japanese example in this article to create a database and execute SQL testing.
Character set encoding concise summary
The European language is written in alphabetical text, which is displayed as a symbol indicating pronunciation. Far East language is written in pictographs, they represent the meaning of the word in abstraction. The pictographic character set is much larger than the alphabetical set, so the mixed byte character set (MBCS) support is introduced to overcome the limit of the single-byte character set (SBCS) 256 characters.
Two methods use two methods to solve the character restriction problem: Dual-word character set (DBCS) and mixed byte character set (MBCS). These characters are divided into two categories:
Universal Language Support - A single language support for many languages (UTF and UCS) single language support - each character set is dedicated to a specific language and platform.
This article describes a single language support. We will use Japanese support for example.
You can use one of the two methods to encode Japanese icons: SHIFT-JIS (SJIS) coding scheme SJIS for IBM-932 and newer IBM-943 code sets. These code sets can be used on most platforms. The first byte of each character is used to determine the number of bytes of the character. But there are exceptions, from 0x20 to 0x7f and values from 0xA0 to 0xDF to encode ASCII and flash name characters. Reserved from 0x81 to 0x9f and from 0xE0 to 0xFC, used as the first byte of multi-byte characters. Map the JISX0208 characters to the multi-byte value starting from 0x8140. The second byte of the multi-byte character can be any value. Extend Unix® Code (EUC)) coding scheme. IBM-EUCJP code sets can be used in UNIX platforms. The EUC coding scheme defines a set of coding rules that support one to four character sets. If you plan to store some graphic characters in the database, this support for multiple character sets may be necessary. The EUC-based code set complies with the EUC coding rule, but also identifies a specific character set associated with a particular instance. For example, IBM-EUCJP for Japanese is cited from the Japanese Industrial Standard (Japanese Industrial Standard) character based on the EUC encoding rule.
The first set-character set O (CSO) always contains the ASCII character set. All other sets must set the highest bit (MSB) to 1 and can encode characters using any number of bytes. In addition, all characters in the character set must have:
The same byte number is to encode all the same columns display width (number of columns on the terminal of the fixed width).
All characters in the third set (CS2) are always starting with control character SS2 (Single-Shift 2,0x8e). The EUC's code set is only used by the SS2 control character to identify the third set. All characters in the fourth set (CS3) are always starting with control character SS3 (Single-Shift 3, 0x8f). The EUC's code set is only used to identify the fourth set.
The following is the format of different code sets supported by EUC:
CS0 - 0xxxxxxx
CS1 - 1xxxxxxx ore 1xxxxxxx 1xxxxxxx ore 1xxxxxxxxxxxxxxxx 1xxxxxxx
CS2 - 10001110 1xxxxxxx ore 10001110 1xxxxxxxx 1xxxxxxx ore 10001110 1xxxxxxxxxxxxxxxxx 1XXXXXXXXXXXX
CS3 - 10001111 1xxxxxxx ore 10001111 1xxxxxxx 1xxxxxxx ore 1XXXXXXXXXX 1XXXXXXX 1XXXXXXX
Set the environment
Assume that you have a DB2 UDB revision with server capabilities, we can continue to introduce the environment settings. The setting of the environment varies depending on your choice. We will first explain the setup process on Windows and then provide instructions for AIX.
Note: The following indication is applied to set your environment to test from the DB2 command window. If you use the setting environment to be tested from an application (not DB2 Command Window Session), the steps below to set the environment will not apply. Instead, you will need to perform the following command from the DB2 command window:
DB2SET DB2CODEPAGE =
We strongly recommend that once the test is completed, it will immediately reset the DB2CODEPAGE environment variable to its initial state. Continue to enable this variable may cause problems. To disable this variable, enter the following command: db2set db2codepage =
If you plan to use DB2CodePage, you can jump to create an MBCS database.
Configuration on Windows
To configure on Windows, you must first install the DB2 message file set, then set the code page conversion file set.
Install DB2 message
To facilitate DB2 to display errors, warnings, and indicative messages after executing DB2 commands or statements, you must install the DB2 message file set for the language you expect. Because DB2 has different distributions based on language-based distribution, you must verify that there is a language you expect on the installation CD-ROM. On Windows, you can choose to add a specific message set for a language when DB2 installation. For version 7, you can only install a language, but on version 8, you can choose to install multiple languages. If you plan to switch between several language environments, then this new feature is particularly sticky. During installation is the only chance to install DB2 messages for a specific language. Note: In the installation process, you must choose Custom Install instead of Typical Install.
After installing the message file set for DB2, you can see DB2 errors, warnings, and indicative messages after executing DB2 commands or SQL statements. Here is an example of an English message:
C: / progra ~ 1 / sqllib / bin> DB2 "Connect to JPDB"
SQL0332N There Is No Available Conversion for the Source
Code Page "1252" to The Target Code Page "943". Reason code "1".
SQLState = 57017
Do not have to install DB2 messages in a specific language in order to create and run the MBCS database. It just provides a convenience of displaying messages with the language you choose.
Setting up code page conversion file set
Setting the next step in your environment depends on the language you choose. In the example, we will set the Japanese environment. In order to set the Windows environment to process Japanese code page conversion, we must change some settings in Regional Options on the Windows machine.
Click Start -> Settings-> Control Panel, and then click Regional Options to open the pop-up dialog shown in Figure 1. Figure 1. Regional Options in the Windows environment on top of the dialog, select Japanese as your language environment. Language environments will affect numbers, currency, time, date, and input text will use. At the bottom of the dialog box, click Set Default, and then click OK. (Refer to Figure 2.) This provision allows applications to display menus and dialogs in their local language without affecting the Windows menu and dialog. Figure 2. Language Environment dialog box Click OK of the Regional Options dialog to open the dialog shown in Figure 3. Figure 3. Installation dialog This is likely already files on your machine, so you can click Yes to install them from your hard drive. If you don't have these files on your drive, you will be prompted to insert Windows Install CD-ROM to get files. Because after installing these files, the operating system will change, so you must reboot the machine before you are ready to create a database.
Configuration on AIX
To configure on AIX, you must first install the DB2 message file set, then set the code page conversion file set. Installing the DB2 message On AIX, you can install DB2 messages by installing DB2Setup on the CD-ROM or via SMIT. DB2Setup is an executable that allows you to install DB2 components (such as DB2 message files). DB2 provides these messages on AIX to display errors, warnings, and indicative messages after the DB2 command or SQL statement, just as it is done on Windows. After installing DB2, you can add additional language file sets at any time, which allows you to display DB2 messages associated with the current locale.
Version 7 and Version 8 allow you to install multiple language message files, which provides flexibility when you need to switch from a locale to another. When DB2 detects a LANG environment variable change, it automatically switches the message file set. By default, English messages are installed.
It is not necessary to adjust the specific language of the database to your desired environment. For example, you don't have to install the Japanese message file set for the DB2 for the Japanese database. If you can't find Japanese message file set, DB2 will use English messages.
Setting up code page conversion file set
The next step is to view if you have an operating system-specific fileset to enable code page conversion from the source code page to the MBCS code page. You can observe the results output by performing the following commands to check if those files have been installed:
LSLPP -L | grep bos.loc.pc.ja_jp (for sjis)
Bos.loc.pc.ja_jp 4.3.3.0 Committed Base System Locale PC Code Set
LSLPP -L | grep bos.loc.iso.ja_jp
Bos.loc.iso.ja_jp 4.3.3.0 Committed Base System Locale ISO CODE
As you can see, Japanese file sets have been installed, so there is no need to install any files again. The following is an example of a command to install the CD-ROM set installation file set from AIX 4.3.3:
Installp -acnqwx -d / dev / cd0 -f file 2> &&; 1
File:
Bos.loc.iso.ja_jp 4.3.3.0
Bos.loc.pc.ja_jp 4.3.3.0
After you apply the file set, you must set the LANG environment variable to the language type. The following is a command for a specific Japanese environment:
$ export lang = ja_jp
Make sure the file set is installed and the LANG environment variable is set correctly. When you try to connect to the MBCS database, it is not satisfied that these needs will trigger the following errors:
$ DB2 "Connect to JPDB"
SQL0332N There Is No Available Conversion for The Source Code Page "819" To
The target code page "943". Reason code "1". SQLState = 57017
If you often switch your language environment, then you need to perform DB2 Terminate to terminate an existing background process that may remember the old LANG environment, and then connect to the database.
Create a MBCS database
You can choose to create your MBCS database before or after you enable a specific locale. If you want to create a database first, you must provide additional information when you execute the CREATE DATABASE command. Otherwise, DB2 will use the default code page from the locale.
Below, we will create a Japanese database before enabling a Japanese language environment:
DB2 "CREATE DATABASE JPDB USING CODESET IBM-943 TERRITORY JP" CODESET keyword tells DB2 Create a database with code page 943 (a Japanese code page).
If you create a database after setting up a locale, or use DB2SET to set the DB2CODEPAGE environment variable to the specified code page (see Setting the Environment), everything you have to do is to enter the following command:
DB2 "CREATE DATABASE JPDB"
Since the Japanese environment has been set, DB2 will know that the Japanese database is created using the code set IBM-943 with the Windows. On AIX, DB2 references the LANG environment variable to determine the code page of the database. If DB2CODEPAGE is set, DB2 determines the code page for creating a database based on this DB2 environment variable instead of checking the LANG environment variable (on AIX) or operating system code page (on Windows).
DB2 provides a number of code sets, so you can create a database using the correct code page on your own platform. Table 1 shows the supported Japanese code set.
Table 1. Supported Japanese Code Collection Code Page Group Packet Code Set Region Identifier Country Code Language Environment Operating System Country Name 932D-1IBM-932JP81-OS2 Japan 942D-1IBM-942JP81-OS2 Japan 943D-1IBM-943JP81- OS2日本 954D-1IBM-9eucJPJP81ja_JPAIX Japan 943 * D-1IBM-943JP81Ja_JPAIX Japan 954D-1eucJPJP81ja_JP.eucJPHP Japan 5,039D-1SJISJP81ja_JP.SJISHP Japan 954D-1eucJPJP81ja_JP.eucJPHP Japan 954D-1eucJPJP81jaSCO 954D-1eucJPJP81ja_JPSCO Japan Japan Japan 954D 954D-1eucJPJP81ja_JP.EUCSCO -1eucJPJP81ja_JP.eucJPSCO Japan 954D-1eucJPJP81jaSun Japan 943D-1IBM-943JP81Ja_JP.PCKSun Japan 954D-1EUC_JPJP81ja_JPLinux Japan 943D-1IBM-943JP81-WIN Japan 930D-1IBM-930JP81-HOST Japan 939D-1IBM-939JP81-HOST Japan 5,026D-1IBM- 5,026jp81-host Japan 5,035d-1ibm-5,035jp81-host Japan 1,390d-1-jp81-host Japan 1,399d-1-jp81-host Japan 1394 ** D-1-JP81-Japan
* The code page is 943 on the AIX 4.3 or updated version. If you use AIX 4.2 or earlier, the code page is 932. ** Only the code page 1394 is used when transferring data from code page 1394 from code page 1394 to the DB2 Unicode database, or uses the Export utility to guide the data from the DB2 Unicode database into code page 1394. To get more information, see the Data Movement Utilities Guide and Reference section of Version 7.2 FixPak 4 Release Notes.
To get a complete list of code sets, see Resources.
To determine if the database is using the code page you expect, you can query the database profile with the following command:
DB2 "Get Database Configuration for
Near the top of the output, you will see a list of information related to the language and code pages:
Database Territory = JPDATABASE CODE Page = 943
Database Code set = IBM-943
Database Country Code = 81
Connect to the database and perform basic SQL test
Connect to the database
The following example shows how to connect to the database from the DB2 command window on the DB2 server:
DB2 "Connect to
To connect to a remote database, you need to include user and using keywords:
DB2 "Connect to
Execute basic SQL test
To perform SQL tests like Insert and SELECT statements, you need to be able to enter the MBCS data from the keyboard to the screen. In the example, we use the English keyboard to perform the MBCS text input operation. In addition, we will also use Japanese to show you how to do this.
Windows
For Windows, you need to open a DB2 Command Window session and enter the following command:
CHCP
CHCP is a Windows command. It represents "Change Code Page", you need to perform it so that your DB2 command window session can input and retrieve the MBCS characters. In the example, we change the code page to 932, which has confirmed that 932 input can be used as a 943 code page database.
After performing a chcp command, you can start testing the MBCS database. You should see the following icons in the lower right corner of the Windows screen:
Japanese icon
English icon
Any one in both icons will be displayed, which allows you to switch between English and Japanese input locations. If you are using another input locale, you will see the appropriate icon for the language environment, but the English icon is still the same as that shown above. Make sure the non-English icon is displayed. If the English icon is displayed, click on the icon, and the following list will allow you to choose a non-English setting option:
If you have an activated editor, and it supports displaying the fonts required for your language, you will also see the dialog toolbox similar to the following. This toolbox is located in the lower right corner of your Windows screen.
You need to set a toolbox so you can enter the correct character set. For example, there are several character sets, such as a fake and false name of a holiday. In the toolbox, click the icon below, select the character set you need:
Now let's perform some operations after connecting to the Japanese database. First, we want to create a table to allow insertion strings:
DB2 "CREATE TABLE JPTBL (Name Varchar (20))"
To date, we have not used Japanese characters yet, so our current input is English. Now we have to insert a line into the table:
After completing the above steps, you can use any editor (such as NOTEPAD) to put the character input, copy, and paste the insert statement on the DB2 command window session. Make sure that the fonts selected in your editor allow input for characters specific to your language.
The following is the result after INSERT from the DB2 command window:
Now, SELECT: AIX from the table
On AIX, the easiest way to test SQL statements on the MBCS database is to connect to the database from the Windows machine. To do this, you must run the DB2 client on the Windows machine and you can catalog the MBCS database residing on the server. The more desirable way to do this is to use the DB2 Client Configuration Assistant; this is a GUI tool that assists users to gradually complete the database cash process. Once this process is completed, you can enter a SQL statement from the client Windows machine as just in the local DB2 server.
Conclude
Developers who need fast solutions after encounter problems often encounter new challenges (such as using MBCS databases). The goal of this article is to make your life easily by providing high quality content (these contents expand the skills of you setting up and run MBCS databases to make your life) As a technical support personnel, we always interested in your feedback and will actively deal with any questions, views, or corrections you provide. To get more information about MBCS, check out the following online reference.
Reference
Code page conversion for all platforms: http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2uix/support/document.d2w/report? Fn = DB2A0CODEPAG.HTM # TBLSUPCDPG Unicode / UCS-2 and UTF-8 Information: http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2w/support/document.d2w/report? Last_page = list.d2w & fn = DB2V7D0DB2D0338.HTM Character conversion: http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2w2w/support/document.d2w/report? Last_page = list.d2w & fn = DB2V7S0CHARCON.HTM Performance Consideration at the time of code page Conversion: http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2w/support/document.d2w/report? Last_page = list.d2w & fn = db2v7d0db2d0170 .htm
Page
About author
David Kline and Gabor Wieser serve as the DB2 technical support representative of PartnerWorld for Developers. David and GABOR have together with other 10 team members to help independent software vendors (ISV) solve extensive development and management issues. David has DB2 applications to develop and manage these two certificates. Most of the time, he focuses on helping the ISV to resolve issues related to DBA (database management). Can contact him with Djkline@us.ibm.com. Gabor has more than 10 years of DB2 work experience and engage in a variety of DB2 issues, including the development of C and JavaTM procedures, connectivity between clients and host databases, and a large number of database management issues. Can contact him via gaborw@us.ibm.com.