Oracle Full Text Index Setting Procedure
Li Shouliang sorted out SLLI@founder.com.cn
Last Version: 2003-2-24
Several item groups have begun to use Oracle's full-text index. Oracle's lexical analyzer is not intelligent for the lexical analysis of the ZHS16GBK character set, and can only be mechanically matched. There is a new Chinese analyzer for the UTF8 character set with a new Chinese analyzer Chinese_LEXER, which has great improvement. The documentation that verifies and organizes how to perform Oracle full-text index settings is sent to you, for 8i, 9i all platform versions.
For DB2, there is a plug-in called TEXT Information Extend to achieve full-text index, but it is said that its grammatical analysis is also original. It is reported that the use of CM (Content Manager) can do a better full-text search, but it is necessary to install a lot of software, which is not suitable for simple applications.
One. Create a database
1. Select two options for JServer and InterMedia when you create a database using DBAssist.
2. Check if your database is installed, you can have CTXSYS users and CTXApp roles (Role) by checking.
3. If there is no such user and role, it means that the intermediate function is not installed when the database is created. The database must be modified to install this feature. Modifying process:
Run $ ORACLE_HOME / BIN / DBASSIST, select 'Modify Database', then select both JServer and InterMedia when selecting the database function (install InterMedia must install JServer at the same time).
two. Set ExtProc
Oracle implements InterMedia via 'external call function' (External Procedure), so it is critical to the correct setting of EXTPROC. The general database is installed in Listener.ora and Tnsnames.ora after installation of JServer and InterMedia. Extproc has been set.
1. Test if extproc is normal
Restart Listener, then, use TNSPing to test whether configured correctly,
Under the command line
TNSPING EXTPROC_CONNECTION_DATA or
TNSPING EXTPROC_CONNECTION_DATA.WORLD
If the configuration is correct, it will appear:
Attempting to contact (de - = extproc) OK (140 ms)
If it is correct, you can skip 2,3 steps. Otherwise, follow step 2, 3 to set the listener.ora and tnsnames.ora files. Be sure to restart Listener after the modification, but does not need to restart the database.
2, set Listerner.ora
If TNSPing fails, you need to configure Listener to make it listening to the intermedia call. You can configure the $ ORACLE_HOME / BIN / NetAssit, or manually modify the configuration file: $ Oracle_Home / Network / Admin / Listener.ora, then restart the listener.
The following example tells how to manually modify the configuration file:
Open the listener.ora file, usually there is the following before modifying (assuming the default listener):
Listener =
(Description =
(Address = (protocol = tcp) (host = mydatabase) (port = 1521)))
)
SID_LIST_LISTENER = (SID_DESC =
(Global_dbname = mydatabase.world)
(Oracle_Home = /u01/app/oracle/Product/8.1.6)
(SID_NAME = MyDatabase)
)
This Listener has not configured extproc, so you need to increase the monitoring of extproc, and the method is to add DESCRIPTION and SID_DESC. The modified listner.ora is as follows:
Listener =
(Description_List =
(Description =
(Address = (protocol = tcp) (host = mydatabase) (port = 1521)))
)
(Description =
(Address = (protocol = IPC) (key = extproc))
)
)
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(Global_dbname = mydatabase.world)
(Oracle_Home = /u01/app/oracle/Product/8.1.6)
(SID_NAME = MyDatabase)
)
(SID_DESC =
(Program = extproc)
(SID_NAME = PLSEXTPROC)
(Oracle_Home = /u01/app/oracle/Product/8.1.6)
)
)
Note the above Host, Global_DBName, SID_NAME, Oracle_Home should fill in the actual value of your database, but Program must fill in the extproc.
3, set tnsnames.ora
Second, to configure the server-side TNSNames.ora file. The location of the file is below $ oracle_home / network / admin. It can also be configured by running Netasst.
In the TNSNames.ora file, you need to add as follows:
EXTPROC_CONNECTION_DATA, EXTPROC_CONNECTION_DATA.WORLD =
(Description =
(Address_list =
(Address = (protocol = IPC) (key = extproc))
)
(Connect_data =
(SID = PLSEXTPROC)
)
)
Note that KEY and SID must be the same as Key and SID_NAME in Listener.ora.
three. Set the lexical analyzer (Lexer)
Oracle defaults this analyzer using Basic_LEXER. Basic_lexer is against English. To specify using Chinese analyzers, steps:
1. Use the CTXSYS users to log in to Intermedia Text Manager, password CTXSYS:
2. Select Preferences -> Language Indicator -> Create, Enter the name of the indicator such as CHINESE_LEXER, select Chinese_vgrnm_lexer under Lexer.
3. Establish an intermedia index, specify an index name, a field of fields, such as the currentText field in the DOM_1_DOCLIB under the System scheme, and select Chinese_LEXER in the Preferences.
The full-text search index is used, which will use Chinese_vgram_lexer as the analyzer. 4. After the index is built, the following tables will automatically generate the following tables below, you can use DBA Studio to view: (assuming index name MyIndex):
Dr $ MyIndex $ K, DR $ MyIndex $ R, DR $ MyIndex $ N
The most important in the I Table, query the table:
Select token_text, token_count from DR $ I_RSK1 $ I where rownum <= 20;
It can be seen that Oracle is preserved in this table, after you analyzing your document, the generated TERM record, including the location, number of times, HASH values, etc. of Term.
four. Use Job Timed Synchronization and Optimization
After the intermedia index is built, if the data in the table changes, increases or modifies the record, the index is automatically modified due to any DML statement that happens in the table, so it is necessary to synchronize (SYNC) and optimization (Optimize) ) Index to correctly reflect the changes in data.
Synchronization: Save the new Term to i;
Optimize: Clear the garbage of the I table, mainly to delete the TERM that has been removed from the I table.
Oracle offers a CTX Server to do this synchronization and optimization, just run this process in the background, which will monitor the changes in data, synchronize in time. But there are many problems. You can use the two JOBs to complete (this job is to be built in the same one by one user):
- Sync:
Variable Jobno Number;
Begin
DBMS_Job.Submit (: jobno, 'ctx_ddl.sync_index (' 'myindex'); ',
Sysdate, 'sysdate (1/24/4)');
COMMIT;
END;
- Optimizer
Variable Jobno Number;
Begin
DBMS_Job.Submit (: jobno, 'ctx_ddl.optimize_index (' 'myindex', '' full ''); ',
Sysdate, 'sysdate 1';
COMMIT;
END;
Among them, the first Job sysdate (1/24/4) refers to a synchronization of once every 15 minutes, and the second JOB sysdate 1 is a full optimization every 1 day. Specific interval, you can set according to your own application. At this point, your full-text search function has been set.
Li Shouliang verified, organized SLLi@founder.com.cn
Last Version: 2003-2-24