MARC data format and database format transformation experience

xiaoxiao2021-03-06  116

?

?

First of all, I would like to thank the Djkhym (HYM) on the 9CBS, giving me a huge help, drawing on his idea of ​​his procedure.

?

Marc (Machine Readable Catalog) data, machine read directory data. The MARC format and database transformation is an important part of the book system and is also a core technology. Nowadays, there are few information on online Mar data, and less information on the database's conversion. . . . . I mainly refer to the "Chinese Operation Directory Format User Manual" and HYM procedures, in order to take less to the way, I wrote this experience. (Oh, now the online transfer of Marc data is 2 cents)

Basic format of Marc data, you can query in the "Chinese Access Directory Format User Manual". Here I simply talk about it. This Marc data format follows the standard of GB / T 2901 (ISO® 2709), the extension is .iso. The following is a row in the ISO file recorded in Marc, that is, a Marc data.

00806nam0 2200229 ?? 450? 001000900000010003500009092002000044100004100064101000800105102001500113105001800128106000600146200003100152210003300183215001500216330020500231333008500436606000500521690000800526701001500534801002700549? S7240011 ?? -a7-5034-1525-8-b hardcover -dCNY130.00 ?? -aCN-b01-724-0011 ?? -a20021211d2002 ??? ekmy0chiy0121 ??? ea ?? -Achi ?? -ACN-B110000 ?? -Ar? 1 -A Deng Xiaoping Theory Dictionary - F Li Changfu editor ?? -a Beijing-C China Wen Shi Publishing House - D2004.7 ?? -A0720-D16 ?? -a this book is a comprehensive, systematic, accurate reflection of Deng Xiaoping theoretical scientific system and retrieval Deng Xiaoping theoretical important view, and research, research, research, and promotion of Deng Xiaoping theory. The book is reflected, and the core content of Marxism-Leninism, Mao Zedong ?? -a thought and "three representatives", reflecting the development of the scientific socialist theory. ?? -a ?? -a-v4? 0-a Li Changfu editor ?? -ACN-BS7240011-C20040709?

?

Record structure: record header, address entry area, data field area, recording end

00806nam0 to 450 # is the record header (# represents space)

001000900000 to 801002700549 @ is the address entry area (@ Representative field separator IS2)

S7240011 to 20040709 @ is the data field area

% Is the recording end value (% representative record end value IS3)

Each byte has a special meaning. Everyone can read the book, I will talk about my transition.

1. • Calculation of field length: 0-4 Bits are the total length of the record, and note that spaces and separation markers are included. English characters, 2 Chinese characters. Note that the result in the ASP is a bit, so the write length is calculated in the VB (StrConv (STRCONV (STRCONV (ST1, VBFROMUNICODE), and use Format (str1, "00000") to format the output.

2. ? Several symbols:

$: Field Identifier IS1

@ Representative field separator IS2

% Representative record end character IS3

This is a human mark, which is convenient for learning and watching. In actual systems, use these characters instead:

$ -------- CHR (31)

@ -------- CHR (30)

% -------- CHR (29)

There is also a space is very important. The space between each field is strictly regulated, so when you learn #, you can repay your space. Such as headline: 01071nam0 # 2200277 ### 450 # ?? 3. Understanding of the address of the subregion: Artificial addition, and the separation, it is obvious.

001,0013,00000; 005,0017,00013; .....................................................................................................................

Here 001, 0013, 00000; is the first field of the data field area, field number: 001, length: 0013, started position: 00000. The collar push is that the position is in addition.

DBTOMARC's calculation method, write the data field area, then count a field, you can open an array: Block (i, 3)?, Block (i, 0) is identified, Block (i, 1) is listed as width , Block (i, 3) column position, i is the number of fields.

MarcToDB is over. . Read, go to the cut data field area.

The address of the address must be counted, otherwise the computer does not know, the data behind it is wrong.

4. Data field area:

• Writing by one by field (read) is OK. Pay attention to one point:

?012001022343@20020928000000.0 @ ## $ a7-80142-191-4 $ dcny46.00 @ ............................................ @%

It is the number of spaces and separated symbols. Be careful, otherwise it is wrong, can't read, this is my lesson. . . .

5. Convert the database fields into a configuration file, which can be selected, which is convenient for the general purpose of the program.

And edit the agreement rules, such as database records: book name [version], book name This is obtained, and it is judge whether it is a series. There is also no more than 3 in the book ... See the versatility of your program design.

6. The implementation of the pinyin, I was found to be realized, I don't know if there is any other good way. . .

7. Again, this is, this data transformation is important is the requirements of eating format, and other algorithms are query writes, see your skills. I like to use an array, huh, huh. . .

Write this first. . . . . . . . . . . . :)

?

One wind and one cloud (http://blog.9cbs.net/wzgme)

2004.08.22

---------------- Today is Tanabata, Happy Valentine's Day! ! ! ! ^ _ ^

转载请注明原文地址:https://www.9cbs.com/read-124699.html

New Post(0)