In large database development, the data source is often encountered by a plan file (such as a text file). For such data sources, the database cannot be effectively managed using the database, and the SQL statement cannot be used to query and operation. So the urgent needs import these flat files into the database, and then perform efficient operations.
This article introduces several common data import methods, and hopes to enrich the reader. In addition, the databases involved herein are Oracle databases, in fact, for other databases, the method is similar.
First, SQL *: Loader
This method is one of the most important ways of data import under Oracle database. This tool is provided by Oracle client. The basic working principle is: First, you should make a control file for data source files, and the control file is used to explain how to The file is parsed, in which the data format of the source file, the field of the target database, etc., a typical control file is as follows:
LOAD DATA INFILE '/ora9i/fengjie/agent/data/ipaagentdetail200410.txt' TRUNCATEINTO TABLE fj_ipa_agentdetailfields terminated "," trailing nullcols (AGENT_NO char, AGENT_NAME char, AGENT_ADDRESS char, AGENT_LINKNUM char, AGENT_LINKMAN char)
Among them, infile'/ora9i/fengjie/agent/data/ipaagentdetail200410.txt 'indicates the source file to be imported, and the source file can also be obtained directly through the command line, fj_ipa_agentdetail is the name of the target table, Fields Terminated "," Yes, "Yes The various fields of the source file are separated by a comma. Trailing NullCols means that the empty field is still written to the database table, and finally the 5 fields are the field structure of the target database table. By the format analysis of the typical control file above, the control file needs to be consistent with the format information of the source file, otherwise the import data will exceed.
In addition to control, SQL * Loader also requires a data file, ie source file. Depending on the format, the source file can be divided into two major classes of fixed fields and separators, which will separately explain these two cases:
Fixed text file with fixed field length
That is, each field has a fixed field length, such as:
602530005922 1012
602538023138 1012
602536920355 1012
602531777166 1012
602533626494 1012
602535700601 1012
Text file with separator
That is, each field has the same separator separation, such as:
1001, Shanghai Long-distance Telecom Integrated Development Company, Room 140, No. 34 Nanjing East Road
1002, Shanghai Huchuchi Communication Technology Co., Ltd., Room 1902, No. 19 Wuning Road
1003, Shanghai Bang Zheng Technology Development Co., Ltd., Room 903, No. 61 Nanjing East Road
For the two file formats, SQL * Loader can be processed, the following is the text of the fixed length of the previously described, exemplified:
Since this text has only two fields, one is a device number, one is a zone number number, and the lengths of the two are 20 and 5, so that the control files can be prepared as follows:
Load data infile '/ora9i/fngjie/agent/data/iPaagent200410.txt' truncateinto table fj_ipa_agent (1:20) char, branch_no position (21:25) Char) where '/ Ora9i / Fengjie / Agent / Data /ipaagent200410.txt 'For the full path of the file, Position (m: n) indicates that the field is from position M to position N.
For data files with separators, there is an example in front, and details are not described here. In summary, using SQL * Loader can easily import data files into the database, this method is also the most common way.
Second, use professional data extraction tools
At present, data extraction and loading (ETL) is an important technology in the data warehouse field, and this technology is particularly suitable for some large data files or files. This brief introduction to the current mainstream data extraction tool - informatica.
The tool is primarily programmed with a graphical interface. Its main workflow is: first import the structure (format) of the source data file into Informatica, then perform certain transformations to the structure according to business rules, and finally import to the target table. in.
The above process is just just a mapping from the source to the target, and the actual extraction and loading of the data need to be performed in Workflow.
Using a professional data extraction tool, you can combine the business logic to perform JOIN, UNIIN, INSECT, etc. for multiple source data, suitable for large databases and data warehouses.
Third, use Access Tool Import
You can select 'open' text files directly in Access, which is imported into the Access database in accordance with the wizard, and then import it into the final target database using the programming method.
This approach is cumbersome, but its software configuration requirements for the system are relatively low, so there is a certain range of use.
Fourth, small knot
In summary, the plane file transformation into database format is conducive to data processing, obviously, database powerful data processing capability is much higher than that in direct file I / O efficiency, and hopes that this article can make a throwing jade in this field.