Source: e800.com.cn
The technical foundation of search engine technology and classification search engines is the full text search technology. From the 1960s, foreign retrieval technologies have begun to study. The full text search usually refers to the full text of the text, including information storage, organization, performance, inquiry, access, etc., its core is the index and retrieval of text information, generally for enterprises and institutions. With the development of Internet information, the search engine has gradually developed in full text search technology, and has been widely used, but the search engine is still different from full-text retrieval. The main difference between the search engine and the regular sense has the following points: 1. The data volume is a full-text retrieval system facing the data or related data related to the enterprise itself. The general index library is mostly in GB level, the amount of data is large There are only millions of people; but Internet web search needs to handle billions of web pages. The strategy of search engines is the server cluster and distributed computing technology. 2, the content correlation information is too much, the quotation and sorting is especially important. Google and other search engines use web link analysis technology. According to the Internet, the number of links to the Internet is used as an importance evaluation; but the full-text retrieval data source is linked The extent is not high, and it cannot be used as a basis for discriminating importance. It can only be sorted based on content. 3, the data source of the security Internet search engine is the information disclosed on the Internet, and other information is not important in addition to the text body; however, the data sources of the company's full text are information, level, permission, etc. Restrictions, there are more stringent requirements for query methods, so their data is generally safe and concentrated in data warehouses to ensure data security and management. 4, personalized and intelligent search engines are the Internet visitors, due to their data quantity and customer quantity limit, natural language processing technology, knowledge retrieval, knowledge mining and other computational intensive intelligent computing technology is difficult to apply, this is also currently The direction of search engine technology; and full-text search data is small, the retrieval demand is clear, the customer is small, and it can go further in intelligence and personality. In addition to the above differences, search engines and full-text retrieval have formed three different types: Full text search search engine: Full-text search engine is a veritable search engine, foreign representative has Google (http: / / www.google.com), Yahoo (http://search.yahoo.com), alltheweb (http://www.alltheweb.com), etc., domestic famous Baidu (http://www.baidu.com ), Search (http://www.zhongsou.com). They are all related records that match the user query conditions, and then return the results to the user according to certain arrangement order, and currently regularly Search engine in the sense. Directory Search Engine: Directory Index Although there is search function, it is not a real search engine in strict sense, just a list of website links classified by the directory. Users can do not use keyword queries, only by classified directorys can also be found. Abroad is more famous directory index search engines have Yahoo (http://www.yahoo.com) Open Directory Project (http://www.dmoz.com/), Looksmart (http: //www.looksmart. COM), etc.