How can it be Lucene?

xiaoxiao2021-03-06  40

How can it be Lucene?

Welcome subscription author Weibo

With the help of good friends LHELPER, start learning the full-text search engine Lucene. After searching for some full-text search information, it is found that there are many products in this area, and the light can be found on http://www.searchtools.com You can find more than 100 search tools (including source code). If you search in Baidu as a "full-text search" as a keyword search, it will find that most of the instances from Lucene in addition to theoretical introduction. Why is it so widely accepted by lucene?

This kind of feeling is good for me, saying that it is said to be a new wave than Keso. .

I think the popularity of a product is a technical advanced, and the product promotion is sufficient.

And these two Lucene have.

First, Lucene's contributor Doug Cutting is a senior full-text index / retrieval expert. The products developed in such big wrists naturally let everyone have orally. But I think this is not the main reason why Lucene is so popular. I think the main reason is:

1. Lucene is not a full full-text index application, but a full-text index engine toolkit written by Java, which can easily embed the full-text index / retrieval function for applications in various applications. Such positioning makes Lucene have a high abstraction level, which is easy to expand and integrate into existing systems. Because for most full-text search applications, we need a development kit instead of the final product (although many search engines can extend the feature). This is also the package level for programmers to accept.

2, Lucene's API interface design is more common, and the input and output structure is very similar to the table ==> record ==> field of the database, so many traditional applications, database, etc. can be more convenient to map to Lucene's storage structure. / Interface. (The above statement is from the application of full-text search function - based on Java full-text index engine lucene).

Article 2 About the promotion of Lucne. Lucene is so popular in China, and I want a series of search related articles in the car to introduce a lot of role. As the bamboo shoots are fried meat: not only in the promotion and introduction, the Car Dong is in Lucene's Chinese and Web applications, which also makes great contributions.

Although there are some articles that introduce other search engines, it is much more affected.

After reading many articles about Lucene, I found that most of the introduction and the article of the car is similar (the text is large, it is estimated that it is not a plagiarism, " Most of the Hello World level.

In the article, the space occupied by the Chinese index file is almost as big as the original text! This can be unacceptable, even if the space of the English data index has also reached 30% to 50% of the original text. So developing Google, Baidu must be the largest customer of massive hard drive consumption. Obviously, the optimization of the index file is an important part of the search engine.

In addition, if you want Lucene to become a distributed search engine, you should also start from the index file, or extend the index file into a distributed file system, or put the index file into the database, use the database's distributed performance to provide distributed search services .

In this case, I have the idea of ​​analyzing the LUCENE index file format.

I will focus on the file format of Lucene index in future articles.

Lucene has now available standard documentation in the LUCENE index file format, and many people have developed a variety of language versions of Lucene. ref: http://java2.5341.com/1_98.html

A demo example for DAO

Research and Implementation of Search Engine Based on Java Technology

http://udoo.51.net/mt/archives/000089.htmlhttp://www.Theserverside.com/news/thread.tss?thread_id=23043

http://www.tbray.org/ongoing/when/200x/2003/07/30/onsearchtoc

Lots of Interest in Lucene Desktop

http://www.getopt.org/luke/

Tian Chunfeng

2004-12-23

转载请注明原文地址:https://www.9cbs.com/read-77314.html

New Post(0)