Use Lucene-1.3-Final to establish a full-text search application on the web

zhaozj2021-02-16  101

Screen.width * .65) this.width = screen.width * .65 "border = 0 name = PIC_4265>

Author: lotus

After reading Lucene-1.3-final, there has been no time to summarize it. Previous "Lucene-1.3-final" completely supports Chinese clever words and retrieval "" just let you understand Lucene-1.3-final After any expansion and modification, you can support Chinese search, since Lucene-1.3-final support Chinese full-text retrieval, then it must also support full-text retrieval on the Web. Let's briefly introduce anything that does not modify Lucene-1.3-final in Web build full-text retrieval applications.

I. Establish an operational environment 1, the system environment I operate: Windows 2000 Professional Resin2.1.6 (why use RESIN, below) Lucene-1.3-Final.jar, Lucene- Demos-1.3-final.jar, and LuceneWeb.war for applications on the web. 2. Establish an application environment to copy LuceneWeb.war to the WebApps directory under the Resin directory, launch resin, then LuceneWeb.war will automatically unwield to a directory LuceneWeb, then delete LuceneWeb.war files. Copy Lucene-1.3-Final.jar, two files to two files to the Lib directory under Web-INF in the original LuceneWeb directory. OK, Lucene-1.3-Final application environment has been established.

Second, configure 1, establish an index file, if you want a full-text file in D: /RESIN-2.1.6/webapps/index, if you copy multiple Chinese files, it also allows a lot of directories. Note: The file to be retrieved can only be a file in the HTM / HTML / TXT format. To retrieve files in other formats such as Word / PDF format files, it is necessary to extend Lucene, which is not within the scope of this article.

If you want to establish an index file on the file in the D: /Resin-2.1.6/webapps/index directory, put it in the D: /RESIN-2.1.6/webapps/luceneindex directory, in the CMD working module Next, execute the following command: d:> cd? Resin-2.1.6 / webapps / luceneindexd: /resin-2.1.6/webapps/luceneindex> java org.apache.lucene.Demo.indexhtml -create -index d: / resin -2.1.6 / WebApps / LuceneIndex ../ Index (Note: You have to add Lucene-1.3-Final.jar, Lucene-Demos-1.3-Final.jar to the system's classpath.) The D: / The files in Resin-2.1.6 / WebApps / Index include files in all directories to establish an index and store the index file in the D: /RESIN-2.1.6/webapps/luceneindex directory. Index files such as: _6.f1_6.f2_6.f3_6.f4 ... 2, configure the index file directory in the web application in the LuceneWeb directory to find a JSP file configuration.jsp, open the editor, change the 5th line of the following Form: string indexlocation = "d: //Resin-2.1.6/WebApps//luceneindex"; you have modified it to your own index file directory. 3, a little modified in the LuceneWeb directory found a JSP file result.jsp, open the editor, will it be the 68th line? Analyzer Analyzer = new stopanalyzer (); ??????????????? Construct Our Usual Analyzer is modified as follows: Analyzer Analyzer = New StandardAnalyzer ();

Note: Because you use the StandardAnalyzer class, you must first introduce this JSP file. If you don't modify this sentence, you can only retrieve English content, and you can retrieve in Chinese after modification.

Another: Add <% @ page language = "java" contenttype = "text / html; charset = GBK"%> Add to Result.jsp's first line to correctly display Chinese results.

Ok, I'm big, I can run.

Third, the result is restarted RESIN, enters the browser address: http://localhost/luceneweb/index.jsp appears as shown, enter the keywords you want to retrieve, such as the syntax check, click the Search button. The result came out, as shown, you can see that there are two, one is an HTML file, one is a TXT file. ?

转载请注明原文地址:https://www.9cbs.com/read-13443.html

New Post(0)