Lucene is a full-text retrieval of API, which is more examples, and the cases of applications.
LUCENE and the references of this article.
This study, is practical, one is simple application, the other is the web application, the third is Chinese, four-related applications (Lucene homepage in Sandbox).
0, ready to work to Lucene's homepage download current stable version Lucene-1.2.tar.gz, unzipped, put two JAR files in the Lucene-1.2 directory Lucene-1.2.jar and Lucene-Demo2-1.2.jar After the appropriate directory, add it to the ClassPath environment variable.
TAR ZXVF Lucene-1.2.tar.gz <---- decompression
CD Lucene-1.2
CP * .jar $ dp
<--- Store the directory of the JAR file, replace the actual directory according to the specific work requirements
ClassPath = $ ClassPath: $ dp / lucene-1.2.jar: $ dp / Lucene-demos-1.2.jar; Export ClassPath
If you don't want to log in every time you are logged in, you can edit / etc / profile or your own directory. Profile, add the last line of the file to the last line of the file. Windows settings, right-click "My Computer" on the desktop, select "Advanced" -> Environment Variable "-> Select ClassPath->" Edit ", add the full path name of the two JAR files in the input box, pay attention to separation The symbol is a semicolon (;). See the right figure.
1, run DEMO
$ java org.apache.lucene.Demo.indexfiles / usr / local / man / man1 /
<- Establish indexing of MAN files
Adding /usr/local/man/man1/mysql.1
...........
Adding /usr/local/man/man1/cvs.1
1614 Total MilliseConds
$ java org.apache.lucene.demo.searchfiles
<- Retrieve
Query: Password
Searching for: Password
7 Total Matching Documents
0. /usr/local/man/man1/mysql.1
......
6. /usr/local/man/man1/mysqlshow.1
Query:
Ok! Lucene stands in Demo runs successfully
The primary API function called by this DEMO program:
/ * About the main function of the index * /
File File = New File (Argv []);
Indexwriter Writer = New IndexWriter ("INDEX", New StandardAnalyzer (), true);
Document doc = new document (); doc.add (Field.Text ("path", file.getpath ()); doc.add ("Modified", Datefield.Timetostring (file.lastmodified ()) )); Fileinputstream is = new fileinputstream (f); Reader Reader = New BufferedReader (New InputStreamReader (IS)); Doc.Add (Field.Text ("Contents", Reader);
Writer.addDocument (DOC);
Writer.optimize (); Writer.close (); / * About the main function of retrieval * / seat searcher = new indexsearcher ("index"); analyzer analyzer = new standardanalyzer (); query query = queryparser.parse (lineforsearch, " Contents ", Analyzer; Hits Hits = Searcher.Search (query); for (int i = start; i 3. Run LuceneWeb assume that Tomcat is installed in the $ TOMCATHOME directory, replacing $ TOMCATHOME with a real directory when applying. CD $ TOMCATHOME / WebApps Mkdir Lucenedb CD Lucenedb Java Org.Apache.lucene.Demo.indexhtml -create -index $ TOMCAT / WebApps / Lucenedb ../examples <- With a relative path "..", point to the location of the indexed file, two to display the URL of the index file, because the retrievalful JSP program is in the LuceneWeb subdirectory .Examples can be used in other real applications Directory name CD .. CP ~ / Lucene-1.2 / LuceneWeb.war. <- LuceneWeb.war under your decompressed lucene-1.2 directory ../bin/shudown.sh . ../bin/startup.sh Then access http://yourdomain.com:8080/luceneweb through the client, if the browser should appear on the right. . Reclusion to the server CD LuceneWeb vi configuration.jsp <- Change the value of indexlocation to "$ Tomcathome / WebApps / LuceneDB"; CD .. Jar -ur Luceneweb.war Luceneweb Go to the client, refresh the page, then enter the word to retrieve it. Unfortunately, this can only retrieve English words. And if the Title of the html page is Chinese characters, there is a problem. Figure. The indexhtml here can index the files of HTM, HTML, and TXT types, using an HTMLPARSER, except that the previous example is basically the same.