coffee_cn

博客园 首页 新随笔 联系 订阅 管理

最近因工作需要,需要学习java lucene2.0,刚开始学习,长话短说,记录下来!!

 

1、下载lucene2.0
http://lucene.apache.org/

http://archive.apache.org/dist/lucene/java/

lucene-2.0.0.zip
lucene-core-2.0.0.jar

 

2、设置环境变量CLASSPATH

/home/tomcat/lucene-core-2.0.0.jar

3、写个创建索引小程序练习,写程序,编译程序,运行程序,一气呵成。

#vi IndexTest.java

import java.io.File;
import java.io.FileReader;
import java.io.Reader;
import java.io.IOException;
import java.util.Date;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;


public class IndexTest {
        public static void main(String[] args) throws Exception{
                Date start = new Date();

                try {
                        IndexWriter indexWriter = new IndexWriter("/lucene_test/data/", new StandardAnalyzer(), true);

                        System.out.println("Indexing file /lucene_test/doc/1.txt");

                        Document document = new Document();
                        Reader reader = new FileReader("/lucene_test/doc/1.txt");

                        String path = "/lucene_test/doc/1.txt";

                        //document.add(new Field("path", path));
                        document.add(new Field("path", path, Field.Store.YES, Field.Index.NO));
                        document.add(new Field("contents", reader));

                        indexWriter.addDocument(document);
                        indexWriter.optimize();
                        indexWriter.close();
                } catch(IOException e) {
                        e.printStackTrace();
                }

                Date end = new Date();

                System.out.print(end.getTime() - start.getTime());
                System.out.println(" total milliseconds");
        }
}

#javac IndexTest.java

#java IndexTest

 

4、写一个搜索小程序

#vi SearchTest.java

import java.io.File;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.store.FSDirectory;

public class SearchTest {
        public static void main(String[] args) throws Exception {
                File indexDir = new File("/root/coffee/lucene_test/data/");
                FSDirectory directory = FSDirectory.getDirectory(indexDir, false);
                IndexSearcher searcher = new IndexSearcher(directory);
                if(!indexDir.exists()) {
                        System.out.println("index is not exist");
                        return;
                }

                Term term = new Term("contents", "twinkle");
                TermQuery query = new TermQuery(term);
                Hits hits = searcher.search(query);
                for(int i=0; i<hits.length(); i++) {
                        Document document = hits.doc(i);
                        System.out.println("File: " + document.get("path") + " " + String.valueOf(document.getBoost()));
                }
        }
}

 

5、本周没时间进一步深入学习了,计划下周好好学习下lucene文档分数setBoost的控制

 

参考站点:

http://www.chedong.com/tech/lucene.html Lucene:基于Java的全文检索引擎简介

http://www.javaeye.com/topic/33241 lucene 入门

http://kb.cnblogs.com/b/243888/ Lucene源码分析笔记之[org.apache.lucene.document](四)

征服Ajax+Lucene—构建搜索引擎

 

 

 

posted on 2009-12-24 15:01  coffee  阅读(342)  评论(0编辑  收藏  举报