Lucene实践之IndexFile

惦记了好几天的lucene开始学习。

Game Starts

文档参考:

  1、http://lucene.apache.org/core/4_9_0/demo/src-html/org/apache/lucene/demo/IndexFiles.html

  2、http://www.ibm.com/developerworks/cn/java/j-lo-lucene1/

  3、http://www.cnblogs.com/likehua/archive/2012/02/16/2354532.html

依赖jar包

  1) lucene-core-4.6.0.jar

  2) lucene-analyzers-common-4.6.0.jar

  3) lucene-queryparser-4.6.0.jar

  http://archive.apache.org/dist/lucene/java/

主要的类(参考[文档2])

  a) Document:用来封装要建索引的文档

  b) Field:描述文档的属性

  c) Directory:目录,文档的目录、索引的目录等

  d) Analyzer:分词器

  e) IndexWriterConfig:配置信息

  f) IndexWriter:创建索引的核心类 

What's Up

 怎么更新索引

设置了config.setOpenMode(OpenMode.CREATE);后以为高枕无忧,然后又建了一个IndexSearcher来测试建的索引。

对于Indexer中的main多执行了几遍(产生了好几个索引文件,并没有覆盖),结果用IndexSearcher来搜索的时候就出现了重复的结果。

命名都设置openMode了怎么回事儿,然后去找资料。

在网上看到有个人说了一句索引是不是锁住了,indexWriter.isLocked(indexDir)果然是true;果断解锁indexWriter.unlock(indexDir);结果报错了,lucene好像不乐意大家这么干。

继续查看lucene锁的问题,看到[文档3]

问题就出在这。然后再代码里index(new File(data));复制了好几下,用同一个IndexWriter执行,果然覆盖了。

要是重启了怎么办呢。indexWriter.deleteAll();把索引都删了,重建吧!

 Always Be Coding

代码参考[文档1]

 1 package lucene;
 2 
 3 import java.io.File;
 4 import java.io.FileNotFoundException;
 5 import java.io.FileReader;
 6 import java.io.IOException;
 7 
 8 import org.apache.lucene.analysis.Analyzer;
 9 import org.apache.lucene.analysis.standard.StandardAnalyzer;
10 import org.apache.lucene.document.Document;
11 import org.apache.lucene.document.Field;
12 import org.apache.lucene.document.StringField;
13 import org.apache.lucene.document.Field.Store;
14 import org.apache.lucene.document.TextField;
15 import org.apache.lucene.index.IndexWriter;
16 import org.apache.lucene.index.IndexWriterConfig;
17 import org.apache.lucene.index.IndexWriterConfig.OpenMode;
18 import org.apache.lucene.store.Directory;
19 import org.apache.lucene.store.FSDirectory;
20 import org.apache.lucene.util.Version;
21 
22 public class Indexer {
23     private static IndexWriter indexWriter;
    //index索引目录 data文档目录
24 public static void index(String index,String data) { 25 Directory indexDir; 26 try { 27 indexDir = FSDirectory.open(new File(index));  //索引存放目录 28 Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_46);  //分词器 29 IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_46, analyzer); 30 config.setOpenMode(OpenMode.CREATE);

        /* OpenMode 设置索引是否覆盖
          APPEND appens an existing index.
          CREATE reates a new index or overwrites an existing one.
          CREATE_OR_APPEND reates a new index if one does not exist, otherwise it opens the index and
                    documents will be appended.
        */

31 indexWriter = new IndexWriter(indexDir, new IndexWriterConfig(Version.LUCENE_46, analyzer)); 32 indexWriter.deleteAll();  //有事儿
         //System.out.println(indexWriter.isLocked(indexDir));
              //indexWriter.unlock(indexDir);
33 index(new File(data));  //构建索引 34 indexWriter.close(); 35 } catch (IOException e) { 36 e.printStackTrace(); 37 } 38 } 39 private static void index(File dataFile) { 40 if(dataFile.isDirectory()) {  //文件夹递归 41 File[] files = dataFile.listFiles(); 42 for(File file : files) { 43 index(file); 44 } 45 } else { 46 try { 47 Document doc = new Document();  //文档
           //Field(name,value,store),Store.YES索引并存储,Store.NO只索引不存储
48 Field name = new StringField("name", dataFile.getName(), Store.YES);  //注意StringField是不分词的! 49 doc.add(name); 50 Field path = new StringField("path", dataFile.getAbsolutePath(), Store.YES); 51 doc.add(path); 52 Field content = new TextField("content", new FileReader(dataFile));//TextField默认Store.NO 53 doc.add(content); 54 indexWriter.addDocument(doc);  //加入索引 55 } catch (FileNotFoundException e) { 56 e.printStackTrace(); 57 } catch (IOException e) { 58 e.printStackTrace(); 59 } 60 } 61 } 62 public static void main(String[] args) throws InterruptedException { 63 64 index("C:/Users/Administrator/Desktop/df","E:/data/data"); 65 66 } 67 }

TO BE CONTINUED ……

 

posted on 2014-08-06 21:02  Erbin  阅读(385)  评论(0编辑  收藏  举报

导航