lucene 内存索引和文件索引合并

Posted on 2017-01-06 18:06 mdong 阅读(951) 评论(0) 编辑收藏举报


IndexWriter.addIndexes(ramDirectory);

http://blog.csdn.net/qq_28042463/article/details/51538283

在lucene索引库的创建的时候，我们有两种不同的索引库创建方式

1.文件索引库

final Path docDir = Paths.get("index");
Directory directory=FSDirectory.open(Paths.get("index"));

这样创建的索引库是在本地磁盘上创建一个index文件夹，并且将索引放在index中，也称为文件索引库
优点：将索引持久化到磁盘上，能长久保存。
缺点：相比较内存索引库，读取慢

2.内存索引库

Directory directory = new RAMDirectory();

只需要一句代码，就创建了一个内存索引库
优点：读取快
缺点：不具备持久化能力，结束时候内存索引库便会删除

3.两种索引库的结合

根据两种索引库的特点我们可以将两种索引库结合起来，设计的思路是

在程序启动时，将文件索引库中的索引拷贝到内存索引库中，然后让程序与内存索引库交互，

当交互完毕后再将内存索引库的索引持久化到文件索引库。

　　　　　* 1.创建两个索引库
         * 2.创建两个IndexWriter
         * 3.把文件索引库中的内容放到内存索引库中
         * 4.让内存索引库和客户端进行交互
         * 5.把内存索引库的内容放到文件索引库
         */
        final Path docDir = Paths.get("index");
        //创建文件索引库
        Directory fileDirectory=FSDirectory.open(Paths.get("index"));

        //创建内存索引库
        Directory ramDirectory = new RAMDirectory(FSDirectory.open(Paths.get("index")), null);

        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        //操作文件的IndexWriter
        IndexWriter fileIndexWriter = new IndexWriter(fileDirectory, iwc);


        //操作内存的IndexWriter
        Analyzer analyzer1 = new StandardAnalyzer();
        IndexWriterConfig iwc1 = new IndexWriterConfig(analyzer1);
        IndexWriter ramIndexWriter=new IndexWriter(ramDirectory, iwc1);

        Article article = new Article();
        article.setAid(1L);
        article.setTitle("lucene是一个全文检索引擎");
        article.setContent("baidu,google都是很好的全文检索引擎");

        // 创建document
        Document document = new Document();
        Field idField = new Field("aid", article.getAid().toString(),
                TextField.TYPE_STORED);
        Field titleField = new Field("title", article.getTitle().toString(),
                TextField.TYPE_STORED);
        Field contentField = new Field("content", article.getContent()
                .toString(), TextField.TYPE_STORED);
        document.add(idField);
        document.add(titleField);
        document.add(contentField);

        //把document放到内存当中
        ramIndexWriter.addDocument(document);
        ramIndexWriter.close();

        //把内存索引库的内容合并到文件索引库
        fileIndexWriter.addIndexes(ramDirectory);
        fileIndexWriter.close();

IndexWriter.addIndexes(ramDirectory);

```
public void addIndexes(Directory... dirs) throws IOException
```
Adds all segments from an array of indexes into this index.
This may be used to parallelize batch indexing. A large document collection can be broken into sub-collections. Each sub-collection can be indexed in parallel, on a different thread, process or machine. The complete index can then be created by merging sub-collection indexes with this method.

NOTE: this method acquires the write lock in each directory, to ensure that no IndexWriter is currently open or tries to open while this is running.

This method is transactional in how Exceptions are handled: it does not commit a new segments_N file until all indexes are added. This means if an Exception occurs (for example disk full), then either no indexes will have been added or they all will have been.

Note that this requires temporary free space in the Directory up to 2X the sum of all input indexes (including the starting index). If readers/searchers are open against the starting index, then temporary free space required will be higher by the size of the starting index (see forceMerge(int) for details).

This requires this index not be among those to be added.

刷新页面返回顶部

平静

导航

公告

lucene 内存索引和文件索引合并

平静

导航

公告

lucene 内存索引 和文件索引 合并

lucene 内存索引和文件索引合并