每日学习笔记(20)

1, Solr合并索引数据有两种方法，第一种是1.4版本中引入的，通过CoreAdminHandler来实现，示例如下：

http://localhost:8983/solr/admin/cores?action=mergeindexes&core=core0&indexDir=/opt/solr/core1/data/index&indexDir=/opt/solr/core2/data/index

上述命令会将core1和core2的索引合并到core0中去，这里最值得注意的一点是：一旦合并完成，必须在core0上调用commit操作，否则索引数据的变化对于searchers来说是暂时不可见的，只有等到下次core0重新装载起来时才可见。

实现代码如下：

/**

     * 合并索引数据
     * @param otherIndexPathList 其他待合并的索引数据路径
     * @param coreName 合并的目标solr核名称
     *
     */    private void mergeIndexData(List<String> otherIndexPathList, String coreName) {
        if (null != otherIndexPathList && otherIndexPathList.size() > 0) {
            HttpClient client = new HttpClient();
            client.setConnectionTimeout(20000);
            client.setTimeout(20000);
            client.setHttpConnectionFactoryTimeout(20000);

            StringBuffer sb = new StringBuffer();
            for (int i = 0; i < otherIndexPathList.size(); ++i) {
                sb.append("&indexDir=" + otherIndexPathList.get(i) + "/data/index");
            }
            String mergeIndexCMD = "http://" + Constants.LOCAL_ADDRESS + ":" + this.port  + "/admin/cores?action=mergeindexes&core="+ coreName;
            if (sb.length() > 0) {
                mergeIndexCMD += sb.toString();
            }
            HttpMethod method = new GetMethod(mergeIndexCMD);
            method.getParams().setContentCharset("GBK");
            method.getParams().setHttpElementCharset("GBK");
            method.getParams().setCredentialCharset("GBK");

            // execute the method.
            try {
                if (client.executeMethod(method) == 200) {
                    String response = method.getResponseBodyAsString();
                    if (logger.isInfoEnabled()) {
                        logger.info("merge result" + response);
                    }
                }
            } catch (Exception e) {
                logger.error("合并其他索引数据失败 " + coreName + "，索引目录: " + otherIndexPathList, e);
            }

            //commit操作让合并后的索引对搜索生效
            StreamingUpdateSolrServer httpSolrServer = null;
            httpSolrServer = getSolrServer(Constants.LOCAL_ADDRESS, this.port, coreName);
            try {
                httpSolrServer.commit();
            } catch (Exception e) {
            }
        }
    }

第二种方法是Solr3.3中引入的，也是通过CoreAdminHandler来实现，示例如下：

http://localhost:8983/solr/admin/cores?action=mergeindexes&core=core0&srcCore=core1&srcCore=core2

同第一种方法一样，一旦合并完成，必须在core0上调用commit操作，否则索引数据的变化对于searchers来说是暂时不可见的，只有等到下次core0重新装载起来时才可见。

使用”srcCore”和”indexDir”这两种方法的区别：

1) 使用”indexDir”参数，你可以合并不是与Solr核相关联的索引数据，比如通过Lucene直接创建的索引

2) 使用”indexDir”参数，你必须注意索引数据不是直接写入的，这就意味着如果它是一个solr核的索引，必须要关闭IndexWriter，这样才能触发一个commit命令。

3) “indexDir”必须指向solr核所在的主机上的磁盘路径，这就限制比较多了，而相反，你可以只给srcCore一个solr核的名称，而不关心它的实际索引路径在哪。

4) 使用”srcCore”，你必须确保即使源索引数据同时存在写操作的时候，合并后的索引页不会损坏。

2, solr索引合并的时候，底层其实调用的还是Lucene，因此你schema.xml中配置的uniqueKeys它并不知道，因此当你对两个包含相同文档（由uniqueKey确定）的索引进行合并时，你会得到双倍的文档数，solr这个地方应该改下，毕竟你不是简单的Lucene包装嘛。。。

posted on 2011-09-28 14:54 Phinecos(洞庭散人) 阅读(1216) 评论(0) 编辑收藏举报

刷新页面返回顶部

每日学习笔记(20)

导航

公告