solr增量索引

参考:http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

修改qiye-data-config.xml

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost:3306/qiye"
              user="root" 
              password="root"/>
  <document>
    <entity name="id" pk="id"
            query="select id,shopdesc from testshop"
            deltaImportQuery="select * from testshop where ID='${dih.delta.id}'"
            deltaQuery="select id from testshop where last_modified &gt; '${dih.last_index_time}'">
    </entity>
  </document>
</dataConfig>

表中testshop加入了字段last_modified 为timestamp类型,默认CURRENT_TIMESTAMP

The query gives the data needed to populate fields of the Solr document in full-import
The deltaImportQuery gives the data needed to populate fields when running a delta-import
The deltaQuery gives the primary keys of the current entity which have changes since the last index time
The parentDeltaQuery uses the changed rows of the current table (fetched with deltaQuery) to give the changed rows in the parent table.This is necessary because whenever a row in the child table changes, we need to re-generate the document which has that field.

运行:

http://localhost:8080/solr/dataimport?command=delta-import 导入新加入的数据,

http://localhost:8080/solr/#/collection1/query 就可以查询出刚才新加入的数据

关于定时索引,查看文档http://wiki.apache.org/solr/DataImportHandler#Scheduling

这里也发现了有人对这个进行了封装:https://code.google.com/p/solr-dataimport-scheduler/

posted on 2013-04-20 17:59  游鱼  阅读(1760)  评论(0编辑  收藏  举报