mmseg4j-solr

 

code:

https://github.com/chenlb/mmseg4j-solr

https://code.google.com/p/mmseg4j/

 

mmseg4j for lucene or solr

<fieldtype name="textComplex" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="complex" dicPath="dic"/>
    </analyzer>
</fieldtype>
<fieldtype name="textMaxWord" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="max-word" />
    </analyzer>
</fieldtype>
<fieldtype name="textSimple" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="simple" dicPath="n:/custom/path/to/my_dic" />
    </analyzer>
</fieldtype>

tokenizer 的参数:

  • dicPath 参数 - 设置自定义的扩展词库,支持相对路径(相对于 solr_home).
  • mode 参数 - 分词模式。

 

links:

http://blog.chenlb.com/category/mmseg4j

https://groups.google.com/forum/#!forum/mmseg4j

http://technology.chtsai.org/mmseg/

http://www.coreseek.cn/opensource/mmseg/

http://lifegoo.pluskid.org/?p=261

posted @ 2014-10-31 18:53  xiaotou745  阅读(152)  评论(0编辑  收藏  举报