自定义中文全文索引

一、中文分词插件

NEO4J中文全文索引,分词组件使用IKAnalyzer。为了支持高版本LUCENE,IKAnalyzer需要做一些调整。

IKAnalyzer-3.0 旧版本实现参考

ELASTICSEARCH-IKAnlyzer 高版本实现参考

1、分词组件的调整

调整之后的分词组件 casia.isiteam.zdr.wltea

// 调整之后的实现
public final class IKAnalyzer extends Analyzer {
<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 默认细粒度切分 true-智能切分 false-细粒度切分</span></span></span></span></span>
<span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">private</span></span></span></span></span> Configuration configuration <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">Configuration</span><span class="token punctuation">(</span><span class="token boolean"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">false</span></span></span></span></span><span class="token punctuation">)</span><span class="token punctuation">;</span>

<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/**
 * IK分词器Lucene  Analyzer接口实现类
 * &lt;p&gt;
 * 默认细粒度切分算法
 */</span></span></span></span></span>
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">public</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">IKAnalyzer</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span>
<span class="token punctuation">}</span>

<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/**
 * IK分词器Lucene Analyzer接口实现类
 *
 * </span></span></span><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag"><span class="hljs-comment"><span class="hljs-doctag">@param</span></span></span></span></span></span></span><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"> configuration IK配置
 */</span></span></span></span></span>
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">public</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">IKAnalyzer</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">Configuration configuration</span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">super</span></span></span></span></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">this</span></span></span></span></span><span class="token punctuation">.</span>configuration <span class="token operator">=</span> configuration<span class="token punctuation">;</span>
<span class="token punctuation">}</span>

<span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">/**
 * 重载Analyzer接口,构造分词组件
 */</span></span></span></span></span>
<span class="token annotation punctuation"><span class="hljs-meta"><span class="hljs-meta"><span class="hljs-meta"><span class="hljs-meta">@Override</span></span></span></span></span>
<span class="token keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword"><span class="hljs-function"><span class="hljs-keyword">protected</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> TokenStreamComponents </span></span></span></span><span class="token function"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title"><span class="hljs-function"><span class="hljs-title">createComponents</span></span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">(</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">String fieldName</span></span></span></span></span></span></span></span><span class="token punctuation"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params"><span class="hljs-function"><span class="hljs-params">)</span></span></span></span></span></span></span></span></span><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"><span class="hljs-function"> </span></span></span></span><span class="token punctuation">{</span>
    Tokenizer _IKTokenizer <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">IKTokenizer</span><span class="token punctuation">(</span>configuration<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">TokenStreamComponents</span><span class="token punctuation">(</span>_IKTokenizer<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>

}

2、分词测试

自定义分词函数

RETURN zdr.index.iKAnalyzer('复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?',true) AS words

在这里插入图片描述

 /**
     * @param text:待分词文本
     * @param useSmart:true 用智能分词,false 细粒度分词
     * @return
     * @Description: TODO(支持中英文本分词)
     */
    @UserFunction(name = "zdr.index.iKAnalyzer")
    @Description("Fulltext index iKAnalyzer - RETURN zdr.index.iKAnalyzer({text},true) AS words")
    public List<String> iKAnalyzer(@Name("text") String text, @Name("useSmart") boolean useSmart) {
    PropertyConfigurator<span class="token punctuation">.</span><span class="token function">configureAndWatch</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"dic"</span></span></span></span></span> <span class="token operator">+</span> File<span class="token punctuation">.</span>separator <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"log4j.properties"</span></span></span></span></span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    Configuration cfg <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">Configuration</span><span class="token punctuation">(</span>useSmart<span class="token punctuation">)</span><span class="token punctuation">;</span>

    StringReader input <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">StringReader</span><span class="token punctuation">(</span>text<span class="token punctuation">.</span><span class="token function">trim</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    IKSegmenter ikSegmenter <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">IKSegmenter</span><span class="token punctuation">(</span>input<span class="token punctuation">,</span> cfg<span class="token punctuation">)</span><span class="token punctuation">;</span>

    List<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">&gt;</span></span> results <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">ArrayList</span><span class="token operator">&lt;</span><span class="token operator">&gt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">try</span></span></span></span></span> <span class="token punctuation">{</span>
        <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">for</span></span></span></span></span> <span class="token punctuation">(</span>Lexeme lexeme <span class="token operator">=</span> ikSegmenter<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> lexeme <span class="token operator">!=</span> <span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">null</span></span></span></span><span class="token punctuation">;</span> lexeme <span class="token operator">=</span> ikSegmenter<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
            results<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>lexeme<span class="token punctuation">.</span><span class="token function">getLexemeText</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
    <span class="token punctuation">}</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">catch</span></span></span></span></span> <span class="token punctuation">(</span><span class="token class-name">IOException</span> e<span class="token punctuation">)</span> <span class="token punctuation">{</span>
        e<span class="token punctuation">.</span><span class="token function">printStackTrace</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token punctuation">}</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> results<span class="token punctuation">;</span>
<span class="token punctuation">}</span>

二、样例数据准备

# 构造样例数据
MERGE (a:Loc {name:'A'}) SET a.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (b:Loc {name:'B'}) SET b.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (c:Loc {name:'C'}) SET c.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (d:Loc {name:'D'}) SET d.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (e:Loc {name:'E'}) SET e.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (f:Loc {name:'F'}) SET f.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!吖啶基氨基甲烷磺酰甲氧基苯胺是一种药嘛?'
MERGE (a)-[:ROAD {cost:50}]->(b)
MERGE (a)-[:ROAD {cost:50}]->(c)
MERGE (a)-[:ROAD {cost:100}]->(d)
MERGE (b)-[:ROAD {cost:40}]->(d)
MERGE (c)-[:ROAD {cost:40}]->(d)
MERGE (c)-[:ROAD {cost:80}]->(e)
MERGE (d)-[:ROAD {cost:30}]->(e)
MERGE (d)-[:ROAD {cost:80}]->(f)
MERGE (e)-[:ROAD {cost:40}]->(f);

三、通过中文全文分词组件创建节点索引

自定义创建索引过程

CALL zdr.index.addChineseFulltextIndex('IKAnalyzer', 'Loc', ['description']) YIELD message RETURN message

在这里插入图片描述

 @Procedure(value = "zdr.index.addChineseFulltextIndex", mode = Mode.WRITE)
    @Description("CALL zdr.index.addChineseFulltextIndex(String indexName, String labelName, List<String> propKeys) YIELD message RETURN message," +
            "为一个标签下的所有节点的指定属性添加索引")
    public Stream<NodeIndexMessage> addChineseFulltextIndex(@Name("indexName") String indexName,
                                                            @Name("labelName") String labelName, @Name("properties") List<String> propKeys) {
        Label label = Label.label(labelName);
    List<span class="token generics function"><span class="token punctuation">&lt;</span>NodeIndexMessage<span class="token punctuation">&gt;</span></span> output <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">ArrayList</span><span class="token operator">&lt;</span><span class="token operator">&gt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

// // 按照标签找到该标签下的所有节点
ResourceIterator<Node> nodes = db.findNodes(label);
System.out.println("nodes:" + nodes.toString());

    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">int</span></span></span></span></span> nodesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">int</span></span></span></span></span> propertiesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">while</span></span></span></span></span> <span class="token punctuation">(</span>nodes<span class="token punctuation">.</span><span class="token function">hasNext</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
        nodesSize<span class="token operator">++</span><span class="token punctuation">;</span>
        Node node <span class="token operator">=</span> nodes<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current nodes:"</span></span></span></span></span> <span class="token operator">+</span> node<span class="token punctuation">.</span><span class="token function">toString</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 每个节点上需要添加索引的属性</span></span></span></span></span>
        Set<span class="token operator">&lt;</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">&gt;</span></span><span class="token operator">&gt;</span> properties <span class="token operator">=</span> node<span class="token punctuation">.</span><span class="token function">getProperties</span><span class="token punctuation">(</span>propKeys<span class="token punctuation">.</span><span class="token function">toArray</span><span class="token punctuation">(</span><span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">String</span><span class="token punctuation">[</span><span class="token number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number"><span class="hljs-number">0</span></span></span></span></span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">entrySet</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current node properties"</span></span></span></span></span> <span class="token operator">+</span> properties<span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 查询该节点是否已有索引,有的话删除</span></span></span></span></span>
        Index<span class="token generics function"><span class="token punctuation">&lt;</span>Node<span class="token punctuation">&gt;</span></span> index <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">,</span> FULL_INDEX_CONFIG<span class="token punctuation">)</span><span class="token punctuation">;</span>
        System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"current node index"</span></span></span></span></span> <span class="token operator">+</span> index<span class="token punctuation">)</span><span class="token punctuation">;</span>
        index<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span>node<span class="token punctuation">)</span><span class="token punctuation">;</span>

        <span class="token comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment"><span class="hljs-comment">// 为了该节点的每个需要添加索引的属性添加全文索引</span></span></span></span></span>
        <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">for</span></span></span></span></span> <span class="token punctuation">(</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation">&lt;</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">&gt;</span></span> property <span class="token operator">:</span> properties<span class="token punctuation">)</span> <span class="token punctuation">{</span>
            propertiesSize<span class="token operator">++</span><span class="token punctuation">;</span>
            index<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>node<span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getKey</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getValue</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
        <span class="token punctuation">}</span>
    <span class="token punctuation">}</span>

    String message <span class="token operator">=</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">"IndexName:"</span></span></span></span></span> <span class="token operator">+</span> indexName <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",LabelName:"</span></span></span></span></span> <span class="token operator">+</span> labelName <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",NodesSize:"</span></span></span></span></span> <span class="token operator">+</span> nodesSize <span class="token operator">+</span> <span class="token string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string"><span class="hljs-string">",PropertiesSize:"</span></span></span></span></span> <span class="token operator">+</span> propertiesSize<span class="token punctuation">;</span>
    NodeIndexMessage indexMessage <span class="token operator">=</span> <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">new</span></span></span></span></span> <span class="token class-name">NodeIndexMessage</span><span class="token punctuation">(</span>message<span class="token punctuation">)</span><span class="token punctuation">;</span>
    output<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>indexMessage<span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword"><span class="hljs-keyword">return</span></span></span></span></span> output<span class="token punctuation">.</span><span class="token function">stream</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>

四、中文分词索引查询

自定义查询索引过程

CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:吖啶基氨基甲烷磺酰甲氧基苯胺', 100) YIELD node RETURN node

在这里插入图片描述

CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:复联* AND year:1999', 100) YIELD node,weight RETURN node.name,node.year,node.description,weight

在这里插入图片描述

  @Procedure(value = "zdr.index.chineseFulltextIndexSearch", mode = Mode.WRITE)
    @Description("CALL zdr.index.chineseFulltextIndexSearch(String indexName, String query, long limit) YIELD node RETURN node," +
            "执行LUCENE全文检索,返回前{limit个结果}")
    public Stream<ChineseHit> chineseFulltextIndexSearch(@Name("indexName") String indexName,
                                                         @Name("query") String query, @Name("limit") long limit) {
        if (!db.index().existsForNodes(indexName)) {
            log.debug("如果索引不存在则跳过本次查询:`%s`", indexName);
            return Stream.empty();
        }
        return db.index()
                .forNodes(indexName, FULL_INDEX_CONFIG)
                .query(new QueryContext(query).sortByScore().top((int) limit))
                .stream()
                .map(ChineseHit::new);  // provider
    }

【跨标签类型检索】使用addChineseFulltextIndex给标签下节点属性添加的索引,默认可以使用chineseFulltextIndexSearch合并检索出来

// 增加一个非Loc标签的节点,然后使用检索
CREATE (n:LocProvince {name:'P'})  SET n.description='复联终章快上映了好激动,据说知识图谱与人工智能技术应用到了那部电影!' RETURN n
// 节点增加索引(索引名与已有相同)
CALL zdr.index.addChineseFulltextIndex('IKAnalyzer', 'LocProvince', ['description','year']) YIELD message RETURN message
// 通过属性检索节点
CALL zdr.index.chineseFulltextIndexSearch('IKAnalyzer', 'description:复联', 100) YIELD node,weight RETURN node

在这里插入图片描述

五、总结

上述NEO4J中文全文索引解决方法,索引不会自动更新,修改节点属性以及新增节点时都需要重新建立索引。
NEO4J默认索引实现参考:neo4j-lucene-index
在这里插入图片描述

原文地址:https://blog.csdn.net/superman_xxx/article/details/89204456
posted @ 2019-08-22 14:35  星朝  阅读(283)  评论(0编辑  收藏  举报