elasticsearch 进行分词测试

1，打开kibana:

GET /scddb/_analyze
{
"text": "蓝瘦香菇",
"analyzer": "ik_max_word" //ik_smart
}

测试分词效果如下，不是很理想:

{
"tokens" : [
{
"token" : "蓝",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "瘦",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "香菇",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
}
]
}

添加自定义词库：

参考这里添加自定义IK词库：https://blog.csdn.net/makang456/article/details/79211255

重启：service elasticsearch restart

再测试：

{
"tokens" : [
{
"token" : "蓝瘦香菇",
"start_offset" : 0,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 0
}
]
}

posted @ 2019-12-04 15:42 likecs 阅读(3094) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

elasticsearch 进行分词测试

公告