Elasticsearch 结构化搜索
过滤器
当进行精确查找时,我们会使用过滤器。
- term 单值匹配
- terms 多值匹配
- bool 复合过滤器(must/must_not/should)
- range 范围查询 (gt/lt/gtq/lte)
- exists null值查询
使用 constant_score 以非评分模式进行查询。
示例数据
curl -X DELETE -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products"
curl -X POST -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_bulk" -d'
{ "index": { "_id": 1 }}
{ "price" : 10, "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20, "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30, "productID" : "QQPX-R-3956-#aD8" }
'
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products?pretty"
term 精确单值查询
term 查询数值
查询价格为 20 的产品
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"constant_score": {
"filter": {
"term": {
"price": 20
}
}
}
}
}
'
term 查询文本
ES 会对文本进行分词导致不能直接用精确匹配查询,需要指定文本的类型为精确值
以下查询会查找失败
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"constant_score": {
"filter": {
"term": {
"productID": "XHDK-A-1293-#fJ3"
}
}
}
}
}
'
查看 ES 如何分析(分词)字段
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_analyze?pretty" -d'
{
"field": "productID",
"text": "XHDK-A-1293-#fJ3"
}
'
重新索引数据(删除索引是必须的,ES不能更新已存在的映射)
curl -X DELETE -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products"
ES 5.0 开始,使用 keyword 类型表示不需要分词的文本
curl -k -X PUT -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products" -d'
{
"mappings": {
"properties": {
"productID": {
"type": "keyword"
}
}
}
}
'
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_mappings?pretty"
组合过滤器
bool 复合过滤器
- must 必须匹配,等价于 and
- must_not 不能匹配,等价于 not
- should 至少一个匹配,等价于 or
{
"bool": {
"must": [],
"must_not": [],
"should": []
}
}
ES 5.0 废弃了 filtered 语法
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"bool": {
"should": [
{"term": {"price": 20}},
{"term": {"productID": "XHDK-A-1293-#fJ3"}}
],
"must_not": [
{"term": {"price": 30}}
]
}
}
}
'
嵌套 bool 过滤器
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"bool": {
"should": [
{"term": {"productID": "KDKE-B-9947-#kL5"}},
{"bool": {
"must": [
{"term": {"productID": "JODL-X-1937-#pV7"}},
{"term": {"price": 30}}
]
}}
]
}
}
}
'
terms 精确多值查询
query -> constant_scoe -> filter -> term/terms
query -> bool -> shold/must/must_not -> term
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"constant_score": {
"filter": {
"terms": {
"price": [20, 30]
}
}
}
}
}
'
对于标签类数据(数组数据),terms 会返回包含词项的数据;如果要返回完全相等的数据,可以新增一个字段记录数组词项个数。
范围查询
range 过滤器可以使用以下表达式
- gt
- lt
- gte
- lte
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"constant_score": {
"filter": {
"range": {
"price": {
"gte": 20,
"lte": 40
}
}
}
}
}
}
'
range 支持日期计算
减1小时
"range" : {
"timestamp" : {
"gt" : "now-1h"
}
}
加1个月
"range" : {
"timestamp" : {
"gt" : "2014-01-01 00:00:00",
"lt" : "2014-01-01 00:00:00||+1M"
}
}
range 支持字典顺序
"range" : {
"title" : {
"gte" : "a",
"lt" : "b"
}
}
处理null值
示例数据
curl -X POST -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_bulk" -d'
{ "index": { "_id": "1" }}
{ "tags" : ["search"] }
{ "index": { "_id": "2" }}
{ "tags" : ["search", "open_source"] }
{ "index": { "_id": "3" }}
{ "other_field" : "some data" }
{ "index": { "_id": "4" }}
{ "tags" : null }
{ "index": { "_id": "5" }}
{ "tags" : ["search", null] }
'
存在 tags 字段的文档(存在该字段,但值为 null 不返回)
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"constant_score": {
"filter": {
"exists": {"field": "tags"}
}
}
}
}
'
ES 5.0 开始废弃 missing,需要使用 must_not 代替
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
"query": {
"bool": {
"must_not": [
{"exists": {"field": "tags"}}
]
}
}
}
'
在需要查询某字段是否被显示设置为 null 的场景时,可以将 null 指定成一个特殊的值,比如字符串类型可以指定 "null_value"。