Elasticsearch 结构化搜索

过滤器

当进行精确查找时,我们会使用过滤器。

  • term 单值匹配
  • terms 多值匹配
  • bool 复合过滤器(must/must_not/should)
  • range 范围查询 (gt/lt/gtq/lte)
  • exists null值查询

使用 constant_score 以非评分模式进行查询。

示例数据

curl -X DELETE -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products"
curl -X POST -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_bulk" -d'
{ "index": { "_id": 1 }}
{ "price" : 10, "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20, "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30, "productID" : "QQPX-R-3956-#aD8" }
'
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products?pretty"

term 精确单值查询

term 查询数值

查询价格为 20 的产品

curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "term": {
                    "price": 20
                }
            }
        }
    }
}
'

term 查询文本

ES 会对文本进行分词导致不能直接用精确匹配查询,需要指定文本的类型为精确值

以下查询会查找失败

curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "term": {
                    "productID": "XHDK-A-1293-#fJ3"
                }
            }
        }
    }
}
'

查看 ES 如何分析(分词)字段

  curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_analyze?pretty" -d'
  {
      "field": "productID",
      "text": "XHDK-A-1293-#fJ3"
  }
  '

重新索引数据(删除索引是必须的,ES不能更新已存在的映射)

curl -X DELETE -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products"

ES 5.0 开始,使用 keyword 类型表示不需要分词的文本

curl -k -X PUT -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products" -d'
{
    "mappings": {
        "properties": {
            "productID": {
                "type": "keyword"
            }
        }
    }
}
'
curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_mappings?pretty"

组合过滤器

bool 复合过滤器

  • must 必须匹配,等价于 and
  • must_not 不能匹配,等价于 not
  • should 至少一个匹配,等价于 or
{
    "bool": {
        "must": [],
        "must_not": [],
        "should": []
    }
}

ES 5.0 废弃了 filtered 语法

  curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
  {
      "query": {
          "bool": {
              "should": [
                  {"term": {"price": 20}},
                  {"term": {"productID": "XHDK-A-1293-#fJ3"}}
              ],
              "must_not": [
                  {"term": {"price": 30}}
              ]
          }
      }
  }
  '

嵌套 bool 过滤器


curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "bool": {
            "should": [
                {"term": {"productID": "KDKE-B-9947-#kL5"}},
                {"bool": {
                    "must": [
                        {"term": {"productID": "JODL-X-1937-#pV7"}},
                        {"term": {"price": 30}}
                    ]
                }}
            ]
        }
    }
}
'

terms 精确多值查询

query -> constant_scoe -> filter -> term/terms

query -> bool -> shold/must/must_not -> term

curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "terms": {
                    "price": [20, 30]
                }
            }
        }
    }
}
'

对于标签类数据(数组数据),terms 会返回包含词项的数据;如果要返回完全相等的数据,可以新增一个字段记录数组词项个数。

范围查询

range 过滤器可以使用以下表达式

  • gt
  • lt
  • gte
  • lte
curl -k -X GET -H 'Content-Type: application/json' -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "range": {
                    "price": {
                        "gte": 20,
                        "lte": 40
                    }
                }
            }
        }
    }
}
'

range 支持日期计算

减1小时

"range" : {
    "timestamp" : {
        "gt" : "now-1h"
    }
}

加1个月

"range" : {
    "timestamp" : {
        "gt" : "2014-01-01 00:00:00",
        "lt" : "2014-01-01 00:00:00||+1M" 
    }
}

range 支持字典顺序

"range" : {
    "title" : {
        "gte" : "a",
        "lt" :  "b"
    }
}

处理null值

示例数据

curl -X POST -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_bulk" -d'
{ "index": { "_id": "1"              }}
{ "tags" : ["search"]                }  
{ "index": { "_id": "2"              }}
{ "tags" : ["search", "open_source"] }  
{ "index": { "_id": "3"              }}
{ "other_field" : "some data"        }  
{ "index": { "_id": "4"              }}
{ "tags" : null                      }  
{ "index": { "_id": "5"              }}
{ "tags" : ["search", null]          } 
'

存在 tags 字段的文档(存在该字段,但值为 null 不返回)

curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "constant_score": {
            "filter": {
                "exists": {"field": "tags"}
            }
        }
    }
}
'

ES 5.0 开始废弃 missing,需要使用 must_not 代替

curl -X GET -H 'Content-Type: application/json' -k -u elastic:KJNQ2rMeC481nFwSsqyf "https://localhost:9200/products/_search?pretty" -d'
{
    "query": {
        "bool": {
            "must_not": [
                {"exists": {"field": "tags"}}
            ]
        }
    }
}
'

在需要查询某字段是否被显示设置为 null 的场景时,可以将 null 指定成一个特殊的值,比如字符串类型可以指定 "null_value"。

posted @ 2024-10-09 19:39  廖子博  阅读(19)  评论(0编辑  收藏  举报