Elasticsearch Query DSL-Term level queries

Term level queries

term查询不会对输入进行分词。

Exists 过滤哪些字段 not exists 的文档。

对于not exists的定义：

字段source是null或[]。
字段的index:false。
字段长度大于ignore_above 。
字段值格式错误，mapping中定义了ignore_malformed 。

GET /_search
{
  "query": {
    "exists": {
      "field": "user"
    }
  }
}

GET /_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "user.id"
        }
      }
    }
  }
}

Fuzzy 按可编辑次数进行模糊匹配。当 search.allow_expensive_queries false时不能使用fuzzy。

对于编辑次数举例：

改变一个字符 (box → fox)
删除一个字符 (black → lack)
插入一个字符 (sic → sick)
位移一个字符 (act → cat)

GET /_search
{
  "query": {
    "fuzzy": {
      "user.id": {
        "value": "ki",
        "fuzziness": "AUTO",
        "max_expansions": 50,
        "prefix_length": 0,
        "transpositions": true,
        "rewrite": "constant_score"
      }
    }
  }
}

fuzzy内支持的参数：

<field>: (必须, object) 搜索的字段名称.

`<field>内支持的参数`

value: (必须, string) 模糊查询内容.
fuzziness: (可选, string) 最大可编辑距离，默认0.; 可以直接以整数，表示最大可编辑距离。; 也可以AUTO:[low],[high]格式，例如AUTO:3,6，表示; 查询的value<3时，精确匹配; 查询的3<=value<6时，可编辑1次; 查询的value>=6时，可编辑2次
max_expansions: (Optional, integer) Maximum number of variations created. Defaults to 50.

Avoid using a high value in the max_expansions parameter, especially if the prefix_length parameter value is 0. High values in the max_expansions parameter can cause poor performance due to the high number of variations examined.
prefix_length: (可选, integer) 查询value的前n个字符不允许编辑，默认0.
transpositions: (可选, Boolean) 是否允许位移，默认true.
rewrite: (可选, string) rewrite parameter.

IDs

GET /_search
{
  "query": {
    "ids" : {
      "values" : ["1", "4", "100"]
    }
  }
}

Prefix 给定前缀查询

GET /_search
{
  "query": {
    "prefix": {
      "user.id": {
        "value": "ki"
      }
    }
  }
}

prefix内参数支持：
<field> (必须, object) 查询的字段名称.

`<field>内参数支持：`

value: (必须, string) 前缀.
rewrite: (Optional, string) Method used to rewrite the query. For valid values and more information, see the rewrite parameter.
case_insensitive 默认false

Range 范围查询，当对text、keyword进行range时，如果 search.allow_expensive_queries 是 false则不支持。

GET /_search
{
  "query": {
    "range": {
      "age": {
        "gte": 10,
        "lte": 20,
        "boost": 2.0
      }
    }
  }
}

range内参数：
<field> (必须, object) 查询的字段名称.
field内参数：

gt、gte、lt、lte
format(可选, string) 日期的格式化.

relation

(Optional, string) Indicates how the range query matches values for range fields. Valid values are:

INTERSECTS (Default): Matches documents with a range field value that intersects the query’s range.
CONTAINS: Matches documents with a range field value that entirely contains the query’s range.
WITHIN: Matches documents with a range field value entirely within the query’s range.

time_zone(可选, string)

boost

(可选, float) 默认1.0

对日期的range

GET /_search
{
  "query": {
    "range": {
      "timestamp": {
        "gte": "now-1d/d",
        "lt": "now/d"
      }
    }
  }
}

Regexp 正则查询

GET /_search
{
  "query": {
    "regexp": {
      "user.id": {
        "value": "k.*y",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000,
        "rewrite": "constant_score"
      }
    }
  }
}

上例以k开头，y结尾，.*支持任意长度的字符，可以匹配 ky, kay,kimchy 。

regexp内参数支持：

<field>: (必须, object) 查询的字段名称.

field内参数支持：


value (必须, string) 表达式.
flags (可选, string) Regular expression syntax.
case_insensitive (可选, Boolean) 默认false
max_determinized_states(可选, integer)查询所需的最大 automaton states 默认10000.
rewrite(可选, string) rewrite parameter.

Term 精确匹配。一般不会对text使用term，而是使用match。

GET /_search
{
  "query": {
    "term": {
      "user.id": {
        "value": "kimchy",
        "boost": 1.0
      }
    }
  }
}

term内参数：
<field> (必须, object) 字段名称.

field内参数：
value(必须, string) 查询的term
boost(可选, float) 默认1.0
case_insensitive (可选, Boolean) 默认false

Terms Term的多value版

GET /_search
{
  "query": {
    "terms": {
      "user.id": [ "kimchy", "elkbee" ],
      "boost": 1.0
    }
  }
}

terms内参数支持：

<field>(可选, object) 数组，查询时至少匹配1个value（类似or的效果），默认数组长度65536，可以通过 index.max_terms_count 设置。

boost(可选, float) 默认1.0

terms 可能不会返回高亮。

Terms set 与Terms相同，但可以指定至少匹配的数量。

直接看例子：

创建mapping

PUT /job-candidates
{
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword"
      },
      "programming_languages": {
        "type": "keyword"
      },
      "required_matches": {
        "type": "long"
      }
    }
  }
}

新增文档

PUT /job-candidates/_doc/1?refresh
{
  "name": "Jane Smith",
  "programming_languages": [ "c++", "java" ],
  "required_matches": 2
}

PUT /job-candidates/_doc/2?refresh
{
  "name": "Jason Response",
  "programming_languages": [ "java", "php" ],
  "required_matches": 2
}

查询至少含个2个（按required_matches，该字段在新增文档时就表示查询时必须匹配到的数量，即如果terms的参数只有1个，以上2个文档都是不符的）programming_language的文档

GET /job-candidates/_search
{
  "query": {
    "terms_set": {
      "programming_languages": {
        "terms": [ "c++", "java", "php" ],
        "minimum_should_match_field": "required_matches"
      }
    }
  }
}

terms_set内参数支持：

<field> 查询的字段名称。

field内参数支持：

terms (必须, array of strings) 查询的value数组。

minimum_should_match_field (可选, string) 字段名称。

minimum_should_match_script (可选, string) 脚本方式，例如

"minimum_should_match_script": {
          "source": "Math.min(params.num_terms, doc['required_matches'].value)"
        }

Wildcard 通配符查询。例如*可以匹配0-N个字符。当 search.allow_expensive_queries false不支持。

GET /_search
{
  "query": {
    "wildcard": {
      "user.id": {
        "value": "ki*y",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}

wildcard内参数支持：

<field>(必须, object) 查询的字段名称.

<field>内参数支持：

boost(可选, float) 默认1.0

case_insensitive (可选, Boolean) 默认false

rewrite(可选, string) rewrite parameter.

value(必须, string) 通配符，不建议用*或?作为开头，会加大查询开销

?, 一个字符
*, 0-N个字符

posted on 2021-11-04 10:09 icodegarden 阅读(108) 评论(0) 收藏举报