es基础概念&用法

 

 

格式化文档:https://www.wolai.com/tKv51LVKTRAH11tirCMwgg

分词

Standard Analyzer(标准分词,默认)

  1. 对于英文根据空格,特殊字符(-!@$#%^&*())__+=#等)进行切分
  2. 对于中文以单个字进行拆分
  3. 不支持特殊字符的分词,如 ,如果遇到特殊字符会被切分,字符
  • 示例
POST _analyze
{
  "analyzer": "standard",
  "text": "logTag=request_out-test!gantanghao(kuohao,中文"
}

Simple Analyzer(简单切分)

以不是字母,中文的字符进行切分

  • 示例
POST _analyze
{
  "analyzer": "simple",
  "text": "logTag=request_out-test!gantanghao(kuohao,中文s1ss"
}

Whitespace Analyzer(空格切分)

只按空格切分

  • 示例
POST _analyze
{
  "analyzer": "whitespace",
  "text": "logTag=request_out-test !gantanghao(kuohao,中文s1ss"
}

查询分割结果

POST _analyze
{
  "analyzer": "simple",
  "text": "logTag=request_out,中文测试"
}

查询

简单查询

match&match phrase

match为普通分词匹配,只要被查询的语句的分词能匹配上目标语句的分词就可以被查询到

match phrase为间隔分词匹配,如果查询语句被分词的话,分词的间隔需和被查询语句对应上才会被查询到,间隔默认为0

term

精确匹配未分词的字段,应尽量避免使用term查询text,如果使用,term将会去匹配被查询字段的所有分词

复杂查询

must&filter

两者都为子查询必须匹配,但是filter查询将会忽略评分并能使用上缓存

should

等同于mysql的or查询

  • should在和mustfilter同一层级时,将会默认不起效果(minimum_should_match被置为0) 若想在这种情况下生效,需额外配置"minimum_should_match": 1参数 minimum_should_match的意思为should条件必须被符合几次 或者使用must再包装一层should
//select _id,dtTime,appName,dateTime from plume_log_run_202205* where dtTime >= now-7d and dtTime < now and (appName = 'score-new' or appName = 'device-http' ) 

//使用minimum_should_match
GET plume_log_run_202205*/_search
{
  "size": 100,
  "from": 0,
  "_source": [
    "_id",
    "dtTime",
    "appName",
    "dateTime"
  ],
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "dtTime": {
              "gte": "now-7d",
              "lt": "now"
            }
          }
        }
      ],
      "should": [
        {
          "term": {
            "appName": "score-new"
          }
        },
        {
          "match": {
            "appName": "device-http"
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "fragment_size": 2147483647
      }
    }
  },
  "sort": [
    {
      "dtTime": "desc"
    }
  ]
}

//使用must嵌套
GET plume_log_run_202205*/_search
{
  "size": 100,
  "from": 0,
  "_source": [
    "_id",
    "dtTime",
    "appName",
    "dateTime"
  ],
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "dtTime": {
              "gte": "now-7d",
              "lt": "now"
            }
          }
        }
      ],
      "must": {
        "bool": {
          "should": [
            {
              "term": {
                "appName": "score-new"
              }
            },
            {
              "match": {
                "appName": "device-http"
              }
            }
          ]
        }
      }
    }
  },
  "highlight": {
    "fields": {
      "content": {
        "fragment_size": 2147483647
      }
    }
  },
  "sort": [
    {
      "dtTime": "desc"
    }
  ]
}

must_not

子查询必须不匹配 示例:

sql查询

直接查询

返回格式文档

分页文档

查询语法

  • 示例1:普通查询(只支持带keyword的字段)
POST /_sql?format=txt
{
  "query": "SELECT appName,dtTime,dateTime FROM \"plume_log_run_20220527_*\"  order by dtTime desc limit 50"
}
  • 示例2:match(普通匹配)
POST /_sql/translate
{
  "query": "select * from m_all_plume_log_run_202205 where match (content,'according to error message')"
}
  • 示例3:queryString+groupby
POST /_sql?format=txt
{
  "query": "select appName,count(*) from m_all_plume_log_run_202205 where query ('content:\"according to error message\"') group by appName"
}

翻译成dsl

  • 示例
POST /_sql/translate
{
  "query": "select * from test1 where a='a' and b='b'"
}

个人总结

和普通的SQL(Structured Query Language,结构化查询语言)相比,es的查询语法为DSL(Domain Specific Language,领域特定语言)

规则:

  1. 复合查询(must/filter/should/must_not)必须由bool进行组合,就算只存在单个也必须由bool包裹
  2. 复合查询(must/filter/should/must_not)的同级别必须为复合查询(must/filter/should/must_not),同理,基础查询的同级别必须为基础查询
  3. 多个基础查询必须组合在复合查询(must/filter/should/must_not)里,不能单独存在于query或者bool
  4. 复合查询(must/filter/should/must_not)建议都使用[]进行后续填充,防止后续修改时混乱
  • 创建测试数据
PUT test1
{

  "mappings": {

    "properties": {
      "a": {
        "type": "keyword"
      },
      "b": {
        "type": "keyword"
      },
      "c": {
        "type": "keyword"
      },
      "d": {
        "type": "keyword"
      },
      "e": {
        "type": "keyword"
      },
      "f": {
        "type": "keyword"
      },
      "dtTime": {
        "type": "date",
        "format": "strict_date_optional_time||epoch_millis"
      }
    }
  }
}


POST /test1/_doc/_bulk
{ "index":{} }
{"a":"a","dtTime":1653536394590,"b":"b","c":"c","d":"d","e":"e","f":"f"}
{ "index":{} }
{"a":"1","dtTime":1653536394590,"b":"b","c":"c","d":"d","e":"e","f":"f"}
{ "index":{} }
{"a":"1","dtTime":1653536394590,"b":"2","c":"c","d":"d","e":"e","f":"f"}
{ "index":{} }
{"a":"a","dtTime":1653536394590,"b":"3","c":"c","d":"d","e":"e","f":"f"}

简单的看,从sql转换为dsl只需把关键字上提,如

  • select * from test1 where a='a' and b='b'
#将and转换为must
#又由于must必须在bool里
#所以最后结果为:

GET test1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "a": "a"
          }
        },
        {
          "term": {
            "b": "b"
          }
        }
      ]
    }
  }
}
  • select * from test1 where (a='a' or a='1') and b='b'
#先将and转为must
#再将or转为should
#由于should不能作为must的子节点,所以用bool拼接
GET test1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "term": {
                  "a": {
                    "value": "a"
                  }
                }
              },
              {
                "term": {
                  "a": {
                    "value": "1"
                  }
                }
              }
            ]
          }
        },
        {
          "term": {
            "b": "b"
          }
        }
      ]
    }
  }
}

  • select * from test1 where (a="1" and b="2") or b='3'
#先将or转为should
#在should里使用bool拼接must
GET test1/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "b": {
              "value": "3"
            }
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "a": {
                    "value": "1"
                  }
                }
              },
              {
                "term": {
                  "b": {
                    "value": "2"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

新增

添加索引

sort.field:磁盘排序

number_of_shards:分片数量

number_of_replicas:副本数量

refresh_interval:索引刷新间隔时间,间隔该时间后数据才会被查询到

dynamic_templates:动态字段映射配置

  • 示例
PUT test2
{
  "settings": {
    "index": {
      "sort.field": [
        "dtTime",
        "seq"
      ],
      "sort.order": [
        "desc",
        "desc"
      ]
    },
    "number_of_shards": 10,
    "number_of_replicas": 0,
    "refresh_interval": "30s"
  },
  "mappings": {
    "dynamic_templates": [
      {
        "test_float": {
          "match_mapping_type": "string", 
          "mapping": {
            "norms": "false"  
          }
        }
      }
    ],
    "properties": {
      "appName": {
        "type": "keyword"
      },
      "env": {
        "type": "keyword"
      },
      "appNameWithEnv": {
        "type": "keyword"
      },
      "logLevel": {
        "type": "keyword"
      },
      "serverName": {
        "type": "keyword"
      },
      "traceId": {
        "type": "keyword"
      },
      "dtTime": {
        "type": "date",
        "format": "strict_date_optional_time||epoch_millis"
      },
      "seq": {
        "type": "long"
      }
    }
  }
}

插入数据

单条插入

  • 示例
POST /test1/_doc
{
    "a": "2022-05-26 11:39:54.590",
    "dtTime": 1653536394590,
    "b": "INFO",
    "c": "run(ClientWorker.java:522)",
    "d": "bms-_-dev",
    "e": "bms",
    "f": "192.168.10.47"
}

批量插入

  • 示例
POST /test1/_doc/_bulk
{ "index":{} }
{"a":"a","dtTime":1653536394590,"b":"b","c":"c","d":"d","e":"e","f":"f"}
{ "index":{} }
{"a":"a","dtTime":1653536394590,"b":"b","c":"c","d":"d","e":"e","f":"f"}

其他参数

禁止自动创建索引

PUT _cluster/settings
{
    "persistent": {
        "action.auto_create_index": "false" 
    }
}
posted @ 2022-05-31 11:33  MRLL  阅读(314)  评论(0编辑  收藏  举报