Elasticsearch之索引、文档、组合查询、排序查询、filter过滤操作

Elasticsearch之-索引操作

# es的倒排索引（扩展阅读.md）

-把文章进行分词，对每个词建立索引

具体操作可以查看官方文档

https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices.html>

官方2版本的中文文档

https://www.elastic.co/guide/cn/elasticsearch/guide/current/index-settings.html

一索引初始化

#新建一个lqz2的索引，索引分片数量为5，索引副本数量为1
PUT lqz2
{
  "settings": {
    "index":{
      "number_of_shards":5,
      "number_of_replicas":1
    }
  }
}
'''
number_of_shards
每个索引的主分片数，默认值是 5 。这个配置在索引创建后不能修改。
number_of_replicas
每个主分片的副本数，默认值是 1 。对于活动的索引库，这个配置可以随时修改。
'''

二查询索引配置

#获取lqz2索引的配置信息
GET lqz2/_settings
#获取所有索引的配置信息
GET _all/_settings
#同上
GET _settings
#获取lqz和lqz2索引的配置信息
GET lqz,lqz2/_settings

三更新索引

#修改索引副本数量为2
PUT lqz/_settings
{
  "number_of_replicas": 2
}
#如遇到报错：cluster_block_exception，因为这是由于ES新节点的数据目录data存储空间不足，导致从master主节点接收同步数据的时候失败，此时ES集群为了保护数据，会自动把索引分片index置为只读read-only

PUT  _all/_settings
{
"index": {
  "blocks": {
    "read_only_allow_delete": false
    }
  }
}

四删除索引

#删除lqz索引
DELETE lqz

Elasticsearch之-文档操作

一新增文档

#新增一个id为1的书籍（POST和PUT都可以）
POST lqz/_doc/1/_create
#POST lqz/_doc/1
#POST lqz/_doc 会自动创建id,必须用Post
{
  "title":"红楼梦",
  "price":12,
  "publish_addr":{
    "province":"黑龙江",
    "city":"鹤岗"
  },
  "publish_date":"2013-11-11",
  "read_num":199,
  "tag":["古典","名著"]
}

二查询文档

#查询lqz索引下id为7的文档
GET lqz/_doc/1
#查询lqz索引下id为7的文档，只要title字段
GET lqz/_doc/7?_source=title
#查询lqz索引下id为7的文档，只要title和price字段
GET lqz/_doc/7?_source=title,price
#查询lqz索引下id为7的文档，要全部字段
GET lqz/_doc/7?_source

三修改文档

#修改文档(覆盖修改，原来的字段就没有了)
PUT lqz/_doc/1
{
  "title":"xxxx",
  "price":333,
  "publish_addr":{
    "province":"黑龙江",
    "city":"福州"
  }
}
#修改文档，增量修改，只修改某个字段(注意是post)（一定要注意包在doc中）
POST lqz/_update/1
{
  "doc":{
    "title":"修改"
  }
}

四删除文档

#删除文档id为10的
DELETE lqz/_doc/10

五批量操作之_mget

#批量获取lqz索引_doc类型下id为2的数据和lqz2索引_doc类型下id为1的数据
GET _mget
{
  "docs":[
    {
      "_index":"lqz",
      "_type":"_doc",
      "_id":2
    },
    {
      "_index":"lqz2",
      "_type":"_doc",
      "_id":1
    }
    ]
}

#批量获取lqz索引下id为1和2的数据
GET lqz/_mget
{
  "docs":[
    {
      "_id":2
    },
    {
      "_id":1
    }
    ]
}
#同上
GET lqz/_mget
{
  "ids":[1,2]
}

六批量操作之 bulk

PUT test/_doc/2/_create
{
  "field1" : "value22"
}
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

Elasticsearch之查询的两种方式

一前言

elasticsearch提供两种查询方式：

查询字符串(query string)，简单查询，就像是像传递URL参数一样去传递查询语句，被称为简单搜索或查询字符串(query string)搜索。
另外一种是通过DSL语句来进行查询，被称为DSL查询(Query DSL),DSL是Elasticsearch提供的一种丰富且灵活的查询语言，该语言以json请求体的形式出现，通过restful请求与Elasticsearch进行交互。

二准备数据

PUT lqz/doc/1
{
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

PUT lqz/doc/2
{
  "name":"大娘子",
  "age":18,
  "from":"sheng",
  "desc":"肤白貌美，娇憨可爱",
  "tags":["白", "富","美"]
}

PUT lqz/doc/3
{
  "name":"龙套偏房",
  "age":22,
  "from":"gu",
  "desc":"mmp，没怎么看，不知道怎么形容",
  "tags":["造数据", "真","难"]
}

PUT lqz/doc/4
{
  "name":"石头",
  "age":29,
  "from":"gu",
  "desc":"粗中有细，狐假虎威",
  "tags":["粗", "大","猛"]
}

PUT lqz/doc/5
{
  "name":"魏行首",
  "age":25,
  "from":"广云台",
  "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp，最后竟然没有嫁给顾老二！",
  "tags":["闭月","羞花"]
}

View Code

三查询字符串

GET lqz/doc/_search?q=from:gu

还是使用GET命令，通过_serarch查询，查询条件是什么呢？条件是from属性是gu家的人都有哪些。

结果如下

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "3",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "龙套偏房",
          "age" : 22,
          "from" : "gu",
          "desc" : "mmp，没怎么看，不知道怎么形容",
          "tags" : [
            "造数据",
            "真",
            "难"
          ]
        }
      }
    ]
  }
}

结果如下

我们来重点说下hits，hits是返回的结果集——所有from属性为gu的结果集。重点中的重点是_score得分，得分是什么呢？根据算法算出跟查询条件的匹配度，匹配度高得分就高。后面再说这个算法是怎么回事。

四结构化查询

我们现在使用DSL方式，来完成刚才的查询，查看来自顾家的都有哪些人。

GET lqz/_doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  }
}

上例，查询条件是一步步构建出来的，将查询条件添加到match中即可，而match则是查询所有from字段的值中含有gu的结果就会返回。当然结果没啥变化：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "3",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "龙套偏房",
          "age" : 22,
          "from" : "gu",
          "desc" : "mmp，没怎么看，不知道怎么形容",
          "tags" : [
            "造数据",
            "真",
            "难"
          ]
        }
      }
    ]
  }
}

结果如下

Elasticsearch之排序查询sort

降序：desc

比如我们查询顾府都有哪些人，并根据age字段按照降序，并且，我只想看name和age字段：

GET lqz/doc/_search
{
  "query": {
    "match": {
      "from": "gu"
    }
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}

上例，在条件查询的基础上，我们又通过sort来做排序，根据age字段排序，由order字段控制，desc是降序。

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        },
        "sort" : [
          30
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        },
        "sort" : [
          29
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "name" : "龙套偏房",
          "age" : 22,
          "from" : "gu",
          "desc" : "mmp，没怎么看，不知道怎么形容",
          "tags" : [
            "造数据",
            "真",
            "难"
          ]
        },
        "sort" : [
          22
        ]
      }
    ]
  }
}

结果如下

上例中，结果是以降序排列方式返回的。

2.2 升序：asc

GET lqz/doc/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "2",
        "_score" : null,
        "_source" : {
          "name" : "大娘子",
          "age" : 18,
          "from" : "sheng",
          "desc" : "肤白貌美，娇憨可爱",
          "tags" : [
            "白",
            "富",
            "美"
          ]
        },
        "sort" : [
          18
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "name" : "龙套偏房",
          "age" : 22,
          "from" : "gu",
          "desc" : "mmp，没怎么看，不知道怎么形容",
          "tags" : [
            "造数据",
            "真",
            "难"
          ]
        },
        "sort" : [
          22
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "5",
        "_score" : null,
        "_source" : {
          "name" : "魏行首",
          "age" : 25,
          "from" : "广云台",
          "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp，最后竟然没有嫁给顾老二！",
          "tags" : [
            "闭月",
            "羞花"
          ]
        },
        "sort" : [
          25
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        },
        "sort" : [
          29
        ]
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        },
        "sort" : [
          30
        ]
      }
    ]
  }
}

结果如下

注意：不是什么数据类型都能排序，只有数字，日期可以排序，其他都不行！！

Elasticsearch之分页查询

分页查询：from/size

GET lqz/doc/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ], 
  "from": 2,
  "size": 1
}

#上例，首先以`age`降序排序，查询所有。并且在查询的时候，添加两个属性`from`和`size`来控制查询结果集的数据条数。

- from：从哪开始查
- size：返回几条结果

# 有了这个查询，如何分页？
一页有10条数据
第一页：
  "from": 0,
  "size": 10
第二页：
  "from": 10,
  "size": 10
第三页：
  "from": 20,
  "size": 10

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 5,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "5",
        "_score" : null,
        "_source" : {
          "name" : "魏行首",
          "age" : 25,
          "from" : "广云台",
          "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp，最后竟然没有嫁给顾老二！",
          "tags" : [
            "闭月",
            "羞花"
          ]
        },
        "sort" : [
          25
        ]
      }
    ]
  }
}

查询结果

Elasticsearch之布尔（组合）查询

多个条件

- must（and）
- should（or）
- must_not（not）
- filter

组合查询之must

# 查询form gu和age=30的数据
GET lqz/doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "age": "30"
          }
        }
      ]
    }
  }
}

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.287682,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 1.287682,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        }
      }
    ]
  }
}

查询结果

注意：所有属性值为列表的，都可以实现多个条件并列存在

组合查询之should

#查询`from`为`gu`或者`tags`为`闭月`的数据
GET lqz/doc/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "tags": "闭月"
          }
        }
      ]
    }
  }
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "5",
        "_score" : 0.5753642,
        "_source" : {
          "name" : "魏行首",
          "age" : 25,
          "from" : "广云台",
          "desc" : "仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp，最后竟然没有嫁给顾老二！",
          "tags" : [
            "闭月",
            "羞花"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "3",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "龙套偏房",
          "age" : 22,
          "from" : "gu",
          "desc" : "mmp，没怎么看，不知道怎么形容",
          "tags" : [
            "造数据",
            "真",
            "难"
          ]
        }
      }
    ]
  }
}

查询结果

组合查询之must_not

#查询`from`既不是`gu`并且`tags`也不是`可爱`，还有`age`不是`18`的数据

GET lqz/doc/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "from": "gu"
          }
        },
        {
          "match": {
            "tags": "可爱"
          }
        },
        {
          "match": {
            "age": 18
          }
        }
      ]
    }
  }
}

filter查询

filter条件过滤查询，过滤条件的范围用`range`表示，`gt`表示大于
gt:大于 lt：小于 get：大于等于 let：小于等于

#查询`from`为`gu`，`age`大于`25`的数据

GET lqz/doc/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "from": "gu"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gt": 25
          }
        }
      }
    }
  }
}

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.6931472,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
          "name" : "石头",
          "age" : 29,
          "from" : "gu",
          "desc" : "粗中有细，狐假虎威",
          "tags" : [
            "粗",
            "大",
            "猛"
          ]
        }
      },
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "顾老二",
          "age" : 30,
          "from" : "gu",
          "desc" : "皮肤黑、武器长、性格直",
          "tags" : [
            "黑",
            "长",
            "直"
          ]
        }
      }
    ]
  }
}

查询结果

小结：

must：与关系，相当于关系型数据库中的and。
should：或关系，相当于关系型数据库中的or。
must_not：非关系，相当于关系型数据库中的not。
filter：过滤条件。
range：条件筛选范围。
gt：大于，相当于关系型数据库中的>。
gte：大于等于，相当于关系型数据库中的>=。
lt：小于，相当于关系型数据库中的<。
lte：小于等于，相当于关系型数据库中的<=。

Elasticsearch之查询结果过滤

一前言

在未来，一篇文档可能有很多的字段，每次查询都默认给我们返回全部，在数据量很大的时候，是的，比如我只想查姑娘的手机号，你一并给我个喜好啊、三围什么的算什么？所以，我们对结果做一些过滤，清清白白的告诉elasticsearch

二准备数据

PUT lqz/doc/1
{
  "name":"顾老二",
  "age":30,
  "from": "gu",
  "desc": "皮肤黑、武器长、性格直",
  "tags": ["黑", "长", "直"]
}

三结果过滤：_source

现在，在所有的结果中，我只需要查看name和age两个属性，其他的不要怎么办？

GET lqz/doc/_search
{
  "query": {
    "match": {
      "name": "顾老二"
    }
  },
  "_source": ["name", "age"]
}

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.8630463,
    "hits" : [
      {
        "_index" : "lqz",
        "_type" : "doc",
        "_id" : "1",
        "_score" : 0.8630463,
        "_source" : {
          "name" : "顾老二",
          "age" : 30
        }
      }
    ]
  }
}

查询结果

在数据量很大的时候，我们需要什么字段，就返回什么字段就好了，提高查询效率

posted @ 2020-05-07 15:09 Hank·Paul 阅读(2261) 评论(0) 编辑收藏举报

刷新页面返回顶部

Hank·Paul

原CSDN博客已不用，转到此处

Elasticsearch之索引、文档、组合查询、排序查询、filter过滤操作

Elasticsearch之-索引操作

Elasticsearch之-文档操作

Elasticsearch之查询的两种方式

一前言

二准备数据

三查询字符串

四结构化查询

Elasticsearch之排序查询sort

降序：desc

2.2 升序：asc

Elasticsearch之分页查询

分页查询：from/size

Elasticsearch之布尔（组合）查询

组合查询之must

组合查询之should

组合查询之must_not

filter查询

Elasticsearch之查询结果过滤

一前言

二准备数据

三结果过滤：_source

公告

Hank·Paul

原CSDN博客已不用，转到此处

Elasticsearch之索引、文档、组合查询、排序查询、filter过滤操作

Elasticsearch之-索引操作

Elasticsearch之-文档操作

Elasticsearch之查询的两种方式

一 前言

二 准备数据

三 查询字符串

四 结构化查询

Elasticsearch之排序查询sort

降序：desc

2.2 升序：asc

Elasticsearch之分页查询

分页查询：from/size

Elasticsearch之布尔（组合）查询

组合查询之must

组合查询之should

组合查询之must_not

filter查询

Elasticsearch之查询结果过滤

一 前言

二 准备数据

三 结果过滤：_source

公告

一前言

二准备数据

三查询字符串

四结构化查询

一前言

二准备数据

三结果过滤：_source