ES索引Index相关操作&ES数据类型、字符串类型text和keyword区别

1.查看索引以及删除之前的测试索引

1. 查看索引以及索引数量信息

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://127.0.0.1:9200/_cat/indices
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   415  100   415    0     0   8829      0 --:--:-- --:--:-- --:--:--  8829yellow open .kibana_task_manager_1   lXR5nwrFSiCplqY52qoG5g 1 1  2 0 12.4kb 12.4kb
yellow open .apm-agent-configuration bPcoddBFSEa_ZR9mTuVEYA 1 1  0 0   283b   283b
yellow open orders                   bZ1MarlySOCNFrK5NRX-9Q 1 1 21 0 15.8kb 15.8kb
yellow open accounts                 mqSfqnX5Rt2O-rmVbqOXyQ 1 1  2 0    9kb    9kb
yellow open .kibana_1                bznW8eeKSC-kSfAhBhPD4w 1 1 12 4 39.8kb 39.8kb

2.删除accounts和orders

1. 第一种使用kibana删除

DELETE accounts

2. 第二种使用curl命令删除

liqiang@root MINGW64 ~/Desktop
$ curl -X DELETE http://localhost:9200/orders
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    21  100    21    0     0     18      0  0:00:01  0:00:01 --:--:--    18{"acknowledged":true}

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://127.0.0.1:9200/_cat/indices
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   249  100   249    0     0   5413      0 --:--:-- --:--:-- --:--:--  8032yellow open .kibana_task_manager_1   lXR5nwrFSiCplqY52qoG5g 1 1  2 0 12.4kb 12.4kb
yellow open .apm-agent-configuration bPcoddBFSEa_ZR9mTuVEYA 1 1  0 0   283b   283b
yellow open .kibana_1                bznW8eeKSC-kSfAhBhPD4w 1 1 13 2 40.1kb 40.1kb

补充：kibana也可以查看索引信息

2. 创建新的索引

0.分片与副本

　　对于一个索引来说，number_of_shards只能设置一次，而number_of_replicas可以使用索引更新设置API在任何时候被增加或者减少。

分片：shard。

　　Elasticsearch集群允许系统存储的数据量超过单机容量，实现这一目标引入分片策略shard。在一个索引index中，数据（document）被分片处理（sharding）到多个分片上。Elasticsearch屏蔽了管理分片的复杂性，使得多个分片呈现出一个大索引的样子。

副本：replica

　　为了提升访问压力过大是单机无法处理所有请求的问题，Elasticsearch集群引入了副本策略replica。副本策略对index中的每个分片创建冗余的副本，处理查询时可以把这些副本当做主分片来对待（primary shard），此外副本策略提供了高可用和数据安全的保障，当分片所在的机器宕机，Elasticsearch可以使用其副本进行恢复，从而避免数据丢失。

1.不指定分片数量、副本数量以及字段

liqiang@root MINGW64 ~/Desktop
$ curl -X PUT "localhost:9200/empty?pretty"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    81  100    81    0     0     32      0  0:00:02  0:00:02 --:--:--    33{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "empty"
}

(1)查看索引信息：

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://localhost:9200/empty/_settings?pretty
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   328  100   328    0     0  10250      0 --:--:-- --:--:-- --:--:--  320k{
  "empty" : {
    "settings" : {
      "index" : {
        "creation_date" : "1596946640278",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "UVn4Da93RjK4uwkMtAkcjA",
        "version" : {
          "created" : "7060299"
        },
        "provided_name" : "empty"
      }
    }
  }
}

(2)查看字段映射关系

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://localhost:9200/empty/_mapping?pretty=true
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    43  100    43    0     0   1387      0 --:--:-- --:--:-- --:--:--  2687{
  "empty" : {
    "mappings" : { }
  }
}

2.指定分片数量、副本数量以及字段映射

(1)创建

PUT http://localhost:9200/empty2?pretty=true

body如下：

{
    "settings": {
        "number_of_shards": 3,
        "number_of_replicas": 2
    },
    "mappings": {
        "properties": {
            "userid": {
                "type": "long"
            },
            "username": {
                "type": "text"
            },
            "fullname": {
                "type": "keyword"
            },
            "age": {
                "type": "double"
            }

        }
    }
}

我是用postman执行后返回结果如下：(当然kibana中也可以执行)

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "empty2"
}

(2)查看：

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://localhost:9200/empty2/_settings?pretty
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   330  100   330    0     0   7021      0 --:--:-- --:--:-- --:--:-- 20625{
  "empty2" : {
    "settings" : {
      "index" : {
        "creation_date" : "1596951227408",
        "number_of_shards" : "3",
        "number_of_replicas" : "2",
        "uuid" : "lC4z_xeqQ7uYUEJZwtXBBw",
        "version" : {
          "created" : "7060299"
        },
        "provided_name" : "empty2"
      }
    }
  }
}


liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://localhost:9200/empty2/_mapping?pretty=true
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   316  100   316    0     0  10193      0 --:--:-- --:--:-- --:--:--  308k{
  "empty2" : {
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "double"
        },
        "fullname" : {
          "type" : "keyword"
        },
        "userid" : {
          "type" : "long"
        },
        "username" : {
          "type" : "text"
        }
      }
    }
  }
}

补充：在ES7中，默认的类型type是_doc。

3. 创建数据-kibana中执行

1. 在empty中创建文档

POST /empty/_doc
{
  "name": "zhi",
  "lastName": "qiao",
  "job": "enginee"
}

结果：

{
  "_index" : "empty",
  "_type" : "_doc",
  "_id" : "AJe80XMBntNcepW1OmVE",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

查看字段映射：

liqiang@root MINGW64 ~/Desktop
$ curl -X GET http://localhost:9200/empty/_mapping?pretty=true
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   683  100   683    0     0  22032      0 --:--:-- --:--:-- --:--:--  666k{
  "empty" : {
    "mappings" : {
      "properties" : {
        "job" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "lastName" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

2. 在empty2中创建文档

POST /empty2/_doc
{
  "name": "zhi",
  "lastName": "qiao",
  "job": "enginee"
}

结果：(添加不存在的field，ES会在原Type增加field)

{
  "_index" : "empty2",
  "_type" : "_doc",
  "_id" : "AZe90XMBntNcepW1N2Vv",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 3,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

4.ES数据类型

1. 字段数据类型-自定义字段的属性

　　Alias、Arrays、Binary、Boolean、Date、Date nanoseconds、Dense vector、Histogram、Flattened、Geo-point、Geo-shape、IP、Join、Keyword、Nested、Numeric、Object、Percolator、Range、Rank feature、Rank features、Search-as-you-type、Sparse vector、Text、Token count、Shape、Constant keyword

2. Metadata fields (元属性)-ES生成的默认属性

　　_field_names field、_ignored field、_id field、_index field、_meta field、_routing field、_source field、_type field

3. ES字符串String数据类型keyword 和 text 数据类型区别的区别

引用官网的介绍：

1. keyword

A field to index structured content such as IDs, email addresses, hostnames, status codes, zip codes or tags.

They are typically used for filtering (Find me all blog posts where status is published), for sorting, and for aggregations. Keyword fields are only searchable by their exact value.

If you need to index full text content such as email bodies or product descriptions, it is likely that you should rather use a text field.

　　简单理解就是 Keyword 数据类型用来建立电子邮箱地址、姓名、邮政编码和标签等数据，不需要进行分词，只能用精准搜素。可以被用来检索过滤、排序和聚合。

2. text

　　A field to index full-text values, such as the body of an email or the description of a product. These fields are analyzed, that is they are passed through an analyzer to convert the string into a list of individual terms before being indexed. The analysis process allows Elasticsearch to search for individual words within each full text field. Text fields are not used for sorting and seldom used for aggregations (although the significant text aggregation is a notable exception).

　　简单理解就是：Text 数据类型被用来索引长文本，比如说电子邮件的主体部分或者一款产品的介绍。这些文本会被分析，在建立索引前会将这些文本进行分词，转化为词的组合，建立索引。允许 ES来检索这些词语。text 数据类型不能用来排序和聚合

注意： 遇到字符串类型时候的字端，系统会默认为“text”类型。检索的时候对字符串进行分析。所以要想只通过字段本身来进行检索，还是需要按照上面把该字段改为“keyword”类型。

例如:(kibana中执行)

1.创建一个用户索引，如下：

put /u
{
    "mappings": {
        "properties": {
            "full_name": {
                "type": "text"
            },
            "idcard": {
                "type": "keyword"
            }
        }
    }
}

结果：

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "u"
}

2.查看字段映射

GET /u/_mapping?pretty=true

结果：

{
  "u" : {
    "mappings" : {
      "properties" : {
        "full_name" : {
          "type" : "text"
        },
        "idcard" : {
          "type" : "keyword"
        }
      }
    }
  }
}

3.创建数据如下后搜索：

POST /u/_doc
{
  "full_name": "张三",
  "idcard": "zhang san"
}

搜索：

(1)按关键字张搜索

GET /u/_search?q=张

结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.9808291,
    "hits" : [
      {
        "_index" : "u",
        "_type" : "_doc",
        "_id" : "BJfc0XMBntNcepW1F2Vj",
        "_score" : 0.9808291,
        "_source" : {
          "full_name" : "张三",
          "idcard" : "zhang san"
        }
      }
    ]
  }
}

(2)按关键字zhang搜索：

GET /u/_search?q=zhang

结果：

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

(3)按关键字 zhang san搜索

GET /u/_search?q=zhang san

结果：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.9808291,
    "hits" : [
      {
        "_index" : "u",
        "_type" : "_doc",
        "_id" : "BJfc0XMBntNcepW1F2Vj",
        "_score" : 0.9808291,
        "_source" : {
          "full_name" : "张三",
          "idcard" : "zhang san"
        }
      }
    ]
  }
}

说明： full_name可分词，而idcard未分词。

4.删掉上面数据

DELETE /u/_doc/BJfc0XMBntNcepW1F2Vj

5.再次增加数据反向测试搜索

POST /u/_doc
{
  "full_name": "zhang san",
  "idcard": "张三"
}

结果：

{
  "_index" : "u",
  "_type" : "_doc",
  "_id" : "BZfk0XMBntNcepW1L2Xg",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 4,
  "_primary_term" : 1
}

(1)按关键字zhang 搜索

GET /u/_search?q=zhang

结果可以搜到：

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.9808291,
    "hits" : [
      {
        "_index" : "u",
        "_type" : "_doc",
        "_id" : "BZfk0XMBntNcepW1L2Xg",
        "_score" : 0.9808291,
        "_source" : {
          "full_name" : "zhang san",
          "idcard" : "张三"
        }
      }
    ]
  }
}

(2)按关键字张搜索

GET /u/_search?q=张

结果：未搜到。

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

posted @ 2020-08-09 14:30 QiaoZhi 阅读(6321) 评论(0) 编辑收藏举报

刷新页面返回顶部

登录后才能查看或发表评论，立即登录或者逛逛博客园首页

阅读排行：
· 分享4款.NET开源、免费、实用的商城系统
· 全程不用写代码，我用AI程序员写了一个飞机大战
· Obsidian + DeepSeek：免费 AI 助力你的知识管理，让你的笔记飞起来！
· MongoDB 8.0这个新功能碉堡了，比商业数据库还牛
· 白话解读 Dapr 1.15：你的「微服务管家」又秀新绝活了

历史上的今天：
2018-08-09 Struts2不扫描jar包中的action

公告

昵称： QiaoZhi
园龄： 7年7个月
粉丝： 1000
关注： 9

+加关注

2025年3月

日

一

二

三

四

五

六

Qiao_Zhi

有远大抱负的人不可忽略眼前的工作!!!

ES索引Index相关操作&ES数据类型、字符串类型text和keyword区别

1.查看索引以及删除之前的测试索引

1. 查看索引以及索引数量信息

2.删除accounts和orders

2. 创建新的索引

0.分片与副本

1.不指定分片数量、副本数量以及字段

2.指定分片数量、副本数量以及字段映射

3. 创建数据-kibana中执行

1. 在empty中创建文档

2. 在empty2中创建文档

4.ES数据类型

1. 字段数据类型-自定义字段的属性

2. Metadata fields (元属性)-ES生成的默认属性

3. ES字符串String数据类型keyword 和 text 数据类型区别的区别

公告

搜索

积分与排名

随笔分类 (1546)

相册 (4)

阅读排行榜

评论排行榜

推荐排行榜

最新评论