ElasticSearch 简单操作

1. 索引操作 (HTTP请求)

(elasticSearch 7.8.x)

1.1 创建索引

创建索引和添加数据可以同时进行,当我们插入数据时带上id号的话PUT请求正常,当不带id号,试图使用es随机创建id号就会报错,需要使用POST请求添加数据

PUT /demo_01/_doc/1  ---> /索引库名/类型/id值
{
  "name" : "张三",  
  "age" : 25
}

也可以通过设置 mappings 定义index下的字段名、定义字段类型和倒排索引相关的设置:

PUT /demo_01
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",  -> 字段类型(可以是text short date integer object)
        "index": true,   -> 是否可用索引,默认为ture。
        "store": true,   -> 是否存储 默认为 false
        "analyzer": "分词器"
      },
      "age": {
        "type": "long"
      },
      "birthday":{
        "type": "date"
      }
    }
  }
}

////////////////// 此时假设我向表中添加不符合数据类型的数据////////////////

POST /demo_01/_doc/1
{
  "name:":"张三",
  "age":"二十五岁",
  "birthday":"1996-01-01"
}

////////////////////// 服务器会向我们返回如下错误 ///////////////////////
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse field [age] of type [long] in document with id '2'. Preview of field's value: '二十五岁'"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse field [age] of type [long] in document with id '2'. Preview of field's value: '二十五岁'",
    "caused_by" : {
      "type" : "illegal_argument_exception",
      "reason" : "For input string: \"二十五岁\""
    }
  },
  "status" : 400
}

1.2 查询索引

GET /demo_01/

################查询获取索引的信息##################

{
  "demo_01" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "age" : {
          "type" : "long"
        },
        "birthday" : {
          "type" : "date"
        },
        "name" : {
          "type" : "text"
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1628175732914",
        "number_of_shards" : "1",  ## 每个索引的主分片数,默认是5。这个配置在索引创建后不能修改
        "number_of_replicas" : "1", ## 每个主分片的副本数,默认是1。对于活动的索引库,这个配置可以随时修改
        "uuid" : "UiGEkRRySRu2_ZFlwaj7-A",
        "version" : {
          "created" : "7060299"
        },
        "provided_name" : "demo_01"
      }
    }
  }
}

1.3 删除索引

DELETE /demo_01

1.4 修改索引

首先elasticSearch是不推荐修改索引的 mapping 结构,常规的情况是新建一个索引,然后将就索引的数据全量导入到新的索引中,如何实现ElasticSearch修改Mapping结构并实现业务零停机,我也没试过,记录一下,以后需要再试:

https://www.cnblogs.com/createboke/p/12234184.html

可以修改mapping的个别情况:

  • 新增字段

    • POST /demo_01/_mapping
      {
        "properties":{
          "hobby":{
            "type":"text"
          }
        }
      }
      
  • 更改字段类型为 multi_field,对于有些字段已有的类型信息不能更改,只能通过fields添加新的类型信息,比如下面的例子中name字段设置数据类型为文本类型text,通过添加fields,有添加了特殊的文本类型keyword

    • POST /demo_01/_mapping
      {
        "properties":{
          "name":{
            "type": "text",
            "fields":{
              "keyword":{
                "type":"keyword",
                "ignore_above":10
              }
            }
          }
        }
      }
      
  • 将新properties 添加到对象数据类型字段(在mapping的field里面设置properties,可以使字段存储Object的数据类型)

    • POST /demo_01/_mapping/
      {
        "properties": {
           "friends":{
              "properties": {
                  "ttt": {
                    "type": "text"
                  },
                  "aaa": {
                    "type": "text"
                  }
              }
           }
        }
      }
      

2. 文档操作 (HTTP请求)

2.1 主键查询

GET /demo_01/_doc/1
{
  "_index" : "demo_01",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 3,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "lisi",
    "age" : 19,
    "birthday" : "1996-01-01",
    "hobby" : "play game",
    "friends" : {
      "aaa" : "zhangsan",
      "ttt" : "wanger"
    }
  }
}

2.2 全查询

GET /demo_01/_doc/_search

2.3 条件查询 & 分页查询

可以在 url 中指明需要查询的条件 ( ?q = key : value ):

GET demo_01/_doc/_search?q=name:lisi

可以将请求参数写在方法体内:

GET /demo_01/_doc/_search
{
  "query":{
    "match":{
      "name":"lisi"
    }
  }
}

可以在方法体内不指定查询条件,从而达到全量查询的效果

GET /demo_01/_doc/_search
{
  "query":{
    "match_all":{
    }
  }
}

但是如果在数据量特别大的情况下采用全量查询是不合适的,这时就可以考虑分页查询

GET /demo_01/_doc/_search
{
  "query":{
    "match_all":{
    }
  },
  "from" : 0,
  "size" : 1,
  "_source":["title"] ## 可以通过_source指明我们查看的字段
  "sort":{  ## 通过sort指明我们想要通过哪个字段进行排序,是升序还是降序
  	 "age":{
       "order":"desc"
    }
  }
}

### 查询结果为###

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,   ## 这里可以看到一共有4条数据,但是在hits中命中的数据只有1条
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "demo_01",
        "_type" : "_doc",
        "_id" : "20",
        "_score" : 1.0,
        "_source" : {
          "name" : "李四",
          "age" : 20,
          "birthday" : "1996-01-01",
          "hobby" : "玩游戏",
          "friends" : {
            "aaa" : "张三",
            "ttt" : "王二"
          }
        }
      }
    ]
  }
}

2.4 多条件查询

需要注意的是,采用match会对搜索条件进行分词,如果采用term,则不会对搜索条件进行分词

查询语句中, must对应的 And , should 对应的是 Or

must (查询兴趣是play game 而且 年龄在19岁的人)

GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "must":[
        {
          "match":{
            "hobby":"play game"
          }
        },
        {
          "match":{
            "age":19
          }
        }
      ]
    }
  }
}

## 文档中存储的hobby都是paly game,采用match匹配会将ss game进行分词查询,game能够到倒排索引中game的记录
GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "must":[
        {
          "match":{
            "hobby":"ss game" 
          }
        },
        {
          "match":{
            "age":19
          }
        }
      ]
    }
  }
}

## 若采用term,因为不会对 ss game进行分词,而是直接拿 ss game作为整体查询,故而查询不到数据
GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "must":[
        {
          "term":{
            "hobby":"ss game" 
          }
        },
        {
          "match":{
            "age":19
          }
        }
      ]
    }
  }
}

should (查询兴趣是play game 或者 是 玩游戏的人)

GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
            "hobby":"play game"
          }
        },
        {
          "match":{
            "hobby":"玩游戏"
          }
        }
      ]
    }
  }
}

filter: 范围操作 :查询兴趣是play game 或者是 玩游戏的人,但查询人的年龄要大于18岁

GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
            "hobby":"play game"
          }
        },
        {
          "match":{
            "hobby":"玩游戏"
          }
        }
      ],
      "filter":{
        "range":{
          "age":{
            "gt":18
          }
        }
      }
    }
  }
}

2.5 全文检索 & 完全匹配 & 高亮查询

match 全文检索

match_phrase 完全匹配

highlight 高亮查询

GET /demo_01/_doc/_search
{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
            "hobby":"play game"
          }
        },
        {
          "match":{
            "hobby":"玩游戏"
          }
        }
      ],
      "filter":{
        "range":{
          "age":{
            "gt":18
          }
        }
      }
    }
  },
  "highlight":{
    "fields":{
      "hobby":{}
    }
  }
}

############### 查询结果 ###############

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 4.6003523,
    "hits" : [
      {
        "_index" : "demo_01",
        "_type" : "_doc",
        "_id" : "20",
        "_score" : 4.6003523,
        "_source" : {
          "name" : "李四",
          "age" : 20,
          "birthday" : "1996-01-01",
          "hobby" : "玩游戏",
          "friends" : {
            "aaa" : "张三",
            "ttt" : "王二"
          }
        },
        "highlight" : {
          "hobby" : [
            "<em>玩</em><em>游</em><em>戏</em>" ### 当满足查询条件后,会在结果中添加 <em> 标签,这样在html中显示的结果就会被高亮显示
          ]
        }
      },
      {
        "_index" : "demo_01",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.37363437,
        "_source" : {
          "name" : "李四",
          "age" : 20,
          "birthday" : "1996-01-01",
          "hobby" : "play game",
          "friends" : {
            "aaa" : "zhangsan",
            "ttt" : "wanger"
          }
        },
        "highlight" : {
          "hobby" : [
            "<em>play</em> <em>game</em>"   ### 当满足查询条件后,会在结果中添加 <em> 标签,这样在html中显示的结果就会被高亮显示
          ]
          ]
        }
      }
    ]
  }
}

2.6 聚合查询

term 分组统计

GET /demo_01/_doc/_search
{
  "aggs":{		## 聚合操作
    "age_group":{    ## 名称,随意取名
      "terms":{    ## 分组
        "field": "age"    ## 分组字段
      }
    }
  },
  "size":0 ## 不显示原始数据,只查看统计结果
}

### 查询结果
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_group" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 19,   ##### 由统计结果可以发现,19岁有2条数据
          "doc_count" : 2
        },
        {
          "key" : 20,
          "doc_count" : 2
        },
        {
          "key" : 15,
          "doc_count" : 1
        },
        {
          "key" : 16,
          "doc_count" : 1
        },
        {
          "key" : 17,
          "doc_count" : 1
        },
        {
          "key" : 18,
          "doc_count" : 1
        }
      ]
    }
  }
}

avg 平均值

GET /demo_01/_doc/_search
{
  "aggs":{
    "age_avg":{
      "avg":{
        "field": "age"
      }
    }
  },
  "size":0
}
######## 查询结果 #######
{
  "took" : 277,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_avg" : {
      "value" : 18.0   ##### 可以看到平均年龄18岁
    }
  }
}

2.7 文档修改

采用PUT请求会将 json 值完全替换;POST请求只会更新相同字段的值,其他数据不会修改,新提交的字段若不存在则会增加

  1. demo_01中主键索引1的数据如下:
## 主键查询
GET /demo_01/_doc/1

{
  "_index" : "demo_01",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 3,
  "_seq_no" : 3,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "lisi",
    "age" : 19,
    "birthday" : "1996-01-01",
    "hobby" : "play game",
    "friends" : {
      "aaa" : "zhangsan",
      "ttt" : "wanger"
    }
  }
}

  1. 采用 PUT 方式进行全量更改
PUT /demo_01/_doc/1
{
  "name": "李四",
  "age": 20,
  "birthday": "1996-01-01"
}

## 查询更改结果

{
  "_index" : "demo_01",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "_seq_no" : 6,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "李四",
    "age" : 20,
    "birthday" : "1996-01-01"
  }
}

可以发现通过 PUT 请求修改后,原文档只剩下三个字段了,那些不再 PUT 请求中的字段都被替换为空了

  1. 采用 POST 方式进行局部修改

如果仅仅是把上面 PUT 请求的方式改为 POST ,那么修改的结果与PUT一样都是进行了全量修改,若想要采用 POST 进行局部修改需要采用下面的方法:

需要在方法参数后面指定_update,然后将需要修改的字段用 "doc" 字段包裹,如果采用该方法修改时,加入了一个原先不存在的字段,那么更新后,就会在原文档上增加这个字段

POST /demo_01/_doc/1/_update
{
  "doc":{
    "name": "李四",
    "age": 20,
    "birthday": "1996-01-01"
  }
}

##### 查询结果 #####
{
  "_index" : "demo_01",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 12,
  "_seq_no" : 14,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "李四",
    "age" : 20,
    "birthday" : "1996-01-01",
    "hobby" : "play game",
    "friends" : {
      "aaa" : "zhangsan",
      "ttt" : "wanger"
    }
  }
}

2.8 删除文档

直接采用 DELETE 请求即可

3. 索引操作(Java Api)

建立连接

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("192.168.0.104",9200,"http")
                )
        );

3.1 增加索引

// 1.创建索引请求
CreateIndexRequest request = new CreateIndexRequest("user_info"); //创建索引名为userInfo

// 设置mapping(也可以不设置,让es自动识别)
XContentBuilder builder = XContentFactory.jsonBuilder();
builder.startObject();
{
    builder.startObject("properties");
    {
        //用户姓名字段
        builder.startObject("user_name");
        {
            builder.field("type", "keyword");
        }
        builder.endObject();
        //用户兴趣字段
        builder.startObject("user_hobby");
        {
            builder.field("type", "text")
                //插入时分词
                .field("analyzer", "ik_max_word")
                //搜索时分词
                .field("search_analyzer", "ik_smart");
        }
        builder.endObject();
        //用户年龄
        builder.startObject("user_age");
        {
            builder.field("type", "long");
        }
        builder.endObject();
        //用户性别
        builder.startObject("user_gender");
        {
            builder.field("type","keyword");
        }
        builder.endObject();
    }
    builder.endObject();
}
builder.endObject();


request.mapping(builder);
// 2.客户端请求执行
CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
System.out.println(createIndexResponse);
client.close();

3.2 删除索引

DeleteIndexRequest request = new DeleteIndexRequest("user_info");
AcknowledgedResponse delete = client.indices().delete(request, RequestOptions.DEFAULT);
client.close();

3.3 查询索引

GetIndexRequest request = new GetIndexRequest("mzlee_index");
GetIndexResponse response = client.indices().get(request, RequestOptions.DEFAULT);
client.close();

4. 文档操作

4.1 插入文档

IndexRequest request = new IndexRequest(); // 通过构造方法指定index和id已经被废弃了
request.index("user_info").id("1");
User user = new User("坤坤",23,"唱、跳、rapper","男");

// 向es中插入操作,必须将数据转换为JSON格式,也可以直接传入一个map
ObjectMapper mapper = new ObjectMapper();
String userJson = mapper.writeValueAsString(user);
request.source(userJson, XContentType.JSON);

IndexResponse response = client.index(request, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();

4.2 查询文档

// 查询数据
GetRequest request = new GetRequest();
request.index("user_info").id("1");

GetResponse response = client.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());
client.close();

4.3 修改文档

UpdateRequest updateRequest = new UpdateRequest();
updateRequest.index("user_info").id("1");

updateRequest.doc(XContentType.JSON,"gender","女");
UpdateResponse response = client.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();

4.4 删除文档

DeleteRequest request = new DeleteRequest();
request.index("user_info").id("1");

DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();

4.5 批量操作

BulkRequest bulkRequest = new BulkRequest();

bulkRequest.add(new IndexRequest().index("user_info").id("2").
                source(XContentType.JSON,"user_name","张三","user_age",24,"user_hobby","打篮球,看电视","user_gender","男"));

bulkRequest.add(new IndexRequest().index("user_info").id("3").
                source(XContentType.JSON,"user_name","李四","user_age",25,"user_hobby","打游戏,看电视","user_gender","女"));

bulkRequest.add(new IndexRequest().index("user_info").id("4").
                source(XContentType.JSON,"user_name","王二","user_age",26,"user_hobby","下棋,弹琴,打游戏","user_gender","男"));

bulkRequest.add(new IndexRequest().index("user_info").id("5").
                source(XContentType.JSON,"user_name","麻子","user_age",27,"user_hobby","上班","user_gender","男"));

bulkRequest.add(new IndexRequest().index("user_info").id("6").
                source(XContentType.JSON,"user_name","翠花","user_age",28,"user_hobby","逛街,打游戏,射箭","user_gender","女"));

bulkRequest.add(new IndexRequest().index("user_info").id("7").
                source(XContentType.JSON,"user_name","鸭蛋","user_age",29,"user_hobby","蹦极,跳伞,射箭,滑雪","user_gender","女"));

bulkRequest.add(new IndexRequest().index("user_info").id("8").
                source(XContentType.JSON,"user_name","小红","user_age",30,"user_hobby","上班","user_gender","女"));

client.bulk(bulkRequest, RequestOptions.DEFAULT);

client.close();

4.6 全量查询 & 精确查询 & 分页查询 & 排序

SearchRequest request = new SearchRequest();
request.indices("user_info");
// 1. 全量查询
SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());

// 2. 精确查询
// SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.termQuery("age", 25));

// 3. 分页查询
builder.from(0)
builder.size(2)
    
// 4. 排序
builder.sort("user_age",SortOrder.ASC)
    
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

4.7 过滤 & 保留字段查询

SearchRequest request = new SearchRequest();
request.indices("user_info");
// 过滤字段
SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
// 1. 排除字段
String[] excludes = {"user_age"};
// 2. 包含字段
String[] include = {"user_gender","user_name"};
builder.fetchSource(include,excludes);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

4.8 组合查询 & 范围查询

SearchRequest request = new SearchRequest();
request.indices("user_info");

SearchSourceBuilder builder = new SearchSourceBuilder();
// 这里的boolQuery对应的就是json中的bool,表示为一个组合查询
SearchSourceBuilder builder = new SearchSourceBuilder();
// 1. 组合查询
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
// 2. 范围查询,查询年龄在23到40之间的男性
boolQueryBuilder.must(QueryBuilders.rangeQuery("user_age").gte(23).lte(30));
boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
builder.query(boolQueryBuilder);

request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

must表示and,should表示or,must和should若在同级使用,should就会失效

假如我们想要查询是男,年龄是24或25岁的用户信息

//////下面这种方式should会全部失效,最终会返回所有的男性用户信息//////
SearchSourceBuilder builder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();

boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
boolQueryBuilder.should(QueryBuilders.termQuery("user_age",24));
boolQueryBuilder.should(QueryBuilders.termQuery("user_age",25));

builder.query(boolQueryBuilder);

request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

正确的写法是:

最外层写两个must,在第二个must再嵌套一个BoolQueryBuilder

SearchSourceBuilder builder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();

boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
// 嵌套BoolQueryBuilder
BoolQueryBuilder boolQueryBuilder2 = QueryBuilders.boolQuery();
boolQueryBuilder2.should(QueryBuilders.termQuery("user_age",24));
boolQueryBuilder2.should(QueryBuilders.termQuery("user_age",25));
boolQueryBuilder.must(boolQueryBuilder2);

builder.query(boolQueryBuilder);

request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

4.9 模糊查询 & 高亮查询

SearchSourceBuilder builder = new SearchSourceBuilder();
// Fuzziness设置为自动的话,会根据字符长度更改允许的最大编辑距离
FuzzyQueryBuilder fuzziness = QueryBuilders.fuzzyQuery("user_name", "张S").fuzziness(Fuzziness.AUTO);
builder.query(fuzziness);

request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
SearchSourceBuilder builder = new SearchSourceBuilder();
TermQueryBuilder termQuery = QueryBuilders.termQuery("user_gender", "男");
builder.query(termQuery);

HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<font color='red'>");
highlightBuilder.postTags("</font>");
highlightBuilder.field("gender");
builder.highlighter(highlightBuilder);

request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);

4.10 聚合查询

SearchSourceBuilder builder = new SearchSourceBuilder();
// 查询年龄最大的记录
MaxAggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("user_age");
// 按照性别分组
TermsAggregationBuilder aggregationBuilder = AggregationBuilders.terms("genderGroup").field("user_gender");
builder.aggregation(aggregationBuilder);
posted @ 2021-08-15 15:44  mz-wesley  阅读(138)  评论(0)    收藏  举报