ElasticSearch 简单操作
1. 索引操作 (HTTP请求)
(elasticSearch 7.8.x)
1.1 创建索引
创建索引和添加数据可以同时进行,当我们插入数据时带上id号的话PUT请求正常,当不带id号,试图使用es随机创建id号就会报错,需要使用POST请求添加数据
PUT /demo_01/_doc/1 ---> /索引库名/类型/id值
{
"name" : "张三",
"age" : 25
}
也可以通过设置 mappings 定义index下的字段名、定义字段类型和倒排索引相关的设置:
PUT /demo_01
{
"mappings": {
"properties": {
"name": {
"type": "text", -> 字段类型(可以是text short date integer object)
"index": true, -> 是否可用索引,默认为ture。
"store": true, -> 是否存储 默认为 false
"analyzer": "分词器"
},
"age": {
"type": "long"
},
"birthday":{
"type": "date"
}
}
}
}
////////////////// 此时假设我向表中添加不符合数据类型的数据////////////////
POST /demo_01/_doc/1
{
"name:":"张三",
"age":"二十五岁",
"birthday":"1996-01-01"
}
////////////////////// 服务器会向我们返回如下错误 ///////////////////////
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "failed to parse field [age] of type [long] in document with id '2'. Preview of field's value: '二十五岁'"
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse field [age] of type [long] in document with id '2'. Preview of field's value: '二十五岁'",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "For input string: \"二十五岁\""
}
},
"status" : 400
}
1.2 查询索引
GET /demo_01/
################查询获取索引的信息##################
{
"demo_01" : {
"aliases" : { },
"mappings" : {
"properties" : {
"age" : {
"type" : "long"
},
"birthday" : {
"type" : "date"
},
"name" : {
"type" : "text"
}
}
},
"settings" : {
"index" : {
"creation_date" : "1628175732914",
"number_of_shards" : "1", ## 每个索引的主分片数,默认是5。这个配置在索引创建后不能修改
"number_of_replicas" : "1", ## 每个主分片的副本数,默认是1。对于活动的索引库,这个配置可以随时修改
"uuid" : "UiGEkRRySRu2_ZFlwaj7-A",
"version" : {
"created" : "7060299"
},
"provided_name" : "demo_01"
}
}
}
}
1.3 删除索引
DELETE /demo_01
1.4 修改索引
首先elasticSearch是不推荐修改索引的 mapping 结构,常规的情况是新建一个索引,然后将就索引的数据全量导入到新的索引中,如何实现ElasticSearch修改Mapping结构并实现业务零停机,我也没试过,记录一下,以后需要再试:
https://www.cnblogs.com/createboke/p/12234184.html
可以修改mapping的个别情况:
-
新增字段
-
POST /demo_01/_mapping { "properties":{ "hobby":{ "type":"text" } } }
-
-
更改字段类型为 multi_field,对于有些字段已有的类型信息不能更改,只能通过fields添加新的类型信息,比如下面的例子中name字段设置数据类型为文本类型text,通过添加fields,有添加了特殊的文本类型keyword
-
POST /demo_01/_mapping { "properties":{ "name":{ "type": "text", "fields":{ "keyword":{ "type":"keyword", "ignore_above":10 } } } } }
-
-
将新properties 添加到对象数据类型字段(在mapping的field里面设置properties,可以使字段存储Object的数据类型)
-
POST /demo_01/_mapping/ { "properties": { "friends":{ "properties": { "ttt": { "type": "text" }, "aaa": { "type": "text" } } } } }
-
2. 文档操作 (HTTP请求)
2.1 主键查询
GET /demo_01/_doc/1
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 3,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "lisi",
"age" : 19,
"birthday" : "1996-01-01",
"hobby" : "play game",
"friends" : {
"aaa" : "zhangsan",
"ttt" : "wanger"
}
}
}
2.2 全查询
GET /demo_01/_doc/_search
2.3 条件查询 & 分页查询
可以在 url 中指明需要查询的条件 ( ?q = key : value ):
GET demo_01/_doc/_search?q=name:lisi
可以将请求参数写在方法体内:
GET /demo_01/_doc/_search
{
"query":{
"match":{
"name":"lisi"
}
}
}
可以在方法体内不指定查询条件,从而达到全量查询的效果
GET /demo_01/_doc/_search
{
"query":{
"match_all":{
}
}
}
但是如果在数据量特别大的情况下采用全量查询是不合适的,这时就可以考虑分页查询
GET /demo_01/_doc/_search
{
"query":{
"match_all":{
}
},
"from" : 0,
"size" : 1,
"_source":["title"] ## 可以通过_source指明我们查看的字段
"sort":{ ## 通过sort指明我们想要通过哪个字段进行排序,是升序还是降序
"age":{
"order":"desc"
}
}
}
### 查询结果为###
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4, ## 这里可以看到一共有4条数据,但是在hits中命中的数据只有1条
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "20",
"_score" : 1.0,
"_source" : {
"name" : "李四",
"age" : 20,
"birthday" : "1996-01-01",
"hobby" : "玩游戏",
"friends" : {
"aaa" : "张三",
"ttt" : "王二"
}
}
}
]
}
}
2.4 多条件查询
需要注意的是,采用match会对搜索条件进行分词,如果采用term,则不会对搜索条件进行分词
查询语句中, must对应的 And , should 对应的是 Or
must (查询兴趣是play game 而且 年龄在19岁的人)
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"hobby":"play game"
}
},
{
"match":{
"age":19
}
}
]
}
}
}
## 文档中存储的hobby都是paly game,采用match匹配会将ss game进行分词查询,game能够到倒排索引中game的记录
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"hobby":"ss game"
}
},
{
"match":{
"age":19
}
}
]
}
}
}
## 若采用term,因为不会对 ss game进行分词,而是直接拿 ss game作为整体查询,故而查询不到数据
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"must":[
{
"term":{
"hobby":"ss game"
}
},
{
"match":{
"age":19
}
}
]
}
}
}
should (查询兴趣是play game 或者 是 玩游戏的人)
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"should":[
{
"match":{
"hobby":"play game"
}
},
{
"match":{
"hobby":"玩游戏"
}
}
]
}
}
}
filter: 范围操作 :查询兴趣是play game 或者是 玩游戏的人,但查询人的年龄要大于18岁
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"should":[
{
"match":{
"hobby":"play game"
}
},
{
"match":{
"hobby":"玩游戏"
}
}
],
"filter":{
"range":{
"age":{
"gt":18
}
}
}
}
}
}
2.5 全文检索 & 完全匹配 & 高亮查询
match 全文检索
match_phrase 完全匹配
highlight 高亮查询
GET /demo_01/_doc/_search
{
"query":{
"bool":{
"should":[
{
"match":{
"hobby":"play game"
}
},
{
"match":{
"hobby":"玩游戏"
}
}
],
"filter":{
"range":{
"age":{
"gt":18
}
}
}
}
},
"highlight":{
"fields":{
"hobby":{}
}
}
}
############### 查询结果 ###############
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 4.6003523,
"hits" : [
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "20",
"_score" : 4.6003523,
"_source" : {
"name" : "李四",
"age" : 20,
"birthday" : "1996-01-01",
"hobby" : "玩游戏",
"friends" : {
"aaa" : "张三",
"ttt" : "王二"
}
},
"highlight" : {
"hobby" : [
"<em>玩</em><em>游</em><em>戏</em>" ### 当满足查询条件后,会在结果中添加 <em> 标签,这样在html中显示的结果就会被高亮显示
]
}
},
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.37363437,
"_source" : {
"name" : "李四",
"age" : 20,
"birthday" : "1996-01-01",
"hobby" : "play game",
"friends" : {
"aaa" : "zhangsan",
"ttt" : "wanger"
}
},
"highlight" : {
"hobby" : [
"<em>play</em> <em>game</em>" ### 当满足查询条件后,会在结果中添加 <em> 标签,这样在html中显示的结果就会被高亮显示
]
]
}
}
]
}
}
2.6 聚合查询
term 分组统计
GET /demo_01/_doc/_search
{
"aggs":{ ## 聚合操作
"age_group":{ ## 名称,随意取名
"terms":{ ## 分组
"field": "age" ## 分组字段
}
}
},
"size":0 ## 不显示原始数据,只查看统计结果
}
### 查询结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"age_group" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 19, ##### 由统计结果可以发现,19岁有2条数据
"doc_count" : 2
},
{
"key" : 20,
"doc_count" : 2
},
{
"key" : 15,
"doc_count" : 1
},
{
"key" : 16,
"doc_count" : 1
},
{
"key" : 17,
"doc_count" : 1
},
{
"key" : 18,
"doc_count" : 1
}
]
}
}
}
avg 平均值
GET /demo_01/_doc/_search
{
"aggs":{
"age_avg":{
"avg":{
"field": "age"
}
}
},
"size":0
}
######## 查询结果 #######
{
"took" : 277,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"age_avg" : {
"value" : 18.0 ##### 可以看到平均年龄18岁
}
}
}
2.7 文档修改
采用PUT请求会将 json 值完全替换;POST请求只会更新相同字段的值,其他数据不会修改,新提交的字段若不存在则会增加
- demo_01中主键索引1的数据如下:
## 主键查询
GET /demo_01/_doc/1
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 3,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "lisi",
"age" : 19,
"birthday" : "1996-01-01",
"hobby" : "play game",
"friends" : {
"aaa" : "zhangsan",
"ttt" : "wanger"
}
}
}
- 采用 PUT 方式进行全量更改
PUT /demo_01/_doc/1
{
"name": "李四",
"age": 20,
"birthday": "1996-01-01"
}
## 查询更改结果
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"_seq_no" : 6,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "李四",
"age" : 20,
"birthday" : "1996-01-01"
}
}
可以发现通过 PUT 请求修改后,原文档只剩下三个字段了,那些不再 PUT 请求中的字段都被替换为空了
- 采用 POST 方式进行局部修改
如果仅仅是把上面 PUT 请求的方式改为 POST ,那么修改的结果与PUT一样都是进行了全量修改,若想要采用 POST 进行局部修改需要采用下面的方法:
需要在方法参数后面指定_update,然后将需要修改的字段用 "doc" 字段包裹,如果采用该方法修改时,加入了一个原先不存在的字段,那么更新后,就会在原文档上增加这个字段
POST /demo_01/_doc/1/_update
{
"doc":{
"name": "李四",
"age": 20,
"birthday": "1996-01-01"
}
}
##### 查询结果 #####
{
"_index" : "demo_01",
"_type" : "_doc",
"_id" : "1",
"_version" : 12,
"_seq_no" : 14,
"_primary_term" : 3,
"found" : true,
"_source" : {
"name" : "李四",
"age" : 20,
"birthday" : "1996-01-01",
"hobby" : "play game",
"friends" : {
"aaa" : "zhangsan",
"ttt" : "wanger"
}
}
}
2.8 删除文档
直接采用 DELETE 请求即可
3. 索引操作(Java Api)
建立连接
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("192.168.0.104",9200,"http")
)
);
3.1 增加索引
// 1.创建索引请求
CreateIndexRequest request = new CreateIndexRequest("user_info"); //创建索引名为userInfo
// 设置mapping(也可以不设置,让es自动识别)
XContentBuilder builder = XContentFactory.jsonBuilder();
builder.startObject();
{
builder.startObject("properties");
{
//用户姓名字段
builder.startObject("user_name");
{
builder.field("type", "keyword");
}
builder.endObject();
//用户兴趣字段
builder.startObject("user_hobby");
{
builder.field("type", "text")
//插入时分词
.field("analyzer", "ik_max_word")
//搜索时分词
.field("search_analyzer", "ik_smart");
}
builder.endObject();
//用户年龄
builder.startObject("user_age");
{
builder.field("type", "long");
}
builder.endObject();
//用户性别
builder.startObject("user_gender");
{
builder.field("type","keyword");
}
builder.endObject();
}
builder.endObject();
}
builder.endObject();
request.mapping(builder);
// 2.客户端请求执行
CreateIndexResponse createIndexResponse = client.indices().create(request, RequestOptions.DEFAULT);
System.out.println(createIndexResponse);
client.close();
3.2 删除索引
DeleteIndexRequest request = new DeleteIndexRequest("user_info");
AcknowledgedResponse delete = client.indices().delete(request, RequestOptions.DEFAULT);
client.close();
3.3 查询索引
GetIndexRequest request = new GetIndexRequest("mzlee_index");
GetIndexResponse response = client.indices().get(request, RequestOptions.DEFAULT);
client.close();
4. 文档操作
4.1 插入文档
IndexRequest request = new IndexRequest(); // 通过构造方法指定index和id已经被废弃了
request.index("user_info").id("1");
User user = new User("坤坤",23,"唱、跳、rapper","男");
// 向es中插入操作,必须将数据转换为JSON格式,也可以直接传入一个map
ObjectMapper mapper = new ObjectMapper();
String userJson = mapper.writeValueAsString(user);
request.source(userJson, XContentType.JSON);
IndexResponse response = client.index(request, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();
4.2 查询文档
// 查询数据
GetRequest request = new GetRequest();
request.index("user_info").id("1");
GetResponse response = client.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());
client.close();
4.3 修改文档
UpdateRequest updateRequest = new UpdateRequest();
updateRequest.index("user_info").id("1");
updateRequest.doc(XContentType.JSON,"gender","女");
UpdateResponse response = client.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();
4.4 删除文档
DeleteRequest request = new DeleteRequest();
request.index("user_info").id("1");
DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
System.out.println(response.getResult());
client.close();
4.5 批量操作
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.add(new IndexRequest().index("user_info").id("2").
source(XContentType.JSON,"user_name","张三","user_age",24,"user_hobby","打篮球,看电视","user_gender","男"));
bulkRequest.add(new IndexRequest().index("user_info").id("3").
source(XContentType.JSON,"user_name","李四","user_age",25,"user_hobby","打游戏,看电视","user_gender","女"));
bulkRequest.add(new IndexRequest().index("user_info").id("4").
source(XContentType.JSON,"user_name","王二","user_age",26,"user_hobby","下棋,弹琴,打游戏","user_gender","男"));
bulkRequest.add(new IndexRequest().index("user_info").id("5").
source(XContentType.JSON,"user_name","麻子","user_age",27,"user_hobby","上班","user_gender","男"));
bulkRequest.add(new IndexRequest().index("user_info").id("6").
source(XContentType.JSON,"user_name","翠花","user_age",28,"user_hobby","逛街,打游戏,射箭","user_gender","女"));
bulkRequest.add(new IndexRequest().index("user_info").id("7").
source(XContentType.JSON,"user_name","鸭蛋","user_age",29,"user_hobby","蹦极,跳伞,射箭,滑雪","user_gender","女"));
bulkRequest.add(new IndexRequest().index("user_info").id("8").
source(XContentType.JSON,"user_name","小红","user_age",30,"user_hobby","上班","user_gender","女"));
client.bulk(bulkRequest, RequestOptions.DEFAULT);
client.close();
4.6 全量查询 & 精确查询 & 分页查询 & 排序
SearchRequest request = new SearchRequest();
request.indices("user_info");
// 1. 全量查询
SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
// 2. 精确查询
// SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.termQuery("age", 25));
// 3. 分页查询
builder.from(0)
builder.size(2)
// 4. 排序
builder.sort("user_age",SortOrder.ASC)
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
4.7 过滤 & 保留字段查询
SearchRequest request = new SearchRequest();
request.indices("user_info");
// 过滤字段
SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
// 1. 排除字段
String[] excludes = {"user_age"};
// 2. 包含字段
String[] include = {"user_gender","user_name"};
builder.fetchSource(include,excludes);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
4.8 组合查询 & 范围查询
SearchRequest request = new SearchRequest();
request.indices("user_info");
SearchSourceBuilder builder = new SearchSourceBuilder();
// 这里的boolQuery对应的就是json中的bool,表示为一个组合查询
SearchSourceBuilder builder = new SearchSourceBuilder();
// 1. 组合查询
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
// 2. 范围查询,查询年龄在23到40之间的男性
boolQueryBuilder.must(QueryBuilders.rangeQuery("user_age").gte(23).lte(30));
boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
builder.query(boolQueryBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
must表示and,should表示or,must和should若在同级使用,should就会失效
假如我们想要查询是男,年龄是24或25岁的用户信息
//////下面这种方式should会全部失效,最终会返回所有的男性用户信息//////
SearchSourceBuilder builder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
boolQueryBuilder.should(QueryBuilders.termQuery("user_age",24));
boolQueryBuilder.should(QueryBuilders.termQuery("user_age",25));
builder.query(boolQueryBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
正确的写法是:
最外层写两个must,在第二个must再嵌套一个BoolQueryBuilder
SearchSourceBuilder builder = new SearchSourceBuilder();
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery("user_gender","男"));
// 嵌套BoolQueryBuilder
BoolQueryBuilder boolQueryBuilder2 = QueryBuilders.boolQuery();
boolQueryBuilder2.should(QueryBuilders.termQuery("user_age",24));
boolQueryBuilder2.should(QueryBuilders.termQuery("user_age",25));
boolQueryBuilder.must(boolQueryBuilder2);
builder.query(boolQueryBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
4.9 模糊查询 & 高亮查询
SearchSourceBuilder builder = new SearchSourceBuilder();
// Fuzziness设置为自动的话,会根据字符长度更改允许的最大编辑距离
FuzzyQueryBuilder fuzziness = QueryBuilders.fuzzyQuery("user_name", "张S").fuzziness(Fuzziness.AUTO);
builder.query(fuzziness);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
SearchSourceBuilder builder = new SearchSourceBuilder();
TermQueryBuilder termQuery = QueryBuilders.termQuery("user_gender", "男");
builder.query(termQuery);
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.preTags("<font color='red'>");
highlightBuilder.postTags("</font>");
highlightBuilder.field("gender");
builder.highlighter(highlightBuilder);
request.source(builder);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
4.10 聚合查询
SearchSourceBuilder builder = new SearchSourceBuilder();
// 查询年龄最大的记录
MaxAggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("user_age");
// 按照性别分组
TermsAggregationBuilder aggregationBuilder = AggregationBuilders.terms("genderGroup").field("user_gender");
builder.aggregation(aggregationBuilder);