spring boot中创建ElasticSearch索引

JAVA 8

Spring Boot 2.5.3

Elasticsearch-7.14.0(Windows)

---

 

目录

创建索引1:只使用@Document注解

创建索引2:@Document注解 + @Setting注解

创建索引3:使用@Setting注解的settingPath属性

settingPath属性的json文件怎么编写?

创建索引4:使用@Setting注解的多种属性

sort属性测试

useServerConfiguration属性测试

 

创建索引1

只使用@Document注解

// 只使用 org.springframework.data.elasticsearch.annotations.Document 注解
@Document(indexName = "news")
@Data
public class News {

    //...
    
}

启动应用,输出下面的日志:

org.elasticsearch.client.RestClient : request [HEAD http://localhost:9200/news?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false] returned 1
org.elasticsearch.client.RestClient : request [PUT http://localhost:9200/news?master_timeout=30s&timeout=30s] returned 1

Kibana执行 GET /news:

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "1s",
        "number_of_shards" : "1",
        "provided_name" : "news",
        "creation_date" : "1631870238379",
        "store" : {
          "type" : "fs"
        },
        "number_of_replicas" : "1",
        "uuid" : "tNNIub-dTqqPnKq6uiwdlA",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

添加、查询数据成功。

 

创建索引2

@Document + @Setting,第一种方式

# 使用@Setting注解
@Document(indexName = "news")
@Setting(shards = 2, replicas = 2, refreshInterval = "2s", indexStoreType = "mmapfs")
@Data
public class News {
}

Kibana执行 GET /news:

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "2s",
        "number_of_shards" : "2",
        "provided_name" : "news",
        "creation_date" : "1631870945579",
        "store" : {
          "type" : "mmapfs"
        },
        "number_of_replicas" : "2",
        "uuid" : "nSXP0GowT4eCHmHJS-kKGQ",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

添加、查询数据成功。

 

创建索引3

@Document + @Setting,第2种方式,使用@Setting的settingPath属性

@Document(indexName = "news")
@Setting(settingPath = "indices/news.setting.json")
@Data
public class News {
}

news.setting.json 文件位于 /src/main/resources/indices下,配置内容如下:

{
	"number_of_shards": 2,
	"refresh_interval": "2s",
	"analyze": {
		"max_token_count": 500
	}
}

Kibana执行 GET /news:

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "2s",
        "number_of_shards" : "2",
        "provided_name" : "news",
        "analyze" : {
          "max_token_count" : "500"
        },
        "creation_date" : "1631871387436",
        "number_of_replicas" : "1",
        "uuid" : "zEDrWnOSQkG_necQitELUw",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

添加、查询数据成功。

 

settingPath属性的json文件怎么编写?

请看官方文档,Index Modules,索引模块,写的明明白白。

https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html

上面这篇 index-modules.html 正是一个开头,其中包括其它各种索引的配置的链接。

文中提到,索引可以单独配置——不使用ES全局配置吧,而且 索引级别的配置 分为2种:

static 静态的 只能在配置创建时 或 对已关闭索引使用,比如,分片数(index.number_of_shards);

dynamic 动态的 可以在索引使用中通过 update-index-settings API 更新,比如,副本数(number_of_replicas);

这两种配置不和任何 索引模块关联。

官方提示,更改索引配置很危险。这样的话,事先规划好,尽量不改吧

 

除了上面的,还有 其它索引模块 可以配置:

Analysis 重点 分词器等

Index shard allocation

Mapping 重点

Merging

Similarities

Slowlog

Store

Translog

History retention

Indexing pressure

Index lifecycle management

官文中有详情,大家可以自行查阅。

其中2个被我标记为 重点,属于常用配置吧(目前的经验)。

相比于注解上的直接配置,通过JSON文件来配置会更全面。

 

创建索引4

2、3混合使用时,会成功吗?成功的话,使用哪个的配置?

外3、文件中2。

注,外面注解上的直接配置都改为 3了

// News.java
@Setting(shards = 3, replicas = 3, refreshInterval = "3s", indexStoreType = "mmapfs", 
	settingPath = "indices/news.setting.json")
    
/*
news.setting.json 文件内容:
{
	"number_of_shards": 2,
	"number_of_replicas": 2,
	"refresh_interval": "2s",
	"analyze": {
		"max_token_count": 500
	},
	"store": {
		"type": "hybridfs"
	}
}
*/

Kibana执行 GET /news:

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "2s",
        "number_of_shards" : "2",
        "provided_name" : "news",
        "analyze" : {
          "max_token_count" : "500"
        },
        "creation_date" : "1631871707807",
        "store" : {
          "type" : "hybridfs"
        },
        "number_of_replicas" : "2",
        "uuid" : "uZt7irx9RhKMoYQtuqEt1A",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

根据上面的响应可以看到,以json文件中为准。

添加、查询数据成功。

 

将settingPath放到最前方,仍然如此:来自博客园

@Setting(settingPath = "indices/news.setting.json", 
	shards = 3, replicas = 3, refreshInterval = "3s", indexStoreType = "mmapfs")

 

Setting注解 中还有几个 sort开头的字段、useServerConfiguration 的试验后文介绍。

 

sort属性测试

sort测试1:启动服务失败,,没有@Field注解

@Document(indexName = "news")
@Setting(
	sortFields = { "id" },
	sortOrders = {Setting.SortOrder.asc}
)
@Data
public class News {

	@Id
	private String id;
    
}

启动APP失败:
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate 
[org.springframework.data.elasticsearch.repository.support.SimpleElasticsearchRepository]: 
Constructor threw exception; nested exception is java.lang.IllegalStateException: Required annotation 
interface org.springframework.data.elasticsearch.annotations.Field not found for id!

 来自博客园

sort测试2:启动服务失败,,id字段上添加@Field失败,不允许

给id字段添加 @Field
	@Id
	@Field
	private String id;
    
启动APP失败:
Constructor threw exception; nested exception is java.lang.IllegalArgumentException: 
field type Auto not allowed for sortField



更改:使用 title
@Setting(
	sortFields = { "title" },
	sortOrders = {Setting.SortOrder.asc}
)
@Data
public class News {
	@Field
	private String title;
}

启动失败:
Constructor threw exception; nested exception is java.lang.IllegalArgumentException: 
field type Auto not allowed for sortField

 

注:上面的type Auto 市值@Field的type属性,在org.springframework.data.elasticsearch.annotations.FieldType 中定义

 

sort测试3:参考官网的配置,设置@Field的type属性

id、title两个字段:
@Setting(
	sortFields = { "id", "title" },
	sortOrders = {Setting.SortOrder.asc, Setting.SortOrder.asc}
)
@Data
public class News {

	@Id
	@Field(type=FieldType.Keyword)
	private String id;

	/**
	 * 标题
	 */
	@Field(type=FieldType.Keyword)
	private String title;
    
}

启动APP成功!

Kibana执行 GET /news ,可以看到:来自博客园

properties下已经有 id、title两个节点了;

settings下也有一个sort节点了;

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "id" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "1s",
        "number_of_shards" : "1",
        "provided_name" : "news",
        "creation_date" : "1631965403582",
        "sort" : {
          "field" : [
            "id",
            "title"
          ],
          "order" : [
            "asc",
            "asc"
          ]
        },
        "store" : {
          "type" : "fs"
        },
        "number_of_replicas" : "1",
        "uuid" : "ZyHlxd1sQAuTnR4by-lReg",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

配置成功。

添加、查询数据成功。来自博客园

 

再次执行 Kibana的 GET /news:可以看到,添加数据后,settings下的sort节点没有变化,,但是,mappings下为其它字段建立了一些properties——都要比id、title的复杂很多(默认配置)!

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "contentHtml" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "contentText" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "id" : {
          "type" : "keyword"
        },
        "postTime" : {
          "type" : "long"
        },
        "source" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "title" : {
          "type" : "keyword"
        },
        "url" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "1s",
        "number_of_shards" : "1",
        "provided_name" : "news",
        "creation_date" : "1631965403582",
        "sort" : {
          "field" : [
            "id",
            "title"
          ],
          "order" : [
            "asc",
            "asc"
          ]
        },
        "store" : {
          "type" : "fs"
        },
        "number_of_replicas" : "1",
        "uuid" : "ZyHlxd1sQAuTnR4by-lReg",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

 

疑问:

配置这个有什么好处呢?排除?提高查询速度?TODO 

 

另外测试了 FieldType.Text ,也无法用来设置sort。

 

useServerConfiguration属性测试

注,测试前删除之前的索引

默认是 false,现设置为 true——使用ES服务器的配置,而不使用Java代码中的配置?

@Setting的useServerConfiguration属性测试

@Document(indexName = "news")
@Setting(useServerConfiguration = true,
	shards = 3, replicas = 3, refreshInterval = "3s", indexStoreType = "mmapfs",
	settingPath = "indices/news.setting.json",
	sortFields = { "id", "title" },
	sortOrders = {Setting.SortOrder.asc, Setting.SortOrder.asc},
	sortModes = {Setting.SortMode.min, Setting.SortMode.max}
)
@Data
public class News {
//...
}

启动APP成功。

Kibana执行 GET /news:

mappings下有id、sort节点了,和@Field有关系吧。来自博客园

但settings下却没有sort节点了。

从settings下的配置来看,使用了 settingPath 中的配置。

{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "id" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "refresh_interval" : "2s",
        "number_of_shards" : "2",
        "provided_name" : "news",
        "analyze" : {
          "max_token_count" : "500"
        },
        "creation_date" : "1631966277191",
        "store" : {
          "type" : "hybridfs"
        },
        "number_of_replicas" : "2",
        "uuid" : "viPNAWZDT2iV-2BEjkd2bw",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

 

Spring Data ES官文的说明:不发送任何配置参数,,难道配置文件中的不算?

不过,的确屏蔽了属性中的配置——sort的配合没有生效。来自博客园

去掉 settingPath属性 进行测试:看起来真的使用了服务器的配置,直接忽略了@Setting的其它属性的配置。

settings节点下的信息也变的很少了啊!

@Setting(useServerConfiguration = true,
	shards = 3, replicas = 3, refreshInterval = "3s", indexStoreType = "mmapfs",
	sortFields = { "id", "title" },
	sortOrders = {Setting.SortOrder.asc, Setting.SortOrder.asc},
	sortModes = {Setting.SortMode.min, Setting.SortMode.max}
)

Kibana 执行 GET /news:
{
  "news" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "_class" : {
          "type" : "keyword",
          "index" : false,
          "doc_values" : false
        },
        "id" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "provided_name" : "news",
        "creation_date" : "1631966831563",
        "number_of_replicas" : "1",
        "uuid" : "mbnoBSH0Rk6STmegSHEZqg",
        "version" : {
          "created" : "7140099"
        }
      }
    }
  }
}

 

在Kibana中使用上面news.setting.json的配置:失败了

失败了!

根据错误信息,一个一个地减少,最终,没有创建索引成功

和前面使用的成功建立索引的配置对比:来自博客园

这部分需要看看官方文档才行!TODO

 

最后的问题:

索引到底是 使用前调用RESTful接口 建立好呢,还是在程序里面使用@Setting等注解来建立呢?

最佳实践是什么?

 

》》》全文完《《《

 

mapping是干啥的?

官方文档,真的太重要了!而且非常节约精时!

 

posted @ 2021-09-18 20:24  快乐的欧阳天美1114  阅读(3682)  评论(2编辑  收藏  举报