springboot整合elasticsearch7.2(基于官方high level client)

前言

最近写的一个个人项目(传送门：全终端云书签)中需要用到全文检索功能，目前 mysql，es 都可以做全文检索，mysql 胜在配置方便很快就能搞定上线(参考这里),不考虑上手难度，es 在全文检索方面是完胜 mysql 的。

最后决定使用 es。使用最新的 7.2 版本。java 客户端使用 es 官方的 high level client(官方文档),为什么用这个有以下几点原因：

jest 毕竟不是官方的，更新速度较慢
transportClient,速度太慢，连官方都嫌弃它了。在 7.x 中已经被弃用，8.x 中将完全删除
high level client 的官方文档写的很清楚明了，虽然目前相关的中文资料还很少，也能够上手用起来

本文主要内容如下：

docker 部署 es(支持 ik 中文分词)
在 springboot 中进行增删改查

docker 部署 es(基于 linux)

es 的中文分词目前比较流行的分词插件为 ik(github 地址)。由于手写 docker 命令太繁杂，这里用 docker-compose 来管理。假定当前在/root 目录下

下载 ik release 到/root/es/ik 目录下，并解压到当前文件夹。
创建/root/es/data 目录，并将读写权限给所有用户.本目录用于存放 es 数据。由于 es 不能以 root 用户执行，所以对于此目录需要将读写权限给其他用户。

编写 es 配置文件，7.2 的配置文件变化还是较大的（之前用的是 2.x 版本），一个简单的配置如下：

cluster.name: elasticsearch

 # 配置的集群名称，默认是 elasticsearch，es 服务会通过广播方式自动连接在同一网段下的 es 服务，通过多播方式进行通信，同一网段下可以有多个集群，通过集群名称这个属性来区分不同的集群。

 node.name: bookmark-world

 # 当前配置所在机器的节点名，你不设置就默认随机指定一个 name 列表中名字，该 name 列表在 es 的 jar 包中 config 文件夹里 name.txt 文件中，其中有很多作者添加的有趣名字。

 node.master: true

 # 指定该节点是否有资格被选举成为 node（注意这里只是设置成有资格， 不代表该 node 一定就是 master），默认是 true，es 是默认集群中的第一台机器为 master，如果这台机挂了就会重新选举 master。

 node.data: true

 # 指定该节点是否存储索引数据，默认为 true。

 bootstrap.memory_lock: false

 # 设置为 true 来锁住内存不进行 swapping。因为当 jvm 开始 swapping 时 es 的效率 会降低，所以要保证它不 swap，可以把 ES_MIN_MEM 和 ES_MAX_MEM 两个环境变量设置成同一个值，并且保证机器有足够的内存分配给 es。 同时也要允许 elasticsearch 的进程可以锁住内存，linux 下启动 es 之前可以通过`ulimit -l unlimited`命令设置。
 # 设置为 true，会导致报警告实际未锁定内存，进而退出进程(es在生产模式下有警告就会退出)

 network.bind_host: 0.0.0.0

 # 设置绑定的 ip 地址，可以是 ipv4 或 ipv6 的，默认为 0.0.0.0，绑定这台机器的任何一个 ip。
 # 集群配置
 discovery.seed_hosts:
   - bookmark-es
 cluster.initial_master_nodes:
   - bookmark-world

编写/root/docker-compose.yml

version: "2"
services:
  bookmark-es:
  image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
container_name: bookmark-es
volumes:
  - /etc/localtime:/etc/localtime
  - ./es/data:/usr/share/elasticsearch/data
  - ./es/ik:/usr/share/elasticsearch/plugins/ik
  - ./es/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
ports:
  - 9200:9200
  - 9300:9300

执行 docker-compose up -d启动 es

详细可参考这里：云书签 docker 部署。

springboot 整合

创建 springboot 项目

首先创建一个 springboot 项目，然后引入high level client的依赖，pom 文件如下：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.6.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.fanxb</groupId>
    <artifactId>es-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>es-demo</name>
    <description>Elasticsearch Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <!--注意：如果使用了parent那么需要在此定义es版本号,因为spring-boot-start-parent中已经定义了es相关依赖的版本号
    ，high-level-client中的部分依赖会被覆盖成低版本的,导出出现莫名其妙的错误-->
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>elasticsearch-rest-high-level-client</artifactId>
                <version>7.2.0</version>
            </dependency>
            <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
            <dependency>
                <groupId>org.elasticsearch</groupId>
                <artifactId>elasticsearch</artifactId>
                <version>7.2.0</version>
            </dependency>
            <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client &ndash;&gt;-->
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>elasticsearch-rest-client</artifactId>
                <version>7.2.0</version>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.2.0</version>
        </dependency>

        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.56</version>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

注意:这里有一个依赖的大坑，要注意!

如果定义了<parent>，就必须在<dependencyManagement>中指定部分依赖的版本，否则会因为依赖版本不对出现各种莫名其妙的错误,上面注释中已经指出。

创建 util/EsUtil.java 工具类

主要功能函数如下：

预创建 index

虽然 es 在插入数据时会自动根据字段类型来创建字段定义，但是自动创建并不总是和需要相符的，比如想让某个字段不分词，或者使用其他的分词器。所以在代码中先判断 index(es7 中已经废弃了 mapping，也就是一个 index 相当于一个表)是否存在，如果不存在就创建 index.

主要代码如下：

//被@PostConstruct注释的方法将会在对应类注入到Spring后调用,确保index的生成
@PostConstruct
public void init() {
    try {
        if (client != null) {
            client.close();
        }
        client = new RestHighLevelClient(RestClient.builder(new HttpHost(host, port, scheme)));
        if (this.indexExist(INDEX_NAME)) {
            return;
        }
        CreateIndexRequest request = new CreateIndexRequest(INDEX_NAME);
        request.settings(Settings.builder().put("index.number_of_shards", 3).put("index.number_of_replicas", 2));
        request.mapping(CREATE_INDEX, XContentType.JSON);
        CreateIndexResponse res = client.indices().create(request, RequestOptions.DEFAULT);
        if (!res.isAcknowledged()) {
            throw new RuntimeException("初始化失败");
        }
    } catch (Exception e) {
        e.printStackTrace();
        System.exit(0);
    }
}

插入或者更新一个对象

通过指定 id，如果此 id 存在那么就是更新，否则是插入。

public void insertOrUpdateOne(String index, EsEntity entity) {
    IndexRequest request = new IndexRequest(index);
    request.id(entity.getId());
    request.source(JSON.toJSONString(entity.getData()), XContentType.JSON);
    try {
        client.index(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

批量插入

high level client 提供了方便的批量操作接口，如下所示：

public void insertBatch(String index, List<EsEntity> list) {
    BulkRequest request = new BulkRequest();
    list.forEach(item -> request.add(new IndexRequest(index).id(item.getId())
            .source(JSON.toJSONString(item.getData()), XContentType.JSON)));
    try {
        client.bulk(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

批量删除

和上面一样同样用到了BulkRequest

public <T> void deleteBatch(String index, Collection<T> idList) {
    BulkRequest request = new BulkRequest();
    idList.forEach(item -> request.add(new DeleteRequest(index, item.toString())));
    try {
        client.bulk(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

搜索

通过构建SearchSourceBuilder查询参数

public <T> List<T> search(String index, SearchSourceBuilder builder, Class<T> c) {
    SearchRequest request = new SearchRequest(index);
    request.source(builder);
    try {
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        SearchHit[] hits = response.getHits().getHits();
        List<T> res = new ArrayList<>(hits.length);
        for (SearchHit hit : hits) {
            res.add(JSON.parseObject(hit.getSourceAsString(), c));
        }
        return res;
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

delete by query

es 插入数据容易，删除就比较麻烦了，特别是根据条件删除。

public void deleteByQuery(String index, QueryBuilder builder) {
    DeleteByQueryRequest request = new DeleteByQueryRequest(index);
    request.setQuery(builder);
    //设置批量操作数量,最大为10000
    request.setBatchSize(10000);
    request.setConflicts("proceed");
    try {
        client.deleteByQuery(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}