Spring boot项目使用 restHighLevelClient 接入 elasticsearch

Spring boot 接入 ElasticSearch 查询数据

最近在做一个需要支持大数据量查询的项目,调研之后选用ElasticSearch存储数据,并接入Spring Boot项目,通过rest接口查询并返回。具体的,获取数据并向ES中插入数据是用Python脚本实现的,本博客只涉及查询操作。

一. 接入ElasticSearch

选用的是官网推荐的restHighLevelClient,其封装了CRUD方法。

服务器上已经部署好ES的前提下,在spring boot项目中接入大概分为三步:

1. 添加依赖

为了简洁,将pom文件中无关的部分都删去了。主要是选择一个适合的es版本,这里选的是7.6.0

    <properties>
        <java.version>1.8</java.version>
        <es.version>7.6.0</es.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>        
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>${es.version}</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>${es.version}</version>
        </dependency>
    </dependencies>

2. 在yaml文件中添加配置

在这里主要配置es服务的地址,以及鉴权。

spring:
  elasticsearch:
    rest:
      connection-timeout: 6s
      uris: test-cluster-01:9200,test-cluster-02:9200
      read-timeout: 10s
      # 如果不需要账号密码就可访问,下面两个字段可以去掉
      username: estest
      password: estest

3. 创建configration,在服务启动时创建好restHighLevelClient

使用configration注解,服务启动时会生成RestHighLevelClient的bean,之后使用只需要注入就行了。

@Configuration
public class ESConfig {

    @Value("${spring.elasticsearch.rest.uris}")
    private List<String> uris;

    // 如果不需要账号密码就可访问,userName和password两个字段可以去掉
    @Value("${spring.elasticsearch.rest.password}")
    private String userName;

    @Value("${spring.elasticsearch.rest.username}")
    private String password;

    @Bean
    public RestHighLevelClient restHighLevelClient() {
        HttpHost[] httpHosts = createHosts();
        RestClientBuilder restClientBuilder = RestClient.builder(httpHosts)
                .setHttpClientConfigCallback(httpClientBuilder -> {
                    CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
                    // 如果不需要账号密码就可访问,下面这行可以去掉
                    credentialsProvider.setCredentials(AuthScope.ANY,new UsernamePasswordCredentials(userName,password));
                    return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
                });
        return new RestHighLevelClient(restClientBuilder);
    }

    // 支持ES分布式
    private HttpHost[] createHosts() {
        HttpHost[] httpHosts = new HttpHost[uris.size()];
        for (int i = 0; i < uris.size(); i++) {
            String hostStr = uris.get(i);
            String[] host = hostStr.split(":");
            httpHosts[i] = new HttpHost(host[0].trim(),Integer.parseInt(host[1].trim()));
        }
        return httpHosts;
    }
}

二. 查询

es中存储的数据结构如下,下面根据这个数据结构进行各种查询

class Entity {

    private String id;

    private String summary;

    private String name;

    private String introduction;
}

1. 根据id查询(单索引查询)

这里选用GetRequest查询,非常方便,但缺点就是只能设置一个索引查询,也只能设置一个id,不能批量。

public class EsEntityClient {

    @Autowired
    private RestHighLevelClient restHighLevelClient;

    // 设置索引名
    private static final String INDEX_NAME = "entity";

    public Entity queryEntityById(String id) {
        GetRequest getRequest = new GetRequest(INDEX_NAME).id(id);

        Entity entity = null;
        try {
            GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
            entity = JSONObject.parseObject(JSONObject.toJSONString(response.getSource()), Entity.class);
        } catch (IOException e) {
            log.warn("can't find entity, id:{}", id, e);
        }
        return entity;
    }

2.根据ids进行 单/多 索引查询

选用IdsQueryBuilder来构建查询

    public List<Entity> queryByIds(List<String> ids) {
        IdsQueryBuilder idsQueryBuilder = QueryBuilders.idsQuery();
        idsQueryBuilder.ids().addAll(ids);

        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource()
                .query(idsQueryBuilder);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder)
                // 这里可以设置多索引
                .indices("idx1","idx2","idx3");

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            // 根据score倒序排序(相关度排序)
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search by ids failed. ids:{}", ids.toString(), e);
        }
        return entities;
    }

3.根据id查询(多索引查询)

项目中遇到的问题是,爬虫从不同来源爬取的数据存在了不同的index里,所以前端给一个id,可能需要从多个索引中查询。此时上面的GetRequest就行不通了(当然可以循环去查不同index,但是index多的情况下,IO开销大,接口响应慢)。

由此,我选择的方法是...

    public Entity queryById(String id) {
        List<Entity> result = queryByIds(Collections.singletonList(id));
        if (CollectionUtils.isEmpty(result)) {
            log.warn("can't find entity by id:{}", id);
            return null;
        }
        // 这里其实直接返回result.get(0)就行吧,但是这里不转不行,感觉是aliFastJson的BUG
        return JSONObject.parseObject(JSONObject.toJSONString(result.get(0)), Entity.class);
    }

4.根据name精准查询

    public List<Entity> queryEntityByName(String name) {
        BoolQueryBuilder queryBuilder = new BoolQueryBuilder();

        // 使用termQuery,第一个参数为:目标字段名.keyword,就可以实现对这个参数的精准匹配
        queryBuilder.filter(QueryBuilders.termQuery("name" + ".keyword", name));
        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource().query(queryBuilder).size(20);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder).indices(INDEX_NAME);

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search entities failed. name:{}", name, e);
        }
        return entities;
    }

5.多字段模糊搜索

根据各个字段的关键字,模糊匹配

    public List<Entity> query(String name, String summary, String introduction) {
        BoolQueryBuilder queryBuilder = buildFuzzQueryBuilder(name, summary, introduction);
        // 暂时写死查100个
        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource().query(queryBuilder).size(100);
        SearchRequest searchRequest = new SearchRequest().source(searchSourceBuilder);
        // 设置查询范围
        searchRequest.indices("idx1","idx2","idx3");

        List<Entity> entities = new ArrayList<>();
        try {
            SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
            SearchHit[] hits = searchResponse.getHits().getHits();
            Arrays.sort(hits, (h1, h2) -> (int) (h2.getScore() - h1.getScore()));
            for (SearchHit hit : hits) {
                String jsonString = hit.getSourceAsString();
                entities.add(JSONObject.parseObject(jsonString, Entity.class));
            }
        } catch (IOException e) {
            log.warn("search failed.", e);
        }
        return entities;
    }

    // 构建查询
    private BoolQueryBuilder buildFuzzQueryBuilder(String name, String summary, String introduction) {
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
        if (Strings.isNotEmpty(name)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("name", name);
            boolQueryBuilder.filter(queryBuilder);
        }

        if (Strings.isNotEmpty(summary)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("summary", summary);
            boolQueryBuilder.filter(queryBuilder);
        }

        if (Strings.isNotEmpty(introduction)) {
            // 模糊匹配
            MatchPhraseQueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("introduction", introduction);
            boolQueryBuilder.filter(queryBuilder);
        }
        return boolQueryBuilder;
    }

简单的demo实现已经上传到 https://github.com/bupt-yanch/spring-elasticsearch-demo

posted @ 2021-06-11 17:58  BuptWade  阅读(1261)  评论(0编辑  收藏  举报