实现elasticsearch网关,兼容不同版本es,滚动升级-功能验证开发

接上一篇

实现elasticsearch网关,兼容不同版本elasticseach读写请求

实现elasticsearch网关,兼容不同版本es,滚动升级-功能验证开发

项目验证目标

最初目标是完成elasticsearch7.10.2elasticsearch 6.8.14的版本兼容

通过remote cluster 不同版本es并行,高版本为主,低版本为辅,低版本内的数据逐步淘汰,过渡升级至elasticsearch7.10.2

项目地址

https://github.com/cclient/elasticsearch-multi-cluster-compat-proxy

新/旧index的判断基准

新索引写入elasticsearch7.10.2,旧索引写入elasticsearch 6.8.14

  • 配置在mysql数据库中

    original index名称,唯一对应 dest index 指向对应的es版本

  • 如mysql内无记录,则通过索引的规范化名称来判断

    例如索引里有日期filebeat_202101_log,约定一个时间基准,判断新/旧索引

项目已开发完成,且基本功能性验证通过(功能正常,性能压测等还顾及不到)

因为只是作一期的可行性验证,所以代码比较简略

有以下几个注意项

  • 1 spring-boot 2.4.4 支持 APPLICATION_NDJSON_VALUE,2.3.9.RELEASE 不支持,但不影响,用 MediaType.APPLICATION_JSON_VALUE即可
package org.springframework.http;
public static final String APPLICATION_NDJSON_VALUE = "application/x-ndjson";
  • 使用了spring-boot-starter-webflux 而不是spring-boot-starter-web

    并没有完全按webflux的规范开发,还是按标准mvc做的,选择webflux,主要是考虑到其底层是netty 并发会比spring-web会强一些,该服务本身也不是业务逻辑多的服务,并不需要结合其他web组件

  • spring-boot-starter-data-elasticsearch,实际最后没用

    	<parent>
    		<groupId>org.springframework.boot</groupId>
    		<artifactId>spring-boot-starter-parent</artifactId>
    		<version>2.4.4</version>
    	</parent>	
    		<dependency>
    			<groupId>org.springframework.boot</groupId>
    			<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
    		</dependency>
    2.4.4版本 spring-boot-starter-data-elasticsearch 有bug
    2.3.9.RELEASE版本 spring-boot-starter-data-elasticsearch 正常使用
    
  • elasticsearch-rest-high-level-client,实际最后没用

    通过elasticsearch-rest-high-level-client访问es _search请求还好 _bulk 实现比较麻烦,技术上没难度,

    项目里保留了直接访问es sdk访问的代码,做为参考,有兴趣的可以改为用官方sdk实现

        public String doPostBulk(EsHost[] esHosts, List<String> lines, String auth, boolean isSSL) throws IOException {
            RestHighLevelClient client=getClient(esHosts,auth,isSSL);
            BulkRequest bulkRequest=new BulkRequest();
            //index
            IndexRequest indexRequest=new IndexRequest("index","type","id");
            indexRequest.source(new HashMap<String,Object>(1){{
                put("field1", "value1");
            }});
            //delete
            DeleteRequest deleteRequest=new DeleteRequest("index","type","id");
            //update
            UpdateRequest updateRequest=new UpdateRequest("index","type","id");
            updateRequest.doc(new HashMap<String,Object>(1){{
                put("field2", "value2");
            }});
            bulkRequest.add(indexRequest);
            bulkRequest.add(deleteRequest);
            bulkRequest.add(updateRequest);
            BulkResponse bulkResponse=client.bulk(bulkRequest,RequestOptions.DEFAULT);
            return Strings.toString(bulkResponse.toXContent(JsonXContent.contentBuilder(), ToXContent.EMPTY_PARAMS).humanReadable(true));
        }
    

    pure http请求的方式,直接把lines做为字符串提交即可(这里有个隐患是 \n的处理)

  • 没有做client负载分流

    elasticsearch 服务本身可能会有多个client节点

    okhttp 只访问了唯一地址,目前并不支持es的多client

    尝试elasticsearch-rest-high-level-client本来是为了多client支持的,因为_bulk实现比较费事,放弃了

  • 实现了https支持,主要是跳过证书验证,但未测试

  • 单节点负载能力有限,需要做多节点搞分布式,分布式未做设计

  • 针对_bulk的解析

    一开始想的简单认为偶数行是索引信息 {_index,_type,_id,_opera}需要解析变更,奇数行不用处理

    操作分index,delete,update,delete,其中delete 没有下一行的内容项,使用奇偶判断会有误差

    其实个人经验上,要限制对es的delete操作,一方面,物理删除,部分情况导致问题排查困难,另一方面,大数据除写入es外,还有双写/多写至其他存储的场景,delete操作,双写/多写,也较难同步

    建议是把物理删除统一变为逻辑删除,新增字段isDeleted,deleteDate,查询时以isDeleted过滤,后台定期清除deleteDate超时的数据,也可配合es的ilm周期,在ilm merge前执行物理删除

    屏蔽es delete ,把物理删除,改为逻辑删除后,不会有奇偶不确定性问题,只用对偶数行作json解析,奇数行保持现状,但为了和es完全兼容,还是按会存在delete来实现

    目前找到规律顺序处理,实际也不用解析所有行,bulk的解析也是实现相对最复杂的部分

    @PostMapping(
            path = "/_bulk",
            consumes = MediaType.APPLICATION_NDJSON_VALUE, //spring-boot 2.4.4
            produces = MediaType.APPLICATION_JSON_VALUE
    )
    @ResponseStatus(HttpStatus.CREATED)
    public String bulk(@RequestBody String requestBody, @RequestParam(required = false) Map<String, String> params) throws IOException {
        String urlParamsString = String.join("&", params.entrySet().stream().map(kv -> kv.getKey() + "=" + kv.getValue()).collect(Collectors.toList()));
        InputStream inputStream = new ByteArrayInputStream(requestBody.getBytes(UTF_8));
        InputStreamReader inputStreamReader = new InputStreamReader(inputStream, UTF_8);
        BufferedReader reader = new BufferedReader(inputStreamReader);
        boolean nextLineIsOperaTarget = true;
        List<String> bulkToEs7 = new ArrayList<>(2000);
        List<String> bulkToEs6 = new ArrayList<>(2000);
        boolean isEs6 = false;
        String preLine = null;
        ObjectMapper objectMapper = Jackson2ObjectMapperBuilder.json().build();
        while (reader.ready()) {
            String line = reader.readLine();
            //第一行必然为operaTarget
            if (nextLineIsOperaTarget) {
                JsonNode jsonNode = objectMapper.readTree(line);
                //其实只会有一个field,直接取next即可,我们也只关注"_index","_type"(如果存在的话)
                Map.Entry<String, JsonNode> field = jsonNode.fields().next();
                String operaName = field.getKey();
                //判断是否es6
                JsonNode targetPoint = field.getValue();
                String index = targetPoint.get("_index").asText();
                IndexStringInfo indexStringInfo = new IndexStringInfo(index);
                indexStringInfo.loadIndex(compatConfiguration.getIndexSplitBy(), compatConfiguration.getDateBoundary());
                isEs6 = indexDispatch.checkIndexIsV6(index);
                String id = targetPoint.get("_id").asText();
                if (isEs6) {
                    String type = CompatConfiguration.DEFAULT_TYPE;
                    if (compatConfiguration.getIsExtraType()) {
                        type = indexStringInfo.getType();
                    }
                    // if _index=filebeat_202103_log  >> _type=log
                    line = "{ \"" + operaName + "\" : { \"_index\" : \"" + index + "\", \"_type\" : \"" + type + "\", \"_id\" : \"" + id + "\" } }";
                } else {
                    line = "{ \"" + operaName + "\" : { \"_index\" : \"" + index + "\", \"_id\" : \"" + id + "\" } }";
                }
                if ("delete".equals(operaName)) {
                    nextLineIsOperaTarget = true;
                    //delete操作只存在操作符 { "delete" : { "_index" : "test", "_id" : "2" } }
                    if (isEs6) {
                        bulkToEs6.add(line);
                    } else {
                        bulkToEs7.add(line);
                    }
                } else {
                    preLine = line;
                    nextLineIsOperaTarget = false;
                    continue;
                }
            } else {
                //非delete操作同时存在操作符和操作数
                //操作符 例{ "update" : {"_id" : "1", "_index" : "test"} }
                //操作数 例{ "doc" : {"field2" : "value2"} }
                if (isEs6) {
                    bulkToEs6.add(preLine);
                    bulkToEs6.add(line);
                } else {
                    bulkToEs7.add(preLine);
                    bulkToEs7.add(line);
                }
                nextLineIsOperaTarget = true;
            }
        }
        if (bulkToEs6.size() == 0 && bulkToEs7.size() == 0) {
            return "{\"took\": 30,\"errors\": false}";
        }
        String es6Response = null;
        String es7Response = null;
        if (bulkToEs6.size() > 0) {
            es6Response = doPostBulk(compatConfiguration.getEs6Uri() + "/_bulk", urlParamsString, bulkToEs6, compatConfiguration.getEs6Auth(), compatConfiguration.getIsEs6SSL());
            if (bulkToEs7.size() == 0) {
                return es6Response;
            }
        }
        if (bulkToEs7.size() > 0) {
            es7Response = doPostBulk(compatConfiguration.getEs7Uri() + "/_bulk", urlParamsString, bulkToEs7, compatConfiguration.getEs7Auth(), compatConfiguration.getIsEs7SSL());
            if (bulkToEs6.size() == 0) {
                return es7Response;
            }
        }
        return mergeEsBulkResponse(objectMapper, es7Response, es6Response);
    }


    private String doPostBulk(String uri, String urlParamsString, List<String> lines, String auth, boolean isSSL) throws IOException {
        if (!urlParamsString.isEmpty()) {
            uri = uri + "?" + urlParamsString;
        }
        lines.add("");
        String esResponse = HttpUtil.post(uri, String.join("\n", lines), auth, isSSL);
        return esResponse;
    }

    /**
     * merge bulk2es6 response and bulk2es7 response
     * took use sum(es6res[took],es7res[took]),sum/avg/min/max
     *
     * @param objectMapper
     * @param es7Response
     * @param es6Response
     * @return
     * @throws IOException
     */
    private String mergeEsBulkResponse(ObjectMapper objectMapper, String es7Response, String es6Response) throws IOException {
        JsonNode es7ResponseJson = objectMapper.readTree(es7Response);
        JsonNode es6ResponseJson = objectMapper.readTree(es6Response);
        if (es7ResponseJson.get("errors").asBoolean() || es6ResponseJson.get("errors").asBoolean()) {
            ((ObjectNode) es7ResponseJson).set("errors", BooleanNode.getTrue());
        }
        ((ObjectNode) es7ResponseJson).set("took", IntNode.valueOf(es7ResponseJson.get("took").asInt() + es6ResponseJson.get("took").asInt()));
        es6ResponseJson.get("items").elements().forEachRemaining(item -> ((ArrayNode) (es7ResponseJson.get("items"))).add(item));
        return es7ResponseJson.toString();
    }

其他

最初是为了验证连接es7.10.2和es6.8.14

但实际应用并不局限在es7.10.2和es6.8.14上,只要版本remote cluster可兼容,就都可以连接

https://www.elastic.co/guide/en/elasticsearch/reference/7.10/modules-remote-clusters.html

Version compatibility table

Local cluster
Remote cluster 5.0→5.5 5.6 6.0→6.6 6.7 6.8 7.0 7.1→7.x
5.0→5.5 Yes Yes No No No No No
5.6 Yes Yes Yes Yes Yes No No
6.0→6.6 No Yes Yes Yes Yes No No
6.7 No Yes Yes Yes Yes Yes No
6.8 No Yes Yes Yes Yes Yes Yes
7.0 No No No Yes Yes Yes Yes
7.1→7.x No No No No Yes Yes Yes

es 官方 100个node 免授权,多于100则要收费,也可以remote cluster做大于100个node的节点扩容(性能问题解决的话)

项目只实现了1个remote的情况,实际可以因需扩展到多个remote

_search uri的重写,_bulk 的分发,基本架子都有了,扩展适配起来也容易

posted @ 2021-03-28 16:01  cclient  阅读(936)  评论(0编辑  收藏  举报