低版本Flume兼容高版本elasticsearch

Flume更新比较慢,而elasticsearch更新非常快所以当涉及更换elasticsearch版本时会出现不兼容问题。

apache-flume-1.6.0+elasticsearch1.5.1是可以完美结合的,这里将elasticsearch版本升级到6.3.2。

低版本elasticsearch和高版本elasticsearch连接方式完全不一样所以需要重写Sink。

下载源码flume-ng-sinks\flume-ng-elasticsearch-sink\ElasticSearchSink.java,查看人家的源码。

我直接起个项目重写了

POM

复制代码
<dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>6.4.3</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>transport</artifactId>
            <version>6.4.3</version>
        </dependency>
        <!-- <dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId> 
            <version>4.1.32.Final</version> </dependency> -->
        <!-- https://mvnrepository.com/artifact/org.apache.flume.flume-ng-sinks/flume-ng-elasticsearch-sink -->
        <dependency>
            <groupId>org.apache.flume.flume-ng-sinks</groupId>
            <artifactId>flume-ng-elasticsearch-sink</artifactId>
            <version>1.6.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>2.8.5</version>
        </dependency>

    </dependencies>
复制代码

重写的Sink类

复制代码
package com.jachs.sink.elasticsearch;

import org.apache.flume.Channel;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.Transaction;
import org.apache.flume.conf.Configurable;
import org.apache.flume.sink.AbstractSink;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;

import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME;

import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.HashMap;
import java.util.Map;

import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES;

public class ElasticSearchSink extends AbstractSink implements Configurable {
    private String hostNames;
    private String indexName;
    private String clusterName;
    static TransportClient client;

    public void configure(Context context) {
        hostNames = context.getString(HOSTNAMES);
        indexName = context.getString(INDEX_NAME);
        clusterName = context.getString(CLUSTER_NAME);
    }

    @Override
    public void start() {      
        Settings settings = Settings.builder().put("cluster.name", clusterName).build();
        try {
            client = new PreBuiltTransportClient(settings).addTransportAddress(new TransportAddress(
                    InetAddress.getByName(hostNames.split(":")[0]), Integer.parseInt(hostNames.split(":")[1])));
        } catch (UnknownHostException e) {
            e.printStackTrace();
        }

    }

    @Override
    public void stop() {
        super.stop();
    }

    public Status process() throws EventDeliveryException {
        Status status = null;
        Channel ch = getChannel();
        Transaction txn = ch.getTransaction();
        txn.begin();
        try {
            Event event = ch.take();
            Map<String, String> head = event.getHeaders();
            Map<String, Object> map = new HashMap<String, Object>();

            for (String key : head.keySet()) {
                map.put("topic", key);
                map.put("timestamp", head.get(key));
                map.put("data", new String(event.getBody()));
            }

            IndexRequestBuilder create = client.prepareIndex(indexName, "text").setSource(map);
            IndexResponse response = create.execute().actionGet();

            txn.commit();
            status = Status.READY;
        } catch (Throwable t) {
            txn.rollback();
            status = Status.BACKOFF;
            if (t instanceof Error) {
                throw (Error) t;
            }
        } finally {
            txn.close();
        }
        return status;
    }
}
复制代码
mvn install -DskipTests

打包,然后将Flume下的flume-ng-kafka-sink.jar替换掉。

修改Flume配置文件将下面修改为自己的类位置

a1.sinks.elasticsearch.type=com.jachs.sink.elasticsearch.ElasticSearchSink

我这里使用的是FileBeat-kafka-flume-elasticsearch,所以是从kafka取数到elasticsearch,根据自己sources修改自己连接。然后将kafka和elasticsearch的jar包Copy到Flume下注意版本冲突保持JAR版本正确不要冲突。

官方参考

http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html#sink
http://flume.apache.org/releases/content/1.6.0/apidocs/index.html

Channel对象是管道,可以创建Transaction事务,采用回调方式将sources数据放进了Data,启动个Even事件,然后根据自己逻辑代码动态设置状态码最后返回状态码。

 

posted @   Jachs  阅读(1636)  评论(1编辑  收藏  举报
编辑推荐:
· 没有源码,如何修改代码逻辑?
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
阅读排行:
· DeepSeek “源神”启动!「GitHub 热点速览」
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· C# 集成 DeepSeek 模型实现 AI 私有化(本地部署与 API 调用教程)
· DeepSeek R1 简明指南:架构、训练、本地部署及硬件要求
· 2 本地部署DeepSeek模型构建本地知识库+联网搜索详细步骤
点击右上角即可分享
微信分享提示
不管人生呈现出什么样貌,人生的构成要素都是一样的。