低版本Flume兼容高版本elasticsearch

Flume更新比较慢,而elasticsearch更新非常快所以当涉及更换elasticsearch版本时会出现不兼容问题。

apache-flume-1.6.0+elasticsearch1.5.1是可以完美结合的,这里将elasticsearch版本升级到6.3.2。

低版本elasticsearch和高版本elasticsearch连接方式完全不一样所以需要重写Sink。

下载源码flume-ng-sinks\flume-ng-elasticsearch-sink\ElasticSearchSink.java,查看人家的源码。

我直接起个项目重写了

POM

<dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>6.4.3</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>transport</artifactId>
            <version>6.4.3</version>
        </dependency>
        <!-- <dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId> 
            <version>4.1.32.Final</version> </dependency> -->
        <!-- https://mvnrepository.com/artifact/org.apache.flume.flume-ng-sinks/flume-ng-elasticsearch-sink -->
        <dependency>
            <groupId>org.apache.flume.flume-ng-sinks</groupId>
            <artifactId>flume-ng-elasticsearch-sink</artifactId>
            <version>1.6.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>2.8.5</version>
        </dependency>

    </dependencies>

重写的Sink类

package com.jachs.sink.elasticsearch;

import org.apache.flume.Channel;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.Transaction;
import org.apache.flume.conf.Configurable;
import org.apache.flume.sink.AbstractSink;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient;

import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME;

import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.HashMap;
import java.util.Map;

import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES;

public class ElasticSearchSink extends AbstractSink implements Configurable {
    private String hostNames;
    private String indexName;
    private String clusterName;
    static TransportClient client;

    public void configure(Context context) {
        hostNames = context.getString(HOSTNAMES);
        indexName = context.getString(INDEX_NAME);
        clusterName = context.getString(CLUSTER_NAME);
    }

    @Override
    public void start() {      
        Settings settings = Settings.builder().put("cluster.name", clusterName).build();
        try {
            client = new PreBuiltTransportClient(settings).addTransportAddress(new TransportAddress(
                    InetAddress.getByName(hostNames.split(":")[0]), Integer.parseInt(hostNames.split(":")[1])));
        } catch (UnknownHostException e) {
            e.printStackTrace();
        }

    }

    @Override
    public void stop() {
        super.stop();
    }

    public Status process() throws EventDeliveryException {
        Status status = null;
        Channel ch = getChannel();
        Transaction txn = ch.getTransaction();
        txn.begin();
        try {
            Event event = ch.take();
            Map<String, String> head = event.getHeaders();
            Map<String, Object> map = new HashMap<String, Object>();

            for (String key : head.keySet()) {
                map.put("topic", key);
                map.put("timestamp", head.get(key));
                map.put("data", new String(event.getBody()));
            }

            IndexRequestBuilder create = client.prepareIndex(indexName, "text").setSource(map);
            IndexResponse response = create.execute().actionGet();

            txn.commit();
            status = Status.READY;
        } catch (Throwable t) {
            txn.rollback();
            status = Status.BACKOFF;
            if (t instanceof Error) {
                throw (Error) t;
            }
        } finally {
            txn.close();
        }
        return status;
    }
}
mvn install -DskipTests

打包,然后将Flume下的flume-ng-kafka-sink.jar替换掉。

修改Flume配置文件将下面修改为自己的类位置

a1.sinks.elasticsearch.type=com.jachs.sink.elasticsearch.ElasticSearchSink

我这里使用的是FileBeat-kafka-flume-elasticsearch,所以是从kafka取数到elasticsearch,根据自己sources修改自己连接。然后将kafka和elasticsearch的jar包Copy到Flume下注意版本冲突保持JAR版本正确不要冲突。

官方参考

http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html#sink
http://flume.apache.org/releases/content/1.6.0/apidocs/index.html

Channel对象是管道,可以创建Transaction事务,采用回调方式将sources数据放进了Data,启动个Even事件,然后根据自己逻辑代码动态设置状态码最后返回状态码。

 

posted @ 2019-01-18 13:28  Jachs  阅读(1629)  评论(1编辑  收藏  举报