arroyo+redpanda 集成试用

arroyo 对于kafka 有着很不错的集成支持(目前版本可以说是优先支持的),使用原生kafka 是一个选择,但是部署以及管理感觉比较费事
以前简单介绍过redpanda,所以尝试下集成

环境准备

  • docker-compose
    包含了redpandadata console,connect,以及一个测试,对于arroyo 使用了与redpandadata 同一个网络
 
version: '3'
networks:
  redpanda_network:
    driver: bridge
volumes:
  redpanda: null
services:
  redpanda:
    image: docker.redpanda.com/redpandadata/redpanda:v23.1.4
    command:
      - redpanda start
      - --smp 1
      - --overprovisioned
      - --kafka-addr PLAINTEXT://0.0.0.0:29092,OUTSIDE://0.0.0.0:9092
      - --advertise-kafka-addr PLAINTEXT://redpanda:29092,OUTSIDE://localhost:9092
      - --pandaproxy-addr 0.0.0.0:8082
      - --advertise-pandaproxy-addr localhost:8082
    ports:
      - 8081:8081
      - 8082:8082
      - 9092:9092
      - 9644:9644
      - 29092:29092
    volumes:
      - redpanda:/var/lib/redpanda/data
    networks:
      - redpanda_network
 
  console:
    image: docker.redpanda.com/redpandadata/console:v2.2.3
    entrypoint: /bin/sh
    command: -c "echo \"$$CONSOLE_CONFIG_FILE\" > /tmp/config.yml; /app/console"
    environment:
      CONFIG_FILEPATH: /tmp/config.yml
      CONSOLE_CONFIG_FILE: |
        kafka:
          brokers: ["redpanda:29092"]
          schemaRegistry:
            enabled: true
            urls: ["http://redpanda:8081"]
        redpanda:
          adminApi:
            enabled: true
            urls: ["http://redpanda:9644"]
        connect:
          enabled: true
          clusters:
            - name: local-connect-cluster
              url: http://connect:8083
    ports:
      - 8080:8080
    networks:
      - redpanda_network
    depends_on:
      - redpanda
 
  owl-shop:
    image: quay.io/cloudhut/owl-shop:latest
    networks:
      - redpanda_network
    #platform: 'linux/amd64'
    environment:
      - SHOP_KAFKA_BROKERS=redpanda:29092
      - SHOP_KAFKA_TOPICREPLICATIONFACTOR=1
      - SHOP_TRAFFIC_INTERVAL_RATE=1
      - SHOP_TRAFFIC_INTERVAL_DURATION=0.1s
    depends_on:
      - redpanda
 
  connect:
    image: docker.redpanda.com/redpandadata/connectors:latest
    hostname: connect
    container_name: connect
    networks:
      - redpanda_network
    #platform: 'linux/amd64'
    depends_on:
      - redpanda
    ports:
      - "8083:8083"
    environment:
      CONNECT_CONFIGURATION: |
          key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
          value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
          group.id=connectors-cluster
          offset.storage.topic=_internal_connectors_offsets
          config.storage.topic=_internal_connectors_configs
          status.storage.topic=_internal_connectors_status
          config.storage.replication.factor=-1
          offset.storage.replication.factor=-1
          status.storage.replication.factor=-1
          offset.flush.interval.ms=1000
          producer.linger.ms=50
          producer.batch.size=131072
      CONNECT_BOOTSTRAP_SERVERS: redpanda:29092
      CONNECT_GC_LOG_ENABLED: "false"
      CONNECT_HEAP_OPTS: -Xms512M -Xmx512M
      CONNECT_LOG_LEVEL: info
  app:
     image: ghcr.io/arroyosystems/arroyo-single:multi-arch
     networks:
      - redpanda_network
     ports:
       - "8000:8000"
       - "8001:8001"

启动&测试

  • 启动
docker-compose up -d
  • 创建kafka topic

 

 

  • 配置arroyo connections、source 以及sink
    我们测试的时候因为部署了一个owl-shop 服务,可以直接将owl-shop 的topic 作为source,然后写入到demo topic中(sink)
    connections 配置

 

 

kafka source 配置

 

 

json schema 定义


 
{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Product",
  "type": "object",
  "properties": {
    "version": { "type": "number" },
    "id": { "type": "string" },
    "customer": {
      "type": "object",
      "properties": { "id": { "type": "string" }, "type": { "type": "string" } }
    },
    "type": { "type": "string" },
    "firstName": { "type": "string" },
    "lastName": { "type": "string" },
    "state": { "type": "string" },
    "street": { "type": "string" },
    "houseNumber": { "type": "string" },
    "city": { "type": "string" },
    "zip": { "type": "string" },
    "latitude": { "type": "number" },
    "longitude": { "type": "number" },
    "phone": { "type": "string" },
    "additionalAddressInfo": { "type": "string" },
    "createdAt": { "type": "string" },
    "revision": { "type": "number" }
  }
}

 

 


kafka sink 创建

 

 

  • 创建job (pipeline)

sql 查询

job 执行

redpanda console 效果

 

 

说明

arroyo+redpanda 的pipeline 还是一种不错的选择,至少redpanda 部署以及使用还是比较简单的,而且支持不少kafaka 周边的兼容,是一个不错的选择,
arroyo 测试目前来说支持的source 以及sink 不是很多(核心还是kafaka)但是基于sql 的处理模式真的很方便,相比kafaka 的stream 简单不少,值得尝试下

参考资料

https://github.com/rongfengliang/arroyo-redpanda-learning
https://www.cnblogs.com/rongfengliang/p/17302703.html
https://github.com/ArroyoSystems/arroyo
https://github.com/redpanda-data/redpanda
https://redpanda.com/

posted on   荣锋亮  阅读(287)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2021-04-11 cube.js 配置自定义basePath 扩展cube.js 多租户处理
2021-04-11 cube.js 测试Query 的方法
2021-04-11 apache kylin 大数据olap 方案
2021-04-11 k6 如何进行api 测试(demo)
2021-04-11 k6 运行大规模测试
2020-04-11 几款不错的java规则引擎
2019-04-11 Announcing the Operate Preview Release: Monitoring and Managing Cross-Microservice Workflows

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
点击右上角即可分享
微信分享提示