吹静静

欢迎QQ交流:592590682

试验环境:

CDH 5.15.1

CentOS 7

Python 3.7.0

kafka 1.1.1

kafka-python :https://pypi.org/project/kafka-python/#files

实验目的:

通过python线程,不断的将指定接口中的数据取出,并将数据不断发送到kafka服务中。

实验步骤-1:

先将kafka-python下载并安装成功;

进行一个python调用kafka的简单测试:

进入python3的终端:

>>> from kafka import KafkaProducer
>>> producer = KafkaProducer(bootstrap_servers=["master:9092"])
>>> producer.send("test",b"Hello world")
<kafka.producer.future.FutureRecordMetadata object at 0x7f4bf56fbda0>
>>> producer.send("test",b"Hello world")
<kafka.producer.future.FutureRecordMetadata object at 0x7f4bf5715438>

启动kafka消费者:

kafka-console-consumer  --zookeeper master:2181 --from-beginning --topic test

输出结果:

Hello world
Hello world

实验步骤-2:

实验代码:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @File  : ParsePS.py
# @Author: cjj
# @Date  : 2019/6/4
# @Desc  : 请求接口,获取数据,对数据进行清洗


import re
import threading
import time
from urllib.error import URLError

from kafka import KafkaProducer
from kafka.errors import KafkaError
from suds.client import Client

class Data_clean:
    # 获取测点数据的函数
    def get_data(observation_point_name):

        try:
            # 获取接口数据
            user_url = 'http://xxx.xxx.xxx.xxx/ServiceSL/ServiceGetInsqlData.svc?wsdl'
            client = Client(user_url)
            result = client.service.GetSingleTagInfo(observation_point_name)
            # 1.对数据进行清洗
            # 1.1 先将数据转换成字符串
            str1 = str(result)
            # 1.2 取出所有双引号里面的数据,并将列表转换成字符串
            pattern = re.compile('"(.*)"')
            str2 = str(pattern.findall(str1))
            # 1.3 将单引号去掉
            str3 = str2.replace('\'', '')
            # 1.4 将逗号换成制表符
            str4 = str3.replace(', ', '\t')
            # 1.5 去掉字符串前后的[]
            str5 = str4[:-1][1:]

            return str5
        except TimeoutError as e:
            print("\033[1;31;0m>>>>>>TimeoutError ->->->->->-> 对接口的请求超时<<<<<<\033[0m")
            # print(e)
        except URLError as e:
            print("\033[1;31;0m>>>>>>URLError ->->->->->-> 连接不到sql服务器<<<<<<\033[0m")
        except:
            print("\033[1;31;0m>>>>>>其它原因报错<<<<<<\033[0m")

try:
    producer = KafkaProducer(bootstrap_servers='master:9092')
    while 1:
        
        msg = Data_clean.get_data("SLWS_ps_1hzybqz_WD.PV")
        print(msg)

        # 指定主题和发送内容,将数据发送到kafka
        producer.send('test', msg.encode('utf-8'))
        time.sleep(5)

except KafkaError as e:
    print(e)
finally:
    producer.close()
    print('done!!!')

将代码上传到Linux服务器

执行代码:python3 ParsePS.py

查看kafka消费者结果:

 

posted on 2019-06-12 10:05  吹静静  阅读(10914)  评论(0编辑  收藏  举报