关于python对hive数据库的添加数据和查询数据统计到MySQL数据库中

使用版本:

python3.9.9

pycharm20.1.3

环境:

库:

requests 

sasl 

thrift

thrift-sasl 

PyHive

下载完后可能会报错,但我这边不受影响可以使用

注意import和修改自己的参数

下面是hive相关参数

hive_conn = hive.Connection(host='192.168.88.151', port=10000, username='root', database='py')
hive_cursor = hive_conn.cursor()

 下面是mysql相关参数

mysql_conn = mysql.connector.connect(
host="localhost",
user="root",
password="root",
database="amtls"
)
mysql_cursor = mysql_conn.cursor()

 

下面是查询hive和将查询结构插入mysql数据库中的语句,仅供参考,自己可以根据需求让AI生成,python生成这些代码很快


# Function to execute a Hive query and insert results into MySQL
def transfer_data(hive_query, mysql_table):
try:
# Execute Hive query
hive_cursor.execute(hive_query)
results = hive_cursor.fetchall()
except Exception as e:
print(f"Error executing Hive query: {e}")
raise

# Prepare and execute MySQL insert queries based on the table name
if mysql_table == 'popular_contents':
insert_query = "INSERT INTO popular_contents (type, id, views) VALUES (%s, %s, %s)"
elif mysql_table == 'top_courses_by_city':
insert_query = "INSERT INTO top_courses_by_city (ip, id, views) VALUES (%s, %s, %s)"
elif mysql_table == 'top_courses_by_traffic':
insert_query = "INSERT INTO top_courses_by_traffic (id, total_traffic) VALUES (%s, %s)"
else:
raise ValueError("Unknown table name")

try:
for row in results:
mysql_cursor.execute(insert_query, row)
mysql_conn.commit()
except Exception as e:
print(f"Error inserting into MySQL: {e}")
raise


# Transfer the first analysis result - Top 10 videos/articles by views
transfer_data("""
SELECT type, id, SUM(traffic) AS views
FROM data
WHERE type IN ('video', 'article')
GROUP BY type, id
ORDER BY views DESC
LIMIT 10
""", 'popular_contents')

# Transfer the second analysis result - Top 10 courses by city using IP as city name
transfer_data("""
SELECT ip, id, SUM(traffic) AS views
FROM data
GROUP BY ip, id
ORDER BY views DESC
LIMIT 10
""", 'top_courses_by_city')

# Transfer the third analysis result - Top 10 courses by traffic
transfer_data("""
SELECT id, SUM(traffic) AS total_traffic
FROM data
GROUP BY id
ORDER BY total_traffic DESC
LIMIT 10
""", 'top_courses_by_traffic')

最后关闭链接

# Close connections
hive_cursor.close()
hive_conn.close()
mysql_cursor.close()
mysql_conn.close()

 

posted on   HA_wind  阅读(3)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示