关于python对hive数据库的添加数据和查询数据统计到MySQL数据库中
使用版本:
python3.9.9
pycharm20.1.3
环境:
库:
requests
sasl
thrift
thrift-sasl
PyHive
下载完后可能会报错,但我这边不受影响可以使用
注意import和修改自己的参数
下面是hive相关参数
hive_conn = hive.Connection(host='192.168.88.151', port=10000, username='root', database='py')
hive_cursor = hive_conn.cursor()
下面是mysql相关参数
mysql_conn = mysql.connector.connect(
host="localhost",
user="root",
password="root",
database="amtls"
)
mysql_cursor = mysql_conn.cursor()
下面是查询hive和将查询结构插入mysql数据库中的语句,仅供参考,自己可以根据需求让AI生成,python生成这些代码很快
# Function to execute a Hive query and insert results into MySQL
def transfer_data(hive_query, mysql_table):
try:
# Execute Hive query
hive_cursor.execute(hive_query)
results = hive_cursor.fetchall()
except Exception as e:
print(f"Error executing Hive query: {e}")
raise
# Prepare and execute MySQL insert queries based on the table name
if mysql_table == 'popular_contents':
insert_query = "INSERT INTO popular_contents (type, id, views) VALUES (%s, %s, %s)"
elif mysql_table == 'top_courses_by_city':
insert_query = "INSERT INTO top_courses_by_city (ip, id, views) VALUES (%s, %s, %s)"
elif mysql_table == 'top_courses_by_traffic':
insert_query = "INSERT INTO top_courses_by_traffic (id, total_traffic) VALUES (%s, %s)"
else:
raise ValueError("Unknown table name")
try:
for row in results:
mysql_cursor.execute(insert_query, row)
mysql_conn.commit()
except Exception as e:
print(f"Error inserting into MySQL: {e}")
raise
# Transfer the first analysis result - Top 10 videos/articles by views
transfer_data("""
SELECT type, id, SUM(traffic) AS views
FROM data
WHERE type IN ('video', 'article')
GROUP BY type, id
ORDER BY views DESC
LIMIT 10
""", 'popular_contents')
# Transfer the second analysis result - Top 10 courses by city using IP as city name
transfer_data("""
SELECT ip, id, SUM(traffic) AS views
FROM data
GROUP BY ip, id
ORDER BY views DESC
LIMIT 10
""", 'top_courses_by_city')
# Transfer the third analysis result - Top 10 courses by traffic
transfer_data("""
SELECT id, SUM(traffic) AS total_traffic
FROM data
GROUP BY id
ORDER BY total_traffic DESC
LIMIT 10
""", 'top_courses_by_traffic')
最后关闭链接
# Close connections
hive_cursor.close()
hive_conn.close()
mysql_cursor.close()
mysql_conn.close()
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix