cup_leo

摘要： #!/bin/bash #spark参数按照实际情况配置 hive -e " SET mapreduce.job.queuename=batch; set hive.execution.engine=spark; set spark.executor.memory=4g; set spark.exe 阅读全文

posted @ 2021-12-29 09:39 cup_leo 阅读(587) 评论(0) 推荐(0) 编辑

2021年12月15日

hive to mysql hive数据导入mysql

摘要： date=`date -d "-1 day" +%F` spark-submit \ --name "suanfa_zjk_tyc_tjy_export" \ --master yarn \ --deploy-mode cluster \ --driver-memory 3G \ --executo 阅读全文

posted @ 2021-12-15 14:28 cup_leo 阅读(10) 评论(0) 推荐(0) 编辑

2021年12月10日

机器学习优化思考

摘要： 1、机器学习中模型优化不得不思考的几个问题 - 云+社区 - 腾讯云 (tencent.com) 2、机器学习模型应该如何调优？这里有三大改进策略 (thepaper.cn) 3、机器学习模型的超参数优化 (baidu.com) 阅读全文

posted @ 2021-12-10 17:57 cup_leo 阅读(19) 评论(0) 推荐(0) 编辑

特征工程 python 批量生成变量名

摘要： features = [] diff_windowns = [1,3,6,12] groups = ['sum','mean','std','max','min','count'] for d in diff_windowns: exec("""last_{}_month = df[df['diff 阅读全文

posted @ 2021-12-10 11:42 cup_leo 阅读(454) 评论(0) 推荐(0) 编辑

2021年12月9日

spark 性能调优基础篇和高级篇转自美团技术团队

摘要： Spark性能优化指南——基础篇 - 美团技术团队 (meituan.com) Spark性能优化指南——高级篇 - 美团技术团队 (meituan.com) 阅读全文

posted @ 2021-12-09 17:05 cup_leo 阅读(92) 评论(0) 推荐(0) 编辑

公告