动态分区问题的解决
| 在dataClean.sh中清洗数据前,创建一个动态时间变量 |
| timeStr=`date -d "yesterday" "+%Y%m%d"` |
| |
| 在dataAnaly.sh中执行 |
| yesterday=`date -d "yesterday" "+%Y%d%m"` |
| hive --hiveconf yesterday=${yesterday} -f /opt/project/dataClean/Script-1.sql |
数据清洗文件
| [root@node1 dataClean] |
| /opt/project/dataClean |
| [root@node1 dataClean] |
| |
| |
| echo "=============================项目数据清洗数据启动成功==============================" |
| timeStr=`date -d "yesterday" "+%Y%m%d"` |
| |
| inpath="/project/$timeStr/*" |
| echo "MR程序的清洗路径定义完成,清洗数据路径为$inpath" |
| outpath="/dataClean" |
| hadoop jar /opt/project/dataClean/dataClean.jar DataCleanDriver $inpath $outpath |
sql文件
| [root@node1 dataClean]# pwd |
| /opt/project/dataClean |
| [root@node1 dataClean]# cat Script-1.sql |
| create database if not exists project; |
| use project; |
| |
| create external table if not exists web_origin( |
| ipaddr string comment "ip address", |
| visit_time string comment "日志的产生时间", |
| request_url string comment "请求的网址", |
| status int comment "网站的响应状态码", |
| body_bytes int comment "响应字节数", |
| referer_url string comment "请求网址的来源网站", |
| user_agent string comment "用户的浏览信息", |
| province string comment "用户访问时所处的省份", |
| latitude string comment "纬度", |
| longitude string comment "经度", |
| age int comment "年龄" |
| ) |
| partitioned by(logdate string) |
| row format delimited |
| fields terminated by ','; |
| |
| desc formatted web_origin; |
| |
| load data inpath "/dataClean/part-m-00000" into table web_origin partition(logdate="${hiveconf:yesterday}"); |
| load data inpath "/dataClean/part-m-00001" into table web_origin partition(logdate="${hiveconf:yesterday}"); |
| select * from web_origin limit 1; |
执行脚本,将数据导入sql文件中
| [root@node1 dataClean] |
| /opt/project/dataClean |
| [root@node1 dataClean] |
| |
| |
| yesterday=`date -d "yesterday" "+%Y%d%m"` |
| hive --hiveconf yesterday=${yesterday} -f /opt/project/dataClean/Script-1.sql |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?