7. 关于数据迁移的相关_项目一

第一次导入数据

[root@node1 dataExport]# cat export.sh 
#!/bin/bash
echo "====================导出age_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table age_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/age_pvs" --input-fields-terminated-by ","
echo "===================导出age_pvs指标表成功========================="

echo "====================导出day_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table day_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/day_pvs" --input-fields-terminated-by ","
echo "====================导出day_pvs指标表成功========================"

echo "====================导出hour_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table hour_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出hour_pvs指标表成功========================"


echo "====================导出month_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table month_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/month_pvs" --input-fields-terminated-by ","
echo "====================导出month_pvs指标表成功========================"

echo "====================导出area_pvs表数据(追加写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table area_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/area_pvs" --input-fields-terminated-by ","
echo "====================导出area_pvs指标表成功========================"

以后的数据

对以后的数据添加的注意事项

表字段 作用
--update-key 指定表字段
--update-mode 导出数据的模式updateonly allowinsert(默认)
--columns hive表中的字段(按顺序写) 修改表的字段的顺序
覆盖写(导出只更新数据,不追加,update-mode设置为updateonly,update-key设置为我们匹配字段)
追加写(update-mode设置为allowinsert或者不设置任何东西)
[root@node1 dataExport]# cat a.sh 
#!/bin/bash
echo "====================导出age_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table age_pvs --update-key age_range --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/age_pvs" --input-fields-terminated-by ","
echo "===================导出age_pvs指标表成功========================="


echo "====================导出day_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table day_pvs --update-key visit_year,visit_month,visit_day --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/day_pvs" --input-fields-terminated-by ","
echo "====================导出day_pvs指标表成功========================"


echo "====================导出hour_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table hour_pvs --update-key visit_year,visit_month,visit_day,visit_hour --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出hour_pvs指标表成功========================"


echo "====================导出month_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table month_pvs --update-key visit_year,visit_month --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出month_pvs指标表成功========================"


echo "====================导出area_pvs表数据(追加写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table area_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/area_pvs" --input-fields-terminated-by ","
echo "====================导出area_pvs指标表成功========================"

附录

  • 问题:对于month_pvs表,如果以后有新数据,无法加入,以为之前没有相关的字段记录
  • 解决方案:对于每个月的访问量指标,应该在当月2号追加一条新的数据记录,当月的3-31号应该更新数据记录
posted @ 2022-08-11 08:13  jsqup  阅读(15)  评论(0编辑  收藏  举报