4.1 大数据汇总
# 环境&安装
## hbase安装问题
https://www.cnblogs.com/shan333/p/15386771.html
## pyspark安装
https://www.cnblogs.com/rhgaiymm/p/12892710.html
https://stackoverflow.com/questions/64169977/modulenotfounderror-no-module-named-pyspark
## MapReduce基本原理
https://blog.csdn.net/weixin_45366499/article/details/106892489
https://blog.csdn.net/weixin_43542605/article/details/122288056
https://blog.csdn.net/Shockang/article/details/117970151
https://blog.csdn.net/qq_45725767/article/details/120956256
## Spark与MapReduce比较
https://www.zhihu.com/question/31930662
## MapReduce倒排索引
参考https://www.cnblogs.com/zll20153246/p/9334857.html
## MapReduce表连接操作
参考https://blog.csdn.net/chuyouyinghe/article/details/78845364
## CAP理论介绍
参考http://www.ruanyifeng.com/blog/2018/07/cap.html
https://www.zhihu.com/question/54105974
## 厦门大学课程
参考http://dblab.xmu.edu.cn/blog/?s=python
http://dblab.xmu.edu.cn/post/spark-python/
## importing pyspark in python shell
参考https://stackoverflow.com/questions/23256536/importing-pyspark-in-python-shell
## Spark基础使用
https://spark.apache.org/docs/latest/quick-start.html
## Spark中的宽依赖和窄依赖
https://blog.csdn.net/houmou/article/details/52531205
## Spark中的RDD
https://blog.csdn.net/Zsusan7/article/details/121920810
https://blog.csdn.net/Python_Ai_Road/article/details/111940472
https://blog.csdn.net/olizxq/article/details/118276930
## Spark中的DataFrame
https://spark.apache.org/docs/latest/api/python/getting_started/quickstart_df.html
https://blog.csdn.net/ljp7759325/article/details/124135234
## Spark SQL
https://blog.csdn.net/m0_46917254/article/details/123959257
## DataFrame数据写出
https://blog.csdn.net/feizuiku0116/article/details/121527042
## 数据倾斜
https://zhuanlan.zhihu.com/p/376286414
https://zhuanlan.zhihu.com/p/449471866
## HiveSQL编译过程
https://tech.meituan.com/2014/02/12/hive-sql-to-mapreduce.html
## Spark OOM解决办法
https://www.jianshu.com/p/1e3472cb033d
https://cloud.tencent.com/developer/article/2109043
## Spark SQL合并产生的小文件
https://blog.csdn.net/Jerry_991/article/details/95773902
## Spark写表覆盖指定分区
https://huaweicloud.csdn.net/63357acfd3efff3090b58903.html
https://blog.csdn.net/lovetechlovelife/article/details/114544073
## Hadoop Shell命令
https://blog.csdn.net/m0_52879657/article/details/124633808
## PickleException: expected zero arguments for construction of ClassDict (for numpy.dtype)
float与np.float64有区别, https://stackoverflow.com/questions/53800062/expected-zero-arguments-for-construction-of-classdict-for-numpy-dtype-when-c
posted on 2022-04-07 22:36 Hiteration 阅读(28) 评论(0) 编辑 收藏 举报