4.1 大数据汇总

# 环境&安装

## hbase安装问题

　　https://www.cnblogs.com/shan333/p/15386771.html

## pyspark安装

　　https://www.cnblogs.com/rhgaiymm/p/12892710.html

　　https://stackoverflow.com/questions/64169977/modulenotfounderror-no-module-named-pyspark

## MapReduce基本原理

　　https://blog.csdn.net/weixin_45366499/article/details/106892489

　　https://blog.csdn.net/weixin_43542605/article/details/122288056

　　https://blog.csdn.net/Shockang/article/details/117970151

　　https://blog.csdn.net/qq_45725767/article/details/120956256

## Spark与MapReduce比较

　　https://www.zhihu.com/question/31930662

## MapReduce倒排索引

　　参考https://www.cnblogs.com/zll20153246/p/9334857.html

## MapReduce表连接操作

　　参考https://blog.csdn.net/chuyouyinghe/article/details/78845364

## CAP理论介绍

　　参考http://www.ruanyifeng.com/blog/2018/07/cap.html

　　https://www.zhihu.com/question/54105974

## 厦门大学课程

　　参考http://dblab.xmu.edu.cn/blog/?s=python

　　http://dblab.xmu.edu.cn/post/spark-python/

## importing pyspark in python shell

　　参考https://stackoverflow.com/questions/23256536/importing-pyspark-in-python-shell

## Spark基础使用

　　https://spark.apache.org/docs/latest/quick-start.html

## Spark中的宽依赖和窄依赖

　　https://blog.csdn.net/houmou/article/details/52531205

## Spark中的RDD

　　https://blog.csdn.net/Zsusan7/article/details/121920810

　　https://blog.csdn.net/Python_Ai_Road/article/details/111940472

　　https://blog.csdn.net/olizxq/article/details/118276930

## Spark中的DataFrame

　　https://spark.apache.org/docs/latest/api/python/getting_started/quickstart_df.html

　　https://blog.csdn.net/ljp7759325/article/details/124135234

## Spark SQL

　　https://blog.csdn.net/m0_46917254/article/details/123959257

## DataFrame数据写出

　　https://blog.csdn.net/feizuiku0116/article/details/121527042

## 数据倾斜

　　https://zhuanlan.zhihu.com/p/376286414

　　https://zhuanlan.zhihu.com/p/449471866

## HiveSQL编译过程

　　https://tech.meituan.com/2014/02/12/hive-sql-to-mapreduce.html

## Spark OOM解决办法

　　https://www.jianshu.com/p/1e3472cb033d

　　https://cloud.tencent.com/developer/article/2109043

## Spark SQL合并产生的小文件

　　https://blog.csdn.net/Jerry_991/article/details/95773902

## Spark写表覆盖指定分区

　　https://huaweicloud.csdn.net/63357acfd3efff3090b58903.html

　　https://blog.csdn.net/lovetechlovelife/article/details/114544073

## Hadoop Shell命令

　　https://blog.csdn.net/m0_52879657/article/details/124633808

## PickleException: expected zero arguments for construction of ClassDict (for numpy.dtype)

　　float与np.float64有区别, https://stackoverflow.com/questions/53800062/expected-zero-arguments-for-construction-of-classdict-for-numpy-dtype-when-c

posted on 2022-04-07 22:36 Hiteration 阅读(28) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Hiteration

4.1 大数据汇总

导航

公告