pyspark 好用多了,放弃scala
注意pysparlk2.4在windows上可能有问题,请用2.3
py4j python 和 java 沟通的桥梁
https://www.py4j.org/advanced_topics.html#accessing-java-collections-and-arrays-from-python
https://www.jianshu.com/p/013fe44422c9?from=timeline&isappinstalled=0
https://raufer.github.io/2018/02/08/custom-spark-models-with-python-wrappers/
openjdk
http://jdk.java.net/11/