springboot对接spark大数据
1、特别申明,请注意JDK版本,最好用JDK1.8,用JDK17会导致很多报错
2、导入pom依赖 JDK1.8直接导入spark依赖就行。
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.13</artifactId> <version>3.4.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.13</artifactId> <version>3.4.1</version> </dependency>
3、如果是JDK17需要导入JDK1.8的(JDK1.8版本跳过下面依赖)
<dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-nop</artifactId> <version>1.6.0</version> </dependency> <dependency> <groupId>javax.servlet</groupId> <artifactId>javax.servlet-api</artifactId> <version>4.0.1</version> </dependency>
4、直接上代码
@Operation(summary = "spark处理") @PostMapping("sparkHander") public Resp sparkTest() throws IOException { //System.setProperty("hadoop.home.dir", "C:\\hadoop-common-2.2.0"); SparkConf sparkConf = new SparkConf(); sparkConf.set("spark.ui.enabled", "false"); sparkConf.set("spark.some.config.option", "some-value"); SparkSession spark = SparkSession .builder() .master("local") .appName("SparkSQLTest4") .config(sparkConf) .getOrCreate();
//本地后缀名.json的文件 Dataset<Row> df = spark.read().json("C:\\test.json"); df.printSchema(); df.show();
//表名可以随便写如:"test" df.createOrReplaceTempView("test");
//如果上面写test这里你就可以写查询条件select * from test Dataset<Row> dataset = spark.sql("select * from test"); dataset.show();
//得到JSON的字符串 List<String> strings = dataset.toJSON().collectAsList(); spark.stop(); return Resp.of(strings); }
5、JSON格式如下:
[
{
"port": "1",
"name": "测试服务器1",
"showPassword": false,
"status": "0"
},
{
"port": "2",
"name": "测试服务器2",
"showPassword": false,
"status": "0"
},{
"port": "3",
"name": "测试服务器3",
"showPassword": false,
"status": "0"
}
]
6、(JDK1.8无需配置)如果是JDK17还需要配置:vm options
--add-opens
java.base/java.io=ALL-UNNAMED
--add-opens
java.base/java.nio=ALL-UNNAMED
--add-exports
java.base/sun.nio.ch=ALL-UNNAMED
--add-opens
java.base/java.javax=ALL-UNNAMED