|NO.Z.00046|——————————|BigDataEnd|——|Hadoop&Spark.V07|——|Spark.v07|spark sql|编程&输入输出|
一、输入与输出
### --- 输入输出
~~~ SparkSQL内建支持的数据源包括:
~~~ Parquet、JSON、CSV、Avro、Images、BinaryFiles(Spark 3.0)。其中Parquet是默认的数据源。
### --- 输入输出实验
~~~ # 内部使用
DataFrameReader.format(args).option("key", "value").schema(args).load()
~~~ # 开发API
SparkSession.read
### --- 代码提取说明
~~~ # 源码提取说明:DataFraneReader.scala
~~~ # 51行
class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
/**
* Specifies the input data source format.
*
* @since 1.4.0
*/
def format(source: String): DataFrameReader = {
this.source = source
this
}
二、输入输出实验
### --- 输入输出实验
~~~ # 上传文件到hdfs
[root@hadoop02 ~]# hdfs dfs -ls /user/root/data/users.parquet
-rw-r--r-- 5 root supergroup 615 2021-10-20 16:43 /user/root/data/users.parquet
[root@hadoop02 ~]# hdfs dfs -ls /user/root/data/emp.json
-rw-r--r-- 5 root supergroup 1776 2021-10-20 16:45 /user/root/data/emp.json
scala> val df1 = spark.read.format("parquet").load("data/users.parquet")
df1: org.apache.spark.sql.DataFrame = [name: string, favorite_color: string ... 1 more field]
# --- Use Parquet; you can omit format("parquet") if you wish as it's the default
scala> val df2 = spark.read.load("data/users.parquet")
df2: org.apache.spark.sql.DataFrame = [name: string, favorite_color: string ... 1 more field]
# --- Use CSV
scala> val df3 = spark.read.format("csv")
df3: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@6f9b1347
scala> .option("inferSchema", "true")
res117: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@6f9b1347
scala> .option("header", "true")
res118: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@6f9b1347
scala> .load("data/people1.csv")
res119: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]
# --- Use JSON
scala> val df4 = spark.read.format("json")
df4: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@e33b191
scala> .load("data/emp.json")
res121: org.apache.spark.sql.DataFrame = [COMM: bigint, DEPTNO: bigint ... 6 more fields]
~~~ # 内部使用
DataFrameWriter.format(args)
.option(args)
.bucketBy(args)
.partitionBy(args)
.save(path)
~~~ # 开发API
DataFrame.write
### --- Parquet文件:
scala> spark.sql(
| """
| |CREATE OR REPLACE TEMPORARY VIEW users
| |USING parquet
| |OPTIONS (path "data/users.parquet")
| |""".stripMargin
| )
res128: org.apache.spark.sql.DataFrame = []
scala> spark.sql("select * from users").show
+------+--------------+----------------+
| name|favorite_color|favorite_numbers|
+------+--------------+----------------+
|Alyssa| null| [3, 9, 15, 20]|
| Ben| red| []|
+------+--------------+----------------+
scala> df.write.format("parquet")
scala> .mode("overwrite")
scala> .option("compression", "snappy")
scala> .save("data/parquet")
### --- json文件:
scala> spark.sql("SELECT * FROM emp").show()
+----+------+-----+------+--------------------+---------+----+----+
|COMM|DEPTNO|EMPNO| ENAME| HIREDATE| JOB| MGR| SAL|
+----+------+-----+------+--------------------+---------+----+----+
|null| 20| 7369| SMITH|2001-01-02T22:12:...| CLERK|7902| 800|
| 300| 30| 7499| ALLEN|2002-01-02T22:12:...| SALESMAN|7698|1600|
| 500| 30| 7521| WARD|2003-01-02T22:12:...| SALESMAN|7698|1250|
|null| 20| 7566| JONES|2004-01-02T22:12:...| MANAGER|7839|2975|
|1400| 30| 7654|MARTIN|2005-01-02T22:12:...| SALESMAN|7698|1250|
|null| 30| 7698| BLAKE|2005-04-02T22:12:...| MANAGER|7839|2850|
|null| 10| 7782| CLARK|2006-03-02T22:12:...| MANAGER|7839|2450|
|null| 20| 7788| SCOTT|2007-03-02T22:12:...| ANALYST|7566|3000|
|null| 10| 7839| KING|2006-03-02T22:12:...|PRESIDENT|null|5000|
| 0| 30| 7844|TURNER|2009-07-02T22:12:...| SALESMAN|7698|1500|
|null| 20| 7876| ADAMS|2010-05-02T22:12:...| CLERK|7788|1100|
|null| 30| 7900| JAMES|2011-06-02T22:12:...| CLERK|7698| 950|
|null| 20| 7902| FORD|2011-07-02T22:12:...| ANALYST|7566|3000|
|null| 10| 7934|MILLER|2012-11-02T22:12:...| CLERK|7782|1300|
+----+------+-----+------+--------------------+---------+----+----+
scala> val fileJson = "data/emp.json"
fileJson: String = data/emp.json
scala> val df6 = spark.read.format("json").load(fileJson)
df6: org.apache.spark.sql.DataFrame = [COMM: bigint, DEPTNO: bigint ... 6 more fields]
scala> spark.sql(
| """
| |CREATE OR REPLACE TEMPORARY VIEW emp
| | USING json
| | options(path "data/emp.json")
| |""".stripMargin)
res142: org.apache.spark.sql.DataFrame = []
scala> spark.sql("SELECT * FROM emp").write
res144: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@3ab1e8c2
scala> .format("json")
res145: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@3ab1e8c2
scala> .mode("overwrite")
res146: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@3ab1e8c2
scala> .save("data/json")

### --- CSV文件:
~~~ # CSV
scala> val fileCSV = "data/people1.csv"
fileCSV: String = data/people1.csv
scala> val df = spark.read.format("csv")
df: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@51888660
scala> .option("header", "true")
res148: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@51888660
scala> .option("inferschema", "true")
res149: org.apache.spark.sql.DataFrameReader = org.apache.spark.sql.DataFrameReader@51888660
scala> .load(fileCSV)
res150: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]
scala> spark.sql(
| """
| |CREATE OR REPLACE TEMPORARY VIEW people
| | USING csv
| |options(path "data/people1.csv",
| | header "true",
| | inferschema "true")
| |""".stripMargin)
res151: org.apache.spark.sql.DataFrame = []
scala> spark.sql("select * from people")
res152: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more field]
scala> .write
res153: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@2fbf9abc
scala> .format("csv")
res154: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@2fbf9abc
scala> .mode("overwrite")
res155: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@2fbf9abc
scala> .save("data/csv")

三、sparksql通过JDBC方式连接外部数据源
### --- 准备数据库表
~~~ # 创建表
mysql> create table yanqi_product_info_back as select * from yanqi_product_info;
~~~ # 检查表的字符集
mysql> show create table yanqi_product_info_back;
+-------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| yanqi_product_info_back | CREATE TABLE `yanqi_product_info_back` (
`productId` bigint(11) NOT NULL DEFAULT '0' COMMENT '商品id',
`productName` varchar(200) NOT NULL COMMENT '商品名称',
`shopId` bigint(11) NOT NULL COMMENT '门店ID',
`price` decimal(11,2) NOT NULL DEFAULT '0.00' COMMENT '门店价',
`isSale` tinyint(4) NOT NULL DEFAULT '1' COMMENT '是否上架 0:不上架 1:上架',
`status` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否新品 0:否 1:是',
`categoryId` int(11) NOT NULL COMMENT 'goodsCatId 最后一级商品分类ID',
`createTime` varchar(25) NOT NULL,
`modifyTime` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table yanqi_product_info;
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| yanqi_product_info | CREATE TABLE `yanqi_product_info` (
`productId` bigint(11) NOT NULL DEFAULT '0' COMMENT '商品id',
`productName` varchar(200) NOT NULL COMMENT '商品名称',
`shopId` bigint(11) NOT NULL COMMENT '门店ID',
`price` decimal(11,2) NOT NULL DEFAULT '0.00' COMMENT '门店价',
`isSale` tinyint(4) NOT NULL DEFAULT '1' COMMENT '是否上架 0:不上架 1:上架',
`status` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否新品 0:否 1:是',
`categoryId` int(11) NOT NULL COMMENT 'goodsCatId 最后一级商品分类ID',
`createTime` varchar(25) NOT NULL,
`modifyTime` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
~~~ # 修改表的字符集
mysql> alter table yanqi_product_info_back convert to character set utf8;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
### --- SparkSQL还支持使用JDBC的方式连接到外部数据源:
val jdbcDF = spark
.read
.format("jdbc")
.option("url", "jdbc:mysql://hadoop03:3306/ebiz?
useSSL=false")
~~~ # &useUnicode=true
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "yanqi_product_info")
.option("user", "hive")
.option("password", "12345678")
.load()
jdbcDF.show()
jdbcDF.write
.format("jdbc")
.option("url", "jdbc:mysql://hadoop03:3306/ebiz?
useSSL=false&characterEncoding=utf8")
.option("user", "hive")
.option("password", "12345678")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "yanqi_product_info_back")
.mode("append")
.save
### --- 备注:如果有中文注意表的字符集,否则会有乱码
~~~ SaveMode.ErrorIfExists(默认)。若表存在,则会直接报异常,数据不能存入数据库
~~~ SaveMode.Append。若表存在,则追加在该表中;若该表不存在,则会先创建表,再插入数据
~~~ SaveMode.Overwrite。先将已有的表及其数据全都删除,再重新创建该表,最后插入新的数据
~~~ SaveMode.Ignore。若表不存在,则创建表并存入数据;若表存在,直接跳过数据的存储,不会报错


四、编程实现
### --- 编程实现
package cn.yanqi.sparksql
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
object InputOutputFileDemo {
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder()
.appName("Demo1")
.master("local[*]")
.getOrCreate()
val sc = spark.sparkContext
sc.setLogLevel("warn")
// parquet
import spark._
// val df1: DataFrame = spark.read.load("data/users.parquet")
// df1.createOrReplaceTempView("t1")
// df1.show
// sql(
// """
// |create or replace temporary view users
// | using parquet
// | options (path "data/users.parquet")
// |""".stripMargin)
// sql(
// """
// |select * from users
// |""".stripMargin).show
// df1.write
// .mode("overwrite")
// .save("data/parquet")
// json
// val df3: DataFrame = spark.read.format("json").load("data/emp.json")
// df3.show()
//
// sql(
// """
// |create or replace temporary view emp
// | using json
// |options (path "data/emp.json")
// |""".stripMargin)
// sql(
// """
// |select * from emp
// |""".stripMargin).write
// .format("json")
// .mode("overwrite")
// .save("data/json")
// csv
// val df2 = spark.read.format("csv")
// .option("header", "true")
// .option("inferschema", "true")
// .load("data/people1.csv")
// df2.show()
//
// sql(
// """
// |create or replace temporary view people
// | using csv
// |options (path "data/people1.csv",
// | header "true",
// | inferschema "true")
// |""".stripMargin)
//
// sql("select * from people").write
// .format("csv")
// .mode("overwrite")
// .save("data/csv")
// jdbc
val jdbcDF: DataFrame = spark.read
.format("jdbc")
.option("url", "jdbc:mysql://hadoop03:3306/ebiz?useSSL=false")
.option("user", "hive")
.option("password", "12345678")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "yanqi_product_info")
.load()
jdbcDF.show()
jdbcDF.write
.format("jdbc")
.option("url", "jdbc:mysql://hadoop03:3306/ebiz?useSSL=false&characterEncoding=utf8")
.option("user", "hive")
.option("password", "12345678")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "yanqi_product_info_back")
.mode(SaveMode.Append)
.save()
spark.close()
}
}
### --- 编译打印
~~~ # 准备数据文件
~~~ data/users.parquet
~~~ data/emp.json
### --- 编译打印
D:\JAVA\jdk1.8.0_231\bin\java.exe "-javaagent:D:\IntelliJIDEA\IntelliJ IDEA 2019.3.3\lib\idea_rt.jar=53187:D:\IntelliJIDEA\IntelliJ IDEA 2019.3.3\bin" -Dfile.encoding=UTF-8 -classpath D:\JAVA\jdk1.8.0_231\jre\lib\charsets.jar;D:\JAVA\jdk1.8.0_231\jre\lib\deploy.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\access-bridge-64.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\cldrdata.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\dnsns.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\jaccess.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\jfxrt.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\localedata.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\nashorn.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\sunec.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\sunjce_provider.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\sunmscapi.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\sunpkcs11.jar;D:\JAVA\jdk1.8.0_231\jre\lib\ext\zipfs.jar;D:\JAVA\jdk1.8.0_231\jre\lib\javaws.jar;D:\JAVA\jdk1.8.0_231\jre\lib\jce.jar;D:\JAVA\jdk1.8.0_231\jre\lib\jfr.jar;D:\JAVA\jdk1.8.0_231\jre\lib\jfxswt.jar;D:\JAVA\jdk1.8.0_231\jre\lib\jsse.jar;D:\JAVA\jdk1.8.0_231\jre\lib\management-agent.jar;D:\JAVA\jdk1.8.0_231\jre\lib\plugin.jar;D:\JAVA\jdk1.8.0_231\jre\lib\resources.jar;D:\JAVA\jdk1.8.0_231\jre\lib\rt.jar;E:\NO.Z.80000.Hadoop.spark\SparkBigData\target\classes;C:\Users\Administrator\.m2\repository\org\scala-lang\scala-library\2.12.10\scala-library-2.12.10.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-core_2.12\2.4.5\spark-core_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\com\thoughtworks\paranamer\paranamer\2.8\paranamer-2.8.jar;C:\Users\Administrator\.m2\repository\org\apache\avro\avro\1.8.2\avro-1.8.2.jar;C:\Users\Administrator\.m2\repository\org\codehaus\jackson\jackson-core-asl\1.9.13\jackson-core-asl-1.9.13.jar;C:\Users\Administrator\.m2\repository\org\apache\commons\commons-compress\1.8.1\commons-compress-1.8.1.jar;C:\Users\Administrator\.m2\repository\org\tukaani\xz\1.5\xz-1.5.jar;C:\Users\Administrator\.m2\repository\org\apache\avro\avro-mapred\1.8.2\avro-mapred-1.8.2-hadoop2.jar;C:\Users\Administrator\.m2\repository\org\apache\avro\avro-ipc\1.8.2\avro-ipc-1.8.2.jar;C:\Users\Administrator\.m2\repository\com\twitter\chill_2.12\0.9.3\chill_2.12-0.9.3.jar;C:\Users\Administrator\.m2\repository\com\esotericsoftware\kryo-shaded\4.0.2\kryo-shaded-4.0.2.jar;C:\Users\Administrator\.m2\repository\com\esotericsoftware\minlog\1.3.0\minlog-1.3.0.jar;C:\Users\Administrator\.m2\repository\org\objenesis\objenesis\2.5.1\objenesis-2.5.1.jar;C:\Users\Administrator\.m2\repository\com\twitter\chill-java\0.9.3\chill-java-0.9.3.jar;C:\Users\Administrator\.m2\repository\org\apache\xbean\xbean-asm6-shaded\4.8\xbean-asm6-shaded-4.8.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-client\2.6.5\hadoop-client-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-common\2.6.5\hadoop-common-2.6.5.jar;C:\Users\Administrator\.m2\repository\xmlenc\xmlenc\0.52\xmlenc-0.52.jar;C:\Users\Administrator\.m2\repository\commons-collections\commons-collections\3.2.2\commons-collections-3.2.2.jar;C:\Users\Administrator\.m2\repository\commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar;C:\Users\Administrator\.m2\repository\commons-digester\commons-digester\1.8\commons-digester-1.8.jar;C:\Users\Administrator\.m2\repository\commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar;C:\Users\Administrator\.m2\repository\com\google\code\gson\gson\2.2.4\gson-2.2.4.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-auth\2.6.5\hadoop-auth-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\directory\server\apacheds-kerberos-codec\2.0.0-M15\apacheds-kerberos-codec-2.0.0-M15.jar;C:\Users\Administrator\.m2\repository\org\apache\directory\server\apacheds-i18n\2.0.0-M15\apacheds-i18n-2.0.0-M15.jar;C:\Users\Administrator\.m2\repository\org\apache\directory\api\api-asn1-api\1.0.0-M20\api-asn1-api-1.0.0-M20.jar;C:\Users\Administrator\.m2\repository\org\apache\directory\api\api-util\1.0.0-M20\api-util-1.0.0-M20.jar;C:\Users\Administrator\.m2\repository\org\apache\curator\curator-client\2.6.0\curator-client-2.6.0.jar;C:\Users\Administrator\.m2\repository\org\htrace\htrace-core\3.0.4\htrace-core-3.0.4.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-hdfs\2.6.5\hadoop-hdfs-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\mortbay\jetty\jetty-util\6.1.26\jetty-util-6.1.26.jar;C:\Users\Administrator\.m2\repository\xerces\xercesImpl\2.9.1\xercesImpl-2.9.1.jar;C:\Users\Administrator\.m2\repository\xml-apis\xml-apis\1.3.04\xml-apis-1.3.04.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-app\2.6.5\hadoop-mapreduce-client-app-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-common\2.6.5\hadoop-mapreduce-client-common-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-yarn-client\2.6.5\hadoop-yarn-client-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-yarn-server-common\2.6.5\hadoop-yarn-server-common-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-shuffle\2.6.5\hadoop-mapreduce-client-shuffle-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-yarn-api\2.6.5\hadoop-yarn-api-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-core\2.6.5\hadoop-mapreduce-client-core-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-yarn-common\2.6.5\hadoop-yarn-common-2.6.5.jar;C:\Users\Administrator\.m2\repository\javax\xml\bind\jaxb-api\2.2.2\jaxb-api-2.2.2.jar;C:\Users\Administrator\.m2\repository\javax\xml\stream\stax-api\1.0-2\stax-api-1.0-2.jar;C:\Users\Administrator\.m2\repository\org\codehaus\jackson\jackson-jaxrs\1.9.13\jackson-jaxrs-1.9.13.jar;C:\Users\Administrator\.m2\repository\org\codehaus\jackson\jackson-xc\1.9.13\jackson-xc-1.9.13.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-mapreduce-client-jobclient\2.6.5\hadoop-mapreduce-client-jobclient-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\hadoop\hadoop-annotations\2.6.5\hadoop-annotations-2.6.5.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-launcher_2.12\2.4.5\spark-launcher_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-kvstore_2.12\2.4.5\spark-kvstore_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\fusesource\leveldbjni\leveldbjni-all\1.8\leveldbjni-all-1.8.jar;C:\Users\Administrator\.m2\repository\com\fasterxml\jackson\core\jackson-core\2.6.7\jackson-core-2.6.7.jar;C:\Users\Administrator\.m2\repository\com\fasterxml\jackson\core\jackson-annotations\2.6.7\jackson-annotations-2.6.7.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-network-common_2.12\2.4.5\spark-network-common_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-network-shuffle_2.12\2.4.5\spark-network-shuffle_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-unsafe_2.12\2.4.5\spark-unsafe_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\javax\activation\activation\1.1.1\activation-1.1.1.jar;C:\Users\Administrator\.m2\repository\org\apache\curator\curator-recipes\2.6.0\curator-recipes-2.6.0.jar;C:\Users\Administrator\.m2\repository\org\apache\curator\curator-framework\2.6.0\curator-framework-2.6.0.jar;C:\Users\Administrator\.m2\repository\com\google\guava\guava\16.0.1\guava-16.0.1.jar;C:\Users\Administrator\.m2\repository\org\apache\zookeeper\zookeeper\3.4.6\zookeeper-3.4.6.jar;C:\Users\Administrator\.m2\repository\javax\servlet\javax.servlet-api\3.1.0\javax.servlet-api-3.1.0.jar;C:\Users\Administrator\.m2\repository\org\apache\commons\commons-lang3\3.5\commons-lang3-3.5.jar;C:\Users\Administrator\.m2\repository\org\apache\commons\commons-math3\3.4.1\commons-math3-3.4.1.jar;C:\Users\Administrator\.m2\repository\com\google\code\findbugs\jsr305\1.3.9\jsr305-1.3.9.jar;C:\Users\Administrator\.m2\repository\org\slf4j\slf4j-api\1.7.16\slf4j-api-1.7.16.jar;C:\Users\Administrator\.m2\repository\org\slf4j\jul-to-slf4j\1.7.16\jul-to-slf4j-1.7.16.jar;C:\Users\Administrator\.m2\repository\org\slf4j\jcl-over-slf4j\1.7.16\jcl-over-slf4j-1.7.16.jar;C:\Users\Administrator\.m2\repository\log4j\log4j\1.2.17\log4j-1.2.17.jar;C:\Users\Administrator\.m2\repository\org\slf4j\slf4j-log4j12\1.7.16\slf4j-log4j12-1.7.16.jar;C:\Users\Administrator\.m2\repository\com\ning\compress-lzf\1.0.3\compress-lzf-1.0.3.jar;C:\Users\Administrator\.m2\repository\org\xerial\snappy\snappy-java\1.1.7.3\snappy-java-1.1.7.3.jar;C:\Users\Administrator\.m2\repository\org\lz4\lz4-java\1.4.0\lz4-java-1.4.0.jar;C:\Users\Administrator\.m2\repository\com\github\luben\zstd-jni\1.3.2-2\zstd-jni-1.3.2-2.jar;C:\Users\Administrator\.m2\repository\org\roaringbitmap\RoaringBitmap\0.7.45\RoaringBitmap-0.7.45.jar;C:\Users\Administrator\.m2\repository\org\roaringbitmap\shims\0.7.45\shims-0.7.45.jar;C:\Users\Administrator\.m2\repository\commons-net\commons-net\3.1\commons-net-3.1.jar;C:\Users\Administrator\.m2\repository\org\json4s\json4s-jackson_2.12\3.5.3\json4s-jackson_2.12-3.5.3.jar;C:\Users\Administrator\.m2\repository\org\json4s\json4s-core_2.12\3.5.3\json4s-core_2.12-3.5.3.jar;C:\Users\Administrator\.m2\repository\org\json4s\json4s-ast_2.12\3.5.3\json4s-ast_2.12-3.5.3.jar;C:\Users\Administrator\.m2\repository\org\json4s\json4s-scalap_2.12\3.5.3\json4s-scalap_2.12-3.5.3.jar;C:\Users\Administrator\.m2\repository\org\scala-lang\modules\scala-xml_2.12\1.0.6\scala-xml_2.12-1.0.6.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\core\jersey-client\2.22.2\jersey-client-2.22.2.jar;C:\Users\Administrator\.m2\repository\javax\ws\rs\javax.ws.rs-api\2.0.1\javax.ws.rs-api-2.0.1.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\hk2-api\2.4.0-b34\hk2-api-2.4.0-b34.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\hk2-utils\2.4.0-b34\hk2-utils-2.4.0-b34.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\external\aopalliance-repackaged\2.4.0-b34\aopalliance-repackaged-2.4.0-b34.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\external\javax.inject\2.4.0-b34\javax.inject-2.4.0-b34.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\hk2-locator\2.4.0-b34\hk2-locator-2.4.0-b34.jar;C:\Users\Administrator\.m2\repository\org\javassist\javassist\3.18.1-GA\javassist-3.18.1-GA.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\core\jersey-common\2.22.2\jersey-common-2.22.2.jar;C:\Users\Administrator\.m2\repository\javax\annotation\javax.annotation-api\1.2\javax.annotation-api-1.2.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\bundles\repackaged\jersey-guava\2.22.2\jersey-guava-2.22.2.jar;C:\Users\Administrator\.m2\repository\org\glassfish\hk2\osgi-resource-locator\1.0.1\osgi-resource-locator-1.0.1.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\core\jersey-server\2.22.2\jersey-server-2.22.2.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\media\jersey-media-jaxb\2.22.2\jersey-media-jaxb-2.22.2.jar;C:\Users\Administrator\.m2\repository\javax\validation\validation-api\1.1.0.Final\validation-api-1.1.0.Final.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet\2.22.2\jersey-container-servlet-2.22.2.jar;C:\Users\Administrator\.m2\repository\org\glassfish\jersey\containers\jersey-container-servlet-core\2.22.2\jersey-container-servlet-core-2.22.2.jar;C:\Users\Administrator\.m2\repository\io\netty\netty-all\4.1.42.Final\netty-all-4.1.42.Final.jar;C:\Users\Administrator\.m2\repository\io\netty\netty\3.9.9.Final\netty-3.9.9.Final.jar;C:\Users\Administrator\.m2\repository\com\clearspring\analytics\stream\2.7.0\stream-2.7.0.jar;C:\Users\Administrator\.m2\repository\io\dropwizard\metrics\metrics-core\3.1.5\metrics-core-3.1.5.jar;C:\Users\Administrator\.m2\repository\io\dropwizard\metrics\metrics-jvm\3.1.5\metrics-jvm-3.1.5.jar;C:\Users\Administrator\.m2\repository\io\dropwizard\metrics\metrics-json\3.1.5\metrics-json-3.1.5.jar;C:\Users\Administrator\.m2\repository\io\dropwizard\metrics\metrics-graphite\3.1.5\metrics-graphite-3.1.5.jar;C:\Users\Administrator\.m2\repository\com\fasterxml\jackson\core\jackson-databind\2.6.7.3\jackson-databind-2.6.7.3.jar;C:\Users\Administrator\.m2\repository\com\fasterxml\jackson\module\jackson-module-scala_2.12\2.6.7.1\jackson-module-scala_2.12-2.6.7.1.jar;C:\Users\Administrator\.m2\repository\org\scala-lang\scala-reflect\2.12.1\scala-reflect-2.12.1.jar;C:\Users\Administrator\.m2\repository\com\fasterxml\jackson\module\jackson-module-paranamer\2.7.9\jackson-module-paranamer-2.7.9.jar;C:\Users\Administrator\.m2\repository\org\apache\ivy\ivy\2.4.0\ivy-2.4.0.jar;C:\Users\Administrator\.m2\repository\oro\oro\2.0.8\oro-2.0.8.jar;C:\Users\Administrator\.m2\repository\net\razorvine\pyrolite\4.13\pyrolite-4.13.jar;C:\Users\Administrator\.m2\repository\net\sf\py4j\py4j\0.10.7\py4j-0.10.7.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-tags_2.12\2.4.5\spark-tags_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\apache\commons\commons-crypto\1.0.0\commons-crypto-1.0.0.jar;C:\Users\Administrator\.m2\repository\org\spark-project\spark\unused\1.0.0\unused-1.0.0.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-sql_2.12\2.4.5\spark-sql_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\com\univocity\univocity-parsers\2.7.3\univocity-parsers-2.7.3.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-sketch_2.12\2.4.5\spark-sketch_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-catalyst_2.12\2.4.5\spark-catalyst_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\org\scala-lang\modules\scala-parser-combinators_2.12\1.1.0\scala-parser-combinators_2.12-1.1.0.jar;C:\Users\Administrator\.m2\repository\org\codehaus\janino\janino\3.0.9\janino-3.0.9.jar;C:\Users\Administrator\.m2\repository\org\codehaus\janino\commons-compiler\3.0.9\commons-compiler-3.0.9.jar;C:\Users\Administrator\.m2\repository\org\antlr\antlr4-runtime\4.7\antlr4-runtime-4.7.jar;C:\Users\Administrator\.m2\repository\org\apache\orc\orc-core\1.5.5\orc-core-1.5.5-nohive.jar;C:\Users\Administrator\.m2\repository\org\apache\orc\orc-shims\1.5.5\orc-shims-1.5.5.jar;C:\Users\Administrator\.m2\repository\com\google\protobuf\protobuf-java\2.5.0\protobuf-java-2.5.0.jar;C:\Users\Administrator\.m2\repository\commons-lang\commons-lang\2.6\commons-lang-2.6.jar;C:\Users\Administrator\.m2\repository\io\airlift\aircompressor\0.10\aircompressor-0.10.jar;C:\Users\Administrator\.m2\repository\org\apache\orc\orc-mapreduce\1.5.5\orc-mapreduce-1.5.5-nohive.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-column\1.10.1\parquet-column-1.10.1.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-common\1.10.1\parquet-common-1.10.1.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-encoding\1.10.1\parquet-encoding-1.10.1.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-hadoop\1.10.1\parquet-hadoop-1.10.1.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-format\2.4.0\parquet-format-2.4.0.jar;C:\Users\Administrator\.m2\repository\org\apache\parquet\parquet-jackson\1.10.1\parquet-jackson-1.10.1.jar;C:\Users\Administrator\.m2\repository\org\apache\arrow\arrow-vector\0.10.0\arrow-vector-0.10.0.jar;C:\Users\Administrator\.m2\repository\org\apache\arrow\arrow-format\0.10.0\arrow-format-0.10.0.jar;C:\Users\Administrator\.m2\repository\org\apache\arrow\arrow-memory\0.10.0\arrow-memory-0.10.0.jar;C:\Users\Administrator\.m2\repository\com\carrotsearch\hppc\0.7.2\hppc-0.7.2.jar;C:\Users\Administrator\.m2\repository\com\vlkan\flatbuffers\1.2.0-3f79e055\flatbuffers-1.2.0-3f79e055.jar;C:\Users\Administrator\.m2\repository\joda-time\joda-time\2.9.7\joda-time-2.9.7.jar;C:\Users\Administrator\.m2\repository\mysql\mysql-connector-java\5.1.44\mysql-connector-java-5.1.44.jar;C:\Users\Administrator\.m2\repository\org\apache\spark\spark-hive_2.12\2.4.5\spark-hive_2.12-2.4.5.jar;C:\Users\Administrator\.m2\repository\com\twitter\parquet-hadoop-bundle\1.6.0\parquet-hadoop-bundle-1.6.0.jar;C:\Users\Administrator\.m2\repository\org\spark-project\hive\hive-exec\1.2.1.spark2\hive-exec-1.2.1.spark2.jar;C:\Users\Administrator\.m2\repository\commons-io\commons-io\2.4\commons-io-2.4.jar;C:\Users\Administrator\.m2\repository\javolution\javolution\5.5.1\javolution-5.5.1.jar;C:\Users\Administrator\.m2\repository\log4j\apache-log4j-extras\1.2.17\apache-log4j-extras-1.2.17.jar;C:\Users\Administrator\.m2\repository\org\antlr\antlr-runtime\3.4\antlr-runtime-3.4.jar;C:\Users\Administrator\.m2\repository\org\antlr\stringtemplate\3.2.1\stringtemplate-3.2.1.jar;C:\Users\Administrator\.m2\repository\antlr\antlr\2.7.7\antlr-2.7.7.jar;C:\Users\Administrator\.m2\repository\org\antlr\ST4\4.0.4\ST4-4.0.4.jar;C:\Users\Administrator\.m2\repository\com\googlecode\javaewah\JavaEWAH\0.3.2\JavaEWAH-0.3.2.jar;C:\Users\Administrator\.m2\repository\org\iq80\snappy\snappy\0.2\snappy-0.2.jar;C:\Users\Administrator\.m2\repository\stax\stax-api\1.0.1\stax-api-1.0.1.jar;C:\Users\Administrator\.m2\repository\net\sf\opencsv\opencsv\2.3\opencsv-2.3.jar;C:\Users\Administrator\.m2\repository\org\spark-project\hive\hive-metastore\1.2.1.spark2\hive-metastore-1.2.1.spark2.jar;C:\Users\Administrator\.m2\repository\com\jolbox\bonecp\0.8.0.RELEASE\bonecp-0.8.0.RELEASE.jar;C:\Users\Administrator\.m2\repository\commons-cli\commons-cli\1.2\commons-cli-1.2.jar;C:\Users\Administrator\.m2\repository\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar;C:\Users\Administrator\.m2\repository\org\datanucleus\datanucleus-api-jdo\3.2.6\datanucleus-api-jdo-3.2.6.jar;C:\Users\Administrator\.m2\repository\org\datanucleus\datanucleus-rdbms\3.2.9\datanucleus-rdbms-3.2.9.jar;C:\Users\Administrator\.m2\repository\commons-pool\commons-pool\1.5.4\commons-pool-1.5.4.jar;C:\Users\Administrator\.m2\repository\commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar;C:\Users\Administrator\.m2\repository\javax\jdo\jdo-api\3.0.1\jdo-api-3.0.1.jar;C:\Users\Administrator\.m2\repository\javax\transaction\jta\1.1\jta-1.1.jar;C:\Users\Administrator\.m2\repository\commons-httpclient\commons-httpclient\3.1\commons-httpclient-3.1.jar;C:\Users\Administrator\.m2\repository\org\apache\calcite\calcite-avatica\1.2.0-incubating\calcite-avatica-1.2.0-incubating.jar;C:\Users\Administrator\.m2\repository\org\apache\calcite\calcite-core\1.2.0-incubating\calcite-core-1.2.0-incubating.jar;C:\Users\Administrator\.m2\repository\org\apache\calcite\calcite-linq4j\1.2.0-incubating\calcite-linq4j-1.2.0-incubating.jar;C:\Users\Administrator\.m2\repository\net\hydromatic\eigenbase-properties\1.1.5\eigenbase-properties-1.1.5.jar;C:\Users\Administrator\.m2\repository\org\apache\httpcomponents\httpclient\4.5.6\httpclient-4.5.6.jar;C:\Users\Administrator\.m2\repository\org\apache\httpcomponents\httpcore\4.4.10\httpcore-4.4.10.jar;C:\Users\Administrator\.m2\repository\org\codehaus\jackson\jackson-mapper-asl\1.9.13\jackson-mapper-asl-1.9.13.jar;C:\Users\Administrator\.m2\repository\commons-codec\commons-codec\1.10\commons-codec-1.10.jar;C:\Users\Administrator\.m2\repository\org\jodd\jodd-core\3.5.2\jodd-core-3.5.2.jar;C:\Users\Administrator\.m2\repository\org\datanucleus\datanucleus-core\3.2.10\datanucleus-core-3.2.10.jar;C:\Users\Administrator\.m2\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar;C:\Users\Administrator\.m2\repository\org\apache\thrift\libfb303\0.9.3\libfb303-0.9.3.jar;C:\Users\Administrator\.m2\repository\org\apache\derby\derby\10.12.1.1\derby-10.12.1.1.jar cn.yanqi.sparksql.InputOutputFileDemo
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
+---------+-----------+------+------+------+------+----------+-------------------+-------------------+
|productId|productName|shopId| price|isSale|status|categoryId| createTime| modifyTime|
+---------+-----------+------+------+------+------+----------+-------------------+-------------------+
| 100101| 四川xxx/个|100056| 36.80| 1| 0| 72|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100102|单果xxx单果|100061| 58.70| 1| 1| 72|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100103|红颜xxx水果|100055| 48.70| 1| 0| 251|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100104|智利xxx水果|100050| 77.80| 1| 0| 72|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100105| Zexxx水果|100062| 58.70| 1| 1| 251|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100106|花果xxx蔬菜|100052| 25.60| 1| 1| 320|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100107|福建xxx蔬菜|100054| 25.70| 1| 1| 256|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100108|花果xxx礼盒|100057| 44.70| 1| 0| 72|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100109|花果xxx水果|100061| 23.70| 3| 1| 76|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100110|花果xxx水果|100056| 33.70| 1| 0| 73|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100111|黄冠xxx水果|100050|297.80| 1| 2| 249|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100112|花果xxx蔬菜|100055| 9.60| 1| 3| 254|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100113|上海xxx蔬菜|100052| 6.70| 1| 2| 254|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100114|花果xxx蔬菜|100058| 14.60| 1| 2| 255|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100115|花果xxx蒜头|100060| 19.30| 1| 1| 256|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100116|花果xxx蔬菜|100054| 7.30| 1| 1| 252|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100117|广西xxx水果|100050| 48.70| 1| 0| 72|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100118|红旗xxx水果|100060| 97.80| 1| 1| 73|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100119|进口xxx水果|100062| 38.70| 1| 1| 76|2020-07-12 13:22:22|2020-07-12 13:22:22|
| 100120| 九家xxx0g|100062| 18.70| 1| 1| 250|2020-07-12 13:22:22|2020-07-12 13:22:22|
+---------+-----------+------+------+------+------+----------+-------------------+-------------------+
only showing top 20 rows
Process finished with exit code 0
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
——W.S.Landor
分类:
bdv016-spark.v01
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」