spark 插入数据到mysql时遇到的问题 org.apache.spark.SparkException: Task not serializable
报错问题:
Exception in thread "main" org.apache.spark.SparkException: Task not serializable
Caused by: java.io.NotSerializableException: org.apache.commons.dbcp2.PoolingDataSource$PoolGuardConnectionWrapper
出错的代码:
def saveMonthToMysql(everymonth_avg:RDD[(String, Float, String)])={ DBs.setup() DB.localTx(implicit session =>{ everymonth_avg.foreach(r=>{ SQL("insert into price_month(name, avgprice, uploaddate) values(?,?,?)") .bind(r._1,r._2,r._3) .update() .apply() }) } ) }
猜测原因可能是传入RDD了
解决方法:
新建一个scala class saveMonthToMysql放到里面并且将传入的参数改成
(String, Float, String)
而不是
RDD[(String, Float, String)])
object Save { DBs.setup() def saveMonthToMysql(everymonth_avg:(String, Float, String))={ DB.localTx(implicit session =>{ SQL("insert into price_month_copy1(name, avgprice, uploaddate) values(?,?,?)") .bind(everymonth_avg._1,everymonth_avg._2,everymonth_avg._3) .update() .apply() } ) } }
使用的时候用rdd调用就可以
everymonth_avg.foreach(x=>{ Save.saveMonthToMysql(x) })