java.lang.IllegalArgumentException: requirement failed: Column features must be of type struct

在学习spark的机器学习的时候出现了这么一个错误

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Column features must be of type struct<type:tinyint,size:int,indices:array<int>,values:array<double>> but was actually struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.
	at scala.Predef$.require(Predef.scala:281)
	at org.apache.spark.ml.util.SchemaUtils$.checkColumnType(SchemaUtils.scala:44)
	at org.apache.spark.ml.PredictorParams.validateAndTransformSchema(Predictor.scala:51)
	at org.apache.spark.ml.PredictorParams.validateAndTransformSchema$(Predictor.scala:46)
	at org.apache.spark.ml.regression.LinearRegression.org$apache$spark$ml$regression$LinearRegressionParams$$super$validateAndTransformSchema(LinearRegression.scala:177)
	at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema(LinearRegression.scala:120)
	at org.apache.spark.ml.regression.LinearRegressionParams.validateAndTransformSchema$(LinearRegression.scala:108)
	at org.apache.spark.ml.regression.LinearRegression.validateAndTransformSchema(LinearRegression.scala:177)
	at org.apache.spark.ml.Predictor.transformSchema(Predictor.scala:144)
	at org.apache.spark.ml.PipelineStage.transformSchema(Pipeline.scala:74)
	at org.apache.spark.ml.Predictor.fit(Predictor.scala:100)
	at sparkML.Regression$.delayedEndpoint$sparkML$Regression$1(Regression.scala:34)
	at sparkML.Regression$delayedInit$body.apply(Regression.scala:10)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1$adapted(App.scala:80)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.App.main(App.scala:80)
	at scala.App.main$(App.scala:78)
	at sparkML.Regression$.main(Regression.scala:10)
	at sparkML.Regression.main(Regression.scala)

Process finished with exit code 1

这个问题是你的spark版本是2.0+,而却调用了mllib包中的方法,saprk2.0之后更推荐使用ml包中的方法。
原来错误的导包

import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint

应该改为

import org.apache.spark.ml.linalg.Vectors
import org.apache.spark.ml.feature.LabeledPoint
posted @ 2020-08-31 15:04  鹤望兰号  阅读(1678)  评论(1编辑  收藏  举报