PredictionIO+Universal Recommender快速开发部署推荐引擎的问题总结(2)
1, 对Universal Recommender进行pio build成功,但是却提示No engine found
Building and delpoying model [INFO] [Engine$] Using command '/home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt' at /home/vagrant/workspace/universal-recommender to build. [INFO] [Engine$] If the path above is incorrect, this process will fail. [INFO] [Engine$] Uber JAR disabled. Making sure lib/pio-assembly-0.11.1-SNAPSHOT.jar is absent. [INFO] [Engine$] Going to run: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt package assemblyPackageDependency in /home/vagrant/workspace/universal-recommender [INFO] [Engine$] Compilation finished successfully. [INFO] [Engine$] Looking for an engine... [ERROR] [Engine$] No engine found. Your build might have failed. Aborting.
这是Scala版本导致的问题。进入到universal-recommender的打包目录target中查看,会发现一个叫做scala-2.10的目录。
而我们的PredictionIO在make时指定版本是Scala2.11,所以会去scala-2.11目录下去寻找engine的jar包,自然会出现No engine found
这里有个临时的解决方案,就是直接把scala-2.10改名或者拷贝为scala-2.11,就可以让PredictionIO可以正常执行。
[vagrant@master universal-recommender]$ cd /home/vagrant/workspace/universal-recommender/target [vagrant@master target]$ ls resolution-cache scala-2.10 streams [vagrant@master target]$ cp -r scala-2.10 scala-2.11
2,解决Universal Recommender的Scala版本问题
上面的办法只是个临时解决办法,还是需要统一PredictionIO和Universal Recommender的Scala版本。
我们可以通过修改build.sbt来指定Universal Recommender的Scala版本
name := "universal-recommender" scalaVersion := "2.11.8"
但是,最终会发现出现编译错误。原因是Universal Recommender的一些依赖包,没有Scala2.11的版本。比如mahout包。
[vagrant@master universal-recommender]$ pio build [INFO] [Engine$] Using command '/home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt' at /home/vagrant/workspace/universal-recommender to build. [INFO] [Engine$] If the path above is incorrect, this process will fail. [INFO] [Engine$] Uber JAR disabled. Making sure lib/pio-assembly-0.11.1-SNAPSHOT.jar is absent. [INFO] [Engine$] Going to run: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt package assemblyPackageDependency in /home/vagrant/workspace/universal-recommender [ERROR] [Engine$] [error] (*:update) sbt.ResolveException: unresolved dependency: org.apache.mahout#mahout-math-scala_2.11;0.13.0: not found [ERROR] [Engine$] [error] unresolved dependency: org.apache.mahout#mahout-spark_2.11;0.13.0: not found [ERROR] [Engine$] [error] Total time: 69 s, completed Sep 8, 2017 10:06:41 AM [ERROR] [Engine$] Return code of build command: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt package assemblyPackageDependency is 1. Aborting.
最终只好对build.sbt动了一下大手术,基本原则是:
1),能够升级到Scala2.11的依赖包,升级;
2),没有2.11的包,比如mahout,强制指定包版本为2.10
3),依赖中出现2.10和2.11并存冲突的包,exclude掉2.10版本
最后修改的样子如下:
import scalariform.formatter.preferences._ import com.typesafe.sbt.SbtScalariform import com.typesafe.sbt.SbtScalariform.ScalariformKeys import sbt.Keys.scalaVersion name := "universal-recommender" version := "0.6.1-SNAPSHOT" organization := "com.actionml" scalaVersion := "2.11.8" val mahoutVersion = "0.13.0" val pioVersion = "0.11.0-incubating" val elasticsearch1Version = "1.7.6" //val elasticsearch5Version = "5.1.2" libraryDependencies ++= Seq( "org.apache.predictionio" %% "apache-predictionio-core" % pioVersion % "provided", "org.apache.predictionio" %% "apache-predictionio-data-elasticsearch1" % pioVersion % "provided", "org.apache.spark" % "spark-core_2.11" % "2.1.0" % "provided", "org.apache.spark" % "spark-mllib_2.11" % "1.4.0" % "provided", "org.xerial.snappy" % "snappy-java" % "1.1.1.7", // Mahout's Spark libs "org.apache.mahout" % "mahout-math-scala_2.10" % mahoutVersion exclude("com.github.scopt", "scopt_2.10") exclude("org.spire-math", "spire_2.10") exclude("org.scalanlp", "breeze_2.10") exclude("org.spire-math", "spire-macros_2.10") exclude("org.apache.spark", "spark-mllib_2.10") exclude("org.json4s", "json4s-ast_2.10") exclude("org.json4s", "json4s-core_2.10") exclude("org.json4s", "json4s-native_2.10") exclude("org.scalanlp", "breeze-macros_2.10") exclude("com.esotericsoftware.kryo", "kryo") exclude("com.twitter", "chill_2.10"), "org.apache.mahout" % "mahout-spark_2.10" % mahoutVersion exclude("com.github.scopt", "scopt_2.10") exclude("org.spire-math", "spire_2.10") exclude("org.scalanlp", "breeze_2.10") exclude("org.spire-math", "spire-macros_2.10") exclude("org.apache.spark", "spark-mllib_2.10") exclude("org.json4s", "json4s-ast_2.10") exclude("org.json4s", "json4s-core_2.10") exclude("org.json4s", "json4s-native_2.10") exclude("com.twitter", "chill_2.10") exclude("org.scalanlp", "breeze-macros_2.10") exclude("com.esotericsoftware.kryo", "kryo") exclude("org.apache.spark", "spark-launcher_2.10") exclude("org.apache.spark", "spark-unsafe_2.10") exclude("org.apache.spark", "spark-tags_2.10") exclude("org.apache.spark", "spark-core_2.10") exclude("org.apache.spark", "spark-network-common_2.10") exclude("org.apache.spark", "spark-streaming_2.10") exclude("org.apache.spark", "spark-graphx_2.10") exclude("org.apache.spark", "spark-catalyst_2.10") exclude("org.apache.spark", "spark-sql_2.10"), "org.apache.mahout" % "mahout-math" % mahoutVersion, "org.apache.mahout" % "mahout-hdfs" % mahoutVersion exclude("com.thoughtworks.xstream", "xstream") exclude("org.apache.hadoop", "hadoop-client"), //"org.apache.hbase" % "hbase-client" % "0.98.5-hadoop2" % "provided", // exclude("org.apache.zookeeper", "zookeeper"), // other external libs "com.thoughtworks.xstream" % "xstream" % "1.4.4" exclude("xmlpull", "xmlpull"), // possible build for es5 //"org.elasticsearch" %% "elasticsearch-spark-13" % elasticsearch5Version % "provided", "org.elasticsearch" % "elasticsearch" % "1.7.5" % "provided", "org.elasticsearch" % "elasticsearch-spark-20_2.11" % "5.5.1", // exclude("org.apache.spark", "spark-launcher_2.11") // exclude("org.apache.spark", "spark-unsafe_2.11") // exclude("org.apache.spark", "spark-tags_2.11") // exclude("org.apache.spark", "spark-core_2.11") // exclude("org.apache.spark", "spark-network-common_2.11") // exclude("org.apache.spark", "spark-streaming_2.11") // exclude("org.apache.spark", "spark-catalyst_2.11") // exclude("org.apache.spark", "spark-sql_2.11"), "org.json4s" % "json4s-native_2.11" % "3.2.10") .map(_.exclude("org.apache.lucene","lucene-core")).map(_.exclude("org.apache.lucene","lucene-analyzers-common")) resolvers += Resolver.mavenLocal SbtScalariform.scalariformSettings ScalariformKeys.preferences := ScalariformKeys.preferences.value .setPreference(AlignSingleLineCaseStatements, true) .setPreference(DoubleIndentClassDeclaration, true) .setPreference(DanglingCloseParenthesis, Prevent) .setPreference(MultilineScaladocCommentsStartOnFirstLine, true) assemblyMergeStrategy in assembly := { case "plugin.properties" => MergeStrategy.discard case PathList(ps @ _*) if ps.last endsWith "package-info.class" => MergeStrategy.first case PathList(ps @ _*) if ps.last endsWith "UnusedStubClass.class" => MergeStrategy.first case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }
PredictionIO和Universal Recommender这样的开源产品,确实存在着官方文档不太完整或者更新不太及时的问题,按照官方手册一次成功的概率很低,需要多次的试验和调查,从其官网,邮件组,以及其他互联网渠道搜索各种线索,才能最终解决问题。
但PredictionIO的社区活跃度很好,Universal Recommender的开发者本人是PredictionIO的重要开发者,还对自己的产品有运营的意愿和行动,邮件组中的技术支持比较到位。
路漫漫其修远兮