Exception: Timeout while feeding partition
21/12/19 16:38:54 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 1.0 (TID 6, slave1, executor 3, partition 3, NODE_LOCAL, 8011 bytes) 21/12/19 16:38:54 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 3, slave1, executor 3): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 372, in main process() File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 367, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 352, in func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 801, in func File "/yarn/nm/usercache/root/appcache/application_1639558616940_0032/container_1639558616940_0032_01_000001/tfspark.zip/tensorflowonspark/TFSparkNode.py", line 511, in _train Exception: Timeout while feeding partition at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:456) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:592) at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:575) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:410) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:945) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2121) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2121) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1408) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 21/12/19 16:38:54 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 1.0 (TID 7, slave2, executor 1, partition 0, NODE_LOCAL, 8011 bytes) 21/12/19 16:38:54 INFO scheduler.TaskSetManager: Lost task 1.0 in stage 1.0 (TID 4) on slave2, executor 1: org.apache.spark.api.python.PythonException (Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 372, in main process() File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 367, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2499, in pipeline_func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 352, in func File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1580995/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 801, in func File "/yarn/nm/usercache/root/appcache/application_1639558616940_0032/container_1639558616940_0032_01_000001/tfspark.zip/tensorflowonspark/TFSparkNode.py", line 511, in _train Exception: Timeout while feeding partition ) [duplicate 1]
运行tensorflowonspark的代码的时候报了以上错误
解决办法:
修改TensorFlowOnSpark-2.2.4/tensorflowonspark/TFCluster.py
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)