蓝天

Hudi - Could not create payload for class

设置错误的 payload:

set `hoodie.datasource.write.payload.class`=`org.apache.hudi.common.model.PartialUpdateAvroPayloadX`;

在执行 insert 时报错:

2023-05-18 15:50:23 An error occurred while calling o42161071.execute.
: org.apache.calcite.avatica.AvaticaSqlException: Error -1 (00000) : Error while executing SQL: Remote driver error: CalciteSQLException: Failed to fetch query result (code=2002) -> RuntimeException: While executing SQL [INSERT INTO `test_db`.`t31` partition (ds) (`_hoodie_commit_time`, `_hoodie_commit_seqno`, `_hoodie_record_key`, `_hoodie_partition_path`, `_hoodie_file_name`, `ut`, `pk`, `f0`, `f1`, `f2`, `f3`, `f4`, `ds`)
(SELECT CAST(NULL AS STRING) `_hoodie_commit_time`, CAST(NULL AS STRING) `_hoodie_commit_seqno`, CAST(NULL AS STRING) `_hoodie_record_key`, CAST(NULL AS STRING) `_hoodie_partition_path`, CAST(NULL AS STRING) `_hoodie_file_name`, CURRENT_TIMESTAMP `ut`, 1006 `pk`, 1 `f0`, CAST(NULL AS BIGINT) `f1`, CAST(NULL AS BIGINT) `f2`, CAST(NULL AS BIGINT) `f3`, CAST(NULL AS BIGINT) `f4`, 20230101 `ds`)] on JDBC sub-schema -> SQLException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20230518155010549
	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:75)
	at org.apache.hudi.table.action.deltacommit.SparkUpsertDeltaCommitActionExecutor.execute(SparkUpsertDeltaCommitActionExecutor.java:45)
	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:88)
	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:80)
	at org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:140)
	at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:206)
	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:363)
	at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:107)
	at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:60)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
	at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:231)
	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3699)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:105)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:172)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:92)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3697)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:231)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:102)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:623)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:616)
	at org.apache.livy.thriftserver.session.SqlJob.executeSql(SqlJob.java:93)
	at org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:73)
	at org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:40)
	at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:84)
	at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:34)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (9.180.24.230 executor 2): java.io.IOException: Could not create payload for class: org.apache.hudi.common.model.PartialUpdateAvroPayloadX
	at org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:130)
	at org.apache.hudi.DataSourceUtils.createHoodieRecord(DataSourceUtils.java:228)
	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$createHoodieRecordRdd$4(HoodieSparkSqlWriter.scala:1104)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1483)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Caused by: java.lang.ClassNotFoundException: org.apache.hudi.common.model.PartialUpdateAvroPayloadX
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:54)
	... 18 more

Caused by: org.apache.hudi.exception.HoodieException: Unable to load class
	at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:57)
	at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:77)
	at org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:127)
	... 16 more

相关源码:

/**
 * Utilities used throughout the data source.
 */
public class DataSourceUtils {
  /**
   * Create a payload class via reflection, passing in an ordering/precombine value.
   */
  public static HoodieRecordPayload createPayload(String payloadClass, GenericRecord record, Comparable orderingVal)
      throws IOException {
    try {
      return (HoodieRecordPayload) ReflectionUtils.loadClass(payloadClass,
          new Class<?>[] {GenericRecord.class, Comparable.class}, record, orderingVal);
    } catch (Throwable e) {
      throw new IOException("Could not create payload for class: " + payloadClass, e);
    }
  }
}

测试 insert 时,hoodie.datasource.write.payload.class 有影响,而 hoodie.compaction.payload.class.class 没有影响,即使设置为错误值。

posted on 2023-05-18 16:11  #蓝天  阅读(266)  评论(0编辑  收藏  举报

导航