Hudi - Could not create payload for class
设置错误的 payload:
set `hoodie.datasource.write.payload.class`=`org.apache.hudi.common.model.PartialUpdateAvroPayloadX`;
在执行 insert 时报错:
2023-05-18 15:50:23 An error occurred while calling o42161071.execute.
: org.apache.calcite.avatica.AvaticaSqlException: Error -1 (00000) : Error while executing SQL: Remote driver error: CalciteSQLException: Failed to fetch query result (code=2002) -> RuntimeException: While executing SQL [INSERT INTO `test_db`.`t31` partition (ds) (`_hoodie_commit_time`, `_hoodie_commit_seqno`, `_hoodie_record_key`, `_hoodie_partition_path`, `_hoodie_file_name`, `ut`, `pk`, `f0`, `f1`, `f2`, `f3`, `f4`, `ds`)
(SELECT CAST(NULL AS STRING) `_hoodie_commit_time`, CAST(NULL AS STRING) `_hoodie_commit_seqno`, CAST(NULL AS STRING) `_hoodie_record_key`, CAST(NULL AS STRING) `_hoodie_partition_path`, CAST(NULL AS STRING) `_hoodie_file_name`, CURRENT_TIMESTAMP `ut`, 1006 `pk`, 1 `f0`, CAST(NULL AS BIGINT) `f1`, CAST(NULL AS BIGINT) `f2`, CAST(NULL AS BIGINT) `f3`, CAST(NULL AS BIGINT) `f4`, 20230101 `ds`)] on JDBC sub-schema -> SQLException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20230518155010549
at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:75)
at org.apache.hudi.table.action.deltacommit.SparkUpsertDeltaCommitActionExecutor.execute(SparkUpsertDeltaCommitActionExecutor.java:45)
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:88)
at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:80)
at org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:140)
at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:206)
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:363)
at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:107)
at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:231)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3699)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:105)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:172)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:92)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3697)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:231)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:102)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:623)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:801)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:616)
at org.apache.livy.thriftserver.session.SqlJob.executeSql(SqlJob.java:93)
at org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:73)
at org.apache.livy.thriftserver.session.SqlJob.call(SqlJob.java:40)
at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:84)
at org.apache.livy.rsc.driver.JobWrapper.call(JobWrapper.java:34)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3) (9.180.24.230 executor 2): java.io.IOException: Could not create payload for class: org.apache.hudi.common.model.PartialUpdateAvroPayloadX
at org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:130)
at org.apache.hudi.DataSourceUtils.createHoodieRecord(DataSourceUtils.java:228)
at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$createHoodieRecordRdd$4(HoodieSparkSqlWriter.scala:1104)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1483)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassNotFoundException: org.apache.hudi.common.model.PartialUpdateAvroPayloadX
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:54)
... 18 more
Caused by: org.apache.hudi.exception.HoodieException: Unable to load class
at org.apache.hudi.common.util.ReflectionUtils.getClass(ReflectionUtils.java:57)
at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:77)
at org.apache.hudi.DataSourceUtils.createPayload(DataSourceUtils.java:127)
... 16 more
相关源码:
/**
* Utilities used throughout the data source.
*/
public class DataSourceUtils {
/**
* Create a payload class via reflection, passing in an ordering/precombine value.
*/
public static HoodieRecordPayload createPayload(String payloadClass, GenericRecord record, Comparable orderingVal)
throws IOException {
try {
return (HoodieRecordPayload) ReflectionUtils.loadClass(payloadClass,
new Class<?>[] {GenericRecord.class, Comparable.class}, record, orderingVal);
} catch (Throwable e) {
throw new IOException("Could not create payload for class: " + payloadClass, e);
}
}
}
测试 insert 时,hoodie.datasource.write.payload.class 有影响,而 hoodie.compaction.payload.class.class 没有影响,即使设置为错误值。