[Flink] Flink Job运行状态正常,但日志中偶报“FlinkException: The file LOG does not exist on the TaskExecutor.”

0 序言

  • Flink : 1.12
job start running time : 2022-12-27 17:40:47
problem throw time : 2023-05-11 16:41:29,394
flink cdc : mysql --> redis

在此之前,本flink cdc job运行一切正常(功能正常、日志正常)

1 问题描述

2023-05-11 16:41:29,394 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler [] - Failed to transfer file from TaskExecutor flink-231840-taskmanager-1-1-7a7b81ea-1cec-4fc6-88b3-e0983c42b824.
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25(TaskExecutor.java:2031) ~[flink-dist_2.11-1.12.2-h0.cbu.dli.233.r4.jar:1.12.2-h0.cbu.dli.233.r4]
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ~[?:1.8.0_322]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_322]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_322]
	at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_322]
Caused by: org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
	... 5 more
2023-05-11 16:41:29,395 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler [] - Unhandled exception.
org.apache.flink.util.FlinkException: The file LOG does not exist on the TaskExecutor.
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25(TaskExecutor.java:2031) ~[flink-dist_2.11-1.12.2-h0.cbu.dli.233.r4.jar:1.12.2-h0.cbu.dli.233.r4]
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ~[?:1.8.0_322]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_322]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_322]
	at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_322]
2023-05-11 16:41:39,026 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Triggering checkpoint 192727 (type=CHECKPOINT) @ 1683794498517 for job e091d022d9992d5c17eac075507ff1a2.

2 原因分析

该报错主要是找不到STDOUT文件,原因是程序中没有sout输出,当你去web-ui点击stdout目录,就会报一个这儿样的接口请求错误,并不影响程序运行,可忽略。

3 解决方法

  • 方式1:不影响程序正常运行,忽略此ERROR即可

目前博主的做法 (经验证,确实不影响程序功能正常运行)

  • 方式2:修改Flink源码 (未验证)
如果一定要修复,提供以下方案:
  1. 修复flink runtime源码
  2. 如果没有sout输出,不要随便点击查看 stdout 目录
  3. 随便加一点sout输出在程序里
  • 方式3 修改日志参数配置(未验证)

网友:发现flink客户端提交的任务,jobManager中多了两个日志相关参数

  • $internal.deployment.config-dir
  • $internal.yarn.log-config-file
    网友测验:手动在程序中参考 YarnLogConfigUtil.discoverLogConfigFile方法设置 $internal.yarn.log-config-file 参数,最终日志成功出现!

X 参考文献

posted @ 2023-05-11 17:18  千千寰宇  阅读(1439)  评论(0编辑  收藏  举报