Flink部署报错(Could not resolve ResourceManager address)处理
问题描述:
启动集群之后,taskmanager无法连接到/resourcemanager,报错为:
2019-08-06 13:38:54,733 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.27.16.6': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.27.16.6': Connection refused
2019-08-06 13:38:54,735 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:55,145 INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address localhost/127.0.0.1:6123
2019-08-06 13:38:55,145 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
2019-08-06 13:38:55,146 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
.............................................
2019-08-06 13:39:08,888 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink@localhost:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorS
election[Anchor(akka.tcp://flink@localhost:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
taskmanager的完整日志为:
2019-08-06 13:38:51,547 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - --------------------------------------------------------------------------------
2019-08-06 13:38:51,548 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Starting TaskManager (Version: 1.7.2, Rev:ceba8af, Date:11.02.2019 @ 14:17:09 UTC)
2019-08-06 13:38:51,548 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - OS current user: root
2019-08-06 13:38:52,801 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-08-06 13:38:53,127 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Current Hadoop/Kerberos user: root
2019-08-06 13:38:53,127 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.11-b03
2019-08-06 13:38:53,128 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Maximum heap size: 922 MiBytes
2019-08-06 13:38:53,128 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JAVA_HOME: /opt/modules/jdk1.8.0_11
2019-08-06 13:38:53,131 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Hadoop version: 2.7.5
2019-08-06 13:38:53,131 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM Options:
2019-08-06 13:38:53,131 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -XX:+UseG1GC
2019-08-06 13:38:53,131 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xms922M
2019-08-06 13:38:53,131 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xmx922M
2019-08-06 13:38:53,155 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlog4j.configuration=file:/opt/modules/flink-1.7.2/conf/log4j.properties
2019-08-06 13:38:53,155 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlogback.configurationFile=file:/opt/modules/flink-1.7.2/conf/logback.xml
2019-08-06 13:38:53,155 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - --------------------------------------------------------------------------------
2019-08-06 13:38:53,157 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-08-06 13:38:53,161 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Maximum number of open file descriptors is 100002.
2019-08-06 13:38:53,198 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost
2019-08-06 13:38:53,199 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2019-08-06 13:38:53,199 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m
2019-08-06 13:38:53,199 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.size, 1024m
2019-08-06 13:38:53,199 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-08-06 13:38:53,199 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2019-08-06 13:38:54,306 WARN org.apache.flink.configuration.Configuration - Config uses deprecated configuration key 'jobmanager.rpc.address' instead of proper key 'rest.address'
2019-08-06 13:38:54,311 INFO org.apache.flink.runtime.util.LeaderRetrievalUtils - Trying to select the network interface and address to use by connecting to the leading JobManager.
2019-08-06 13:38:54,312 INFO org.apache.flink.runtime.util.LeaderRetrievalUtils - TaskManager will try to connect for 10000 milliseconds before falling back to heuristics
2019-08-06 13:38:54,314 INFO org.apache.flink.runtime.net.ConnectionUtils - Retrieved new target address localhost/127.0.0.1:6123.
2019-08-06 13:38:54,733 INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address localhost/127.0.0.1:6123
2019-08-06 13:38:54,733 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.27.16.6': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:54,734 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/172.27.16.6': Connection refused
2019-08-06 13:38:54,735 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address '/127.0.0.1': Connection refused
2019-08-06 13:38:55,145 INFO org.apache.flink.runtime.net.ConnectionUtils - Trying to connect to address localhost/127.0.0.1:6123
2019-08-06 13:38:55,145 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
2019-08-06 13:38:55,146 INFO org.apache.flink.runtime.net.ConnectionUtils - Failed to connect from address 'flink001/132.232.80.134': Cannot assign requested address
2019-08-06 13:38:55,146 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager will use hostname/address 'VM_16_6_centos' (127.0.0.1) for communication.
2019-08-06 13:38:55,148 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start actor system at vm_16_6_centos:0
2019-08-06 13:38:56,632 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2019-08-06 13:38:56,828 INFO akka.remote.Remoting - Starting remoting
2019-08-06 13:38:57,411 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink@vm_16_6_centos:41270]
2019-08-06 13:38:57,430 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system started at akka.tcp://flink@vm_16_6_centos:41270
2019-08-06 13:38:57,436 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Trying to start actor system at vm_16_6_centos:0
2019-08-06 13:38:57,557 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2019-08-06 13:38:57,591 INFO akka.remote.Remoting - Starting remoting
2019-08-06 13:38:57,649 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Actor system started at akka.tcp://flink-metrics@vm_16_6_centos:45213
2019-08-06 13:38:57,659 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink-metrics@vm_16_6_centos:45213]
2019-08-06 13:38:57,671 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics reporter configured, no metrics will be exposed/reported.
2019-08-06 13:38:57,677 INFO org.apache.flink.runtime.blob.PermanentBlobCache - Created BLOB cache storage directory /tmp/blobStore-aeceeed4-d0fd-4a7e-88ee-db119c5d87de
2019-08-06 13:38:57,712 INFO org.apache.flink.runtime.blob.TransientBlobCache - Created BLOB cache storage directory /tmp/blobStore-bf7b2df0-ea20-4ef1-a0cc-b633904dfbdc
2019-08-06 13:38:57,726 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Starting TaskManager with ResourceID: e99285a420122256039587659f0825d4
2019-08-06 13:38:57,731 INFO org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig [server address: VM_16_6_centos/127.0.0.1, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type:
NIO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)]
2019-08-06 13:38:57,805 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices - Temporary file directory '/tmp': total 49 GB, usable 43 GB (87.76% usable)
2019-08-06 13:38:58,034 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool - Allocated 102 MB for network buffer pool (number of memory segments: 3278, bytes per segment: 32768).
2019-08-06 13:38:58,222 INFO org.apache.flink.runtime.query.QueryableStateUtils - Could not load Queryable State Client Proxy. Probable reason: flink-queryable-state-runtime is not in the classpath. To enable Queryable State
, please move the flink-queryable-state-runtime jar from the opt to the lib folder.
2019-08-06 13:38:58,222 INFO org.apache.flink.runtime.query.QueryableStateUtils - Could not load Queryable State Server. Probable reason: flink-queryable-state-runtime is not in the classpath. To enable Queryable State, plea
se move the flink-queryable-state-runtime jar from the opt to the lib folder.
2019-08-06 13:38:58,223 INFO org.apache.flink.runtime.io.network.NetworkEnvironment - Starting the network environment and its components.
2019-08-06 13:38:58,338 INFO org.apache.flink.runtime.io.network.netty.NettyClient - Successful initialization (took 114 ms).
2019-08-06 13:38:58,495 INFO org.apache.flink.runtime.io.network.netty.NettyServer - Successful initialization (took 156 ms). Listening on SocketAddress /127.0.0.1:43413.
2019-08-06 13:38:58,496 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices - Limiting managed memory to 0.7 of the currently free heap space (641 MB), memory will be allocated lazily.
2019-08-06 13:38:58,500 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager uses directory /tmp/flink-io-6941b527-e8f1-4783-9e63-1a3055f23222 for spill files.
2019-08-06 13:38:58,728 INFO org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration - Messages have a max timeout of 10000 ms
2019-08-06 13:38:58,767 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/taskmanager_0 .
2019-08-06 13:38:58,856 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Connecting to ResourceManager akka.tcp://flink@localhost:6123/user/resourcemanager(00000000000000000000000000000000).
2019-08-06 13:38:58,908 INFO org.apache.flink.runtime.taskexecutor.JobLeaderService - Start job leader service.
2019-08-06 13:38:58,965 INFO org.apache.flink.runtime.filecache.FileCache - User file cache uses directory /tmp/flink-dist-cache-18b7f19b-c711-41ad-87ed-abe980dda51b
2019-08-06 13:39:08,888 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink@localhost:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorS
election[Anchor(akka.tcp://flink@localhost:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..
------------------------------我叫华丽分割线------------------------------
解决方案:
在配置文件flink-conf.yaml中添加如下配置:
#添加如下配置,指定taskmananger的地址,如果是单机部署,指定localhost
taskmanager.host: localhost
重新启动flink即可
-----------------------------The end----------------------------------------