【storm异常】nimbus提交拓扑后,supervisor拒绝连接 (Connection refused)
开启storm集群后,查看supervisor日志
2020-09-26 10:26:41.289 o.a.s.s.o.a.c.u.Compatibility main [INFO] Running in ZooKeeper 3.4.x compatibility mode
2020-09-26 10:26:41.355 o.a.s.z.Zookeeper main [INFO] Staring ZK Curator
2020-09-26 10:26:41.355 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl main [INFO] Starting
2020-09-26 10:26:41.373 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2020-09-26 10:26:41.373 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:host.name=192.168.171.135
2020-09-26 10:26:41.373 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.version=1.8.0_261
2020-09-26 10:26:41.373 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.vendor=Oracle Corporation
2020-09-26 10:26:41.373 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.home=/home/java/jdk1.8.0_261/jre
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.class.path=/home/storm/apache-storm-1.1.2/lib/slf4j-api-1.7.21.jar:/home/storm/apache-storm-1.1.2/lib/kryo-3.0.3.jar:/home/storm/apache-storm-1.1.2/lib/clojure-1.7.0.jar:/home/storm/apache-storm-1.1.2/lib/reflectasm-1.10.1.jar:/home/storm/apache-storm-1.1.2/lib/ring-cors-0.1.5.jar:/home/storm/apache-storm-1.1.2/lib/log4j-core-2.8.2.jar:/home/storm/apache-storm-1.1.2/lib/storm-rename-hack-1.1.2.jar:/home/storm/apache-storm-1.1.2/lib/minlog-1.3.0.jar:/home/storm/apache-storm-1.1.2/lib/storm-core-1.1.2.jar:/home/storm/apache-storm-1.1.2/lib/log4j-slf4j-impl-2.8.2.jar:/home/storm/apache-storm-1.1.2/lib/log4j-over-slf4j-1.6.6.jar:/home/storm/apache-storm-1.1.2/lib/disruptor-3.3.2.jar:/home/storm/apache-storm-1.1.2/lib/log4j-api-2.8.2.jar:/home/storm/apache-storm-1.1.2/lib/objenesis-2.1.jar:/home/storm/apache-storm-1.1.2/lib/servlet-api-2.5.jar:/home/storm/apache-storm-1.1.2/lib/asm-5.0.3.jar:/home/storm/apache-storm-1.1.2/conf
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.io.tmpdir=/tmp
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:java.compiler=<NA>
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:os.name=Linux
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:os.arch=amd64
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:os.version=4.8.0-36-generic
2020-09-26 10:26:41.374 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:user.name=root
2020-09-26 10:26:41.375 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:user.home=/root
2020-09-26 10:26:41.375 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client environment:user.dir=/home/storm/zookeeper-3.4.14/bin
2020-09-26 10:26:41.376 o.a.s.s.o.a.z.ZooKeeper main [INFO] Initiating client connection, connectString=nimbus:2181,supervisor1:2181,supervisor2:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@6b0615ae
2020-09-26 10:26:41.417 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl main [INFO] Default schema
2020-09-26 10:26:41.413 o.a.s.s.o.a.z.ClientCnxn main-SendThread(192.168.171.135:2181) [INFO] Opening socket connection to server 192.168.171.135/192.168.171.135:2181. Will not attempt to authenticate using SASL (unknown error)
2020-09-26 10:26:41.438 o.a.s.s.o.a.z.ClientCnxn main-SendThread(192.168.171.135:2181) [INFO] Socket connection established to 192.168.171.135/192.168.171.135:2181, initiating session
2020-09-26 10:26:41.447 o.a.s.s.o.a.z.ClientCnxn main-SendThread(192.168.171.135:2181) [INFO] Session establishment complete on server 192.168.171.135/192.168.171.135:2181, sessionid = 0x300004b1a3a0003, negotiated timeout = 20000
2020-09-26 10:26:41.453 o.a.s.s.o.a.c.f.s.ConnectionStateManager main-EventThread [INFO] State change: CONNECTED
2020-09-26 10:26:41.472 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl Curator-Framework-0 [INFO] backgroundOperationsLoop exiting
2020-09-26 10:26:41.481 o.a.s.s.o.a.z.ClientCnxn main-EventThread [INFO] EventThread shut down
2020-09-26 10:26:41.482 o.a.s.s.o.a.z.ZooKeeper main [INFO] Session: 0x300004b1a3a0003 closed
2020-09-26 10:26:41.482 o.a.s.z.Zookeeper main [INFO] Staring ZK Curator
2020-09-26 10:26:41.483 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl main [INFO] Starting
2020-09-26 10:26:41.487 o.a.s.s.o.a.z.ZooKeeper main [INFO] Initiating client connection, connectString=nimbus:2181,supervisor1:2181,supervisor2:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@11b377c5
2020-09-26 10:26:41.495 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl main [INFO] Default schema
2020-09-26 10:26:41.501 o.a.s.s.o.a.z.ClientCnxn main-SendThread(nimbus:2181) [INFO] Opening socket connection to server nimbus/127.0.1.1:2181. Will not attempt to authenticate using SASL (unknown error)
2020-09-26 10:26:41.502 o.a.s.s.o.a.z.ClientCnxn main-SendThread(nimbus:2181) [INFO] Socket connection established to nimbus/127.0.1.1:2181, initiating session
2020-09-26 10:26:41.507 o.a.s.s.o.a.z.ClientCnxn main-SendThread(nimbus:2181) [INFO] Session establishment complete on server nimbus/127.0.1.1:2181, sessionid = 0x300004b1a3a0004, negotiated timeout = 20000
2020-09-26 10:26:41.507 o.a.s.s.o.a.c.f.s.ConnectionStateManager main-EventThread [INFO] State change: CONNECTED
2020-09-26 10:26:41.534 o.a.s.l.Localizer main [INFO] Reconstruct localized resource: /home/storm/apache-storm-1.1.2/data/supervisor/usercache
2020-09-26 10:26:41.539 o.a.s.l.Localizer main [WARN] No left over resources found for any user during reconstructing of local resources at: /home/storm/apache-storm-1.1.2/data/supervisor/usercache
2020-09-26 10:26:41.553 o.a.s.d.s.Supervisor main [INFO] Starting supervisor for storm version '1.1.2'.
2020-09-26 10:26:41.553 o.a.s.d.s.Supervisor main [INFO] Starting Supervisor with conf {storm.messaging.netty.min_wait_ms=100, storm.zookeeper.auth.user=null, storm.messaging.netty.buffer_size=5242880, storm.exhibitor.port=8080, pacemaker.auth.method=NONE, ui.filter=null, worker.profiler.enabled=false, ui.http.creds.plugin=org.apache.storm.security.auth.DefaultHttpCredentialsPlugin, topology.bolts.outgoing.overflow.buffer.enable=false, supervisor.supervisors.commands=[], logviewer.cleanup.age.mins=10080, topology.tuple.serializer=org.apache.storm.serialization.types.ListDelegateSerializer, drpc.port=3772, topology.max.spout.pending=null, topology.transfer.buffer.size=1024, logviewer.port=8000, worker.childopts=-Xmx768m, topology.component.cpu.pcore.percent=10.0, storm.daemon.metrics.reporter.plugins=[org.apache.storm.daemon.metrics.reporters.JmxPreparableReporter], drpc.childopts=-Xmx768m, nimbus.task.launch.secs=120, logviewer.childopts=-Xmx128m, storm.zookeeper.servers=[nimbus, supervisor1, supervisor2], topology.disruptor.batch.timeout.millis=1, storm.messaging.transport=org.apache.storm.messaging.netty.Context, storm.messaging.netty.authentication=false, topology.kryo.factory=org.apache.storm.serialization.DefaultKryoFactory, worker.heap.memory.mb=768, storm.network.topography.plugin=org.apache.storm.networktopography.DefaultRackDNSToSwitchMapping, supervisor.slots.ports=[6700, 6701, 6702, 6703], resource.aware.scheduler.eviction.strategy=org.apache.storm.scheduler.resource.strategies.eviction.DefaultEvictionStrategy, topology.stats.sample.rate=0.05, storm.local.dir=/home/storm/apache-storm-1.1.2/data, pacemaker.host=localhost, storm.messaging.netty.max_retries=300, topology.testing.always.try.serialize=false, storm.principal.tolocal=org.apache.storm.security.auth.DefaultPrincipalToLocal, java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib, worker.gc.childopts=, storm.group.mapping.service.cache.duration.secs=120, zmq.linger.millis=5000, topology.multilang.serializer=org.apache.storm.multilang.JsonSerializer, drpc.request.timeout.secs=600, zmq.threads=1, nimbus.blobstore.class=org.apache.storm.blobstore.LocalFsBlobStore, topology.state.synchronization.timeout.secs=60, topology.worker.shared.thread.pool.size=4, topology.executor.receive.buffer.size=1024, supervisor.monitor.frequency.secs=3, storm.nimbus.retry.
提交拓扑后报错,supervisor拒绝连接,而且会自动停止storm supervisor进程。
2020-09-26 10:36:25.071 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE EMPTY msInState: 583470 -> WAITING_FOR_BASIC_LOCALIZATION msInState: 0
2020-09-26 10:36:25.094 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2001ms (NOT MAX)
2020-09-26 10:36:27.098 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2003ms (NOT MAX)
2020-09-26 10:36:29.103 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2006ms (NOT MAX)
2020-09-26 10:36:31.111 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2008ms (NOT MAX)
2020-09-26 10:36:33.120 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2028ms (NOT MAX)
2020-09-26 10:36:35.150 o.a.s.u.NimbusClient Async Localizer [WARN] Ignoring exception while trying to get leader nimbus info from nimbus. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:128) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:84) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.1.2.jar:1.1.2]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_261]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_261]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_261]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.2.jar:1.1.2]
... 13 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.2.jar:1.1.2]
... 13 more
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_261]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) ~[?:1.8.0_261]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) ~[?:1.8.0_261]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) ~[?:1.8.0_261]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) ~[?:1.8.0_261]
at java.net.Socket.connect(Socket.java:606) ~[?:1.8.0_261]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.2.jar:1.1.2]
... 13 more
2020-09-26 10:36:35.162 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id wordcount-1-1601087782
2020-09-26 10:36:35.163 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /home/storm/apache-storm-1.1.2/data/supervisor/tmp/7d299dc3-04c5-4169-a72b-c83102ad0060
2020-09-26 10:36:35.165 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /home/storm/apache-storm-1.1.2/data/supervisor/stormdist/wordcount-1-1601087782
2020-09-26 10:36:35.165 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)...
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [nimbus]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:112) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.1.2.jar:1.1.2]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.1.2.jar:1.1.2]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_261]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_261]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_261]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
配置文件中的配置:
storm.local.dir: "/home/storm/apache-storm-1.1.2/data"
storm.zookeeper.servers:
- "nimbus"
- "supervisor1"
- "supervisor2"
nimbus.seeds: ["nimbus"]
nimbus.childopts: "-Xmx1024m"
supervisor.childopts: "-Xmx1024m"
worker.childopts: "-Xmx768m"
ui.childopts: "-Xmx768m"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
分析可能是解析地址出错,
打开host文件发现nimbus的地址是127.0.0.1,把三台机器的地址添加到host里面之后,问题解决。