Jenkins connection loss

When using Jenkins, I also meet the connection loss issue as below. Jenkins remoting may fail to maintain connections between master and slave when running long time jobs.

FATAL: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:295)
at hudson.remoting.Channel.terminate(Channel.java:814)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
at ......remote call to alkaid(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1356)
at hudson.remoting.Request.call(Request.java:171)
at hudson.remoting.Channel.call(Channel.java:751)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:179)
at $Proxy41.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:979)
at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:137)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:97)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:770)
at hudson.model.Build$BuildExecution.build(Build.java:199)
at hudson.model.Build$BuildExecution.doRun(Build.java:160)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:533)
at hudson.model.Run.execute(Run.java:1759)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:89)
at hudson.model.Executor.run(Executor.java:240)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2279)
at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2748)
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:280)
at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

Found this link giving some hints that where to check detailed logs.

https://wiki.jenkins-ci.org/display/JENKINS/Remoting+issue

You'll have to go find archived slave launch logs under $JENKINS_HOME/logs/slaves/*/slave.log*, which records 10 more recent connection logs to that slave.

One of the common cause of the connection loss is the forced connection shutdown by the Ping Thread. When a ping thread detects that it's not getting a reply back in 4 minutes, it proceeds to terminate the connection to prevent an infinite hang. This will leave a message like the following in the slave launch log:

Ping failed. Terminating
Nov 01, 2014 1:22:35 PM hudson.slaves.ChannelPinger$1 onDead
INFO: Ping failed. Terminating the channel.

If you've identified that this is how your connection is lost, then you now need to dig deeper and understand what's causing the ping reply to be so late. Doing this diagnosis usually requires that you disable the ping thread, so that you can cause a slave to hang and exhibit the problem without getting killed. Then use tools like jmap and jstack to obtain the diagnostic information of the slave JVM. See here and here for how to use those tools.

Finally, I resolved the issue by adding two system environments when starting Jenkins server:

java -Dhudson.remoting.Launcher.pingTimeoutSec=600 -Dhudson.remoting.Launcher.pingIntervalSec=1200 -jar jenkins.war --httpPort=7070 >> jenkins.log &

posted @ 2015-04-02 16:54  wangjianxa  阅读(1506)  评论(0编辑  收藏  举报