Details

Description

When executing a free-style job in a Windows agent, this error appers:

//wc/psp/Branches/Release/PSP/content/EaTrax/MichaelFranti.wav#1 - added as P:\Release\PSP\content\EaTrax\MichaelFranti.wav
//wc/psp/Branches/Release/PSP/content/EaTrax/MiikeSnow.wav#1 - added as P:\Release\PSPFATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:311)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:303)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:594)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
FATAL: Unable to delete script file C:\DOCUME~1\pcfarm08\LOCALS~1\Temp\2\hudson2101488692771117152.bat
hudson.util.IOException2: remote file operation failed
at hudson.FilePath.act(FilePath.java:677)
at hudson.FilePath.act(FilePath.java:665)
at hudson.FilePath.delete(FilePath.java:922)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.FilePath.act(FilePath.java:672)
... 13 more
[locks-and-latches] Releasing all the locks
[locks-and-latches] All the locks released
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:734)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:422)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)

The job executes a batch command and it tooks 1h 17 min to fail in Hudson 1.336.

deccico
added a comment - 14/Dec/09 7:46 AM Hi, we noticed that this happens just in a job that execute a job with a duration of around 2hours that has a step with no output in a long time (don't know exactly how long)
thanks

I have the same problem on other platform like ibm aix and with hudson 1.338 and with jdk 1.6
Seems to be a general bug:
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:311)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:583)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1179)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: hudson.remoting.ProxyOutputStream$Flush
at hudson.remoting.Channel$1.adapt(Channel.java:580)
at hudson.remoting.Channel$1.adapt(Channel.java:575)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:303)
... 12 more
Caused by: java.lang.NoClassDefFoundError: hudson.remoting.ProxyOutputStream$Flush
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at hudson.remoting.RemoteOutputStream.readObject(RemoteOutputStream.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:600)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:986)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1865)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1963)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1887)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:364)
at hudson.remoting.UserRequest.deserialize(UserRequest.java:168)
at hudson.remoting.UserRequest.perform(UserRequest.java:88)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:270)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:453)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:315)
at java.util.concurrent.FutureTask.run(FutureTask.java:150)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:736)
Caused by: java.lang.ClassNotFoundException: hudson.remoting.ProxyOutputStream$Flush
at java.net.URLClassLoader.findClass(URLClassLoader.java:421)
at java.lang.ClassLoader.loadClass(ClassLoader.java:643)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:346)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 27 more
Notifying upstream projects of job completion
Finished: FAILURE

erwan_q
added a comment - 23/Dec/09 5:02 AM I have the same problem on other platform like ibm aix and with hudson 1.338 and with jdk 1.6
Seems to be a general bug:
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:311)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:583)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1179)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: hudson.remoting.ProxyOutputStream$Flush
at hudson.remoting.Channel$1.adapt(Channel.java:580)
at hudson.remoting.Channel$1.adapt(Channel.java:575)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:303)
... 12 more
Caused by: java.lang.NoClassDefFoundError: hudson.remoting.ProxyOutputStream$Flush
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at hudson.remoting.RemoteOutputStream.readObject(RemoteOutputStream.java:90)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:600)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:986)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1865)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1963)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1887)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:364)
at hudson.remoting.UserRequest.deserialize(UserRequest.java:168)
at hudson.remoting.UserRequest.perform(UserRequest.java:88)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:270)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:453)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:315)
at java.util.concurrent.FutureTask.run(FutureTask.java:150)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:736)
Caused by: java.lang.ClassNotFoundException: hudson.remoting.ProxyOutputStream$Flush
at java.net.URLClassLoader.findClass(URLClassLoader.java:421)
at java.lang.ClassLoader.loadClass(ClassLoader.java:643)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:346)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 27 more
Notifying upstream projects of job completion
Finished: FAILURE

Thanks for your comment.
The Hudson-5141 is more oriented on "ClassNotFoundException: hudson.model.job" i think.
My problem is more on the hudson.remoting.ProxyOutputStream$Flush
I created a new bug JENKINS-5149 on this specific issue.
I'm trying to find a solution to have a stable env but no success

erwan_q
added a comment - 23/Dec/09 7:51 AM Thanks for your comment.
The Hudson-5141 is more oriented on "ClassNotFoundException: hudson.model.job" i think.
My problem is more on the hudson.remoting.ProxyOutputStream$Flush
I created a new bug JENKINS-5149 on this specific issue.
I'm trying to find a solution to have a stable env but no success

We have encountered this same exact issue with Windows XP master/slave (Hudson 1.342). It occurs about 2 hours into build process, but not consistently. Usually builds of the same project do succeed without any problems. Here's a snippet of our failures (omitting some of the lines, since it is exactly the same as in the original post).

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:582)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:132)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1198)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.EOFException
...
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:735)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1198)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

@decicco: Thanks for the tip about lack of output, will see if adding some output (if possible) would help.

sini_m
added a comment - 10/Feb/10 2:44 AM We have encountered this same exact issue with Windows XP master/slave (Hudson 1.342). It occurs about 2 hours into build process, but not consistently. Usually builds of the same project do succeed without any problems. Here's a snippet of our failures (omitting some of the lines, since it is exactly the same as in the original post).
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:582)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:132)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1198)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.EOFException
...
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:735)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1198)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
@decicco: Thanks for the tip about lack of output, will see if adding some output (if possible) would help.

Usually happens on a job running for a little longer than an hour. Latest Hudson version I can report this happening is 1.356.

Output:

FATAL: command execution failed.
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.Ant.perform(Ant.java:212)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 11 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

levante
added a comment - 13/May/10 11:26 AM We've seen similar join related issues on our Hudson deployment (master/slaves all running 64-bit Windows 2003 Server, slaves as Windows Service).
Usually happens on a job running for a little longer than an hour. Latest Hudson version I can report this happening is 1.356.
Output:
FATAL: command execution failed.
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.Ant.perform(Ant.java:212)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 11 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Is this a problem with Hudson or the Join plugin?

hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:584)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson4172490882779075596.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson4172490882779075596.sh at hudson.remoting.Channel@4b4e563e:hudson_slave
at hudson.FilePath.act(FilePath.java:743)
at hudson.FilePath.act(FilePath.java:729)
at hudson.FilePath.delete(FilePath.java:984)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:584)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:736)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:584)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson4172490882779075596.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson4172490882779075596.sh at hudson.remoting.Channel@4b4e563e:hudson_slave
at hudson.FilePath.act(FilePath.java:743)
at hudson.FilePath.act(FilePath.java:729)
at hudson.FilePath.delete(FilePath.java:984)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:584)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:736)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1244)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

I occasionally get a similar error when Hudson checks out a large svn project. It happens at night when system backups are also happening on the network, so the check out may take a very long time.

I am running Hudson v.1.360. Master runs on Windows XP, slaves are Linux. Failure seen on jobs running on master or slave.

ERROR: Failed to check out https://<snip>
org.tmatesoft.svn.core.SVNException: svn: SSL peer shut down incorrectly
svn: REPORT request failed on '/subversion/zodiac/!svn/vcc/default'
at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:103)
at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:87)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:616)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:273)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:261)
at org.tmatesoft.svn.core.internal.io.dav.DAVConnection.doReport(DAVConnection.java:266)
at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.runReport(DAVRepository.java:1263)
at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.update(DAVRepository.java:820)
at org.tmatesoft.svn.core.wc.SVNUpdateClient.update(SVNUpdateClient.java:558)
at org.tmatesoft.svn.core.wc.SVNUpdateClient.doCheckout(SVNUpdateClient.java:934)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:742)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:660)
at hudson.FilePath.act(FilePath.java:753)
at hudson.FilePath.act(FilePath.java:735)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:653)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:601)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1044)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: javax.net.ssl.SSLException: SSL peer shut down incorrectly
at com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(Unknown Source)
at com.sun.net.ssl.internal.ssl.InputRecord.read(Unknown Source)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(Unknown Source)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at org.tmatesoft.svn.core.internal.util.ChunkedInputStream.read(ChunkedInputStream.java:70)
at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at org.tmatesoft.svn.core.internal.io.dav.http.XMLReader.read(XMLReader.java:39)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.readData(HTTPConnection.java:726)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.readData(HTTPConnection.java:691)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPRequest.dispatch(HTTPRequest.java:216)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:345)
... 20 more
Email was triggered for: Failure
Sending email for trigger: Failure

davida2009
added a comment - 17/Jun/10 1:53 AM I occasionally get a similar error when Hudson checks out a large svn project. It happens at night when system backups are also happening on the network, so the check out may take a very long time.
I am running Hudson v.1.360. Master runs on Windows XP, slaves are Linux. Failure seen on jobs running on master or slave.

ERROR: Failed to check out https://<snip>
org.tmatesoft.svn.core.SVNException: svn: SSL peer shut down incorrectly
svn: REPORT request failed on '/subversion/zodiac/!svn/vcc/default'
at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:103)
at org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:87)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:616)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:273)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:261)
at org.tmatesoft.svn.core.internal.io.dav.DAVConnection.doReport(DAVConnection.java:266)
at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.runReport(DAVRepository.java:1263)
at org.tmatesoft.svn.core.internal.io.dav.DAVRepository.update(DAVRepository.java:820)
at org.tmatesoft.svn.core.wc.SVNUpdateClient.update(SVNUpdateClient.java:558)
at org.tmatesoft.svn.core.wc.SVNUpdateClient.doCheckout(SVNUpdateClient.java:934)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:742)
at hudson.scm.SubversionSCM$CheckOutTask.invoke(SubversionSCM.java:660)
at hudson.FilePath.act(FilePath.java:753)
at hudson.FilePath.act(FilePath.java:735)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:653)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:601)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1044)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: javax.net.ssl.SSLException: SSL peer shut down incorrectly
at com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(Unknown Source)
at com.sun.net.ssl.internal.ssl.InputRecord.read(Unknown Source)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(Unknown Source)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at org.tmatesoft.svn.core.internal.util.ChunkedInputStream.read(ChunkedInputStream.java:70)
at sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
at sun.nio.cs.StreamDecoder.read(Unknown Source)
at java.io.InputStreamReader.read(Unknown Source)
at org.tmatesoft.svn.core.internal.io.dav.http.XMLReader.read(XMLReader.java:39)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.readData(HTTPConnection.java:726)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.readData(HTTPConnection.java:691)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPRequest.dispatch(HTTPRequest.java:216)
at org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:345)
... 20 more
Email was triggered for: Failure
Sending email for trigger: Failure

I am seeing the same on my Windows XP master-slave setup. I am running latest Hudson ver. 1.363
I am using the close-workspace-scm plugin to copy my workspace from master to slave(150).

Started by user anonymous
Building remotely on 150
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.unzip(FilePath.java:415)
at hudson.FileSystemProvisioner$Default$WorkspaceSnapshotImpl.restoreTo(FileSystemProvisioner.java:227)
at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:344)
at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:126)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1044)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:127)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)

nirmal_patel
added a comment - 22/Jun/10 3:02 AM - edited I am seeing the same on my Windows XP master-slave setup. I am running latest Hudson ver. 1.363
I am using the close-workspace-scm plugin to copy my workspace from master to slave(150).
Started by user anonymous
Building remotely on 150
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.unzip(FilePath.java:415)
at hudson.FileSystemProvisioner$Default$WorkspaceSnapshotImpl.restoreTo(FileSystemProvisioner.java:227)
at hudson.plugins.cloneworkspace.CloneWorkspaceSCM$Snapshot.restoreTo(CloneWorkspaceSCM.java:344)
at hudson.plugins.cloneworkspace.CloneWorkspaceSCM.checkout(CloneWorkspaceSCM.java:126)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1044)
at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:127)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)

I think I'm seeing the same thing, at least the same thing as levante reported.

Hudson 1.362 in Tomcat 5.5 on Server 2003 32-bit, JRE 1.6. Slave is Server 2008 R2, 64-bit, JRE 1.6, connected via hudsonslave service. When a lot of jobs come in at once the hudsonslave service crashes.

hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)
FATAL: Unable to delete script file C:\Windows\TEMP\hudson750406825412804255.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson750406825412804255.bat at hudson.remoting.Channel@1db6bbe:weyoun1
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)

williamleara
added a comment - 25/Jun/10 2:56 PM I think I'm seeing the same thing, at least the same thing as levante reported.
Hudson 1.362 in Tomcat 5.5 on Server 2003 32-bit, JRE 1.6. Slave is Server 2008 R2, 64-bit, JRE 1.6, connected via hudsonslave service. When a lot of jobs come in at once the hudsonslave service crashes.
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at java.io.ObjectInputStream$PeekInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peek(Unknown Source)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)
FATAL: Unable to delete script file C:\Windows\TEMP\hudson750406825412804255.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson750406825412804255.bat at hudson.remoting.Channel@1db6bbe:weyoun1
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1253)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)

(The same problems occur also without the -Xdebug -Xrunjdwp... arguments). The purpose of starting in this way is to get some long-running jobs to run with a lower nice to allow other jobs to finish quicker.

For me this has only happened when there are two jobs running on the slave. Both jobs fail with the same error message at the same time. Then the slave(s) is restarted and new jobs are allocated on the slave. I have previously noticed that the jobs on the old slave is still running so I assume that it is just the connection between the master and the slave that is the problem.

Today I have noticed this twice and managed to capture the output from the master hudson. The first time with two jobs running on one slave and probably none on the other, the second time with just one of the slaves without any jobs running.

Here is the first time with the output from the master. Notice that both slaves (updateshow and nightly) are crashing, taken down, or failing at the same time. You can also notice that I run the master hudson with the -verbosegc argument.

This is the backtrace from the log of one of the jobs on the nightly slave:

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson18796.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson18796.sh at hudson.remoting.Channel@2e646f:nightly
at hudson.FilePath.act(FilePath.java:752)
at hudson.FilePath.act(FilePath.java:738)
at hudson.FilePath.delete(FilePath.java:993)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:745)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

The second time today only the updateshow job got this problem and I cannot find any failed jobs. The jobs on updateshow are a lot shorter (around 5m). Here is the output from the master process:

In this second case I guess that the following has happened: The FullGC on the master takes 225 seconds (almost four minutes). During that time the master cannot answer the ping. The slave gives up on the master and aborts. Eventually the master detects this and restarts the slave. This happens after the job started at 12:20 is completed and before the job starting at 12:30 so no jobs are affected. To this, the simple fix would be to increase the TIME_OUT of the Ping (end of the hudson.remoting.PingThread class).

In the first case I don't understand what has happened.

This is a complicated problem to investigate because it just happens occasionally and the log is overwritten.

To allow us to investigate this further, let me suggest that the master-slave function is improved with a function to move or save the old log every time a slave starts so the log of the slave that crashed, failed, or terminated is preserved.

Linus Tolke
added a comment - 22/Jul/10 8:26 AM - edited I notice the same (or a similar) problem on my Debian/Linux host and Hudson 1.357. I have the remote job running on the same host started with the Launch slave via execution of command on the master:

(The same problems occur also without the -Xdebug -Xrunjdwp... arguments). The purpose of starting in this way is to get some long-running jobs to run with a lower nice to allow other jobs to finish quicker.
For me this has only happened when there are two jobs running on the slave. Both jobs fail with the same error message at the same time. Then the slave(s) is restarted and new jobs are allocated on the slave. I have previously noticed that the jobs on the old slave is still running so I assume that it is just the connection between the master and the slave that is the problem.
Today I have noticed this twice and managed to capture the output from the master hudson. The first time with two jobs running on one slave and probably none on the other, the second time with just one of the slaves without any jobs running.
Here is the first time with the output from the master. Notice that both slaves (updateshow and nightly) are crashing, taken down, or failing at the same time. You can also notice that I run the master hudson with the -verbosegc argument.

This is the backtrace from the log of one of the jobs on the nightly slave:

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson18796.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson18796.sh at hudson.remoting.Channel@2e646f:nightly
at hudson.FilePath.act(FilePath.java:752)
at hudson.FilePath.act(FilePath.java:738)
at hudson.FilePath.delete(FilePath.java:993)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:600)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:745)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1241)
at hudson.matrix.MatrixRun.run(MatrixRun.java:130)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)

The second time today only the updateshow job got this problem and I cannot find any failed jobs. The jobs on updateshow are a lot shorter (around 5m). Here is the output from the master process:

In this second case I guess that the following has happened: The FullGC on the master takes 225 seconds (almost four minutes). During that time the master cannot answer the ping. The slave gives up on the master and aborts. Eventually the master detects this and restarts the slave. This happens after the job started at 12:20 is completed and before the job starting at 12:30 so no jobs are affected. To this, the simple fix would be to increase the TIME_OUT of the Ping (end of the hudson.remoting.PingThread class).
In the first case I don't understand what has happened.
This is a complicated problem to investigate because it just happens occasionally and the log is overwritten.
To allow us to investigate this further, let me suggest that the master-slave function is improved with a function to move or save the old log every time a slave starts so the log of the slave that crashed, failed, or terminated is preserved.

It happened again (at 04:59 AM this morning). I guess it was the long Garbage Collect (221 seconds) that made the client close because of the ping not being answered.

This time I had moved the slave-nightly.log file to another name. Unluckily, nothing was written to the file. I had hoped to see the "No Ping response received" log error. I guess the slave-nightly.log file is written by the master on the master and since the connection is already closed by the client at this point, the master cannot write anything.

I still think the most frustrating part of this is that the logs of the slave cannot be accessed. I will see if I understand how to start the slaves logging to a file.

Linus Tolke
added a comment - 25/Jul/10 8:37 AM It happened again (at 04:59 AM this morning). I guess it was the long Garbage Collect (221 seconds) that made the client close because of the ping not being answered.
This time I had moved the slave-nightly.log file to another name. Unluckily, nothing was written to the file. I had hoped to see the "No Ping response received" log error. I guess the slave-nightly.log file is written by the master on the master and since the connection is already closed by the client at this point, the master cannot write anything.
I still think the most frustrating part of this is that the logs of the slave cannot be accessed. I will see if I understand how to start the slaves logging to a file.

mistafunk
added a comment - 26/Jul/10 12:59 AM are you sure that it was fixed in 1.367? i run this version and i still see this problem. additionally, this fix was not yet mentioned in the release notes...
i'll wait for the next release...

I think that this "fix" may only be a workaround instead of a real fix.

I experienced this problem in 1.363. Yesterday I upgrade to 1.368 and executed the same job in question - a matrix job with about 40 configurations, each of which took from 3 to 4 hours. Most of them used to fail in 1.363, but after switching to 1.368, only one of them failed. So I think the problem still exists.

Tzuchien
added a comment - 29/Jul/10 3:25 AM I think that this "fix" may only be a workaround instead of a real fix.
I experienced this problem in 1.363. Yesterday I upgrade to 1.368 and executed the same job in question - a matrix job with about 40 configurations, each of which took from 3 to 4 hours. Most of them used to fail in 1.363, but after switching to 1.368, only one of them failed. So I think the problem still exists.

yes, i think there is still something broken in this regard. i am running 1.368 on a linux64 master with two 32bit virtualbox slave VMs (win7/linux) and another linux64 node with three 64bit virtualbox slave VMs (win7/linux/osx).

below is the typical stacktrace i see regularly (2 out of 3 jobs) on the windows slaves (32bit and 64bit win7, not on linux or osx). it does not seem to be related to the host machines. the exceptions almost always occurs on longish linker, nsis or file transfer (archiving artifacts) operations.

update: the slave log does not contain anything special when it happens.

i appreciate any help to debug this.
kind regards,
simon

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)
FATAL: Unable to delete script file C:\Windows\TEMP\hudson1632955036060142154.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson1632955036060142154.bat at hudson.remoting.Channel@72945e31:sidian_win7_32
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)

mistafunk
added a comment - 29/Jul/10 4:25 AM - edited yes, i think there is still something broken in this regard. i am running 1.368 on a linux64 master with two 32bit virtualbox slave VMs (win7/linux) and another linux64 node with three 64bit virtualbox slave VMs (win7/linux/osx).
below is the typical stacktrace i see regularly (2 out of 3 jobs) on the windows slaves (32bit and 64bit win7, not on linux or osx). it does not seem to be related to the host machines. the exceptions almost always occurs on longish linker, nsis or file transfer (archiving artifacts) operations.
update: the slave log does not contain anything special when it happens.
i appreciate any help to debug this.
kind regards,
simon
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:602)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:893)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:869)
FATAL: Unable to delete script file C:\Windows\TEMP\hudson1632955036060142154.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson1632955036060142154.bat at hudson.remoting.Channel@72945e31:sidian_win7_32
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:555)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1257)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)

We see this on a Linux slave, running on the same machine as the master.
Seems to happen more frequently under high load.
Stacktrace is slightly differen:

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2700)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1648)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1323)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson700909144823232476.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson700909144823232476.sh at hudson.remoting.Channel@7f5a75a2:hudson_slave
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)

kflorian
added a comment - 09/Aug/10 8:09 AM We see this on a Linux slave, running on the same machine as the master.
Seems to happen more frequently under high load.
Stacktrace is slightly differen:

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:312)
at hudson.Launcher$ProcStarter.join(Launcher.java:278)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:304)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:598)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:880)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:862)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2700)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1648)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1323)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:856)
FATAL: Unable to delete script file /tmp/hudson700909144823232476.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson700909144823232476.sh at hudson.remoting.Channel@7f5a75a2:hudson_slave
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:412)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:551)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:738)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1241)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:124)

a random guess: could this issue be related to high slave response times? on my 5 slaves i observe 5-40ms for the linux32/linux64/osx slaves and 200-600ms for the two win32/win64 slaves. the former three slaves run rock solid, on the two windows slaves i regularly (75%) get "failed to join the process".

note that our linux32/win32 and linux64/win64 slave pairs run on the same virtualbox hosts (linux64) each. i guess, this rules out the hosts as problem sources. so is this a problem of hudson slave communication in combination with virtualbox!? i'm going to try different vbox network setups, maybe the vbox nat is the offender...

mistafunk
added a comment - 11/Aug/10 2:16 PM a random guess: could this issue be related to high slave response times? on my 5 slaves i observe 5-40ms for the linux32/linux64/osx slaves and 200-600ms for the two win32/win64 slaves. the former three slaves run rock solid, on the two windows slaves i regularly (75%) get "failed to join the process".
note that our linux32/win32 and linux64/win64 slave pairs run on the same virtualbox hosts (linux64) each. i guess, this rules out the hosts as problem sources. so is this a problem of hudson slave communication in combination with virtualbox!? i'm going to try different vbox network setups, maybe the vbox nat is the offender...
i am veeery motivated to finally resolve this one

Also having the same problem. Running Hudson 1.378 on CentOS 5.5 connecting to a CentOS 5.2 slave via ssh.

Hi, we noticed that this happens just in a job that execute a job with a duration of around 2hours that has a step with no output in a long time (don't know exactly how long)

-deccico
This is when it fails for me, too. The error occurs while running a script that takes quite a while and does not produce output. However, if I make the script print to the console frequently as it's running, this error doesn't pop up.

Is it because the ssh connection seems idle and isn't using a "keep-alive"?

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:658)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:950)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)
FATAL: Unable to delete script file /tmp/hudson8197175197320902892.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson8197175197320902892.sh at hudson.remoting.Channel@7069b861:slave
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:451)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:607)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:451)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:607)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)

Hi, we noticed that this happens just in a job that execute a job with a duration of around 2hours that has a step with no output in a long time (don't know exactly how long)

-deccico
This is when it fails for me, too. The error occurs while running a script that takes quite a while and does not produce output. However, if I make the script print to the console frequently as it's running, this error doesn't pop up.
Is it because the ssh connection seems idle and isn't using a "keep-alive"?

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:658)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:950)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)
FATAL: Unable to delete script file /tmp/hudson8197175197320902892.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson8197175197320902892.sh at hudson.remoting.Channel@7069b861:slave
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:451)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:607)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:451)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:607)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:137)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:931)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:925)

When executing a free-style job in a Windows agent, this error appers:

//wc/psp/Branches/Release/PSP/content/EaTrax/MichaelFranti.wav#1 - added as P:\Release\PSP\content\EaTrax\MichaelFranti.wav
//wc/psp/Branches/Release/PSP/content/EaTrax/MiikeSnow.wav#1 - added as P:\Release\PSPFATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:311)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:303)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:594)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
FATAL: Unable to delete script file C:\DOCUME~1\pcfarm08\LOCALS~1\Temp\2\hudson2101488692771117152.bat
hudson.util.IOException2: remote file operation failed
at hudson.FilePath.act(FilePath.java:677)
at hudson.FilePath.act(FilePath.java:665)
at hudson.FilePath.delete(FilePath.java:922)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.FilePath.act(FilePath.java:672)
... 13 more
[locks-and-latches] Releasing all the locks
[locks-and-latches] All the locks released
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:734)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:422)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)

The job executes a batch command and it tooks 1h 17 min to fail in Hudson 1.336.

When executing a free-style job in a Windows agent, this error appers:

{noformat}
//wc/psp/Branches/Release/PSP/content/EaTrax/MichaelFranti.wav#1 - added as P:\Release\PSP\content\EaTrax\MichaelFranti.wav
//wc/psp/Branches/Release/PSP/content/EaTrax/MiikeSnow.wav#1 - added as P:\Release\PSPFATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:311)
at hudson.Launcher$ProcStarter.join(Launcher.java:275)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:303)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.EOFException
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:594)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:872)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:852)
FATAL: Unable to delete script file C:\DOCUME~1\pcfarm08\LOCALS~1\Temp\2\hudson2101488692771117152.bat
hudson.util.IOException2: remote file operation failed
at hudson.FilePath.act(FilePath.java:677)
at hudson.FilePath.act(FilePath.java:665)
at hudson.FilePath.delete(FilePath.java:922)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:93)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:577)
at hudson.model.Build$RunnerImpl.build(Build.java:165)
at hudson.model.Build$RunnerImpl.doRun(Build.java:133)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.FilePath.act(FilePath.java:672)
... 13 more
[locks-and-latches] Releasing all the locks
[locks-and-latches] All the locks released
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:408)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:547)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:734)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:422)
at hudson.model.Run.run(Run.java:1176)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:123)
{noformat}

The job executes a batch command and it tooks 1h 17 min to fail in Hudson 1.336.

More recent version of Hudson, like the one Mike_L used, shows the root cause of the "hudson.remoting.ChannelClosedException: channel is already closed". So those who are experiencing this problem, please report your stack trace so that we can collect more data.

So far the error seems to indicate that the network connection between the master and the slave is terminated abruptly.

I guess I need to first improve Hudson so that it captures the failure from the slave process before we can understand the cause of the problem.

Kohsuke Kawaguchi
added a comment - 01/Oct/10 2:41 PM More recent version of Hudson, like the one Mike_L used, shows the root cause of the "hudson.remoting.ChannelClosedException: channel is already closed". So those who are experiencing this problem, please report your stack trace so that we can collect more data.
So far the error seems to indicate that the network connection between the master and the slave is terminated abruptly.
I guess I need to first improve Hudson so that it captures the failure from the slave process before we can understand the cause of the problem.

For the more info request. The issue started plaguing us about 2 weeks ago. Running 1.375 on 32-bit Win 2003 Server, with an additional slave on the same server and one on a 64-bit Windows 2008 server. Tomcat 6.0.20, java 1.6.0_06. Java 1.6.0_20 on the 64-bit 2008 server. Slave.jars are in sync with server version.

I have the Master + 3 slaves running as services on Windows. This particular issue strikes every 2-4 days, usually during the middle of actual build work where logs are actively being written. Recovery requires restaring the master and all slaves. The builds we do are large C and C++ builds, and they take a long time between 3-5 hours.

The unstableness is dwindling my dev team's confidence in Hudson's usefulness. Hopefully this issue can be addressed soon.

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:608)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:899)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
FATAL: Unable to delete script file C:\Users\lgbuild\AppData\Local\Temp\hudson8508007568280050491.bat
hudson.util.IOException2: remote file operation failed: C:\Users\lgbuild\AppData\Local\Temp\hudson8508007568280050491.bat at hudson.remoting.Channel@367b19:Schwartz
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:414)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:557)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:414)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:557)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)

jed624
added a comment - 12/Oct/10 11:56 AM - edited For the more info request. The issue started plaguing us about 2 weeks ago. Running 1.375 on 32-bit Win 2003 Server, with an additional slave on the same server and one on a 64-bit Windows 2008 server. Tomcat 6.0.20, java 1.6.0_06. Java 1.6.0_20 on the 64-bit 2008 server. Slave.jars are in sync with server version.
I have the Master + 3 slaves running as services on Windows. This particular issue strikes every 2-4 days, usually during the middle of actual build work where logs are actively being written. Recovery requires restaring the master and all slaves. The builds we do are large C and C++ builds, and they take a long time between 3-5 hours.
The unstableness is dwindling my dev team's confidence in Hudson's usefulness. Hopefully this issue can be addressed soon.

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:608)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:899)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
FATAL: Unable to delete script file C:\Users\lgbuild\AppData\Local\Temp\hudson8508007568280050491.bat
hudson.util.IOException2: remote file operation failed: C:\Users\lgbuild\AppData\Local\Temp\hudson8508007568280050491.bat at hudson.remoting.Channel@367b19:Schwartz
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:414)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:557)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:414)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:557)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1273)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:129)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:881)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:875)

When executing a free-style job in a Windows agent, this error appers:

2> WINVER not defined. Defaulting to 0x0502 (Windows Server 2003)
11>CDllCall.cpp
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:972)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: Unable to delete script file C:\WINDOWS\TEMP\hudson7211599763862507406.bat
hudson.util.IOException2: remote file operation failed: C:\WINDOWS\TEMP\hudson7211599763862507406.bat at hudson.remoting.Channel@60ac71:FXBuild
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.plugins.cygpath.CygpathLauncherDecorator$1.kill(CygpathLauncherDecorator.java:76)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)

tapiomtr
added a comment - 19/Oct/10 9:59 PM When executing a free-style job in a Windows agent, this error appers:
2> WINVER not defined. Defaulting to 0x0502 (Windows Server 2003)
11>CDllCall.cpp
FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: Connection reset
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:972)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: Unable to delete script file C:\WINDOWS\TEMP\hudson7211599763862507406.bat
hudson.util.IOException2: remote file operation failed: C:\WINDOWS\TEMP\hudson7211599763862507406.bat at hudson.remoting.Channel@60ac71:FXBuild
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601)
at hudson.model.Build$RunnerImpl.build(Build.java:174)
at hudson.model.Build$RunnerImpl.doRun(Build.java:138)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.plugins.cygpath.CygpathLauncherDecorator$1.kill(CygpathLauncherDecorator.java:76)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:185)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.executeBuildSteps(M2ExtraStepsWrapper.java:166)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.access$200(M2ExtraStepsWrapper.java:43)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper$1.tearDown(M2ExtraStepsWrapper.java:137)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:494)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:972)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: Unable to delete script file /tmp/hudson1272208775265435208.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson1272208775265435208.sh at hudson.remoting.Channel@35267641:vmlnxbld1buch-axhudson-x64
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.executeBuildSteps(M2ExtraStepsWrapper.java:166)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.access$200(M2ExtraStepsWrapper.java:43)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper$1.tearDown(M2ExtraStepsWrapper.java:137)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:494)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)

Does not happen all the time, seems random so far, but is highly annoying, and blocks releases sometimes.

FATAL: command execution failed
hudson.util.IOException2: Failed to join the process
at hudson.Proc$RemoteProc.join(Proc.java:355)
at hudson.Launcher$ProcStarter.join(Launcher.java:280)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.executeBuildSteps(M2ExtraStepsWrapper.java:166)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.access$200(M2ExtraStepsWrapper.java:43)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper$1.tearDown(M2ExtraStepsWrapper.java:137)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:494)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request$1.get(Request.java:218)
at hudson.remoting.Request$1.get(Request.java:172)
at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
at hudson.Proc$RemoteProc.join(Proc.java:347)
... 12 more
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:972)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: Unable to delete script file /tmp/hudson1272208775265435208.sh
hudson.util.IOException2: remote file operation failed: /tmp/hudson1272208775265435208.sh at hudson.remoting.Channel@35267641:vmlnxbld1buch-axhudson-x64
at hudson.FilePath.act(FilePath.java:749)
at hudson.FilePath.act(FilePath.java:735)
at hudson.FilePath.delete(FilePath.java:990)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.executeBuildSteps(M2ExtraStepsWrapper.java:166)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper.access$200(M2ExtraStepsWrapper.java:43)
at hudson.plugins.m2extrasteps.M2ExtraStepsWrapper$1.tearDown(M2ExtraStepsWrapper.java:137)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:494)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:742)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)
FATAL: channel is already closed
hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:467)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.Launcher$RemoteLauncher.kill(Launcher.java:744)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:421)
at hudson.model.Run.run(Run.java:1280)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:293)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:139)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:954)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2554)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:948)

Does not happen all the time, seems random so far, but is highly annoying, and blocks releases sometimes.

Integrated in hudson_main_trunk #450JENKINS-5073 Hudson was failing to record the connection termination problem in slave logs.JENKINS-5073 if the build fails because the node went offline, point the user to the slave log where the details are.JENKINS-5073 point the user to the slave log if the build failed because the slave went offline during the build.
noting these changes that are related to JENKINS-5073

dogfood
added a comment - 20/Jan/11 12:46 PM Integrated in hudson_main_trunk #450JENKINS-5073 Hudson was failing to record the connection termination problem in slave logs.
JENKINS-5073 if the build fails because the node went offline, point the user to the slave log where the details are.
JENKINS-5073 point the user to the slave log if the build failed because the slave went offline during the build.
noting these changes that are related to JENKINS-5073
Kohsuke Kawaguchi :
Files :

If the read returns 0, e.g. because of a buffering delay etc, then the while loop exists before we are done.

Same for StreamCopyThread. There may be other occurences in the code. I will search for them.

Not sure how these classes are used but it could definitively affect a long build, explaining the lack of reproducibility.

I created JENKINS-8686 to track and deal with this issue. We will see if that solves this problem or not. Otherwise we might have to ask for some logs (there are a few well placed log calls in that part of the code).

lacostej
added a comment - 04/Feb/11 10:09 AM - edited I was looking at the Proc class, and there seems to be a bug in the StdinCopyThread that could be cause the associated LocalProc to close its stream too early.
See :

If the read returns 0, e.g. because of a buffering delay etc, then the while loop exists before we are done.
Same for StreamCopyThread. There may be other occurences in the code. I will search for them.
Not sure how these classes are used but it could definitively affect a long build, explaining the lack of reproducibility.
I created JENKINS-8686 to track and deal with this issue. We will see if that solves this problem or not. Otherwise we might have to ask for some logs (there are a few well placed log calls in that part of the code).

It looks like it does not happen at random points, but at certain points in the build.

FATAL: Unable to delete script file C:\Windows\TEMP\hudson182664989132439196.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson182664989132439196.bat at hudson.remoting.Channel@42728b7:win7
at hudson.FilePath.act(FilePath.java:753)
at hudson.FilePath.act(FilePath.java:739)
at hudson.FilePath.delete(FilePath.java:994)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:649)
at hudson.model.Build$RunnerImpl.build(Build.java:177)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:423)
at hudson.model.Run.run(Run.java:1362)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:466)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:746)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:985)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:979)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy26.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:850)
at hudson.Launcher$ProcStarter.join(Launcher.java:336)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:649)
at hudson.model.Build$RunnerImpl.build(Build.java:177)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:423)
at hudson.model.Run.run(Run.java:1362)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1003)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:985)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:979)

lukas rytz
added a comment - 30/Mar/11 4:07 PM - edited[Edit 2: We only had the issues with JNLP-Slaves. Instead, we now installed Cygwin+OpenSSH on the Windows slave and connect using the SSH Slaves plugin. This works fine.]
We see similar problems.
Master: Ubuntu x64, 2.6.32-29-server
Slave: Windows 7 SP1 x64, VM running in VirtualBox 4.0.4, Java 1.6.0_24-x64 (also tried 32bit JVM, same problem).
[Edit: i disabled antivirus, Windows Defender, UAC, I run the jenkins slave as administrator. These don't help.]
We also have a Windows XP 32 bit VM, similar configuration, and there it never happens. On the Win7 slave it happens at every build.
Here are 5 build, the stack traces are exactly identical 12345
It looks like it does not happen at random points, but at certain points in the build.

FATAL: Unable to delete script file C:\Windows\TEMP\hudson182664989132439196.bat
hudson.util.IOException2: remote file operation failed: C:\Windows\TEMP\hudson182664989132439196.bat at hudson.remoting.Channel@42728b7:win7
at hudson.FilePath.act(FilePath.java:753)
at hudson.FilePath.act(FilePath.java:739)
at hudson.FilePath.delete(FilePath.java:994)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:649)
at hudson.model.Build$RunnerImpl.build(Build.java:177)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:423)
at hudson.model.Run.run(Run.java:1362)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:466)
at hudson.remoting.Request.call(Request.java:105)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.FilePath.act(FilePath.java:746)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:985)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:979)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:137)
at hudson.remoting.Channel.call(Channel.java:630)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy26.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:850)
at hudson.Launcher$ProcStarter.join(Launcher.java:336)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:649)
at hudson.model.Build$RunnerImpl.build(Build.java:177)
at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:423)
at hudson.model.Run.run(Run.java:1362)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:145)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:257)
at hudson.remoting.Channel.terminate(Channel.java:681)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:1003)
Caused by: java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Channel$ReaderThread.run(Channel.java:985)
Caused by: java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2553)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Channel$ReaderThread.run(Channel.java:979)