[Issue 4611] New - hudson.proc.RemoteProc.kill() does not work

6 messages Options
Embed this post
Permalink
jglick-2

[Issue 4611] New - hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611
                 Issue #|4611
                 Summary|hudson.proc.RemoteProc.kill() does not work
               Component|hudson
                 Version|current
                Platform|All
              OS/Version|All
                     URL|
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|core
             Assigned to|issues@hudson
             Reported by|jglick






------- Additional comments from [hidden email] Tue Oct  6 16:32:29 +0000 2009 -------
A few days ago we had a Mercurial server outage, with the result that all Hg
processes running at the time hung. (For technical reasons relating to network
config, the connections do not time out - they just hang forever.)

For those jobs running on master, the Hg polling was killed after an hour due to
issue #4461.

But for those jobs running on a slave,
SCMTrigger.DescriptorImpl.queue.inProgress shows them still active, even though
their polling log claims they were killed after an hour. A thread dump on master
confirms this:

"SCM polling for hudson.model.FreeStyleProject@164e3e2[apitest]" prio=10
tid=0xa0e0a400 nid=0x746e in Object.wait() [0xf77ff000..0xf77ff554]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at hudson.remoting.Request$1.get(Request.java:185)
        - locked <0x69424868> (a hudson.remoting.UserRequest)
        at hudson.remoting.Request$1.get(Request.java:165)
        at hudson.remoting.FutureAdapter.get(FutureAdapter.java:55)
        at hudson.Proc$RemoteProc.join(Proc.java:290)
        at
hudson.plugins.mercurial.MercurialSCM.joinWithTimeout(MercurialSCM.java:233)
        at hudson.plugins.mercurial.MercurialSCM.pollChanges(MercurialSCM.java:192)
        at hudson.model.AbstractProject.pollSCMChanges(AbstractProject.java:1032)
        at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:317)
        at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:344)
        at
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:114)

It seems that even though proc.kill() was called in another thread, proc.join()
is still waiting.

Looking at the implementation, it is no wonder kill() does not work:
Request.callAsynch's Future.cancel just returns false and does nothing!

Shouldn't it call channel.send(new Cancel(id)) or abort(...) or something like this?

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

jglick-2

[Issue 4611] hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611



User jglick changed the following:

                What    |Old value                 |New value
================================================================================
OtherIssuesDependingOnTh|                          |4461
                      is|                          |
--------------------------------------------------------------------------------




------- Additional comments from [hidden email] Tue Oct  6 16:33:34 +0000 2009 -------
.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

jglick-2

[Issue 4611] hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
In reply to this post by jglick-2
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611






------- Additional comments from [hidden email] Tue Oct  6 16:39:17 +0000 2009 -------
Workaround: identify the hung jobs, then

for (thread in Thread.currentThread().threadGroup.threads) {
 if (thread != null && thread.name.matches('SCM polling for
.*(job-1|job-2|...).*')) {
  thread.interrupt()
 }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

jglick-2

[Issue 4611] hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
In reply to this post by jglick-2
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611



User jglick changed the following:

                What    |Old value                 |New value
================================================================================
                Priority|P3                        |P2
--------------------------------------------------------------------------------




------- Additional comments from [hidden email] Wed Oct 28 11:28:23 +0000 2009 -------
This is turning out to be a major problem requiring at least weekly intervention
to log in to the slave and run 'killall hg'. Worse than the immediate impact of
the bug is the fact that it is not obvious from the Hudson GUI that anything is
wrong; the jobs are all blue, you have to notice that they have not run in days.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

scm_issue_link

[Issue 4611] hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
In reply to this post by jglick-2
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611



User scm_issue_link changed the following:

                What    |Old value                 |New value
================================================================================
                  Status|NEW                       |RESOLVED
--------------------------------------------------------------------------------
              Resolution|                          |FIXED
--------------------------------------------------------------------------------




------- Additional comments from [hidden email] Fri Nov  6 18:34:35 +0000 2009 -------
Code changed in hudson
User: : jglick
Path:
 trunk/hudson/main/core/src/main/java/hudson/Proc.java
 trunk/hudson/main/core/src/test/java/hudson/LauncherTest.java
 trunk/hudson/main/remoting/src/main/java/hudson/remoting/Request.java
 trunk/hudson/main/remoting/src/test/java/hudson/remoting/SimpleTest.java
http://fisheye4.cenqua.com/changelog/hudson/?cs=23526
Log:
[FIXED HUDSON-4611] hudson.proc.RemoteProc.kill() was a no-op.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

scm_issue_link

[Issue 4611] hudson.proc.RemoteProc.kill() does not work

Reply Threaded More More options
Print post
Permalink
In reply to this post by jglick-2
https://hudson.dev.java.net/issues/show_bug.cgi?id=4611






------- Additional comments from [hidden email] Fri Nov  6 18:36:09 +0000 2009 -------
Code changed in hudson
User: : jglick
Path:
 trunk/www/changelog.html
http://fisheye4.cenqua.com/changelog/hudson/?cs=23527
Log:
[HUDSON-4611] Noting.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]