From: Eric PAIRE Date: Wed, 08 Dec 1999 11:14:13 +0100 Subject: SIGCONT misbehaviour in Linux Hi Linux gurus, Michael Snyder is currently integrating my linuxthreads debugging support inside the source tree of GDB at Cygnus, and he notified what I think is a generic kernel bug in the signal handling: When a process blocked in the kernel receives a stopping signal (POSIX says SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is correctly implemented by Linux. *BUT*, when such a process receives a SIGCONT, then it must continue, whatever signal handling is configured in the process. The specific problem here is that, if the process is blocked in sys_nanosleep(), then receiving a SIGSTOP will make it exit from sys_nanosleep() and enter into TASK_STOPPED state in do_signal(). When it will be awaken via a SIGCONT, then it will exit immediately from the kernel, whatever time it remains to sleep, even if no signal handler is attached to SIGCONT, which is not the correct POSIX semantics (It should only return if there is a signal handler attached to SIGCONT). Notice also that the remaining time does not take into account the time during which the process has been stopped. The general problem here is that the kernel seems to *ALWAYS* return EINTR when signals have been sent during system calls, *EVEN* when there is no signal handler attached to the signal, which seems to be in contradiction with the generic POSIX semantics of EINTR. I have added the glibc-bug mailing list because I don't know whether the POSIX behaviour should be handled correctly in the libc or in the kernel. BTW, a funny user test to show this misbehaviour is to type the following commands in bash: sleep 1000 ^Z fg and the process running sleep 1000 immediatly returns on Linux. I tested it on other systems and it works correctly (the sleep continue). Best regards, - -Eric P.S. The original problem of Michael was with PTRACE_ATTACH, which side effect is to make a process executing nanosleep() ot immediatly exit from nanosleep() wheen attached by GDB, which make gdb intrusive in the process behaviour.... +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE Web : http://www.ri.silicomp.com/~paire | Group SILICOMP - Research Institute Email: eric.paire@ri.silicomp.com | 2, avenue de Vignate Phone: +33 (0) 476 63 48 71 | F-38610 Gieres Fax : +33 (0) 476 51 05 32 | FRANCE Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Simon Kirby Date: Wed, 8 Dec 1999 10:02:19 -0500 Subject: Re: SIGCONT misbehaviour in Linux On Wed, Dec 08, 1999 at 11:14:13AM +0100, Eric PAIRE wrote: > Hi Linux gurus, > > Michael Snyder is currently integrating my linuxthreads debugging support > inside the source tree of GDB at Cygnus, and he notified what I think is a > generic kernel bug in the signal handling: > > When a process blocked in the kernel receives a stopping signal (POSIX says > SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is > correctly implemented by Linux. *BUT*, when such a process receives a SIGCONT, > then it must continue, whatever signal handling is configured in the process. > > The specific problem here is that, if the process is blocked in > sys_nanosleep(), then receiving a SIGSTOP will make it exit from > sys_nanosleep() and enter into TASK_STOPPED state in do_signal(). > When it will be awaken via a SIGCONT, then it will exit immediately > from the kernel, whatever time it remains to sleep, even if no signal > handler is attached to SIGCONT, which is not the correct POSIX semantics > (It should only return if there is a signal handler attached to SIGCONT). > Notice also that the remaining time does not take into account the time > during which the process has been stopped. > > The general problem here is that the kernel seems to *ALWAYS* return EINTR > when signals have been sent during system calls, *EVEN* when there is no > signal handler attached to the signal, which seems to be in contradiction > with the generic POSIX semantics of EINTR. I have added the glibc-bug > mailing list because I don't know whether the POSIX behaviour should be > handled correctly in the libc or in the kernel. > > BTW, a funny user test to show this misbehaviour is to type the following > commands in bash: > > sleep 1000 > ^Z > fg > > and the process running sleep 1000 immediatly returns on Linux. I tested it > on other systems and it works correctly (the sleep continue). Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1 use nanosleep(), libc5 uses alarm() and sigsuspend()). Simon- [ Stormix Technologies Inc. ][ NetNation Communcations Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Eric Paire Date: Wed, 08 Dec 1999 16:49:48 +0100 Subject: Re: SIGCONT misbehaviour in Linux > On Wed, Dec 08, 1999 at 11:14:13AM +0100, Eric PAIRE wrote: > > > Hi Linux gurus, > > > > Michael Snyder is currently integrating my linuxthreads debugging support > > inside the source tree of GDB at Cygnus, and he notified what I think is a > > generic kernel bug in the signal handling: > > > > When a process blocked in the kernel receives a stopping signal (POSIX says > > SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is > > correctly implemented by Linux. *BUT*, when such a process receives a SIGCONT, > > then it must continue, whatever signal handling is configured in the process. > > > > The specific problem here is that, if the process is blocked in > > sys_nanosleep(), then receiving a SIGSTOP will make it exit from > > sys_nanosleep() and enter into TASK_STOPPED state in do_signal(). > > When it will be awaken via a SIGCONT, then it will exit immediately > > from the kernel, whatever time it remains to sleep, even if no signal > > handler is attached to SIGCONT, which is not the correct POSIX semantics > > (It should only return if there is a signal handler attached to SIGCONT). > > Notice also that the remaining time does not take into account the time > > during which the process has been stopped. > > > > The general problem here is that the kernel seems to *ALWAYS* return EINTR > > when signals have been sent during system calls, *EVEN* when there is no > > signal handler attached to the signal, which seems to be in contradiction > > with the generic POSIX semantics of EINTR. I have added the glibc-bug > > mailing list because I don't know whether the POSIX behaviour should be > > handled correctly in the libc or in the kernel. > > > > BTW, a funny user test to show this misbehaviour is to type the following > > commands in bash: > > > > sleep 1000 > > ^Z > > fg > > > > and the process running sleep 1000 immediatly returns on Linux. I tested it > > on other systems and it works correctly (the sleep continue). > > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1 > use nanosleep(), libc5 uses alarm() and sigsuspend()). > This works for the special case of sleep(), which is the example I took, just because the libc5 sleep implementation looks for the return value; but what about the other blocking system calls (like nanosleep) ? do they check properly on EINTR errno that the SIGCONT received signal did have a signal handling function at the time they received the signal, and restart automagically the system call that should not have been interrupted ? This is the reason why my guess is that this feature should be fixed in the kernel (if Linux is to be POSIX-compliant). - -Eric +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE Web : http://www.ri.silicomp.com/~paire | Group SILICOMP - Research Institute Email: eric.paire@ri.silicomp.com | 2, avenue de Vignate Phone: +33 (0) 476 63 48 71 | F-38610 Gieres Fax : +33 (0) 476 51 05 32 | FRANCE Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: hpa@transmeta.com (H. Peter Anvin) Date: 8 Dec 1999 12:22:20 -0800 Subject: Re: SIGCONT misbehaviour in Linux Followup to: <19991208100219.A18129@stormix.com> By author: Simon Kirby In newsgroup: linux.dev.kernel > > > > and the process running sleep 1000 immediatly returns on Linux. I tested it > > on other systems and it works correctly (the sleep continue). > > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1 > use nanosleep(), libc5 uses alarm() and sigsuspend()). > It really could be argued what is the right behaviour here. When a system call is interrupted by the signal, the normal thing is to return EINTR. -hpa - -- at work, in private! "Unix gives you enough rope to shoot yourself in the foot." Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Ulrich Drepper Date: 08 Dec 1999 13:01:17 -0800 Subject: Re: SIGCONT misbehaviour in Linux hpa@transmeta.com (H. Peter Anvin) writes: > > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1 > > use nanosleep(), libc5 uses alarm() and sigsuspend()). > > > > It really could be argued what is the right behaviour here. When a > system call is interrupted by the signal, the normal thing is to > return EINTR. Right. The problem is that the ptrace() call to continue the process (which implicitly sends a SIGCONT) also wakes up the process. We have a test program which, if you'd run it normally, would not finish in aeons. If you run it under gdb with all the ptrace() calls to stop and continue all the threads, it finishes. This change in behaviour is not wanted nor can it be avoided by gdb without a kernel change. - -- - ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Cygnus Solutions `--' drepper at cygnus.com `------------------------ Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: "Richard B. Johnson" Date: Wed, 8 Dec 1999 16:41:37 -0500 (EST) Subject: Re: SIGCONT misbehaviour in Linux On 8 Dec 1999, H. Peter Anvin wrote: > Followup to: <19991208100219.A18129@stormix.com> > By author: Simon Kirby > In newsgroup: linux.dev.kernel > > > > > > and the process running sleep 1000 immediatly returns on Linux. I tested it > > > on other systems and it works correctly (the sleep continue). > > > > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1 > > use nanosleep(), libc5 uses alarm() and sigsuspend()). > > > > It really could be argued what is the right behaviour here. When a > system call is interrupted by the signal, the normal thing is to > return EINTR. > > -hpa > It becomes a definition of BSD_SIGNALS. If I remember correctly, they, by default, use SA_RESTART as a flag. This way, sleep() and other system calls automatically restart after a signal. At the kernel level, any signal delivered to a process, causes a co-pending system call to return to the caller with -EINTR. It is the 'C' runtime library that decides, based upon this flag, if the system call should be restarted or if -1 should be returned to the caller with errno set to EINTR. Cheers, Dick Johnson Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips). Warning : The end of the world as we know it requires a new calendar. Seconds : 2013503 (until Y2K) Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Andrea Arcangeli Date: Wed, 8 Dec 1999 23:26:04 +0100 (CET) Subject: Re: SIGCONT misbehaviour in Linux On Wed, 8 Dec 1999, Richard B. Johnson wrote: >co-pending system call to return to the caller with -EINTR. It >is the 'C' runtime library that decides, based upon this flag, >if the system call should be restarted or if -1 should be returned >to the caller with errno set to EINTR. glibc could also return to run the syscall without waiting again from the beginning by looking at the 'struct timespec *rem'. If there wouldn't be the `rem` parameter in nanosleep, glibc couldn't wrap the -EINTR trasparently. But there is. NOTE: I can as well fix the kernel for this, but I agree with Peter that returning -INTR looks like the right thing to do. (I don't know which is the official semantic for the syscall though) Andrea Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: "Richard B. Johnson" Date: Wed, 8 Dec 1999 17:32:32 -0500 (EST) Subject: Re: SIGCONT misbehaviour in Linux > By author: Simon Kirby > In newsgroup: linux.dev.kernel > > and the process running sleep 1000 immediatly returns on Linux. > I tested it on other systems and it works correctly (the sleep > continue). This shows the operation of the SA_RESTART flag. If you don't want the system call to return to the caller with -1 and EINTR, you have to use this. #include #include #include #include void foo(int unused) { puts("\7Alarm"); } main(int x) { struct sigaction sa; char buf[1]; int i; memset(&sa, 0x00, sizeof(sa)); if(x > 1) sa.sa_flags = SA_RESTART; sa.sa_handler = foo; sigaction(SIGALRM, &sa, NULL); alarm(1); i = read(0, buf, 1); printf("%d, %s\n", i, strerror(errno)); } Cheers, Dick Johnson Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips). Warning : The end of the world as we know it requires a new calendar. Seconds : 2010448 (until Y2K) Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: "Richard B. Johnson" Date: Wed, 8 Dec 1999 17:37:29 -0500 (EST) Subject: Re: SIGCONT misbehaviour in Linux On Wed, 8 Dec 1999, Andrea Arcangeli wrote: > On Wed, 8 Dec 1999, Richard B. Johnson wrote: > > >co-pending system call to return to the caller with -EINTR. It > >is the 'C' runtime library that decides, based upon this flag, > >if the system call should be restarted or if -1 should be returned > >to the caller with errno set to EINTR. > > glibc could also return to run the syscall without waiting again from the > beginning by looking at the 'struct timespec *rem'. If there wouldn't be > the `rem` parameter in nanosleep, glibc couldn't wrap the -EINTR > trasparently. But there is. > > NOTE: I can as well fix the kernel for this, but I agree with Peter that > returning -INTR looks like the right thing to do. (I don't know which is > the official semantic for the syscall though) > > Andrea > I think the kernel provides the correct result. The caller either has to use '_BSD_SIGNALS_' or use code like this: #include #include #include #include void foo(int unused) { puts("\7Alarm"); } main(int x) { struct sigaction sa; char buf[1]; int i; memset(&sa, 0x00, sizeof(sa)); if(x > 1) sa.sa_flags = SA_RESTART; sa.sa_handler = foo; sigaction(SIGALRM, &sa, NULL); alarm(1); i = read(0, buf, 1); printf("%d, %s\n", i, strerror(errno)); } Depending upon whether anything is on the command-line, the SA_RESTART flag is set. This allows one to get both kinds of behavior with no problems. I think the kernel code is correct. Cheers, Dick Johnson Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips). Warning : The end of the world as we know it requires a new calendar. Seconds : 2010151 (until Y2K) Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Ulrich Drepper Date: 08 Dec 1999 14:41:31 -0800 Subject: Re: SIGCONT misbehaviour in Linux Andrea Arcangeli writes: > NOTE: I can as well fix the kernel for this, but I agree with Peter that > returning -INTR looks like the right thing to do. (I don't know which is > the official semantic for the syscall though) You don't understand the initial problem. This is that kill(SIGSTOP); ptrace(PTRACE_CONTINUE) is interrupting syscalls as well. It is fine if signals in general interrrupt syscalls. But SIGSTOP & friends, undone by a ptrace() call should not return since these kind of things happen when because of reasons outside the program. I user hitting ^Z or gdb stopping and restarting a process. The behaviour of the program is changed dramatically. - -- - ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Cygnus Solutions `--' drepper at cygnus.com `------------------------ Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Andrea Arcangeli Date: Thu, 9 Dec 1999 00:30:13 +0100 (CET) Subject: Re: SIGCONT misbehaviour in Linux On 8 Dec 1999, Ulrich Drepper wrote: >You don't understand the initial problem. This is that I am not even considering it now. I was considering what the kernel should do after a: kill(SIGSTOP); kill(SIGCONT); Richard was talking about what happens after a _signal_ and not after a ptrace_continue. These are two different things and we can make them behave in completly different way inside the kernel. I don't think you should compare the SIGSTOP+SIGCONG with SIGSTOP+PTRACE_CONTINUE. >reasons outside the program. I user hitting ^Z or gdb stopping and I think we should make difference between ^Z and gdb. The signal code is filled by ugly special cases exactly because they are different things AFIK. Do you agree that ^Z is just correct returning -EINTR immediatly at SIGCONT time (aka `fg` time)? Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the SIGCONT case?)? BTW, I am not sure if nanosleep is the only place that you may like to change in this respect... Andrea Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Ulrich Drepper Date: 08 Dec 1999 16:20:40 -0800 Subject: Re: SIGCONT misbehaviour in Linux Andrea Arcangeli writes: > Do you agree that ^Z is just correct returning -EINTR immediatly at > SIGCONT time (aka `fg` time)? This is not what happens on other platforms. At least with my limited testing I found that if you do on Solaris sleep 10 ^Z fg the process will continue to sleep. > Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the > SIGCONT case?)? This is the least what has to happen. > BTW, I am not sure if nanosleep is the only place that you may like > to change in this respect... No, it's not the only place (e.g., blocking read call). I think this is a general change. Whenever the continue happens throug PTRACE_CONTINUE no EINTR should be generated. - -- - ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Cygnus Solutions `--' drepper at cygnus.com `------------------------ Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Jason Gunthorpe Date: Wed, 8 Dec 1999 17:24:44 -0700 (MST) Subject: Re: SIGCONT misbehaviour in Linux On 8 Dec 1999, Ulrich Drepper wrote: > is interrupting syscalls as well. It is fine if signals in general > interrrupt syscalls. But SIGSTOP & friends, undone by a ptrace() call > should not return since these kind of things happen when because of I've noticed some general dysfunction with Linux and attaching strace to running processes. It seems that strace cannot attach without effecting the state of the process it is attaching too - I never had time to trace the particular problem down, but from this it sounds like a plausible explanation [strace causes a slow system call to return?]. Jason Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Andrea Arcangeli Date: Thu, 9 Dec 1999 01:38:32 +0100 (CET) Subject: Re: SIGCONT misbehaviour in Linux On Wed, 8 Dec 1999, Jason Gunthorpe wrote: >I've noticed some general dysfunction with Linux and attaching strace to >running processes. It seems that strace cannot attach without effecting There are things that you should expect to break. For example if you SIGSTOP your parent (that is always strace) while you are traced, then you'll deadlock the first time you'll try to return to userspace immediatly after you sent the signal to strace. This is normal and it's not trivial to fix it. Andrea Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Andrea Arcangeli Date: Thu, 9 Dec 1999 01:32:34 +0100 (CET) Subject: Re: SIGCONT misbehaviour in Linux On 8 Dec 1999, Ulrich Drepper wrote: >This is not what happens on other platforms. At least with my limited >testing I found that if you do on Solaris > > sleep 10 > ^Z > fg > >the process will continue to sleep. That's not enough to tell what the kernel is doing, maybe they have a bit smarter sleep(1) program. `sleep` can be changed to run nanosleep again if it received -EINTR and `req` is not null. You only have to pass as `req` the `rem` that you got back from the previous nanosleep call. >> Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the >> SIGCONT case?)? > >This is the least what has to happen. Ok. >> BTW, I am not sure if nanosleep is the only place that you may like >> to change in this respect... > >No, it's not the only place (e.g., blocking read call). I think this >is a general change. Whenever the continue happens throug >PTRACE_CONTINUE no EINTR should be generated. Ok. Andrea Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Ulrich Drepper Date: 08 Dec 1999 16:47:31 -0800 Subject: Re: SIGCONT misbehaviour in Linux Andrea Arcangeli writes: > That's not enough to tell what the kernel is doing, maybe they have a bit > smarter sleep(1) program. `sleep` can be changed to run nanosleep again if > it received -EINTR and `req` is not null. You only have to pass as `req` > the `rem` that you got back from the previous nanosleep call. I ran it under truss, you can do the same. The syscall does not return. - -- - ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Cygnus Solutions `--' drepper at cygnus.com `------------------------ Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: sp@albion.engr.sgi.com (Simon Patience) Date: Fri, 10 Dec 1999 07:29:25 -0800 (PST) Subject: Re: SIGCONT misbehaviour in Linux In article <82mvls$aqtrr@fido.engr.sgi.com>, you write: |> On 8 Dec 1999, Ulrich Drepper wrote: |> >This is not what happens on other platforms. At least with my limited |> >testing I found that if you do on Solaris |> > |> > sleep 10 |> > ^Z |> > fg |> > |> >the process will continue to sleep. |> |> That's not enough to tell what the kernel is doing, maybe they have a bit |> smarter sleep(1) program. `sleep` can be changed to run nanosleep again if |> it received -EINTR and `req` is not null. You only have to pass as `req` |> the `rem` that you got back from the previous nanosleep call. No, the problem is that you shouldn't have interrupted it in the first place. What is the point of interrupting a blocked process so that you can block it? Simon. - -- Simon Patience Phone: (650) 933-4644 Silicon Graphics, Inc FAX: (650) 962-8404 1600 Amphitheatre Pkwy Email: sp@sgi.com Mountain View, CA 94043-1389 Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: sp@albion.engr.sgi.com (Simon Patience) Date: Fri, 10 Dec 1999 07:42:52 -0800 (PST) Subject: Re: SIGCONT misbehaviour in Linux Eric Paire wrote: |> My reading of the The POSIX philosophy is that it is not legal for a |> blocking system call to return EINTR when it has been interrupted by a |> signal that does not have a signal handler attached to it at the time |> the signal has been delivered to the process. I agree. If you look at the description of EINTR, that is quite clear. [snip] |> IMHO, the SIGSTOP management (which is much simpler than the others since |> the signal can never be ignored nor caught) should be taken into account |> in the schedule loop, and not in the signal management on syscall return. You really don't want job control to be implemented in the scheduler! It should be implemented in the joc control code on syscall/trap return. I know there isn't such code at the moment but that is why you are seeing the problems :-) |> Notice that SIGTSTP, SIGTTIN and SIGTTOU should be handled at the same |> place when the default signal behaviour is applied, as well as some other Agreed. |> special cases like ignored SIGCHLD,... Part of this code is currently in |> machine-dependent do_signal() function. |> The advantage of such modification is that a blocking system call will |> remain in the actual schedule loop whenever SIGSTOP/SIGTSTP and SIGCONT |> are sent to him (thus eliminating the EINTR problem, and being POSIX |> compatible). The other advantage is that for a traced process, the SIGSTOP |> handling may also be managed in the schedule loop, thus avoiding the side |> effect of being awaken by PTRACE_ATTACH/PTRACE_CONTINUE. I don't see this as an advantage. Stop signals should stop the process from advancing in user space. You don't need to do anything to them while they are in the kernel. Simon. - -- Simon Patience Phone: (650) 933-4644 Silicon Graphics, Inc FAX: (650) 962-8404 1600 Amphitheatre Pkwy Email: sp@sgi.com Mountain View, CA 94043-1389 Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Ulrich Drepper Date: 10 Dec 1999 07:49:48 -0800 Subject: Re: SIGCONT misbehaviour in Linux sp@albion.engr.sgi.com (Simon Patience) writes: > |> > sleep 10 > |> > ^Z > |> > fg > |> > > |> >the process will continue to sleep. > [...] > No, the problem is that you shouldn't have interrupted it in the first > place. What is the point of interrupting a blocked process so that you > can block it? I agrre, that is what seems to happen. With one little addition: at least on Solaris the syscall in the end nevertheless returns EINTR. I'm not sure whether this is useful but it might be ok since a) code today already has to handle EINTR b) it provides the user more information (e.g., that she could find out that the process has possibly slept for a long time) - -- - ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Cygnus Solutions `--' drepper at cygnus.com `------------------------ Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Simon Patience Date: Fri, 10 Dec 99 08:18:50 -0800 Subject: Re: SIGCONT misbehaviour in Linux Ulrich Drepper wrote: |> Andrea Arcangeli writes: |> |> > Do you agree that ^Z is just correct returning -EINTR immediatly at |> > SIGCONT time (aka `fg` time)? |> |> This is not what happens on other platforms. At least with my limited |> testing I found that if you do on Solaris |> |> sleep 10 |> ^Z |> fg |> |> the process will continue to sleep. I am with Ulrich on this one. The problem with job control signals is that they are not really signals directed towards the process, they are signals directed to the kernel to do something to the process. In the case of STOP/CONT it is a request to not allow/allow the process to make forward progress _in user space_. Sending the signal should not interrupt the process at all. Sending SIGSTOP should simply mark the process as not to return to user space. If the process happens to be already blocked in the kernel waiting for something, there is no reason to interrupt so it can be blocked somewhere else in the kernel. If it wakes up then it can complete the system call successfully but then block before returning to user space. SIGCONT would simply clear that flag and wake the process up if it was in a job control stop. Simon. - ---- Simon Patience Phone: (650) 933-4644 Silicon Graphics, Inc FAX: (650) 962-8404 1600 Amphitheatre Pkwy Email: sp@sgi.com Mountain View, CA 94043-1389 Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Brian Pomerantz Date: Fri, 10 Dec 1999 10:17:08 -0800 Subject: Re: SIGCONT misbehaviour in Linux On Fri, Dec 10, 1999 at 07:29:25AM -0800, Simon Patience wrote: > In article <82mvls$aqtrr@fido.engr.sgi.com>, you write: > |> On 8 Dec 1999, Ulrich Drepper wrote: > |> >This is not what happens on other platforms. At least with my limited > |> >testing I found that if you do on Solaris > |> > > |> > sleep 10 > |> > ^Z > |> > fg > |> > > |> >the process will continue to sleep. > |> > |> That's not enough to tell what the kernel is doing, maybe they have a bit > |> smarter sleep(1) program. `sleep` can be changed to run nanosleep again if > |> it received -EINTR and `req` is not null. You only have to pass as `req` > |> the `rem` that you got back from the previous nanosleep call. > > No, the problem is that you shouldn't have interrupted it in the first > place. What is the point of interrupting a blocked process so that you > can block it? > Isn't a process in a blocked state when it is waiting on I/O? I often will hit ^Z for a long tarball extraction and run it in the background. When I hit ^Z, the process could be waiting for I/O when the signal comes through, thus a time when I want to interrupt a blocked process to block it. BAPper Please read the FAQ at http://www.tux.org/lkml/ linux-kernel-digest V1 #4897 ------------------------------------------------------------------------ From: sp@albion.engr.sgi.com (Simon Patience) Date: Fri, 10 Dec 1999 15:37:51 -0800 (PST) Subject: Re: SIGCONT misbehaviour in Linux Brian Pomerantz wrote: |> On Fri, Dec 10, 1999 at 07:29:25AM -0800, Simon Patience wrote: |> > In article <82mvls$aqtrr@fido.engr.sgi.com>, you write: |> > |> On 8 Dec 1999, Ulrich Drepper wrote: |> > |> >This is not what happens on other platforms. At least with my limited |> > |> >testing I found that if you do on Solaris |> > |> > |> > |> > sleep 10 |> > |> > ^Z |> > |> > fg |> > |> > |> > |> >the process will continue to sleep. |> > |> |> > |> That's not enough to tell what the kernel is doing, maybe they have a bit |> > |> smarter sleep(1) program. `sleep` can be changed to run nanosleep again if |> > |> it received -EINTR and `req` is not null. You only have to pass as `req` |> > |> the `rem` that you got back from the previous nanosleep call. |> > |> > No, the problem is that you shouldn't have interrupted it in the first |> > place. What is the point of interrupting a blocked process so that you |> > can block it? |> |> Isn't a process in a blocked state when it is waiting on I/O? I often |> will hit ^Z for a long tarball extraction and run it in the |> background. When I hit ^Z, the process could be waiting for I/O when |> the signal comes through, thus a time when I want to interrupt a |> blocked process to block it. My point was that you could just leave it blocked waiting for the I/O. If you type fg before the I/O completes then the process will just execute and return as if nothing had happened when the I/O finally completes. If the I/O completes first then the processes winds its way back to the return from the system call, notices that it is stopped and blocks itself there. When the SIGCONT arrives, the process unblocks and returns from the system call normally. This is far better (less code, less complexity) than unblocking the I/O (which may have partial results) getting the process to block somewhere else and then trying to work out what on earth the right thing to do is when SIGCONT arrives. Simon. ------------------------------------------------------------------------ From: Simon Patience Date: Wed, 15 Dec 99 08:36:52 -0800 Subject: Re: SIGCONT misbehaviour in Linux Eric Paire wrote: > > Eric Paire wrote: > > |> IMHO, the SIGSTOP management (which is much simpler than the others since > > |> the signal can never be ignored nor caught) should be taken into account > > |> in the schedule loop, and not in the signal management on syscall return > . > > > > You really don't want job control to be implemented in the scheduler! It > > should be implemented in the joc control code on syscall/trap return. I > > know there isn't such code at the moment but that is why you are seeing the > > problems :-) > > > No. my opinion was to locate only STOP/START management in the scheduling loop > in order to avoid exiting it for being managed very lately (just before > returning in user mode). So that if a process is stopped and then restarted > without any signal handler, then it will remain blocked in the scheduler > (which is transparent for functions that blocks a process). Why are you trying to do this? I can't see the objection to code just before return to user space that says, if I am stopped, wait for sigcont. As you haven't interrupted the processes you won't get EINTR. You don't have to muck with the scheduler, which is always a tricky thing to do, and everything works wonderfully. > > I don't see this as an advantage. Stop signals should stop the process from > > advancing in user space. You don't need to do anything to them while they > > are in the kernel. > > > My point is that processes that are stopped and restarted, exit from the > main schduler loop, and prepare themselves for returning EINTR in user space > (which is *not* POSIX-compliant, and make GDB very intrusive), since the But you don't need to change the scheduler to fix that, just don't send interrupt the process when it gets the STOP signal in the first place. Mark the process as stopped, having SIGSTOP in the pending set is good enough but don't wake the process up. Then in do_signal() you special case STOP signals and wait on a semaphore or something (actually a synchronization/condition variable would be good for this situation but Linux doesn't have them). When someone sends SIGCONT, they clear the STOP signal from the pending set (as today) and then signal the semaphore. No interrupt, no scheduler hack, POSIX compliant, simple. > current implementation of restart does not force them to return to the > scheduler loop for those in INTERRUPTED state. The idea of managing stop > restart without signal handlers within schedule() is to make a simple > machine-independent modification to correct this signal mishandling. > > Any scheduler guru opinion ??? Simon Simon Patience Phone: (650) 933-4644 Silicon Graphics, Inc FAX: (650) 962-8404 1600 Amphitheatre Pkwy Email: sp@sgi.com Mountain View, CA 94043-1389 Please read the FAQ at http://www.tux.org/lkml/ ------------------------------ From: Eric Paire Date: Thu, 16 Dec 1999 14:04:56 +0100 Subject: Re: SIGCONT misbehaviour in Linux > > > Eric Paire wrote: > > > Eric Paire wrote: > > > |> IMHO, the SIGSTOP management (which is much simpler than the others since > > > |> the signal can never be ignored nor caught) should be taken into account > > > |> in the schedule loop, and not in the signal management on syscall return > > . > > > > > > You really don't want job control to be implemented in the scheduler! It > > > should be implemented in the joc control code on syscall/trap return. I > > > know there isn't such code at the moment but that is why you are seeing the > > > problems :-) > > > > > No. my opinion was to locate only STOP/START management in the scheduling loop > > in order to avoid exiting it for being managed very lately (just before > > returning in user mode). So that if a process is stopped and then restarted > > without any signal handler, then it will remain blocked in the scheduler > > (which is transparent for functions that blocks a process). > > Why are you trying to do this? I can't see the objection to code just before > return to user space that says, if I am stopped, wait for sigcont. As you > haven't interrupted the processes you won't get EINTR. You don't have to > muck with the scheduler, which is always a tricky thing to do, and > everything works wonderfully. > > > > I don't see this as an advantage. Stop signals should stop the process from > > > advancing in user space. You don't need to do anything to them while they > > > are in the kernel. > > > > > My point is that processes that are stopped and restarted, exit from the > > main schduler loop, and prepare themselves for returning EINTR in user space > > (which is *not* POSIX-compliant, and make GDB very intrusive), since the > > But you don't need to change the scheduler to fix that, just don't send > interrupt the process when it gets the STOP signal in the first place. > Mark the process as stopped, having SIGSTOP in the pending set is good > enough but don't wake the process up. Then in do_signal() you special > case STOP signals and wait on a semaphore or something (actually a > synchronization/condition variable would be good for this situation but > Linux doesn't have them). When someone sends SIGCONT, they clear the STOP > signal from the pending set (as today) and then signal the semaphore. > No interrupt, no scheduler hack, POSIX compliant, simple. > I agree that your idea to transfer the STOP/CONT management in the calling process rather the in the managed process seems good. But, you will have to also transfer from do_signal() in ptrace(), the STOP/CONT management of a traced process (which is similar to STOP/CONT), in order to avoid ptrace to modify the process scheduling (gdb would be intrusive otherwise). > > current implementation of restart does not force them to return to the > > scheduler loop for those in INTERRUPTED state. The idea of managing stop > > restart without signal handlers within schedule() is to make a simple > > machine-independent modification to correct this signal mishandling. > > > > Any scheduler guru opinion ??? > - -Eric +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE Web : http://www.ri.silicomp.com/~paire | Group SILICOMP - Research Institute Email: eric.paire@ri.silicomp.com | 2, avenue de Vignate Phone: +33 (0) 476 63 48 71 | F-38610 Gieres Fax : +33 (0) 476 51 05 32 | FRANCE Please read the FAQ at http://www.tux.org/lkml/ ------------------------------