From: Eric PAIRE <eric.paire@ri.silicomp.fr>
Date: Wed, 08 Dec 1999 11:14:13 +0100
Subject: SIGCONT misbehaviour in Linux

Hi Linux gurus,

Michael Snyder is currently integrating my linuxthreads debugging support
inside the source tree of GDB at Cygnus, and he notified what I think is a
generic kernel bug in the signal handling:

When a process blocked in the kernel receives a stopping signal (POSIX says
SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is
correctly implemented by Linux. *BUT*, when such a process receives a
SIGCONT,
then it must continue, whatever signal handling is configured in the
process.

The specific problem here is that, if the process is blocked in
sys_nanosleep(), then receiving a SIGSTOP will make it exit from
sys_nanosleep() and enter into TASK_STOPPED state in do_signal().
When it will be awaken via a SIGCONT, then it will exit immediately
from the kernel, whatever time it remains to sleep, even if no signal
handler is attached to SIGCONT, which is not the correct POSIX semantics
(It should only return if there is a signal handler attached to SIGCONT).
Notice also that the remaining time does not take into account the time
during which the process has been stopped.

The general problem here is that the kernel seems to *ALWAYS* return EINTR
when signals have been sent during system calls, *EVEN* when there is no
signal handler attached to the signal, which seems to be in contradiction
with the generic POSIX semantics of EINTR. I have added the glibc-bug
mailing list because I don't know whether the POSIX behaviour should be
handled correctly in the libc or in the kernel.

BTW, a funny user test to show this misbehaviour is to type the following
commands in bash:

sleep 1000
^Z
fg

and the process running sleep 1000 immediatly returns on Linux. I tested it
on other systems and it works correctly (the sleep continue).

Best regards,
- -Eric
P.S. The original problem of Michael was with PTRACE_ATTACH, which side
effect
        is to make a process executing nanosleep() ot immediatly exit from
        nanosleep() wheen attached by GDB, which make gdb intrusive in the
        process behaviour....
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE
Web : http://www.ri.silicomp.com/~paire | Group SILICOMP - Research
Institute
Email: eric.paire@ri.silicomp.com | 2, avenue de Vignate
Phone: +33 (0) 476 63 48 71 | F-38610 Gieres
Fax : +33 (0) 476 51 05 32 | FRANCE

Please read the FAQ at http://www.tux.org/lkml/

------------------------------

From: Simon Kirby <sim@stormix.com>
Date: Wed, 8 Dec 1999 10:02:19 -0500
Subject: Re: SIGCONT misbehaviour in Linux

On Wed, Dec 08, 1999 at 11:14:13AM +0100, Eric PAIRE wrote:

> Hi Linux gurus,
> 
> Michael Snyder is currently integrating my linuxthreads debugging support
> inside the source tree of GDB at Cygnus, and he notified what I think is a
> generic kernel bug in the signal handling:
> 
> When a process blocked in the kernel receives a stopping signal (POSIX says
> SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is
> correctly implemented by Linux. *BUT*, when such a process receives a SIGCONT,
> then it must continue, whatever signal handling is configured in the process.
> 
> The specific problem here is that, if the process is blocked in
> sys_nanosleep(), then receiving a SIGSTOP will make it exit from
> sys_nanosleep() and enter into TASK_STOPPED state in do_signal(). 
> When it will be awaken via a SIGCONT, then it will exit immediately
> from the kernel, whatever time it remains to sleep, even if no signal
> handler is attached to SIGCONT, which is not the correct POSIX semantics
> (It should only return if there is a signal handler attached to SIGCONT).
> Notice also that the remaining time does not take into account the time
> during which the process has been stopped.
> 
> The general problem here is that the kernel seems to *ALWAYS* return EINTR
> when signals have been sent during system calls, *EVEN* when there is no
> signal handler attached to the signal, which seems to be in contradiction
> with the generic POSIX semantics of EINTR. I have added the glibc-bug
> mailing list because I don't know whether the POSIX behaviour should be
> handled correctly in the libc or in the kernel.
> 
> BTW, a funny user test to show this misbehaviour is to type the following
> commands in bash:
> 
> sleep 1000
> ^Z
> fg
> 
> and the process running sleep 1000 immediatly returns on Linux. I tested it
> on other systems and it works correctly (the sleep continue).

Hmm...This works properly on libc5 systems, btw.  (glibc2.0 and glibc2.1
use nanosleep(), libc5 uses alarm() and sigsuspend()).

Simon-

[  Stormix Technologies Inc.  ][  NetNation Communcations Inc.  ]
[       sim@stormix.com       ][       sim@netnation.com        ]
[ Opinions expressed are not necessarily those of my employers. ]

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Eric Paire <paire@ri.silicomp.fr>
Date: Wed, 08 Dec 1999 16:49:48 +0100
Subject: Re: SIGCONT misbehaviour in Linux 

> On Wed, Dec 08, 1999 at 11:14:13AM +0100, Eric PAIRE wrote:
> 
> > Hi Linux gurus,
> > 
> > Michael Snyder is currently integrating my linuxthreads debugging support
> > inside the source tree of GDB at Cygnus, and he notified what I think is a
> > generic kernel bug in the signal handling:
> > 
> > When a process blocked in the kernel receives a stopping signal (POSIX says
> > SIGSTOP, SIGTSTP, SIGTTIN and SIGTTOU), then the process stops, and this is
> > correctly implemented by Linux. *BUT*, when such a process receives a SIGCONT,
> > then it must continue, whatever signal handling is configured in the process.
> > 
> > The specific problem here is that, if the process is blocked in
> > sys_nanosleep(), then receiving a SIGSTOP will make it exit from
> > sys_nanosleep() and enter into TASK_STOPPED state in do_signal(). 
> > When it will be awaken via a SIGCONT, then it will exit immediately
> > from the kernel, whatever time it remains to sleep, even if no signal
> > handler is attached to SIGCONT, which is not the correct POSIX semantics
> > (It should only return if there is a signal handler attached to SIGCONT).
> > Notice also that the remaining time does not take into account the time
> > during which the process has been stopped.
> > 
> > The general problem here is that the kernel seems to *ALWAYS* return EINTR
> > when signals have been sent during system calls, *EVEN* when there is no
> > signal handler attached to the signal, which seems to be in contradiction
> > with the generic POSIX semantics of EINTR. I have added the glibc-bug
> > mailing list because I don't know whether the POSIX behaviour should be
> > handled correctly in the libc or in the kernel.
> > 
> > BTW, a funny user test to show this misbehaviour is to type the following
> > commands in bash:
> > 
> > sleep 1000
> > ^Z
> > fg
> > 
> > and the process running sleep 1000 immediatly returns on Linux. I tested it
> > on other systems and it works correctly (the sleep continue).
> 
> Hmm...This works properly on libc5 systems, btw.  (glibc2.0 and glibc2.1
> use nanosleep(), libc5 uses alarm() and sigsuspend()).
> 
This works for the special case of sleep(), which is the example I took,
just because the libc5 sleep implementation looks for the return value;
but what about the other blocking system calls (like nanosleep) ? do they
check properly on EINTR errno that the SIGCONT received signal did have a
signal handling function at the time they received the signal, and restart
automagically the  system call that should not have been interrupted ?
This is the reason why my guess is that this feature should be fixed
in the kernel (if Linux is to be POSIX-compliant).

- -Eric
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE
Web  : http://www.ri.silicomp.com/~paire  | Group SILICOMP - Research Institute
Email: eric.paire@ri.silicomp.com         | 2, avenue de Vignate
Phone: +33 (0) 476 63 48 71               | F-38610 Gieres
Fax  : +33 (0) 476 51 05 32               | FRANCE


Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: hpa@transmeta.com (H. Peter Anvin)
Date: 8 Dec 1999 12:22:20 -0800
Subject: Re: SIGCONT misbehaviour in Linux

Followup to: <19991208100219.A18129@stormix.com>
By author: Simon Kirby <sim@stormix.com>
In newsgroup: linux.dev.kernel
> >
> > and the process running sleep 1000 immediatly returns on Linux. I tested
it
> > on other systems and it works correctly (the sleep continue).
>
> Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1
> use nanosleep(), libc5 uses alarm() and sigsuspend()).
>

It really could be argued what is the right behaviour here. When a
system call is interrupted by the signal, the normal thing is to
return EINTR.

       -hpa

- --
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."

Please read the FAQ at http://www.tux.org/lkml/


------------------------------


From: Ulrich Drepper <drepper@cygnus.com>
Date: 08 Dec 1999 13:01:17 -0800
Subject: Re: SIGCONT misbehaviour in Linux

hpa@transmeta.com (H. Peter Anvin) writes:

> > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1
> > use nanosleep(), libc5 uses alarm() and sigsuspend()).
> >
>
> It really could be argued what is the right behaviour here. When a
> system call is interrupted by the signal, the normal thing is to
> return EINTR.

Right. The problem is that the ptrace() call to continue the process
(which implicitly sends a SIGCONT) also wakes up the process. We have
a test program which, if you'd run it normally, would not finish in
aeons. If you run it under gdb with all the ptrace() calls to stop
and continue all the threads, it finishes. This change in behaviour
is not wanted nor can it be avoided by gdb without a kernel change.

- --
- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: "Richard B. Johnson" <root@chaos.analogic.com>
Date: Wed, 8 Dec 1999 16:41:37 -0500 (EST)
Subject: Re: SIGCONT misbehaviour in Linux

On 8 Dec 1999, H. Peter Anvin wrote:

> Followup to: <19991208100219.A18129@stormix.com>
> By author: Simon Kirby <sim@stormix.com>
> In newsgroup: linux.dev.kernel
> > >
> > > and the process running sleep 1000 immediatly returns on Linux. I
tested it
> > > on other systems and it works correctly (the sleep continue).
> >
> > Hmm...This works properly on libc5 systems, btw. (glibc2.0 and glibc2.1
> > use nanosleep(), libc5 uses alarm() and sigsuspend()).
> >
>
> It really could be argued what is the right behaviour here. When a
> system call is interrupted by the signal, the normal thing is to
> return EINTR.
>
> -hpa
>

It becomes a definition of BSD_SIGNALS. If I remember correctly,
they, by default, use SA_RESTART as a flag. This way, sleep()
and other system calls automatically restart after a signal. At
the kernel level, any signal delivered to a process, causes a
co-pending system call to return to the caller with -EINTR. It
is the 'C' runtime library that decides, based upon this flag,
if the system call should be restarted or if -1 should be returned
to the caller with errno set to EINTR.

Cheers,
Dick Johnson

Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips).
Warning : The end of the world as we know it requires a new calendar.
Seconds : 2013503 (until Y2K)

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Andrea Arcangeli <andrea@suse.de>
Date: Wed, 8 Dec 1999 23:26:04 +0100 (CET)
Subject: Re: SIGCONT misbehaviour in Linux

On Wed, 8 Dec 1999, Richard B. Johnson wrote:

>co-pending system call to return to the caller with -EINTR. It
>is the 'C' runtime library that decides, based upon this flag,
>if the system call should be restarted or if -1 should be returned
>to the caller with errno set to EINTR.

glibc could also return to run the syscall without waiting again from the
beginning by looking at the 'struct timespec *rem'. If there wouldn't be
the `rem` parameter in nanosleep, glibc couldn't wrap the -EINTR
trasparently. But there is.

NOTE: I can as well fix the kernel for this, but I agree with Peter that
returning -INTR looks like the right thing to do. (I don't know which is
the official semantic for the syscall though)

Andrea

Please read the FAQ at http://www.tux.org/lkml/

------------------------------

From: "Richard B. Johnson" <root@chaos.analogic.com>
Date: Wed, 8 Dec 1999 17:32:32 -0500 (EST)
Subject: Re: SIGCONT misbehaviour in Linux

> By author: Simon Kirby <sim@stormix.com>
> In newsgroup: linux.dev.kernel
>
> and the process running sleep 1000 immediatly returns on Linux.
> I tested it on other systems and it works correctly (the sleep
> continue).

This shows the operation of the SA_RESTART flag. If you don't want
the system call to return to the caller with -1 and EINTR, you
have to use this.

#include <stdio.h>
#include <signal.h>
#include <errno.h>
#include <string.h>

void foo(int unused) { puts("\7Alarm"); }

main(int x)
{
    struct sigaction sa;
    char buf[1];
    int i;
    memset(&sa, 0x00, sizeof(sa));
    if(x > 1)
        sa.sa_flags = SA_RESTART;
    sa.sa_handler = foo;
    sigaction(SIGALRM, &sa, NULL);
    alarm(1);
    i = read(0, buf, 1);
    printf("%d, %s\n", i, strerror(errno));
}

Cheers,
Dick Johnson

Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips).
Warning : The end of the world as we know it requires a new calendar.
Seconds : 2010448 (until Y2K)

Please read the FAQ at http://www.tux.org/lkml/

------------------------------

From: "Richard B. Johnson" <root@chaos.analogic.com>
Date: Wed, 8 Dec 1999 17:37:29 -0500 (EST)
Subject: Re: SIGCONT misbehaviour in Linux

On Wed, 8 Dec 1999, Andrea Arcangeli wrote:

> On Wed, 8 Dec 1999, Richard B. Johnson wrote:
>
> >co-pending system call to return to the caller with -EINTR. It
> >is the 'C' runtime library that decides, based upon this flag,
> >if the system call should be restarted or if -1 should be returned
> >to the caller with errno set to EINTR.
>
> glibc could also return to run the syscall without waiting again from the
> beginning by looking at the 'struct timespec *rem'. If there wouldn't be
> the `rem` parameter in nanosleep, glibc couldn't wrap the -EINTR
> trasparently. But there is.
>
> NOTE: I can as well fix the kernel for this, but I agree with Peter that
> returning -INTR looks like the right thing to do. (I don't know which is
> the official semantic for the syscall though)
>
> Andrea
>
I think the kernel provides the correct result. The caller either has
to use '_BSD_SIGNALS_' or use code like this:

#include <stdio.h>
#include <signal.h>
#include <errno.h>
#include <string.h>

void foo(int unused) { puts("\7Alarm"); }

main(int x)
{
    struct sigaction sa;
    char buf[1];
    int i;
    memset(&sa, 0x00, sizeof(sa));
    if(x > 1)
        sa.sa_flags = SA_RESTART;
    sa.sa_handler = foo;
    sigaction(SIGALRM, &sa, NULL);
    alarm(1);
    i = read(0, buf, 1);
    printf("%d, %s\n", i, strerror(errno));
}

Depending upon whether anything is on the command-line, the SA_RESTART
flag is set. This allows one to get both kinds of behavior with no
problems. I think the kernel code is correct.

Cheers,
Dick Johnson

Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips).
Warning : The end of the world as we know it requires a new calendar.
Seconds : 2010151 (until Y2K)

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Ulrich Drepper <drepper@cygnus.com>
Date: 08 Dec 1999 14:41:31 -0800
Subject: Re: SIGCONT misbehaviour in Linux

Andrea Arcangeli <andrea@suse.de> writes:

> NOTE: I can as well fix the kernel for this, but I agree with Peter that
> returning -INTR looks like the right thing to do. (I don't know which is
> the official semantic for the syscall though)

You don't understand the initial problem. This is that

        kill(SIGSTOP);
        ptrace(PTRACE_CONTINUE)

is interrupting syscalls as well. It is fine if signals in general
interrrupt syscalls. But SIGSTOP & friends, undone by a ptrace() call
should not return since these kind of things happen when because of
reasons outside the program. I user hitting ^Z or gdb stopping and
restarting a process. The behaviour of the program is changed
dramatically.

- --
- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Andrea Arcangeli <andrea@suse.de>
Date: Thu, 9 Dec 1999 00:30:13 +0100 (CET)
Subject: Re: SIGCONT misbehaviour in Linux

On 8 Dec 1999, Ulrich Drepper wrote:

>You don't understand the initial problem. This is that

I am not even considering it now. I was considering what the kernel should
do after a:

        kill(SIGSTOP);
        kill(SIGCONT);

Richard was talking about what happens after a _signal_ and not after a
ptrace_continue. These are two different things and we can make them
behave in completly different way inside the kernel. I don't think you
should compare the SIGSTOP+SIGCONG with SIGSTOP+PTRACE_CONTINUE.

>reasons outside the program. I user hitting ^Z or gdb stopping and

I think we should make difference between ^Z and gdb. The signal code is
filled by ugly special cases exactly because they are different things
AFIK.

Do you agree that ^Z is just correct returning -EINTR immediatly at
SIGCONT time (aka `fg` time)?

Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the
SIGCONT case?)? BTW, I am not sure if nanosleep is the only place that you
may like to change in this respect...

Andrea

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Ulrich Drepper <drepper@cygnus.com>
Date: 08 Dec 1999 16:20:40 -0800
Subject: Re: SIGCONT misbehaviour in Linux

Andrea Arcangeli <andrea@suse.de> writes:

> Do you agree that ^Z is just correct returning -EINTR immediatly at
> SIGCONT time (aka `fg` time)?

This is not what happens on other platforms. At least with my limited
testing I found that if you do on Solaris

        sleep 10
        ^Z
        fg

the process will continue to sleep.

> Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the
> SIGCONT case?)?

This is the least what has to happen.

> BTW, I am not sure if nanosleep is the only place that you may like
> to change in this respect...

No, it's not the only place (e.g., blocking read call). I think this
is a general change. Whenever the continue happens throug
PTRACE_CONTINUE no EINTR should be generated.

- --
- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Jason Gunthorpe <jgg@ualberta.ca>
Date: Wed, 8 Dec 1999 17:24:44 -0700 (MST)
Subject: Re: SIGCONT misbehaviour in Linux

On 8 Dec 1999, Ulrich Drepper wrote:

> is interrupting syscalls as well. It is fine if signals in general
> interrrupt syscalls. But SIGSTOP & friends, undone by a ptrace() call
> should not return since these kind of things happen when because of

I've noticed some general dysfunction with Linux and attaching strace to
running processes. It seems that strace cannot attach without effecting
the state of the process it is attaching too - I never had time to trace
the particular problem down, but from this it sounds like a plausible
explanation [strace causes a slow system call to return?].

Jason

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Andrea Arcangeli <andrea@suse.de>
Date: Thu, 9 Dec 1999 01:38:32 +0100 (CET)
Subject: Re: SIGCONT misbehaviour in Linux

On Wed, 8 Dec 1999, Jason Gunthorpe wrote:

>I've noticed some general dysfunction with Linux and attaching strace to
>running processes. It seems that strace cannot attach without effecting

There are things that you should expect to break. For example if you
SIGSTOP your parent (that is always strace) while you are traced, then
you'll deadlock the first time you'll try to return to userspace
immediatly after you sent the signal to strace. This is normal and it's
not trivial to fix it.

Andrea

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Andrea Arcangeli <andrea@suse.de>
Date: Thu, 9 Dec 1999 01:32:34 +0100 (CET)
Subject: Re: SIGCONT misbehaviour in Linux

On 8 Dec 1999, Ulrich Drepper wrote:

>This is not what happens on other platforms. At least with my limited
>testing I found that if you do on Solaris
>
> sleep 10
> ^Z
> fg
>
>the process will continue to sleep.

That's not enough to tell what the kernel is doing, maybe they have a bit
smarter sleep(1) program. `sleep` can be changed to run nanosleep again if
it received -EINTR and `req` is not null. You only have to pass as `req`
the `rem` that you got back from the previous nanosleep call.

>> Should we make PTRACE_CONTINUE to force nanosleep to continue (unlike the

>> SIGCONT case?)?
>
>This is the least what has to happen.

Ok.

>> BTW, I am not sure if nanosleep is the only place that you may like
>> to change in this respect...
>
>No, it's not the only place (e.g., blocking read call). I think this
>is a general change. Whenever the continue happens throug
>PTRACE_CONTINUE no EINTR should be generated.

Ok.

Andrea

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Ulrich Drepper <drepper@cygnus.com>
Date: 08 Dec 1999 16:47:31 -0800
Subject: Re: SIGCONT misbehaviour in Linux

Andrea Arcangeli <andrea@suse.de> writes:

> That's not enough to tell what the kernel is doing, maybe they have a bit
> smarter sleep(1) program. `sleep` can be changed to run nanosleep again if

> it received -EINTR and `req` is not null. You only have to pass as `req`
> the `rem` that you got back from the previous nanosleep call.

I ran it under truss, you can do the same. The syscall does not return.

- --
- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: sp@albion.engr.sgi.com (Simon Patience)
Date: Fri, 10 Dec 1999 07:29:25 -0800 (PST)
Subject: Re: SIGCONT misbehaviour in Linux

In article <82mvls$aqtrr@fido.engr.sgi.com>, you write:
|> On 8 Dec 1999, Ulrich Drepper wrote:
|> >This is not what happens on other platforms. At least with my limited
|> >testing I found that if you do on Solaris
|> >
|> > sleep 10
|> > ^Z
|> > fg
|> >
|> >the process will continue to sleep.
|>
|> That's not enough to tell what the kernel is doing, maybe they have a bit

|> smarter sleep(1) program. `sleep` can be changed to run nanosleep again
if
|> it received -EINTR and `req` is not null. You only have to pass as `req`
|> the `rem` that you got back from the previous nanosleep call.

No, the problem is that you shouldn't have interrupted it in the first
place. What is the point of interrupting a blocked process so that you
can block it?

Simon.

- --
  Simon Patience Phone: (650) 933-4644
  Silicon Graphics, Inc FAX: (650) 962-8404
  1600 Amphitheatre Pkwy Email: sp@sgi.com
  Mountain View, CA 94043-1389

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: sp@albion.engr.sgi.com (Simon Patience)
Date: Fri, 10 Dec 1999 07:42:52 -0800 (PST)
Subject: Re: SIGCONT misbehaviour in Linux

Eric Paire wrote:
|> My reading of the The POSIX philosophy is that it is not legal for a
|> blocking system call to return EINTR when it has been interrupted by a
|> signal that does not have a signal handler attached to it at the time
|> the signal has been delivered to the process.

I agree. If you look at the description of EINTR, that is quite clear.

[snip]

|> IMHO, the SIGSTOP management (which is much simpler than the others since

|> the signal can never be ignored nor caught) should be taken into account
|> in the schedule loop, and not in the signal management on syscall return.

You really don't want job control to be implemented in the scheduler! It
should be implemented in the joc control code on syscall/trap return. I
know there isn't such code at the moment but that is why you are seeing the
problems :-)

|> Notice that SIGTSTP, SIGTTIN and SIGTTOU should be handled at the same
|> place when the default signal behaviour is applied, as well as some other

Agreed.

|> special cases like ignored SIGCHLD,... Part of this code is currently in
|> machine-dependent do_signal() function.

|> The advantage of such modification is that a blocking system call will
|> remain in the actual schedule loop whenever SIGSTOP/SIGTSTP and SIGCONT
|> are sent to him (thus eliminating the EINTR problem, and being POSIX
|> compatible). The other advantage is that for a traced process, the
SIGSTOP
|> handling may also be managed in the schedule loop, thus avoiding the side

|> effect of being awaken by PTRACE_ATTACH/PTRACE_CONTINUE.

I don't see this as an advantage. Stop signals should stop the process from
advancing in user space. You don't need to do anything to them while they
are in the kernel.

Simon.

- --
  Simon Patience Phone: (650) 933-4644
  Silicon Graphics, Inc FAX: (650) 962-8404
  1600 Amphitheatre Pkwy Email: sp@sgi.com
  Mountain View, CA 94043-1389

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Ulrich Drepper <drepper@cygnus.com>
Date: 10 Dec 1999 07:49:48 -0800
Subject: Re: SIGCONT misbehaviour in Linux

sp@albion.engr.sgi.com (Simon Patience) writes:

> |> > sleep 10
> |> > ^Z
> |> > fg
> |> >
> |> >the process will continue to sleep.
> [...]
> No, the problem is that you shouldn't have interrupted it in the first
> place. What is the point of interrupting a blocked process so that you
> can block it?

I agrre, that is what seems to happen. With one little addition: at
least on Solaris the syscall in the end nevertheless returns EINTR.
I'm not sure whether this is useful but it might be ok since
a) code today already has to handle EINTR
b) it provides the user more information (e.g., that she could find out
   that the process has possibly slept for a long time)

- --
- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Simon Patience <sp@albion.engr.sgi.com>
Date: Fri, 10 Dec 99 08:18:50 -0800
Subject: Re: SIGCONT misbehaviour in Linux

Ulrich Drepper wrote:
|> Andrea Arcangeli <andrea@suse.de> writes:
|>
|> > Do you agree that ^Z is just correct returning -EINTR immediatly at
|> > SIGCONT time (aka `fg` time)?
|>
|> This is not what happens on other platforms. At least with my limited
|> testing I found that if you do on Solaris
|>
|> sleep 10
|> ^Z
|> fg
|>
|> the process will continue to sleep.

I am with Ulrich on this one. The problem with job control signals is
that they are not really signals directed towards the process, they are
signals directed to the kernel to do something to the process. In the
case of STOP/CONT it is a request to not allow/allow the process to
make forward progress _in user space_. Sending the signal should not
interrupt the process at all.

Sending SIGSTOP should simply mark the process as not to return to user
space. If the process happens to be already blocked in the kernel
waiting for something, there is no reason to interrupt so it can be
blocked somewhere else in the kernel. If it wakes up then it can
complete the system call successfully but then block before returning
to user space. SIGCONT would simply clear that flag and wake the
process up if it was in a job control stop.

Simon.

- ----
  Simon Patience Phone: (650) 933-4644
  Silicon Graphics, Inc FAX: (650) 962-8404
  1600 Amphitheatre Pkwy Email: sp@sgi.com
  Mountain View, CA 94043-1389

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Brian Pomerantz <bapper@piratehaven.org>
Date: Fri, 10 Dec 1999 10:17:08 -0800
Subject: Re: SIGCONT misbehaviour in Linux

On Fri, Dec 10, 1999 at 07:29:25AM -0800, Simon Patience wrote:
> In article <82mvls$aqtrr@fido.engr.sgi.com>, you write:
> |> On 8 Dec 1999, Ulrich Drepper wrote:
> |> >This is not what happens on other platforms. At least with my limited
> |> >testing I found that if you do on Solaris
> |> >
> |> > sleep 10
> |> > ^Z
> |> > fg
> |> >
> |> >the process will continue to sleep.
> |>
> |> That's not enough to tell what the kernel is doing, maybe they have a
bit
> |> smarter sleep(1) program. `sleep` can be changed to run nanosleep again
if
> |> it received -EINTR and `req` is not null. You only have to pass as
`req`
> |> the `rem` that you got back from the previous nanosleep call.
>
> No, the problem is that you shouldn't have interrupted it in the first
> place. What is the point of interrupting a blocked process so that you
> can block it?
>

Isn't a process in a blocked state when it is waiting on I/O? I often
will hit ^Z for a long tarball extraction and run it in the
background. When I hit ^Z, the process could be waiting for I/O when
the signal comes through, thus a time when I want to interrupt a
blocked process to block it.

BAPper

Please read the FAQ at http://www.tux.org/lkml/

                        linux-kernel-digest V1 #4897

  ------------------------------------------------------------------------


From: sp@albion.engr.sgi.com (Simon Patience)
Date: Fri, 10 Dec 1999 15:37:51 -0800 (PST)
Subject: Re: SIGCONT misbehaviour in Linux

Brian Pomerantz wrote:
|> On Fri, Dec 10, 1999 at 07:29:25AM -0800, Simon Patience wrote:
|> > In article <82mvls$aqtrr@fido.engr.sgi.com>, you write:
|> > |> On 8 Dec 1999, Ulrich Drepper wrote:
|> > |> >This is not what happens on other platforms. At least with my
limited
|> > |> >testing I found that if you do on Solaris
|> > |> >
|> > |> > sleep 10
|> > |> > ^Z
|> > |> > fg
|> > |> >
|> > |> >the process will continue to sleep.
|> > |>
|> > |> That's not enough to tell what the kernel is doing, maybe they have
a bit
|> > |> smarter sleep(1) program. `sleep` can be changed to run nanosleep
again if
|> > |> it received -EINTR and `req` is not null. You only have to pass as
`req`
|> > |> the `rem` that you got back from the previous nanosleep call.
|> >
|> > No, the problem is that you shouldn't have interrupted it in the first
|> > place. What is the point of interrupting a blocked process so that you
|> > can block it?
|>
|> Isn't a process in a blocked state when it is waiting on I/O? I often
|> will hit ^Z for a long tarball extraction and run it in the
|> background. When I hit ^Z, the process could be waiting for I/O when
|> the signal comes through, thus a time when I want to interrupt a
|> blocked process to block it.

My point was that you could just leave it blocked waiting for the I/O.
If you type fg before the I/O completes then the process will just
execute and return as if nothing had happened when the I/O finally
completes. If the I/O completes first then the processes winds its way
back to the return from the system call, notices that it is stopped and
blocks itself there. When the SIGCONT arrives, the process unblocks and
returns from the system call normally.

This is far better (less code, less complexity) than unblocking the I/O
(which may have partial results) getting the process to block somewhere
else and then trying to work out what on earth the right thing to do is
when SIGCONT arrives.

Simon.

  ------------------------------------------------------------------------


From: Simon Patience <sp@albion.engr.sgi.com>
Date: Wed, 15 Dec 99 08:36:52 -0800
Subject: Re: SIGCONT misbehaviour in Linux 

Eric Paire wrote:
> > Eric Paire wrote:
> > |> IMHO, the SIGSTOP management (which is much simpler than the others since
> > |> the signal can never be ignored nor caught) should be taken into account
> > |> in the schedule loop, and not in the signal management on syscall return
> .
> > 
> > You really don't want job control to be implemented in the scheduler! It
> > should be implemented in the joc control code on syscall/trap return. I
> > know there isn't such code at the moment but that is why you are seeing the
> > problems :-)
> > 
> No. my opinion was to locate only STOP/START management in the scheduling loop
> in order to avoid exiting it for being managed very lately (just before
> returning in user mode). So that if a process is stopped and then restarted
> without any signal handler, then it will remain blocked in the scheduler
> (which is transparent for functions that blocks a process).

Why are you trying to do this? I can't see the objection to code just before
return to user space that says, if I am stopped, wait for sigcont. As you 
haven't interrupted the processes you won't get EINTR. You don't have to 
muck with the scheduler, which is always a tricky thing to do, and
everything works wonderfully.

> > I don't see this as an advantage. Stop signals should stop the process from
> > advancing in user space. You don't need to do anything to them while they
> > are in the kernel.
> > 
> My point is that processes that are stopped and restarted, exit from the
> main schduler loop, and prepare themselves for returning EINTR in user space
> (which is *not* POSIX-compliant, and make GDB very intrusive), since the

But you don't need to change the scheduler to fix that, just don't send
interrupt the process when it gets the STOP signal in the first place.
Mark the process as stopped, having SIGSTOP in the pending set is good
enough but don't wake the process up. Then in do_signal() you special
case STOP signals and wait on a semaphore or something (actually a
synchronization/condition variable would be good for this situation but
Linux doesn't have them). When someone sends SIGCONT, they clear the STOP
signal from the pending set (as today) and then signal the semaphore.
No interrupt, no scheduler hack, POSIX compliant, simple.

> current implementation of restart does not force them to return to the
> scheduler loop for those in INTERRUPTED state. The idea of managing stop
> restart without signal handlers within schedule() is to make a simple
> machine-independent modification to correct this signal mishandling.
> 
> Any scheduler guru opinion ???

Simon

  Simon Patience				Phone: (650) 933-4644
  Silicon Graphics, Inc				FAX:   (650) 962-8404
  1600 Amphitheatre Pkwy			Email: sp@sgi.com
  Mountain View, CA 94043-1389

Please read the FAQ at http://www.tux.org/lkml/

------------------------------


From: Eric Paire <paire@ri.silicomp.fr>
Date: Thu, 16 Dec 1999 14:04:56 +0100
Subject: Re: SIGCONT misbehaviour in Linux

>
>
> Eric Paire wrote:
> > > Eric Paire wrote:
> > > |> IMHO, the SIGSTOP management (which is much simpler than the others
since
> > > |> the signal can never be ignored nor caught) should be taken into
account
> > > |> in the schedule loop, and not in the signal management on syscall
return
> > .
> > >
> > > You really don't want job control to be implemented in the scheduler!
It
> > > should be implemented in the joc control code on syscall/trap return.
I
> > > know there isn't such code at the moment but that is why you are
seeing the
> > > problems :-)
> > >
> > No. my opinion was to locate only STOP/START management in the
scheduling loop
> > in order to avoid exiting it for being managed very lately (just before
> > returning in user mode). So that if a process is stopped and then
restarted
> > without any signal handler, then it will remain blocked in the scheduler

> > (which is transparent for functions that blocks a process).
>
> Why are you trying to do this? I can't see the objection to code just
before
> return to user space that says, if I am stopped, wait for sigcont. As you
> haven't interrupted the processes you won't get EINTR. You don't have to
> muck with the scheduler, which is always a tricky thing to do, and
> everything works wonderfully.
>
> > > I don't see this as an advantage. Stop signals should stop the process
from
> > > advancing in user space. You don't need to do anything to them while
they
> > > are in the kernel.
> > >
> > My point is that processes that are stopped and restarted, exit from the

> > main schduler loop, and prepare themselves for returning EINTR in user
space
> > (which is *not* POSIX-compliant, and make GDB very intrusive), since the

>
> But you don't need to change the scheduler to fix that, just don't send
> interrupt the process when it gets the STOP signal in the first place.
> Mark the process as stopped, having SIGSTOP in the pending set is good
> enough but don't wake the process up. Then in do_signal() you special
> case STOP signals and wait on a semaphore or something (actually a
> synchronization/condition variable would be good for this situation but
> Linux doesn't have them). When someone sends SIGCONT, they clear the STOP
> signal from the pending set (as today) and then signal the semaphore.
> No interrupt, no scheduler hack, POSIX compliant, simple.
>
I agree that your idea to transfer the STOP/CONT management in the calling
process rather the in the managed process seems good. But, you will have to
also transfer from do_signal() in ptrace(), the STOP/CONT management of a
traced process (which is similar to STOP/CONT), in order to avoid ptrace
to modify the process scheduling (gdb would be intrusive otherwise).

> > current implementation of restart does not force them to return to the
> > scheduler loop for those in INTERRUPTED state. The idea of managing stop

> > restart without signal handlers within schedule() is to make a simple
> > machine-independent modification to correct this signal mishandling.
> >
> > Any scheduler guru opinion ???
>

- -Eric
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ Eric PAIRE
Web : http://www.ri.silicomp.com/~paire | Group SILICOMP - Research
Institute
Email: eric.paire@ri.silicomp.com | 2, avenue de Vignate
Phone: +33 (0) 476 63 48 71 | F-38610 Gieres
Fax : +33 (0) 476 51 05 32 | FRANCE

Please read the FAQ at http://www.tux.org/lkml/

------------------------------