ARG_MAX
| Shells
| portability
| permissions
| UUOC
| ancient
| -
| ../Various
| HOME
$() vs )
| IFS
| using siginfo
| nanosleep
| line charset
| locale
#! magic, details about the shebang/hash-bang mechanism
on various Unix flavours2010-04-06 (see recent changes)
Here you'll find
See an old mail from Dennis Ritchie introducing the new feature,
quoted in 4.0BSD, /usr/src/sys/newsys/sys1.c.
So this mechanism was invented between Version 7 and
Version 8. It was then available in 4BSD but not activated per
default until 4.2BSD.
(pointed out by Gunnar Ritter in
<3B5B0BA4.XY112IX2@bigfoot.de>, de.comp.os.unix.shell.)
In 4.3BSD Net/2 the code was removed due to the license war and had to be reimplemented for the descendants (e.g., NetBSD, 386BSD, BSDI).
The paragraph
"3.16)
Why do some scripts start with #! ... ?"
(local copy),
emphasizes the history concerning shells, not the kernel.
That document is wrong about two details:
#! was not invented at Berkeley (but they implemented
it first in widely distributed releases), see above.
#! required?
There is a rumor, that a very few and very special, earlier
Unix versions (particularly 4.2BSD derivatives) require you to
separate the "#!" from the following path with a blank.
You may also read, that (allegedly) such a kernel parses
"#! /" as a 32-bit (long) magic.
But it turns out that it is virtually impossible to find
a Unix which actually required this.
4.2BSD in fact doesn't require it, although
previous versions
of the GNU autoconf tutorial wrongly claimed this
("10. Portable Shell Programming", corrected with release 2.64, 2009-07-26).
But instead, see (again
4.0BSD /usr/src/sys/newsys/sys1.c from above and)
the first regular occurence in
4.2BSD,
/usr/src/sys/sys/kern_exec.c.
The source accepted a blank, but never required it.
All this pointed out by Gunnar Ritter in
<3B5B0BA4.XY112IX2@bigfoot.de>
(and thanks to the new Caldera license, the code can be cited here now.)
Instead, the origin of this myth "of the required blank"
might be a particular release of 4.1 BSD:
There is a manpage in a "4.1.snap" snapshot of 4.1BSD on the CSRG CDs,
/usr/man/man2/exec.2 (4/1/81),
where a space/tab after the #! is mentioned as mandatory.
However, this is not true: the source itself remained unchanged.
(Hint to the existence of such a manpage from Bruce Barnett in
<ae3m9l$rti$0@208.20.133.66>).
It's not clear whether this is a bug in documentation or if Berkeley planned to modify the BSD source but eventually did not.
DYNIX is mentioned in the autoconf documentation, too.
It's unclear if this variant might have implemented it in a few releases
(perhaps following the abovementioned manual page).
However, later releases did not implement it according to
Usenet discussions.
I asked David MacKenzie, the author of the autoconf
documentation, about the actual origin of the autoconf note.
But unfortunately neither the reporting author nor
the very system are recorded anymore.
Even intensive search of usenet archives didn't reveal any further hints to me.
I found no evidence yet, that there's an implementation
which forbids a blank after #!
env(1) is often used with the #! mechanism to start
an interpreter, which then only needs to be somewhere
in your PATH, e.g. "#!/usr/bin/env perl".
However, the location of env(1) may vary.
Free-, Net-, OpenBSD and some Linux distributions (Debian)
only come with /usr/bin/env.
On the other hand, there's only /bin/env at least on
OpenServer 5.0.6 and Unicos 9.0.2.
(On some other Linux distributions (Redhat) it's located in /bin and there's
a symbolic link from /usr/bin/env to it.)
The env-mechanism is highly increasing convenience,
but cannot strictly assure "portability" of a script.
In practice, env must not be a script, because the #! mechanism usually
accepts only binary executables (except on a few implementations like Minix and Linux 2.6.27.9 ff.).
Some systems split up the arguments like a shell to fill up argv[], e.g.,
For Linux, a patch was suggested on the Linux kernel mailing list, with an interesting discussion of some portability issues.
execve() implementation due to the license issue.)
| [1] |
NetBSD already implements it in the first cvs entry for
exec_script.c
(01/94), some time before release 1.0.
The filedescriptor filesystem ("fdescfs") had been added with release 0.8 (04/93). NetBSD was influenced by 386BSD, but I couldn't find it there (including patchkit 0.2.4, 06/93). FreeBSD, which is a direct descendant of 386BSD, doesn't implement it either. OpenBSD forked off from NetBSD later (10/95) and thus implements it like NetBSD. |
Set user id support is implemented (always by means of the fd filesystem) for instance on:
-p. Without this flag, the EUID is set back
to the UID if different.
#! mechanism,
because you have to be aware of numerous issues. Keywords are:
#!?
Most probably there isn't any Bell-Labs- or Berkeley-derived Unix
that accepts "nested" #!.
However, Minix and Linux since 2.6.27.9 2 accept this.
Be careful not to confuse whether the kernel accepts an interpreted script after #!,
or if your shell silently tries to take over if the kernel refused it or
returned with an error for another reason:
#! mechanism was not present at compile time
(probably only in unix-like environments like cygwin).
#!, but only if "BSD" was not defined at compile time.
Later variants de-facto do not recognize it.
| [2] |
For more information about nested #! on Linux, see the
kernel patch (applied to
2.6.27.9) and especially
binfmt_script.c
which contains the important parts.
Linux allows at most BINPRM_MAX_RECURSION, that is 4, levels of nesting.
(hint to me about the change by Mantas Mikulėnas.) |
FreeBSD 4.0 introduced a comment-like handling of "#" in the arguments, but release 6.0 revoked this (see also a discussion on freebsd-arch).
MacOS X introduced comment-like handling of "#" with release 10.3(/xnu-517/Panther)
#! line:
sizeof(struct a.out/exec)".
union, which contains both struct
a.out/exec and a string with the same length (to access the #! line).
imgact_shell.c and
<sys/imgact.h> before 6.0
<machine/param.h> since 6.0
(MAXSHELLCMDLEN became PAGESIZE, which depends on the architecture),
kern/exec_script.c
(MAXINTERP in
<sys/param.h> or
PATH_MAX in
<sys/syslimits.h>,
respectively).
kern/exec_script.c (MAXINTERP in
<sys/param.h>).
load_script() in
linux/fs/binfmt_script.c (and binfmts.h).
On Linux, #! was introduced with kernel release 0.09 or 0.10
(0.08 had not implemented it, yet).
In fact, the original maximum length was 1022,
see linux/fs/exec.c from Linux 0.10.
But on Linux 0.12,
this was changed to 127 (parts of a diff).
limits.h or syslimits.h
on the respective system.
Exceptions are BIG-IP4.2 (BSD/OS4.1) with 4096 and FreeBSD since 6.0 (PAGE_SIZE) with 4096 or 8192 depending on the architecture.
Minix also uses the limit of PATH_MAX characters
(255 here) but the actual limit is 257 characters,
because patch_stack() in src/mm/exec.c
first skips the "#!" with an lseek() and then reads in the rest.
#! only as a possible extension:
Shell Introduction
[...]
If the first line of a file of shell commands starts with the
characters #!, the results are unspecified.
The construct #! is reserved for implementations wishing to provide
that extension. A portable application cannot use #! as the first
line of a shell script; it might not be interpreted as a comment.
[...]
Command Search and Execution
[...]
This description requires that the shell can execute shell
scripts directly, even if the underlying system does not support
the common #! interpreter convention. That is, if file foo contains
shell commands and is executable, the following will execute foo:
./foo
There was a Working Group Resolution trying to define the mechanism.
On the other hand, speaking about "#!/bin/sh" on any Unix:
This is a really rocksolid and portable convention by tradition,
if you expect anything from the Bourne shell family and its descendants
to be called.
ENOENT.
This error can be misleading, because the shell applies it to the script called
and not to the interpreter in its #! line.
As exception, bash-3 subsequently reads the first line itself and gives a
diagnostic concerning the interpreter
"bash: ./script.sh: /bin/notexistent: bad interpreter: No such file or directory"
#! line is too long, at least three things can happen:
E2BIG
(IRIX, SCO OpenServer)
or ENAMETOOLONG
(FreeBSD, BIG-IP4.2 (BSD/OS4.1)
ENOEXEC.
In some shells this results in a silent failure.
Some shells subsequently try to interprete the script itself.
I used the following as program "showargs":
#include <stdio.h>
int main(argc, argv)
int argc; char** argv;
{
int i;
for (i=0; i<argc; i++)
fprintf(stdout, "argv[%d]: \"%s\"\n", i, argv[i]);
return(0);
}
and a one line script named "invoker.sh" to call it, similar to this,
#!/tmp/showargs -1 -2 -3
to produce the following results (tried them myself, but i'd like to add your results from yet different systems).
Typically, a result from the above would look like this:
argv[0]: "/tmp/showargs"
argv[1]: "-1 -2 -3"
argv[2]: "./invoker.sh"
... but the following table lists the variations. The meaning of the columns is explained below.
| OS (arch) | maximum length of #! line | cut-off (c), error (err) or ENOEXEC () | only the 1st arg passed on | each arg in its own argv[x] | handle #like a comment | argv[0]: invoker, instead of interpreter | not full path in argv[0] | remove trailing white- space | convert tabulator to space | accept inter- preter | search current directory | (x) allow suid (o) optional [suid] |
| 4.0BSD | 32 | [orig] | X | X | X | |||||||
| AIX 3.2.5/4.3.2 (rs6k) | 256 | X | X | X | ||||||||
| BIG-IP4.2 [big-ip] | 4096 | err | X | ? | ? | X | n/a | X | ||||
| EP/IX 2.2.1 (mips) | 1024 | X | ? | ? | ||||||||
| FreeBSD 1.1- / 4.0-4.4 | 64 | X | - / X | X | n/a | ? | ||||||
| FreeBSD 4.5- | 128 | err | X | X | X | n/a | ? | |||||
| FreeBSD 6.0- (i386/amd64) | 4096 | c | X | X | X | |||||||
| FreeBSD 6.0- (ia64/sparc64/alpha) | 8192 | c | X | X | X | |||||||
| HP-UX A.08.07/B.09.03 | 32 | X | ? | ? | ? | |||||||
| HP-UX B.10.10 | 128 | X | X | ? | ? | |||||||
| HP-UX B.10.20-11.31 | 128 | X | X | X | ||||||||
| IRIX 4.0.5 (mips) | 64 | ? | ? | X | X | X | ||||||
| IRIX 5.3/6.5 (mips) | 256 | err | X | ? | X | |||||||
| Linux 0.10 / 0.12-0.99.1 | 1022 / 127 | [early-linux] | X | ? | ||||||||
| Linux 0.99.2-2.2.26 | 127 | c | X | X | ? | |||||||
| Linux 2.4.0-2.6.27.8 / 2.6.27.9- | 127 | c | X | / X | X | |||||||
| MacOS X 10.0/.1/.2, xnu 123.5-344 | 512 | ? | ? | X | ? | ? | ? | |||||
| MacOS X 10.3, xnu 517 | 512 | X | ? | ? | X | X | ? | ? | ? | |||
| MacOS X 10.4/.5/.6, xnu 792-1504 | 512 | X | X | X | n/a | X | o | |||||
| Minix 2.0.3-3.1.1 | 257 | X | X | n/a | X | X | X | |||||
| MUNIX 3.1 (svr3.x, 68k) | 32 | X | ? | ? | ? | |||||||
| NetBSD 0.8-1.6Q / 1.6R- | 64 / 1024 | X | o | |||||||||
| OpenBSD 2.0-3.4 | 64 | X | o | |||||||||
| OSF1 V4.0B-T5.1 | 1024 | X | X | X | ||||||||
| OpenServer 5.0.6/6.0.0 [sco] | 256 | err | X | X | X | X | ? | |||||
| SINIX 5.20 (mx300/nsc) | 32 | ? | ? | |||||||||
| SunOS 4.1.4 (sparc) | 32 | c | X | ? | ? | ? | ||||||
| SunOS 5.x (sparc) | 1024 | X | X | X | X | X | ||||||
| Ultrix 4.0 (µvax 3900) | 31 | X | X | X | X | |||||||
| Ultrix 4.3 (all), 4.5 (vax3100) | 32 | c | X | ? | ? | |||||||
| Ultrix 4.5 (risc) | 80 | c | X | ? | ? | |||||||
| Unicos 9.0.2.2 (cray) | 32 | X | ? | ? | ||||||||
| Unixware 2, 7 | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | X |
| GNU Hurd cvs-20020529 | 4096 | c | X | ? | ? | ? |
argv[1] from above contains
only "-1" then
argv[1]:"-1",
argv[2]:"-2", etc.
argv[0] doesn't contain
"/tmp/showargs" but "./invoker.sh"
argv[0] contains
the basename of the called program instead of its full path.
| [orig] | 4.0BSD and 386BSD-0.1 don't hand over any argument at all.
The called interpreter only receives argv[0] with it's own path and argv[1] with the script |
| [big-ip] | This BIG-IP 4.2 (vendor is F5) is based on BSDi BSD/OS 4.1,
probably even with very few modifications:
The tools contain the string "BSD/OS 4.1" and there's also a kernel /bsd-generic, which contains "BSDi BSD/OS 4.1". I had no compiler available on this system, thus some tests are pending. |
| [sco] | John H. DuBois told me that
#! was introduced in SCO UNIX 3.2v4.0, but was disabled by
default. If you wanted to use it, it had to be enabled by setting
hashplingenable in kernel/space.c ("hashpling" because it was
implemented by programmers in Britain). It was apparently enabled by
default in 3.2v4.2, but even then there were no #! scripts shipped with
the OS as a customer might disable it. The first #! scripts (tcl)
were shipped in 3.2v5.0 then.
|
| [suid] |
|
| [early-linux] | On linux 0.10 until 0.99.1, argv[0] contains both the interpreter and the arguments:
argv[0]: "/tmp/showargs -1 -2 -3"
|
And why shebang? In music, '#' means sharp. So just
shorten #! to sharp-bang. Or it might be derived from "shell
bang". All this probably under the influence of the american slang
idiom "the whole shebang" (everything, the works, everything
involved in what is under consideration).
See also the
jargon
dictionary
or Merriam-Websters for the slang
idiom.
Sometimes it's also called hash-bang, pound-bang, sha-bang/shabang, hash-exclam, or hash-pling (british, isn't it?).
<http://www.in-ulm.de/~mascheck/various/shebang/>