ARG_MAX
| Shells
| portability
| permissions
| UUOC
| ancient
| -
| ../Various
| HOME
$() vs )
| IFS
| using siginfo
| nanosleep
| line charset
| locale
#!' magic - details about the shebang mechanism
on various Unix flavours2009-12-07 (see recent changes)
Here you'll find
See an old mail from Dennis Ritchie introducing the new feature,
quoted in 4.0BSD, /usr/src/sys/newsys/sys1.c.
So this mechanism was invented between Version 7 and
Version 8. It was then available in 4BSD but not activated per
default until 4.2BSD.
(All this pointed out by Gunnar Ritter in
<3B5B0BA4.XY112IX2@bigfoot.de>
in de.comp.os.unix.shell.)
The paragraph
"3.16)
Why do some scripts start with #! ... ?"
(local copy),
emphasizes the history concerning shells, not the kernel.
That document is wrong about two details:
There is a rumor, that a very few and very special, earlier
Unix versions (particularly 4.2BSD derivatives) require you to
separate the "#!" from the following path with a blank.
You may also read, that (allegedly) such a kernel parses
"#! /" as a 32-bit (long) magic.
But it turns out that it is virtually impossible to find
a Unix which actually required this.
4.2BSD in fact doesn't require it, although
previous versions
of the GNU autoconf tutorial wrongly claimed this
("10. Portable Shell Programming", corrected with release 2.64, 2009-07-26).
But instead, see (again
4.0BSD /usr/src/sys/newsys/sys1.c from above and)
the first regular occurence in
4.2BSD,
/usr/src/sys/sys/kern_exec.c.
The source accepted a blank, but never required it.
All this pointed out by Gunnar Ritter in
<3B5B0BA4.XY112IX2@bigfoot.de>
(and thanks to the new Caldera license, the code can be cited here now.)
Instead, the origin of this myth "of the required blank" might be a particular release of 4.1 BSD: There is a manpage in a "4.1.snap" snapshot of 4.1BSD on the CSRG CDs, /usr/man/man2/exec.2 (4/1/81), where a space/tab after the "#!" is mentioned as mandatory. However, this is not true: the source itself remained unchanged. (Hint to the existence of such a manpage from Bruce Barnett in <ae3m9l$rti$0@208.20.133.66>).
It's not clear whether this is a bug in documentation or if Berkeley planned to modify the BSD source but eventually did not. DYNIX is mentioned in the autoconf documentation, too. It's also unclear if this variant implemented it in a few releases (perhaps following the abovementioned manual page). However, later releases did not implement it according to Usenet discussions.
I asked David MacKenzie, the author of the autoconf
documentation, about the actual origin of this note.
But unfortunately neither the reporting author nor
the very system are recorded anymore.
Even intensive search of usenet archives didn't reveal any further hints to me.
I found no evidence yet, that there's an implementation which forbids a blank after #!
env(1) is often used with the #!-mechanism to start
an interpreter, which then only needs to be somewhere
in your PATH, e.g. "#!/usr/bin/env perl".
However, the location of env(1) may vary.
Free-, Net-, OpenBSD and some Linux distributions (Debian)
only come with /usr/bin/env.
On the other hand, there's only /bin/env at least on
OpenServer 5.0.6 and Unicos 9.0.2.
(On some other Linux distributions (Redhat) it's located in /bin and there's
a symbolic link from /usr/bin/env to it.)
The env-mechanism is highly increasing convenience,
but cannot strictly assure "portability" of a script.
In practice, env must not be a script, because the #! mechanism usually accepts only binary executables (except on a few implementations like UWIN, Minix and Linux 2.6.27.9 ff.).
FreeBSD before 6.0 (change pointed out to me by Akinori Musha),
BIG-IP4.2 (BSD/OS4.1) and Minix split up the
arguments like a shell to fill up argv[].
execve() implementation due to the license issue.)
| [1] |
NetBSD already implements it in the first cvs entry for
exec_script.c
(01/94), some time before release 1.0.
The filedescriptor filesystem ("fdescfs") had been added with release 0.8 (04/93). NetBSD was influenced by 386BSD, but I couldn't find it there (including patchkit 0.2.4, 06/93). FreeBSD, which is a direct descendant of 386BSD, doesn't implement it either. OpenBSD forked off from NetBSD later (10/95) and thus implements it like NetBSD. |
Set user id support is implemented by means of the fd filesystem for instance on:
-p. Without this flag, the EUID is set back
to the UID if different.
Most probably there isn't any Bell-Labs- or Berkeley-derived Unix
that accepts "nested" #!.
However, Minix, the UWIN environment, and Linux since
2.6.27.9 2 accept this.
Be careful not to confuse whether the kernel accepts an interpreted script after #!,
or if your shell silently tries to take over if the kernel refused it or
returned with an error for another reason:
| [2] |
For more information about nested #! on Linux, see the
kernel patch (applied to
2.6.27.9) and especially
binfmt_script.c
which contains the important parts.
Linux allows at most BINPRM_MAX_RECURSION, that is 4, levels of nesting.
(Thanks to Mantas Mikulėnas for informing me that Linux changed here.) |
FreeBSD 4.0 introduced a comment-like handling of "#" in the arguments.
Release 6.0 revoked this (see also a discussion
on freebsd-arch).
sizeof(struct a.out/exec)".
The reason is a union, which contains both struct
a.out/exec and a string with the same length (to access the #! line).
imgact_shell.c and
<sys/imgact.h> before 6.0
<machine/param.h> since 6.0
(MAXSHELLCMDLEN became PAGESIZE, which depends on the architecture),
kern/exec_script.c
(MAXINTERP in
<sys/param.h> or
PATH_MAX in
<sys/syslimits.h>,
respectively).
kern/exec_script.c (MAXINTERP in
<sys/param.h>).
load_script() in
linux/fs/binfmt_script.c
(link to lxr.linux.no).
limits.h or syslimits.h
on the respective system.
Exceptions are BIG-IP4.2 (BSD/OS4.1) with 4096 and FreeBSD since 6.0 (PAGE_SIZE) with 4096 or 8192 depending on the architecture.
Minix also uses the limit of PATH_MAX characters
(255 here) but the actual limit is 257 characters, because
patch_stack() in src/mm/exec.c
first skips the "#!" with an lseek() and then reads in the rest.
#! only as a possible extension:
Shell Introduction
[...]
If the first line of a file of shell commands starts with the
characters #!, the results are unspecified.
The construct #! is reserved for implementations wishing to provide
that extension. A portable application cannot use #! as the first
line of a shell script; it might not be interpreted as a comment.
[...]
Command Search and Execution
[...]
This description requires that the shell can execute shell
scripts directly, even if the underlying system does not support
the common #! interpreter convention. That is, if file foo contains
shell commands and is executable, the following will execute foo:
./foo
There was a Working Group Resolution trying to define the mechanism.
On the other hand, speaking about "#!/bin/sh" on any Unix:
This is a really rocksolid and portable convention by tradition,
if you expect anything from the Bourne shell family and its descendants
to be called.
ENOENT.
This error can be misleading, because the shell applies it to the script called
and not to the interpreter in its #! line.
As exception, bash-3 subsequently reads the first line itself and gives a
diagnostic concerning the interpreter
"bash: ./script.sh: /bin/notexistent: bad interpreter: No such file or directory"
E2BIG
(IRIX, SCO OpenServer)
or ENAMETOOLONG
(FreeBSD, BIG-IP4.2 (BSD/OS4.1)
ENOEXEC.
In some shells this results in a silent failure.
Some shells subsequently try to interprete the script itself.
I used the following as program "showargs":
#include <stdio.h>
int main(int argc, char** argv) {
int i;
for (i=0; i<argc; i++)
fprintf(stdout, "argv[%d]: \"%s\"\n", i, argv[i]);
return(0);
}
and a one line script named "invoker.sh" to call it, similar to this,
#!/tmp/showargs -1 -2 -3
to produce the following results (tried them myself, but i'd like to add your results from yet different systems).
Typically, a result from the above would look like this:
argv[0]: "/tmp/showargs"
argv[1]: "-1 -2 -3"
argv[2]: "./invoker.sh"
... but the following table lists the variations. The meaning of the columns is explained below.
| OS (arch) | maximum length of #! line | cut-off (c), error (err) or ENOEXEC () | only the 1st arg passed on | each arg in its own argv[x] | handle #like a comment | argv[0]: invoker, instead of interpreter | not full path in argv[0] | remove trailing white- space | convert tabulator to space | accept inter- preter | search current directory | (x) allow suid (o) optional [suid] |
| AIX 3.2.5/4.3.2 (rs6k) | 256 | X | X | X | ||||||||
| BIG-IP4.2 [big-ip] | 4096 | err | X | ? | ? | X | ? | X | ||||
| EP/IX 2.2.1 (mips) | 1024 | X | ? | ? | ||||||||
| FreeBSD 1.1- / 4.0-4.4 | 64 | X | - / X | X | ? | ? | ||||||
| FreeBSD 4.5- | 128 | err | X | X | X | ? | ? | |||||
| FreeBSD 6.0- (i386/amd64) | 4096 | c | X | X | X | |||||||
| FreeBSD 6.0- (ia64/sparc64/alpha) | 8192 | c | X | X | X | |||||||
| HP-UX A.08.07/B.09.03 | 32 | X | ? | ? | ? | |||||||
| HP-UX B.10.10 | 128 | X | X | ? | ? | |||||||
| HP-UX B.10.20-11.31 | 128 | X | X | X | ||||||||
| IRIX 4.0.5 (mips) | 64 | ? | ? | X | X | X | ||||||
| IRIX 5.3/6.5 (mips) | 256 | err | X | ? | X | |||||||
| Linux 2.2.9-2.6.27.8 | 127 | c | X | X | X | |||||||
| Linux 2.6.27.9- | 127 | c | X | X | X | X | ||||||
| Minix 2.0.3-3.1.1 | 257 | X | X | ? | X | ? | ? | |||||
| MUNIX 3.1 (svr3.x, 68k) | 32 | X | ? | ? | ? | |||||||
| NetBSD 0.8-1.6Q / 1.6R- | 64 / 1024 | ? | ? | o | ||||||||
| OpenBSD 2.0-3.4 | 64 | ? | ? | o | ||||||||
| OSF1 V4.0B-T5.1 | 1024 | X | ? | ? | ? | |||||||
| OpenServer 5.0.6/6.0.0 [sco] | 256 | err | X | X | X | X | ? [sco-suid] | |||||
| SINIX 5.20 (mx300/nsc) | 32 | ? | ? | |||||||||
| SunOS 4.1.4 (sparc) | 32 | c | X | ? | ? | ? | ||||||
| SunOS 5.x (sparc) | 1024 | X | X | X | X | X | ||||||
| Ultrix 4.3 (all), 4.5 (vax3100) | 32 | c | X | ? | ? | |||||||
| Ultrix 4.5 (risc) | 80 | c | X | ? | ? | |||||||
| Unicos 9.0.2.2 (cray) | 32 | X | ? | ? | ||||||||
| Unixware 2, 7 | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | X |
| UWIN (i386) | 255 | c | ? | X | ? | X | ? | ? | ||||
| GNU Hurd cvs-20020529 | 4096 | c | ? | ? | X | ? | ? | ? |
argv[1] from above contains
only "-1" then
argv[1]:"-1",
argv[2]:"-2", etc.
argv[0] doesn't contain
"/tmp/showargs" but "./invoker.sh"
argv[0] contains
the basename of the called program instead of its full path.
| [suid] | On Net- and OpenBSD the kernel option SETUIDSCRIPTS must be activated to allow for the setuid/gid-bit with the #! mechanism. |
| [sco-suid] | The SCO OpenServer 6.0 documentation is ambiguous whether setuid scripts are supported:
"SUID, SGID, and sticky bit clearing on writes" (via Security online docs/Maintainig System Security) states that suid/sgid bit don't work on shell scripts (not mentioning the #! mechanism), but chmod(1) states that they do only, if the #! convention is used. |
| [big-ip] | This BIG-IP 4.2 (vendor is F5) is based on BSDi BSD/OS 4.1,
probably even with very few modifications:
The tools contain the string "BSD/OS 4.1" and there's also a kernel /bsd-generic, which contains "BSDi BSD/OS 4.1". I had no compiler available on this system, thus some tests are pending. |
| [sco] | John H. DuBois told me that #! was introduced in SCO UNIX 3.2v4.0, but was disabled by default. If you wanted to use it, it had to be enabled by setting hashplingenable in kernel/space.c ("hashpling" because it was implemented by programmers in Britain). It was apparently enabled by default in 3.2v4.2, but even then there were no #! scripts shipped with the OS as a customer might disable it. The first #! scripts (tcl) were shipped in 3.2v5.0 then. |
And why shebang? In music, '#' means sharp. So just shorten '#!' to sharp-bang. Or it might be derived from "shell bang". All this probably under the influence of the american slang idiom "the whole shebang" (everything, the works, everything involved in what is under consideration). See also the jargon dictionary or Merriam-Websters for the slang idiom.
Sometimes it's also called hashbang, pound-bang, sha-bang/shabang, hashexclam, or hashpling (very british, isn't it?).
<http://www.in-ulm.de/~mascheck/various/shebang/>