From: Ted Timar Message-ID: Newsgroups: comp.unix.questions,comp.unix.shell,comp.answers,news.answers Subject: Unix - Frequently Asked Questions (1/7) [Frequent posting] Date: 29 Feb 2000 12:49:02 GMT [...] Version: $Id: part1,v 2.9 1996/06/11 13:07:56 tmatimar Exp $ [...] Subject: Why do some scripts start with #! ... ? >From: chip@@chinacat.unicom.com (Chip Rosenthal) Date: Tue, 14 Jul 1992 21:31:54 GMT 3.16) Why do some scripts start with #! ... ? Chip Rosenthal has answered a closely related question in comp.unix.xenix in the past. I think what confuses people is that there exist two different mechanisms, both spelled with the letter `#'. They both solve the same problem over a very restricted set of cases -- but they are none the less different. Some background. When the UNIX kernel goes to run a program (one of the exec() family of system calls), it takes a peek at the first 16 bits of the file. Those 16 bits are called a `magic number'. First, the magic number prevents the kernel from doing something silly like trying to execute your customer database file. If the kernel does not recognize the magic number then it complains with an ENOEXEC error. It will execute the program only if the magic number is recognizable. Second, as time went on and different executable file formats were introduced, the magic number not only told the kernel *if* it could execute the file, but also *how* to execute the file. For example, if you compile a program on an SCO XENIX/386 system and carry the binary over to a SysV/386 UNIX system, the kernel will recognize the magic number and say `Aha! This is an x.out binary!' and configure itself to run with XENIX compatible system calls. Note that the kernel can only run binary executable images. So how, you might ask, do scripts get run? After all, I can type `my.script' at a shell prompt and I don't get an ENOEXEC error. Script execution is done not by the kernel, but by the shell. The code in the shell might look something like: /* try to run the program */ execl(program, basename(program), (char *)0); /* the exec failed -- maybe it is a shell script? */ if (errno == ENOEXEC) execl ("/bin/sh", "sh", "-c", program, (char *)0); /* oh no mr bill!! */ perror(program); return -1; (This example is highly simplified. There is a lot more involved, but this illustrates the point I'm trying to make.) If execl() is successful in starting the program then the code beyond the execl() is never executed. In this example, if we can execl() the `program' then none of the stuff beyond it is run. Instead the system is off running the binary `program'. If, however, the first execl() failed then this hypothetical shell looks at why it failed. If the execl() failed because `program' was not recognized as a binary executable, then the shell tries to run it as a shell script. The Berkeley folks had a neat idea to extend how the kernel starts up programs. They hacked the kernel to recognize the magic number `#!'. (Magic numbers are 16-bits and two 8-bit characters makes 16 bits, right?) When the `#!' magic number was recognized, the kernel would read in the rest of the line and treat it as a command to run upon the contents of the file. With this hack you could now do things like: #! /bin/sh #! /bin/csh #! /bin/awk -F: This hack has existed solely in the Berkeley world, and has migrated to USG kernels as part of System V Release 4. Prior to V.4, unless the vendor did some special value added, the kernel does not have the capability of doing anything other than loading and starting a binary executable image. Now, lets rewind a few years, to the time when more and more folks running USG based unices were saying `/bin/sh sucks as an interactive user interface! I want csh!'. Several vendors did some value added magic and put csh in their distribution, even though csh was not a part of the USG UNIX distribution. This, however, presented a problem. Let's say you switch your login shell to /bin/csh. Let's further suppose that you are a cretin and insist upon programming csh scripts. You'd certainly want to be able to type `my.script' and get it run, even though it is a csh script. Instead of pumping it through /bin/sh, you want the script to be started by running: execl ("/bin/csh", "csh", "-c", "my.script", (char *)0); But what about all those existing scripts -- some of which are part of the system distribution? If they started getting run by csh then things would break. So you needed a way to run some scripts through csh, and others through sh. The solution introduced was to hack csh to take a look at the first character of the script you are trying to run. If it was a `#' then csh would try to run the script through /bin/csh, otherwise it would run the script through /bin/sh. The example code from the above might now look something like: /* try to run the program */ execl(program, basename(program), (char *)0); /* the exec failed -- maybe it is a shell script? */ if (errno == ENOEXEC && (fp = fopen(program, "r")) != NULL) { i = getc(fp); (void) fclose(fp); if (i == '#') execl ("/bin/csh", "csh", "-c", program, (char *)0); else execl ("/bin/sh", "sh", "-c", program, (char *)0); } /* oh no mr bill!! */ perror(program); return -1; Two important points. First, this is a `csh' hack. Nothing has been changed in the kernel and nothing has been changed in the other shells. If you try to execl() a script, whether or not it begins with `#', you will still get an ENOEXEC failure. If you try to run a script beginning with `#' from something other than csh (e.g. /bin/sh), then it will be run by sh and not csh. Second, the magic is that either the script begins with `#' or it doesn't begin with `#'. What makes stuff like `:' and `: /bin/sh' at the front of a script magic is the simple fact that they are not `#'. Therefore, all of the following are identical at the start of a script: : : /bin/sh <--- a blank line : /usr/games/rogue echo "Gee...I wonder what shell I am running under???" In all these cases, all shells will try to run the script with /bin/sh. Similarly, all of the following are identical at the start of a script: # # /bin/csh #! /bin/csh #! /bin/sh # Gee...I wonder what shell I am running under??? All of these start with a `#'. This means that the script will be run by csh *only* if you try to start it from csh, otherwise it will be run by /bin/sh. (Note: if you are running ksh, substitute `ksh' for `sh' in the above. The Korn shell is theoretically compatible with Bourne shell, so it tries to run these scripts itself. Your mileage may vary on some of the other available shells such as zsh, bash, etc.) Obviously, if you've got support for `#!' in the kernel then the `#' hack becomes superfluous. In fact, it can be dangerous because it creates confusion over what should happen with `#! /bin/sh'. The `#!' handling is becoming more and more prevelant. System V Release 4 picks up a number of the Berkeley features, including this. Some System V Release 3.2 vendors are hacking in some of the more visible V.4 features such as this and trying to convince you this is sufficient and you don't need things like real, working streams or dynamically adjustable kernel parameters. XENIX does not support `#!'. The XENIX /bin/csh does have the `#' hack. Support for `#!' in XENIX would be nice, but I wouldn't hold my breath waiting for it. ------------------------------