From: Ted Timar
Message-ID: <unix-faq/faq/part1_951828526@rtfm.mit.edu>
Newsgroups: comp.unix.questions,comp.unix.shell,comp.answers,news.answers
Subject: Unix - Frequently Asked Questions (1/7) [Frequent posting]
Date: 29 Feb 2000 12:49:02 GMT

[...]

Version: $Id: part1,v 2.9 1996/06/11 13:07:56 tmatimar Exp $

[...]


Subject: Why do some scripts start with #! ... ?
>From: chip@@chinacat.unicom.com (Chip Rosenthal)
Date: Tue, 14 Jul 1992 21:31:54 GMT

3.16) Why do some scripts start with #! ... ?

      Chip Rosenthal has answered a closely related question in
      comp.unix.xenix in the past.

      I think what confuses people is that there exist two different
      mechanisms, both spelled with the letter `#'.  They both solve the
      same problem over a very restricted set of cases -- but they are
      none the less different.

      Some background.  When the UNIX kernel goes to run a program (one
      of the exec() family of system calls), it takes a peek at the
      first 16 bits of the file.  Those 16 bits are called a `magic
      number'.  First, the magic number prevents the kernel from doing
      something silly like trying to execute your customer database
      file.  If the kernel does not recognize the magic number then it
      complains with an ENOEXEC error.  It will execute the program only
      if the magic number is recognizable.

      Second, as time went on and different executable file formats were
      introduced, the magic number not only told the kernel *if* it
      could execute the file, but also *how* to execute the file.  For
      example, if you compile a program on an SCO XENIX/386 system and
      carry the binary over to a SysV/386 UNIX system, the kernel will
      recognize the magic number and say `Aha!  This is an x.out
      binary!' and configure itself to run with XENIX compatible system
      calls.

      Note that the kernel can only run binary executable images.  So
      how, you might ask, do scripts get run?  After all, I can type
      `my.script' at a shell prompt and I don't get an ENOEXEC error.
      Script execution is done not by the kernel, but by the shell.  The
      code in the shell might look something like:

        /* try to run the program */
        execl(program, basename(program), (char *)0);

        /* the exec failed -- maybe it is a shell script? */
        if (errno == ENOEXEC)
            execl ("/bin/sh", "sh", "-c", program, (char *)0);

        /* oh no mr bill!! */
        perror(program);
        return -1;

            (This example is highly simplified.  There is a lot
            more involved, but this illustrates the point I'm
            trying to make.)

      If execl() is successful in starting the program then the code
      beyond the execl() is never executed.  In this example, if we can
      execl() the `program' then none of the stuff beyond it is run.
      Instead the system is off running the binary `program'.

      If, however, the first execl() failed then this hypothetical shell
      looks at why it failed.  If the execl() failed because `program'
      was not recognized as a binary executable, then the shell tries to
      run it as a shell script.

      The Berkeley folks had a neat idea to extend how the kernel starts
      up programs.  They hacked the kernel to recognize the magic number
      `#!'.  (Magic numbers are 16-bits and two 8-bit characters makes
      16 bits, right?)  When the `#!' magic number was recognized, the
      kernel would read in the rest of the line and treat it as a
      command to run upon the contents of the file.  With this hack you
      could now do things like:

        #! /bin/sh

        #! /bin/csh

        #! /bin/awk -F:

      This hack has existed solely in the Berkeley world, and has
      migrated to USG kernels as part of System V Release 4.  Prior to
      V.4, unless the vendor did some special value added, the kernel
      does not have the capability of doing anything other than loading
      and starting a binary executable image.

      Now, lets rewind a few years, to the time when more and more folks
      running USG based unices were saying `/bin/sh sucks as an
      interactive user interface!  I want csh!'.  Several vendors did
      some value added magic and put csh in their distribution, even
      though csh was not a part of the USG UNIX distribution.

      This, however, presented a problem.  Let's say you switch your
      login shell to /bin/csh.  Let's further suppose that you are a
      cretin and insist upon programming csh scripts.  You'd certainly
      want to be able to type `my.script' and get it run, even though it
      is a csh script.  Instead of pumping it through /bin/sh, you want
      the script to be started by running:

        execl ("/bin/csh", "csh", "-c", "my.script", (char *)0);

      But what about all those existing scripts -- some of which are
      part of the system distribution?  If they started getting run by
      csh then things would break.  So you needed a way to run some
      scripts through csh, and others through sh.

      The solution introduced was to hack csh to take a look at the
      first character of the script you are trying to run.  If it was a
      `#' then csh would try to run the script through /bin/csh,
      otherwise it would run the script through /bin/sh.  The example
      code from the above might now look something like:

        /* try to run the program */
        execl(program, basename(program), (char *)0);

        /* the exec failed -- maybe it is a shell script? */
        if (errno == ENOEXEC && (fp = fopen(program, "r")) != NULL) {
            i = getc(fp);
            (void) fclose(fp);
            if (i == '#')
                execl ("/bin/csh", "csh", "-c", program, (char *)0);
            else
                execl ("/bin/sh", "sh", "-c", program, (char *)0);
        }

        /* oh no mr bill!! */
        perror(program);
        return -1;

      Two important points.  First, this is a `csh' hack.  Nothing has
      been changed in the kernel and nothing has been changed in the
      other shells.  If you try to execl() a script, whether or not it
      begins with `#', you will still get an ENOEXEC failure.  If you
      try to run a script beginning with `#' from something other than
      csh (e.g. /bin/sh), then it will be run by sh and not csh.

      Second, the magic is that either the script begins with `#' or it
      doesn't begin with `#'.  What makes stuff like `:' and `: /bin/sh'
      at the front of a script magic is the simple fact that they are
      not `#'.  Therefore, all of the following are identical at the
      start of a script:

        :

        : /bin/sh

                        <--- a blank line

        : /usr/games/rogue

        echo "Gee...I wonder what shell I am running under???"

      In all these cases, all shells will try to run the script with /bin/sh.

      Similarly, all of the following are identical at the start of a script:

        #

        # /bin/csh

        #! /bin/csh

        #! /bin/sh

        # Gee...I wonder what shell I am running under???

      All of these start with a `#'.  This means that the script will be
      run by csh *only* if you try to start it from csh, otherwise it
      will be run by /bin/sh.

            (Note:  if you are running ksh, substitute `ksh' for
            `sh' in the above.  The Korn shell is theoretically
            compatible with Bourne shell, so it tries to run these
            scripts itself.  Your mileage may vary on some of the
            other available shells such as zsh, bash, etc.)

      Obviously, if you've got support for `#!' in the kernel then the
      `#' hack becomes superfluous.  In fact, it can be dangerous
      because it creates confusion over what should happen with `#! /bin/sh'.

      The `#!' handling is becoming more and more prevelant.  System V
      Release 4 picks up a number of the Berkeley features, including
      this.  Some System V Release 3.2 vendors are hacking in some of
      the more visible V.4 features such as this and trying to convince
      you this is sufficient and you don't need things like real,
      working streams or dynamically adjustable kernel parameters.

      XENIX does not support `#!'.  The XENIX /bin/csh does have the `#'
      hack.  Support for `#!' in XENIX would be nice, but I wouldn't
      hold my breath waiting for it.

------------------------------