Article by David Korn

Parent page: The Traditional Bourne Shell Family

The following is taken from "ksh - An Extensible High Level Language" by David G. Korn.

It is the 2nd chapter from < link expired>, instead see www.usenix.org/conference/usenix-1994-very-high-level-languages-symposium/ksh-extensible-high-level-language

1. INTRODUCTION

[...]

2. HISTORY

The original UNIX system shell was a simple program written by Ken
Thompson at Bell Laboratories, as the interface to the new UNIX
operating system. It allowed the user to invoke single commands, or
to connect commands together by having the output of one command pass
through a special file called a pipe and become input for the next
command. The Thompson shell was designed as a command interpreter, not
a programming language. While one could put a sequence of commands in
a file and run them, i.e., create a shell script, there was no support
for traditional language facilities such as flow control, variables,
and functions. When the need for some flow control surfaced, the
commands /bin/if and /bin/goto were created as separate commands. The
/bin/if command evaluated its first argument and, if true, executed the
remainder of the line. The /bin/goto command read the script from its
standard input, looked for the given label, and set the seek position at
that location. When the shell returned from invoking /bin/goto, it read
the next line from standard input from the location set by /bin/goto.

Unlike most earlier systems, the Thompson shell command language was a
user-level program that did not have any special privileges. This meant
that new shells could be created by any user, which led to a succession
of improved shells. In the mid-1970s, John Mashey at Bell Laboratories
extended the Thompson shell by adding commands so that it could be used
as a primitive programming language. He made commands such as if and
goto built-ins for improved performance, and also added shell variables.

At the same time, Steve Bourne at Bell Laboratories wrote a version of
the shell which included programming language techniques. A rich set
of structured flow control primitives was part of the language; the
shell processed commands by building a parse tree and then evaluating
the tree. Because of the rich flow control primitives, there was no
need for a goto command. Bourne introduced the "here-document" whereby
the contents of a file are inserted directly into the script. One
of the often overlooked contributions of the Bourne shell is that it
helped to eliminate the distinction between programs and shell scripts.
Earlier versions of the shell read input from standard input, making it
impossible to use shell scripts as part of a pipeline.

By the late 1970s, each of these shells had sizable followings within
Bell Laboratories. The two shells were not compatible, leading to a
division as to which should become the standard shell. Steve Bourne and
John Mashey argued their respective cases at three successive UNIX user
group meetings. Between meetings, each enhanced their shell to have the
functionality available in the other. A committee was set up to choose
a standard shell. It chose the Bourne shell as the standard.

At the time of these so-called ``shell wars'', I worked on a project
at Bell Laboratories that needed a form entry system. We decided to
build a form interpreter, rather than writing a separate program for
each form. Instead of inventing a new script language, we built a form
entry system by modifying the Bourne shell, adding built-in commands
as necessary. The application was coded as shell scripts. We added
a built-in to read form template description files and create shell
variables, and a built-in to output shell variables through a form mask.
We also added a built-in named let to do arithmetic using a small subset
of the C language expression syntax. An array facility was added to
handle columns of data on the screen. Shell functions were added to
make it easier to write modular code, since our shell scripts tended to
be larger than most shell scripts at that time. Since the Bourne shell
was written in an Algol-like variant of C, we converted our version of
it to a more standard K&R version of C. We removed the restriction
that disallowed I/O redirection of built-in commands, and added echo,
pwd, and test built-in commands for improved performance. Finally, we
added a capability to run a command as a coprocess so that the command
that processed the user-entered data and accessed the database could be
written as a separate process.

At the same time, at the University of California at Berkeley, Bill Joy
put together a new shell called the C shell. Like the Mashey shell, it
was implemented as a command interpreter, not a programming language.
While the C shell contained flow control constructs, shell variables,
and an arithmetic facility, its primary contribution was a better
command interface. It introduced the idea of a history list and an
editing facility, so that users didn't have to retype commands that they
had entered incorrectly.

I created the first version of ksh soon after I moved to a research
position at Bell Laboratories. Starting with the form scripting
language, I removed some of the form-specific code, and added useful
features from the C shell such as history, aliases, and job control.

In 1982, the UNIX System V shell was converted to K&R C, echo and
pwd were made built-in commands, and the ability to define and use
shell functions was added. Unfortunately, the System V syntax for
function definitions was different from that of ksh. In order to
maintain compatibility with the System V shell and preserve backward
compatibility, I modified ksh to accept either syntax.

The popular inline editing features (vi and emacs mode) of ksh were
created by software developers at Bell Laboratories; the vi line editing
mode by Pat Sullivan, and the emacs line editing mode by Mike Veach.
Each had independently modified the Bourne shell to add these features,
and both were in organizations that wanted to use ksh only if ksh had
their respective inline editor. Originally the idea of adding command
line editing to ksh was rejected in the hope that line editing would
move into the terminal driver. However, when it became clear that this
was not likely to happen soon, both line editing modes were integrated
into ksh and made optional so that they could be disabled on systems
that provided editing as part of the terminal interface.

As more and more software developers at AT&T switched to ksh, it became
the de facto standard shell at AT&T. As developers left AT&T to go
elsewhere, the demand for ksh led AT&T to make ksh source code available
to external customers via the UNIX System Toolchest, an electronic
software distribution system. For a one-time fixed cost, any company
could buy rights to distribute an unlimited number of ksh binaries.
Most UNIX system providers have taken advantage of this and now ship ksh
as part of their systems. The wider availability of ksh contributed
significantly to its success.

As use of ksh grew, the need for more functionality became apparent.
Like the original shell, ksh was first used primarily for setting up
processes and handling I/O redirection. Newer uses required more string
handling capabilities to reduce the number of process creations. The
1988 version of ksh, the one most widely distributed at the time this
is written, extended the pattern matching capability of ksh to be
comparable to that of the regular expression matching found in sed and
grep.

In spite of its wide availability, ksh source is not in the public
domain. This has led to the creation of bash, the ``Bourne again
shell'', by the Free Software Foundation; and pdksh, a public domain
version of ksh. Unfortunately, neither is compatible with ksh.

In 1992, the IEEE POSIX 1003.2 and ISO/IEC 9945-2 shell and utilities
standards were ratified. These standards describe a shell language
that was based on the UNIX System V shell and the 1988 version of ksh.
The 1993 version of ksh is a version of ksh which is a superset of the
POSIX and ISO/IEC shell standards. With few exceptions, it is backward
compatible with the 1988 version of ksh.

The awk command was developed in the late 1970s by Al Aho, Brian
Kernighan, and Peter Weinberger of Bell Laboratories as a report
generation language. A second-generation awk developed in the early
1980s was a more general-purpose scripting language, but lacked some
shell features. It became very common to combine the shell and awk
to write script applications. For many applications, this had the
disadvantage of being slow because of the time consumed in each
invocation of awk. The perl language, developed by Larry Wall in the
mid-1980s, is an attempt to combine the capabilities of the shell and
awk into a single language. Because perl is freely available and
performs better than combined shell and awk, perl has a large user
community, primarily at universities.

3. REQUIREMENTS FOR A REUSABLE SCRIPTING LANGUAGE

[...]