Email archive for list austin-group-l, item 03065

Subject: 	Re: Re: find + xargs
From: 	David Korn 
Date: 	Mon, 26 Mar 2001 11:09:54 -0500 (EST)

> Paul Eggert  wrote, on Sat 24 Mar 2001:
> >
> > Here's an example of why Solaris 8 find's behavior is incompatible
> > with the current POSIX standard:
> > 
> > find / -prune -exec echo {} + - ';'
> > 
> > POSIX requires this to print "/ + -" but Solaris 8 find prints an
> > error diagnostic instead.
> 
> It was this very issue that prompted me to start the discussion.  I was
> contemplating filing a POSIX.2 interp request asking whether the SVR4
> use of "+" is a valid extension.  But when I started to think more
> about it, it seemed like the problem was really that there is a gap in
> the standard, and that standardizing the SVR4 behaviour would be a
> neat way of fixing both the question mark over its legality and the
> problem with the standard.  (Then I realised that GNU's -print0 is an
> alternative solution to the same problem...)
> 
> > Admittedly this is a contrived example but
> > I can see the possibility of real scripts running into compatibility
> > problems with this incompatible change to "find".
> 
> It appears that the "+" is only treated as special if it immediately
> follows "{}".  My guess is that it was done this way to minimise the
> chances of causing compatibility problems.
> 
> > I'm not saying that
> > I oppose the entire idea; it's just that the SVR4 syntax is not
> > compatible with POSIX, and that is a strike against it.
> 
> The SVR4 syntax is long-established existing practice that predates
> POSIX.2, so its arguable that if it is not allowed by POSIX.2 then
> that is an oversight in the standard - it should have explicitly
> allowed the SVR4 behaviour as an extension.  Standardizing the
> behaviour would neatly solve the issue.
> 
> > I still favor the GNU extension of NUL termination instead.  It's more
> > reliable in the presence of multiple locales and encoding errors, and
> 
> NUL termination certainly has advantages over adding backslashes,
> but I don't see any advantage over internal argument aggregation
> by "find".
> 
> > it's easier to explain.  Perhaps that's why it is documented and the
> > Solaris behavior is not.
> 
> The SVR4 behaviour is documented in Unixware man pages, so it's odd
> that it isn't documented by Solaris.  Here is the description from a
> Unixware 2.1 man page:
> 
>     -exec cmd    True if the executed cmd returns a zero value as
>                exit status.  A command argument {} is replaced
>                by the current path name.  The end of cmd must be
>                punctuated by an escaped semicolon or a plus sign
>                (+).  When a plus sign is used, cmd aggregates a
>                set of pathnames and executes on the set; when a
>                semicolon is used, cmd executes on one pathname
>                at a time.  The reason for preferring + to a
>                semicolon is vastly improved performance.
> 
> It shouldn't be too hard to update the XCUd5 description of "find"
> based on this.
> 
> -- 
> Geoff Clare                         yyy@xxxxxxxxxxx
> UniSoft Limited, London, England.   yyy@xxxxxxxxxx
> 
> 

I wrote the version of find that went into System V Release 4 version back
in 1987 and added the feature of accepting + in place of ; to allow xargs
type grouping without the quoting problems.  I was concerned that
xargs did not handle newlines.  Earlier versions of find had their
own file tree walker code which did not handle symbolic links.
I wrote ftwalk() (later renamed nftw()) and then wrote find using that
interface.

Our current find implementation (the AST software), also supports -print0
but I would recommend against this since the output format
is no longer a text file that can be processed by many of the
standard utilities.

David Korn
research!dgk
yyy@xxxxxxxxxxxxxxxx