ARG_MAX
| Shells
| whatshell
| portability
| permissions
| UUOC
| ancient
| -
| ../Various
| HOME
$@
"
| echo/printf
| set -e
| test
| tty defs
| tty chars
| $()
vs )
| IFS
| using siginfo
| nanosleep
| line charset
| locale
2017-03-29 (see recent changes)
Other documents:
If you build a chain of filters, the throughput may drop down unexpectedly slow:
while sleep 0.1; do date; done|grep . # immediate output each tenth of a second while sleep 0.1; do date; done|grep .|grep . # delayed output in big chunks, about every 10 seconds
A real world example is searching growing logfiles with tail -f
feeding
several invocations of grep
.
The explanation can be found in the C library, which provides different output buffering methods:
unbuffered, line buffeed and block (or fully) buffered. See setbuf(3)
.
There is no such buffering if write(2)
is used instead of library
functions like printf(3)
.
The C library offers buffering simply for performance reasons.
The highest throughput is achieved with block buffering due to the lowest overhead.
The slower line buffering might only be used if a command is directly connected to a tty
,
where immediate output is expected.
In a pipeline, the full buffering usually happens in blocks of PIPE_BUF
bytes.
Common values are 4096 bytes (4K) or 5120 bytes (10 blocks of 512 bytes).
$ cpp<<EOF|egrep -v '^#|^$' #include <limits.h> PIPE_BUF EOF 4096
Keep in mind that it's not about the pipeline itself, but the buffering method, if the utility doesn't see a TTY.
There's no universal way to avoid the buffering of a utility. However, here are some possible options:
grep
know --line-buffered
: GNU since 2.5 (03/'02), FreeBSD since 5.3, NetBSD since 2.0, OpenBSD since 3.6
sed
(-u/--unbuffered
), since 3.02.80 (08/'99)
awk
(-W interactive
or per fflush()
)
tcpdump
(-l
)
stdbuf
which uses method 2.) from below
(more information).
cat
knows the flag -u
.
But some variants do not buffer in a pipe, e.g. Solaris 2.9 and GNU.
By the way, tee
even produces line buffered output per default.
That's why tail -f
behaves intuitively on the resulting file.
stdbuf
"
while sleep 0.1; do date; done| grep .|grep . # delayed output while sleep 0.1; do date; done|stdbuf -o0 grep .|grep . # immediate output
LD_PRELOAD
,
libc
.
Then you can preload code which modifies the buffering before the utility runs.
Compile this code to a shared library:
/* linux$ gcc -fpic -c unbuffer.c; ld -shared -o libunbuffer.so unbuffer.o * solaris$ cc -Kpic -c unbuffer.c; ld -G -o libunbuffer.so unbuffer.o */ #include <stdio.h> void _init() { setbuf(stdout, NULL); }
and create a wrapper named "unbuffer
":
#!/bin/sh LD_PRELOAD=$HOME/lib/libunbuffer.so export LD_PRELOAD exec "$@"
Coming back to the initial example:
while sleep 0.1; do date; done| grep .|grep . # delayed output while sleep 0.1; do date; done|unbuffer grep .|grep . # immediate output
Using _init()
is only a hack (possible clash with real usage of this internal function?).
This item was inspired by a usenet posting from Stephane Chazelas.
Meanwhile GNU coreutils since 7.5 (08/'09) provides this functionality with the tool "stdbuf
".
while sleep 0.1; do date; done| grep . |grep . # delayed output while sleep 0.1; do date; done| while IFS= read -r line; do printf '%s\n' "$line"|grep -q . && printf '%s\n' "$line"; done |grep . # immediate output
$|
controls buffering), courtesy Stefan Reuther
while sleep 0.1; do date; done| grep . | grep . # immediate output while sleep 0.1; do date; done| perl -ne '$|=1; print if /./' | grep . # immediate output
unbuffer
It simulates an interactive TTY which makes the command chose line buffering. The usage is identical to item 2. and 3.)
<http://www.in-ulm.de/~mascheck/various/buffering/>