Personal website of Martin Tournoij (“arp242”); writing about programming (CV) and various other things.

Working on GoatCounter and moreGitHub Sponsors.

Contact at martin@arp242.net or GitHub.

This page's author

Discussions: Hacker News

You would expect this to work, no?

bash% echo $(( .1 + .2 ))
bash: .1 + .2 : syntax error: operand expected (error token is ".1 + .2 ")

Well, bash says no, but zsh just works:

zsh% echo $(( .1 + .2 ))
0.30000000000000004      # Well, "works" insofar IEEE-754 works.

There is simply no way you can do calculations with fractions in bash without relying on bc, dc, or some hacks. Compared to simply being able to use a + b it’s ugly, slow, and difficult.

There are other pretty frustrating omissions in bash; NUL bytes is another fun one:

zsh% x=$(printf 'N\x00L'); printf $x | xxd -g1 -c3
00000000: 4e 00 4c  N.L

bash% x=$(printf 'N\x00L'); printf $x | xxd -g1 -c3
bash: warning: command substitution: ignored null byte in input
00000000: 4e 4c     NL

It looks like bash added a warning recently-ish (4.4-patch 2); this one bit me pretty hard a few years ago; back then it would just get silently discarded without warning; I guess a warning is an “improvement” of sorts (fixing it, however, would be an actual improvement[1]).

NUL bytes aren’t that uncommon, think of e.g. find -print0, xargs -0, etc. That all works grand, right up to the point you try to assign it to a variable. You can use NUL bytes for array assignments though, if you evoke the right incantation:

bash% read -rad arr < <(find . -type f -print0)

There are all sorts of edge-cases where you need to resort to read or readarray rather than being able to just assign in. In zsh it’s:

zsh% arr=(**/*(.))

zsh% IFS='\x00' arr=($(find . -type f -print0)) # If you must use find (rarely needed)

zsh% arr=( "${(0)$(find . -type f -print0)}" )

That (.) is a “glob qualifier” to include only regular files – more on that later.

Don’t even think of doing something like:

img=$(curl https://example.com/image.png)
if [[ $cond ]]; then
    optpng <<<"$img" > out.png
else
    cat <<<"$img" > out.png
fi

Of course you can refactor this to avoid the variable (and the example is a bit contrived), but it really ought to work. I once wrote a script to import emails from the Mailgun API. It worked great, yet sometimes images were mangled and I just couldn’t figure out why. Turns out Mailgun “helpfully” decodes attachments (i.e. removes the base64) and sends binary content, which bash (at the time) would silently discard. It took me a very long time to figure out. I forgot why, but it was hard to avoid storing the response in a variable. I ended up rewriting it to Python because of this, which was just a waste of time really. It’s this, specifically, that really soured me on bash and prompted The shell scripting trap in 2017. However, many items listed there are solved by zsh, including this one.


zsh also fixes most of the quoting:

zsh% for f in *; ls -l $f
-rw-rw-r-- 1 martin martin 0 Oct 19 06:51 asd.txt
-rw-rw-r-- 1 martin martin 0 Oct 19 06:51 with space.txt

bash% for f in *; do ls -l $f; done
-rw-rw-r-- 1 martin martin 0 Oct 19 06:51 asd.txt
ls: cannot access 'with': No such file or directory
ls: cannot access 'space.txt': No such file or directory

It’s not POSIX compatible, but who cares? bash doesn’t follow POSIX in all sorts of ways by default either because it just makes more sense, and with both you can still tell them to be POSIX-compatible if you must for one reason or the other.

Also note the convenient short version of the for loop: no need for do, done and muckery with ; before the done, which is much more convenient for quick one-liners you interactively type in. You can still do word splitting, but you need to do it explicitly:

zsh% for i in *; ls -l $=i
-rw-rw-r-- 1 martin martin 0 Oct 19 06:51 asd.txt
ls: cannot access 'with': No such file or directory
ls: cannot access 'space.txt': No such file or directory

[[ is supposed to fix [, but it still has weird quoting quirks:

zsh% a=foo; b=*;
zsh% if [[ $a = $b ]]; then
       print 'Equal!'
     else
       print 'Nope!'
     fi
Nope!

bash% a=foo; b=*
bash% if [[ $a = $b ]]; then
        echo 'Equal!'
      else
        echo 'Nope!'
      fi
Equal!

This is equal because without quotes it still interprets the right-hand side as a pattern (i.e. glob). In zsh you need to use $~var to explicitly enable pattern matching, which is a much better model than remembering when you do and don’t need to quote things – sometimes you do want the pattern matching and then you don’t want quotes; it’s not always immediately obvious if an if [[ ... statement is correct if it’s lacking quotes.

“But Martin, you should always quote things, you’re being disingenuous!” Well, I could make a comfortable living if I were paid to add quotes to other people’s shell scripts. Telling people to “always quote things” is what we’ve been doing for 40 years now and irrefutable observational evidence has demonstrated that it just does not work.

Most people aren’t shell scripting wizards; they make a living writing Python or C or Go or PHP programs, or maybe they’re sysadmins or scientists, and oh, occasionally they also write a shell script. They just see something that works and assume it has sane behaviour and don’t realize the subtle differences between $@, $*, and "$@". I think that’s actually quite reasonable, because the behaviour is odd, surprising, and confusing.

It’s also a lot more complex than just “quote your variables”, especially if you use $(..) since command substitution often needs quotes too, as well as any variables inside it. Before you know it you’ve got double, triple, or more levels of nested quotes and if you forget one set of them you’re in trouble.

It’s such a fertile source of real-world bugs that it would merit entomologist examination. If there’s a system that causes this many bugs then that system is at fault, and not the people using it. Computers and software should adapt to humans, not the other way around.

And “always quote things!” isn’t even right either, because you should always quote things except when you shouldn’t:

zsh% a=foo; b=.*;
zsh% if [[ "$a" =~ "$b" ]]; then
       print 'Equal!'
     else
       print 'Nope!'
     fi
Equal!

bash% a=foo; b=.*
bash% if [[ "$a" =~ "$b" ]]; then
        echo 'Equal!'
      else
        echo 'Nope!'
      fi
Nope!

If there are quotes around a regexp then it’s treated as a literal string. I mean, it’s consistent with = pattern matching, but also confusing because I explicitly use =~ to match a regular expression.

Another famous quoting trap is $@ vs. "$@" vs. $* vs. "$*":

zsh% cat args
echo "unquoted @:"
for a in $@; do echo "  => $a"; done

echo "quoted @:"
for a in "$@"; do echo "  => $a"; done

echo "quoted *:"
for a in $*; do echo "  => $a"; done

echo "quoted *:"
for a in "$*"; do echo "  => $a"; done
bash% bash args hello world 'test space' '*'
unquoted @:
  => hello
  => world
  => test
  => space
  => Guust1129.jpg
  => IEEESTD.2019.8766229.pdf
  [.. rest of my $HOME ..]
quoted @:
  => hello
  => world
  => test space
  => *
unquoted *:
  => hello
  => world
  => test
  => space
  => Guust1129.jpg
  => IEEESTD.2019.8766229.pdf
  [.. rest of my $HOME ..]
quoted *:
  => hello world test space *

Experiences shellers will know to (almost) always use "$@", but how often do you see it being done wrong? It’s not that strange people do it wrong either; if you learned about quoting and word splitting then $@ without quotes is actually the logical thing to use because you would expect "$@" to be treated as one argument (as "$*"). You tell people to “always add quotes to prevent word splitting and treat things as a single argument”, and then you tell them “oh, except in this one special case when the addition of quotes invokes a special kind of splitting and doesn’t follow any of the rules we previously told you about”.

In zsh, $@ and $* (and $argv) are all (read-only) arrays and it all behaves identical as you would expect with no surprises:

zsh% zsh args hello world 'test space' '*'
unquoted @:
  => hello
  => world
  => test space
  => *
quoted @:
  => hello
  => world
  => test space
  => *
unquoted *:
  => hello
  => world
  => test space
  => *
quoted *:
  => hello world test space *

Actually in bash you can do argv=("$@") and then you have an array. This is really how it should work by default.

You still need to loop over it with:

bash% for a in "${argv[@]}"; do
        echo "=> $a"
      done

Rather than just for a in $argv like in zsh. Aside from the pointless [@], why would you ever want to word-split every element of an array? There is probably some use case somewhere, but it’s exceedingly rare. Better to just skip all of that by default unless explicitly enabled with = and/or ~.

Oh, here’s another interesting tidbit:

zsh% n=3; for i in {1..$n}; print $i
1
2
3

bash% n=3; for i in {1..$n}; do echo "$i"; done
{1..3}

bash% n=3; for i in {1..3}; do echo "$i"; done
1
2
3

Why does it work like that? That’s left as an exercise for the reader ;-)


Aside from all sorts caveats that are handled much better, a lot of common things are just much easier in zsh:

zsh% arr=(123 5 1 9)
zsh% echo ${(o)arr}     # Lexical order
1 123 5 9
zsh% echo ${(on)arr}    # Numeric order
1 5 9 123

bash% IFS=$'\n'; echo "$(sort <<<"${arr[*]}")"; unset IFS
1 123 5 9
bash% IFS=$'\n'; echo "$(sort -n <<<"${arr[*]}")"; unset IFS
1 5 9 123

I had to look up how to do it in bash; the Stack Overflow answer for that one starts with “you don’t really need all that much code”. lol? I guess that’s in reply to some of the other horrendously complex answers which implement “pure bash” sorting algorithms and the like. I guess compared to that this is “not that much code”. And of course the entire thing is a minefield of expansion again; and if you forget a set of nested quotes you end up in trouble.

Arrays in general are just awkward in bash:

bash% arr=(first second third fourth)

bash% echo ${arr[0]}
first
bash% echo ${arr[@]::2}
first second

I mean, it works, and it’s not that much typing, but why do I need that [@]? Probably some (historical) reason, but zsh implements it much more readable and easier:

zsh% arr=(first second third fourth)

zsh% print ${arr[1]}        # Yeah, it's 1-based. Deal with it.
first
zsh% print ${arr[1,2]}
first second

The bash array syntax was copied from ksh; so I guess we have to blame David Korn (zsh supports it too, if you must use it). But regular subscripts are just so much easier.

And then there’s the useful features:

zsh% ls *.go
format.go  format_test.go  gen.go  old.go  uni.go  uni_test.go

zsh% ls *.go~*_test.go
format.go  gen.go  old.go  uni.go

zsh% ls *.go~*_test.go~f*
gen.go  old.go  uni.go

*.go gets expanded, and filters the pattern after the ~; *_test.go in this case. Looks a bit obscure at first glance, but bash’s ksh-style extglobs are far harder:

bash% ls !(*_test).go
format.go  gen.go  old.go  uni.go

bash% ls !(*_test|f*).go
gen.go  old.go  uni.go

!(..) is “match anything except the pattern”; the * is implied here (zsh supports !(..) if you set ksh_glob). While it works, the the pattern~filter~filter model is much easier, and also more flexible since you don’t need to start with all matches.

There are many useful things you can do with globbing; you can replace many uses of find with it, and you don’t need to worry about the caveats, -print0 hacks, etc. For example to recursively list all regular files:

zsh% ls **/*(.)
LICENSE         go.sum           unidata/gen.go             wasm/make*
README.md       old.go           unidata/gen_codepoints.go  wasm/srv*
[..]

Or directories:

zsh% ls -d /etc/**/*(/)
/etc/OpenCL/                      /etc/runit/runsvdir/default/dnscrypt-proxy/log/
/etc/OpenCL/vendors/              /etc/runit/runsvdir/default/ntpd/log/
/etc/X11/                         /etc/runit/runsvdir/default/postgresql/supervise/
[..]

Or files that were changed in the last week:

zsh% ls -l /etc/***(.m-7)    # *** is a shortcut for **/*; needs GLOB_STAR_SHORT
-rw-r--r-- 1 root root 28099 Oct 13 03:47 /etc/dnscrypt-proxy.toml.new-2.1.1_1
-rw-r--r-- 1 root root    97 Oct 13 03:47 /etc/environment
-rw-r--r-- 1 root root 37109 Oct 17 10:34 /etc/ld.so.cache
-rw-r--r-- 1 root root 77941 Oct 19 01:01 /etc/public-resolvers.md
-rw-r--r-- 1 root root  6011 Oct 19 01:01 /etc/relays.md
-rw-r--r-- 1 root root   142 Oct 19 07:57 /etc/shells

You can even order them by modified date with om (***(.m-7om)), although that’s a bit pointless here as ls will reorder them again, but if you’re looping over files it’s useful.

There is no way to do any of this in bash, you’ll have to use something like:

bash% find /etc -type f -mtime 7 -exec ls -l {} +
find: ‘/etc/sv/docker/supervise’: Permission denied
find: ‘/etc/sv/docker/log/supervise’: Permission denied
find: ‘/etc/sv/bluetoothd/log/supervise’: Permission denied
find: ‘/etc/sv/postgresql/supervise’: Permission denied
find: ‘/etc/sv/runsvdir-martin/supervise’: Permission denied
find: ‘/etc/wpa_supplicant/supervise’: Permission denied
find: ‘/etc/lvm/cache’: Permission denied
-rw-rw-r-- 1 root root  167 Oct 12 22:17 /etc/default/postgresql
-rw-r--r-- 1 root root  817 Oct 12 09:11 /etc/fstab
-rw-r--r-- 1 root root 1398 Oct 12 22:19 /etc/passwd
-rw-r--r-- 1 root root 1397 Oct 12 22:19 /etc/passwd.OLD
-rw-r--r-- 1 root root  307 Oct 12 23:10 /etc/public-resolvers.md.minisig
-rw-r--r-- 1 root root  297 Oct 12 23:10 /etc/relays.md.minisig
-r-------- 1 root root  932 Oct 12 09:57 /etc/shadow
-rwxrwxr-x 1 root root  397 Oct 12 22:23 /etc/sv/postgresql/run

Not sure how to make it ignore these errors too without redirecting stderr (more typing!) And if you think adding single letters in (..) after a pattern is hard then try understanding find’s weird flag syntax. Glob qualifies are great.

csh-style parameter substitution is pretty useful:

zsh% for f in ~/photos/*.png; convert $f ${f:t:r}.jpeg

:t to get the tail, and :r to get the root (without extension). csh could do this before I was even born, but bash can’t (it can for history expansion, but not variables). According to the bash FAQ “Posix has specified a more powerful, albeit somewhat more cryptic, mechanism cribbed from ksh”, which I find a somewhat curious statement as the above in bash is:

bash% for f in ~/photos/*.png; do convert "$f" "$(basename "${f%%.*}")"; done

Technically “more powerful” in the sense that you can do other things with it, but not really “more useful for common operations” (zsh, of course, implements % and # as well).

Note you can’t nest ${..} in bash; e.g. "${${f%%.*}##*/}" is an error:

zsh% f=~/asd.png; print "${${f%%.*}##*/}"
asd

bash% f=~/a.png; echo "${${f%%.*}##*/}"
bash: ${${f%%.*}##*/}: bad substitution

While this can quickly lead to very unreadable ASCII vomit, it’s useful on occasion, when used with care and wisdom. You can click below for a more advanced example if the children are already in bed.

Click to see NSFW content. Not suitable for children under 18!

For example, this can be used to show the longest element in an array:

print ${array[(r)${(l.${#${(O@)array//?/X}[1]}..?.)}]}

Cribbed from the zsh User's Guide.


There are many more things. I’m not going to list them all here. None of this is new; much (if not all?) of this has around for 20 years, if not longer. I don’t know why bash is the de-facto default, or why people spend time on complex solutions to work around bash problems when zsh solves them. I guess because Linux used a lot of GNU stuff and bash was came with it, and GNU stuff was (and is) using bash. Not a very good reason, certainly not one 30 years later.

zsh still has plenty of limitations; the syntax isn’t always something you’d want to show your mother for starters, as well as a number of other things. Still, it’s clearly better. I genuinely can’t find a single thing bash does better beyond “it’s installed on many systems already”.

Ubiquitousness is overrated anyway; zsh has no dependencies beyond libc and curses and is 970K on my system[2] and is available for pretty much all systems. Compared to most other interpreters it’s tiny, with only Lua being smaller (275K). “Stick to POSIX sh for compatibility” was good advice in 1990 when you had some SunOS system with some sun-sh and that’s what you were stuck with. Those days are long gone, and while there are a few poor souls still suck on those systems (sometimes even with csh!) chances are they’re not going to try and run your Docker or Arch Linux shell script or whatnot on those systems anyway.

Contorting yourself in all sorts of strange bends to perhaps possibly maybe make it work for a tiny fraction of users who are unlikely to use your script anyway does not seem like a good trade-off, especially since these kind of limitations tend to be organisational rather than technical, which is not my problem anyway to be honest.

Using zsh is also more portable; since it allows you to avoid many shell tools and the (potential) incompatibilities, and by explicitly setting zsh as the interpreter you can rely on zsh behaviour, rather than hoping the /bin/sh on $random_system behaves the same (even dash has some extensions to POSIX sh, such as local).

What I typically do is save files as script.zsh with:

#!/usr/bin/env zsh
[ "${ZSH_VERSION:-}" = "" ] && echo >&2 "Only works with zsh" && exit 1

This makes sure it gets run by zsh when used as ./script.zsh, and gives an error in case people type sh script.zsh or bash script.zsh in case the .zsh extension isn’t enough of a clue.

So in conclusion: s/bash/zsh/g and everything will be just a little bit better.

P.S. maybe fish is even better by the way, but I could never get over the bright colouring and all these things popping in and out of my screen; it's such a "busy" chaotic experience! I'd have to spend time disabling all that stuff to make it usable for me, and I never bothered – and if you're disabling the key selling points of a program then it's probably not intended for you anyway. Maybe I'll have a crack at it at some point though.

Footnotes
  1. Which I assume isn’t so easy, otherwise it would have been done already. The reason it doesn’t work is an artifact from C’s NUL-terminated strings, but this kind of stuff really shouldn’t be exposed in a high-level language like the shell. It’s also a bit ironic since one of Stephen Bourne’s original goals with his shell was to get rid of arbitrary size limits on strings, which were common at the time. 

  2. A full install if ~8M, mostly in the optional completion functions it ships with. Bash is about 1.3M by the way.