David A. Wheeler's Blog

Thu, 22 Apr 2010

Filenames and Pathnames in Shell - Doing it Correctly

Traditionally, Unix/Linux/POSIX filenames and pathnames can be almost any sequence of bytes. Unfortunately, most developers and users of Bourne shells (including bash, dash, ash, and ksh) don’t handle filenames and pathnames correctly. Even good textbooks on shell programming get filename and pathname processing completely wrong. Thus, many shell scripts are buggy, leading to surprising failures. In fact, mis-handling of filenames is a significant source of security vulnerabilities.

So I’ve created a short essay on how to correctly process filenames in Bourne shells as used in Unix, Linux, and various POSIX systems. It presumes that you already know how to write Bourne shell scripts.

The essay is: Filenames and Pathnames in Shell: How to do it correctly. Please, take a look!

Frankly, it would be better if filenames weren’t so permissive. In particular, filenames with control characters, leading dash (“-”), and non-UTF-8 encoding cause a lot of grief. To see more about that, please see my essay Fixing Unix/Linux/POSIX Filenames. If filenames weren’t so permissive, correct programs would be much easier to write.

So, Filenames and Pathnames in Shell: How to do it correctly explains how to handle filenames properly in shell programs, given the current situation. Please take a look; I hope you find it useful.

path: /oss | Current Weblog | permanent link to this entry