The Secure BROWSER (SB) Specification

David A. Wheeler
Version 0.95.1 (DRAFT)
February 24, 2001

Introduction

Eric Raymond has proposed the BROWSER convention for Unix-like systems, which lets users specify their browser preferences and lets developers easily invoke those browsers. In general, this is a great idea. Unfortunately, as specified it has horrendous security flaws; documents containing hypertext links like "; /bin/rm -fr ~" will erase all of a user's files when the user selects it! Also, it's ambiguously specified, it doesn't permit a colon (:) in commands, and it doesn't discuss some implementation issues (especially if filenames are allowed and how to handle non-absolute references).

This document defines the ``secure BROWSER'' (SB) convention, which is backwards-compatible with the original BROWSER specification, fixes these problems, and is still easy to implement. It also provides a simple example, implementation notes, a detailed discussion of how to implement the convention, a discussion on how to convert non-absolute references into absolute references, discussion of how this could be expanded to serve non-Unix-like systems, and "tricks" that users can use to perform all sorts of tasks using the Secure BROWSER convention. Technically URLs are a subset of URIs, so in this document the term "URI" is used instead of URL (with occasional references to URLs).

This is a DRAFT specification. Comments welcome, please email dwheeler@dwheeler.com (no spam please). This specification and related information (include a security analysis) are available at http://www.dwheeler.com/browse.

In particular, a major debate is whether it's worth calling the shell. The shell call may be removed as being too dangerous. Options include: (1) BROWSER only having a list of program names, (2) BROWSER listing programs with constant arguments (no need for %s), and (3) Using "%" replacements but avoiding the shell. The "%" replacements are increasingly looking undesirable; they take more work to program, and handling Netscape/Mozilla properly requires writing a short program anyway (so they don't seem to be helpful). Below are "compatible" and "alternative" definitions - the "compatible" one is compatible with the original definition, but it's complicated to implement; the "alternative" one is simpler, but requires that the "BROWSER" variable have a different format. Comments welcome.

Compatible Secure BROWSER Definition

The BROWSER environment variable contains a colon-separated series of browser command parts. These command parts are tried in order, executing each using /bin/sh, until one succeeds (returns 0); empty command parts (e.g., "::") are ignored. In a command part, %% becomes a single "%", %c becomes a single ":", and "%s" becomes the ``escaped absolute reference'' to be viewed; a command part not having a "%s" has " %s" appended to it.

An implementation of this convention must convert any reference to an ``escaped absolute reference''. An ``absolute reference'' is either an absolute local pathname (a filename beginning with "/" and not containing any high-bit characters or control characters) or an absolute URI/URL (which begins with lower case letters followed by a ":"). In an absolute URI/URL, all characters except for the following must be URL-escaped (replaced with %hh, where hh is its hexadecimal value):

; / ? : @ & = + $ , # % A-Z a-z 0-9 - _ . ! ~ * ' ( )

An absolute reference must not include the NIL character (0). Any other kind of reference must be either rejected or converted into an absolute reference before executing the command parts. For example, filenames with high-bit characters must be converted to the "file:" URI. This absolute reference is then converted into an ``escaped absolute reference'' by inserting a backslash before each of the following characters (to prevent certain shell-based security attacks):

& ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r (space) \t

Alternative Secure BROWSER Definition

The previous definition is complex, because sending text directly through a shell is hard to secure. By changing how BROWSER works, it's easier to secure:

The BROWSER environment variable contains a colon-separated series of browser command parts. These command parts are tried in order, executing each directly (e.g., using execvp(3), and not using system(3) or /bin/sh), until one succeeds (returns 0); empty command parts (e.g., "::") are ignored. A command part begins with the filename of the command to run, followed by 0 or more parameters (separated by one or more spaces). After the command part, the absolute reference (e.g., URI) to be browsed is added.

An implementation of this convention must convert any reference to an ``absolute reference''. An ``absolute reference'' is either an absolute local pathname (a filename beginning with "/") or an absolute URI/URL (which begins with lower case letters followed by a ":"). An absolute reference must not include the NIL character (0). An implementation may choose to URL-escape any characters not legal in URLs, but it doesn't have to do so. Any other kind of reference must be either rejected or converted into an absolute reference before executing the command parts.

Simple Example

For example, users can set their BROWSER variable by running the following command in a Bourne-like shell:

BROWSER='netscape -raise -remote "openURL(%s,new-window)":lynx' export BROWSER Then, when a program wants to view "http://www.google.com/search?q=url", it'll look at the BROWSER variable. It will first try by running the following command through /bin/sh (as one line): netscape -raise -remote "openURL(http://www.google.com/search\?q=url,new-window)" and if that fails, it'll try: lynx http://www.google.com/search\?q=url

Note that the "%s" was changed to the escaped absolute reference in the first case, and that an implied " %s" was added in the second case. Note that the "?" is preceded with a backslash when handed to the shell, to prevent it from being (mis)interpreted. The shell will remove the backslash and hand the final browser the intended value.

Implementation Notes

Before discussing implementation details, a few notes on how to implement the EB convention may help: