David A. Wheeler's Blog

Mon, 02 May 2005

Trend: Simple, readable text markup languages

Here’s a new(?) trend, that shows that everything old really is sometimes new again. What’s the trend? Simple, highly readable markup languages.

In some situations, typical document formats (such as OpenDocument or Word .doc format) simply don’t work well. This includes massive collaboration over an internet, or for creating relatively simple/short documents. Although existing markup languages like DocBook, HTML/XHTML, LaTex, and nroff/man all work, they’re often complicated to write and read. You could use SGML or XML to create your own markup language, but that doesn’t really address the need for simplicity. None of these work very well if you expect to have many users who don’t really understand computers deeply (HTML comes closest, but complicated HTML documents become unreadable in a hurry).

Thus, there’s been a resurgence of new markup languages that are really easy to read and write, which can then be automatically translated to other formats. Two especially capable examples of this trend seem to be AsciiDoc and MediaWiki:

  1. AsciiDoc looks very reasonable if you want to create documents or websites; it can generate HTML, XML, DocBook, PDF, and man pages (DocBook can in turn generate other formats; it can also generate the obsolete LinuxDoc format). This is no trivial capability; it can handle cross-links, tables, and so on. Technically, AsciiDoc processing requires an implementation to look ahead to the next line to understand text; some find this annoying, but if it makes the language easy to read, I think that’s quite reasonable.
  2. Wikipedia’s markup language (supported by MediaWiki) has grown a lot of capabilities (to support creating an encyclopedia), yet it’s still easy to use (and thus is a really capable example of this). There are a vast number of users of this notation, but setting up a processor for it isn’t so easy.

The various Wiki languages, such as MoinMoin’s, etc., are also examples of this. But there are a lot of different ones, all incompatible. Here’s some text on StructuredText, ReStructuredText, and WikiText. Many Wiki languages use CamelCase to create links, unfortunately; a lot of people (including me) find that convention ugly and awkward (MediaWiki dumped CamelCase years ago; MediaWiki internal links look like this: [[MediaWiki link]]). Most Wiki languages are too limiting for wider use.

No doubt there are others. One I learned about recently is Markdown. Markdown is a notation for simply writing text and generating HTML or XHTML; it seems to be focused on helping bloggers.

Anyway, it’s an interesting trend! I’ve created a new essay about this at http://www.dwheeler.com/essays/simple-markup.html; if I learn about interesting new links related to this, I’ll add them there.

path: /misc | Current Weblog | permanent link to this entry