html2wikipedia Home Page

This is the home page for html2wikipedia, a simple tool (filter) that translates HTML into the wikipedia wiki format. It's designed for Unix-like systems (including GNU/Linux), and I've been told it runs fine under Cygwin if you install that first. It should run on Windows, but that is untested; you should be able to use an arbitrary C compiler (such as gcc for Windows, MinGW) to just compile it. If you want to change the code, you'll also need a lex implementation; those come standard with Unixes, Windows users will have to install one of those too.

It handles HTML's heading levels (1 to 3), center, ordered and unordered list (including nested ones), "pre" sections, bold, italics, centering, horizontal rules, tables, and Microsoft's bizarre left and right quotes (both single and double quotes). It also translates HTML links - and if they're to Wikipedia entries, they're translated into the special Wikipedia cross-references, otherwise they're translates as normal hypertext links.

It is covered by the GNU General Public License (GPL), version 2 or later. At one time, I had an additional requirement ("if you put it on a website, you also have to make the database publicly downloadable"); as of June 11, 2006, I have removed this restriction, so it's simply under the GPL version 2 or greater. (Since I wrote the program, Wikipedia now has automated systems to make its database available, as I'd hoped.)

If you wish, you can first see the README file that includes installation instructions. The installation steps require that you understand how to uninstall and compile a program; if these instructions are confusing, you'll probably need someone else to help you.

Download it now! (tarball format)

Feel free to see my home page.