David A. Wheeler's Blog

Thu, 06 Dec 2007

Readable s-expressions (sweet-expressions) draft 0.2 for Lisp-like languages

Back in 2006 I posted my basic ideas about “sweet-expressions”. Lisp-based programming languages normally represent programs as s-expressions, and though they are regular, most people find them hard to read. I hope to create an alternative to s-expressions that have their advantages, and not their disadvantages. You can see more at my readable Lisp page. I’ve gotten lots of feedback, based on both my prototype of the idea, as well as on the mailing list discussing it.

I’ve just posted a a draft of version 0.2 of sweet-expressions. This takes that feedback into account, in particular, it’s now much more backwards-compatible. There’s still a big question about whether or not infix should be a default; see the page for more details.

Here are the improvements over version 0.1:

  1. This version is much more compatible with existing Lisp code. The big change is that an unprefixed “(” immediately calls the underlying s-expression reader. This way, people can quietly replace their readers with a sweet-reader, without harming most existing code. In fact, many implementations could quietly switch to a sweet-reader and users might not notice until they use the new features. Instead of using (…), this uses {..} and […] for grouping expressions without disabling sweet-expressions.
  2. It can work more cleanly with macros that provide infix precedence (for those who want precedence rules).
  3. It extends version 0.1’s “name-prefixing” into “term-prefixing”. This is not only more general, it also makes certain kinds of functional programming much more pleasant.
  4. It adds syntax for the common case of accessing maps (such as indexed or associative arrays) - now a[j] is translated into (bracketaccess a j).
  5. Infix default supports arbitrarily-spelled infix operators, and it automatically accepts “and” and “or”.

Here’s an example of (ugly) s-expressions:

 (defun factorial (n)
   (if (<= n 1)
       (* n (factorial (- n 1)))))

Here’s sweet-expressions version 0.1:

 defun factorial (n)
   if (n <= 1)
       n * factorial(n - 1)

Here is sweet-expressions version 0.2 (draft), with infix default (it figures out when you have an infix operation from the spelling of the operation):

 defun factorial (n)
   if {n <= 1}
       n * factorial(n - 1)

Here is sweet-expressions version 0.2 (draft), with infix non-default (you must surround every infix operator with {…}):

 defun factorial (n)
   if {n <= 1}
       {n * factorial{n - 1}}

I’m still taking comments. If you’re interested, take a look at http://www.dwheeler.com/readable. And if you’re really interested, please join the readable-discuss mailing list.

path: /misc | Current Weblog | permanent link to this entry

Mon, 05 Nov 2007

Please point me to High-Assurance Free-Libre / Open Source Software (FLOSS) Components

I’m looking for High Assurance and Free-Libre / Open Source Software (FLOSS) components. Can anyone point me to ones I don’t know about? A little context might help, I suppose…

A while back I posted a paper, High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS). For purposes of the paper, I define “high assurance software” as software where there’s an argument that could convince skeptical parties that the software will always perform or never perform certain key functions without fail. That means you have to show convincing evidence that there are absolutely no software defects that would interfere with the software’s key functions. Almost all software built today is not high assurance; developing high assurance software is currently a specialist’s field.

High assurance and FLOSS are potentially a great match. We achieve high assurance in scientific analysis and mathematical proofs by subjecting them to peer review, and then worldwide review. FLOSS programs, unlike proprietary programs, can receive similar kinds of review, and many FLOSS programs achieve really good reliability figures for medium assurance. So it can be easily argued that stepping up to high assurance should be easier for FLOSS than for proprietary software. In addition, there are a vast number of FLOSS tools that support developing high assurance components, including PVS, ACL2, Isabelle/HOL, prover9, and Alloy (there’s lots more, see the paper for more details).

Yet it’s hard to find High Assurance Free-Libre / Open Source Software (FLOSS) components. To be fair, high assurance software components are exceedingly rare in the non-FLOSS world as well. But I suspect there are more than I’ve found, and I hope that people will help me by pointing them out to me. I’d like to know about such components for direct use, and also simply for use as demonstrations of how to actually develop high assurance software. Today, it’s nearly impossible to explain how to develop high assurance software, because there are almost no fully-published examples. Existing “formal methods successes” papers generally don’t publish everything including their specs, proofs, and code… which makes it impossible to really learn how to do it. And no, I don’t require that proofs prove every line of machine code; but showing how correspondance demonstrations are made would be valuable, and that’s currently not well demonstrated by working examples.

If you’re curious about the general topic, take a peek at High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS). I’ve collected lots of interesting information; hopefully it will be useful to some of you. And let me know of high-assurance FLOSS components that I don’t already know about.

path: /oss | Current Weblog | permanent link to this entry

Sun, 04 Nov 2007

Added “MapReduce” to the “Software Innovations” list

Ken Krugler’s recent blog said that my article of The Most Important Software Innovations was “very good”, but he was surprised that I hadn’t included MapReduce as an important software innovation. Basically, MapReduce makes writing certain kinds of programs that process huge amounts of data, on vast distributed clusters, remarkably easy and efficient. (Wikipedia explains MapReduce, including links to alternative implementations like the open source Hadoop.)

It’s not because I didn’t know about MapReduce; I read about it almost immediately after it got published. I thought it was very promising, and even forwarded the original paper to some co-workers. I think MapReduce is especially promising because, now that we have cheap commodity computers, having a way to easily exploit their capabilities is really valuable. But even with something this promising, I didn’t want to add it to my list of innovations right away - after all, maybe after a little while it turns out to be not so helpful.

Currently, there’s aren’t many who have Google-sized clusters of computers available. But it’s clear that this approach is useful in many other circumstances as well. It’s new, but I think it’s stood the test of time enough that it’s a worthy addition… so I’ve just added it.

One interesting issue is that the MapReduce framework is itself built primarily on the “map” and “reduce” functions, which are far, far older. So, is MapReduce really a new idea, or is it just a high-quality implementation of an old idea? I’ll accept that it’s a new idea, but that can be difficult to judge. This judgment doesn’t really matter unless you think software patents are a good idea (since every software patents in theory prevents progress for 20 years). But I think it’s quite clear that software patents are a foolish idea, and it’s clear that others have come to the same conclusion. Eric S. Maskin, an economist who has long criticised the patenting of software, recently received the 2007 Nobel Prize for Economics. Here’s a nice quote: “… when patent protection was extended to software in the 1980s, […] standard arguments would predict that R&D intensity and productivity should have increased among patenting firms. Consistent with our model, however, these increases did not occur.” Someone who correctly predicted that software patents were harmful to innovation just received a Nobel prize. I hope to someday see people receive other prizes because they ended software patenting in the United States.

path: /misc | Current Weblog | permanent link to this entry

Mon, 22 Oct 2007

Donald Macleay

Don Macleay was my mentor and friend, and he just passed away (Oct. 15, 2007). So, this is a small blog entry in his honor.

Donald (Don) Macleay Here’s what I said at his funeral: “In 1980, Don was the manager of a computer store. I was only 15, but he took a chance on employing me, and I’m grateful. He taught me much, in particular, showing by example that you could be in business (even as a salesman!) and be an honest person. He later moved to other companies, and I moved twice with him, because I found that good bosses were hard to find. Don was honest, reliable, a good friend, and an inspiration to me. I will miss him, and I look forward to seeing him again in heaven.” I should add that he spoke at my Eagle scout ceremony. Later on, when he moved out to the country, it was always a pleasure to visit him and his family.

Here’s a part of his biography, as printed in the funeral bulletin: “Born in Washington, D.C., on October 27, 1934, Donald Macleay was raised in Falls Church. He attained the rank of Eagle Scout and graduated at the top of the first class of St. Stephen’s School in 1952. In 1956, he graduated with a Bachelor of Arts in English from the Virginia Military Institute (VMI).

After serving as a Marine Corps officer, Donald Macleay spent many years in the business world before becoming a Parole Officer for the Department of Juvenile Justice in Stafford County. As well, in 1992, he was a candidate for the U.S. Congress as an Independent.”

The biography goes on to note that he “valued being a Christian, a husband, a father and grandfather, and a friend.” Much of his last years were spent helping troubled youth in his area (Fredericksburg, VA), and from all accounts he was extraordinarily successful at helping them and their families.

path: /misc | Current Weblog | permanent link to this entry

Thu, 11 Oct 2007

Readable s-expressions (sweet-expressions) for Lisp-like languages

Back in 2006 I posted my basic ideas about “sweet-expressions”. Here’s a basic recap, before I discuss what’s new. Lisp-based programming languages normally represent programs as s-expressions, where an operation and its parameters are surrounded by parentheses. The operation to be performed is identified first, and each parameter afterwards is separated by whitespace. So the traditional “2+3” is written as “(+ 2 3)” instead. This is regular, but most people find this hard to read. Here’s a longer example of an s-expression - notice the many parentheses and the lack of infix operations:

 (defun factorial (n)
   (if (<= n 1)
       (* n (factorial (- n 1)))))

Lisp-based systems are very good at symbol manipulation tasks, including program analysis. But many software developers avoid Lisp-based languages, even in cases where they would be a good tool to use, because most software developers find s-expressions really hard to read. I think I’ve found a better solution, which I call “sweet-expressions”. Here’s that same program be written using sweet-expressions:

 defun factorial (n)         ; Parameters can be indented, but need not be
   if (n <= 1)               ; Supports infix, prefix, & function <=(n 1)
       1                     ; This has no parameters, so it's an atom.
       n * factorial(n - 1)  ; Function(...) notation supported

Sweet-expressions add the following abilities:

  1. Indentation. Indentation may be used instead of parentheses to start and end expressions: any indented line is a parameter of its parent, later terms on a line are parameters of the first term, lists of lists are marked with GROUP, and a function call with 0 parameters is surrounded or followed by a pair of parentheses [e.g., (pi) and pi()]. A “(” disables indentation until its matching “)”. Blank lines at the beginning of a new expression are ignored. A term that begins at the left edge and is immediately followed by newline is immediately executed, to make interactive use pleasant.
  2. Name-ending. Terms of the form ‘NAME(x y…)’, with no whitespace before ‘(’, are interpreted as ‘(NAME x y…)’;. Parameters are space-separated inside. If its content is an infix expression, it’s considered one parameter instead (so f(2 + 3) computes 2 + 3 and passes its result, 5, to f).
  3. Infix. Optionally, expressions are automatically interpreted as infix if their second parameter is an infix operator (by matching an “infix operator” pattern of symbols), the first parameter is not an infix operator, and it has at least three parameters. Otherwise, expressions are interpreted as normal “function first” prefix notation. To disable infix interpretation, surround the second parameter with as(…). Infix expressions must have an odd number of parameters with the even ones being the same binary infix operator. You must separate each infix operator with whitespace on both sides. Precedence is not supported; just use parens (a lot more about that in a moment). Use the “name-ending” form for unary operations, e.g., -(x) for “negate x”. Thus “2 + (y * -(x))” is a valid expression, equivalent to (+ 2 (* y (- x))). “(2 + 3 + 4)” is fine too. Infix operators must match this pattern (and in Scheme cannot be =>):

For more information on sweet-expressions or on making s-expressions more readable in general, see my website page at http://www.dwheeler.com/readable. For example, I provide a demo sweet-expression reader in Scheme (under the MIT license), as well as an indenting pretty-printer in Common Lisp. In particular, you can see my lengthy paper about why sweet-expressions do what they do, and some plausible alternatives. You can also download some other implementation code. I’ve also set up a SourceForge project named “readable” to discuss options in making s-expressions more readable, and to distribute open source software to implement them (unimplemented ideas don’t go far!).

Okay, but all of that was true in 2006 - what’s new? What’s new is a change of heart about precedence. I’ve been occasionally trying to figure out how to “flesh out” sweet-expressions with operator precedence, and I just kept hitting one annoying complication after another. Precedence is nearly universal among programming languages; they’re very useful, and only a few infix-supporting languages (such as Smalltalk) lack them. “Everyone” knows that 2+3*4 is 14, not 20, because of years of training in math classes that you multiply before you add. They’re also pretty easy to code (it’s an old solved problem). But I’ve discovered that in the typical use cases of a Lisp-like language’s expression reader, supporting precedence (in the general case) has some significant downsides that are irrelevant in other situations. Which is interesting, given how widespread they are elsewhere, so let’s see why that is.

First, let’s talk about a big advantage to not supporting precedence in sweet-expressions: It makes the creation of every new list obvious in the text. That’s very valuable in a list processing language; the key advantage of list processing languages is that you can process programs like data, and data like programs, in a very fluid way, so having clear markers of new lists using parentheses and indentation is very valuable.

Now let me note the downsides to supporting precedence in the specific cases of a Lisp-like language, which leads me to believe that it’s a bad idea for this particular use. Basically, adding precedence rules to a general-purpose list expression processor creates a slippery slope of complexity. There are two basic approaches to defining precedence: dynamic and static.

It’s easier to add precedence later, if that turns out to be important after more experimentation. But after the experimentation I’ve done so far, it appears that precedence simply isn’t worth it in this case. Precedence creates complexity in this case, and it hides where the lists begin/end. It’s not hard to work without it; you can even argue that (2 + (5 * 6)) is actually clearer than (2 + 5 * 6). Precedence is great in many circumstances - I’d hate to lose it in other languages - but in this particular set of use cases, it seems to be more of a hurt than a help.

Of course, you can write code in some Lisp dialect to implement a language that includes precedence. Many programs written in Lisp, including PVS and Maxima, do just that. But when you’re implementing another language, you know what the operators are, and you’re probably implementing other syntactic sugar too, so adding precedence is a non-problem. Also, if you’re really happy with s-expressions as they are, and just want precedence in a few places in your code, a simple macro to implement them (such as infpre) works very well. But sweet-expressions are intended to be a universal representation in Lisp-like languages, just like S-expressions are, so their role is different. In that different role, precedence causes problems that don’t show up in most other uses. It think not supporting precedence turns out to be much better for this different role.

Here are some more examples, this time in Scheme (another Lisp dialect):
Sweet-expression (Ugly) S-expression
define factorial(n)
   if (n <= 1)
       n * factorial(n - 1)
substring("Hello" (1 + 1) string-length("Hello"))
define move-n-turn(angle)
if (0 <= 5 <= 10)
   display("Uh oh\n")
define int-products(x y)
  if (x = y)
    x * int-products((x + 1) y)
int-products(3 5)
(2 + 3 + (4 * 5) + 7.1)
(2 + 3 + (4 / (5 * 6)))
*(2 3 4 5)
(define (factorial n)
   (if (<= n 1)
       (* n (factorial (- n 1)))))
(substring "Hello" (+ 1 1) (string-length "Hello"))
(define (move-n-turn angle)
   (tortoise-move 100)
   (tortoise-turn angle))
(if (<= 0 5 10)
   (display "True\n")
   (display "Uh oh\n"))
(define (int-products x y)
  (if (= x y)
      (* x (int-products (+ x 1) y))))
(int-products 3 5)
(+ 2 3 (* 4 5) 7.1)
(+ 2 3 (/ 4 (* 5 6)))
(* 2 3 4 5)

So I’ve modified my demo program so that it supports infix operator chaining, such as (2 + 3 + 4). Since I no longer need to implement precedence, the addition of chaining means that I now have a working model of the whole idea, ready for experimentation. My demo isn’t ready for “serious use” in development yet; it has several known bugs and weaknesses. But it’s good enough for experimentation, to see if the basic idea is sensible - and I think it is. You can actually sit down and play with it, and see if it has merit. There are still some whitespace rules I’d like to fiddle with, to make both long files and interactive use as comfortable as possible, but these are at the edges of the definitions… not at its core.

I’m suggesting the use of && for “logical and”, and || for “logical or”. These are common symbols in other languages, and using the same symbols aids readability. Now, in Common Lisp and some Scheme implementations, || is “the symbol with 0-length name”. Oddly enough, this doesn’t seem to be a problem; Lisps can generally bind to the symbol with the 0-length name, and print it the same way, so it works perfectly well! In Scheme this is trivially done by running this:

define(&& and)
define(|| or)
Then you can do this:
(#t && #t)
if ((a > b) || ((a * 2) < (c + d + e))) ...
Instead of the hideous s-expressions:
(and #t #t)
(if (or (> a b) (< (* a 2) (+ c d e))) ...)

Here are some quotable quotes, by the way, showing that I’m not the only one who thinks there’s room for improvement:

Lisp-based languages are all over the place. There are vast number of implementations of Common Lisp and Scheme. GNU guile is a Scheme implementation embedding into many other programs, for example. GNU emacs is a widely-used text editor, built on its own dialect of Lisp. AutoCAD has its own variant under the covers, too. Programs like PVS are implemented in Lisp, and interacting with it currently requires using s-expressions. It’d be great if all of these supported an alternative, simpler syntax. With sweet-expressions, typical s-expressions are legal too. So I think this is a widely-useful idea.

So if you’re interested, take a look at http://www.dwheeler.com/readable.

path: /misc | Current Weblog | permanent link to this entry

Thu, 04 Oct 2007

Software Assurance 2007

Lots of interesting things are happening with the various efforts to eliminate or counter software vulnerabilities. The Software Security Assurance (SwA) State-of-the-Art Report (SOAR) tries to list what’s going on, especially in things related to the U.S. government. As with any such document, it’s incomplete, and it’s only a snapshot (things keep changing!). But if you haven’t been following this world, and want to know “what’s going on”, it’s the best place I know of to start. Of course, you can also look at sites such as the U.S. DHS / CERT “build security in” site.

The U.S. National Vulnerability Database tracks specific vulnerabilities in specific products; they identify each vulnerability using the unique id defined by Common Vulnerabilities and Exposures (CVE). But if the world is going to prevent these kinds of vulnerabilities from happening in the future, we need to categorize them in a way that everyone agrees what the categories are. Informally, there are lots of ways to categorize them, but their meanings differ between people. That’s a real problem when comparing tools; different tools find different problems, but without agreed-on terminology, it’s hard to even describe their differences. MITRE is currently developing a way to categorize all vulnerabilities in a way that everyone can agree on, called Common Weakness Enumeration (CWE). The U.S. National Vulnerability Database and MITRE have worked out a set of CWEs that they will use to categorize vulnerabilities. The CWE is still being developed, but at least some common terminology is getting worked out.

path: /security | Current Weblog | permanent link to this entry

Mon, 24 Sep 2007

FLOSS License Slide - Modified for LGPL

My “Free-Libre / Open Source Software (FLOSS) FLOSS License Slide” helps people figure out if the most widely-used FLOSS licenses are compatible (that is, if you can combine their software to produce new works). The main body of the PDF text fits all on one page, which can be handy.

My thanks to Olaf Schmidt, who found an error of omission in the previous version of my “slide”. He pointed out to me that the GNU Lessor General Public License (LGPL) 2.1 is even more compatible with various versions of the GPL than the plain reading of my “slide” originally noted. That’s because the LGPL has explicit text noting that you can switch its license to GPL version 2 or later, and similarly LGPL version 3 explicitly says that you can also use GPL version 3 or later. I fixed this by adding one more arrow to my diagram, which was enough to capture this. I also noted that the previous version of the LGPL is version 2.1, not 2 (previous versions of the LGPL exist but are becoming uncommon). I also added some additional text, so I hope that the LGPL-related text is even clearer now.


path: /oss | Current Weblog | permanent link to this entry

Wed, 18 Jul 2007

Navy: Open Source Software IS Commercial Software

On June 5, 2007, the U.S. Navy released some guidance on Open Source Software. In particular, they noted that if Open Source Software (OSS) met the U.S. law definition of “commercial item”, it was a commercial item. (They actually say OSS that meets that definition is commercial off-the-shelf (COTS), not just a “commercial item” - presumably because they had in mind the off-the-shelf open source software.) I’m delighted to see this guidance, because I’ve been saying the same thing. This Navy memo was pretty clear, yet some people seemed to have really odd interpretations of it. In particular, GCN reported that “After several years of evaluation, the Navy Department has approved the use of open-source software in all Navy and Marine Corps information technology systems.” This GCN article makes it seem like there’s been some big change in direction, and I think that’s a terrible misunderstanding of what’s going on here.

I believe this 2007 Navy memo is not a change in policy or direction. This Navy memo merely tries to counter a widespread misunderstanding that is sometimes resulting in a failure to obey U.S. law, the U.S. government’s Federal Acquisition Regulations (aka the FAR), and (by implication) the U.S. Department of Defense’s (DoD) Defense Federal Acquisition Regulation Supplement (DFARS). This memo also serves as a restatement that Navy policy continues to obey existing DoD and U.S. government policies regarding open source software (OSS), as were already formally established in 2003-2004.

Below are some supporting details to justify those statements. I hope they will help put this memo in context. As is usual in any blog, my conclusions are just my own opinion, not the official position of any organization. On the other hand, I think I have really good evidence! So let’s see…

Years ago some people had the strange idea that OSS was prohibited in the DoD or U.S. federal government, even though there was no such prohibition. This was particularly bizarre in the DoD, since a MITRE report (final publication early 2003) found that OSS use was already widespread and very helpful to the DoD. That MITRE report concluded that “Neither the survey nor the analysis supports the premise that banning or seriously restricting [Free / Open Source Software (FOSS)] would benefit DoD security or defensive capabilities. To the contrary, the combination of an ambiguous status and largely ungrounded fears that it cannot be used with other types of software are keeping FOSS from reaching optimal levels of use.”: http://www.terrybollinger.com/dodfoss/dodfoss_pdf_hyperlinked.pdf

So in May 2003 an official DoD policy memo (“OSS in DoD”) was released. It affirmed that OSS was fine as long as it met the applicable DoD requirements, just as any other kind of software had to meet the applicable DoD requirements: http://www.egovos.org/rawmedia_repository/822a91d2_fc51_4e6e_8120_1c2d4d88fa06?/document.pdf

This problem was government-wide, so in July 2004, OMB released a similar policy memo (M-04-16), which explicitly stated that U.S. federal government acquisition policy was neutral about using OSS vs. proprietary software. In particular, it said that government “policies are intentionally technology and vendor neutral, and to the maximum extent practicable, agency implementation should be similarly neutral.”: http://www.whitehouse.gov/omb/memoranda/fy04/m04-16.html

Yet there still seemed to be some strange misunderstandings, in spite of these 2003 and 2004 policy memos explicitly stating that U.S. DoD and federal acquisition policies were neutral on the question of using OSS vs. proprietary software (they merely had to obey the usual requirements). More recently, these misunderstandings seem to revolve around a failure to read and understand the term “commercial item” as defined by U.S. Code Title 41, Chapter 7, section 403, as well as its corresponding FAR text. These define a “commercial item” as an item “customarily used by the general public or by non-governmental entities” (i.e., they have uses not unique to a government) and have been “sold, leased, or licensed to the general public”). It would seem obvious that if OSS meets the U.S. law/FAR definition for a commercial item, it is a commercial item for government acquisition purposes. And nearly all extant OSS does indeed meet this definition; nearly all extant OSS has non-government uses and are licensed to the public. In addition, almost all already-existing OSS software also meets the definition of commercial-off-the-shelf (COTS), since they are commercial items that are ALREADY available to the public (“off the shelf”).

The problem was that some acquisition programs were redefining the term “commercial item” (and COTS) to exclude OSS competitors. These redefinitions were in contradiction to the existing DoD and federal government-wide explicit policy for neutrality regarding OSS, and in contradiction to the clear definition of “commercial item” given in U.S. law, the FAR, and by implication the DFARS. The Navy memo simply tries to correct this misunderstanding, as well as re-iterating that the existing DoD and federal government policy on OSS continues. This Navy memo was signed by Department of the Navy CIO Robert J. Carey on June 5, 2007, and titled “Department of the Navy Open Source Software Guidance”. You can find the Navy memo here: http://oss-institute.org/Navy/DONCIO_OSS_User_Guidance.pdf

The main implication of this definition of “commercial item” is that (as required by law and the FAR) contractors and their subcontractors at all tiers must do market research of the commercial market and consider ALL their commercial options… including the OSS options. This is certainly NOT a special preference for OSS, and ALL evaluation characteristics for software are still valid (e.g., functionality, total cost of ownership, quality, security, support, and flexibility). But in cases where the OSS option is the better option, by policy the U.S. government intends to take advantage of it.

This approach makes sense, given the major changes that are happening in the software industry. In many market segments OSS programs are the #1 or #2 product by market share, and OSS in aggregate now represents billions of dollars of development effort. Many companies are developing OSS and/or selling commercial support for OSS, including Red Hat, Novell, Sun Microsystems, IBM, and MySQL AB. Microsoft competes with some OSS programs, business models, and licenses, but in other areas Microsoft uses, develops, and encourages OSS (Microsoft’s Windows includes BSD-developed networking applications; Microsoft OSS projects include WiX and IronPython; Microsoft runs the “Codeplex” website to encourage OSS development on Windows). In areas where they are appropriate some major OSS programs have received the relevant Common Criteria or FIPS 140-2 IT security certificates.

OSS potentially affects how acquisition programs acquire software, but acquisition programs should expect to be affected by changes in relevant commercial industries. This was anticipated; the DoD policy memo “Commercial Acquisitions” (Jan. 5, 2001) explains that the benefits of commercial item acquisition include “increased competition; use of market and catalog prices; and access to leading edge technology and ‘non-traditional’ business segments”. In other words, DoD policy anticipates that there WILL be “non-traditional business segments” - and its policy is to embrace and exploit such changes where appropriate. (Given its growth and breadth, it’s become increasingly difficult to argue that OSS is “non-traditional” anyway.) AT&L’s “Commercial Item Handbook” (November 2001) explains that this broad definition of “commercial item” is intentional, because it “enables the Government to take greater advantage of the commercial marketplace.” See: http://www.acq.osd.mil/dpap/Docs/cihandbook.pdf

In other words, U.S. contractors must consider all their options, and then select the best one. They are not allowed to arbitrarily ignore a relevant commercial industry sector, and are specifically not allowed to ignore OSS options.

If you’re interested in this topic, you might also be interested in some related articles of mine, such as Open Source Software (OSS) in U.S. Government Acquisitions and “Commercial” is not the opposite of Free-Libre / Open Source Software (FLOSS): Nearly all FLOSS is Commercial.

path: /oss | Current Weblog | permanent link to this entry

Wed, 11 Jul 2007

FLOSS License Slide - Released!

There are a large number of Free-Libre / Open Source Software (FLOSS) licenses, but only a few are widely used. Unsurprisingly, the widely-used licenses tend to be compatible — that is, it’s possible to combine software under the different licenses to produce a larger work. But many people have trouble figuring out when they can be combined, or how.

So I’ve created a little figure which I call the “FLOSS license slide” to make it easier to see if FLOSS licenses can be combined in many common cases, and if so, what the basic ramifications are. I’ve crafted it so that the figure and explanatory text all fit in a page, which can be handy.

You can look at the FLOSS license slide in one of three formats:

I had released a draft of this earlier, and got some nice feedback. I added “public domain”; there seemed to be enough questions about public domain software that it made sense to add that to the figure. Yes, I know that technically “public domain” isn’t a license, but it’s much easier to understand and explain by treating it as if it were one. Some of the earlier text was not clear enough; hopefully this one is clearer. For example, GPLv2+ and the Affero GPL 3 licenses are compatible via GPLv3, but some people had trouble understanding why. Now the text notes that they are compatible via the GPL version 3, which will hopefully make that clearer. I also added the HTML format.

I’m not a lawyer, and if you need formal legal advice you need to consult your own attorney. But for many people, this is the information you needed, in a conveniently small format, so here it is. Enjoy!

path: /oss | Current Weblog | permanent link to this entry

Sun, 08 Jul 2007

Increasing Government Interest in Free-Libre / Open Source Software (FLOSS)

There’s more on the web now about governments and Free-Libre / Open Source Software (FLOSS), both on my own site and other sites. I think this suggests more and more interest by governments in FLOSS.

I’ve now posted Open Source Software (OSS) in U.S. Government Acquisitions, which is a slightly-updated version of my article which was published in the “DoD Software Tech News” issue devoted to free-libre / open source software (FLOSS). As I noted earlier, this issue has lots of good articles. I was asked to create the article (not the other way around), so I think the very existance of this issue is evidence of increasing interest, not just a personal interest of mine.

I should probably mention why I’m posting a revision. After I submitted my article, the Department of the Navy CIO Robert J. Carey signed a June 5, 2007 memorandum titled “Department of the Navy Open Source Software Guidance” which notes that FLOSS needs to be considered a commercial item when it meets the U.S. government’s standard definition of a commercial item (nearly all extant FLOSS does). That is the same basic point that I raised in my presentation and paper, and I would have gladly mentioned the Navy memo… had it been released at the time. So my modified version now points to the Navy memo, as well as making a few tweaks based on other feedback. This realization that FLOSS programs are (almost always) commercial items is an important point in the U.S. government. Why? Because as noted in Linux.com, that means that extant FLOSS software must be considered when the U.S. government acquires software the same way as other commercial software is considered (i.e., it must be considered before starting a new project to write their own). The Navy memo’s assertion of this makes it worth posting an update.

But think of this updated essay as a only sampler… for the rest of the articles, you’ll still need to go read the “DoD Software Tech News” issue. Getting the issue does require registration, but registration is free and in this case it’s worthwhile.

Anyway, it’s not just one issue about FLOSS in one magazine. Here’s more evidence it’s not just me - on July 5, 2007, the article Open Source Government: good-will needed was posted by Roberto Galoppini’s Commercial Open Source Software blog. He points in turn to various articles, including Matt Asay’s “Open source in government: Leadership needed”, which then leads you back to a very interesting research paper: “Open-Source Collaboration in the Public Sector: The Need for Leadership and Value” by Michael P. Hamel. There’s lots of interesting discussion in those articles about how governments can use FLOSS, and in particular how governments can use FLOSS components and approaches more effectively. I’ll throw in my own two cents here. I would certainly agree that leadership is important in any project, including FLOSS projects; the leadership of Linus Torvalds of the Linux kernel is well-documented, for example. But there are different kinds and levels of leadership. Roberto Galoppini says, “I think that first we need politicians with good-will, willing to put their intellectual potential to work for the overall desires of the general public…”; while that’s good, often what’s important is the ability to lead concept into practice. It’s easy to “lead” by saying words, but I think that often what’s needed is the kind of leadership that rolls up its sleeves and makes actual projects produce useful results. Focusing on specific, measurable products can often get better results. Certainly politicians with goodwill are a good thing, but they can provide value primarily by providing “cover” for those who actually do the hands-on leading of projects, but the latter make or break such efforts.

While I can quibble with some of the stuff Hamel says, some of the his statements seem dead-on with what I see. Hamel notes that “participants in both groups also believe that the creation of value, or products that are appropriate and effective in addressing members’ wants and needs, is critical”… to which I say, Amen. Hamel concludes that “Collaborations with a strong leadership structure, and more importantly a single leader who is persistent, passionate and willing to spend a great deal of time maintaining and improving the organization are much more likely to succeed. Value is also a critical component, and requires that efforts meet the wants and needs of members and clients, whether they be in the form of software, documentation, research or even policy advocacy.” Focusing on a few most useful projects is critical: “a conscious effort to focus energy on a small number of projects in early stages may be an important component in creating value for members of collaborative efforts.” A FLOSS project requires collaboration to be successful, and collaboration requires that the project gain the trust of potential users/developers; “In this research I found that leadership, face-to-face contact, and the legal framework were the primary factors leading to trust. A willingness and ability to evolve, which may be tied to creating products of value to clients and members, might also be an important factor in developing a successful collaboration.” Those statements, at least, seem very sound. These aren’t new ideas, to be sure, but it sure is easy to lose sight of them.

I think there’s a vast opportunity here for governments to use FLOSS. Which is odd in a sense, because in fact, FLOSS is already widely used in governments. MITRE determined back in 2003 that FLOSS was already critically important to the U.S. Department of Defense (DoD). But that widespread use is small compared to its potential, and to how many commercial organizations use FLOSS, and there are a lot of reasons for that. Governments have long aquisition times, often tend to avoid doing anything “risky” (aka “different than the way we did it before”, abetted because there’s no competitor to force them to improve), and many proprietary vendors target governments to try to prevent them from using competing FLOSS products. Which means that even when it makes sense to use FLOSS, it can often take much longer for FLOSS components and approaches to enter into government use. But it has already entered into significant use, and I expect that to accelerate over the years.

There are some other goodies on my web site if you’re interested in this topic (government and FLOSS). My essay is basically a text version of my March 2007 presentation on Free-libre / Open Source Software (FLOSS). It includes snippets from much longer papers that give statistics about FLOSS, explain how to evaluate FLOSS, discuss FLOSS and security, and explain why most FLOSS is commercial software. There’s even a hint of my essay Make Your Open Source Software GPL-Compatible. Or Else, since I note the widespread use of the GPL.

Again, if you have a true PHB, I can’t help you. But many managers are trying to the right thing, and just need reasonable information so that they can do the right thing.

path: /oss | Current Weblog | permanent link to this entry

Sat, 23 Jun 2007

ISPs Inserting Ads Into Your Pages: Unlawful Modifications (Copyright Violation)

Here’s a nasty trick. Apparantly some shady Internet Service Providers (ISPs) are inserting advertizements into other people’s web pages without their permission.

I believe these are usually illegal modifications. Whoever creates a work automatically has a copyright on that work. (If you’re doing work for someone else, or if you sell the copyright, they have the copyright.) Anyone who releases a public web page obviously grants others the right to copy that work to view it, but unless otherwise stated, you don’t have the right to modify the work. U.S. law also has “fair use” provisions that let you quote another work without being sued, and so on, but those can’t possibly be stretched into letting someone insert ads. The copyright owner can release works permitting such actions (that’s how Free-libre / open source software works), but they have to give that permission… you can’t just unilaterally modify it. What really angers me is that they might insert stuff I am morally opposed to, and sullying my reputation.

If you run a website, here’s a sample letter that you can send to inform these ISPs that what they’re doing is wrong.

Dear (name here),

I am concerned that you appear to be violating copyright laws,
by unlawfully modifying and then redistributing copyrighted works
of mine and many others.

In particular, your "terms of service" say... (quote them, if you can).

My website, (www.dwheeler.com), distributes many copyrighted works
at no charge, but I do _NOT_ give the right to modify many of
those works.  Many other websites do the same.  (Some can add:
In addition, I have ads on my site; by adding your own ads, you are
diluting the value of MY ads and in essence stealing from me.)

Please stop modifying my web pages immediately.  It is my hope that
this was just a misunderstanding, and that we can settle this amicably.

Thank you very much.

path: /misc | Current Weblog | permanent link to this entry

Tue, 19 Jun 2007

“DoD Software Tech News” posts open source software issue

The U.S. Department of Defense (DoD)’s software technology magazine “DoD Software Tech News” has posted a whole issue devoted to free-libre / open source software (FLOSS). If you’re trying to get FLOSS seriously considered by acquisition or management people, this may be what you need. This issue includes essays by David A. Wheeler (that’s me!), Terry Bollinger, John M. Weathersby, Mark Lucas (on Geospatial FLOSS), Peter Gallagher, Matt Asay (Alfresco), and Andrew Gordon. Free registration required. My essay is basically a text version of my March 2007 presentation on Free-libre / Open Source Software (FLOSS); it includes snippets from my papers that give statistics about FLOSS, explain how to evaluate FLOSS, discuss FLOSS and security, and explain why most FLOSS is commercial software. There’s even a hint of my essay Make Your Open Source Software GPL-Compatible. Or Else, since I note the widespread use of the GPL.

After I submitted my article, but before it got published, the Department of the Navy CIO Robert J. Carey signed a June 5, 2007 memorandum titled “Department of the Navy Open Source Software Guidance” which notes that FLOSS needs to be considered a commercial item when it meets the U.S. government’s standard definition of a commercial item (and nearly all extant FLOSS meets that definition). That is the same basic point that I raised in my presentation. It feels nice to have made a key statement, and then find that the U.S. Navy officially confirms it. As noted in Linux.com, that means that extant FLOSS software must be considered when the U.S. government acquires software, the same way as other commercial software is considered (i.e., it must be considered before starting a new project to write their own).

If you have a true PHB, I can’t help you. But many managers just need some honest information, and this is at least one way to get it to them.

path: /oss | Current Weblog | permanent link to this entry

Mon, 11 Jun 2007

Reviews of Books, Movies, etc.

I read voraciously, and occasionally I’ve even been known to see a movie. So I’ve started to write down Reviews of Books, Movies and Other Stuff. There you can see what I think of various books, movies, etc. It’s not huge yet, but I intend to make it grow.

In many cases you can click on it to get your own copy via Amazon.com. If you buy something that way, I even get a small cut. I won’t get rich that way, but if you choose to buy something by doing that, thanks very much!

path: /misc | Current Weblog | permanent link to this entry

FLOSS License Slide

There are a large number of Free-Libre / Open Source Software (FLOSS) licenses, but only a few are widely used. Unsurprisingly, the widely-used licenses tend to be compatible — that is, it’s possible to combine software under the different licenses to produce a larger work. But many people have trouble figuring out when they can be combined, or how.

So I’ve created a little figure which I call the “FLOSS license slide” to make it easier to see if FLOSS licenses can be combined in many common cases, and if so, what the basic ramifications are. I’ve crafted it so that the figure and explanatory text all fit in a page, which can be handy.

You can look at the FLOSS license slide in one of two formats:

This is currently a draft, because I’d like to hear comments before “finishing” this. Also, it’s based on the “final draft” of GPLv3; things could change before that revision is complete (though I doubt it). There’s no end in trying to add other licenses, but if there’s a big mistake in this document, I’d like to know. I’m not a lawyer, and if you need formal legal advice you need to consult your own attorney. But for many people, this is the information you needed, so here it is.

path: /oss | Current Weblog | permanent link to this entry

Tue, 05 Jun 2007

Comparing OpenDocument (ODF) with MS-XML (OOXML/EOOXML) - and why Multiple Competing Standards are a Bad Idea

Microsoft continues to give its bizarre argument that multiple competing (conflicting) standards for the same purpose is a good thing. They are dreadfully confused. Having multiple competing products is a good thing, but having multiple competing standards is terrible.

Multiple competing standards risk massive property loss, many lives, and even the loss of your country, according to history. My presentation Open Standards and Security notes two of the many historical examples, the 1904 Baltimore fire and the U.S. Civil war:

  1. In 1904, a huge (80-block) area of Baltimore burned to the ground. Fire fighters from other cities came but couldn’t effectively connect their fire hoses to the fire hydrants, because every city had their own incompatible standard. That resulted in 2,500 burned buildings over a 30 hour period.
  2. Perhaps even more strikingly, one of the important reasons the U.S. South (the Confederacy) lost the U.S. Civil War was because the southern states had incompatible rail gauges. The U.S. North had a single rail gauge standard (for the most part), and could move troops and materiel to battles far more quickly than the U.S. South, even though the U.S. North had far greater distance to travel. Ned Harrison’s paper “’States Rights’ doomed Confederate nation” (Nov. 12, 2005) cites the problems with railroads, including the rail gauge incompatibility, as an important factor in the U.S. South’s loss (it certainly was not the only reason, but it was an important factor). Indeed, when the North conquered an area, one of the first things it did was move the rails to the Northern standard… so it could continue to exploit that advantage as it advanced.

But what about document converters, don’t they make it easy to have multiple competing standards? Well, it’s true that there are converters like ODF Converter. But while converters are very useful for one-time transitions, they are lousy long-term solutions… they make it clear that there has been a failure, not a success, in standards-making. Look at the problems multiple standards cause in other areas. We can easily convert between metric and English units, but NASA lost a $125 million Mars orbiter because a Lockheed Martin engineering team used English units of measurement while the agency’s team used the more conventional metric system. Another example is the Gimli Glider, a Boeing 767 that ran out of fuel in Canada in 1983 due, in large part, to confusion about English/metric conversion. Why do we want this problem in office documents too? Also, as of June 5, 2007, there are 6 pages of problems converting MS XML to ODF, but only 2 pages of problems converting ODF to MS XML. In general, ODF appears to be a much more capable format; just from that list it appears that it’ll be easier to improve ODF than to try to use MS XML for all documents. Which is unsurprising; the ODF work included membership from a variety of office suite implementors, while MS XML is the work of just one. That’s even more true when you consider that OpenDocument uses existing standards, instead of creating a trade barrier through nonsense one-off specifications. Microsoft claims that OOXML is necessary to “fully” capture old binary documents, but I have not seen much evidence this is actually true. A quick look at the CONVERT function in OOXML revealed many absolutely incorrect unit measures, for example. More generally, I have no reason to believe that the OOXML spec actually includes what is needed to capture the older binary formats like .doc - no mapping has been presented to prove this claim, and it’d be easy for the OOXML spec to omit lots of important information.

Regardless of the facts, Microsoft continues to press for ISO standardization of its format, aka “Microsoft XML” (MS-XML), OOXML, or EOOXML. There are now several papers about problems with OOXML. Groklaw’s EOOXML objections has lots of good information. Edward Macnaghten’s Technical Distinctions of ODF and OOXML: A Consultation Document (ODFA UKAG) is interesting because it shows what actual documents look like using the different specs - and it exposes a lot of problems in MS XML that have not been widely discussed before. Sam Hiser has an interesting ODF / OOXML comparison as well. Rob Weir has interesting comments as well.

Perhaps a more elegant demonstration the OOXML is absurd is this picture of their 6000-page spec, printed. This is a single spec; no wonder there’s been so little review compared to OpenDocument. Yet even with this hideously large size, MS XML (OOXML) still fails to give the important details that a spec really needs. The reason it’s so hideously long, while still failing to give the important details a spec needs, is simply because they ignored a vast number of standards - so they end up re-inventing lots of already-existing standards. Thus, they end up conflicting with a large number of different standards. They even ignored MathML, a widely-used standard that even they supported, and redid things from scratch.

Even on the ground this pressure for OOXML makes little sense. Magazines like Science and Nature reject Office 2007 documents. Macintoshes still can’t read the .docx (etc) format, nor can Pocket PCs.

I guess one good result is that Microsoft has encouraged voting for OpenDocument, because that’s the only logical thing it can do if it really believes that having “many conflicting formats are a good thing”. In contrast, there’s no reason that someone who wants a truly open single format needs to vote for OOXML. It’s perfectly reasonable to reject OOXML on the grounds that it conflicts with an already-existing ISO standard (OpenDocument). If there’s something that OOXML does that OpenDocument doesn’t, it would be much easier to add that tweak to OpenDocument, because OpenDocument builds on existing standards while OOXML fails to do so.

Microsoft is not a “universal evil”, and I praise them when they do good things. But encouraging multiple conflicting standards for the same area is not a good thing. In some sense, I don’t care if MS XML or ODF become “the” format for office documents, as long as the final specification is truly open. But the materials noted above lead me to believe that MS XML is not really open; it appears to be effectively controlled by one vendor, both in its current and future forms, as one obvious example. So MS XML isn’t really an option, and we already have a nice working solution.

What I want is a single document format that is fully open. What’s that mean? See Is OpenDocument an Open Standard? Yes! to see what the phrase “open standard” really means. And let’s look at it in practice. Currently I can edit text documents using the program “vim”, and I don’t even bother to ask if the other person uses emacs, or Notepad… just by saying “simple text format” we can exchange our files. Similarly, I can edit a GIF or PNG file without wondering what originally created the file - or who will edit it next. That’s generally true with other standards like HTML, HTTP, and TCP/IP. That’s the beauty of open standards - real open standards enable a thriving industry of competing products, allowing users to choose and re-choose between them. I want to see that beautiful sunlight in office suites as well.

path: /misc | Current Weblog | permanent link to this entry

Sat, 26 May 2007

Funny “I’m Linux” ads from Novell

This came out a little while ago, but I just saw them and they’re hilarous. Here are three ads from Novell that spoof Apple’s “I’m a Mac” ads, and I thought they were really funny and well-done.

Here are links to the Hi-Res Novell Videos:

  1. Meet Linux mpg ogg Flash on YouTube
  2. New Duds mpg ogg Flash on YouTube
  3. Running Linux mpg ogg Flash on YouTube

Here’s some background about how these ads were created.

path: /oss | Current Weblog | permanent link to this entry

Wed, 18 Apr 2007

George Mason University (GMU) Thesis/Dissertation Sample Document in OpenDocument Format

I’ve released a “sample document” that implements all the requirements of George Mason University (GMU)’s Thesis/Dissertation format in the OpenDocument format. You can get it in OpenDocument or PDF formats.

If you’re a student at GMU who needs it, you really need this.. but even if you’re not, there’s lots that can be shown from this template.

First, if you’re a student at GMU who needs it, let me explain why you really need it. Most universities have their own formats that have many detailed requirements, so by using a pre-created format, you immediately comply with lots of the details that are meaningless, yet you can’t graduate without meeting them. GMU requires page numbers to be on the top right, except at chapter headings (where they are centered on the bottom)… except that appendix chapter headings aren’t considered chapter headings. Got all that? GMU requires that there be a horizontal line between any footnotes and the main text, and it must be exactly 2” long. Oh, and there are lots of margin requirements, which you must get exactly right. Every university has its own oddnesses; this format, for example, is explicitly single-sided, uses U.S. customary measurement units everywhere, and has its own odd placement rules (e.g., appendixes must be placed after, not before, the bibliography). Headings are 2” from the top.. except that level 1 appendix headings aren’t considered headings. It took me a long time of back-and-forth discussions with the GMU dissertation and thesis coordinator to get all the details right. (The problem wasn’t that OpenDocument couldn’t do it; the problem was understanding what the GMU requirements actually were.) You can spend many, many hours to redo these details… or just grab this sample document and have the problems solved for you.

But whether you’re a GMU student or not, there’s lots that can be shown from this template. It certainly shows that OpenDocument is fully capable of representing fairly complicated (and odd!) formats, for large documents, completely automatically. That shouldn’t be surprising; one of the OpenDocument developers was Boeing, who develop so many large documents to build an airplane that the documents (when printed) outweigh the plane.

In particular, this document shows that an OpenDocument document can automate all sorts of things, easing development:

  1. Just create a “Heading 1” (control-1 in OpenOffice.org) and the page format automatically switches to first chapter format (with page numbering in a different place).
  2. The spacing and text flow all happen automatically, without weird artifacts like undesired “extra” vertical space on the top of a page. In fact, nearly everything is very automatic - you can concentrate on writing a paper, instead of fixing formats. The entire document is based on “paragraph styles” - just make sure each paragraph has the right style (which is nearly always correct), and the document will look right.
  3. Tables of Contents, of tables, and figures can all be automatically regenerated.
  4. Even the bibliography can be regenerated automatically, so that only documents actually cited in the paper are listed. You can even determine what order is appropriate (e.g., alphabetically or in citation order).

What’s particularly amusing is to compare the OpenDocument template to the GMU Word templates, because their Word templates are horrible to use. GMU’s Word templates are a bunch of individual files, completely inappropriate for actual use by a writer. Even the first page of a chapter and the rest of a chapter are in separate documents, and the table of contents has to be regenerated by hand. But even merging these files together won’t completely solve the problem; Word sometimes fails to correctly generate tables of contents (it’s happened to me!), which is one reason why so many people hand-create tables of contents. And Word certainly doesn’t match other OpenDocument capabilities, such as automated bibliography management. What’s worse, even though Word does have paragraph styles, Word seems to work especially hard to subvert and make their use difficult. Word seems to love redefining all your carefully-crafted styles, making Word painful to use as the document gets long. In contrast, OpenDocument is a breeze to use (at least in OpenOffice.org, and probably other OpenDocument-based systems as well) - setting paragraph styles is trivial (and for the most part completely automatic), once set they stay set, and they can make all the other formatting decisions automatic. Creating useful sample documents in Word is also painful; I started and completed one in OpenDocument quite easily, while GMU is still struggling to create a Word template.

You don’t have to use OpenOffice.org to use OpenDocument, which is great - choice and competition are good things. But OpenOffice.org is a reasonable choice. Bruce Byfield’s articles Replacing FrameMaker with OOo Writer and OpenOffice.org Writer vs. Microsoft Word showed that OpenOffice.org is remarkably capable, especially in its word processor. Byfield’s comparison of OpenOffice.org with the widely-lionized FrameMaker is especially enlightening: “I began comparing FrameMaker and Writer when a regular on the OpenOffice.org User’s list asked what it would take to give Writer the power of FrameMaker. When I started, I mentally pictured a scale with Microsoft Word on one end and FrameMaker on another, with Writer in the middle, but closer to Microsoft Word. As I proceeded, I found Writer was a much stronger contender than I had expected. At the end of the comparison, I had to conclude that the two products compare quite closely, depending on what features are more important to a given user… [OpenOffice.org users] can be in little doubt that they are using software that competes with FrameMaker on its own terms, and wins as often it loses. Even ignoring the cost and philosophical differences, OpenOffice.org is clearly an acceptable alternative to FrameMaker.”

In short, if you’re creating a thesis at GMU, use OpenDocument. I’ve used Word, Word Perfect, OpenOffice.org, and FrameMaker to write large documents. FrameMaker is nice but hideously overpriced, and because it’s overpriced, non-standard, and poorly supported, it’s mostly disappearing from the marketplace. Word works well for 1-2 page documents, but its weaknesses become apparant as your documents get larger, and it’s based on proprietary formats that lock you into a single product. It’s painful to use for larger documents; GMU has yet to create a Word template that is even slightly non-painful. In contrast, in short order I created an OpenDocument format that did everything they wanted, with lots of automation. OpenDocument is an ISO standard, with nice products to support it, and specifically designed for large documents. If you need to make large documents, use the right tool for the job.

path: /misc | Current Weblog | permanent link to this entry

Thu, 12 Apr 2007

April 2007 release of “Why OSS/FS? Look at the Numbers!”

Finally, I’ve released a new version of “Why Open Source Software / Free Software (OSS/FS, FLOSS, FOSS)? Look at the Numbers!” This paper continues to provide “quantitative data that, in many cases, using open source software / free software (abbreviated as OSS/FS, FLOSS, or FOSS) is a reasonable or even superior approach to using their proprietary competition according to various measures. This paper’s goal is to show that you should consider using OSS/FS when acquiring software.”

It’s been a while; my last release was November 14, 2005. The ChangeLog has all the details, but here are some of the highlights:

  1. Updated webserver stats, and noted issues with the Go Daddy change and lighttpd.
  2. Noted Kenneth van Wyk’s article about Linux security
  3. Added quotes from Microsoft’s Bill Hilf, from “Cracking Open the Door to Open Source” by Carolyn A. April, “Redmond” magazine, March 2007, pp. 26-36.
  4. Added link to Andy Tanenbaum’s article about Ken Brown and ADTI.
  5. Added a link to an approved European Parliament resolution, A5-0264/2001, which calls “on the Commission and Member States to promote software projects whose source text is made public (open-source software), as this is the only way of guaranteeing that no backdoors are built into programmes [and calls] on the Commission to lay down a standard for the level of security of e-mail software packages, placing those packages whose source code has not been made public in the ‘least reliable’ category” (5 September, 2001; 367 votes for, 159 against and 39 abstentions).
  6. Added a reference to the Forrester report “Open Source Becoming Mission-Critical In North America And Europe” by Michael Goulde that says “Firms Should Consider Open Source Options For Mission-Critical Applications”.
  7. Added references to a major new European Commission-sponsored study, “Study on the Economic impact of open source software on innovation and the competitiveness of the Information and Communication Technologies (ICT) sector in the EU”, November 20, 2006. This is a major new study; “Our findings show that, in almost all the cases, a transition toward open source reports of savings on the long term”. There is LOTS of quantitative information here.
  8. Added reference to Communications of the ACM (CACM) Jan. 2007, “Increased Security through Open Source” It doesn’t say anything new, and it omits the many quantitative studies cited here, but it’s a prestigious journal that says it.
  9. Added reference to mail server market survey: Sendmail and Postfix and #1 and #2 in the market.
  10. Added references to defectivebydesign.org and to Raymond/Landley’s “World Domination 201” into desktop section.
  11. IE vs. Firefox unsafe days in 2006. Eek… it’s scary.
  12. Added Survey - Linux use on mission-critical systems
  13. Added Danish cities demand more openness
  14. Added “The war is over and Linux won” (Server war)
  15. Added Evergreen, an open source, enterprise-class library management developed by the Georgia Public Library Service.
  16. Added reference to TCO savings on OSS/FS databases, from “Open source databases ‘60 percent cheaper’” article
  17. Added info Firefox use which keeps growing. See http://marketshare.hitslink.com/report.aspx?qprid=3 and http://www.techweb.com/wire/security/193104314
  18. Added reference to IDC survey
  19. Referenced “Trusting Trust” attack. Here’s the text: “An Air Force evaluation by Karger and Schell first publicly described this very nasty computer attack, which Ken Thompson ably demonstrated and described in his classic 1984 paper “Reflections on Trusting Trust”. Thompson showed that because we use software to create other software, if an attacker subverts the software-creating programs, no amount of auditing any program can help you - the subverted programs can hide whatever they want to! This has been called the “uncounterable attack”, and some have said that it’s impossible to secure computers simply because this attack is possible. Some have even said that all those security audits of OSS/FS are worthless, because subverted tools could insert attacks the auditors couldn’t see. But it turns out that the trusting trust attack can be countered. My 2005 paper Countering Trusting Trust through Diverse Double-Compiling (DDC), published by ACSAC, shows how the “uncounterable” trusting trust attack can be countered. But there’s a catch: the DDC defense only works if you can get the source code for your software creation tools, including the operating system, compiler, and so on. That kind of information is typically only available for OSS/FS programs! Thus, even in the case of the dangerous “trusting trust” attack, OSS/FS has a security advantage.”
  20. Added a note about Symphony OS (innovative user interface).
  21. Added quote from Bellovin to history section. OSS was the norm in many communities before the mid-1970s.
  22. Added stats from onestat.com re: Firefox usage
  23. Added EMA study
  24. Added Spyware stats, IE vs. Firefox, from University of Washington.
  25. Added new reports on security flaw fixing time: http://blogs.washingtonpost.com/securityfix/2006/02/a_time_to_patch.html and http://www.heinz.cmu.edu/%7Ertelang/disclosure_jan_06.pdf.
  26. Added “Deliverable D3: Results and policy paper from survey of government authorities”. There’s lots of other good stuff there.
  27. Added reference to another paper on innovation.
  28. Added reference to “Why open source projects are not publicised” by Ingrid Marson, ZDNet UK, November 25, 2005.

As I mentioned earlier, I wish I’d used the term “FLOSS” (Free-Libre / Open Source Software) as my all-encompassing term in this paper. FLOSS is much easier to say than some of the alternatives, and the term “Free Software” is widely misunderstood as being “no cost”. However, I’ve used the term OSS/FS all over in the paper, and it’s awkward to change now (and people might not find the document they were looking for), so I haven’t changed it here.


path: /oss | Current Weblog | permanent link to this entry

Tue, 20 Mar 2007

Audio for “Open Standards and Security” online

Last year in Boston I gave a presentation titled “Open Standards and Security”, explaining why open standards are needed for security; here is “Open Standards and Security” as a PDF. You can also get it in OpenDocument format (for the OpenDocument version, make sure you have the fonts you need). I had earlier posted a blog entry about it, and Newsforge had some very nice things to say about my talk. I used several stories in my talk, which the reporter called “parables”. I didn’t use that word, but I wish I had, because that’s exactly what those stories were.

Many people never got to hear it, so I’ve finally made an audio version of it and posted it here in several formats: [OGG (Vorbis)], [MP3], and [FLAC]. Download and enjoy! You should be able to understand the talk just from listening to the audio, but if you listen to the audio while reading the slides, all the better!

Of course, having to post multiple audio formats shows how immature the audio standards area is. While ISO has a standard (MP3), MP3 is not an open standard because it’s patent-encumbered. I recommend using the Ogg Vorbis format instead - it’s the smallest file, and it has very good quality. Ogg Vorbis produces smaller files with better sound than MP3, so the only real reason to use MP3s is because your equipment can’t handle anything else. The FLAC format is lossless, and is useful for recoding later (it’s much smaller than a WAV or AIFF while still being lossless).

The solution to this nonsense is not to have no standards. The solution is to either (1) get countries to stop permitting software patents (the best solution), or at least (2) get standards organizations to stop publishing closed standards like MP3 for software. I think the tide has already started turning for option 2. After all, when MP3 was created, many still thought that patents in IT standards were okay, and relatively few understood the problems that patents could cause. Fundamentally, of course, this made no sense; the whole point of a patent is to create temporary monopolies, while the whole point of an open standard is to enable competition (the opposite of monopolies). People have tried to make compromises that don’t really work, such as having so-called RAND policies. But I think these are clear failures; all royalty-bearing patents discriminate (for example, they prevent open source and no-cost implementations). The point of patents is to prevent competition, and thus they have no place in software standards. Now that software patents have been shown to be a “Wild West” where anyone can be sued for billions, the need for unencumbered standards should be quite clear. The W3C has already changed its policies to make it very hard to publish patent-encumbered standards, and the IETF has already thrown out several proposals specifically because they were encumbered by patents.

One of the people at my talk made the claim that, “today, every successful open standard is implemented by FLOSS.” That should be easy to disprove — all I need is a counter-example. Except that counter-examples seem to be hard to find; I can’t find even one, and even if I eventually find one, this difficulty suggests that there’s something deeper going on. So as a result of thinking about this mystery, I wrote a new essay, titled Open Standards, Open Source. It discusses how open standards aid free-libre / open source software (FLOSS) projects, how FLOSS aids open standards, and then examines this mystery. It appears that it is true — today, essentially every successful open standard really is implemented by FLOSS. I consider why that is, and what it means if this is predictive. In particular, this observation suggests that an open standard without a FLOSS implementation is probably too risky for users to require, and that developers of open standards should encourage the development of at least one FLOSS implementation. The point of the “Open Standards and Security” talk was on open standards, not on FLOSS, but there’s much to be learned from their inter-relationships.

path: /security | Current Weblog | permanent link to this entry

Presentation and audio of “Open Source Software” online

Earlier this month I gave a presentation about open source software (aka OSS, Free Software, or FLOSS) at a conference near Washington, DC. You can now download the March 2007 presentation “Open Source Software” in PDF format; you can also get it in OpenDocument format. For the OpenDocument version, make sure you have the fonts you need. Those are just the slides; I’ve separately made the audio available in several formats: [OGG (Vorbis)], [MP3], and [FLAC]. You should be able to understand the presentation just from the audio, but looking at the slides while listening to the audio is even better. For the audio, I recommend using the Ogg Vorbis format - it’s the smallest file, and it has very good quality. The FLAC format is lossless, and is useful for recoding later (it’s much smaller than WAV or AIFF while still not losing anything). The MP3 format is useful if your player can’t handle Ogg Vorbis yet (complain to your manufacturer!) - while MP3 is an ISO standard, MP3 isn’t an open standard because it’s patent-encumbered.

The conference was titled “Open Source - Open Standards - Open Architecture”, and was put on by the non-profit Association for Enterprise Integration (AFEI) (a member of the NDIA family of associations). A lot of people were particularly surprised to learn that essentially all open source software (FLOSS) are commercial off-the-shelf (COTS) software, a point I make in more detail in my essay ‘Commercial’ is not the opposite of Free-Libre / Open Source Software (FLOSS). Basically, the U.S. government’s own laws (particularly Title 10 section 101) and regulations (particularly the Federal Acquisition Regulation) make it clear that nearly all open source software is commercial off-the-shelf (COTS). There are two kinds of COTS software products: proprietary software and open source software.

path: /oss | Current Weblog | permanent link to this entry

Mon, 05 Feb 2007

Internet Explorer 7: Still a security problem, keep using Firefox

Microsoft’s Internet Explorer (IE) is a major security problem. The Washington Post found some horrific statistics that justify this claim pretty well: “For a total 284 days in 2006 (or more than nine months out of the year), exploit code for known, unpatched critical flaws in pre-IE7 versions of the browser was publicly available on the Internet. Likewise, there were at least 98 days last year in which no software fixes from Microsoft were available to fix IE flaws that criminals were actively using to steal personal and financial data from users… In contrast, Internet Explorer’s closest competitor in terms of market share — Mozilla’s Firefox browser — experienced a single period lasting just nine days last year in which exploit code for a serious security hole was posted online before Mozilla shipped a patch to remedy the problem.”

Let’s sum that up: in 2006, IE was unsafe 78% (284/365) of the time - 27% (98/365) had known criminal use - compared to Firefox’s 2% (9/365). This is an improvement for IE; in 2004, it was unsafe 98% of the time, and 54% of the time there was known active exploitation of them. But Firefox is improving too; in 2004 it was unsafe 15% of the time (with 0% known exploitation), and half of that time only affected Macintosh users. (I blogged on these Internet Explorer / Firefox security statistics in 2005.) You really want to be using the safer product, and now we have two different years with the same result. But none of these studies considered IE version 7… so has it all changed?

IE version 7 is finally out, and I’d like to think it’s better than IE 6. Indeed, I suspect IE 7 is better than its predecessor; Microsoft did try to improve IE security, and IE 6’s security was so bad that it was hard to get worse. But IE is not the only browser available, and early signs suggest that IE is still far behind Firefox.

In particular, there are already signs that Microsoft still isn’t addressing vulnerabilities aggressively the way that the Mozilla/Firefox team have been doing for years. Why? Because recent “Full disclosure” and Bugtraq postings give room for worry. Michal Zalewski’s “Web 2.0 backdoors made easy with MSIE & XMLHttpRequest” (3 Feb 2007) noted that the XMLHttpRequest object (used by many so-called “web 2.0” applications) allows “client-side web scripts to send nearly arbitrary HTTP requests, and then freely analyze and manipulate the returned response, including HTTP headers. This gives an unprecedented level of control over your browser to the author of a visited site. For this reason, to prevent various types of abuse, XMLHttpRequest is restricted to interacting only with the site from where the script originated, based on protocol, port, and host name observed. Unfortunately, due to a programming error, Microsoft’s Msxml2.XMLHTTP ActiveX object that MSIE relies on allows you to bypass this restriction with the use of - BEHOLD - a highly sophisticated newline-and-tab technology.” (This last bit about being “highly sophisticated” is quite sarcastic; security problems with control characters like newline and tab are as old as computer security problems.)

One poster found a previous May 2006 article about this problem: “IE + some popular forward proxy servers = XSS, defacement (browser cache poisoning)”. Indeed, the basic information goes back to September 2005. (There are hints in January 2003, but to be fair few noticed its implications at the time.)

Now it turns out that this kind of error is easy to make; even the Mozilla/Firefox people made this kind of error. In particular, this basic problem (differing in some details) was identified and fixed in Mozilla in 2005 as bug 297078.

The problem in this case isn’t that the Microsoft people made an error, and the Mozilla/Firefox people didn’t. Certainly, there’s evidence that Mozilla’s policy of releasing the source code for people to review, combined with worldwide development/review and a “bug bounty” to encourage additional review, really does produce good results. But in this case, both Microsoft and Mozilla made the error; what’s different is what happened next. Mozilla fixed it in 2005, the same year the issues had become clear, yet Microsoft still hasn’t fixed it in 2007. (And no, this particular defect isn’t included in the Washington Post study above; it sure wouldn’t improve IE’s statistics if they had.)

If a supplier won’t quickly fix known security problems, that’s a really big warning sign. The Washington Post earlier found that Microsoft took far longer to fix a vulnerability than Mozilla, and this latest report is consistent with that sad news. I do not understand why Microsoft hasn’t addressed this; hopefully this will turn out to be a false alarm (that seems unlikely) or they will fix it soon.

The only way to really see which browser is more secure is examine its vulnerability pattern over time into the future - for example, does it have more vulnerabilities over time (of a certain criticality), and how fast are reported vulnerabilities repaired? But note a key issue: unless you throw away the entire program and start over from scratch, it’s difficult to turn an insecure program into a secure one. Thus, while past performance is no guarantee of future results, it’s a good way to bet. It appears that Microsoft still hasn’t fixed all the problems in IE 7 that were publicly known at least two years earlier (in some of the most widely publicized vulnerability discussion groups!). If that’s true, it’s a really bad sign… how can they have removed most vulnerabilities not publicly known, if they haven’t even addressed the ones already publicly known?

I continue recommending that users switch to Firefox and not use IE for security reasons. And I highly recommend that web developers ensure that their systems conform to web standards so that users can choose and switch their browsers. These are only my personal opinions, but I think you can see why I think it makes sense. Even ignoring this particular issue, IE has a terrible track record. I’m glad that Microsoft is starting to take security seriously (they are at least saying the right things), and I’d delight in a race between suppliers to see who can produce the most secure software. But these recent reports reinforce the supposition that IE is still too dangerous to use safely. There’s nothing “user friendly” about a program that is easily subverted.

path: /security | Current Weblog | permanent link to this entry

Tue, 16 Jan 2007

Flawfinder version 1.27 released!

I’ve released yet another new version of flawfinder - now it’s version 1.27. Flawfinder is a simple program that examines C/C++ source code and reports on likely security flaws in the program, ranked by risk level.

The big functional addition is that flawfinder can now examine just the changes in a program. If you’re modifying a big program, it can be overwhelming to view all of the warnings flawfinder can produce… but if you can look at only the ones relevant to the change you are making, it can be easier to handle. My thanks to Sebastien Tandel - he suggested the feature, I replied in a brief email describing how I thought it could be done, and in the same day he replied with code to implement it. Wow, that’s truly amazing. His original patch only worked with Subversion; I modified it so that it also works with GNU diff. For this to work, you use the new “—patch” option and give flawfinder a patch file (in unified diff format) that describes the changes… and flawfinder will only report on the potential vulnerabilities on the changed lines (or the lines immediately above and below them).

An administrative change is that flawfinder is now hosted on SourceForge.net, with a mailing list and a Subversion repository for code changes. This should make it easier for people to discuss the program, submit changes, and generally keep track of things. And it also deals nicely with the “what happens if he’s hit by a bus” problem.

You can view the Flawfinder ChangeLog for the details on the other changes. It deals more gracefully with unreadable files and when there are zero lines of code. Also, it now skips by default any directories beginning with “.”; this makes it work nicely with many SCM systems (use “—followdotdir” if you WANT it to enter such directories). My thanks to Steve Kemp, cmorgan, and others.

For more info, or a copy, just go to my original flawfinder home page or the new flawfinder page on SourceForge.net. Enjoy!

path: /security | Current Weblog | permanent link to this entry

Sun, 07 Jan 2007

DRM Nonsense, HD DVD, AACS, and BackupHDDVD - why “content protection” doesn’t

Hollywood wants to prevent piracy - and that is understandable. But in their zeal it sometimes appears that some who create movies or music don’t care what privacy, security, legal rights, or laws of physics they try to violate. And that is a real problem. DRM proponents want to release digital information to people, yet make it impossible to copy them. Yet the whole point of digital processing is to enable perfect copies. DRM (Digital Rights Management or Digital Restrictions Management) is truly “defective by design. As others have said, DRM is an attempt to change water so it’s not wet.

The recent reports about HD DVD are showing the folly of DRM in general. HD DVD encrypts a movie, and then encrypts that movie key many different times on the DVD as well - once for each player. The theory here is that the movie industry could then revoke a player key by simply not including that key on future DVDs. I think the first time they try to actually do this, they’ll see the folly of it — it would mean that millions of customers would suddenly no longer have access to future movies through a device they purchased that they expected to work with them. Can anyone say “class action lawsuit”? I knew you could!

But it turns out that this idea has a fatal flaw technically, as shown by BackupHDDVD (you can see the code, comments, NY Times article, and Slashdot discussion). The code itself is no big deal - it just implements the decryption protocol, which is publicly known anyway. But the interesting trick is that the released software requires the master decryption key for that specific movie, and the implementor is claiming that he has found a way to get this key from a player. To be fair, he hasn’t proven he can get such keys by actually sharing any real keys, but let’s presume that he is telling the truth; his described method for getting them is very plausible. Yet the implementor is not revealing the player that he got this from or the exact details of how he got them.

That’s more clever than it first appears. The creators of the DRM scheme assumed that anyone who broke a player would reveal the player’s private key. But because BackupHDDVD’s creator doesn’t reveal that key, he never reveals the player he’s broken into. Since the DRM scheme masters don’t know which player was broken into, their revocation scheme won’t work. Many other revocation schemes for media use the same basic approach, and so they would fall the same way.

Some Blu-ray folks are claiming that this shows their scheme works better, because they can include additional crypto stuff on the media. But this shows that they don’t understand the nature of the problem; it’s not hard to implement the crypto interpreter, and since you wouldn’t know which player to revoke, you would give all the crypto interpreter information away too. They’d just need to send around the decrypted decryptor… which would be trivially acquired. Once again, DRM is doing nothing to stop piracy, but it’s certainly interfering in legitimate use. Sorry, but water stays wet.

I do not approve of piracy. I don’t approve of murder, either, yet I approve of the sale of steak knives and cleaning supplies… and would oppose trying to halt their sales. Certainly the costs to consumers of DRM measures are considerable, yet they are actively against the interests of customers.. and they even fail to do the one thing they are supposed to do. DRM proponents are often laughingly referred to as the MAFIAA (Music And Film Industry Association of America), in part because their actions towards their own customers seem actively hostile. DRM seems to be primarily about preventing people from using in legitimate ways the products they’ve already purchased, and has nothing to do with actually preventing illegal activities. Why can’t I transfer that music or movie I bought to a new device I just bought? Or to an old CD so I can play it on older equipment? Why can’t I watch what I bought using GNU/Linux or BSD systems? Why can’t I use a $3000 display’s full resolution at all times for movies I have legitimately bought? Measures this extreme that create monopolies and inhibit legal activities are not a good thing, and are worse than the piracy that DRM measures are trying to prevent.

What’s worse, the anti-consumer impacts of DRM don’t even inhibit piracy. The big piracy operations will just continue to make direct copies of the bits using specialized equipment, and DRM cannot affect that at all. Individuals can make recordings of the displays or sounds… again, DRM can’t really counteract that (there are anti-sync measures for video, but they are easily foiled). So DRM will fail against individuals, and against large-scale piracy, period. Since DRM tries to prevent many legitimate uses, it also encourages law-abiding citizens to break them… and so far they’ve all fallen, given that additional incentive. The fact that DRM is not even successful at doing what it’s supposed to do is just icing on the cake. Even if DRM worked, it is still worse than the problems it is trying to stop.

DRM is the disease, not the cure. It’s time for content industries to take advantage of technology, instead of trying to halt the use of technology. Instead of DRM, they should sell non-DRMed content using standards that everyone can implement… and then they can sell their content to a very large unified market.

path: /security | Current Weblog | permanent link to this entry