California: Open Source Software is Okay!
The California state government has officially declared that it’s okay to use open source software inside the California state government. On January 7, 2010, the California Office of the State Chief Information Officer (OCIO) released Information Technology Policy Letter (ITPL) 10-01, titled “Open Source Software Policy” . A key purpose of ITPL 10-01 is to “formally establish the use of Open Source Software (OSS) in California state government as an acceptable practice”, and the first sentence of its policy statement is that “The OCIO permits the use of OSS”. It even includes the ten-point open source definition (OSD) as promulgated by the Open Source Intiative, to make sure that there’s no misunderstanding.
I think this is a big deal. Officially saying “it’s okay to use free/libre/open source software (FLOSS)” is really important before FLOSS can get widespread use in governments. Most technologists already understand the potential advantages of FLOSS, but they encounter a lot of resistance when they try to use or develop FLOSS in large organizations like governments. Far too many middle managers are instinctively afraid of change from “the way we’ve always done it”. For example, they may be afraid of unseen problems, or afraid their bosses will rake them over the coals later. Far too often the middle managers have misunderstandings about FLOSS, too. For example, many managers still believe the myth that “you can’t get support” and are unaware of the many companies that do provide such support. Companies that make competing proprietary products are delighted (of course) when governments don’t consider their competition... but in an era of tight budgets, it doesn’t make sense for governments to ignore competing (and often less expensive) products. When top officials give official “top cover” permission to consider FLOSS, then the technologists and middle managers are far more likely to fairly and honestly consider them.
Also, the fact that it’s California matters. The economy of the California is larger than most countries (if it were a country, it would be third through tenth in the world depending on how you measure it). Anything the state of California does can influence other states and countries; acts like this further legitimize the user of Free/Libre/Open Source Software (FLOSS).
Of course, the state of California isn’t the only government organization to release a memo officially declaring that it’s okay to use free/libre/open source software (FLOSS). Just looking inside the U.S., the U.S. DoD did this in 2003, the Office of Management and Budget (OMB) released a somewhat similar memo in 2004 that applied to the entire U.S. federal government, the U.S. Navy did this in 2007, and the the U.S. DoD released clarifying guidance in 2009 re-emphasizing this point. And that’s only a few examples from U.S. government organizations; the examples from around the world are legion. It’s really difficult to get people to change what they do... as you can tell from the number of times that various U.S. federal government organizations have had to state and re-state it. Still, they really do have an effect. Official policy statements that FLOSS is used, such as the one California just released, are a necessary first step to changing things from “the way we’ve always done things”.
path: /oss | Current Weblog | permanent link to this entry
Eben Moglen has a very interesting presentation on patents (including comments on Bilski) that was originally presented on Nov. 2, 2009. Software patents and business method patents have been a disaster for the U.S. and world economy, and he has some interesting things to say about how we got here (and how it could be fixed).
One interesting point he made, which I hadn’t heard before, is that there is a fundamental conflict between the patent system and the Administrative Procedure Act of 1946 (aka the APA). Nearly all of the U.S. government must obey the APA before creating new rules and regulations. According to the APA, U.S. agencies must keep the public informed, provide for public participation in the rulemaking process, establish uniform standards for rulemaking and adjudication, and provide for judicial review. In particular, agencies normally have to perform a cost-benefit analysis.
But the patent system pre-existed the APA. Patents, since they are government-created monopolies, can constrain people in the same ways that any other rule or regulation can. However, the government does not follow the APA to determine if each proposed patent should be granted. Instead, the old patent process was essentially grandfathered in instead, as a special exception to the APA. Because the APA is not considered when examining each patent, no one in government asks the normally-required question “How will each proposed patent be publicly reviewed before it is granted?”. Patents on ideas that are patently obvious are routinely granted, in part because there is no public review before they are granted and because the patent office (by policy) ignores most information available to the public. All because the patent-granting process is not required to enable public participation in the rulemaking process, in this case, the process for permitting the granting of a patent. Also, when examining a patent to determine if it should be granted, no one asks normally-obvious questions like:
Because the patent system predates the APA, all potential harms to society from a patent are completely ignored during the patent examination process. If patents were individually considered as new regulations under the APA, such questions would need to be carefully considered. That’s an interesting point Moglen makes.
It’s my hope that the Supreme Court will clearly stop software patents. We shall see.
path: /oss | Current Weblog | permanent link to this entry
U.S. research should be open access
The Office of Science and Technology Policy (OSTP) has launched a “public consultation on Public Access Policy”, to see if research funded by U.S. grants should be made available as open access results. I think this is important — I believe publicly-funded unclassified research should actually be made available to the public.
Historically, the U.S. pays a fortune for research, the results are written up as papers for journals, and then various publishers acquire total rights to these papers and charge exorbitant monopoly fees for them. The result: Most U.S. citizens cannot afford to see the research their taxes pay for.
The basic question here is really straightforward: Should publicly-funded research results be made available directly to the public instead? Or, should private companies continue to gain ownership over publicly-funded results, for either nothing or a tiny fraction of the public’s costs?
A small number of journal publishers and societies strongly want to keep things the way they are, of course. It makes sense from their point of view; everybody likes free (or nearly free) money! Historically, this arrangement was created because it can be expensive to publish and manage paper. However, that rationale has become completely obsolete. Few people want the paper any more — they want the research, on-line, without a paywall. And don’t give me nonsense about the “costs” of peer review. Many journals don’t pay their reviewers (the reviewers do it gratis), and even if they did, the total control they gain is still unjustified; the U.S. government spends far more per paper than they do for review.
The current sequestering of research is not good for science or the country. I’m currently reading the interesting book “Are We Rome?” by Cullen Murphy, and I can’t help but see some parallels. Chapter 3 is all about “when public good meets private opportunity”. Private organizations may pay for private research, and then keep their results private. But when the public pays for research, it should be shocking if it does not get released back to the public. And by “released back”, I mean released back at no fee at all.
So who will pay for the printing, complex peer review, storage, and fancy indexing of these research results? I think the very question shows a failure to understand current technology, but let’s answer it anyway. Most peer review isn’t paid-for anyway, and if it is, it’s a tiny cost compared to the research itself. Storage? Don’t make me laugh; for $100 I can buy storage for the all of the U.S. research papers for a year. Indexing? The government shouldn’t be doing serious indexing at all!! Just put it on a government site with a basic form filled out (title, authors, date, keywords, abstract, and a link to the actual paper on the government site). If it’s not behind a paywall, the many commercial search systems will index it for you.
I do think there should be a centralized government repository of such papers. If it’s distributed, then papers could be lost without anyone knowing it. I think they should be freely redistributable, so others can copy what they want, but a centralized repository would make sure that we keep all of them available forever. Also, bandwidth costs can be reduced by scale. There’s a risk that they all get lost at once, but it’s easier to copy everything if there’s one place to start from. If it’s a complicated site, then they’ve done it wrong.... for each paper there should be a simple “summary” page with title, authors, etc., and the actual paper itself.
OSTP cites the experience of NIH; NIH did wonderful work for releasing as open access, and in my mind the real problems are that they didn’t go far enough. First, NIH has a one-year embargo... if I already paid for it (and I did), why should wealthy people and organizations get the results first? Second, NIH only considers the actual papers, not the data and software programs that support the works... yet often those are more important. If they were funded by the public, then the public should get them (unless they’re classified, of course, but then they shouldn’t be released at all). I’m sure there are complications and exceptions, but a “default open access” policy would go a long way.
So please, tell the OSTP that the U.S. should release government-funded research as open access publications, available to anyone on the Internet without a paywall. In short, if “we the people” paid for it, then “we the people” should get it. For more information, see this Request for Information (RFI)
path: /oss | Current Weblog | permanent link to this entry
Success on Fully Countering Trusting Trust through Diverse Double-Compiling
My November 23 public defense of Fully Countering Trusting Trust through Diverse Double-Compiling went well. This was my 2009 PhD dissertation that expands on how to counter the “trusting trust” attack by using the “Diverse Double-Compiling” (DDC) technique.
Most importantly (to me), my PhD committee agreed that I successfully defended my dissertation. Whew! As a result, I'm essentially done with my PhD.
I learned a lot about creating formal proofs using computers by doing this dissertation. I wanted to give the strongest possible evidence that DDC counters the trusting trust attack, and formal proofs are the strongest form of proof that I know of... which is why I created them. Frankly, creating proofs was kind of fun once I knew what I was doing, but getting there was more painful than it needed to be. Many books are on the underlying mathematics (e.g., giving you extreme detail about various logic systems)... which is great if you're a mathematician, but not so helpful if you are simply trying to use the mathematics. Some books explain how to do things by hand, but that is an unnecessary amount of pain; one of my proofs is 30 steps long, and I sure wouldn't have wanted to create that by hand. Some books seemed to assume that you already knew everything the book covered, which is an odd assumption to me :-).
Here's a trivial example: Most logic systems can prove anything if you give them inconsistent assumptions. That's bad! You can get rid of that problem by sending the assumptions to a model-builder like mace4... if it can create a model, then the assumptions are consistent. So, make sure you send your assumptions through a model-builder to see if your assumptions are consistent.
I've posted detailed data from my dissertation so that people can reproduce my results. I think it's really important that results be reduceable, otherwise, it's not science. As part of that data, I've included a few files that may help potential proof tool users get started. In particular, I've posted prover9 input to prove that Socrates is mortal, a prover9 input to prove that the square root of 2 is irrational, and prover9 input showing how to easily declare that terms in a list are distinct.
The "trusting trust" attack has historically been considered the "uncounterable" attack. Now the attack can be effectively detected — and thus countered.
path: /security | Current Weblog | permanent link to this entry
Fully Countering Trusting Trust through Diverse Double-Compiling
A last-minute reminder — my public defense of Fully Countering Trusting Trust through Diverse Double-Compiling is coming up on November 23, 1-3pm. This is my 2009 PhD dissertation that expands on how to counter the “trusting trust” attack by using the “Diverse Double-Compiling” (DDC) technique.
It will be at George Mason University, Fairfax, Virginia, Innovation Hall, room 105. [campus location] [Google map] Anyone is welcome!
I've made a few small tweaks over the last few weeks. I modified proof #2 to reduce its requirements even further (making it even easier to do); I had mentioned in text that this was possible, but now the formal proof shows it. I also used mace4 to show that the assumptions of each proof are consistent. Formal proofs aren't easy to create, or trivial to read, but the reason I went to that trouble is to show that it's not just my opinion that I've countered the trusting trust attack... I want to show, conclusively, that the trusting trust attack has been countered. I know of no stronger method to show that than a formal proof.
The "trusting trust" attack has historically been considered the "uncounterable" attack. Nuts to that. Now the attack can be effectively detected — and thus countered.
path: /security | Current Weblog | permanent link to this entry
Trusting Trust, DDC, and Free-Libre/Open Source Software (FLOSS)
As I noted in my blog, I’ve just released my dissertation “Fully Countering Trusting Trust through Diverse Double-Compiling (DDC). But what does that mean for Free-Libre/Open Source Software (FLOSS)? In short, it’s fantastic news for FLOSS, but to explain why that’s so, I need to backtrack first.
The “trusting trust” attack is a nasty computer attack that involves creating a subverted compiler in such a way that it even subverts compilers. It was originally reported in a 1974 security evaluation of Multics, but most people heard about it from Ken Thompson’s 1984 Turing Award presentation (Ken Thompson is a creator of Unix). This attack is incredibly nasty, and what’s worse, until now there’s been no effective countermeasure to it. Indeed, some have claimed that it could not ever be countered, making the whole idea of “computer security” a non-starter.
The “trusting trust” attack appears to be especially devastating to FLOSS. The problem is that with the trusting trust attack, the source code that people review does not correspond to the executable that’s actually running, and that seems to completely torpedo the “many eyes” review that FLOSS makes possible. The whole world could carefully review a program’s source code, but it wouldn’t matter if the compiler turns it undetectably into something malicious.
Thankfully, there is an effective countermeasure, which I’ve named Diverse Double-Compiling (DDC). You can see my dissertation which explains what it is, proves that it works, and even demonstrates it with several compilers including GCC. (I will be giving a public defense of it on November 23, 2009, if you’d like to come.) This means that source code review, such as mass review of FLOSS code, can now actually work.
But there’s more, because there’s an interesting catch with DDC. DDC counters the trusting trust attack, but it’s only useful for people who have access to the compiler source code. Fundamentally, DDC is a technique for determining if a compiler executable corresponds with its source code, but only people who have the source code can apply DDC to see if that’s true. What’s more, only people who have access to the source code will find the statement “the source and executable correspond” particularly useful. (You could use trusted intermediaries, but this requires total trust in those intermediaries, making such claims far weaker than claims that anyone can check.) What’s more, DDC is actually useful beyond what we normally think of as compilers, because you can redefine “compiler” as including other parts (such as the operating system). In that case, you can even show that the system’s executables all correspond to their source code. But you can only use DDC to counter the trusting trust attack if you have access to the source code.
So we now have a radical change. Now that DDC has been shown to work, we can see that software with available source code (including FLOSS) has a fundamental security advantage over other software. That doesn’t mean that all FLOSS is more secure than all proprietary software, of course. But FLOSS already had a general security advantage because it better meets Saltzer & Schroeder’s “Open design principle” (as explained in their 1974-1975 papers). Now we have an attack — the trusting trust attack — for which FLOSS has a fundamental security advantage. The time of ignoring FLOSS options, because of misplaced notions that FLOSS cannot be as secure as proprietary software, needs to come to an end.
path: /oss | Current Weblog | permanent link to this entry
New PhD Dissertation: Fully Countering Trusting Trust through Diverse Double-Compiling
An Air Force evaluation of Multics, and Ken Thompson’s Turing award lecture (“Reflections on Trusting Trust”), showed that compilers can be subverted to insert malicious Trojan horses into critical software, including themselves. If this “trusting trust” attack goes undetected, even complete analysis of a system’s source code will not find the malicious code that is running. Previously-known countermeasures have been grossly inadequate. If this attack cannot be countered, attackers can quietly subvert entire classes of computer systems, gaining complete control over financial, infrastructure, military, and/or business system infrastructures worldwide.
Thankfully, there is a countermeasure to the “trusting trust” attack. In 2005 I wrote a paper on Diverse Double-Compiling (DDC), published by ACSAC, where I explained DDC and why it is an effective countermeasure. But some people still raised concerns. Would DDC really counter the attack? Would DDC scale up to real-world compilers? Also, the ACSAC paper required “self-parenting” compilers — can DDC handle compilers that are not self-parenting?
I’m now releasing Fully Countering Trusting Trust through Diverse Double-Compiling, my 2009 PhD dissertation that expands on how to counter the “trusting trust” attack by using the “Diverse Double-Compiling” (DDC) technique. This dissertation was accepted by my PhD committee on October 26, 2009.
On November 23, 2009, 1-3pm, I will be giving a public defense of this dissertation. If you’re interested, please come! It will be at George Mason University, Fairfax, Virginia, Innovation Hall, room 105. [campus location] [Google map]
This dissertation’s thesis is that the trusting trust attack can be detected and effectively countered using the “Diverse Double-Compiling” (DDC) technique, as demonstrated by (1) a formal proof that DDC can determine if source code and generated executable code correspond, (2) a demonstration of DDC with four compilers (a small C compiler, a small Lisp compiler, a small maliciously corrupted Lisp compiler, and a large industrial-strength C compiler, GCC), and (3) a description of approaches for applying DDC in various real-world scenarios. In the DDC technique, source code is compiled twice: once with a second (trusted) compiler (using the source code of the compiler’s parent), and then the compiler source code is compiled using the result of the first compilation. If the result is bit-for-bit identical with the untrusted executable, then the source code accurately represents the executable.
Many people commented on my previous 2005 ACSAC paper on the topic. Bruce Schneier wrote an article on ‘Countering “Trusting Trust”’, which I think is one of the best independent articles describing my work on DDC.
This 2009 dissertation significantly extends my previous 2005 ACSAC paper. For example, I now have a formal proof that DDC is effective (the ACSAC paper only had an informal justification). I also have additional demonstrations, including one with GCC (to show that it scales up) and one with a maliciously corrupted compiler (to show that it really does detect them in the real world). The dissertation is also more general; the ACSAC paper only considered the special case of a “self-parenting” compiler, while the dissertation eliminates that assumption.
So if you’re interested in countering the “trusting trust” attack, please take a look at my work on countering trusting trust through diverse double-compiling (DDC).
path: /security | Current Weblog | permanent link to this entry
Notes about the DoD and OSS memo
Yesterday I posted about the new 2009 DoD memo about open source software. I'm delighted to see that the word is getting out. Slashdot, Linux Weekly News, and LXer.com all mentioned the new memo and even pointed to my post. Others are noting the new memo too, including CNet's Matt Asay, InformationWeek's J. Nicholas Hoover, InformationWeek's Serdar Yegulalp, NetworkWorld, and The H. Dan Risacher has posted on Slashdot some background and history for this new 2009 DoD memo. He notes, for example, that "The lawyers were by far the biggest delay" in getting this memo released.
There's some supporting information for this memo at the DoD Free Open Source Software (FOSS) Communities of Interest (COI) site, which posts the memo itself and a supporting DoD Open Source Software Frequently Asked Questions (FAQ) document.
To help potential users, I've updated my presentation Open Source Software (OSS) and the U.S. Department of Defense (DoD), which I hope will clarify some things. I should also remind people about the 2003 MITRE study "Use of Free and Open Source Software (FOSS) in the U.S. Department of Defense", which showed that in 2003 Free/libre/open source software (FLOSS, FOSS, or OSS) was already widely used in the DoD.
path: /oss | Current Weblog | permanent link to this entry
New DoD memo on Open Source Software
The U.S. Department of Defense (DoD) has just released Clarifying Guidance Regarding Open Source Software (OSS), a new official memo about open source software (OSS). This 2009 memo should soon be posted on the list of ASD(NII)/DoD CIO memorandums. This 2009 memo is important for anyone who works with the DoD (including contractors) on software and systems that include software... and I suspect it will influence many other organizations as well. Let me explain why this new memo exists, and what it says.
Back in 2003 the DoD released a formal memo titled Open Source Software (OSS) in the Department of Defense. This older memo was supposed to make it clear that it was fine to use and develop OSS in the DoD. Unfortunately, as the new 2009 memo states, "there have been misconceptions and misinterpretations of the existing laws, policies and regulations that deal with software and apply to OSS that have hampered effective DoD use and development of OSS".
This new 2009 memo simply explains "the implications and meaning of existing laws, policies and regulations", hopefully eliminating many of those misconceptions and misinterpretations. A lot of the "meat" is in the Attachment 2, section 2 (guidance), so let's walk through that:
But perhaps most important is this memo's opening statement: "To effectively achieve its missions, the Department of Defense must develop and update its software-based capabilities faster than ever, to anticipate new threats and respond to continuously changing requirements. The use of Open Source Software (OSS) can provide advantages in this regard...". As with the later part (b), here we have an official government document acknowledging that OSS can have a significant advantage. What's more, these potential advantages aren't necessarily just minor cost savings; OSS can in some cases provide a military advantage. Which is a more-than-adequate justification for considering OSS, as I have been advocating for years.
I'm really delighted that this memo has finally been released. I participated in the original brainstorming meeting to create this memo (as did John Scott), and I reviewed many versions of it, but many, many other hands have stirred this pot since it began. It took over 18 months to create it and get it out; getting this coordinated was a very long and drawn-out process. My thanks to everyone who worked to help make this happen. In particular, congrats go to Dan Risacher, who led this project to its successful completion.
By the way, if you're interested in the issue of open source software in the U.S. military/national defense, you probably should look at Mil-OSS (at least, join their mailing list, and consider going to their upcoming conference; I was a speaker at their last one). If you're interested in the connection between open source software and the U.S. government (including the military), you might also be interested in the upcoming GOSCON conference on November 5, 2009 (I'm one of the speakers there too).
path: /oss | Current Weblog | permanent link to this entry
CVC3 is one of the better automated theorem provers. Given certain mathematical assertions, it can in many cases prove that certain claims follow from them. Some tools that can prove properties about programs use CVC3 (and/or similar programs). For example, the Frama-C Jessie plug-in for C and Krakatoa for Java use Why, which can build on one of several programs including CVC3.
Problem is, CVC's license has historically been a problem. I understand that its authors intended for CVC3 to be Free/Libre/Open Source Software (FLOSS), but unfortunately, it was released with additional license clauses that resulted in yet another non-standard license. This was an unfortunate mistake; as I note in my essay on GPL-compatible licenses, it is absolutely critical to choose a standard FLOSS license when releasing FLOSS. In this case, the big problem was the addition of an "indemnification" clause that was really scary; to some, at least, it seemed to imply that if the CVC3 authors were sued, anyone who used or copied the program was obligated to pay their legal bills. Interpreted that way, no one wanted to touch the program... how could any user possibly know their risks? Fedora eventually ruled that this license was non-free (aka not FLOSS), and thus could not be included in Fedora. There was a less-serious problem that if you made a change to the program, you had to change the name... since the program couldn't even compile without a change (at the time), this meant that you had to change the name almost instantly. There is a reason that people have converged on standard FLOSS licenses; if your lawyer says you need to add non-standard clauses, be wary, because the result may be that few people can use your program.
I'm delighted to report that this has a happy ending. CVC3's license has just been changed to a straight BSD license - a well-known license that is universally acknowledged as being FLOSS. This means that there are no licensing problems for Linux distributions. Only about a day after he found this out, Jerry James has submitted a CVC3 package to Fedora. So, I expect that in a relatively short time we'll see CVC3 available directly in common Linux distribution repositories.
I think this is a helpful step towards open proofs, which are cases where an implementation, its proofs, and the necessary tools are all FLOSS. Having a good tool like CVC3 to build on makes it easier to develop useful tools. My hope is to mature formal methods tools so that they can be more scaleable, applicable, and effective than they are today. It's clear that a single little tool cannot possibly do the job; we need suites of tools that can work together. And this is a promising step in that direction.
path: /oss | Current Weblog | permanent link to this entry
I’ve just released Auto-DESTDIR, a software package which helps automate program installation on POSIX/Unix/Linux systems from source code. If you have the problem it solves — automatic support for DESTDIR — you want this!
A little background: Many programs for Unix/Linux are provided as source code. Such programs must be configured, built, and installed, and that last step is normally performed by typing “make install”. The “make install” step normally writes directly to privileged directories like “/usr/bin” to perform the installation. Unfortunately, most modern packaging systems (such as those for .rpm and .deb files) require that files be written to some intermediate directory instead, even though when run they will be in a different filesystem location (because of security issues). This redirection is easy to do if the installation script supports the “DESTDIR convention”; simply set DESTDIR to the intermediate directory’s value and run “make install”. Supporting DESTDIR is a best practice when releasing software. Unfortunately, many source packages don’t support the DESTDIR convention. Auto-DESTDIR causes “make install” to support DESTDIR, even if the provided “makefile” doesn’t support the DESTDIR convention. Auto-DESTDIR is released under the “MIT” license, so it is Free-libre/open source software (FLOSS).
Auto-DESTDIR is implemented using a set of bash shell scripts that wrap typical install commands (such as install, cp, ln, and mkdir), These wrappers are placed in a special directory. The run-redir command modifies the PATH so that the directory with these scripts is listed first, and then runs the given command. The make-redir command invokes “make” using run-redir, along with some extra settings to simplify things. For more information on this approach, and why this is a good way to automate DESTDIR, see the paper Automating DESTDIR, especially its section on wrappers.
So please take a look at the Auto-DESTDIR software package, if you have the problem it solves.
path: /oss | Current Weblog | permanent link to this entry
Limiting Unix/Linux/POSIX filenames simplifies things: Lowercasing filenames
My essay Fixing Unix/Linux/POSIX Filenames: Control Characters (such as Newline), Leading Dashes, and Other Problems argues that adding some limitations on legal Unix/Linux/POSIX filenames would be an improvement. In particular, a few minor limitations (which most people assume anyway) would eliminate certain kinds of bugs, some of which end up being security vulnerabilities. Forbidding crazy things (like control characters in filenames) simplifies creating programs that work all the time.
Here’s a little example of this. I wanted to convert all the filenames inside a directory tree to all lowercase letters. I didn’t want to lose any files without checking on them first, so I wanted it to ask before doing a rename in a way that would eliminate a file (i.e., I wanted to use mv -i). I didn’t find such a program built into my distro, so I wrote a short script to do it (which is just as well, because it makes a nice simple example). I wanted it to be portable, since I might need it again later.
So how do we write this? A simple glob like “*” won’t work, because it needs to recursively descend through a tree of directories, and simple globs will skip hidden filesystem objects too (and I want to include them). I could write a more complex glob that included hidden files and directories, and recursed down through subdirectories, but the naive way of recursing down subdirectories can have many problems (e.g., it could get stuck in endless loops created by symbolic links). If we need to handle a tree recursively, there’s a better tool designed for the purpose — find.
Unfortunately, an ordinary find . has an interesting problem — it will pick the upper-level names first, and if we rename the upper-level names first, find will fail when it tries to enter them (since they will no longer exist). No problem — if we are manipulating the tree structure (including renames), we can use the -depth option of find, which will process each directory’s contents before the directory itself. We can then rename just the basename of what find returns, so we won’t change anything before find descends into it.
Now, if we could assume that newlines and tabs cannot be in filenames, as recommended in Fixing Unix/Linux/POSIX Filenames..., then we can do a simple for loop around the results of find. My shell script mklowercase renames filenames to lowercase letters recursively. Here is its essence:
#!/bin/sh
# mklowercase - change all filenames to lowercase recursively from "." down.
# Will prompt if there's an existing file of that name (mv -i)
# Presumes that filenames don't include newline or tab.
set -eu
IFS=`printf '\n\t'`
for file in `find . -depth` ; do
[ "." = "$file" ] && continue # Skip "." entry.
dir=`dirname "$file"`
base=`basename "$file"`
oldname="$dir/$base"
newbase=`printf "%s" "$base" | tr A-Z a-z`
newname="$dir/$newbase"
if [ "$oldname" != "$newname" ] ; then
mv -i "$file" "$newname"
fi
done
This script skips “.”, which is not strictly necessary, but I thought it would be a good idea to point out that you may need to skip “.” sometimes.
Yes, this could be modified to handle literally all possible Unix/Linux/POSIX filenames, but those modifications make it more complicated and uglier. One approach would be to use one program to use find...-exec, which then invokes another script to do the renaming. But then you have to maintain two scripts, and keep them in sync. You could embed the command into find, but then the find command becomes hideously complicated.
Another solution to handling all filenames would be to change the loop to:
find . -depth -print0 | while IFS="" read -r -d '' file ; do ...
However, this requires non-standard GNU extensions to find (-print0) and bash (read -d), as well as being uglier and more complicated. Also, if “mv” is implemented as required by the Single Unix Standard, then the “mv -i” will fail badly if it tries to rename a file into an existing name. That’s because when it tries to get an answer, it will send a prompt to stderr, but it will expect a RESPONSE from stdin... and yet, stdin is where it gets the list of filenames!!
And it’s all silly anyway. If you put newlines in filenames, lots of scripts fail. It’s simply too much of a pain to deal with them “correctly”. Which is the point of Fixing Unix/Linux/POSIX Filenames — adding some limitations on legal Unix/Linux/POSIX filenames would be an improvement. At the least, by default let’s forbid control characters (so simple “find” and filename display is safe), forbid leading dash characters (so simple globbing is safe), require that all filenames be UTF-8 (so displaying filenames always works), and perhaps forbid trailing spaces (since these are dangerously misleading to end-users). I would like to see kernels build in the mechanisms to forbid certain kinds of filenames, so that administrators can then specify the specific “bad filename” policy they would like to use.
So please take a look at: Fixing Unix/Linux/POSIX Filenames: Control Characters (such as Newline), Leading Dashes, and Other Problems. I’ve made a few recent additions, thanks to some interesting comments people have sent, but the basic message is the same.
path: /security | Current Weblog | permanent link to this entry
SPARK released as FLOSS (Free/ Libre / Open Source Software)!
The SPARK toolsuite has just been released as FLOSS (Free/ Libre / Open Source Software) by Praxis (its creator). This is great news for those who want to make software safer, more reliable, and more secure. In particular, this means that Tokeneer is now an open proof. If you haven’t been following this, here’s some background.
Software is now a part of really critical systems (ones that need “high assurance”), yet often that software is not as safe, reliable, or secure as it needs to be. I believe that in the long term, we will need to start proving that our very important programs are correct. Testing by itself isn’t enough; completely testing the trivial “add three 64-bit integers” program would take far longer than the age of the universe (it would take about 2x10^39 years). The basic idea of using mathematics to prove that programs are correct — aka “formal methods” — has been around for decades. There are a number of cases where formal methods have been applied successfully, and I’m glad about that. And yet, applying formal methods is still relatively rare. There are many reasons for this, such as inadequate maturation and capabilities of many formal methods tools, and the fact that relatively few people know how to apply formal methods when developing real programs. But what, in turn, is causing those problems? It’s true that applying formal methods is a hard problem that hasn’t received the level of funding it needs, but still, it’s been decades!
I believe one problem hindering the maturation and spread of formal methods is a “culture of secrecy”. Details of formal method use are often unpublished (e.g., because the implementations are proprietary or classified). Similarly, details about formal methods tools are often unshared and lost (or have to constantly re-invented). Biere’s “The Evolution from LIMMAT to NANOSAT” (Apr 2004) gives an example: “From the publications alone, without access to the source code, various details were still unclear... Only [when CHAFF’s source code became available did] our unfortunate design decision became clear... The lesson learned is, that important details are often omitted in publications and can only be extracted from source code. It can be argued, that making source code of SAT solvers available is as important to the advancement of the field as publications”
This “culture of secrecy” means that researchers/toolmakers often don’t receive adequate feedback, researchers/toolmakers waste time and money rebuilding tools, educators have difficulty explaining formal methods (they have no examples to show!), developers don’t understand how to apply it (and it has an uncertain value to them), and evaluators/end-users don’t know what to look for.
I believe that a way to break through this “culture of secrecy” is to develop “open proofs”. But what are they? An “open proof” is software or a system where all of the following are free-libre / open source software (FLOSS):
Imagine if we had a number of open proofs available. There could be small open proofs that could be used for learning (e.g., as examples and use in class exercises). There could be proofs of various useful functions and small applications, so developers could see how to scale up these techniques, directly reuse them as components, or use them as starting points but add additional (proven) capabilities to them. When problems come up (and they will!), toolmakers and developers could work together to find ways to mature the tools and technology so that they’d be easier to use (e.g., so more could be automated). In short, imagine there was a working ecosystem where researchers/toolmakers/educators, developers of implementations to be proved, and evaluators/end-users could work together by sharing information. I believe that would greatly speed up the maturing of formal methods, resulting in more reliable and secure software.
In this context, Praxis has just released the SPARK GPL Edition. This is their SPARK toolsuite (a formal methods tool) released under the GNU General Public License aka GPL (the most common FLOSS license). So, what’s that?
SPARK is a variant of the Ada programming language, designed to enable proofs about programs (by adding and removing some features of Ada). The additions are in special comments, so SPARK programs can be compiled by a normal Ada compiler like GNAT (which is part of gcc). The Open Proofs page on SPARK has some information on SPARK. The page What is Special About SPARK Contracts? gives a nice quick introduction to SPARK, which I will quote here. It points out that the Ada line:
procedure Inc (X : in out Integer);
just says there is some procedure “Inc” that may read a value X, and
may write it out, but that’s it.
In SPARK, you can add much more precise information, and the SPARK tools
can then check to see if they are true.
For example, if you say this using SPARK:
procedure Inc (X : in out Integer);
--# global in out CallCount;
--# pre X < Integer'Last and
--# CallCount < Integer'Last;
--# post X = X~ + 1 and
--# CallCount = CallCount~ + 1;
then the SPARK tools will ensure at compile-time (not run-time) that:
You can learn more about SPARK from the book High Integrity Software: The SPARK Approach to Safety and Security” by John Barnes. Sample text of Barnes’ book is available online. The open proofs page on SPARK has more information.
This means that the “Tokeneer” program is now an open proof. Remember, to be an open proof, a program’s implementation, proofs, and required tools have to be open source software. Tokeneer was a sample program written to show how to apply these kinds of techniques to actual systems (instead of trivial 5-line programs). The Tokeneer program itself, and its proofs, have already been released as open source software. Many of the tools it required are already FLOSS (e.g., fuzz and LaTeX for its formal specifications, and an Ada compiler to compile it). Now that SPARK has been released as FLOSS, people can examine this entire stack of software to make improvements in all the technologies, as well as learn from them and create improved implementations. No, this doesn’t suddenly make it trivial to make proofs about complex programs, but it’s a step forward.
If you are interested in making future software better, please help the open proofs project. You don’t need to be a math whiz. For example, if you know how to do shell scripting, please help us package some promising formal methods tools (like SPARK) so they are easy to install. It’s hard to get people to try out these tools (and give feedback) if they’re too hard to install. If you know of formal methods software that is rotting in some warehouse, try to get it released as FLOSS. I think all government-funded unclassified research software should be released as FLOSS by default, since “we the people” paid for it! If you’re interested in the latest software technology, try out a few of these formal methods tools, and release as FLOSS any small programs and proofs you develop with them. Send the toolmakers feedback, or write down their strengths and weaknesses to help others understand them. SPARK is a tool that can be used, right now, in certain circumstances. I have no illusions that today’s formal methods tools are ready for arbitrary 20 million line programs. But if we want future software to be better than today, we need to figure out how to mature formal methods technology and make it better-understood so that it can mature and scale. I think making top-to-bottom worked examples and starting points can help us get there.
path: /oss | Current Weblog | permanent link to this entry
Parchment: Running the Z-machine
I just learned of fun web application called Parchment. Parchment lets you play interactive fiction (I.F., aka "text adventure games") using just your web browser. It only works with I.F. in "Z-machine" format, but that's a very common format.
So go to the parchment site and try out something from their long list of interactive fiction... now you don't need to install anything! That includes my small replayable puzzle "Accuse" (my Accuse source code is already available).
If you want more information about it, here's a brief post about Parchment by its author, Atul Varma. Atul built this based on an existing program, Thomas Thurman's Gnusto. Both are open source software (using the GPLv2 license). Once again, this demonstrates the neat thing about community-developed software; one person developed a program for one circumstance, and another extended it for a different circumstance.
There are several tools available for creating interactive fiction. I've been watching Inform 7 for a while, with interest, because it takes a radically different approach to writing code. Inform 7 is a natural-language programming language that tries to actively exploit features of natural language to make developing these kinds of things easier. You can see a brief Inform 7 tutorial if you're curious, as well as the full Writing with Inform documentation. Inform 7 isn't itself OSS, though significant portions are; inform 6 (a key substrate) and many other portions including the Inform 7 standard rules are released under the Artistic License 2.0. The extensions are released under the "Attribution Creative Commons licence"; that's not normally a license used for software, but I think it'd meet the criteria for OSS, and Fedora approves of this license for content. I hope that someday the rest will be released as OSS as well. The logic behind Inform 7 is described in "Natural Language, Semantic Analysis and Interactive Fiction" by Graham Nelson. If you're interested in some of the technical stuff behind it, the text of the Standard Rules, the text of the extensions, Inform 7 for programmers, and the Chart of Rules can tell you more.
path: /oss | Current Weblog | permanent link to this entry
The Wikimedia Foundation (WMF) will change the licensing terms on all its materials — including Wikipedia. Now, all of its existing material will be released under the Creative Commons Attribution-ShareAlike (CC-BY-SA) license in addition to the current GNU Free Documentation License (GFDL). The WMF says “This change is meant to advance the WMF’s mission by increasing the compatibility and availability of free content.” This means that Wikipedia material can now be combined with the vast amount of CC-BY-SA licensed material, and Wikipedia can now include the volumes of CC-BY-SA material (that material will just be CC-BY-SA). It also makes it easier to use Wikipedia material (and other material from the Wikimedia Foundation).
I think this is a good thing overall. Incompatible licenses are a real scourge on community-developed works. Past experience shows that license incompatibility can be a real problem for free-libre/ open source software (FLOSS or OSS), in particular. Bruce Perens warned about FLOSS license incompatibility back in 1999! As I argue in Make Your Open Source Software GPL-Compatible. Or Else, you should release free-libre/ open source software (FLOSS) using a GPL-compatible license. You don’t need to use the GPL, but using a GPL-compatible license (like the MIT, BSD-new, LGPL, or GPL) so means that people can combine your software with other software to create larger works. I show how this works in The Free-Libre / Open Source Software (FLOSS) License Slide, which has a simple graph showing how common FLOSS licenses can work together. Wikipedia articles aren’t software, but the principles still apply - licenses need to enable community-developed works, not disable them.
Now, nothing is perfect. One nice benefit of the GNU Free Documentation License (GFDL) is that it requires that readers be able to get editable versions whose format specification is available to the public (for details, see its text on “transparent” copies). This is a really nice feature of the GFDL; it counters some of the problems of proprietary formats.
The GFDL has many problems, though, when used for short works like Wikipedia articles or images. Most obviously, it requires that you include the entire text of the license with each work (see GFDL 1.3 section 2). That’s no problem for large manuals, which is what the GFDL was designed for, but it’s a big problem for short works. Nobody likes having a license longer than the article it’s attached to! This is one reason why CC-BY-SA is so widely used for short works - and since Wikipedia is primarily a large set of short works, it makes sense. Which is why I (and many others) voted to approve this change.
Now it’s certainly true that people also complain that the GFDL allows the addition of unmodifiable sections. But many GFDL items don’t have them, and Debian determined through a formal vote that “GFDL-licensed works without unmodifiable sections are free [as in freedom]”.
I should also give credit to the Wikimedia Foundation (WMF), Richard Stallman of the FSF, and Lawrence Lessig, who worked together to make this possible.
For more on the Wikimedia license modification, you can see Wikimedia license FAQ, Lawrence Lessig’s post on GFDL 1.3, GFDL 1.3: Wikipedia's exit permit, FDL 1.3 FAQ, and An open response to Chris Frey regarding GFDL 1.3.
path: /oss | Current Weblog | permanent link to this entry
Government-developed Unclassified Software: Default release as Open Source Software
I’d like to see this idea seriously considered and discussed: By default, unclassified software which the government paid to develop should be released to the public as open source software (unless there’s a good reason not to).
Why? Well, If “we the people” paid to develop it, then “we the people” should get it! I think this idea fits into the good government ideal of data transparency; after all, software is data. Currently, we have a lot of waste and unnecessary costs due to loss, re-development, and/or government-created monopolies. The government is not a venture capitalist (VC); people who need a VC should go to a VC.
Let me focus specifically on the United States. I think this idea easily fits into the broader ideas of transparency and open government, including the Memorandum on Transparency and Open Government. Look at all the excitement over data.gov, indeed, Apps for America having a contest to develop software to use data from data.gov.
Indeed, there’s a long history of U.S. laws specifically set up to make data available. Most obviously, Freedom on information act (FOIA) requests make it possible to extract information from the U.S. government. 17 USC 105 and 17 USC 101 prevents the U.S. government from claiming U.S. copyright on a work “prepared by an officer or employee of the United States Government as part of that person’s official duties”. So this idea would be an extension of what’s already gone on.
Let me focus on research, and how this idea could help advance technology. Think of all the advantages if software developed by U.S.-funded research could be reused by other research projects and commercial firms. For example, imagine if other researchers could simply extend previous work by modifying previously-developed software, instead of re-building yet another version from scratch. Anyone could take commercialize the research making it more likely that it would be commercialized instead of being lost in the archives shown at the end of Raiders of the Lost Ark. Some argue that giving sole rights is the only way to commercialization, but that’s just not true; open source software is commercial software, so this is simply a different and fairer path to commercialization. In contrast, the current system inhibits all kinds of technical progress; Biere’s “The Evolution from LIMMAT to NANOSAT” (Apr 2004) found that “important details are often omitted in [research] publications and can only be extracted from source code... [Making source code available] is as important to the advancement of the field as publications”. Originally I thought of this idea for research software, and it’s not hard to see why. But when I starting thinking about the reasons for doing this — especially “if ‘we the people’ paid to develop it, then ‘we the people’ should get it” — then I realized that this principle applies much more broadly.
An open government directive isn’t out yet, but they’re clearly working on it. Please submit this - and other ideas like it - to them. I think there’s a lot of promise, but they can only enact and refine ideas that they’ve heard of. If you like this idea, please vote for it.
If this happened, I envision a two-stage process: (1) release of the software as an archive (so it can be downloaded), and (2) some of it will get picked up and used to start an active OSS project. The second stage might not happen for many years after the first, and that’s okay. Some will ask “how will people find it”, but I think that’s the wrong question. There are many commercial search engines that can find code, but they can only find stuff that’s web-accessible; let’s give them something to find.
Perhaps this should be done in stages. For example, perhaps it'd be best to start with software developed by research. Researchers are supposed to share their results anyway (under most cases), and the lack of software release often inhibits research (e.g., it's harder to check or repeat results). You could then broaden this to other types of software.
I’m sure there will need to be exceptions. There would need to be some sort of guidelines to figure out when to grant those exceptions, and those guidelines should be developed though lively discussion. Most obviously, if it’s a special ingredient necessary for national security, then it should be classified and not revealed in any form. I would not expect weapon systems or intelligence software to be released (though sometimes generic functions developed in them could be released). Export controls would still apply. But the exceptions should be that: Exceptions.
path: /oss | Current Weblog | permanent link to this entry
Wikipedia for childrens' schools
Wikipedia is a cool project. But if you want to hand an encyclopedia to younger children or to schools, Wikipedia is not a great choice. Wikipedia is not “child-safe”, nor is intended to be; it includes a lot of “adult” content. Also, Wikipedia constantly suffers vandalism; the vandalism is often repaired quickly, but that’s little comfort to parents and teachers. There’s also the problem of Internet access; schools typically employ blocking software, and blocking software is fundamentally not smart. Since Wikipedia mixes material that’s okay for children with stuff that is not, Wikipedia often gets blocked by schools for children. Some schools for children just don’t have Internet access at all, for a variety of reasons. All of this makes it hard for such schools to directly use Wikipedia.
Wikipedia for schools is a cool project that compensates for this. It’s a free, hand-checked, non-commercial selection from Wikipedia, targeted around the UK National Curriculum and useful for much of the English speaking world. The current version has about 5500 articles (as much as can be fit on a DVD with good size images) and is “about the size of a twenty volume encyclopaedia (34,000 images and 20 million words)”. It was developed by carefuly selecting for content, then checking for vandalism and suitability by “SOS Children volunteers”. You can download it for free from the website, or as a free 3.5GB DVD.
I also see this as a future model for Wikipedia — allow people to edit, but have a separate vetting process that identifies particular versions of an article as vetted. Then, people can choose if they want to see the latest version or the most recent vetted version. To some, this is very controversial, but I don’t see it that way. A vetting process doesn’t prevent future edits, and it creates a way for people to get what they want... material that they can have increased confidence in. The trick is to develop a good-enough vetting process (or perhaps multiple vetting/rating processes for different purposes). This didn’t make sense back when Wikipedia was first starting (the problem was to get articles written at all!), but now that Wikipedia is more mature, it shouldn’t be surprising that there’s a new need to identify vetted articles. Yes, you have to worry about countries to whom “democracy” is a dirty word, but I think such problems can be resolved. This is hardly a new idea; see Wikimedia’s article on article validation, Wikipedia’s pushing to 1.0, WikiQA by Eloquence, and FlaggedRevs. I am sure that a vetting/validation process will take time to develop, and it will be imperfect... but that doesn’t make it a bad idea.
So anyway, if you know or have younger kids, check out Wikipedia for schools. This is a project that more people should know about.
path: /oss | Current Weblog | permanent link to this entry
FLOSS doubles every 14 months!
I just took a look at Red Hat's 2009 brief to the European Patent Office on why software patents should not be allowed. It's a nice brief, noting that software patents hinder software innovation, and that there is a sound legal basis not to expand availability of such patents in Europe. (Here's Red Hat's press release, and Glyn Moody's comments (ComputerWorld UK) on it).
Their brief points to another paper with very interesting results: "The Total Growth of Open Source" by Amit Deshpande and Dirk Riehle (Proceedings of the Fourth Conference on Open Source Systems (OSS 2008). Springer Verlag, 2008. Page 197-209). In this paper, they analyze the growth of more than 5000 open source software projects, and show that "the total amount of source code as well as the total number of open source projects is growing at an exponential rate." In their conclusion they state that the "total amount of source code and the total number of projects double about every 14 months."
That is an extraordinary rate of growth. Exponential growth can start small, but when it continues it will completely flatten anything not growing exponentially (or growing as fast). This result is consistent with my earlier work, More than a Gigabuck: Estimating GNU/Linux's Size, which also found very rapid growth in free/libre/open source software (FLOSS).
So if you're interested in software trends, take a look at "The Total Growth of Open Source" and Red Hat's brief to the EPO on software patents. I think they're both worth reading.
path: /oss | Current Weblog | permanent link to this entry
Geocities, a web hosting site sponsored by Yahoo, is shutting down. Which means that, barring lots of work by others, all of its information will be disappearing forever. Jason Scott is trying to coordinate efforts to archive GeoCities' information, but it's not easy. He estimates they're archiving about 2 Gigabytes/hour, pulling in about 5 Geocities sites per second... and they don't know if it'll be enough. What's more, the group has yet to figure out how to serve it: "It is more important to me to grab the data than to figure out how to serve it later.... I don't see how the final collection won’t end up online, but how is elusive..."
This sort of thing happens all the time, sadly. Some company provides a free service for your site / blog / whatever... and so you take advantage of it. That's fine, but if you care about your site, make sure you own your data sufficiently so that you can move somewhere else... because you may have to. Yahoo is a big, well-known company, who paid $3.5 billion for Geocities... and now it's going away.
Please own your own site — both its domain name and its content — if it's important to you. I've seen way too many people have trouble with their sites because they didn't really own them. Too many scams are based on folks who "register" your domain for you, but actually register it in their own names... and then hold your site as a hostage. Similarly, many organizations provide wonderful software that is unique to their site for managing your data... but then you either can't get your own data, or you can't use your data because you can't separately get and re-install the software to use it. Using open standards and/or open source software can help reduce vendor lock-in — that way, if the software vendor/website disappears or stops supporting the product/service, you can still use the software or a replacement for it. And of course, continuously back up your data offsite, so if the hosting service disappears without notice, you still have your data and you can get back on.
I practice what I preach. My personal site, www.dwheeler.com, has moved several times, without problems. I needed to switch my web hosting service (again) earlier in 2009, and it was essentially no problem. I just used "rsync" to copy the files to my new hosting service, change the domain information so people would use the new hosting service instead, and I was up and running. I've switched web servers several times, but since I emphasize using ordinary standards like HTTP, HTML, and so on, I haven't had any trouble. The key is to (1) own the domain name, and (2) make sure that you have your data (via backups) in a format that lets you switch to another provider or vendor. Do that, and you'll save yourself a lot of agony later.
path: /misc | Current Weblog | permanent link to this entry
Why copyright damage limits don't hurt FLOSS
There's a move afoot to argue that copyright infringement penalties should bear a rational relationship to the value of what was infringed. You might think that this could harm Free/Libre/Open Source Software (FLOSS), but I don't think so. Here's why.
First: This is all being brought to a head by the current file-sharing lawsuit against Boston University graduate student Joel Tenenbaum, which raises a number of interesting questions. One issue that I find particularly interesting is the issue of statutory damages: Are fines from $750 to $150,000 per song (worth at most $1), non-commercially shared without permission, even legal under the US Constitution? Or, are these fines so excessive that they are unconstitutional? Ars Technical gives a brief summary of the case, if you haven't been following it. The Free Software Foundation (FSF)'s Amicus Brief in Connection with defendant's motion to dismiss on grounds of unconstitutionality of copyright act statutory damages as applied to infringement of single MP3 files argues that these penalties grossly exceed the crime; the FSF argues that the "State Farm/Gore due process test applicable to punitive damage awards is likewise applicable to statutory damages, and in particular bars the suggestion that each infringement of an MP3 file having a retail value of 99 cents or less may be punishable by statutory damages of from $750 to $150,000 -- or from 2,100 to 425,000 times the actual damages".
Frankly, I think the FSF and Tenenbaum have a reasonable argument on this point. People who shoplift a CD from a store would definitely pay penalties when caught, but those penalties would bear some relationship to the value of the property stolen, and would be far smaller than a file-sharer. This notion that the "punishment should fit the crime" is certainly not new; Proverbs 6:30-31 talks about thieves paying sevenfold if they are caught. That doesn't make such actions right - but unjust penalties aren't right either. I think a lot of the problem is that copyright laws were originally written when only rich people with printing presses could really make and distribute many copies of material. Today, 8-year-olds can distribute as much information as the New York Times, and the law hasn't caught up.
But does the FSF risk subverting Free/Libre/Open Source Software (FLOSS) by making this argument? After all, FLOSS developers also depend on copyright law to enforce certain conditions, and often charge $0 for copies of their software. If the penalties would be limited to "7 times the original cost", would that make FLOSS development impossible?
I don't think there's any problem, but for some people that may not be obvious. The difference is that in a typical music copyright infringement case, the filesharer could purchase the right to do what they're doing for a relatively low price, something typically not true for FLOSS. For example, under normal circumstances it's perfectly legal to buy a song for $1, and then transfer that song to someone else (as long as you destroy your own copies), so sharing that song with 10 people is legal after paying $10.
In contrast, violations of FLOSS licenses often can't be made legal by simply buying the rights. If you violate the revised BSD license by removing all credits to the original author, there's typically no "alternative" legal version available for sale without the author credits. (Indeed, under legal systems with strict "moral rights" it may not even be possible.) Similarly, if you violate the GPL by releasing binary software yet refusing to release its source code, there's often no way to pay additional money to the original authors for that privilege. In some cases, GPL'ed software is released via a dual-use license (e.g., "GPL or proprietary"), with the proprietary version costing additional money; in those cases you do have a value that you can compare against. In cases where there is a value you can compare against, then you should use that value to help determine the penalty. Otherwise, a much stiffer penalty is justified, because there is no method for the infringer to "buy" his or her way out, and their actions risk making functional products (not just entertainment) unsupportable. As noted in the United States Court of Appeals for the Federal Circuit case 2008-1001, JACOBSEN v. KATZER, the court essentially found that failing to obey the conditions of an open source software license led to copyright infringement. (For more on this particular case, see New Open Source Legal Decision: Jacobsen & Katzer and How Model Train Software Will Have an Important Effect on Open Source Licensing.)
So I think that it does make sense to limit copyright penalties based on the value of the original infringed item... but that doing so does not (necessarily) put FLOSS development processes at risk.
path: /oss | Current Weblog | permanent link to this entry