The good news: There's lots of digital audio and video available through the Internet (some free, some pay-for). The bad news: Lots of audio and video is locked up in formats that aren't open standards. This makes it impractical for people to use them on arbitrary devices, shift the media between devices, and so on. This hurts product developers too; they've become vulnerable to massive lawsuits. Even though the MPEG standards are ratified by ISO and are often used - MP3 is particularly common for audio - they are not open standards. In particular, they are subject to a raft of patents, which prevent arbitrary use (e.g., by free-libre / open source software). Things are even worse if you use a format with DRM (aka "Digital Restrictions Management"). DRM tries to arbitrarily restrict how you can use the media you've paid for; when the company decides to abandon support for that DRM format, you've effectively lost all the money you spent on the audio and video media (examples of DRM abandonment include Microsoft's MSN Music, Microsoft's PlaysForSure which is not supported by Microsoft Zune, Yahoo! Music Store, and Walmart's DRM-encumbered music).
Thankfully, there's a solution, and that's Ogg (as maintained by the Xiph.org foundation). Ogg is a "container format" that can contain audio, video, and related material. Audio and video can be encoded inside Ogg using one of several encodings, but usually audio is encoded with "Vorbis" and video is encoded with "Theora". For perfect sound reproduction, you can use "FLAC" instead of Vorbis (but for most circumstances, Vorbis is the better choice).
I encourage you to use Ogg, and I'm not the only one. Wikipedia requires that audio and video be in Ogg Vorbis and Ogg Theora format (respectively); according to Alexa, Wikipedia is the 8th most popular website in the U.S. (as of Oct 2, 2008). The Free Software Foundation (FSF)'s "Play Ogg" campaign is encouraging the use of Ogg, too. Xiph.org's 2007 press release and about Xiph explain some of the reasons for preferring Ogg.
So, please seek out and create Ogg files! Their file extensions are easily recognized: ".ogg" (Ogg Vorbis sound), ".oga" (Ogg audio using other codecs like FLAC), and ".ogv" (Ogg video, typically Theora plus Vorbis). If you need to download software to play Ogg files, FSF Ogg's "how" page or Xiph.org's home page will explain how to download and install software to play Ogg files (they're free, in all senses!). Many video players can play Ogg already; among them, VLC (from VideoLAN) is often recommended as a player.
Probably the big news is that the next version of Mozilla's Firefox will include Ogg - built in! So soon, you can just install Firefox, and you'll have Ogg support. That should encourage even more use of Ogg, because there will be so many more people who have Ogg (or can get install it easily), as well as lots of reasons to install such software.
If you want more technical details, you can see the Wikipedia article on Ogg. You can also see Internet standard RFC 5334, which discusses the basic file extensions and MIME types, as well as pointing to other technical documents.
Currently there is a babel of formats out there, and most of the more common ones are not open standards. I have no illusions that this babel will instantly disappear, with everyone using Ogg by tomorrow. Getting a new audio or video format used is a difficult chicken-and-egg problem: People don't want to release audio or video until everyone can play them, and people don't want to install format players until there's something to play.
But with Wikipedia, Firefox, and many others all working to encourage the Ogg format, I think the chicken-and-egg problem has been overcome. I'm now discovering all sorts of organizations support Ogg, such as Metavid, (who provide video footage from the U.S. Congress in Ogg Theora format). Groklaw interviewed Richard Hulse of Radio New Zealand, who explained why they recently added support for Ogg Vorbis. Many other radio stations support Ogg; I've confirmed support by the Canadian Broadcasting Corporation (CBC) (Radio feeds 1 and 2), WPCE, and WBUR (Xiph.org has a much longer list of stations supporting Ogg). Ogg is widely used in games; there's Ogg support in the engines for Doom 3, Unreal Tournament 2004, Halo: Combat Evolved, Myst IV: Revelation, Serious Sam: The Second Encounter, Lineage 2, Vendetta Online, and the Grand Theft Auto engines (Xiph.org has a longer list of games). In short, there are now enough Ogg players, and Ogg media, to get the ball rolling.
In particular: Don't buy a portable audio (music) player unless it can play Ogg Vorbis. Xiph has a list of audio players that support Ogg Vorbis (read the details for the player you're considering!). If a manufacturer doesn't support Ogg, complain to them until they fix the problem.
path: /oss | Current Weblog | permanent link to this entry
Developers: Use System Libraries!
The packagers from a variety of GNU/Linux distributions are informally uniting to tell software developers a simple story: "Use system libraries - don't create local copies of libraries!"
The latest push came from Toshio Kuratomi's email "Uniting to get upstreams to use system libraries". Fedora, like most distributions, has a guideline that "a package should not link against a local copy of a library... libraries should be included in the system and applications should link against that [instead]". Toshio lists two reasons why this guideline exists (I know there are other reasons too):
I'm big on security, so reason #1 is a good-enough reason to me. The Fedora packaging rules note that the fixes aren't actually limited to security issues; not duplicating system libraries "prevents old bugs and security holes from living on after the core system libraries have been fixed." But I think the more important reason is hinted at in the last part of reason #2. No one - not even a big FLOSS project - has infinite resources. Different people will find different problems when they use a library. If the many different applications that use a library report problems back to the library maintainers, the library maintainers can fix the problem. Then, the fix will benefit everyone who depends on the library. If every application has their own local variant of a library, then each one will have defects that were fixed in other variants.
Toshio then notes: "In the world of C applications and libraries, we don't often run into this problem anymore. Most C application developers have learned the same lessons we have. However, in the java, mono/.net, and web application worlds, this [duplication of libraries is still] a common practice. Sometimes our packagers find themselves trying to convince upstream to change what they do without success -- upstream is convinced that they need to include these local copies." In some cases (particularly for Java), there were historical reasons that they had to do this due to licensing. But as those reasons have diminished, the practices haven't gone away.
Fedora, Debian, openSUSE, Gentoo, and Mandriva all have policies/guidelines specifically recommending or requiring that packages not have their own special copies of libraries. All of these distributions clearly explain that applications should use normal libraries instead. Unfortunately, software developers for non-C programs don't seem to be hearing the message. That makes it really hard to package those programs for use by end-users. As a result, applications are often harder to install, or the easily-installed versions are much delayed, because of unnecessary difficulties in packaging the program for end-users.
Yes, in a few cases a special copy of a library may be necessary. Granted. But it's often unnecessary, and it should be the exception, not the rule. At the very least, it should be trivial to build a FLOSS application from source code so that it uses the system's libraries instead of some local copy of the libraries.
So developers, please, try to work with the standard libraries instead of creating your own modified copy. Packagers - and users - around the world will thank you.
path: /oss | Current Weblog | permanent link to this entry
Challenges for securing closed source software
I've just learned of a really interesting article by Chad Perrin, "10 security challenges facing closed source software". He starts with my Secure Programming for Linux and Unix HOWTO book's list of "core requirements for developing secure software", which was part of the section on developing secure open source software. My list was really simple:
At the time I made that list, I was primarily thinking about that list as requirements for open source software. Chad Perrin had the interesting insight that the list applies to closed source software too... and then examined what the challenges are. It's a really interesting list, I suggest taking a look at it! He closes with a very interesting claim: "None of these disadvantages for closed source software are inflexible or absolute. There’s no reason closed source software developed by a corporate vendor can’t be as secure as an open source equivalent. It should be pretty obvious that, all else being equal, the trend is for circumstances to favor the security of open source software — at least as far as these principles of software security are concerned."
path: /oss | Current Weblog | permanent link to this entry
FLOSS License Proliferation: Still a problem
License proliferation in free-libre / open source software (FLOSS) licenses is less than it used to be, but it's still a serious problem. There are, thankfully, some interesting rumblings to try to make things better.
Russ Nelson at the Open Source Initiative (OSI) wants to restart a FLOSS license anti-proliferation committee to address the problem that there are too many FLOSS licenses. He wants to set up a process to establish two tiers, "recommended" and "compliant". There's no telling if the work will be successful, but the basic concept sounds very reasonable to me.
Matt Asay counters that "Someone needs to tell the Open Source Initiative, Google, and others who fret about license proliferation that the market has already cut down the number of actively used licenses to just a small handful: L/GPL, BSD/Apache, MPL, and a few others (EPL, CPL)... It's a worthy cause, but one that has already been effectively fought and settled by the free market. I would hazard a guess that upwards of 95 percent of all open-source projects are licensed under less than 5 percent of open-source licenses. (The last time I checked, 88 percent of Sourceforge projects were L/GPL or BSD. It's been a non-issue for many years.) There is no open-source proliferation problem. Do we have a lot of open-source licenses? Yes, just as we have a lot of proprietary licenses (in fact, we have many more of those). But we don't have a license proliferation problem, because very few open-source licenses actually get used on a regular basis. This is a phantom. It seems scary, but it's not real.
Asay is right that "the market" has mostly settled the issue, but I think Asay is quite wrong that there is no problem. I quite agree with Asay that there is a very short list of standard FLOSS licenses... but there's still a lot of people who, even in 2008, keep creating new incompatible FLOSS (or intended to be FLOSS) licenses for their newly-released programs. And although it's true that "very few actually get used on a regular basis", it's also true that a large number of people are still creating new, one-off FLOSS licenses that are incompatible with many widely-used licenses. Why? I think the problem is that there are still a lot of lawyers and developers who "didn't get the memo" from users and potential co-developers that new FLOSS licenses are decidedly unwelcome. As a result, new programs are still being released under new non-standard licenses.
I can even speculate why there are so many people still creating incompatible licenses, even though users and distributors don't want them. A lot of new programs are developed by people who know a lot about their technical specialty, but very little about copyright law, and also very little about FLOSS norms (both in licensing and community development processes). So they go to lawyers of their organizations. As far as I can tell, many lawyers think it's fun to create new licenses and have absolutely no clue that using a nonstandard FLOSS-like license will relegate the program to oblivion. (The primary thing that matters to a lawyer is if they or their organization can be sued; if the license causes the program to be useless, well, too bad, the lawyer still gets paid.) Indeed, many lawyers still don't even know what the requirements for FLOSS licenses are - never mind that there are license vetting procedures, or that using non-standard FLOSS licenses is widely considered harmful. So we have developers, who know they want to collaborate but don't realize that they need to follow community standards to make that work, and we have lawyers, who often don't realize that there are community standards for the licenses (and their non-selection will affect their clients).
Let me give some specific examples from recent work I'm doing, to show that this is still a problem. Right now I'm trying to get some software packaged to more rigorously prove that software does (or doesn't) do something important. I tried to get CVC3 packaged; it has "almost a BSD license", and I believe the developer intended for it to be FLOSS. Problem is, somebody thought it'd be fun to add some new nonstandard clauses. The worst clause - and I'm highly paraphrasing here - could be interpreted as, "If we developers did lots of illegal activities in creating the software, you're required to pay for our legal expenses to defend our illegal activities, even if the only thing that you did is provide copies of this software to other people, or used it incidentally." Certainly that's how I interpret it, though I'm no lawyer. When I brought this license text to Fedora legal, let's just say that they were less than enthused about endorsing this license or including the program in the distribution. Indeed, CVC3's license may make it too dangerous for anyone to use. After all, how could I possibly determine the risk that you (the developer) did something illegal? CVC3 also has another annoying incompatible license addition (compared to the BSD-new license), a "must change name if you change the code" type clause. Of course, it won't compile as-is; the only way to compile it is to change the code :-). Here's hoping that they fix this by switching to a standard license. CVC3 is not the only offender, either, there are legions of them. I examined Alt-Ergo, a somewhat similar program. It uses a FLOSS license, but it uses the remarkably weird and non-standard CeCILL-C license (this is even less well known than its cousin the CeCILL; according to Fedora it's FLOSS but GPL-incompatible, and a GPL-incompatible FLOSS license is a remarkably bad choice). Third example - over this weekend I had a private email conversation with a developer who's about to release their software with a license; the developer intended to create (as a non-lawyer!) yet another license with incompatible non-FLOSS terms. Which would have been a big mistake.
Frankly, I think Asay is being excessively generous in his list of acceptable licenses. The standard FLOSS licenses are, I believe, simply MIT, revised BSD (BSD-new), LGPL (versions 2.1 and 3), and GPL (versions 2 and 3), and possibly the Apache 2.0 license. All of these licenses have a very large set of projects that use them, are widely understood, have been deeply analyzed by legal experts, and yet are comprehensible to both developers and users. An especially important property of this set, as you can see from my FLOSS license slide, is that they are generally compatible (with the problem that Apache 2.0 and GPLv2 aren't compatible). Compatibility is critical; if you want to use FLOSS to build serious applications, you often need to combine them in novel ways, and license incompatibilities often prevent that. As I note in Make Your Open Source Software GPL-Compatible. Or Else, the GPL is by far the most popular FLOSS license; most FLOSS software is under the GPL. So choosing a GPL-incompatible license is, in most cases, foolish. Which is a key reason I don't include the MPL in that set; not only do these licenses have vanishingly small market share compared to the set above, but their incompatibilities make their use foolish. Even Mozilla, the original creator of the MPL, essentially no longer uses the MPL (they tri-license with the GPL/LGPL/MPL, because GPL-incompatibility was a bad idea).
Having a short "OSI recommended" or "FSF recommended" list of licenses is unlikely to completely solve the problem of license proliferation. But having a semi-formal, more obviously endorsed, and easy-to-reference site that identified the short list of recommended licenses, and explained why license proliferation is bad, would help. While those well-versed in FLOSS understand things, the problem is those others who are just starting out to develop a FLOSS project. After all, the license is chosen at the very beginning of a project, when the lead developer may have the least experience with FLOSS. Anyone beginning a new project is likely to make mistakes, but there's a difference; just about any other mistake in starting a FLOSS project can be fixed fairly easily. Don't like the CM system? Switch! Don't like your hosting environment? Move! But a bad license is often extremely difficult to change; it may require agreement by a vast army of people, or those (e.g., organizational lawyers) who have no incentive to cooperate later. Yes, projects like vim and Python have done it, but only with tremendous effort.
The license mistakes of one project can even hurt other projects. Squeak is still trying to transition from early licensing mistakes, and it's still not done even though it's been working on it for years. These has impeded the packaging and wider use of nice programs like Scratch, which depend on Squeak. The Java Trap discusses some of the challenges when FLOSS requires proprietary software to run; when the FLOSS licenses are incompatible, many of the same problems apply. In short, when FLOSS licenses are incompatible, they cause problems for everyone. And when there are more than a few FLOSS licenses, it also becomes very hard to understand, keep track of, and comply with them.
Asay and Nelson have no trouble understanding the license proliferation issues; they've been analyzing FLOSS for years. But they are not the ones who need this information, anyway. It's the newcomers - the innovators coming up with the new software ideas, but who don't fully understand collaborative development and how FLOSS licensing enables it - who need this information. I don't really mean to pick on Asay in this article; it's just in this case, I think Asay knows too much, and has forgotten how many people don't yet understand FLOSS.
Documenting a short list of the "recommended licenses" would be a great boon, because it would help those innovative newcomers to FLOSS approaches avoid one of the costliest mistakes of all: Using a nonstandard license.
path: /oss | Current Weblog | permanent link to this entry
Free-Libre/Open Source Software (FLOSS) licenses legally enforceable - and more
The U.S. Court of Appeals for the Federal Circuit has ruled in Jacobsen v. Katzer (August 13, 2008) that Free-Libre/Open Source Software (FLOSS) licenses are legally enforceable. Specifically, it determined that in the U.S. disobeying a FLOSS license is copyright infringement (unless there are other arrangements), and not just a contract violation. This makes it much easier to enforce FLOSS licenses in the United States. It has some other very interesting things to say, too, as I show below.
Frankly, I thought this was a very obvious ruling; I find it bizarre that some people thought there was another possibility (and that this had to be appealed). After all, U.S. copyright law clearly says that the copyright holder can determine the conditions for (most) copying, and doing anything else (unless specially permitted by law) is copyright infringement. This ruling simply states that the law is what it says it is, and that FLOSS licenses are a perfectly valid set of conditions. This eliminates, in one stroke, the argument "is a license a contract or a license?" silliness. A license is, well, a license! I've thought it was quite obvious that a license is not a contract; Eben Moglen and Groklaw have both written articles on this that I find extremely persuasive. In some countries, this distinction may make no difference, but in the U.S. there is a big difference. As Andy Updegrove noted, "Under contract law, the remedy is monetary damages, which aren't likely to amount to anything involving open-source software that is given away...", but statutory damages (money awarded for a violation of law) "can be awarded for copyright infringement without requiring proof of monetary damages... people can recover attorney fees for copyright infringement cases... [and] most importantly for licenses such as the [GNU General Public License], it means that your rights to use the copyrighted work at all disappear".
You can find more about the legal implications in Groklaw's article on Jacobsen v. Katzer, the announcement on Jmri-legal-announce, and LinuxInsider. JMRI has a set of links to related articles.
The court also had many very interesting things to say about FLOSS. I suspect many will quote it because it's an official U.S. court ruling that cuts to the essense of FLOSS licensing and why it is the way it is. Let me pull out a few interesting quotes; I have bolded some particularly interesting points:
"We consider here the ability of a copyright holder to dedicate certain work to free public use and yet enforce an 'open source' copyright license to control the future distribution and modification of that work... Public licenses, often referred to as 'open source' licenses, are used by artists, authors, educators, software developers, and scientists who wish to create collaborative projects and to dedicate certain works to the public. Several types of public licenses have been designed to provide creators of copyrighted materials a means to protect and control their copyrights. Creative Commons, one of the amici curiae, provides free copyright licenses to allow parties to dedicate their works to the public or to license certain uses of their works while keeping some rights reserved."
"Open source licensing has become a widely used method of creative collaboration that serves to advance the arts and sciences in a manner and at a pace that few could have imagined just a few decades ago. For example, the Massachusetts Institute of Technology ('MIT') uses a Creative Commons public license for an OpenCourseWare project that licenses all 1800 MIT courses. Other public licenses support the GNU/Linux operating system, the Perl programming language, the Apache web server programs, the Firefox web browser, and a collaborative web-based encyclopedia called Wikipedia. Creative Commons notes that, by some estimates, there are close to 100,000,000 works licensed under various Creative Commons licenses. The Wikimedia Foundation, another of the amici curiae, estimates that the Wikipedia website has more than 75,000 active contributors working on some 9,000,000 articles in more than 250 languages."
"Open Source software projects invite computer programmers from around the world to view software code and make changes and improvements to it. Through such collaboration, software programs can often be written and debugged faster and at lower cost than if the copyright holder were required to do all of the work independently. In exchange and in consideration for this collaborative work, the copyright holder permits users to copy, modify and distribute the software code subject to conditions that serve to protect downstream users and to keep the code accessible. By requiring that users copy and restate the license and attribution information, a copyright holder can ensure that recipients of the redistributed computer code know the identity of the owner as well as the scope of the license granted by the original owner. The Artistic License in this case also requires that changes to the computer code be tracked so that downstream users know what part of the computer code is the original code created by the copyright holder and what part has been newly added or altered by another collaborator.
"Traditionally, copyright owners sold their copyrighted material in exchange for money. The lack of money changing hands in open source licensing should not be presumed to mean that there is no economic consideration, however. There are substantial benefits, including economic benefits, to the creation and distribution of copyrighted works under public licenses that range far beyond traditional license royalties. For example, program creators may generate market share for their programs by providing certain components free of charge. Similarly, a programmer or company may increase its national or international reputation by incubating open source projects. Improvement to a product can come rapidly and free of charge from an expert not even known to the copyright holder. The Eleventh Circuit has recognized the economic motives inherent in public licenses, even where profit is not immediate.... (Program creator 'derived value from the distribution [under a public license] because he was able to improve his Software based on suggestions sent by end-users. . . . It is logical that as the Software improved, more end-users used his Software, thereby increasing [the programmer's] recognition in his profession and the likelihood that the Software would be improved even further.')."
"... The conditions set forth in the Artistic License are vital to enable the copyright holder to retain the ability to benefit from the work of downstream users. By requiring that users who modify or distribute the copyrighted material retain the reference to the original source files, downstream users are directed to Jacobsen=s website. Thus, downstream users know about the collaborative effort to improve and expand the SourceForge project once they learn of the 'upstream' project from a 'downstream' distribution, and they may join in that effort."
"... Copyright holders who engage in open source licensing have the right to control the modification and distribution of copyrighted material. As the Second Circuit explained in Gilliam v. ABC, 538 F.2d 14, 21 (2d Cir. 1976), the 'unauthorized editing of the underlying work, if proven, would constitute an infringement of the copyright in that work similar to any other use of a work that exceeded the license granted by the proprietor of the copyright.' Copyright licenses are designed to support the right to exclude; money damages alone do not support or enforce that right. The choice to exact consideration in the form of compliance with the open source requirements of disclosure and explanation of changes, rather than as a dollar-denominated fee, is entitled to no less legal recognition. Indeed, because a calculation of damages is inherently speculative, these types of license restrictions might well be rendered meaningless absent the ability to enforce through injunctive relief."
"... The clear language of the Artistic License creates conditions to protect the economic rights at issue in the granting of a public license. These conditions govern the rights to modify and distribute the computer programs and files included in the downloadable software package. The attribution and modification transparency requirements directly serve to drive traffic to the open source incubation page and to inform downstream users of the project, which is a significant economic goal of the copyright holder that the law will enforce. Through this controlled spread of information, the copyright holder gains creative collaborators to the open source project; by requiring that changes made by downstream users be visible to the copyright holder and others, the copyright holder learns about the uses for his software and gains others' knowledge that can be used to advance future software releases."
In short, this court ruling makes it clear that FLOSS licenses really are legally enforceable... so it's safe for businesses to rely on them. It also makes a number of clear statements that FLOSS really does have economic value, even when money doesn't change hands - a point I make in my article Free-Libre / Open Source Software (FLOSS) is Commercial Software.
path: /oss | Current Weblog | permanent link to this entry
Linus Torvalds is thinking about changing the Linux kernel version numbering scheme [Kernel Release Numbering Redux]. He said: "I _am_ considering changing just the [version] numbering... because a constantly increasing minor number leads to big numbers. I'm not all that thrilled with '26' as a number: it's hard to remember... If the version were to be date-based, instead of releasing 2.6.26, maybe we could have 2008.7 instead... I personally don't have any hugely strong opinions on the numbering. I suspect others do, though, and I'm almost certain that this is an absolutely _perfect_ 'bikeshed-painting' subject... let the bike-shed-painting begin."
Here's my proposal: Offset 2000 version numbers, i.e., "(y-2000).mm[.dd]". The first number is the year minus 2000, followed by "." and a two-digit month, optionally followed by "." and a two-digit day when there's more than one release in a month. So version 8.07 would be the first release in July 2008. If you made a later release on July 17, that later release would be 8.07.17 (so if a project makes many releases in a month, you can again determine how old a particular copy is).
Date-based version numbers have a lot going for them, because at a glance you know when it was released (and thus you can determine how old something is). If you choose the ISO order YYYY.MM.DD, the numbers sort very nicely; Debian packages often use YYYYMMDD for versioning. But there's a problem: full year numbers, or full dates in this format, are annoyingly large. For example, version numbers 2008.07.16 and 20080716 are painfully long version numbers to remember.
So, use dates, but shorten then. Since nothing today can be released before 2000, shorten it by subtracting 2000. Note that this is subtracting - there's no Y2K-like rollover problem, because the year 2100 becomes 100 and the year 3000 becomes 1000. The second number is the month; using a two-digit month means you don't have the ambiguity of determining if "2.2" is earlier or later than "2.10" (you would use "2.02" instead). If you need to disambiguate day releases (or you make additional releases in the same month), add "." and a two-digit day.
These version numbers are short, they're easy to compare, and they give you a clue about when it was released. Ubuntu already uses this scheme for the first two parts, so this scheme is already in use and familiar to many. This works perfectly with "natural sort" (e.g., with GNOME's Nautilus file manager or with GNU ls's "-v" option).
If you use a time-based release system (see this summary of Martin Michlmayr's thesis for why you would), using this version numbering scheme is easy, and you can even talk about future releases the same way. But what if you release software based on when the features are ready - how, then, can you talk about the system under development? In that case, you can't easily call it by the version number, since you don't know it yet. But that's not really a problem. In many cases, you can just talk about the "development" branch or give a special name to the development branch (e.g., "Rawhide" for Fedora). If you need to distinguish between multiple development branches, just give each of them a name (e.g., "Hardy Heron" for Ubuntu); on release you can announce the version number of a named branch (e.g., "Hardy Heron is 8.04"). This is more-or-less what many people do now, but if a lot of us used the same system, version numbers would have more meaning than they do now.
path: /oss | Current Weblog | permanent link to this entry
YEARFRAC Incompatibilities between Excel 2007 and OOXML (OXML)
In theory, the OOXML (OXML) specification is supposed to define what Excel 2007 reads and writes. In practice, it's not true at all; the latest public drafts of OOXML are unable to represent many actual Excel 2007 files.
For example, at least 26 Excel financial functions depend on a parameter called "Basis", which controls how the calendar is interpreted. The YEARFRAC function is a good example of this; it returns the fraction of years between two dates, given a "basis" for interpreting the calendar. Errors in these functions can have large financial stakes.
I've posted a new document, YEARFRAC Incompatibilities between Excel 2007 and OOXML (OXML), and the Definitions Actually Used by Excel 2007 ([OpenDocument version]), which shows that the definitions of OOXML and Excel 2007 aren't the same at all. "This document identifies incompatibilities between the YEARFRAC function, as implemented by Microsoft Excel 2007, compared to how it is defined in the Office Open Extensible Mark-up Language (OOXML), final draft ISO/IEC 29500-1:2008(E) as of 2008-05-01 (aka OXML). It also identifies the apparent definitions used by Excel 2007 for YEARFRAC, which to the author’s knowledge have never been fully documented anywhere. They are not defined in the OOXML specification, because OOXML’s definitions are incompatible with the apparent definition used by Excel 2007."
"This incompatibility means that, given OOXML’s current definition, OOXML cannot represent any Excel spreadsheet that uses financial functions using “basis” date calculations, such as YEARFRAC, if they use common “basis” values (omitted, 0, 1, or 4). Excel functions that depend upon "basis" date calculations include: ACCRINT, ACCRINTM, AMORDEGRC, AMORLINC, COUPDAYBS, COUPDAYS, COUPDAYSNC, COUPNCD, COUPNUM, COUPPCD, DISC, DURATION, INTRATE, MDURATION, ODDFPRICE, ODDFYIELD, ODDLPRICE, ODDLYIELD, PRICE, PRICEDISC, PRICEMAT, RECEIVED, YEARFRAC, YIELD, YIELDDISC, and YIELDMAT (26 functions)."
I have much more information about YEARFRAC if you want it.
path: /misc | Current Weblog | permanent link to this entry
Oracle letter to Universities: Educate software developers on security/assurance!
I am delighted to point out a really interesting letter to Universities by Mary Ann Davidson, the Chief Security Officer of Oracle Corporation. It basically tells colleges and universities to stop ignoring security, and to instead include software security principles in their computer science curricula. I'm so delighted to see this letter, which has just been released to the public (it had been privately sent to many colleges and universities). Let me point out and comment on some great points in this letter, because I think this letter is really important.
In this letter, she notes that "many security vulnerabilities can be traced to a relatively few types of common coding errors". I've noted that myself, by the way; simply educating developers on what the common (past) mistakes are goes a long way towards eliminating vulnerabilities. She then notes, "most developers we hire have not been adequately trained in basic secure coding principles in their undergraduate or graduate computer science programs." I agree and think it's horrific; more on that in a moment. She clarifies that this is a really important problem: "Security flaws are widely recognized as a threat to national security and to the privacy and financial well being of individual citizens, in addition to the costs they impose on us and our customers." They haven't just let this be; as they note, "We have therefore had to develop and roll out our own in-house security training program at significant time and expense." Kudos to Oracle for doing such training, by the way; far too many organizations don't do that, which explains why software continues to have the same old vulnerabilities as it did 30 years ago. But clearly Oracle cannot train the world, nor it is reasonable to expect that they do so.
She also states that "We believe that the ability to recognize and avoid common errors that can result in catastrophic security failures should be a core part of computer science curricula and that the above measures will foster such change. We strongly recommend that universities adopt secure coding practices as part of their computer science curricula, to improve the security of all commercial software, and ensure that their graduates remain competitive in the job market." To that I say, Amen.
By itself, that's great, but here's the kicker: "In the future, Oracle plans to give hiring preference to students who have received such training and can demonstrate competence in software security principles." Do you see this? Students at colleges and universities that fail to properly prepare them will be at a competitive disadvantage!
Today, almost all computer science and software engineering graduates will develop software that connects to a network, or must take data from a network... yet almost all are absolutely clueless about how to do so. Not because they don't know what a "socket" is, but because they don't know how to counter attacks. And if you're hooked to a network, or take data from one, you will get attacked.
Yet the education community (with a few wonderful exceptions) still completely ignores the need to educate software developers on how to develop secure software. "It's not my job" is not just wrong; it's almost criminal. Society is depending on the educational community to educate students in the fundamentals of what they need to know. Society depends on software, and essentially every student in a software-related field will, after they graduate, write software that will be attacked. Attacks are no longer a surprise - they are a guarantee. Yet the educational system that's supposed to prepare our developers fundamentally fails to do so. Since attacks are guaranteed, and the students are guaranteed to not know how to counter them, what other results would you expect? The basics of developing secure software should be a mandatory part of computer science and software engineering undergraduate curricula. The vulnerabilities that the students will embed in software, if they do not get this education, will lead to great loss of life and the loss of billions of dollars. Sure, schools already have a lot of material to cover, but practically nothing in a computer science curricula is as important as how to develop secure software; I can think of no other omissions in the CS curricula that cause so much damage. Don't tell me that you only teach the "fundamentals"; programming languages change, but the need for security will never go away; it is fundamental. I think computer science and software engineering departments that do not explain the basics of developing secure software to all of their undergraduate and graduate students should be shut down, as a menace to society, until they change their ways.
Oh, if you want to see more about this letter, see Mary Ann Davidson's blog article about it, "The Supply Chain Problem", where she talks about what led up to the letter, and the follow-on from it: "Last year, I got fed up enough with Oracle having to train otherwise bright and capable CS grads in secure coding 101 that I sent letters to the top 10 or so universities we recruit from (my boss came up with the idea and someone on my team executed on it - teamwork is a wonderful thing)... I am sorry to state that only one of those universities we wrote to responded to my letter... We need a revolution - an upending of the way we think about security -and that means upsetting the supply chain of software developers... To universities, I cannot but contrast the education of engineers with that of computer science majors. Engineers know that their work product must above all be safe, secure and reliable. They are trained to think this way (not pawn off 'safety' on 'testers') and their curricula builds and reinforces the techniques and mindset of safe, secure and reliable product. (A civil engineer who ignores the principles of basic structures - a core course - in an upper level class is not going to graduate, and can't dismiss structures as a 'legacy problem.')"
I would love to see many organizations banding together to sign a letter like this one. If enough organizations band together, I think many universities and colleges will finally get the message. I would expand it beyond computer science, to any curricula with a significant amount of software development (such as software engineering, MIS, and so on), but that's a quibble. My goal is not to shut down any departments (I hope that's clear); it's to repair a serious omission in our educational system. Kudos to Mary Ann Davidson, for writing the letter and sending it to a number of Universities. When I learned of it, I begged her to please post it publicly. To her great credit, she's now done so. Thanks, from the bottom of my heart! Now colleges and universities have even fewer reasons to claim the nonsense, "well, no one wants information on developing secure software." The companies that will hire your students know otherwise.
path: /security | Current Weblog | permanent link to this entry
Defining "open standards": The Digital Standards Organization (digistan.org)
Lots of people agree that we need "open standards" in information technology. The problem is, there are a lot of snake-oil salesmen who are trying to (re)define that term to mean "whatever proprietary product I'm selling".
Will we be able to choose what products we use? Will we even be able to exercise our rights (as citizens) at all? These are important questions about our future. The answers to those questions depends on whether or not we have real open standards in place for critical areas of our lives. A vendor who controls critical standards could easily decide that something that is manifestly not in our interest could be in theirs, and force us to submit to their malevolent actions. This is already a concern, and through globalization it will only get worse. We are dependent on information systems, and those who control their standards control those systems... and thus, us. It's about power; should we have any? This means that understanding what real open standards are about is vital.
In my essay "Is OpenDocument an Open Standard? Yes!", I addressed this problem of multiple different definitions by finding three widely-used definitions (Perens', Krechmer's, and the European Commission's) and merging them. After all, if a specification meets all three definitions of "open standard", then it's far more likely to be a true open standard. Problem is, with all those trees, it's hard to see the forest.
So I'm delighted to have discovered the Digital Standards Organization (digistan.org). They have a wonderfully brief definition of "open standard": "a published specification that is immune to vendor capture at all stages in its life-cycle". That can be a little mystifying, so they also provide a slightly longer definition of "open standard" that clarifies what that means:
That's a remarkably clear and simple definition, and good definitions are hard! Even better, they have posted a rationale for this definition that cuts through all the noise and nonsense, and instead gets to the heart of the matter. For example, it explains the real goals of open standards: "An open standard must be aimed at creating unrestricted competition between vendors and unrestricted choice for users. Any barrier - including RAND, FRAND, and variants - to vendor competition or user choice is incompatible with the needs of the market at large." Here's a quote from the rationale's abstract, which I think makes a lot of sense:
"Many groups and individuals have provided definitions for 'open standard' that reflect their economic interests in the standards process. We see that the fundamental conflict is between vendors who seek to capture markets and raise costs, and the market at large, which seeks freedom and lower costs. There are thus only two types of standard: franchise standards, and open standards. Vendors work hard to turn open standards into franchise standards. They work to change the statutory language so they can cloak franchise standards in the sheep's clothing of 'open standard'. Our canonical definition of open standard derives from the conclusion that this conflict lies at the heart of the matter. We define an open standard as 'a published specification that is immune to vendor capture at all stages in its life-cycle'. A full definition of 'open standard' must take into account the direct economic conflict between vendors and the market at large. Such conflicts do not end when a standard is published, so an open standard must also be immune from attack long after it has been widely implemented."
Digistan is currently asking people to sign "The Hague Declaration" by 2008-05-21. This one states why open standards are important to human liberty, in ways that non-technical people can understand. As Pieter Hintjens argues in his "Open letter to Standards Professionals, Developers, and Activists", "The Hague Declaration argues that international law and national constitutions of most democracies oblige governments to adopt open standards." If the text of this letter looks a little like Andrew Updegrove's A Proposal to Recognize the Special Status of "Civil ICT Standards" or his testimony in Texas, that's no accident; Andrew Updegrove is one of Digistan's founders.
Standards are vitally important. If we allow individual companies to control standards, then we have ensured that they will control us - and what we may do - through them. Being a non-profit helps, but even a non-profit's no guarantee; is the organization interested in maximizing implementation and competition between potential suppliers, or does it have some other motivation (such as maximizing publication revenue)?
I think making standards available at no-charge is no longer a nicety; it is a necessity for a specification to be a truly open standard. When there were only a few standards, and all products were developed by large big-budget corporations, a $100 standard was not a big deal. But today there are a vast array of standards; simply buying "all relevant standards" is becoming prohibitive even for large companies with massive budgets. And those big budgets are increasingly rare; suppliers are often small organizations or individuals collaborating together, or are in countries where those kinds of funds are unavailable. Because the world now includes so many new suppliers, anything that prevents those suppliers from using standards is simply unacceptable. Don't give me the nonsense that the money is needed to help develop standards; it's not true. I've helped to develop many standards, and I never received a penny from the publication royalties. The IETF, W3C, OASIS, and many other organizations manage to publish their standards, and have for years. The world has changed. In today's world, "publish" means "freely available over the Internet without having to register for it"; if you can't Google it, it doesn't exist. The cost of putting a specification on a public web server is essentially petty cash, and not doing so means that many (if not most) of the specification's potential users cannot use it.
Open standards and free-libre / open source software (FLOSS) are not the same thing - not at all! There are some similarities, though. From a customer's point of view, both open standards and FLOSS are strategies for enabling supplier switching (by preventing lock-in). In addition, customers often don't switch to a FLOSS product, even it's technologically superior or has lower total costs, solely because the customer is locked into an existing product due to proprietary standards (in data formats, APIs, and so on). You can choose to use open standards and not use FLOSS products, but if you use an open standard, it enables you to select a FLOSS product (now or later).
I believe, very much, in the power of competition to produce lower-cost, higher-quality, and innovative components. But competition is easily stymied through lock-in via "franchise" standards. Open standards are necessary to eliminate lock-in and bring to everyone the advantages of competition: lower cost, higher quality, and greater innovation.
path: /oss | Current Weblog | permanent link to this entry
Bilski: Information is physical!?
The US Court of Appeals for the Federal Circuit in Washington, DC just heard arguments in the Bilski case, where the appellant (Bilski) is arguing that a completely mental process should get a patent. The fact that this was even entertained demonstrates why the patent system has truly descended into new levels of madness. At least the PTO rejected the application; the problem is that the PTO now allows business method patents and software patents. Once they allowed them, there's no rational way to say "stop! That's rediculous!" without being arbitrary.
Mr. David Hanson (Webb Law Firm) argued for the appellant (Bilski), and got peppered with questions. "Is a curve ball patentable?", for example. At the end, he finally asked the court to think of "information as physical"; it is therefore tangible and can be transformed.
That is complete lunacy, and it clearly demonstrates why the patent office is in real trouble.
Information is not physical, it is fundamentally different, and that difference has been understood for centuries. If I give you my car, I no longer have that car. If I give you some information, I still have the information. That is a fundamental difference in information, and always has been. The fact that Bilski's lawyer can't understand this difference shows why our patent office is so messed up.
This fundamental difference between information and physical objects was well-understood by the U.S. founding fathers. Here's what Thomas Jefferson said: "That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature, when she made them, like fire, expansible over all space, without lessening their density at any point, and like the air in which we breath, move, and have our physical being, incapable of confinement or exclusive appropriation. Inventions then cannot, in nature, be a subject of property." Thomas Jefferson was a founder, and an inventor. No, they didn't have computers then, but computers merely automate the processing of information; the essential difference between information and physical/tangible objects was quite clear then.
Our laws need to distinguish between information and physical objects, because they have fundamentally different characteristics.
Basically, by failing to understand the differences, the PTO let in software patents and business method patents, which have been grossly harmful to the United States.
Even if you thought they were merely "neutral", that's not enough. There's a famous English speech about the trade-offs of copyright law, whose principles also apply here: "It is good that authors should be remunerated; and the least exceptionable way of remunerating them is by a monopoly. Yet monopoly is an evil. For the sake of the good we must submit to the evil; but the evil ought not to last a day longer than is necessary for the purpose of securing the good." - Thomas Babbington Macaulay, speech to the House of Commons, February 5, 1841.
I believe that software patents need to be abolished, pronto. As I've discussed elsewhere, software patents harm software innovation, not help it.
But here in the Bilski case we see why some some people have managed to sneak software patents into the patent process. In short, too many people do not understand the fundamental differences between information and physical objects. People whose thinking is that fuzzy are easily duped. Though clearly many people aren't as confused as Bilski's lawyer, I think too many people in the patent process have become so confused about the difference between physical objects and information that they don't understand why software patents are a serious problem. Patents should only apply to processes that directly change physical objects, and their scope should only cover the specifics of those changes. I add that latter part because yes, changing the number on a display does change something physical, but that is irrelevant. If you have a wholly new process for making displays (say, using a new chemical compound), that could be patentable, but changing a "5" to a "6" should not be patentable because "changing a 5 to a 6" is not fundamentally a change in nature. Taking something unpatentable and adding the phrase "doing it with a computer" should not change an unpatentable invention into a patentable one; the Supreme Court understood that, but the PTO still fails to understand that.
I think pharmaceutical companies are afraid of any patent reform laws, because they're afraid that a change in the patent system might hurt them. But if the patent system isn't fixed - by eliminating business method patents and software patents - the entire patent system might become too overwhelmed to function, and thus eventually scrapped. I don't know if pharma patents are more help than hinderance; I'm not an expert in that area. But I make my living with software, and it's obvious to me (and most other software practitioners) that software patents and business patents are becoming a massive drag on innovation. If we can't fix the patent system, we'll have to abolish the patent system completely. A lot of lawyers will be unhappy if the patent system is eliminated, but there are more non-lawyers than lawyers. If the pharma companies want to have a working patent system, then they'll need to help reign in patents in other areas, or the whole system may collapse.
path: /misc | Current Weblog | permanent link to this entry
Open Source Computer Emergency Response Team (oCERT)
Here's something new and interesting: the Open Source Computer Emergency Response Team (oCERT). Here's how they describe themselves: "The oCERT project is a public effort providing security handling support to Open Source projects affected by security incidents or vulnerabilities...".
They promise to keep things moving. They do permit embargo periods (where vulnerabilities are not publicly disclosing, giving time for developers to fix the problem first). More importantly, though, they have a maximum embargo time of two months; I think that's great, and important, because a lot of suppliers have abused embargo periods and failed to fix critical vulnerabilities as long as they're embargoed. These abuses often resulted in customers being exploited through mechanisms that the supplier knew about, but refused to fix in a timely manner.
Google is backing oCERT, which is certainly encouraging. Google even mentions my "three conditions" for securing software (thanks!):
This ComputerWorld article on oCERT makes some interesting points. One minor point: They worry that oCERT is using the term "CERT" without permission, but oCERT reports that they do indeed have that permission.
path: /oss | Current Weblog | permanent link to this entry
Securing Open Source Software (OSS)
I've just posted my presentation titled "Securing Open Source Software (OSS or FLOSS), which is to be presented at the 8th Semi-Annual Software Assurance Forum, May 6-8, 2008, Sheraton Premiere, Tyson's Corner in Vienna, Virginia. In it, I discuss how to improve the security of an OSS component by modifying its environment, as well as securing the OSS component itself (by selecting a secure component, building a secure component from scratch, or modifying an existing component). I include a number of examples; they're necessarily incomplete, but I hope it will help people who are developing or deploying systems. (Here is "Securing Open Source Software (OSS or FLOSS)" in OpenDocument format.) Enjoy!
path: /security | Current Weblog | permanent link to this entry
Microsoft Office XML (OOXML) massively defective
Robert Weir has been analyzing Microsoft's Office XML spec (aka OOXML) to determine how defective it is, with disturbing results.
Most standards today are relatively small, build on other standards, and are developed publicly over time with lots of opportunity for correction. Not OOXML; Emca submitted Office Open XML for "Fast Track" as a massive 6,045 page specification, developed in an absurdly rushed way, behind closed doors, using a process controlled by a single vendor. It's huge primarily because does everything in a non-standard way, instead of referring to other standards where practical as standards are supposed to do (e.g., for mathematical equations they created their own incompatible format instead of using the MathML standard). All by itself, its failure to build on other standards should have disqualified OOXML, but it was accepted for review anyway, and what happened next was predictable.
No one can seriously review such a massive document in a short time, though ISO tried; ISO's process did find 3,522 defects. It's not at all clear that the defects were fixed - there's been no time to really check, because the process for reviewing the standard simply wasn't designed to handle that many defects. But even if they were fixed - a doubtful claim - Robert Weir has asked another question, "did they find nearly all of the defects?". The answer is: Almost all of the original defects remain. By sampling pages, he's found error after error, none of which were found by the ISO process. The statistics from the sample are very clear: practically all serious errors have not been found. It's true that good standards sometimes have a few errors left in them, after review, but this isn't "just a few errors"; these clearly show that the specification is intensely defect-ridden. Less than 2% of the defects have been found, by the data we have so far, which suggests that there are over 172,000 important defects (49x3522) left to find. That's rediculous.
Want more evidence that it's defect-ridden? Look at Inigo Surguy's "Technical review of OOXML", where he examines just the WordProcessingML section's 2300 XML examples. He wrote code to check for well-formedness and validation errors, and found that more than 10% (about 300) were in error even given this trivial test. Conclusion? "While a certain number of errors is understandable in any large specification, the sheer volume of errors indicates that the specification has not been through a rigorous technical review before becoming an Ecma standard, and therefore may not be suitable for the fast-track process to becoming an ISO standard." This did not include the other document sections, and this is a lower bound on accuracy (XML could validate and still be in error). (He also confirmed that Word 2007 does not implement the extensibility requirements of the Ecma specification, so as a result it would be hard to "write an interoperable word processor with Word" using OOXML.)
I think that all by itself, these vast number of errors in OOXML prove that the "Fast Track" process is completely inappropriate for OOXML. The "Fast Track" process was intended to be used when there was already a widely-implemented, industry-accepted standard that had already had its major problems addressed. That's just not the case here.
These huge error rates were predictable, too. The committee for creating OOXML wasn't even created until OpenDocument was complete, so they had to do a massive rush job to produce anything. ( Doug Mahugh admitted that "Microsoft... had to rush this standard through.") They didn't reuse existing mature standards, so they ended up creating much more work for themselves. Most developers (who could have helped find and fix the defects) stayed away from the Ecma process in the first place; its rules gave one vendor complete control over what was allowed, and there was already a vendor-independent standard in place, which gave most experts no reason to participate. The Ecma process was also almost entirely closed-door (OpenDocument's mailing lists are public, in contrast), which predictably increased the error rate too.
The GNOME Foundation has been involved in OOXML's development, and here's what they say in the GNOME Foundation Annual Report 2007: "The GNOME Foundation’s involvement in ECMA TC45-M (OOXML) was the main discussion point during the last meeting.... [the] Foundation does not support this file format as the main format or as a standard..." I don't think this is as widely touted as it should be. Here's an organization directly involved in OOXML development, and it thinks OOXML should not be a standard at all.
India has already voted "no" to OOXML. I hope others do the same. Countries with the appropriate rights have until March 29 to decide. It's quite plausiable that the final vote will be "no", and indeed, based on what's published, it should be "no". Open Malaysia reported on the March 2008 BRM meeting, for example. It reports that everybody "did their darnest to improve the spec... The final day was absolute mayhem. We had to submit decisions on over 500 items which we hadn't [had] the time to review. All the important issues which have been worked on repeatedly happened to appear on this final day. So it was non-stop important matters... It was a failure of the Fast Track process, and Ecma for choosing it. It should have been obvious to the administrators that submitting a 6000+ page document which failed the contradiction period, the 5 month ballot vote and poor resolution dispositions, should be pulled from the process. It should have been blatantly obvious that if you force National Bodies to contribute in the BRM and end up not deliberating on over 80% of their concerns, you will make a lot of people very unhappy... judging from the reactions from the National Bodies who truly tried to contribute on a positive manner, without having their concerns heard let alone resolved, they leave the BRM with only one decision in their mind come March 29th. The Fast Tracking process is NOT suitable for ISO/IEC DIS 29500. It will fail yet again. And this time it will be final."
In my opinion, the OOXML specification should not become an international standard, period. I think it clearly doesn't meet the criteria for "fast track" - but more importantly, it doesn't meet the needs for being a standard at all. It completely contradicts the goal of "One standard, one test - Accepted everywhere", and it simply is not an open standard. I've blogged before that having multiple standards for office documents is a terrible idea. There's nothing wrong with a vendor publishing their internal format; in fact, ISO's "type 2 technical report" or "ISO agreement" are pre-existing mechanisms for documenting the format of a single vendor and product line specification. But when important data is going to be exchanged between parties, it should be exchanged using an open standard. We already have an open standard for office documents that was developed by consensus and implemented by multiple vendors: OpenDocument (ISO/IEC 26300). For more clarification about what an open standard is, or why OpenDocument is an open standard, see my essay "Is OpenDocument an Open Standard? Yes!" OpenDocument works very well; I use it often. In contrast, it seems clear that OOXML will never be a specification that everyone can fully implement. Its technical problems alone are serious, but even more importantly, the Software Freedom Law Center's "Microsoft's Open Specification Promise: No Assurance for GPL" makes it clear that OOXML cannot be legally implemented by anyone using any license. And this matters greatly.
Andy Updegrove calls for recognition of "Civil ICT Standards", which I think helps puts this technical stuff into a broader and more meaningful context. He notes that in our new "interconnected world, virtually every civic, commercial, and expressive human activity will be fully or partially exercisable only via the Internet, the Web and the applications that are resident on, or interface with, them. And in the third world, the ability to accelerate one’s progress to true equality of opportunity will be mightily dependent on whether one has the financial and other means to lay hold of this great equalizer... [and thus] public policy relating to information and communications technology (ICT) will become as important, if not more, than existing policies that relate to freedom of travel (often now being replaced by virtual experiences), freedom of speech (increasingly expressed on line), freedom of access (affordable broadband or otherwise), and freedom to create (open versus closed systems, the ability to create mashups under Creative Commons licenses, and so on)... This is where standards enter the picture, because standards are where policy and technology touch at the most intimate level. Much as a constitution establishes and balances the basic rights of an individual in civil society, standards codify the points where proprietary technologies touch each other, and where the passage of information is negotiated... what will life be like in the future if Civil ICT Rights are not recognized and protected, as paper and other fixed media disappear, as information becomes available exclusively on line, and as history itself becomes hostage to technology? I would submit that a vote to adopt OOXML would be a step away from, rather than a way to advance towards, a future in which Civil ICT Rights are guaranteed".
Ms. Geraldine Fraser-Moleketi, Minister of Public Service and Administration, South Africa, gave an interesting presentation at the Idlelo African Conference on FOSS and the Digital Commons. She said, "The adoption of open standards by governments is a critical factor in building interoperable information systems which are open, accessible, fair and which reinforce democratic culture and good governance practices. In South Africa we have a guiding document produced by my department called the Minimum Interoperability Standards for Information Systems in Government (MIOS). The MIOS prescribes the use of open standards for all areas of information interoperability, including, notably, the use of the Open Document Format (ODF) for exchange of office documents... It is unfortunate that the leading vendor of office software, which enjoys considerable dominance in the market, chose not to participate and support ODF in its products, but rather to develop its own competing document standard which is now also awaiting judgement in the ISO process. If it is successful, it is difficult to see how consumers will benefit from these two overlapping ISO standards... The proliferation of multiple standards in this space is confusing and costly." She also said, "One cannot be in Dakar without being painfully aware of the tragic history of the slave trade... As we find ourselves today in this new era of the globalised Knowledge Economy there are lessons we can and must draw from that earlier era. That a crime against humanity of such monstrous proportions was justified by the need to uphold the property rights of slave owners and traders should certainly make us more than a little cautious about what should and should not be considered suitable for protection as property."
You can get more detail from the Groklaw ODF-MSOOXML main page, but I think the point is clear. The world doesn't need the confusion of a specification controlled by a single vendor being labelled as an international standard. NoOOXML has a list of reasons to reject OOXML.
path: /misc | Current Weblog | permanent link to this entry
Twisted Mind of the Security Pro
Bruce Schneier's "Inside the Twisted Mind of the Security Professional" is highly-recommended reading - he explains the different kind of thinking required to be good at making things secure. Security pros are able to see the bigger picture, and in particular, they are able to see things from from an attacker's perspective.
For example, "SmartWater is a liquid with a unique identifier linked to a particular owner. 'The idea is for me to paint this stuff on my valuables as proof of ownership,' I wrote when I first learned about the idea. 'I think a better idea would be for me to paint it on your valuables, and then call the police.'" Similarly, on opening up an ant farm, his friend was surprised that the manufacturer would send you ants by mail; Bruce thought it was interesting that "these people will send a tube of live ants to anyone you tell them to."
Being able to think like an attacker is so important that in my book on writing secure programs, I gave it its own heading: paranoia is a virtue. It's still true. My thanks to Bruce Schneier for expressing this need so eloquently.
We would live in a better world if all of us could see the world as attackers do - or at least make the effort to try. In particular, we'd stop doing many foolish things in the name of "security", and instead do things that actually secured our world.
path: /security | Current Weblog | permanent link to this entry
OSS and the U.S. DoD - Questions and Answers
I've just posted Questions and Answers for 2008 "Open Source Software and DoD" Webinar. These are my attempts to answer the questions people sent me at my February "Open Source Software (OSS) and the U.S. Department of Defense (DoD)" Some of the questions were easy to answer, but some were surprisingly difficult. In some cases, I asked lawyers and got conflicting answers. But this is the best information that I could find on the topic.
For example, I explain in detail why In particular, it appears fairly clear that both the government and government contractors can release their results as open source software under the default DoD contract terms for software development (DFARS contracting clause 252.227-7014):
I also point out that even when the government isn't the copyright holder, if it releases software under an OSS license it can still enforce its license. That's because, even when it's not the copyright holder, it can still enforce the license... and because the doctrine of unclean hands will impact those who refuse to obey the license.
Several people had questions about software developed by a government employee (which can't be copyrighted in the U.S.) and how that impacts OSS. The short impact is that there's no problem; government employees can still contribute to OSS projects, for example. I also discuss some of the export control issues (especially ITAR), and how to address them.
If there are mistakes, please let me know. Thanks!
path: /oss | Current Weblog | permanent link to this entry
OSS and the U.S. DoD - Webinar
I'm going to present a webinar on "Open Source Software (OSS) and the U.S. Department of Defense (DoD)" on Feb 11, 2008, 3:00-4:30pm EST. It is open to the public, at no charge. To find out how to sign up, see http://www.dwheeler.com/oss-dod-webinar2008.html.
Here's the summary: "Open source software (OSS) has become widespread, but there are many misconceptions about it - resulting in numerous missed opportunities. This presentation will clarify what OSS is (and isn't), rebut common misunderstandings about OSS, discuss the relationship of OSS and security, discuss how to find and evaluate OSS, and explain OSS licensing (including how to combine products and select a license). It will show why nearly all extant OSS is COTS software, and thus why it's illegal (as well as foolish) to ignore OSS options."
This presentation is hosted by the Data & Analysis Center for Software (DACS), which is technically managed by the Air Force Research Laboratory - Information Directorate (AFRL/IF).
Please sign up quickly, if you're interested. There were 45 registrants in the first half hour of its announcement.
path: /oss | Current Weblog | permanent link to this entry
Readable s-expressions (sweet-expressions) draft 0.2 for Lisp-like languages
Back in 2006 I posted my basic ideas about "sweet-expressions". Lisp-based programming languages normally represent programs as s-expressions, and though they are regular, most people find them hard to read. I hope to create an alternative to s-expressions that have their advantages, and not their disadvantages. You can see more at my readable Lisp page. I've gotten lots of feedback, based on both my prototype of the idea, as well as on the mailing list discussing it.
I've just posted a a draft of version 0.2 of sweet-expressions. This takes that feedback into account, in particular, it's now much more backwards-compatible. There's still a big question about whether or not infix should be a default; see the page for more details.
Here are the improvements over version 0.1:
Here's an example of (ugly) s-expressions:
(defun factorial (n)
(if (<= n 1)
1
(* n (factorial (- n 1)))))
Here's sweet-expressions version 0.1:
defun factorial (n)
if (n <= 1)
1
n * factorial(n - 1)
Here is sweet-expressions version 0.2 (draft), with infix default (it figures out when you have an infix operation from the spelling of the operation):
defun factorial (n)
if {n <= 1}
1
n * factorial(n - 1)
Here is sweet-expressions version 0.2 (draft), with infix non-default (you must surround every infix operator with {...}):
defun factorial (n)
if {n <= 1}
1
{n * factorial{n - 1}}
I'm still taking comments. If you're interested, take a look at http://www.dwheeler.com/readable. And if you're really interested, please join the readable-discuss mailing list.
path: /misc | Current Weblog | permanent link to this entry
Please point me to High-Assurance Free-Libre / Open Source Software (FLOSS) Components
I'm looking for High Assurance and Free-Libre / Open Source Software (FLOSS) components. Can anyone point me to ones I don't know about? A little context might help, I suppose...
A while back I posted a paper, High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS). For purposes of the paper, I define “high assurance software” as software where there’s an argument that could convince skeptical parties that the software will always perform or never perform certain key functions without fail. That means you have to show convincing evidence that there are absolutely no software defects that would interfere with the software’s key functions. Almost all software built today is not high assurance; developing high assurance software is currently a specialist’s field.
High assurance and FLOSS are potentially a great match. We achieve high assurance in scientific analysis and mathematical proofs by subjecting them to peer review, and then worldwide review. FLOSS programs, unlike proprietary programs, can receive similar kinds of review, and many FLOSS programs achieve really good reliability figures for medium assurance. So it can be easily argued that stepping up to high assurance should be easier for FLOSS than for proprietary software. In addition, there are a vast number of FLOSS tools that support developing high assurance components, including PVS, ACL2, Isabelle/HOL, prover9, and Alloy (there's lots more, see the paper for more details).
Yet it's hard to find High Assurance Free-Libre / Open Source Software (FLOSS) components. To be fair, high assurance software components are exceedingly rare in the non-FLOSS world as well. But I suspect there are more than I've found, and I hope that people will help me by pointing them out to me. I'd like to know about such components for direct use, and also simply for use as demonstrations of how to actually develop high assurance software. Today, it's nearly impossible to explain how to develop high assurance software, because there are almost no fully-published examples. Existing "formal methods successes" papers generally don't publish everything including their specs, proofs, and code... which makes it impossible to really learn how to do it. And no, I don't require that proofs prove every line of machine code; but showing how correspondance demonstrations are made would be valuable, and that's currently not well demonstrated by working examples.
If you're curious about the general topic, take a peek at High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS). I've collected lots of interesting information; hopefully it will be useful to some of you. And let me know of high-assurance FLOSS components that I don't already know about.
path: /oss | Current Weblog | permanent link to this entry
Added "MapReduce" to the "Software Innovations" list
Ken Krugler's recent blog said that my article of The Most Important Software Innovations was "very good", but he was surprised that I hadn't included MapReduce as an important software innovation. Basically, MapReduce makes writing certain kinds of programs that process huge amounts of data, on vast distributed clusters, remarkably easy and efficient. (Wikipedia explains MapReduce, including links to alternative implementations like the open source Hadoop.)
It's not because I didn't know about MapReduce; I read about it almost immediately after it got published. I thought it was very promising, and even forwarded the original paper to some co-workers. I think MapReduce is especially promising because, now that we have cheap commodity computers, having a way to easily exploit their capabilities is really valuable. But even with something this promising, I didn't want to add it to my list of innovations right away - after all, maybe after a little while it turns out to be not so helpful.
Currently, there's aren't many who have Google-sized clusters of computers available. But it's clear that this approach is useful in many other circumstances as well. It's new, but I think it's stood the test of time enough that it's a worthy addition... so I've just added it.
One interesting issue is that the MapReduce framework is itself built primarily on the "map" and "reduce" functions, which are far, far older. So, is MapReduce really a new idea, or is it just a high-quality implementation of an old idea? I'll accept that it's a new idea, but that can be difficult to judge. This judgment doesn't really matter unless you think software patents are a good idea (since every software patents in theory prevents progress for 20 years). But I think it's quite clear that software patents are a foolish idea, and it's clear that others have come to the same conclusion. Eric S. Maskin, an economist who has long criticised the patenting of software, recently received the 2007 Nobel Prize for Economics. Here's a nice quote: "... when patent protection was extended to software in the 1980s, [...] standard arguments would predict that R&D intensity and productivity should have increased among patenting firms. Consistent with our model, however, these increases did not occur." Someone who correctly predicted that software patents were harmful to innovation just received a Nobel prize. I hope to someday see people receive other prizes because they ended software patenting in the United States.
path: /misc | Current Weblog | permanent link to this entry
Don Macleay was my mentor and friend, and he just passed away (Oct. 15, 2007). So, this is a small blog entry in his honor.
Here's what I said at his funeral:
"In 1980, Don was the manager of a computer store.
I was only 15, but he took a chance on employing me, and I'm grateful.
He taught me much, in particular, showing by example
that you could be in business (even as a salesman!) and be an honest person.
He later moved to other companies, and I moved twice with him,
because I found that good bosses were hard to find.
Don was honest, reliable, a good friend, and an inspiration to me.
I will miss him, and I look forward to seeing him again in heaven."
I should add that he spoke at my Eagle scout ceremony.
Later on,
when he moved out to the country, it was always a pleasure to visit him
and his family.
Here's a part of his biography, as printed in the funeral bulletin: "Born in Washington, D.C., on October 27, 1934, Donald Macleay was raised in Falls Church. He attained the rank of Eagle Scout and graduated at the top of the first class of St. Stephen's School in 1952. In 1956, he graduated with a Bachelor of Arts in English from the Virginia Military Institute (VMI).
After serving as a Marine Corps officer, Donald Macleay spent many years in the business world before becoming a Parole Officer for the Department of Juvenile Justice in Stafford County. As well, in 1992, he was a candidate for the U.S. Congress as an Independent."
The biography goes on to note that he "valued being a Christian, a husband, a father and grandfather, and a friend." Much of his last years were spent helping troubled youth in his area (Fredericksburg, VA), and from all accounts he was extraordinarily successful at helping them and their families.
path: /misc | Current Weblog | permanent link to this entry