David A. Wheeler's Blog

Thu, 20 Feb 2014

Presenting at American Society for Quality

On February 25, 2014, I will be presenting on “Open Source Software and Government” at the American Society for Quality (ASQ) Software SIG. You can join in person in McLean, Virginia; there will also be various video tele-conferencing sites, and you can join by phone or online as well.

If you’re interested, you’re welcome to join us, but you’ll need to pre-register.

path: /oss | Current Weblog | permanent link to this entry

Fri, 07 Feb 2014

William W. McCune: He made the world a better place through source code

Here I want to honor the memory of William W. (“Bill”) McCune, who helped change the world for the better by releasing software source code. I hope that many other researchers and government policy-makers will follow his lead… and below I intend to show why.

But first, I should explain my connection to him. My PhD dissertation involved countering the so-called “trusting trust” attack. In this attack, an attacker subverts the tools that developers use to create software. This turns out to be a really nasty attack. If a software developer’s tools are subverted, then the attacker actually controls the computer system running the software. This is no idle concern, either; we know that computers are under constant attack, and that some of these attacks are very sophisticated. Such subversions could allow attackers to essentially control all computers worldwide, including the global financial system, militaries, electrical systems, dams, you name it. That kind of power makes this kind of attack potentially worthwhile, but only if it cannot be detected and countered. For many years there were no good detection mechanisms or countermeasures. Then Henry Spencer suggested a potential solution… but there was no agreement that his idea would really counter attackers. That matters; how can you be absolutely certain about some claim?

The “gold standard” for knowing if something is true is a formal mathematical proof. Many important questions cannot be proved this way, all proofs depend on assumptions, and creating a formal proof is often hard. Still, a formal mathematical proof is the best guarantee we have for being certain about something. And there were a lot of questions about whether or not Henry Spencer’s approach would really counter this attack. So, I went about trying to prove that Henry Spencer’s idea really would counter the attack (if certain assumptions held).

After trying several other approaches, I found that the tools developed by Bill McCune (in particular prover9, mace4, and ivy) were perfect for my needs. These tools made my difficult work far easier, because his tools managed to mostly-automatically prove claims mathematically once they were described using mathematical statements. In the end, I managed to mathematically prove that Henry Spencer’s approach really did counter the subverted compiler problem. The tools Bill McCune developed and released made a real difference in helping to solve this challenging real-world problem. I didn’t need much help (because his tools were remarkably easy to use and well-documented), but he responded quickly when I emailed him too.

Sadly, Bill McCune suddenly died on May 4, 2011, leaving the field of automated reasoning deprived of one of its founders (particularly in the subfields of practical theorem proving and model building). In 2013 an academic book was released in his honor (“Automated Reasoning and Mathematics: Essays in Memory of William W. McCune”, Lecture Notes in Artificial Intelligence 7788). That book’s preface has a nice tribute to Bill McCune, listing some of his personal accomplishments (e.g., the development of Otter) and other accomplishments that his tools enabled.

Bill McCune released many tools as open source software (including prover9, mace4, ivy, and the older tool Otter). This means that anyone could use the software (for any purpose), modify it, and distribute it (with or without modification). These freedoms had far-reaching effects, accelerating research in automated proving of claims, as well as speeding the use of these techniques. That book’s preface notes several of Bill McCune’s accomplishments, including the impact he had by releasing the code:

All too often the U.S. government spends a fortune in research, and then that same research has to be recreated from scratch several times again by other researchers (sometimes unsuccessfully). This is a tremendous waste of government money, and can delay work by years (if it can happen at all) resulting in far less progress for the money spent. Bill McCune instead ensured that this results got out to people who could use and improve upon them. In this specific area Bill McCune made software research available to many others, so that those others could use it, verify it, and build on top of those results.

Of course, he was not alone in recognizing the value of sharing research when implemented as software. The paper ”The Evolution from LIMMAT to NANOSAT” by Armin Biere (April 2004) makes the same point when they tried to reproduce others’ work. That paper states, “From the publications alone, without access to the source code, various details were still unclear… what we did not realize, and which hardly could be deduced from the literature, was [an optimization] employed in GRASP and CHAFF [was critically important]… Only [when CHAFF’s source code became available did] our unfortunate design decision became clear… The lesson learned is, that important details are often omitted in publications and can only be extracted from source code. It can be argued, that making source code … available is as important to the advancement of the field as publication.”

More generally, Free the Code.org argues that if government pays to develop software, then it should be available to others for reuse and sharing. That makes sense to me; if “we the people” paid to develop software, then by default “we the people” should receive it. I think it especially makes sense in science and research; without the details of how software works, results are not reproduceable. Currently much of science is not reproduceable (and thus not really science), though open science efforts are working to change this.

I think Bill McCune made great contributions to many, many, others. I am certainly one of the beneficiaries. Thank you, Bill McCune, so very much for your life’s work.

path: /oss | Current Weblog | permanent link to this entry

Sun, 01 Dec 2013

Shellcheck

I just learned about shellcheck, a tool that reports on common mistakes in (Bourne) shell scripts. If you write shell scripts, you should definitely check out this static analyzer. You can try it out by pasting shell scripts into their website. It is open source software, so you can also download and use it to your heart’s content.

It even covers some of the issues identified in Filenames and Pathnames in Shell: How to do it Correctly. If you are interested in static analyzers for software, you can also see my Flawfinder home page which identifies many other static analysis tools.

path: /oss | Current Weblog | permanent link to this entry

Sat, 16 Nov 2013

Vulnerability bidding wars and vulnerability economics

I worry that the economics of software vulnerability reporting is seriously increasing the risks to society. The problem is the rising bidding wars for vulnerability information, leading to a rapidly-growing number of vulnerabilities known only to attackers. These kinds of vulnerabilities, when exploited, are sometimes called “zero-days” because users and suppliers had zero days of warning. I suspect we should create laws limiting the sale of vulnerability information, similar to the limits we place on organ donation, to change the economics of vulnerability reporting. To see why, let me go over some background first.

A big part of the insecure software problem today is that relatively few of today’s software developers know how to develop software that resists attack (e.g., via the Internet). Many schools don’t teach it at all. I think that’s ridiculous; you’d think people would have heard about the Internet by now. I do have some hope that this will get better. I teach a graduate course on how to develop secure software at George Mason University (GMU), and attendance has increased over time. But today, most software developers do not know how to create secure software.

In contrast, there is an increasing bidding war for vulnerability information by organizations who intend to exploit those vulnerabilities. This incentivizes people to search for vulnerabilities, but not report them to the suppliers (who could fix them) and not alert the public. As Bruce Schneier reports in “The Vulnerabilities Market and the Future of Security” (June 1, 2012), “This new market perturbs the economics of finding security vulnerabilities. And it does so to the detriment of us all.” Forbes ran an article about this in 2012, Meet The Hackers Who Sell Spies The Tools To Crack Your PC (And Get Paid Six-Figure Fees). The Forbes article describes what happened when French security firm Vupen broke the security of the Chrome web browser. Vupen would not tell Google how they broke in, because the $60,000 award Google from Google was not enough. Chaouki Bekrar, Vupen’s chief executive, said that they “wouldn’t share this [information] with Google for even $1 million… We want to keep this for our customers.” These customers do not plan to fix security bugs; they purchase exploits or techniques with the “explicit intention of invading or disrupting”. Vupen even “hawks each trick to multiple government agencies, a business model that often plays its customers against one another as they try to keep up in an espionage arms race.” Just one part of the Flame espionage software (exploiting Microsoft Update) has been estimated as being worth $1 million when it was not known.

This imbalance in economic incentives creates a dangerous and growing mercenary subculture. You now have a growing number of people looking for vulnerabilities, keeping them secret, and selling them to the highest bidder… which will encourage more to look for, and keep secret, these vulnerabilities. After all, they are incentivized to do it. In contrast, the original developer typically does not know how to develop secure software, and there are fewer economic incentives to develop secure software anyway. This is a volatile combination.

Some think the solution is for suppliers to pay people when they report security vulnerabilities to suppliers (“bug bounties”). I do not think bug bounty systems (by themselves) will be enough, though suppliers are trying.

There has been a lot of discussion about Yahoo and bug bounties. On September 30, 2013, the article What’s your email security worth? 12 dollars and 50 cents according to Yahoo reported that Yahoo paid for each vulnerability only $12.50 USD. Even worse, this was not actual money, it was “a discount code that can only be used in the Yahoo Company Store, which sell Yahoo’s corporate t-shirts, cups, pens and other accessories”. Ilia Kolochenko, High-Tech Bridge CEO, says: “Paying several dollars per vulnerability is a bad joke and won’t motivate people to report security vulnerabilities to them, especially when such vulnerabilities can be easily sold on the black market for a much higher price. Nevertheless, money is not the only motivation of security researchers. This is why companies like Google efficiently play the ego card in parallel with [much higher] financial rewards and maintain a ‘Hall of Fame’ where all security researchers who have ever reported security vulnerabilities are publicly listed. If Yahoo cannot afford to spend money on its corporate security, it should at least try to attract security researchers by other means. Otherwise, none of Yahoo’s customers can ever feel safe.” Brian Martin, President of Open Security Foundation, said: “Vendor bug bounties are not a new thing. Recently, more vendors have begun to adopt and appreciate the value it brings their organization, and more importantly their customers. Even Microsoft, who was the most notorious hold-out on bug bounty programs realized the value and jumped ahead of the rest, offering up to $100,000 for exploits that bypass their security mechanisms. Other companies should follow their example and realize that a simple “hall of fame”, credit to buy the vendor’s products, or a pittance in cash is not conducive to researcher cooperation. Some of these companies pay their janitors more money to clean their offices, than they do security researchers finding vulnerabilities that may put thousands of their customers at risk.” Yahoo has since decided to establish a bug bounty system with larger rewards.

More recently, the Internet Bug Bounty Panel (founded by Microsoft and Facebook) will award public research into vulnerabilities with the potential for severe security implications to the public. It has a minimum bounty of $5,000. However, it certainly does not cover everything; they only intend to pay out widespread vulnerabilities (wide range of products or end users), and plan to limit bounties to only severe vulnerabilities that are novel (new or unusual in an interesting way). I think this could help, but it is no panacea.

Bug bounty systems are typically drastically outbid by attackers, and I see no reason to believe this will change.

Indeed, I do not think we should mandate, or even expect, that suppliers will pay people when people report security vulnerabilities to suppliers (aka bug bounties). Such a mandate or expectation could kill small businesses and open source software development, and it would almost certainly chill software development in general. Such payments would not also deal with what I see as a key problem: the people who sell vulnerabilities to the highest bidder. Mandating payment by suppliers would get most people to send them problem reports… if the bug bounty payments were required to be larger than payments to those who would exploit the vulnerability. That would be absurd, because given current prices, such a requirement would almost certainly prevent a lot of software development.

I think people who find a vulnerability in software should normally be free to tell the software’s supplier, so that the supplier can rapidly repair the software (and thus fix it before it is exploited). Some people call this “responsible disclosure”, though some suppliers misuse this term. Some suppliers say they want “responsible disclosure”, but they instead appear to irresponsibly abuse the term to stifle warning those at risk (including customers and the public), as well as irresponsibly delay the repair of critical vulnerabilities (if they repair the vulnerabilities at all). After all, if a supplier convinces the researcher to not alert users, potential users, and the public about serious security defects in their product, then these irresponsible suppliers may believe they don’t need to fix it quickly. People who are suspicious about “responsible disclosure” have, unfortunately, excellent reasons to be suspicious. Many suppliers have shown themselves untrustworthy, and even trustworthy suppliers need to have a reason to stay that way. For that and other reasons, I also think people should be free to alert the public in detail, at no charge, about a software vulnerability (so-called “full disclosure”). Although it’s not ideal for users, full disclosure is sometimes necessary; it can be especially justifiable when a supplier has demonstrated (through past or current actions) that he will not rapidly fix the problem that he created. In fact, I think it’d be an inappropriate constraint of free speech to prevent people from revealing serious problems in software products to the public.

But if we don’t want to mandate bug bounties, or so-called “responsible disclosure”, then where does that leave us? We need to find some way to change the rules so that economics works more closely with and not against computer security.

Well, here is an idea… at least one to start with. Perhaps we should criminalize selling vulnerability information to anyone other than the supplier or the reporter’s government. Basically, treat vulnerability information like organ donation: intentionally eliminate economic incentives in a specific area for a greater social good.

That would mean that suppliers can set up bug bounty programs, and researchers can publish information about vulnerabilities to the public, but this would sharply limit who else can legally buy the vulnerability information. In particular, it would be illegal to sell the information to organized crime, terrorist groups, and so on. Yes, governments can do bad things with the information; this particular proposal does nothing directly to address it. But I think it’s impossible to prevent a citizen from telling his country’s government about a software vulnerability; a citizen could easily see it as his duty. I also think no government would forbid buying such information for itself. However, by limiting sales to that particular citizen’s government, it becomes harder to create bidding wars between governments and other groups for vulnerability information. Without the bidding wars, there’s less incentive for others to find the information and sell it to them. Without the incentives, there would be fewer people working to find vulnerabilities that they would intentionally hide from suppliers and the public.

I believe this would not impinge on freedom of speech. You can tell no one, everyone, or anyone you want about the vulnerability. What you cannot do is receive financial benefit from selling vulnerability information to anyone other than the supplier (who can then fix it) or your own government (and that at least reduces bidding wars).

Of course, you always have to worry about unexpected consequences or easy workarounds for any new proposed law. An organization could set itself up specifically to find vulnerabilities and then exploit them itself… but that’s already illegal, so I don’t see a problem there. A trickier problem is that a malicious organization (say, the mob) could create a “supplier” (e.g., a reseller of proprietary software, or a downstream open source software package) that vulnerability researchers could sell their information to, working around the law. This could probably be handled by requiring, in law, that suppliers report (in a timely manner) any vulnerability information they receive to their relevant suppliers.

Obviously there are some people will do illegal things, but some people will avoid doing illegal things in principle, and others will avoid illegal activities because they fear getting caught. You don’t need to stop all possible cases, just enough to change the economics.

I fear that the current “vulnerability bidding wars” - left unchecked - will create an overwhelming tsunami of zero-days available to a wide variety of malicious actors. The current situation might impede the peer review of open source software (OSS), since currently people can make more money selling an exploit than in helping the OSS project fix the problem. Thankfully, OSS projects are still widely viewed as public goods, so there are still many people who are willing to take the pay cut and help OSS projects find and fix vulnerabilities. I think proprietary and custom software are actually in much more danger than OSS; in those cases it’s a lot easier for people to think “well, they wrote this code for their financial gain, so I may as well sell my vulnerability information for my financial gain”. The problem for society is that this attitude completely ignores the users and those impacted by the software, who can get hurt by the later exploitation of the vulnerability.

Maybe there’s a better way. If so, great… please propose it! My concern is that economics currently makes it hard - not easy - to have computer security. We need to figure out ways to get Adam Smith’s invisible hand to work for us, not against us.

Standard disclaimer: As always, these are my personal opinions, not those of employer, government, or (deceased) guinea pig.

path: /security | Current Weblog | permanent link to this entry

Mon, 14 Oct 2013

Readable Lisp version 1.0.0 released!

Lisp-based languages have been around a long time. They have some interesting properties, especially when you want to write programs that analyze or manipulate programs. The problem with Lisp is that the traditional Lisp notation - s-expressions - is notoriously hard to read.

I think I have a solution to the problem. I looked at past (failed) solutions and found that they generally failed to be general or homoiconic. I then worked to find notations with these key properties. My solution is a set of notation tiers that make Lisp-based languages much more pleasant to work with. I’ve been working with many others to turn this idea of readable notations into a reality. If you’re interested, you can watch a short video or read our proposed solution.

The big news is that we have reached version 1.0.0 in the readable project. We now have an open source software (MIT license) implementation for both (guile) Scheme and Common Lisp, as well as a variety of support tools. The Scheme portion implements the SRFI-105 and SRFI-110 specs, which we wrote. One of the tools, unsweeten, makes it possible to process files in other Lisps as well.

So what do these tools do? Fundamentally, they implement the 3 notation tiers we’ve created: curly-infix-expressions, neoteric-expressions, and sweet-expressions. Sweet-expressions have the full set of capabilities.

Here’s an example of (awkward) traditional s-expression format:

(define (factorial n)
  (if (<= n 1)
    1
    (* n (factorial (- n 1)))))

Here’s the same thing, expressed using sweet-expressions:

define factorial(n)
  if {n <= 1}
    1
    {n * factorial{n - 1}}

I even briefly mentioned sweet-expressions in my PhD dissertation “Fully Countering Trusting Trust through Diverse Double-Compiling” (see section A.3).

So if you are interested in how to make Lisp-based languages easier to read, watch our short video about the readable notations or download the current version of the readable project. We hope you enjoy them.

path: /misc | Current Weblog | permanent link to this entry

Thu, 26 Sep 2013

Welcome, those interested in Diverse Double-Compiling (DDC)!

A number of people have recently been discussing or referring to my PhD work, “Fully Countering Trusting Trust through Diverse Double-Compiling (DDC)”, which counters Trojan Horse attacks on compilers. Last week’s discussion on reddit based on a short short slide show discussed it directly, for example. There have also been related discussions such as Tor’s work on creating deterministic builds.

For everyone who’s interested in DDC… welcome! I intentionally posted my dissertation, and a video about it, directly on the Internet with no paywall. That way, anyone who wants the information can immediately get it. Enjoy!

I even include enough background material so other people can independently repeat my experiments and verify my claims. I believe that if you cannot reproduce the results, it is not science… and a lot of computational research has stopped being a science. This is not a new observation; “Reproducible Research: Addressing the Need for Data and Code Sharing in Computational Science” by Victoria C. Stodden (Computing in Science & Engineering, 2010) summarizes a roundtable on this very problem. The roadtable found that “Progress in computational science is often hampered by researchers’ inability to independently reproduce or verify published results” and, along with a number of specific steps, “reproducibility must be embraced at the cultural level within the computational science community.” “Does computation threaten the scientific method (by Leslie Hatton and Adrian Giordani) and “The case for open computer programs” in Nature (by Darrel C. Ince, Leslie Hatton, and John Graham-Cumming) make similar points. For one of many examples, the paper “The Evolution from LIMMAT to NANOSAT” by Armin Biere (Technical Report #444, 15 April 2004) reported that they could not reproduce results because “From the publications alone, without access to the source code, various details were still unclear.” In the end they realized that “making source code… available is as important to the advancement of the field as publications”. I think we should not pay researchers, or their institutions, if they fail to provide the materials necessary to reproduce the work.

I do have a request, though. There is no patent on DDC, nor is there a legal requirement to report using it. Still, if you apply my approach, please let me know; I’d like to hear about it. Alternatively, if you are seriously trying to use DDC but are having some problems, let me know.

Again - enjoy!

path: /security | Current Weblog | permanent link to this entry

Wed, 21 Aug 2013

Open security

Modern society depends on computer systems. Yet computer security problems let attackers subvert the very systems that society depends on. This is a serious problem.

I think one approach that could help is “open security” - applying open source software (OSS) approaches to help solve computer security problems. To see why, let’s look at some background.

Back in the 1970s people collaboratively developed software that today we would call open source software or free-libre software. At the time many assumed these approaches could not scale up to big systems… but they were wrong. Software systems that would cost over a billion U.S. dollars to redevelop have been developed as open source software, and Wikipedia has used similar approaches to collaboratively develop the world’s largest encyclopedia.

So… if we can collaboratively develop multi-billion software systems, and large encyclopedias, can we use the same kinds of collaborative approaches to improve computer security? I believe we can… but if we are going to do this, we need to define a term for this (so that we can agree on what we are doing!).

I propose that open security is the application of open source software (OSS) approaches to help solve cyber security problems. OSS approaches collaboratively develop and maintain intellectual works (including software and documentation) by enabling users to use them for any purpose, as well as study, create, change, and redistribute them (in whole or in part). Cyber security problems are a lack of security (confidentiality, integrity, and/or availability), or potential lack of security (a vulnerability), in computer systems and/or the networks they are a part of. In short, open security improves security through collaboration.

You can see more details in my paper What is open security? [PDF] [DOC]. I intentionally built on previous work such as the Free Software Definition by the Free Software Foundation (FSF), the Open Source Definition (Annotated) by the Open Source Initiative (OSI), the Creative Commons license work, and the Definition of Free Cultural Works by Freedom Defined (the last one is, for example, the basis of the Wikimedia/Wikipedia licensing policy).

The Open security site has been recently set up so that you and others can join and get involved. So please - get involved! We are only just starting, and the direction we go depends on the feedback we get.

Further reading:

path: /oss | Current Weblog | permanent link to this entry

Tue, 06 Aug 2013

Don’t anthropomorphize computers, they hate that

A lot of people who program computers or live in the computing world ‐ including me ‐ talk about computer hardware and software as if they are people. Why is that? This is not as obvious as you’d think.

After all, if you read the literature about learning how to program, you’d think that programmers would never use anthropomorphic language. “Separating Programming Sheep from Non-Programming Goats” by Jeff Atwood discusses teaching programming and points to the intriguing paper “The camel has two humps” by Saeed Dehnadi and Richard Bornat. This paper reported experimental evidence on why some people can learn to program, while others struggle. Basically, to learn to program you must fully understand that computers mindlessly follow rules, and that computers just don’t act like humans. As their paper said, “Programs… are utterly meaningless. To write a computer program you have to come to terms with this, to accept that whatever you might want the program to mean, the machine will blindly follow its meaningless rules and come to some meaningless conclusion… the consistent group [of people] showed a pre-acceptance of this fact: they are capable of seeing mathematical calculation problems in terms of rules, and can follow those rules wheresoever they may lead. The inconsistent group, on the other hand, looks for meaning where it is not. The blank group knows that it is looking at meaninglessness, and refuses to deal with it. [The experimental results suggest] that it is extremely difficult to teach programming to the inconsistent and blank groups.” Later work by Saeed Dehnadi and sometimes others expands on this earlier work. The intermediate paper “Mental models, Consistency and Programming Aptitude” (2008) seemed to have refuted the idea that consistency (and ignoring meaning) was critical to programming, but the later “Meta-analysis of the effect of consistency on success in early learning of programming” (2009) added additional refinements and then re-confirmed this hypothesis. The reconfirmation involved a meta-analysis of six replications of an improved version of Dehnadi’s original experiment, and again showed that understanding that computers were mindlessly consistent was key in successfully learning to program.

So the good programmers know darn well that computers mindlessly follow rules. But many use anthropomorphic language anyway. Huh? Why is that?

Some do object to anthropomorphism, of course. Edjar Dijkstra certainly railed against anthropomorphizing computers. For example, in EWD854 (1983) he said, “I think anthropomorphism is the worst of all [analogies]. I have now seen programs ‘trying to do things’, ‘wanting to do things’, ‘believing things to be true’, ‘knowing things’ etc. Don’t be so naive as to believe that this use of language is harmless.” He believed that analogies (like these) led to a host of misunderstandings, and that those misunderstandings led to repeated multi-million-dollar failures. It is certainly true that misunderstandings can lead to catastrophe. But I think one reason Dijkstra railed particularly against anthropomorphism was (in part) because it is a widespread practice, even among those who do understand things ‐ and I see no evidence that anthropomorphism is going away.

The Jargon file specifically discusses anthropomorphization: “one rich source of jargon constructions is the hackish tendency to anthropomorphize hardware and software. English purists and academic computer scientists frequently look down on others for anthropomorphizing hardware and software, considering this sort of behavior to be characteristic of naive misunderstanding. But most hackers anthropomorphize freely, frequently describing program behavior in terms of wants and desires. Thus it is common to hear hardware or software talked about as though it has homunculi talking to each other inside it, with intentions and desires… As hackers are among the people who know best how these phenomena work, it seems odd that they would use language that seems to ascribe consciousness to them. The mind-set behind this tendency thus demands examination. The key to understanding this kind of usage is that it isn’t done in a naive way; hackers don’t personalize their stuff in the sense of feeling empathy with it, nor do they mystically believe that the things they work on every day are ‘alive’.”

Okay, so others have noticed this too. The Jargon file even proposes some possible reasons for anthropomorphizing computer hardware and software:

  1. It reflects a “mechanistic view of human behavior.” “In this view, people are biological machines - consciousness is an interesting and valuable epiphenomenon, but mind is implemented in machinery which is not fundamentally different in information-processing capacity from computers… Because hackers accept that a human machine can have intentions, it is therefore easy for them to ascribe consciousness and intention to other complex patterned systems such as computers.” But while the materialistic view of humans has respectible company, this “explanation” fails to explain why humans would use anthropomorphic terms about computer hardware and software, since they are manifestly not human. Indeed, as the Jargon file acknowledges, even hackers who have contrary religious views will use anthropological terminology.
  2. It reflects “a blurring of the boundary between the programmer and his artifacts - the human qualities belong to the programmer and the code merely expresses these qualities as his/her proxy. On this view, a hacker saying a piece of code ‘got confused’ is really saying that he (or she) was confused about exactly what he wanted the computer to do, the code naturally incorporated this confusion, and the code expressed the programmer’s confusion when executed by crashing or otherwise misbehaving. Note that by displacing from “I got confused” to “It got confused”, the programmer is not avoiding responsibility, but rather getting some analytical distance in order to be able to consider the bug dispassionately.”
  3. “It has also been suggested that anthropomorphizing complex systems is actually an expression of humility, a way of acknowleging that simple rules we do understand (or that we invented) can lead to emergent behavioral complexities that we don’t completely understand.”

The Jargon file claims that “All three explanations accurately model hacker psychology, and should be considered complementary rather than competing.” I think the first “explanation” is completely unjustified. The second and third explanations do have some merit. However, I think there’s a simpler and more important reason: Language.

When we communicate with a human, we must use some language that will be more-or-less understood by the other human. Over the years people have developed a variety of human languages that do this pretty well (again, more-or-less). Human languages were not particularly designed to deal with computers, but languages have been honed over long periods of time to discuss human behaviors and their mental states (thoughts, beliefs, goals, and so on). The sentence “Sally says that Linda likes Tom, but Tom won’t talk to Linda” would be understood by any normal seven-year-old girl (well, assuming she speaks English).

I think a primary reason people anthropomorphic terminology is because it’s much easier to communicate that way when discussing computer hardware and software using existing languages. Compare “the program got confused” with the overly long “the program executed a different path than the one expected by the program’s programmer”. Human languages have been honed to discuss human behaviors and mental states, so it is much easier to use languages this way. As long as both the sender and receiver of the message understand the message, the fact that the terminology is anthropomorphic is not a problem.

It’s true that anthropomorphic language can confuse some people. But the primary reason it confuses some people is that they still have trouble understanding that computers are mindless ‐ that computers simply do whatever their instructions tell them. Perhaps this is an innate weakness in some people, but I think that addressing this weakness head-on can help counter it. This is probably a good reason for ensuring that people learn a little programming as kids ‐ not because they will necessarily do it later, but because computers are so central to the modern world that people should have a basic understanding of them.

path: /misc | Current Weblog | permanent link to this entry

Thu, 20 Jun 2013

Industry-wide Misunderstandings of HTTPS (SSL/TLS)

Industry-wide Misunderstandings of HTTPS describes a nasty security problem involving HTTP (SSL/TLS) and caching. The basic problem is that developers of web applications do not know or understand web standards. The result: 70% of sites tested expose private data on users’ machines by recording data that is supposed to be destroyed.

Here’s the abstract: “Most web browsers, historically, were cautious about caching content delivered over an HTTPS connection to disk - to a greater degree than required by the HTTP standard. In recent years, in response to the increased use of HTTPS for non-sensitive data, and the proliferation of bandwidth-hungry AJAX and Web 2.0 sites, some browsers have been changed to strictly follow the standard, and cache HTTPS content far more aggressively than before. HTTPS web servers must explicitly include a response header to block standards-compliant browsers from caching the response to disk - and not all web developers have caught up to the new browser behavior. ISE identified 21 (70% of sites tested) financial, healthcare, insurance and utility account sites that failed to forbid browsers from storing cached content on disk, and as a result, after visiting these sites, unencrypted sensitive content is left behind on end-users’ machines.”

This vulnerability isn’t as easy to exploit as some other problems; it just means that data that should have been destroyed is hanging around. But it does set up serious problems, because that information should have been destroyed.

This is really just yet another example of the security problems that can happen when people assume, “the only web browser is Internet Explorer 6”. That was never true, and by ignoring standards, they set themselves up for disaster. This isn’t even a new standard; HTTP version 1.1 was released in 1999, so there’s been plenty of time to fix things. Today, many modern systems use AJAX, and SSL/TLS encryption is far more widely used as well, and given these changing conditions, web browsers are changing in standards-compliant ways. Web application developers who followed the standard are doing just fine. The web application developers who ignored the standards are, once again, putting their users at risk.

path: /security | Current Weblog | permanent link to this entry

Tue, 30 Apr 2013

OSS License Clinic

If you’re interested in understanding the legal, contract, or government acquisition issues in applying free / libre / open source software (FLOSS), come to the “Open Source License Clinic” on May 9, 2013, 9am-noon (EDT), in Washington, DC. This clinic will be hosted by the non-profit Open Source Initiative (OSI), and is “designed as a cross-industry, cross-community workshop for legal, contract, acquisition and program professionals who wish to deepen their understanding of open source software licenses, and raise their proficiency to better serve their organizations objectives as well as identify problems which may be unique to government. Discussion of licenses and issues in straight-forward terms make the clinic of value to anyone involved in the lifecycle of a technology decision/acquisition or strategy for internal software development.”

I’m one of the speakers, along with:

The location for the license clinic will be:

101 Independence Ave SE
Madison Building, 6th Floor, Dining Room A
Washington, DC 20540

You might also be interested in the Open Source Community Summit on May 10 (the following day) in Washington, DC.

path: /oss | Current Weblog | permanent link to this entry

Thu, 21 Mar 2013

French government OSS policy

Free/libre/open source software (FLOSS) continues to grow around the world, and governments around the world are trying to establish policies about it. Yet in the U.S. we often don’t hear about them. I just posted about a UK policy; here’s a recent French policy, translated into English.

The French administration, in September 2012, established a set of guidelines and recommendations on the proper use of Free Software (aka open source software) in the French government. This is called the “Ayrault Memorandum” (circulaire Ayrault, in French) and was signed in September 2012 by the French Prime Minister. The document was mainly produced by the DISIC (the Department of Interministerial Systems Information and Communication) and the CIOs of some departments. The DISIC is in charge of coordinating the administration actions on information systems.

path: /oss | Current Weblog | permanent link to this entry

Mon, 18 Mar 2013

UK Government prefers OSS

The UK government is mandating a “preference” for open source software in its Government Service Design Manual Open Source section, to be effective April 2013. The draft manual says, “Use open source software in preference to proprietary or closed source alternatives, in particular for operating systems, networking software, web servers, databases and programming languages.”

path: /oss | Current Weblog | permanent link to this entry

Sun, 10 Mar 2013

Readable Lisp: Sweet-expressions

I’ve used Lisp-based programming languages for decades, but while they have some nice properties, their traditional s-expression notation is not very readable. Even the original creator of Lisp did not particularly like its notation! However, this problem turns out to be surprisingly hard to solve.

After reviewing the many past failed efforts, I think I have figured out why they failed. Past solutions typically did not work because they failed to be general (the notation is independent from any underlying semantic) or homoiconic (the underlying data structure is clear from the syntax). Once I realized that, I devised (with a lot of help from others!) a new notation, called sweet-expressions (t-expressions), that is general and homoiconic. I think this creates a real solution for an old problem.

You can download and try out sweet-expressions as released by the Readable Lisp S-expressions Project by downloading our new version 0.7.0 release.

If you’re interested, please participate! In particular, please participate in the SRFI-110 sweet-expressions (t-expressions) mailing list. SRFIs let people write specifications for extensions to the Scheme programming language (a Lisp), and this SRFI lets people in the Scheme community discuss it.

The following table shows what an example of traditional (ugly) Lisp s-expressions, the same thing in sweet-expressions, and a short explanation.

s-expressions Sweet-expressions (t-expressions) Explanation
(define (fibfast n)
  (if (< n 2)
    n
    (fibup n 2 1 0)))
define fibfast(n)
  if {n < 2}
    n
    fibup n 2 1 0
Typical function notation
Indentation, infix {...}
Single expr = no new list
Simple function calls

path: /misc | Current Weblog | permanent link to this entry

Tue, 22 Jan 2013

Speaking at ACM DC Chapter

FYI, on 2013-03-04 I plan to speak about “Open Source Software, Government, and Cyber Security” at the Association for Computing Machinery (ACM), Washington, DC Chapter. It will be at 1203 19th St, 3rd Floor, Washington, DC. See the link for more information.

path: /oss | Current Weblog | permanent link to this entry

Ozone Widget Framework (OWF) released as OSS!

The Ozone Widget Framework (OWF) has recently been released as open source software (OSS) by the U.S. government. OWF is useful but a little tricky to explain; as their website explains, OWF is a web application that “allows users to easily access all their online tools from one location… [users can] access websites and applications with widgets [and] group them and configure … applications to interact with each other via widget intents”. Go see their website to learn more about it; here, I’ll talk about the wider implications of OWF.

To me, OWF is interesting on several fronts.

From potential user’s point of view, this is great news. If you want something like this, well, now you can easily get it. If you’re outside the U.S. government, you’ve never had this program at all before. But even for those inside the U.S. government, this release makes OWF far easier to get, use, and improve if necessary.

But from the point-of-view of collaborative software development, this is a much bigger deal. The government all too often pays money to develop software on one project, and then re-pays to develop that software again on any other project that needs it. In the rare cases where reuse happens at all, the government makes it hard for others in the government to improve it as needed. The government often talks about “public/private partnerships”, and such partnerships are a good idea… but all too often this doesn’t happen in software development.

Here we have an awesome change. Per their original plans and a Congressional mandate, OWF is now released to the public. This means that instead of the government having to re-develop the code for every use, and for the public to have to re-develop it as well, “we the people” who paid to develop the software can actually get it.

What’s more, OWF has avoided some of the terrible mistakes that have hurt some past efforts:

  1. Sometimes software developed via government funding gets “captured” by one vendor, so that even though the government paid to have it developed, essentially no one else has the right or ability to maintain it. Once it’s captured, the cost of maintaining the software skyrockets. By releasing the software as OSS, the OWF project has avoided that problem. Instead, the OWF project can get wide use and improvements from around the world.
  2. OWF has wisely released the software under an industry-standard OSS license (in this case, Apache 2.0), instead of writing some government-unique non-standard license. Nearly all OSS is licensed under a few licenses (GPL, LGPL, BSD-new, MIT, Apache 2.0); using nonstandard or incompatible licenses greatly impedes any possibility of collaboration.
  3. Second, OWF has wisely chosen to use a widely-used repository and development infrastructure (in this case, GitHub), instead of unnecessarily developing and maintaining its own.

The U.S. federal government was formed by “we the people”. It’s great to see the government releasing software back to the people; in the end, we’re the ones who paid to develop it. I wish the OWF project the best of success, and I hope that there will be many similar OSS projects to come.

path: /oss | Current Weblog | permanent link to this entry

Sun, 12 Aug 2012

Readable s-expressions for Lisp-based languages: Lots of progress!

Lots has been happening recently in my effort to make Lisp-based languages more readable. A lot of programming languages are Lisp-based, including Scheme, Common Lisp, emacs Lisp, Arc, Clojure, and so on. But many software developers reject these languages, at least in part because their basic notation (s-expressions) is very awkward.

The Readable Lisp s-expressions project has a set of potential solutions. We now have much more robust code (you can easily download, install, and use it, due to autoconfiscation), and we have a video that explains our solutions. The video on readable Lisp s-expressions is also available on Youtube.

We’re now at version 0.4. This version is very compatible with existing Lisp code; they are simply a set of additional abbreviations. There are three tiers: curly-infix expressions (which add infix), neoteric-expressions (which add a more conventional call format), and sweet-expressions (which deduce parentheses from indentation, reducing the number of required parentheses).

Here’s an example of (awkward) traditional s-expression format:

(define (factorial n)
  (if (<= n 1)
    1
    (* n (factorial (- n 1)))))

Here’s the same thing, expressed using sweet-expressions:

define factorial(n)
  if {n <= 1}
    1
    {n * factorial{n - 1}}

A sweet-expression reader could accept either format, actually, since these tiers are simply additional abbreviations and adjustments that you can make to an existing Lisp reader. If you’re interested, please go to the Readable Lisp s-expressions project web page for more information and an implementation - and please join us!

path: /misc | Current Weblog | permanent link to this entry

Fri, 20 Jul 2012

Release government-developed software as OSS

I encourage people to sign the white house petition to Maximize the public benefit of federal technology by sharing government-developed software under an open source license. I, at least, interpret this to include software developed by contractors (since they receive government funding). I think this proposal makes sense. Sure, some software is classified, or export-controlled, or for some other specific reason should not be released to the public. But those should be exceptions. If we the people paid to have it developed, then we the people should get it!

It is true that many petitions do not get action right away, but that isn’t taking the long view. Often an issue has to be repeatedly raised before anything happens. So just because something doesn’t happen once doesn’t mean it was a waste of time. The Consumer Financial Protection Bureau has a “default share” policy so it is possible.

path: /oss | Current Weblog | permanent link to this entry

Tue, 17 Jul 2012

Interview at opensource.com

FYI, opensource.com just posted an interview with me: “5 Questions with David A. Wheeler” by Melanie Chernoff, Opensource.com, 2012-07-17.

path: /oss | Current Weblog | permanent link to this entry

Sun, 08 Jul 2012

How to have a successful open source software (OSS) project: Internet Success

The world of the future belongs to the collaborators. But how, exactly, can you have a successful project with collaborators? Can we quantitatively analyze past projects to figure out what works, instead of just using our best guesses? The answer, thankfully, is yes.

I just finished reading the amazing Internet Success: A Study of Open-Source Software Commons. This landmark book by Charles M. Schweik and Robert C. English of the Massachusetts Institute of Technology (MIT) presents the results of five years of painstaking quantitative research to answer this question: “What factors lead some open source software (OSS) commons (aka projects) to success, and others to abandonment?

If you’re doing serious research in how collaborative development projects succeed (or not), you have to get this book. If you’re running a project, you should apply its results, and frankly, you’d probably get quite a bit of insight about collaboration from reading it. The book focuses specifically on the development of OSS, but as the authors note, many of its lessons probably apply elsewhere. Here’s a quick review and summary.

Schweik and English examined over 100,000 projects on SourceForge, using data from SourceForge and developer surveys. Their approach to data collection and analysis is spelled out in detail in the book; the key is that they took the time to deeply dive into it. Many previous studies have focused on just a few projects, and they summarize those; while those are useful, they don’t tell the whole story. Schweik and English instead cover a broad array of projects, using quantitative analysis instead of guesswork.

Fair warning: The book is quite technical. People who are not used to statistical analysis will find some parts quite mysterious, and they answer a lot of questions you might not even have thought to ask. Because this is serious scientific research, they carefully define terms, walk through a variety of data, and present an avalanche of data. The key, though, is that they managed to find useful answers from the data, and their results are actually quite understandable.

They spend a whole chapter (chapter 7) defining the terms “success” and “abandonment”. The definitions of these terms are key to the whole study, so it makes sense that they spend time to define them. Interestingly, they switched to the term “abandonment” instead of the more common term “failure”; they found that “many projects that had ceased collaborating would not be seen as failed projects”, e.g., because that project code had been absorbed into another project or the developer had improved their development skills (where this was their purpose).

They use a very simple project lifecycle model — projects begin in initiation, and once the project has made its first software release, it switches to growth. They also categorized projects as success, abandonment, or indeterminate. Combining these produces 6 categories of project: success initiation (SI); abandonment initiation (AI); success growth (SG); abandonment growth (AG); indeterminant initiation (II); and indeterminant growth (IG). Their operational definition of success initiation (SI) is oversimplified but easy to understand: an SI project has at least one release. Their operational definition for a success growth (SG) project is very generous: at least 3 releases, at least 6 months between releases, and has more than 10 downloads. Chapter 7 gives details on these; I note these here because it’s hard to follow most of the book without knowing these categories. I could argue that these are really too generous a definition of success, but even with those definitions, they had many projects which did not meet these definitions, and it is important to understand why (so that future projects would be more likely to succeed).

They had so much data that even supercomputers could not directly process it. Given today’s computing capabilities, that’s pretty amazing.

So, what did they learn? Quite a bit. A few specific points are described in chapter 12. For example, they had presumed that OSS projects with limited competition would be more successful, but the effect is actually mildly the other way; “successful growth (SG) projects are more frequently found in environments where there is more competition, not less”. Unsurprisingly, projects with financial backing are “much more likely to be successful than those that are not” once they are in growth stage; although financing had an effect, its effects were not as strong in initiation.

As with any research material, if you don’t have time for the details, it’s a good idea to jump to the conclusions, which in this book is chapter 13. So what does it say?

One of the key results is that during initiation (before first release), the following are the most important issues, in order of importance, for success in an OSS project:

  1. “Put in the hours. Work hard toward creating your first release.” The details in chapter 11 tell the story: If the leader put in more than 1.5 hours per week (on average), the project was successful 73% of the time; if the leader did not, the project was abandoned 65% of the time. They are not saying that leaders should only put in 2 hours a week; instead, the point is that the leader must consistently put in time for the project to get to its first release.
  2. “Practice leadership by administering your project well, and thinking through and articulating your vision as well as goals for the project. Demonstrate your leadership through hard work…”
  3. “Establish a high-quality Web site to showcase and promote your project.”
  4. “Create good documentation for your (potential) user and developer community.”
  5. “Advertise and market your project, and communicate your plans and goals with the hope of getting help from others.”
  6. “Realize that successful projects are found in both GPL-based and non-GPL-compatible situations.”
  7. “Consider, at the project’s outset, creating software that has the potential to be useful to a substantial number of users.” Remarkably, the minimum number of users is surprisingly small; they estimate that successful growth stage projects typically have at least 200 users. In general, the more potential users, the better.

None of these are earth-shattering surprises, but now they are confirmed by data instead of being merely guessed at. In particular, some items that people have claimed are important, such as keeping complexity low, were not really supported as important. In fact, successful projects tended to have a little more complexity. That is probably not because a project should strive for complexity. Instead, I suspect both successful and abandoned projects often strive to reduce complexity — so it not really something that distinguishes them — and I suspect sometimes a project that focuses on user needs has to have more complexity than one that does not, simply because user needs can sometimes require some complexity.

Similarly, they had guidance for growth projects, in order of importance:

  1. “Your goal should be to create a virtuous circle where others help to improve the software, thereby attracting more users and other developers, which in turn leads to more improvements in the software…” Do this the same way it is done in initiation: spending time, maintain goals and plans, communicate the plans, and maintain a high-quality project web site.” The user community should actively interacting with the development team.
  2. “Advertize and market your project.” In particular, successful growth projects are frequently projects that have added at least one new developer in the growth stage.
  3. Have some small tasks available for contributors with limited time.
  4. Welcome competition. The authors were surprised, but noted that “competition seems to favor success”. Personally, I do not find this surprising at all. Competition often encourages others to do better; we have an entire economic system based on that premise.
  5. Consider accepting offers of financing or paid developers (they can greatly increase success rates). This one, in particular, should surprise no one — if you want to increase success, pay someone to do it.
  6. “Keep institutions (rules and project governance) as lean and informal as possible, but do not be afraid to move toward more formalization if it appears necessary.”

The also have some hints of how potential OSS users (consumers) can choose OSS that is more likely to endure. Successful OSS projects have characteristics like more than 1000 downloads, users participating in bug tracker and email lists, goals/plans listed, a development team that responds quickly to questions, a good web site, good user documentation, and good developer documentation. A larger development team is a good sign, too.

These are just some of the research highlights. For details, well, get the book!

If you’re looking for more detailed guidance on how to run an OSS project, then a good place to go is “Producing Open Source Software: How to Run a Successful Free Software Project” by Karl Fogel. If you want to do it with or in the U.S. government, you might look at Open Technology Development (OTD): Lessons Learned & Best Practices for Military Software - OSD Report, May 2011 (full disclosure: I am co-author). Both of them were written before these research results were reported, but I think they are all quite consistent with each other.

I want to give some extra kudos to the authors: They have made a vast amount of their data avaiable so that analysis can be re-done, and so that additional analysis can be done. (They held back some survey data due to personally-identifying information issues, which is reasonable enough). Science depends on repeatability, yet much of today’s so-called “science” does not publish its data or analysis software, and thus cannot be repeated… and thus is not science.

The book is not perfect. It’s big and rather technical in some spots, which will make it hard reading for some. An unfortunate blot is that, while they’re usually extremely precise, there are serious ambiguities in their discussion on licensing. In particular, they have fundamentally inconsistent definitions for the term “GPL-compatible” and “GPL-incompatible” throughout the book, making their license analysis results suspect. On page 22, they define the term “GPL-incompatible” in an extremely bizarre and non-standard way; they define “GPL-incompatible” as software in which “firms can derive new works from OSS, but are not obliged to license new derivatives under the GPL [and] are not obligated to expose the code logic in [derivative products].” In short, they seem to using the term “GPL-compatible” as a synonym for what the rest of the world would call a “reciprocal” or “protective” license. Similarly, they seem to be defining the term “GPL-incompatible” to mean a “permissive” license. I don’t like non-standard terminology, but as long as unusual terms are defined clearly, I can deal with bizarre terminology. Yet later, on page 157, they define “GPL-compatible” completely differently, and give it its conventional meaning instead. That is, they define “GPL-compatible” as software that can be combined with the GPL (which includes not just the reciprocal GPL license, but which also includes many permissive licenses like the MIT license). My initial guess is that the page 22 text is just wrong, but it’s hard to be sure. There is another wrinkle, too, presuming that they meant the term “GPL-compatible” in the usual sense (and that page 22 is just wrong). One of the more popular licenses, the Apache License 2.0, has recently become GPL-compatible (on release of the GPL version 3), even though it wasn’t before. It’s not clear from the book that this is reflected in their data (at least I didn’t see it), if they actually used the term “GPL-compatible” in its usual sense, and there is enough Apache-licensed software that this would matter. This may just be a poor explanation of terms, but until this is cleared up, I would be cautious about its comments on licensing. Hopefully they will clear this up, and in addition, it would probably be very useful to re-run the licensing analysis to examine (1) GPL-compatible vs. GPL-incompatible, and (2) to examine the typical 3 license categories (permissive, weakly protective/reciprocal, and strongly protective/reciprocal).

So if you are interested in the latest research on how OSS projects become successful (or not), pick up Internet Success: A Study of Open-Source Software Commons. This book is a milestone in the serious study of collaborative development approaches.

What’s especially intriguing is that success is very achievable. While initiating your project you should keep at it and communicate (articulate the vision and goals, have a high-quality web site to showcase/promote the project, create good docuemntation, and advertize). Once it’s growing, work to attract more users and developers.

path: /oss | Current Weblog | permanent link to this entry

Wed, 27 Jun 2012

Antideficiency Act and the Apache License

Some people are claiming that the U.S. federal government law called the “antideficiency act” means that the U.S. government cannot use any software released under the Apache 2.0 license. This is nonsense, but it’s a good example of the nonsense that impedes government use and co-development of some great software. Here’s why this is nonsense.

First, I should note that in my earlier post, Open Source Software volunteers forbidden in government? (Antideficiency Act), I explained that the US government rule called the “antideficiency act” (ADA) doesn’t interfere with the government’s use of open source software (OSS), even if it is created by people who are “volunteers”. As long as the volunteers intend or agree that their work is gratuitous (no-charge), there’s no problem. The antideficiency act says that you can’t create a moral obligation to pay without Congress’ consent; the government can accept materials even if they are provided at no cost.

The GAO has a summary describing the Antideficiency Act (ADA), Pub.L. 97-258, 96 Stat. 923. It explains that the ADA prohibits “federal employees from:

Software licenses sometimes include indemnification clauses, and those clauses can run afoul of this act if the clauses require the government to grant a possibly unlimited future liability (or any liability not already appropriated). But some lawyers act as if the word “indemnification” is some kind of magic curse word. The word “indemnification” is not a magic word that makes a licenses automatically unacceptable for government use. As always, whether a license is okay or not depends on what the license actually says.

The license that seems to trigger problems in some lawyers is the Apache 2.0 license, a popular OSS license. Yet the Apache license version 2.0 does not require such broad indemnification. The Apache 2.0 license clause 9 (“Accepting Warranty or Additional Liability”) instead requires that a redistributor provide indemnification only when additional conditions are met - in this case, when the redistributor provides warranty or indemnification. Clause 9 says in full, “While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.”

In short (and to oversimplify), “if you indemnify your downstream (recipients), you have to indemnify your upstream (those you received it from)”. There is a reason for clauses like this; it helps counter some clever sheanigans by competitors who might want to harm a project. If a competitor set up a situation to legally protect that software’s users while legally exposing its developers to heightened risk, after a while there would be no developers. This clause prevents this. (This is yet another example of why you should reuse a widely-used OSS license instead of writing your own; most people would never have thought of this issue.)

It is extremely unlikely that any government agency would trigger this clause by warrantying the software or indemnifying a recipient, so it is quite unlikely that this clause would ever be triggered by government action. But in any case, it would be this later action, not mere acceptance of the Apache 2.0 license, that would potentially run afoul of the ADA. This is simply the same as usual; the government typically does not warranty or indemnify software it releases, and if it did, it would have to determine that value and lawfully receive funding to do it.

There’s an additional wrinkle on this stuff. The legal field, like the software field, is so large that many people specialize, and sometimes the right specialists don’t get involved. Reviewing software licenses is normally the domain of so-called “intellectual property” lawyers, who really should be called “data rights” lawyers. (I’ve commented elsewhere that the term “intellectual property” is dangerously misleading, but that is a different topic.) But I’ve been told that at least in some organizations, the people who really understand the antideficiency act are a different group of lawyers (e.g., those who specialize in finance). So if a data rights lawyer comes back with antideficiency act questions, find out if that lawyer is the right person to talk to; it may be that the question really should be forwarded to a lawyer who specializes in that instead.

Now I am no lawyer, and this blog post is not legal advice. Even if I were a lawyer, I am not your lawyer — specific facts can create weird situations. There is no formal ruling on this matter, either, more’s the pity. However, this conclusion that I’m describing has been previously reached by others, in particular, see “Army lawyers dismiss Apache license indemnification snafu”, Fierce Government IT, March 8, 2012. What’s more, other lawyers I’ve talked to have agreed that this makes sense. Basically, the word “indemification” is not a magic curse word when it is in a licence — you have to actually read the license, and then determine if it is a problem or not.

More broadly, this is (yet another) example of a misunderstanding in the U.S. federal government that impedes the use and collaborative development of open source software (aka OSS or FLOSS). I believe the U.S. federal government does not use or co-develop OSS to the extent that it should, and in some cases it is because of misunderstandings like this. So if this matters to you, spread the word — often rules that appear to be problems are not problems at all.

I’ve put this information in the MIL-OSS FAQ so others can find out about this.

path: /oss | Current Weblog | permanent link to this entry