David A. Wheeler's Blog

Mon, 20 Oct 2014

Open Source Software in U.S. Government

The report “Open Source Software in Government: Challenges and Opportunities” is available to the public (you can jump to the “Download full report” link at the bottom). This paper, which I co-authored, discusses key challenges and opportunities in the U.S. government application of open source software (OSS). It became publicly available only recently, even though it was finished a while back; I hope it’s been worth the wait. If you’re interested in the issues of OSS and government, I think you’ll find this report very illuminating.

path: /oss | Current Weblog | permanent link to this entry

Tue, 14 Oct 2014

POODLE attack against SSLv3

There is a new POODLE attack against SSLv3. See my page for more info.

path: /security | Current Weblog | permanent link to this entry

Mon, 13 Oct 2014

Twitter

My username on Twitter is drdavidawheeler, for those on Twitter who want occasional comments on computer security, open source software, software development, and so on.

path: /website | Current Weblog | permanent link to this entry

Thu, 09 Oct 2014

A tester walks into a bar

A tester walks into a bar and orders a beer. Then he orders 999,999 beers. Orders 0 beers. Orders -1 beers. Orders a coyote. Orders a qpoijzcpx. Then he insults the bartender.

This joke (with variations) is making the rounds, but it also has a serious point. It’s a nice example of how testing should work, including software testing.

Too many of today’s so-called software “tests” only check for correct data. This has led to numerous vulnerabilities including Heartbleed and Apple’s “goto fail; goto fail;” vulnerability. The paper “The Most Dangerous Code in the World: Validating SSL Certificates in Non-Browser Software” found that a disturbingly large number of programss’ security depends on SSL certificate validation, and they are insecure because no one actually tested them with invalid certificates. They note that “most of the vulnerabilities we found should have been discovered during development with proper unit testing”.

Good software testing must include negative testing (tests with data that should be rejected) to ensure that the software protects itself against bad data. This must be part of an automated regression test suite (re-run constantly) to prevent problems from creeping in later. For example, if your programs accept numbers, don’t just test for “correct” input; test for wrong cases like too big, zero, negative or too small, and non-numbers. Testing “just around” too big and too small numbers is often helpful, too, as is testing that tries to bypass the interface. Your users won’t know how you did it, but they’ll know your program “just works” reliably.

path: /misc | Current Weblog | permanent link to this entry

Sun, 05 Oct 2014

Shellshock

I have posted a new paper about Shellshock. In particular, it includes a detailed timeline about shellshock, which counters a number of myths and misunderstandings. It also shows a correct way to detect if your system is vulnerable to shellshock (many postings get it wrong and only detect part of the problem).

I also briefly discuss how to detect or prevent future shellshock-like attacks. At the moment this list is short, because these kinds of vulnerabilities are known to be difficult to detect ahead of time. Still, I think it is worth trying to do this. My goal is to eventually end up with something similar to the list of countermeasures for Heartbleed-like attacks that I developed earlier.

path: /security | Current Weblog | permanent link to this entry

Tue, 19 Aug 2014

Software SOAR released!!

The Software SOAR (which I co-authored) has finally been released to the public! This document - whose full name is State-of-the-Art Resources (SOAR) for Software Vulnerability Detection, Test, and Evaluation (Institute for Defense Analyses Paper P-5061, July 2014) - is now available to everyone. It defines and describes the following overall process for selecting and using appropriate analysis tools and techniques for evaluating software for software (security) assurance. In particular, it identifies types of tools and techniques available for evaluating software, as well as the technical objectives those tools and techniques can meet. A key thing that it does is make clear that in most cases you need to use a variety of different tools if you are trying to evaluate software (e.g., to find vulnerabilities).

The easy way to get the document is via the Program Protection and System Security Engineering web page, then scroll to the bottom to look for it (it is co-authored by David A. Wheeler and Rama S. Moorthy). You can jump directly to the Main report of the software SOAR and Appendix E (Software State-of-the-Art Resources (SOAR) Matrix). You can also get the software SOAR report via IDA.

I don’t normally mention things I’ve done at work, but this is publicly available, some people have been waiting for it, and I’ve found that some people have had trouble finding it. For example, the article “Pentagon rates software assurance tools” by David Perera (Politico, 2014-08-19) is about this paper, but it does not tell people how to actually get it. I’m hoping that this announcement will give people a hand.

path: /security | Current Weblog | permanent link to this entry

Sun, 13 Jul 2014

Flawfinder version 1.28 released!

I’ve released yet another new version of flawfinder - now it’s version 1.28. Flawfinder is a simple program that examines C/C++ source code and reports on likely security flaws in the program, ranked by risk level.

This new version has some new capabilities. Common Weakness Enumeration (CWE) references are now included in most hits (this makes it easier to use in conjunction with other tools, and it also makes it easier to find general information about a weakness). The new version of flawfinder also has a new option to only produce reports that match a regular expression (e.g., you can report only hits with specific CWE values). This version also adds support for the git diff format.

This new version also has a number of bug fixes. For example, it handles files not ending in newline, and it more gracefully handles handles unbalanced double-quotes in sprintf. A bug in reporting the time executed has also been fixed.

For more information, or a copy, just go to my original flawfinder home page or the flawfinder project page on SourceForge.net. Enjoy!

path: /security | Current Weblog | permanent link to this entry

Tue, 10 Jun 2014

Interview on Application Security

A new interview of me is available: David A. Wheeler on the Current State of Application Security (by the Trusted Software Alliance) (alternate link). In this interview I discuss a variety of topics with Mark Miller, including the need for education in developing secure software, the need to consider security thoughout the lifecycle, and the impact of componentization. I warn that many people do not include security (including software assurance) when they ask for quality; while I agree in principle that security is generally part of quality, in practice you have to specifically ask for security or you won’t get it.

This interview is part of their 50 in 50 interviews series, along with Joe Jarzombek (Department of Homeland Security), Steve Lipner (Microsoft), Bruce Schneier, Jeff Williams (Aspect Security and OWASP), and many others. It was an honor and pleasure to participate, and I hope you enjoy the results.

path: /security | Current Weblog | permanent link to this entry

Wed, 21 May 2014

On Dave and Gunnar show

There is now an interview of me on the Dave and Gunnar show (episode #51). I talk mostly about How to prevent the next Heartbleed. I also talk about my FLOSS numbers database (as previously discussed) and vulnerability economics. There was even a mention of my Fully Countering Trusting Trust through Diverse Double-Compiling work.

Since the time of the interview, more information has surfaced about Heartbleed. Traditional fuzzing could not find Heartbleed, but it looks like some fuzzing variants could even if the OpenSSL code was unchanged; see the latest version for more information. If you learn more information relevant to the paper, let me know!

path: /oss | Current Weblog | permanent link to this entry

Thu, 08 May 2014

FLOSS numbers database!

If you are doing research related to Free / Libre / Open Source Software (FLOSS), then I have something that may be useful to you: the FLOSS numbers database.

My paper Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers! is a big collection of quantitative studies about FLOSS. Too big, in fact. There have been a lot of quantitative studies about FLOSS over the years! A lot of people want to query this information for specific purposes, and it is hard to pull out just the parts you want from a flat document. I had thought that as FLOSS became more and more common, fewer people would want this information… but I still get requests for it.

So I am announcing the FLOSS numbers database; it provides the basic information in spreadsheet format, making it easy to query for just the parts you want. My special thanks go to Paul Rotilie, who worked to get the data converted from my document format into the spreadsheet.

If you want to discuss this database, I have set up a discussion group: Numbers about Free Libre Open Source Software. If you are doing research and need or use this kind of information, please feel free to join. If you just need a presenatation based on this, you might like my Presentation: Why Free-Libre / Open Source Software (FLOSS or OSS/FS)? Look at the Numbers!.

This database is the sort of thing that if you need it, you really need it. I am sure it is incomplete… but I am also sure that with your help, we can make it better.

path: /oss | Current Weblog | permanent link to this entry

Sat, 03 May 2014

How to Prevent the next Heartbleed

My new article How to Prevent the next Heartbleed describes why the Heartbleed vulnerability in OpenSSL was so hard to find… and what could be done to prevent something like it next time.

path: /security | Current Weblog | permanent link to this entry

Thu, 24 Apr 2014

Opensource.com interview

Opensource.com has posted an interview of me, titled “US government accelerating development and release of open source”. In this interview I describe the current state of the use of open source software by the US federal government, the challenges of the Federal acquisition system, and I also discuss what may happen next. Enjoy!

path: /oss | Current Weblog | permanent link to this entry

Thu, 20 Feb 2014

Presenting at American Society for Quality

On February 25, 2014, I will be presenting on “Open Source Software and Government” at the American Society for Quality (ASQ) Software SIG. You can join in person in McLean, Virginia; there will also be various video tele-conferencing sites, and you can join by phone or online as well.

If you’re interested, you’re welcome to join us, but you’ll need to pre-register.

path: /oss | Current Weblog | permanent link to this entry

Fri, 07 Feb 2014

William W. McCune: He made the world a better place through source code

Here I want to honor the memory of William W. (“Bill”) McCune, who helped change the world for the better by releasing software source code. I hope that many other researchers and government policy-makers will follow his lead… and below I intend to show why.

But first, I should explain my connection to him. My PhD dissertation involved countering the so-called “trusting trust” attack. In this attack, an attacker subverts the tools that developers use to create software. This turns out to be a really nasty attack. If a software developer’s tools are subverted, then the attacker actually controls the computer system running the software. This is no idle concern, either; we know that computers are under constant attack, and that some of these attacks are very sophisticated. Such subversions could allow attackers to essentially control all computers worldwide, including the global financial system, militaries, electrical systems, dams, you name it. That kind of power makes this kind of attack potentially worthwhile, but only if it cannot be detected and countered. For many years there were no good detection mechanisms or countermeasures. Then Henry Spencer suggested a potential solution… but there was no agreement that his idea would really counter attackers. That matters; how can you be absolutely certain about some claim?

The “gold standard” for knowing if something is true is a formal mathematical proof. Many important questions cannot be proved this way, all proofs depend on assumptions, and creating a formal proof is often hard. Still, a formal mathematical proof is the best guarantee we have for being certain about something. And there were a lot of questions about whether or not Henry Spencer’s approach would really counter this attack. So, I went about trying to prove that Henry Spencer’s idea really would counter the attack (if certain assumptions held).

After trying several other approaches, I found that the tools developed by Bill McCune (in particular prover9, mace4, and ivy) were perfect for my needs. These tools made my difficult work far easier, because his tools managed to mostly-automatically prove claims mathematically once they were described using mathematical statements. In the end, I managed to mathematically prove that Henry Spencer’s approach really did counter the subverted compiler problem. The tools Bill McCune developed and released made a real difference in helping to solve this challenging real-world problem. I didn’t need much help (because his tools were remarkably easy to use and well-documented), but he responded quickly when I emailed him too.

Sadly, Bill McCune suddenly died on May 4, 2011, leaving the field of automated reasoning deprived of one of its founders (particularly in the subfields of practical theorem proving and model building). In 2013 an academic book was released in his honor (“Automated Reasoning and Mathematics: Essays in Memory of William W. McCune”, Lecture Notes in Artificial Intelligence 7788). That book’s preface has a nice tribute to Bill McCune, listing some of his personal accomplishments (e.g., the development of Otter) and other accomplishments that his tools enabled.

Bill McCune released many tools as open source software (including prover9, mace4, ivy, and the older tool Otter). This means that anyone could use the software (for any purpose), modify it, and distribute it (with or without modification). These freedoms had far-reaching effects, accelerating research in automated proving of claims, as well as speeding the use of these techniques. That book’s preface notes several of Bill McCune’s accomplishments, including the impact he had by releasing the code:

All too often the U.S. government spends a fortune in research, and then that same research has to be recreated from scratch several times again by other researchers (sometimes unsuccessfully). This is a tremendous waste of government money, and can delay work by years (if it can happen at all) resulting in far less progress for the money spent. Bill McCune instead ensured that this results got out to people who could use and improve upon them. In this specific area Bill McCune made software research available to many others, so that those others could use it, verify it, and build on top of those results.

Of course, he was not alone in recognizing the value of sharing research when implemented as software. The paper ”The Evolution from LIMMAT to NANOSAT” by Armin Biere (April 2004) makes the same point when they tried to reproduce others’ work. That paper states, “From the publications alone, without access to the source code, various details were still unclear… what we did not realize, and which hardly could be deduced from the literature, was [an optimization] employed in GRASP and CHAFF [was critically important]… Only [when CHAFF’s source code became available did] our unfortunate design decision became clear… The lesson learned is, that important details are often omitted in publications and can only be extracted from source code. It can be argued, that making source code … available is as important to the advancement of the field as publication.”

More generally, Free the Code.org argues that if government pays to develop software, then it should be available to others for reuse and sharing. That makes sense to me; if “we the people” paid to develop software, then by default “we the people” should receive it. I think it especially makes sense in science and research; without the details of how software works, results are not reproduceable. Currently much of science is not reproduceable (and thus not really science), though open science efforts are working to change this.

I think Bill McCune made great contributions to many, many, others. I am certainly one of the beneficiaries. Thank you, Bill McCune, so very much for your life’s work.

path: /oss | Current Weblog | permanent link to this entry

Sun, 01 Dec 2013

Shellcheck

I just learned about shellcheck, a tool that reports on common mistakes in (Bourne) shell scripts. If you write shell scripts, you should definitely check out this static analyzer. You can try it out by pasting shell scripts into their website. It is open source software, so you can also download and use it to your heart’s content.

It even covers some of the issues identified in Filenames and Pathnames in Shell: How to do it Correctly. If you are interested in static analyzers for software, you can also see my Flawfinder home page which identifies many other static analysis tools.

path: /oss | Current Weblog | permanent link to this entry

Sat, 16 Nov 2013

Vulnerability bidding wars and vulnerability economics

I worry that the economics of software vulnerability reporting is seriously increasing the risks to society. The problem is the rising bidding wars for vulnerability information, leading to a rapidly-growing number of vulnerabilities known only to attackers. These kinds of vulnerabilities, when exploited, are sometimes called “zero-days” because users and suppliers had zero days of warning. I suspect we should create laws limiting the sale of vulnerability information, similar to the limits we place on organ donation, to change the economics of vulnerability reporting. To see why, let me go over some background first.

A big part of the insecure software problem today is that relatively few of today’s software developers know how to develop software that resists attack (e.g., via the Internet). Many schools don’t teach it at all. I think that’s ridiculous; you’d think people would have heard about the Internet by now. I do have some hope that this will get better. I teach a graduate course on how to develop secure software at George Mason University (GMU), and attendance has increased over time. But today, most software developers do not know how to create secure software.

In contrast, there is an increasing bidding war for vulnerability information by organizations who intend to exploit those vulnerabilities. This incentivizes people to search for vulnerabilities, but not report them to the suppliers (who could fix them) and not alert the public. As Bruce Schneier reports in “The Vulnerabilities Market and the Future of Security” (June 1, 2012), “This new market perturbs the economics of finding security vulnerabilities. And it does so to the detriment of us all.” Forbes ran an article about this in 2012, Meet The Hackers Who Sell Spies The Tools To Crack Your PC (And Get Paid Six-Figure Fees). The Forbes article describes what happened when French security firm Vupen broke the security of the Chrome web browser. Vupen would not tell Google how they broke in, because the $60,000 award Google from Google was not enough. Chaouki Bekrar, Vupen’s chief executive, said that they “wouldn’t share this [information] with Google for even $1 million… We want to keep this for our customers.” These customers do not plan to fix security bugs; they purchase exploits or techniques with the “explicit intention of invading or disrupting”. Vupen even “hawks each trick to multiple government agencies, a business model that often plays its customers against one another as they try to keep up in an espionage arms race.” Just one part of the Flame espionage software (exploiting Microsoft Update) has been estimated as being worth $1 million when it was not known.

This imbalance in economic incentives creates a dangerous and growing mercenary subculture. You now have a growing number of people looking for vulnerabilities, keeping them secret, and selling them to the highest bidder… which will encourage more to look for, and keep secret, these vulnerabilities. After all, they are incentivized to do it. In contrast, the original developer typically does not know how to develop secure software, and there are fewer economic incentives to develop secure software anyway. This is a volatile combination.

Some think the solution is for suppliers to pay people when they report security vulnerabilities to suppliers (“bug bounties”). I do not think bug bounty systems (by themselves) will be enough, though suppliers are trying.

There has been a lot of discussion about Yahoo and bug bounties. On September 30, 2013, the article What’s your email security worth? 12 dollars and 50 cents according to Yahoo reported that Yahoo paid for each vulnerability only $12.50 USD. Even worse, this was not actual money, it was “a discount code that can only be used in the Yahoo Company Store, which sell Yahoo’s corporate t-shirts, cups, pens and other accessories”. Ilia Kolochenko, High-Tech Bridge CEO, says: “Paying several dollars per vulnerability is a bad joke and won’t motivate people to report security vulnerabilities to them, especially when such vulnerabilities can be easily sold on the black market for a much higher price. Nevertheless, money is not the only motivation of security researchers. This is why companies like Google efficiently play the ego card in parallel with [much higher] financial rewards and maintain a ‘Hall of Fame’ where all security researchers who have ever reported security vulnerabilities are publicly listed. If Yahoo cannot afford to spend money on its corporate security, it should at least try to attract security researchers by other means. Otherwise, none of Yahoo’s customers can ever feel safe.” Brian Martin, President of Open Security Foundation, said: “Vendor bug bounties are not a new thing. Recently, more vendors have begun to adopt and appreciate the value it brings their organization, and more importantly their customers. Even Microsoft, who was the most notorious hold-out on bug bounty programs realized the value and jumped ahead of the rest, offering up to $100,000 for exploits that bypass their security mechanisms. Other companies should follow their example and realize that a simple “hall of fame”, credit to buy the vendor’s products, or a pittance in cash is not conducive to researcher cooperation. Some of these companies pay their janitors more money to clean their offices, than they do security researchers finding vulnerabilities that may put thousands of their customers at risk.” Yahoo has since decided to establish a bug bounty system with larger rewards.

More recently, the Internet Bug Bounty Panel (founded by Microsoft and Facebook) will award public research into vulnerabilities with the potential for severe security implications to the public. It has a minimum bounty of $5,000. However, it certainly does not cover everything; they only intend to pay out widespread vulnerabilities (wide range of products or end users), and plan to limit bounties to only severe vulnerabilities that are novel (new or unusual in an interesting way). I think this could help, but it is no panacea.

Bug bounty systems are typically drastically outbid by attackers, and I see no reason to believe this will change.

Indeed, I do not think we should mandate, or even expect, that suppliers will pay people when people report security vulnerabilities to suppliers (aka bug bounties). Such a mandate or expectation could kill small businesses and open source software development, and it would almost certainly chill software development in general. Such payments would not also deal with what I see as a key problem: the people who sell vulnerabilities to the highest bidder. Mandating payment by suppliers would get most people to send them problem reports… if the bug bounty payments were required to be larger than payments to those who would exploit the vulnerability. That would be absurd, because given current prices, such a requirement would almost certainly prevent a lot of software development.

I think people who find a vulnerability in software should normally be free to tell the software’s supplier, so that the supplier can rapidly repair the software (and thus fix it before it is exploited). Some people call this “responsible disclosure”, though some suppliers misuse this term. Some suppliers say they want “responsible disclosure”, but they instead appear to irresponsibly abuse the term to stifle warning those at risk (including customers and the public), as well as irresponsibly delay the repair of critical vulnerabilities (if they repair the vulnerabilities at all). After all, if a supplier convinces the researcher to not alert users, potential users, and the public about serious security defects in their product, then these irresponsible suppliers may believe they don’t need to fix it quickly. People who are suspicious about “responsible disclosure” have, unfortunately, excellent reasons to be suspicious. Many suppliers have shown themselves untrustworthy, and even trustworthy suppliers need to have a reason to stay that way. For that and other reasons, I also think people should be free to alert the public in detail, at no charge, about a software vulnerability (so-called “full disclosure”). Although it’s not ideal for users, full disclosure is sometimes necessary; it can be especially justifiable when a supplier has demonstrated (through past or current actions) that he will not rapidly fix the problem that he created. In fact, I think it’d be an inappropriate constraint of free speech to prevent people from revealing serious problems in software products to the public.

But if we don’t want to mandate bug bounties, or so-called “responsible disclosure”, then where does that leave us? We need to find some way to change the rules so that economics works more closely with and not against computer security.

Well, here is an idea… at least one to start with. Perhaps we should criminalize selling vulnerability information to anyone other than the supplier or the reporter’s government. Basically, treat vulnerability information like organ donation: intentionally eliminate economic incentives in a specific area for a greater social good.

That would mean that suppliers can set up bug bounty programs, and researchers can publish information about vulnerabilities to the public, but this would sharply limit who else can legally buy the vulnerability information. In particular, it would be illegal to sell the information to organized crime, terrorist groups, and so on. Yes, governments can do bad things with the information; this particular proposal does nothing directly to address it. But I think it’s impossible to prevent a citizen from telling his country’s government about a software vulnerability; a citizen could easily see it as his duty. I also think no government would forbid buying such information for itself. However, by limiting sales to that particular citizen’s government, it becomes harder to create bidding wars between governments and other groups for vulnerability information. Without the bidding wars, there’s less incentive for others to find the information and sell it to them. Without the incentives, there would be fewer people working to find vulnerabilities that they would intentionally hide from suppliers and the public.

I believe this would not impinge on freedom of speech. You can tell no one, everyone, or anyone you want about the vulnerability. What you cannot do is receive financial benefit from selling vulnerability information to anyone other than the supplier (who can then fix it) or your own government (and that at least reduces bidding wars).

Of course, you always have to worry about unexpected consequences or easy workarounds for any new proposed law. An organization could set itself up specifically to find vulnerabilities and then exploit them itself… but that’s already illegal, so I don’t see a problem there. A trickier problem is that a malicious organization (say, the mob) could create a “supplier” (e.g., a reseller of proprietary software, or a downstream open source software package) that vulnerability researchers could sell their information to, working around the law. This could probably be handled by requiring, in law, that suppliers report (in a timely manner) any vulnerability information they receive to their relevant suppliers.

Obviously there are some people will do illegal things, but some people will avoid doing illegal things in principle, and others will avoid illegal activities because they fear getting caught. You don’t need to stop all possible cases, just enough to change the economics.

I fear that the current “vulnerability bidding wars” - left unchecked - will create an overwhelming tsunami of zero-days available to a wide variety of malicious actors. The current situation might impede the peer review of open source software (OSS), since currently people can make more money selling an exploit than in helping the OSS project fix the problem. Thankfully, OSS projects are still widely viewed as public goods, so there are still many people who are willing to take the pay cut and help OSS projects find and fix vulnerabilities. I think proprietary and custom software are actually in much more danger than OSS; in those cases it’s a lot easier for people to think “well, they wrote this code for their financial gain, so I may as well sell my vulnerability information for my financial gain”. The problem for society is that this attitude completely ignores the users and those impacted by the software, who can get hurt by the later exploitation of the vulnerability.

Maybe there’s a better way. If so, great… please propose it! My concern is that economics currently makes it hard - not easy - to have computer security. We need to figure out ways to get Adam Smith’s invisible hand to work for us, not against us.

Standard disclaimer: As always, these are my personal opinions, not those of employer, government, or (deceased) guinea pig.

path: /security | Current Weblog | permanent link to this entry

Mon, 14 Oct 2013

Readable Lisp version 1.0.0 released!

Lisp-based languages have been around a long time. They have some interesting properties, especially when you want to write programs that analyze or manipulate programs. The problem with Lisp is that the traditional Lisp notation - s-expressions - is notoriously hard to read.

I think I have a solution to the problem. I looked at past (failed) solutions and found that they generally failed to be general or homoiconic. I then worked to find notations with these key properties. My solution is a set of notation tiers that make Lisp-based languages much more pleasant to work with. I’ve been working with many others to turn this idea of readable notations into a reality. If you’re interested, you can watch a short video or read our proposed solution.

The big news is that we have reached version 1.0.0 in the readable project. We now have an open source software (MIT license) implementation for both (guile) Scheme and Common Lisp, as well as a variety of support tools. The Scheme portion implements the SRFI-105 and SRFI-110 specs, which we wrote. One of the tools, unsweeten, makes it possible to process files in other Lisps as well.

So what do these tools do? Fundamentally, they implement the 3 notation tiers we’ve created: curly-infix-expressions, neoteric-expressions, and sweet-expressions. Sweet-expressions have the full set of capabilities.

Here’s an example of (awkward) traditional s-expression format:

(define (factorial n)
  (if (<= n 1)
    1
    (* n (factorial (- n 1)))))

Here’s the same thing, expressed using sweet-expressions:

define factorial(n)
  if {n <= 1}
    1
    {n * factorial{n - 1}}

I even briefly mentioned sweet-expressions in my PhD dissertation “Fully Countering Trusting Trust through Diverse Double-Compiling” (see section A.3).

So if you are interested in how to make Lisp-based languages easier to read, watch our short video about the readable notations or download the current version of the readable project. We hope you enjoy them.

path: /misc | Current Weblog | permanent link to this entry

Thu, 26 Sep 2013

Welcome, those interested in Diverse Double-Compiling (DDC)!

A number of people have recently been discussing or referring to my PhD work, “Fully Countering Trusting Trust through Diverse Double-Compiling (DDC)”, which counters Trojan Horse attacks on compilers. Last week’s discussion on reddit based on a short short slide show discussed it directly, for example. There have also been related discussions such as Tor’s work on creating deterministic builds.

For everyone who’s interested in DDC… welcome! I intentionally posted my dissertation, and a video about it, directly on the Internet with no paywall. That way, anyone who wants the information can immediately get it. Enjoy!

I even include enough background material so other people can independently repeat my experiments and verify my claims. I believe that if you cannot reproduce the results, it is not science… and a lot of computational research has stopped being a science. This is not a new observation; “Reproducible Research: Addressing the Need for Data and Code Sharing in Computational Science” by Victoria C. Stodden (Computing in Science & Engineering, 2010) summarizes a roundtable on this very problem. The roadtable found that “Progress in computational science is often hampered by researchers’ inability to independently reproduce or verify published results” and, along with a number of specific steps, “reproducibility must be embraced at the cultural level within the computational science community.” “Does computation threaten the scientific method (by Leslie Hatton and Adrian Giordani) and “The case for open computer programs” in Nature (by Darrel C. Ince, Leslie Hatton, and John Graham-Cumming) make similar points. For one of many examples, the paper “The Evolution from LIMMAT to NANOSAT” by Armin Biere (Technical Report #444, 15 April 2004) reported that they could not reproduce results because “From the publications alone, without access to the source code, various details were still unclear.” In the end they realized that “making source code… available is as important to the advancement of the field as publications”. I think we should not pay researchers, or their institutions, if they fail to provide the materials necessary to reproduce the work.

I do have a request, though. There is no patent on DDC, nor is there a legal requirement to report using it. Still, if you apply my approach, please let me know; I’d like to hear about it. Alternatively, if you are seriously trying to use DDC but are having some problems, let me know.

Again - enjoy!

path: /security | Current Weblog | permanent link to this entry

Wed, 21 Aug 2013

Open security

Modern society depends on computer systems. Yet computer security problems let attackers subvert the very systems that society depends on. This is a serious problem.

I think one approach that could help is “open security” - applying open source software (OSS) approaches to help solve computer security problems. To see why, let’s look at some background.

Back in the 1970s people collaboratively developed software that today we would call open source software or free-libre software. At the time many assumed these approaches could not scale up to big systems… but they were wrong. Software systems that would cost over a billion U.S. dollars to redevelop have been developed as open source software, and Wikipedia has used similar approaches to collaboratively develop the world’s largest encyclopedia.

So… if we can collaboratively develop multi-billion software systems, and large encyclopedias, can we use the same kinds of collaborative approaches to improve computer security? I believe we can… but if we are going to do this, we need to define a term for this (so that we can agree on what we are doing!).

I propose that open security is the application of open source software (OSS) approaches to help solve cyber security problems. OSS approaches collaboratively develop and maintain intellectual works (including software and documentation) by enabling users to use them for any purpose, as well as study, create, change, and redistribute them (in whole or in part). Cyber security problems are a lack of security (confidentiality, integrity, and/or availability), or potential lack of security (a vulnerability), in computer systems and/or the networks they are a part of. In short, open security improves security through collaboration.

You can see more details in my paper What is open security? [PDF] [DOC]. I intentionally built on previous work such as the Free Software Definition by the Free Software Foundation (FSF), the Open Source Definition (Annotated) by the Open Source Initiative (OSI), the Creative Commons license work, and the Definition of Free Cultural Works by Freedom Defined (the last one is, for example, the basis of the Wikimedia/Wikipedia licensing policy).

The Open security site has been recently set up so that you and others can join and get involved. So please - get involved! We are only just starting, and the direction we go depends on the feedback we get.

Further reading:

path: /oss | Current Weblog | permanent link to this entry

Tue, 06 Aug 2013

Don’t anthropomorphize computers, they hate that

A lot of people who program computers or live in the computing world ‐ including me ‐ talk about computer hardware and software as if they are people. Why is that? This is not as obvious as you’d think.

After all, if you read the literature about learning how to program, you’d think that programmers would never use anthropomorphic language. “Separating Programming Sheep from Non-Programming Goats” by Jeff Atwood discusses teaching programming and points to the intriguing paper “The camel has two humps” by Saeed Dehnadi and Richard Bornat. This paper reported experimental evidence on why some people can learn to program, while others struggle. Basically, to learn to program you must fully understand that computers mindlessly follow rules, and that computers just don’t act like humans. As their paper said, “Programs… are utterly meaningless. To write a computer program you have to come to terms with this, to accept that whatever you might want the program to mean, the machine will blindly follow its meaningless rules and come to some meaningless conclusion… the consistent group [of people] showed a pre-acceptance of this fact: they are capable of seeing mathematical calculation problems in terms of rules, and can follow those rules wheresoever they may lead. The inconsistent group, on the other hand, looks for meaning where it is not. The blank group knows that it is looking at meaninglessness, and refuses to deal with it. [The experimental results suggest] that it is extremely difficult to teach programming to the inconsistent and blank groups.” Later work by Saeed Dehnadi and sometimes others expands on this earlier work. The intermediate paper “Mental models, Consistency and Programming Aptitude” (2008) seemed to have refuted the idea that consistency (and ignoring meaning) was critical to programming, but the later “Meta-analysis of the effect of consistency on success in early learning of programming” (2009) added additional refinements and then re-confirmed this hypothesis. The reconfirmation involved a meta-analysis of six replications of an improved version of Dehnadi’s original experiment, and again showed that understanding that computers were mindlessly consistent was key in successfully learning to program.

So the good programmers know darn well that computers mindlessly follow rules. But many use anthropomorphic language anyway. Huh? Why is that?

Some do object to anthropomorphism, of course. Edjar Dijkstra certainly railed against anthropomorphizing computers. For example, in EWD854 (1983) he said, “I think anthropomorphism is the worst of all [analogies]. I have now seen programs ‘trying to do things’, ‘wanting to do things’, ‘believing things to be true’, ‘knowing things’ etc. Don’t be so naive as to believe that this use of language is harmless.” He believed that analogies (like these) led to a host of misunderstandings, and that those misunderstandings led to repeated multi-million-dollar failures. It is certainly true that misunderstandings can lead to catastrophe. But I think one reason Dijkstra railed particularly against anthropomorphism was (in part) because it is a widespread practice, even among those who do understand things ‐ and I see no evidence that anthropomorphism is going away.

The Jargon file specifically discusses anthropomorphization: “one rich source of jargon constructions is the hackish tendency to anthropomorphize hardware and software. English purists and academic computer scientists frequently look down on others for anthropomorphizing hardware and software, considering this sort of behavior to be characteristic of naive misunderstanding. But most hackers anthropomorphize freely, frequently describing program behavior in terms of wants and desires. Thus it is common to hear hardware or software talked about as though it has homunculi talking to each other inside it, with intentions and desires… As hackers are among the people who know best how these phenomena work, it seems odd that they would use language that seems to ascribe consciousness to them. The mind-set behind this tendency thus demands examination. The key to understanding this kind of usage is that it isn’t done in a naive way; hackers don’t personalize their stuff in the sense of feeling empathy with it, nor do they mystically believe that the things they work on every day are ‘alive’.”

Okay, so others have noticed this too. The Jargon file even proposes some possible reasons for anthropomorphizing computer hardware and software:

  1. It reflects a “mechanistic view of human behavior.” “In this view, people are biological machines - consciousness is an interesting and valuable epiphenomenon, but mind is implemented in machinery which is not fundamentally different in information-processing capacity from computers… Because hackers accept that a human machine can have intentions, it is therefore easy for them to ascribe consciousness and intention to other complex patterned systems such as computers.” But while the materialistic view of humans has respectible company, this “explanation” fails to explain why humans would use anthropomorphic terms about computer hardware and software, since they are manifestly not human. Indeed, as the Jargon file acknowledges, even hackers who have contrary religious views will use anthropological terminology.
  2. It reflects “a blurring of the boundary between the programmer and his artifacts - the human qualities belong to the programmer and the code merely expresses these qualities as his/her proxy. On this view, a hacker saying a piece of code ‘got confused’ is really saying that he (or she) was confused about exactly what he wanted the computer to do, the code naturally incorporated this confusion, and the code expressed the programmer’s confusion when executed by crashing or otherwise misbehaving. Note that by displacing from “I got confused” to “It got confused”, the programmer is not avoiding responsibility, but rather getting some analytical distance in order to be able to consider the bug dispassionately.”
  3. “It has also been suggested that anthropomorphizing complex systems is actually an expression of humility, a way of acknowleging that simple rules we do understand (or that we invented) can lead to emergent behavioral complexities that we don’t completely understand.”

The Jargon file claims that “All three explanations accurately model hacker psychology, and should be considered complementary rather than competing.” I think the first “explanation” is completely unjustified. The second and third explanations do have some merit. However, I think there’s a simpler and more important reason: Language.

When we communicate with a human, we must use some language that will be more-or-less understood by the other human. Over the years people have developed a variety of human languages that do this pretty well (again, more-or-less). Human languages were not particularly designed to deal with computers, but languages have been honed over long periods of time to discuss human behaviors and their mental states (thoughts, beliefs, goals, and so on). The sentence “Sally says that Linda likes Tom, but Tom won’t talk to Linda” would be understood by any normal seven-year-old girl (well, assuming she speaks English).

I think a primary reason people anthropomorphic terminology is because it’s much easier to communicate that way when discussing computer hardware and software using existing languages. Compare “the program got confused” with the overly long “the program executed a different path than the one expected by the program’s programmer”. Human languages have been honed to discuss human behaviors and mental states, so it is much easier to use languages this way. As long as both the sender and receiver of the message understand the message, the fact that the terminology is anthropomorphic is not a problem.

It’s true that anthropomorphic language can confuse some people. But the primary reason it confuses some people is that they still have trouble understanding that computers are mindless ‐ that computers simply do whatever their instructions tell them. Perhaps this is an innate weakness in some people, but I think that addressing this weakness head-on can help counter it. This is probably a good reason for ensuring that people learn a little programming as kids ‐ not because they will necessarily do it later, but because computers are so central to the modern world that people should have a basic understanding of them.

path: /misc | Current Weblog | permanent link to this entry