David A. Wheeler's Blog

Thu, 30 Jul 2020

If you contribute to Free/Open Source Software, please take the FOSS Contributor Survey!

This survey is a collaboration between the Linux Foundation’s Core Infrastructure Initiative and the Laboratory for Innovation Science at Harvard. Some of the questions are specific to those who write software; if you contribute, but don’t write software, just skip those questions. The goal is to get a better understanding about its development so that we can best work out how to improve its security and sustainability.

Also: please tell others who develop this software about the survey!

One interesting complication about this survey is that it’s difficult to get the word out about such a general survey. People talk about the “open source software community”, but in practice there isn’t one such community, there are many communities with some overlap. I don’t want to spam people who have never expressed any interest in information like this.

I’m currently talking with some folks in the Linux Foundatinon leadership about sending a one-time email only to developers who are already signed up for Linux Foundation mailing lists that are focused on developing open source software. We don’t want to spam people, but I think it’s reasonable to believe that people on those mailing lists are interestd in information related to the development of open source software. One problem with sending to multiple mailing lists is that we don’t want to annoy people by having them receive multiple copies, so we want to work out a way so an individual only gets one copy.

I’ve never done this before, and I hate spam myself. So I’m first checking with Linux Foundation leaders and program managers to see if they think this is reasonable. I think it is, but it’s easy to justify anything to yourself, so I’m waiting to hear from others about what they think.

So getting back to the point - if you contribute to Free/Open Source Software, please take the FOSS Contributor Survey!

path: /oss | Current Weblog | permanent link to this entry

Fri, 12 Jun 2020

Linux kernel earns gold!

The Linux kernel has earned the CII Best Practices gold badge. The CII Best Practices badge has three badge levels: passing, silver, and gold. Gold badges are especially hard to get, and I congratulate them! More info here: Linux kernel earns CII best practices gold badge

path: /oss | Current Weblog | permanent link to this entry

Fri, 03 Apr 2020

I am at the Linux Foundation!

On April 1, 2020, I started working at the Linux Foundation!

My new title is “Director, Open Source Supply Chain Security”. I’ll be working to improve the security of open source software. I look forward to working with many others on this important problem.

So please wish me luck… and stay tuned for more.

path: /oss | Current Weblog | permanent link to this entry

Tue, 18 Feb 2020

Census II Report on Open Source Software

The Linux Foundation and the Laboratory for Innovation Science at Harvard have just released a new report: “Vulnerabilities in the Core: Preliminary Report and Census II of Open Source Software” by Frank Nagle, Jessica Wilkerson, James Dana, and Jennifer L. Hoffman, 2020-02-14. Just click on “Download Report” when you get there. A summary is available from Harvard. Here’s a quick introduction to the paper.

Their long-term goal is to figure out what FOSS packages are most critical through data analysis. This turns out to extremely difficult, as discussed in the paper, and they expressly state that their current results “cannot - and do not purport to - be a definitive claim of which FOSS packages are the most critical”. That said, they have developed a method as a “proof of concept” to start working towards that answer.

They describe their approach in detail. Here’s a quick summary. First they use data from Software Composition Analysis (SCAs) and application security companies, including Snyk and Synopsys Cybersecurity Research Center, to identify components used in actual systems. They then use dependency analysis (via libraries.io) to identify indirect (transitive) dependencies. Finally, they averaged the Z-scores to provide normalized rankings.

Here are some key lessons learned from the report (Chapter 7):

There’s a need for a standardized naming scheme for software components.
There’s an increasing importance of individual developer account security.
Legacy software persists in the open source space.

Also, here’s an interesting nugget: “These statistics illustrate an interesting pattern: a high correlation between being employed and being a top contributor to one of the FOSS packages identified as most used.”

I’m on the CII Steering Committee, so I did comment on an earlier draft, but credit goes to the actual authors.

path: /oss | Current Weblog | permanent link to this entry

Sat, 12 Oct 2019

Gource visualization (including set.mm)

Software and mathematics are often difficult for others to visualize. Computer hardware engineers can often have cool props to distribute during their talks, but software developers and mathematicians work with ideas of the mind - no physical objects involved.

This can sometimes make it difficult to explain important ideas like open source software (OSS). The idea of “people collaborating to produce something” is easy enough, but getting a true visceral understanding of what happens can be hard.

Gource is a cool visualization tool that makes it easy to see “collaboration in action”. The Gource project even has a web page showing some examples of Gource visualization.

I recently created a Gource visualization of the Metamath set.mm project. Some context is important here. In mathematics, claims are supposed to be rigorously proven, but humans are fallible; they make mistakes, and others often miss those mistakes. The solution to this problem is to rigorously describe mathematics in a formal way so that every step can be rigorously and automatically checked by a computer. This turns out to be difficult, and requires that a lot of people work together. Now… how can you visualize people working together to rigorously prove mathematical claims? One way is to use Gource… because while it doesn’t show everything, you at least get a sense of the collaboration. In this case, 48 people have contributed so far.

This visualization shows a common feature: in many cases, a single person starts and makes all the contributions for a while. The same thing happens if you view, for example, a Gource visualization of the Python programming language.

Gource is itself OSS, so you can download it and use it to create your own visualizations. I strongly recommend that you automate doing it as much as possible. For example, if you process data first, use a script to automate processing the data. You’ll need to give Gource various options; store options in its config file or a scripts.

If you create a Gource video, I strongly recommend adding some music or at least an audio commentary. If you add music, make sure it’s legal to add; the safe route is to use music released under open licenses such as Creative Commons Attribution (CC-BY) or CC0 Public Domain Dedication (CC0). Beware of the “non-commercial use” licenses - your releases might count as “commercial” even if you don’t think they do (talk to a lawyer if you want to go down that path). A great place to start for Gource music is audionautix.com, which has released lots of music under the Creative Commons Attribution 3.0 Unported License; you can select from lots of different styles and get some great options. Improving Gource Videos with Background and Audio has some tips and instructions.

In conclusion: enjoy my Gource visualization of the Metamath set.mm project… and perhaps it will inspire you to do something similar. I’ve embedded the video below so you can easily view it (if you like):

path: /oss | Current Weblog | permanent link to this entry

Sat, 25 May 2019

GitHub Maintainer Security Advisories

GitHub just made a change that I think will make a big improvement to the security of open source software (OSS). It’s now possible to privately report vulnerabilities to OSS projects on GitHub via maintainer security advisories! This wasn’t possible before, and you can blame me (in part), because I’m the one who got this ball rolling. I also want to give a big congrats to the GitHub team, who actually made it happen.

Here some details, in case you’re curious.

As you probably know, there are more OSS projects on GitHub than any other hosting service. However, there has been no way to privately report security vulnerabilities on OSS projects. It’s hard to fault GitHub too much (they’re providing a service for free!), yet because so much software is maintained on GitHub this has led to widespread problems in reporting and handling vulnerabilities. It can be worked around, but this has been a long-standing systemic problem with GitHub.

Why is this a problem? In a word: attackers. Ideally software would have no defects, including vulnerabilities. Since vulnerabilities can harm users, developers should certainly be using a variety of techniques to limit the number and impact of vulnerabilities in the software they develop If you’re developing OSS, a great way to see if you’re doing that (and show others the same) is to get a CII Best Practices badge from the Linux Foundation’s Core Infrastructure Initiative (I lead this effort). But mistakes sometimes happen, no matter what you do, so you need to be prepared for them. It’s hard to respond to vulnerability reports if it’s hard to get the vulnerability reports or discuss them within a project. Of course, a project needs to rapidly fix a vulnerability once it is reported, but we need to make that first step easy.

In September 2018 I went to a meeting at Harvard to discuss OSS security (in support of the Linux Foundation). There I met Devon Zuegel, who was helping Microsoft with their recently-announced acquisition of GitHub. I explained the problem to her, and she agreed that this was a problem that needed to be fixed. She shared it with Nat Friedman (who was expected to become the GitHub CEO), who also agreed that it made sense. They couldn’t do anything until after the acquisition was complete, but they planned to make that change once the acquisition was complete. The acquisition did complete, so the obvious question is, did they make the change? Well…

I am very happy to report that GitHub has just announced the beta release of maintainer security advisories, which allow people to privately report vulnerabilities without immediately alerting every attacker out there. My sincere thanks to Devon Zuegel, Nat Friedman, and the entire team of developers at GitHub for making this happen.

This seems to be part of a larger effort by GitHub to support security (including for OSS). GitHub’s security alerts make it easy for GitHub-hosted projects to learn about vulnerable dependencies (that is, a version of a software component that you depend on but is vulnerable).

It’s easy to get discouraged about software security, because the vulnerabilities keep happening. Part of the problem is that most software developers know very little about developing secure software. After all, almost no one is teaching them how to do it (I teach a graduate class at George Mason University to try to counter that problem). I hope that over time more developers will learn how to do it. I also hope that more and more developers will use more and more tools will help them create secure software, such as my flawfinder and Railroader tools. Tools can’t replace knowledge, but they are a necessary piece of the puzzle; putting tools into a CI/CD pipeline (and an auditing process if you can afford one) can eliminate a vast number of problems.

These changes show that it is possible to make systemic changes to improve security. Let’s keep at it!

path: /oss | Current Weblog | permanent link to this entry

Fri, 10 May 2019

The year of Linux on the desktop

For those who know their computer history, wild things are going on regarding Linux this year.

Linux is already in widespread use. For years the vast majority of smartphones run Android, and Android runs on Linux, so most smartphones run on Linux. As of November 2018 100% of all top 500 supercomputers worldwide run on Linux. Best estimates for servers using Linux are around 66.7%, and Linux is widely used in the cloud and in embedded devices.

But something different is going on in 2019. All Chromebooks are also going to be Linux laptops going forward. Later this year Microsoft will include the Linux kernel as a component in Windows. In a sense, 2019 is the year of the Linux desktop. This was not in the way it was envisioned in the past, but perhaps that’s what makes it most interesting. No, it does not mean that everyone is interacting directly with Linux as their main laptop OS, and so you can certainly argue that this doesn’t count. But increasingly that is measurement is less important; people today access computers via browsers, not the underlying OS, and that system is often running and/or developed using Linux.

path: /oss | Current Weblog | permanent link to this entry

Wed, 10 Apr 2019

Subversion of bootstrap-sass

A malicious backdoor has been found in the popular open source software library bootstrap-sass. Its impact was limited - but the next attack might not be. Thankfully, there are things we can learn and do to reduce those risks… but that requires people to think them through.

See my essay Subversion of boostrap-sass for more about that!

path: /oss | Current Weblog | permanent link to this entry

Sat, 09 Feb 2019

Railroader: Security static analysis tool for Ruby on Rails (Brakeman fork)

I’ve kicked off the Railroader project to maintain a security static analysis tool for Ruby on Rails that is open source software. If you are developing with Ruby on Rails, please consider using Railroader. We would also really love contributions, so please contribute!

A security static analysis tool (analyzer) examines software to help you identify vulnerabilities (without running the possibly-vulnerable program). This helps you find and fix vulnerabilities before you field your web application. Ruby on Rails is a popular framework for developing web applications; sites that use Rails include GitHub, Airbnb, Bloomberg, Soundcloud, Groupon, Indiegogo, Kickstarter, Scribd, MyFitnessPal, Shopify, Urban Dictionary, Twitch.tv, GitLab, and the Core Infrastructure Initiative (CII) Best Practices Badge.

In the past the obvious tool for this purpose was Brakeman. However, Brakeman has switched to the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Public License (CC-BY-NC-SA-4.0). This is not an open source software license since it cannot be used commercially (an OSS license cannot discriminate against a field of endeavor). Similarly, it is not a free software license (since you cannot run the program as you wish / for any purpose). You can verify this by looking at the Brakeman 4.4.0 release announcement, the SPDX license list, Debian’s “The Debian Free Software Guidelines (DFSG) and Software Licenses”, Various Licenses and Comments about Them (Free Software Foundation), and Fedora’s Licensing:Main (Bad Licenses list). Railroader conitinues using the original licenses: MIT for code and CC-BY-3.0 for the website. MIT, of course, is a very well-known and widely-used open source software license.

If you are currently using Brakeman, do not update to Brakeman version 4.4.0 or later until you first talk with your lawyer. At the very least, if you plan to use newer versions of Brakeman, check their new license carefully to make sure that there is no possibility of a legal issue. This license change was part of a purchase of Brakeman by Synopsys. Synopsys is a big company, and they definitely have the resources to sue people who don’t obey their legal terms. Even if they didn’t, it is not okay to use software when you don’t have the right to do so. Either make sure that you have no legal issues… or just switch to Railroader, where nothing has changed.

Unfortunately, it is really easy to “just upgrade to the latest release” of Brakeman without realizing that this is a major license change. I suspect a lot of people will just automatically download and run the latest version, and have no idea that this is happening. I only noticed because I routinely use software license checkers (license_finder in my case) so that I immediately notice license changes in a newer version. I strongly recommend adding static source code analyzers and license checkers as part of your continuous integration (CI).

We assume that “Brakeman” is now a trademarked by Synopsys, Inc, so we’ve tried to rename everything so that the projects are clearly distinct. If we’ve missed something, please let us know and we’ll fix it. The term “Railroader” is a play on the word Rails, but it is obviously a completely different word. Railroader shares a common code base historically with Brakeman, and that’s important to explain, but they are not the same projects and we are expressly trying to not infringe on any Brakeman trademark. It’s obviously legal to copy and modify materials licensed under the MIT and CC-BY-3.0 licenses (that’s the purpose of these licenses), so we believe there is no legal problem.

I think I have a reasonable background for starting this project. I created and maintain flawfinder, a security static analysis tool for C/C++, since 2001. I literally wrote the book on developing secure software; see my book Secure Programming HOWTO. I even teach a graduate class at George Mason Univerity (GMU) on how to develop secure software. For an example of how I approach securing software in an affordable way, see my video How to Develop Secure Applications: The BadgeApp Example (2017-09-18) or the related document BadgeApp Security: Its Assurance Case. I have also long analyzed software licenses, e.g., see The Free-Libre / Open Source Software (FLOSS) License Slide, Free-Libre / Open Source Software (FLOSS) is Commercial Software, and Publicly Releasing Open Source Software Developed for the U.S. Government.

While Railroader is a project fork, we hope that this is not a hosttile fork. We will not accept software licensed only under CC-BY-NC-SA-4.0, since that is not an OSS license. But we’ll gladly accept good contributions from anyone if they are released under the original OSS licenses (MIT for software, CC-BY-3.0 for website content). If the Brakeman project wants to cooperate in some way, we’d love to talk! We are all united in our desire to squash out vulnerabilities before they are deployed. In addition, we’re grateful for all the work that the Brakeman community has done.

So, again: If you are developing with Ruby on Rails, please consider using Railroader. We would also really love contributions, so please contribute!

path: /oss | Current Weblog | permanent link to this entry

Mon, 19 Nov 2018

Get your CII best practices badge!

Are you developing open source software (OSS)? Selecting some? If you’re developing OSS, earn a best practices badge from the Linux Foundation Core Infrastructure Initiative (CII). If you’re selecting OSS, prefer OSS that has earned a badge. The badge shows that the project is applying the best practices for today’s projects. Check out this short Youtube video summary or the CII Best Practices badge website.

path: /oss | Current Weblog | permanent link to this entry

Wed, 04 May 2016

Get your CII best practices badge!

If you’re involved in a free / libre / open source software (FLOSS) project, go to bestpractices.coreinfrastructure.org and get your best practices badge!

The Linux Foundation’s Core Infrastructure Initiative (CII) has just announced its CII best practices badging program for FLOSS projects. It’s a free program that lets developers explain how they follow best practices, and if they do, they can get a badge that they can show on their GitHub page or anywhere else. Early badge earners include the Linux kernel, Curl, GitLab, OpenBlox, OpenSSL, Node.js and Zephyr.

The idea is straightforward. The Heartbleed vulnerability in OpenSSL made it obvious that there are widely-accepted best practices that not everyone is doing - and that even includes important projects. This isn’t just speculation; if you compare OpenSSL before Heartbleed with current OpenSSL the difference is striking. I think it’s clear that if more projects would apply generally-accepted best practices, we’d have more secure software. This badging process helps projects identify those best practices, determine if they meet them, and show everyone else that they’re meeting them.

The web application and criteria are being maintained as an open source software project, so we’d love to have you! I say “we” because I’m leading this project… but it’s not just me, and we would love to have you involved.

More detail is in the Linux Foundation press release about the best practices badging project.

path: /oss | Current Weblog | permanent link to this entry

Thu, 10 Mar 2016

US government - Reusable and Open Source Software

The US White House has announced (in its blog) Leveraging American Ingenuity through Reusable and Open Source Software. They state that, “Today, we’re releasing for public comment a draft policy to support improved access to custom software code developed for the Federal Government.” They are accepting comments on this draft policy via GitHub pull requests, GitHub issues, or email. I definitely plan to take a look, and I’m sure they would like feedback from many people.

Note that I also posted this information on Twitter.

path: /oss | Current Weblog | permanent link to this entry

Mon, 01 Feb 2016

Using open source software to help technology transition of research

If you’re doing software research and development (especially on how to improve computer security), and are thinking about using an open source software (OSS) approach but don’t know a lot about it, here’s something that may help: Using an Open Source Software Approach for Cybersecurity Technology Transition (IDA paper P-5279, aka the “PI guide”). If you’re an old hand at developing Free/ libre/ open source software (FLOSS or OSS), you probably know most of this information. However, I’ve found that a lot of people could use a hand. Here’s that helping hand.

path: /oss | Current Weblog | permanent link to this entry

Fri, 09 Oct 2015

Government adoption of OSS

If you’re interested in open source software (OSS), or in how governments can work better, take a look! Mark Bohannon has posted the article “U.S. report highlights positive elements of government open source adoption” on Opensource.com. This discusses a paper Tom Dunn and I wrote Open Source Software in Government: Challenges and Opportunities, and discusses a few things that have happened since. Enjoy!

path: /oss | Current Weblog | permanent link to this entry

Wed, 30 Sep 2015

Reveloping open source software in Linux Foundation projects: $5 billion and 30 years

The Linux Foundation now estimates it would cost $5 billion and 30 years to redevelop “the software residing in The Linux Foundation’s collaborative projects”. That’s not even all free / libre / open source software (FLOSS). Of course, there are many caveats, but that’s still an intriguing number; it provides a simple view of just how big FLOSS has become. They also credit me, since they applied the same general process I developed earlier in my “More than a Gigabuck” paper. Thanks! If you’re interested in FLOSS, I think you’ll find this paper intriguing.

path: /oss | Current Weblog | permanent link to this entry

Fri, 27 Mar 2015

Z3 is OSS!

Z3 has been released as open source software under the MIT license! This is great news. Z3 is a good satisifiability modulo theories (SMT) solver / theorem prover from Microsoft Research. An SMT solver accepts a set of constraints (such as “a<5 and a>1”) and tries to produce values that satisfy all the constraints. A satisfiability (SAT) solver does this too, but SAT solvers can only work with boolean variables; SMT solvers can handle other types, such as integers. Here is a Z3 tutorial.

SMT solvers are basically lower-level tools that have many uses for building larger capabilities, because many problems require solving logical formulas to find a solution.

I am particularly interested in the use of SMT solvers to help prove that programs do something or do not do something. Why3 is a platform that lets you write programs and their specifications, and then calls out to various provers to try to determine if the claims are true. By itself Why3 only supports its WhyML language, but Why3 can be combined with other tools to prove statements in other languages. Those include C (using Frama-C and a plug-in), Java, and Ada. People have been able to prove tiny programs for decades, but scaling up to bigger programs in practice requires a lot of automation. I think this approach of combining many different tools, with different strengths, is very promising.

The more tools that are available to Why3, the more likely it will solve problems automatically. That’s because different tools use different heuristics and focus on different issues, resulting in different ones being good at different things. There are already several good SMT solvers available as OSS, including CVC4 and alt-ergo.

Now that Microsoft has released Z3 as OSS, there is yet another strong OSS SMT solver that tools like Why3 can use directly. In short, the collection of OSS SMT solvers has just become even stronger. There’s a standard for SMT solver inputs, the SMT-LIB format, so it’s not hard to take advantage of many SMT solvers. My hope is that this will be another step in making it easier to have strong confidence in software.

path: /oss | Current Weblog | permanent link to this entry

Mon, 20 Oct 2014

Open Source Software in U.S. Government

The report “Open Source Software in Government: Challenges and Opportunities” is available to the public (you can jump to the “Download full report” link at the bottom). This paper, which I co-authored, discusses key challenges and opportunities in the U.S. government application of open source software (OSS). It became publicly available only recently, even though it was finished a while back; I hope it’s been worth the wait. If you’re interested in the issues of OSS and government, I think you’ll find this report very illuminating.

path: /oss | Current Weblog | permanent link to this entry

Wed, 21 May 2014

On Dave and Gunnar show

There is now an interview of me on the Dave and Gunnar show (episode #51). I talk mostly about How to prevent the next Heartbleed. I also talk about my FLOSS numbers database (as previously discussed) and vulnerability economics. There was even a mention of my Fully Countering Trusting Trust through Diverse Double-Compiling work.

Since the time of the interview, more information has surfaced about Heartbleed. Traditional fuzzing could not find Heartbleed, but it looks like some fuzzing variants could even if the OpenSSL code was unchanged; see the latest version for more information. If you learn more information relevant to the paper, let me know!

path: /oss | Current Weblog | permanent link to this entry

Thu, 08 May 2014

FLOSS numbers database!

If you are doing research related to Free / Libre / Open Source Software (FLOSS), then I have something that may be useful to you: the FLOSS numbers database.

My paper Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers! is a big collection of quantitative studies about FLOSS. Too big, in fact. There have been a lot of quantitative studies about FLOSS over the years! A lot of people want to query this information for specific purposes, and it is hard to pull out just the parts you want from a flat document. I had thought that as FLOSS became more and more common, fewer people would want this information… but I still get requests for it.

So I am announcing the FLOSS numbers database; it provides the basic information in spreadsheet format, making it easy to query for just the parts you want. My special thanks go to Paul Rotilie, who worked to get the data converted from my document format into the spreadsheet.

If you want to discuss this database, I have set up a discussion group: Numbers about Free Libre Open Source Software. If you are doing research and need or use this kind of information, please feel free to join. If you just need a presenatation based on this, you might like my Presentation: Why Free-Libre / Open Source Software (FLOSS or OSS/FS)? Look at the Numbers!.

This database is the sort of thing that if you need it, you really need it. I am sure it is incomplete… but I am also sure that with your help, we can make it better.

path: /oss | Current Weblog | permanent link to this entry

Thu, 24 Apr 2014

Opensource.com interview

Opensource.com has posted an interview of me, titled “US government accelerating development and release of open source”. In this interview I describe the current state of the use of open source software by the US federal government, the challenges of the Federal acquisition system, and I also discuss what may happen next. Enjoy!

path: /oss | Current Weblog | permanent link to this entry