David A. Wheeler's Blog

Fri, 01 Sep 2006

Planet definition problems, and Pluto too

Well, at the last minute International Astronomical Union (IAU) changed their proposed definition of the term “planet”, and voted on it. Pluto is no longer a planet!

Well, maybe.

Usually definitions like this go through a lot of analysis; I think this one was rushed through at the last minute. I see three problems: It was an irregular vote, it’s vague as written, and it doesn’t handle faraway planets well. Let’s look at each issue in turn.

This was a pretty irregular vote, I think. As I noted, at the last minute the proposal changed, with no time for deeply examining it. There were 2,700 attendees, but only 424 astronomers (about 10%) voted on the proposal that “demoted” Pluto. And only a few days after the vote, 300 astronomers have signed a petition saying they do not accept this IAU definition - almost as many as voted in the first place. That doesn’t sound like consensus to me.

More importantly, it’s too vague. That’s not just my opinion; Space.com notes that there’s a lot of uncertainty about it. Now a planet has to control its zone… but Earth doesn’t, there are lots of objects that cross Earth’s orbit. Does this mean that Earth is not a planet? I haven’t seen any published commentary on it, but I think there’s even a more obvious problem - Neptune is clearly not a planet, because it hasn’t cleared out Pluto and Charon. A definition which is that vague is not an improvement.

But in my mind, the worst problem with this definition is a practical one: it doesn’t handle other planets around other stars well. We are too far away to observe small objects around other stars, and I think we will always be able to detect larger objects but not smaller ones in many faraway orbits. So when we detect an object in another galaxy with the mass of Jupiter, and it’s orbiting a star, is it a planet? Well, under this current definition we don’t know if it’s a planet or not. Why? Because we may not be able to know what else is there in orbit. And that is a real problem. I think it’s clear that we will always be able to observe some larger objects without being able to detect the presence of smaller ones. If we can’t use the obvious word, then the definition is useless - so we need a better definition instead.

I thought the previous proposal (orbits a star, enough mass to become round) was a good one, as I noted earlier in What’s a planet? Why I’m glad there’s an argument. I think they should return to that previous definition, or find some other definition that is (1) much more precise and (2) lets us use the term “planet” in a sensible way to discuss large non-stars that are orbiting faraway stars. Whether Pluto is in, or not. Of course, none of this affects reality; this is merely a definition war. But clear terminology is important in any science.

I still think that what’s great about this debate is that it has caused many people to discuss and think about what’s happening in the larger universe, instead of focusing on the transient. That is probably the most positive result of all.

path: /misc | Current Weblog | permanent link to this entry

GPL, BSD, and NetBSD - why the GPL rocketed Linux to success

Charles M. Hannum (one of the 4 originators of NetBSD) has posted a sad article about serious problems in the NetBSD project, saying “the NetBSD Project has stagnated to the point of irrelevance.” You can see the article or an LWN article about it.

There are still active FreeBSD and OpenBSD communities, and there’s much positive to say about FreeBSD and OpenBSD. I use them occasionally, and I always welcome a chance to talk to their developers - they’re sharp folks. Perhaps NetBSD will partly revive. But systems based on the Linux kernel (“Linux”) absolutely stomp the *BSDs (FreeBSD, OpenBSD, and NetBSD) in market share. And Linux-based systems will continue to stomp on the *BSDs into the foreseeable future.

I think there is one primary reason Linux-based systems completely dominate the *BSDs’ market share - Linux uses the protective GPL license, and the *BSDs use the permissive (“BSD-style”) licenses. The BSD license has been a lot of trouble for all the *BSDs, even though they keep protesting that it’s good for them. But look what happens. Every few years, for many years, someone has said, “Let’s start a company based on this BSD code!” BSD/OS in particular comes to mind, but Sun (SunOS) and others have done the same. They pull the *BSD code in, and some of the best BSD developers, and write a proprietary derivative. But as a proprietary vendor, their fork becomes expensive to self-maintain, and eventually the company founders or loses interest in that codebase (BSD/OS is gone; Sun switched to Solaris). All that company work is then lost forever, and good developers were sucked away during that period. Repeat, repeat, repeat. That’s enough by itself to explain why the BSDs don’t maintain the pace of Linux kernel development. But wait - it gets worse.

In contrast, the GPL has enforced a consortia-like arrangement on any major commercial companies that want to use it. Red Hat, Novell, IBM, and many others are all contributing as a result, and they feel safe in doing so because the others are legally required to do the same. Just look at the domain names on the Linux kernel mailing list - big companies, actively paying for people to contribute. In July 2004, Andrew Morton addressed a forum held by U.S. Senators, and reported that most Linux kernel code was generated by corporate programmers (37,000 of the last 38,000 changes were contributed by those paid by companies to do so; see my report on OSS/FS numbers for more information). BSD license advocates claim that the BSD is more “business friendly”, but if you look at actual practice, that argument doesn’t wash. The GPL has created a “safe” zone of cooperation among companies, without anyone having to sign complicated legal documents. A company can’t feel safe contributing code to the BSDs, because its competitors might simply copy the code without reciprocating. There’s much more corporate cooperation in the GPL’ed kernel code than with the BSD’d kernel code. Which means that in practice, it’s actually been the GPL that’s most “business-friendly”.

So while the BSDs have lost energy every time a company gets involved, the GPL’ed programs gain every time a company gets involved. And that explains it all.

That’s not the only issue, of course. Linus Torvalds makes mistakes, but in general he’s a good leader; leadership issues are clearly an issue for some of the BSDs. And Linux’s ability early on to support dual-boot computers turned out to be critical years ago. Some people worried about the legal threats that the BSDs were under early on, though I don’t think it had that strong an effect. But the early Linux kernel had a number of problems (nonstandard threads, its early network stack was terrible, etc.), which makes it harder to argue that it was “better” at first. And the Linux kernel came AFTER the *BSDs - the BSDs had a head start, and a lot of really smart people. Yet the Linux kernel, and operating systems based on it, jumped quickly past all of them. I believe that’s in large part because Linux didn’t suffer the endless draining of people and effort caused by the BSD license.

Clearly, some really excellent projects can work well on BSD-style licenses; witness Apache, for example. It would be a mistake to think that BSD licenses are “bad” licenses, or that the GPL is always the “best” license. But others, like Linux, gcc, etc., have done better with copylefting / “protective” licenses. And some projects, like Wine, have switched to a protective (copylefting) license to stem the tide of loss from the project. Again, it’s not as simple as “BSD license bad” - I don’t think we fully understand exactly when each license’s effects truly have the most effect. But clearly the license matters; this as close to an experiment in competing licenses as you’re likely to get.

Obviously, a license choice should depend on your goals. But let’s look more carefully at that statement, maybe we can see what type of license tends to be better for different purposes.

If your goal is to get an idea or approach widely used to the largest possible extent, a permissive license like the BSD (or MIT) license has much to offer. Anyone can quickly snap up the code and use it. Much of the TCP/IP code (at least for tools) in Windows was originally from BSD, I believe; there are even some copyright statements still in it. BSD code is widely used, and even when it isn’t used (the Linux kernel developers wrote their own TCP/IP code) it is certainly studied. But don’t expect the public BSD-licensed code to be maintained by those with a commercial interest in it. I haven’t noticed a large number of Microsoft developers being paid to improve any of the *BSDs, even though they share the same code ancestries in some cases.

If your goal is to have a useful program that stays useful long-term, then a protective (“copylefting”) license like the LGPL or GPL licenses has much to offer. Protective licenses force the cooperation that is good for everyone in the long term, if a long-term useful project is the goal. For example, I’ve noticed that GPL projects are far less likely to fork than BSD-licensed projects; the GPL completely eliminates any financial advantage to forking. The power of the GPL license is so strong that even if you choose to not use a copylefting license, it is critically important that an open source software project use a GPL-compatible license.

Yes, companies could voluntarily cooperate without a license forcing them to. The *BSDs try to depend on this. But it today’s cutthroat market, that’s more like the “Prisoner’s Dilemma”. In the dilemma, it’s better to cooperate; but since the other guy might choose to not cooperate, and exploit your naivete, you may choose to not cooperate. A way out of this dilemma is to create a situation where you must cooperate, and the GPL does that.

Again, I don’t think license selection is all that simple when developing a free-libre/open source software (FLOSS) program. Obviously the Apache web server does well with its BSD-ish license. But packages like Linux, gcc, Samba, and so on all show that the GPL does work. And more interestingly, they show that a lot of competing companies can cooperate, when the license requires them to.

path: /oss | Current Weblog | permanent link to this entry