Open Source Mergers and the Circle of UNIX

Lloyd Budd wrote a post today entitled, “GPL Encourages Collaboration,” that was highly serendipitous and that triggered a flood of thoughts, some of which I’ll try to capture here. Earlier today, I was rereading an interview I did with Sunir Shah a few months back for a paper I’m writing, and I latched onto a point that Sunir made about the proliferation of Wiki communities:    (MHR)

I see so many people start their own Wikis. They go to these other Wikis that are disorganized. They don’t feel like jumping in and learning everything because everything’s in different places, and they are not coherent. So they start their own.    (MHS)

There are now thousands of public Wikis, and they are all balkanized, because no one wants to collaborate. I understand that. It’s human nature. You collaborate within your group, but you don’t collaborate outside of that group.    (MHT)

On the one hand, I encourage the proliferation of Wikis, because if you disagree with someone, you should be able to do your own thing. On the other hand, true collaboration is showing a willingness to actually pull these communities together, a willingness to subsume your immediate need to be a true leader, build relationships, and pull people together.    (MHU)

Sunir’s insight is an important one. The existence of a commons isn’t in itself enough to guarantee collaboration. Ultimately, collaboration is intentional. You have to want to collaborate with others, and you have to be proactive about it. And it only takes two for starters.    (MHV)

That said, if you want to encourage collaboration, you have to start by eliminating as many barriers as possible. The first barriers are Shared Understanding and Shared Language. In the Open Source community, both of these have been indoctrinated in the form of Open Source licenses. The main reason for the effectiveness of these licenses as vehicles for Shared Understanding is Richard Stallman.    (MHW)

When RMS originally wrote the GPL, he embedded the philosophy behind the terms in the license itself. It was not simply a legal document, but an essay on the principles underlying software freedom. Regardless of how you feel about the GPL or RMS, these remain the fundamental principles behind why Open Source works, and more importantly, why collaboration is so rampant in Open Source communities.    (MHX)

However, as Sunir notes, this isn’t enough to guarantee collaboration. We see a lot of redundancy in Open Source, just as we see lots of redundancy in Wiki communities. And that’s okay. There are sometimes good reasons for this redundancy, and it’s healthy to facilitate it. But eventually, you want convergence also.    (MHY)

This is the crux of Lloyd’s post. He writes, “Forking is good, but not as good as merging.”    (MHZ)

Lloyd also cites a few examples (WordPress, gcc) and asks for others. There are tons of good ones, but my favorite is a bit obscure: UNIX.    (MI0)

That’s right, UNIX. Yes, the history of UNIX is fraught with ugliness, some of which still continues today. But it’s also one of the greatest examples of Open Source forking and merging. This is important, because it’s also the birthplace of the modern Open Source movement.    (MI1)

Here’s the quick summary. UNIX came from AT&T labs, and the original license terms were ambiguous (a massive understatement if there ever was one) for a number of reasons, the main one being that most people weren’t thinking about software as IP at the time. Stallman, of course, was one of the exceptions, and in 1984, he started the GNU project, whose goal was to reimplement UNIX under the GPL.    (MI2)

A few years earlier at UC Berkeley, Bill Joy had taken a different approach. He simply forked AT&T UNIX into what is now known as BSD UNIX. The new code eventually was distributed under the BSD license, which essentially allowed anyone to do whatever they wanted with the code, as long as they gave credit to BSD and didn’t sue anybody. This led to many, many new forks, especially proprietary ones from Sun (cofounded by Joy), IBM, DEC, SGI, and many, many others.    (MI3)

With the advent of the 386 in the late 1980s came a number of PC forks, including 386BSD (which begat FreeBSD, NetBSD, and OpenBSD), BSDI, SCO, and MINIX (which begat Linux). At this point, most versions of UNIX were also borrowing heavily from the GNU project.    (MI4)

By the late 1990s, we started seeing massive convergence. IBM, Sun, DEC (now HP), SGI, and SCO (yes, SCO) all switched over to Linux, and started merging some of their proprietary capabilities into the Linux kernel. Apple moved to FreeBSD.    (MI5)

What the heck happened? You could attribute this phenomenon to market consolidation, but that would only be telling part of the story. Market forces indeed played a strong role, just as they a play a strong role in anything that involves… well, markets. But the important points are these. The ability to fork UNIX into both proprietary and free versions allowed the market to innovate rapidly, and that innovation was critical to both the success of UNIX and to the growth of the industry as a whole. However, the existence of a free version of UNIX was critical in enabling the consolidation of all of these different UNIXes, even if they weren’t necessarily formal mergers. Why were all of these competing companies willing to throw their hat into the Linux ring? Because of the Open Source license, which protected companies from their competitors hijaacking the commons and using their own code against them.    (MI6)

Ironically, the story of UNIX is also one of the reasons I believe that the GPL’s terms are unnecessary for the proliferation of software freedom. The BSD license enabled proprietary forks, which arguably enabled innovation that wouldn’t have happened otherwise. In fact, it’s still enabling this today; look at Mac OS X. In the end, however, culture won out; these same proprietary companies are all supporting Open Source software, not because they had to (although they do now), but because they wanted to. Empowerment is stronger than enforcement when it comes to communities and collaboration.    (MI7)

I can’t resist one more small rant, a teaser if you will. Most Web 2.0 companies seem to have learned the superficial lessons of Open Source, but they’ve missed the main point. The argument that Open APIs obviate the need for Open Source completely ignores what history has taught us over and over again. One of these days, I’ll blog about this.    (MI8)

Montreal WikiWednesday Today

I’ll be dropping by WikiWednesday in Palo Alto tonight, but I can’t stay for long. My softball team dominated the regular season, and the playoffs are tonight (assuming it doesn’t rain). Although I’m bummed I’ll miss most of the festivities in Palo Alto, I’m really wish I could be in Montreal, home of next year’s RecentChangesCamp. Tonight’s Montreal WikiWednesday will be excellent, with Wiki celebrities such as Sunir Shah, Evan Prodromou, Seb Paquet, Alain Desilets, and Anne Goldenberg (who’s doing an awesome job organizing the next RecentChangesCamp) in attendance. If you live in Montreal, definitely drop by. Have fun, folks, and I’ll look forward to reading about it on your blogs!    (LB4)

Creole Wiki Markup, WikiOhana, and More Ward Wisdom

Several folks asked me on Saturday what I thought about the conference, and I kept saying, “It’s been great, except I haven’t had any interesting conversations about Wikis.” That changed in an unexpectedly generative way on Saturday afternoon, resulting in a new little cooperative effort we’re calling WikiOhana.    (KZ1)

At Thursday night’s conference party, I got reacquainted with Chuck Smith, whom I had met at Wikimania last year. Chuck is the founder of Esperanto Wikipedia and was the one who first introduced Brion Vibber — now the lead developer of Mediawiki — to the Wikipedia community. At the conference last year, Chuck met Christoph Sauer, a German Wiki researcher, and the two of them started working together.    (KZ2)

At the party, Chuck told me that he and Christoph had been working on a proposed WikiMarkup standard, and I loudly grimaced in response. At WikiSym last October, I made it clear to several people that I thought trying to standardize WikiMarkup was a noble waste of time.    (KZ3)

First, people are really attached to their own markup. Previous discussions about standardization usually started and ended with, “Great idea. Let’s just standardize on mine.”    (KZ4)

Second, WYSIWYG in theory obviates the value of a standard markup from a user’s perspective. It seemed more valuable to work out a standard interchange language, which WYSIWYG widgets could use to interact with different Wiki engines and which would make data migration between different Wiki engines easier. It also seemed like it would be easier to come to agreement on an interchange language. In fact, folks have already worked out a good proposal for an XHTML interchange language on the Interwiki mailing list (now migrated to the wiki-standards list) and on CommunityWiki.    (KZ5)

What made all this talk worse for me was that I thought that people were trying to go about this the wrong way. We were not going to come to agreement by group authoring a spec, then getting everyone to agree on it. The only way this was going to work was if a few Wikis came to some agreement and actually implemented the changes. This wouldn’t necessarily result in a standard, but it could act as a catalyst that could lead to a standard. Sunir Shah has been advocating this approach for a long time, to no avail. I would have organized such an effort myself, except that I didn’t think it was important enough to spend any time on it.    (KZ6)

Chuck and Christoph changed my mind. Chuck explained that he and Christoph had extensively analyzed existing Wiki markup, and they had taken pains to come up with a small subset of things that would be as noncontroversial as possible. (There had been a similar effort by others to do such an analysis before, but it hadn’t gotten very far.) They were going to present the results at WikiSym in a few weeks, but they had prepared a poster for Wikimania. Still clucking my disapproval, I promised Chuck that I would check out his poster.    (KZ7)

On Saturday, I finally made some time to check out the poster. I studied it and found myself thinking, “Hmm. This isn’t bad.” Chuck came over later, and we started going over it in depth. I still have some issues with it (discussed below), but I told him, “You know what, this is almost good enough for us to adopt in PurpleWiki. But you really should find one or two other Wikis who might say the same.” We kicked a few ideas around, and then I said, “Let’s go see what Ward thinks.”    (KZ8)

So we rounded up Ward Cunningham, and he took a look and started asking questions. He seemed to be okay with the answers, because he said that if someone were to write a Perl function that converted from his markup to the proposed markup and vice versa, he would consider incorporating it into his Wiki.    (KZ9)

Creole Markup    (KZA)

Well, you know me. Low-hanging fruit, a spare hour before the next party. The long and the short of it is, half of the converter now exists. It’s called Creole, and it’s written in Perl. Currently, it consists of a brain-dead simple script that converts Ward’s markup into Creole Markup (Chuck and Christoph’s proposal) and some tests.    (KZB)

The name was Ward’s suggestion. Rather than call it “pidgin,” which is what it actually is (a non-native language cobbled together from two or three other languages), we chose to optimistically call it “creole” (a native language that emerges from a pidgin), which is what we hope it will become.    (KZC)

I have three issues with the current proposal. First, I don’t like the exclamation point syntax for headers, not because of the character itself, but because of the numbering:    (KZD)

!!!Heading 1 !!Heading 2 !Heading 3    (KZE)

As you can see, you need to remember that a top-level header has three exclamation points, not one, and you are only capable of having three levels of headers. The advantage of the equals syntax:    (KZF)

= Heading 1 = == Heading 2 == === Heading 3 ===    (KZP)

is that the heading level is represented by the indentation.    (KZQ)

Second, there needs to be a proposal for preformatted blocks.    (KZR)

Third, the spec isn’t rigorous enough. You have to answer questions like, “Will italics work across multiple lines or blocks?” The reason I didn’t write a Creole-to-Ward converter was that these questions were not yet answered.    (KZS)

Most of the time developing the Creole converter was actually spent reverse engineering Ward’s markup. Ward pointed me to a Literate Programming version of his source code — implemented in a Wiki, of course — which helped a lot. As it turns out, Ward’s been thinking a bit about how to write a good markup parser recently. Parsers are often on my mind as well, as we’ve been talking about rewriting the PurpleWiki parser forever. Plus, the topic came up at Hacking Days among the Mediawiki developers before the conference. We’ll probably noodle on this problem a bit both before and during WikiSym. Fortunately, Ward knows a lot more about writing parsers than I, as anyone who’s seen the PurpleWiki code can attest.    (KZT)

WikiOhana    (KZJ)

When we were discussing names, Christoph proposed “ohana,” which means “family” in Hawaiian. It has a wonderful connotation — respecting our individuality while emphasizing the bonds that keep us together in a positive, welcoming fashion. It’s also consistent with Ward’s original naming convention, as “WikiWiki” means “quick” in Hawaiian.    (KZU)

We decided to call our little cooperative effort, WikiOhana, of which Creole Markup is a first project. We hope the family grows over time.    (KZV)

More Ward Wisdom    (KZK)

While retelling an experience he had writing a parser, Ward said, “That’s when I learned how to write beautiful code: Get it working, then spend the same amount of time making it better.”    (KZW)

WikiSym 2006 Program

The WikiSym 2006 program is set. Guess who’s keynoting (with Doug Engelbart). That’s right, I’ll be talking Wiki philosophy and showing off some HyperScope goodness. I’ll also be moderating an interactive session on the Future of Wikis, featuring the other WikiSym keynoters (Ward Cunningham, Angela Beesley, Mark Bernstein) and the illustrious Sunir Shah.    (KY0)

I got back from Wikimania late last night with much news to report, and I’m really looking forward to WikiSym in two weeks. I was originally skeptical about having two Wiki conferences in a month, but now, I’m looking forward to continuing some of the conversations we had this past weekend as well as seeing many other core members of the Wiki community. Plus, the program looks fantastic and there will be an Open Space component as well, organized by Ted Ernst and facilitated by Gerard Muller.    (KY1)

To top it all off, it’ll be in Odense, Denmark. I’ll be in Copenhagen from August 17-20, so if you’d like to meet up earlier, drop me a line. Thomas Madsen Mygdal, the creator of Reboot, has graciously offered to organize a meetup. More on that as details come.    (KY2)

Fighting WikiSpam: Eaton and Shared Blacklists

WikiSym 2005 was awesome. Massive props to Dirk Riehle and the program committee for throwing an outstanding event and drawing tons of great, great people. With Wikimania last August and WikiSym this past week, the Wiki community is really starting to gel. And it’s about time. Can you believe Wikis are 10 years old?    (JXD)

Now the bad news: I walked away with some action items. How do I get myself into these messes?!    (JXE)

The first action item can be traced back to an ad hoc meeting that happened at Wikimania regarding WikiSpam. On August 6, a group of Wiki developers — me (PurpleWiki), Alex Schroeder (OddMuse), Brion Vibber (Mediawiki), Thomas Waldmann (MoinMoin), Sven Dowideit (TWiki), Janne Jalkanen (JSPWiki) — along with John Breslin and Jochen Topf, got together to discuss ways we could collaborate on fighting WikiSpam. Our goal was to identify the simplest possible first step and not to get mired in process discussions.    (JXF)

Since all of us were already maintaining URL blacklists, we decided to merge them and host it as a Sourceforge project. We agreed on a standard format (which I’ll document and post soon), and we agreed to send our respective lists to Alex, who already has scripts to slice, dice, and merge.    (JXG)

One of my action items then was to create the Sourceforge project. I did that immediately, but for some reason, the project was rejected. Thus began a month-long go-around with Sourceforge support where I tried to discover why they had rejected the proposal. In the end, the project was approved, and I never got an answer as to why it was rejected in the first place. At that point, I was mired in other work, and so I never followed up.    (JXH)

WikiSym was the kick in the butt I needed to follow-up. On Sunday, Sunir Shah hosted an antispam workshop, which about 40 people attended. First, Sunir reviewed techniques (many of which are listed at MeatBall:WikiSpam). Then we broke out.    (JXI)

In my breakout, I described what we had agreed on at Wikimania. Then Peter Kaminski described a very cute idea he had for making it easy to fight WikiSpam. In a nutshell, Peter suggested we write a simple drop-in replacement CGI wrapper that would filter a POST payload for spam and call the real CGI script — be it a Wiki, a blog, or anything else — if the payload were spam-free. Such a wrapper would enable users to install spam-protection for any CGI script without having to write a single line of code and without having to do any complex configuration. It wouldn’t require any special access to your web server, since it would just be a CGI script. And you could easily add other spam-fighting measures, such as throttling and IP blacklists.    (JXJ)

I thought it was a brilliant idea. So Peter and I sat down afterwards and whipped it up. Took about an hour. It’s called Eaton, it works, and it’s Public Domain. Peter Kaminski has already blogged about it, and there’s some important commentary there from Jay Allen, the creator of MT-Blacklist.    (JXK)

It’s a proof of concept, and it won’t scale. It can and should be improved, and I’d encourage folks to do so. Nevertheless, it’s pretty cool. Bravo to Peter for a very clever idea.    (JXL)

By the way, the first person to figure out the origins of the name “Eaton” wins a cookie.    (JXM)