Response to Marc Canter

I meet a lot of interesting people in this business, but a few stand out more than others. Anyone who’s met Marc Canter knows what I’m talking about. Marc is a big, boisterous fellow, mostly good-natured, and very outspoken. The guy is sharp and passionate, and has a proven track record of getting things done in the Valley, including cofounding MacroMind (which became Macromedia). He’s also got an incredibly cute baby daughter, which is a bit unfair, because it makes it difficult for me to get mad at him.    (B5)

Marc recently complained about Blue Oxen Associates. Among his complaints were:    (B6)

Now I have nothing againist Eugene Kim and his stalwarts, but what they created and what they do is pretty lame – compared to what’s going on out there – for free, everyday. What was it about Blue Oxen that Socialtext, Phil Pearson, Dave Winer, Paolo Valdemarin, Clay Shirky, Cory Doctorow, Danny Ayers, Dan Brickley, Lisa Rein, Mitch Kapor, Mark Pilgrim, Aaron Swartz, and hundreds more aren’t doing everyday?    (B7)

Why did we need a white paper on why open source is a good thing? Why do we need yet another Wiki – ever if it is purple?    (B8)

I just really think that Pierre’s money could be put to better use.    (B9)

There are lots of problems with what Marc says, the biggest being that he’s got most of his facts wrong.    (BA)

First, Omidyar Foundation has not invested in us. They funded our first research report on the open source software community. We are completely self-funded (by me), surviving on my initial investment and on clients.    (BB)

Second, we are not in the business of open source software development. As a company that uses, evangelizes, and works to better understand collaborative tools, we naturally do some development. Our policy is that everything we produce is open source, be it software or research. PurpleWiki happens to be the first of those tools, and so we make it available.    (BC)

Is PurpleWiki the Wiki to end all Wikis? That’s not our intent. Our intent is to understand and explore ideas that will help us improve collaboration, and then to disseminate those ideas as widely as possible. PurpleWiki is more than a bunch of purple hash marks spread out on a page. There is a lot of deep thinking behind its architecture, much of which was inspired by Doug Engelbart‘s earlier work. The Purple Numbers are the most visible manifestation, but there are others that will become more apparent eventually. The point is, we want to make those ideas accessible, and if they are useful, we want to help spread those ideas. That goes for all collaborative tools, not just Wikis.    (BD)

We’ve done this. For example, we worked closely with Mike Mell who worked closely with Simon Michael to implement Purple Numbers in ZWiki. We host an active community of tool developers who are working together to make their tools more useful and interoperable. We’re always exploring new ways to make a difference.    (BE)

Third, our research report was not about why open source software is a good thing. It was a preliminary exploration into why open source communities work. We’re not the first to explore this question, but I think we are especially qualified to explore it for a number of reasons. More importantly, our target audience is not open source developers. These folks don’t need persuading. Our audience is people who know nothing about open source software, except perhaps that it exists. There is an enormous language and experience barrier that prevents these folks from understanding what makes open source communities tick. Our thesis is that all communities — independent of their domain — could benefit from understanding and emulating open source communities. In order to test this thesis, we first have to overcome the language barrier.    (BF)

Fourth, Marc listed a bunch of people doing great work in this space as if they were our competitors. They are not. I know several of these folks, have worked with some of them, and hope to work with many more. Our goal at Blue Oxen Associates is to better understand collaboration, so we can help improve it. In order to do that, we need to put our money where our mouths are and collaborate with others working towards the same goal.    (BG)

When I founded Blue Oxen Associates, I consciously avoided seeking seed money and going for the big splash up-front. Some of the things we’re trying to accomplish are not obvious, and require further thinking and development. Starting small and growing slowly means we have to focus, sometimes at the expense of interesting projects and work, often at the expense of attention.    (BH)

Nevertheless, I’m surprised and flattered by the attention we’ve received thus far. The participants in the communities that we host are incredibly supportive. We have an amazing advisory board. I constantly run into people who know about us and compliment us on our work. And, the blogging community has said some great things about us as well. I’m even flattered that Marc thought enough of us to devote some space on his blog to call us “lame.” Like I said, it’s hard to get mad at a guy who means well and has a cute baby daughter. I hope he’ll continue to follow our work; perhaps “lame” will evolve into something better once he actually understands what we do.    (BI)

PurpleWiki v0.9 Released

PurpleWiki v0.9 is now available. Chris Dent‘s announcement covers the basics. A few words on how we got here, and where we’re going.    (83)

The Wiki Experiment    (84)

When we launched Blue Oxen Associates last December, we made Wikis a core part of our infrastructure. We had two reasons for choosing Wikis. The first was practical. We needed a system for sharing documents and collaborative authoring, and Wikis fit the bill quite nicely.    (85)

The second reason was more philosophical. We wanted a knowledge management system like Doug Engelbart‘s Open Hyperdocument System (OHS), and felt that Wikis already resembled the OHS in many ways. It was immediately usable and useful, while also offering the perfect platform for coevolution.    (86)

PurpleWiki fulfills the following OHS requirements:    (87)

  • Backlinks. This is a core feature of all Wikis, but also one of its most underutilized. In a future version of PurpleWiki, we will create an open API to its Backlink engine, so that other applications (such as blogs) may use it.    (88)
  • Granular Addressability. PurpleWiki‘s site-wide, automatic Purple Number management had the additional benefit of enabling us to quickly experiment and evolve the feature. Node IDs are now document-independent, which has improved usability and enabled features like Transclusions.    (89)
  • Link Types. We’ve added a syntax for specifying link types, and have implemented our first new link type: Transclusions.    (8A)
  • View Control. We can easily add new or customize existing output formats, thanks to PurpleWiki‘s modular architecture. More importantly, we can implement dynamic views, such as a collapsible outline view, of any document on PurpleWiki.    (8B)

PurpleWiki‘s most visible feature is Purple Numbers, but its most important feature is its data architecture. In general, Wikis view the world as a graph of linked documents. PurpleWiki views the world as a graph of linked documents with a unified data model. The data model is sufficiently general and simple to apply to any kind of document. When you use this data model in other applications, you automatically inherit PurpleWiki‘s features. Integrating PurpleWiki with blogs was a piece of cake as a result, and we’ve only begun to explore the ramifications.    (8C)

Next Steps    (8D)

Our roadmap lists several enterprise features we plan on implementing: templates, pluggable database back-ends, mod_perl controller, etc. I consider these important, but relatively mundane. Blue Oxen Associates isn’t in the business of software development; we’re implementing these features because we need them.    (8E)

Blue Oxen Associates is in the business of research and improvement, and of facilitating coevolution. PurpleWiki is an amazing platform for this. One of Chris’s pet projects is to create universal ID space for nodes, possibly based on handles. Think of this as persistent URIs for nodes. Among other things, this will enable us to support Transclusions of content from other sites, not just the site on which PurpleWiki is installed.    (8F)

My long-term interests lie in three areas: the aforementioned open Backlink engine, Wiki page types, and Wiki namespaces. I’ll discuss the latter a bit here.    (8G)

One of the reasons I dislike TWiki is its notion of “webs.” You can partition your Wiki into multiple webs, and each web is contained in its own namespace. The problem with this is that it encourages people to balkanize their Wiki content. That, in my opinion, runs counter to the spirit of Wikis. Early balkanization prevents evolution.    (8H)

Several months ago, Richard Gabriel passed along some insight he had learned from Ward Cunningham: In computer science, you want to keep namespaces separate. With Wikis, you want namespaces to clash. Ward’s idea for SisterSites is one way to create namespace clash. This idea could be taken a step further by allowing site administrators to establish a system of namespace resolution. For example, if a certain page doesn’t exist on a local namespace, the Wiki would search another namespace for that page. If it existed there, the Wiki would simply take the user to that page.    (8I)

A Brief History of Purple Numbers

A few weeks ago, Chris Dent posted a brief history of Purple Numbers, noting, “This is likely full of errors as the story as I’ve heard it is incomplete and I was unable to check some things because the network path to California was busted while I was writing.” His account is pretty good, but there are a few holes here and there. Most of my clarifications are nitpicky. In case you, my dear readers, haven’t realized it yet, I am very anal.    (7W)

Chris differentiates NLS from the mouse, hypertext, GUIs, and so forth. NLS was actually the totality of all of these things. Chris also says that NLS had a “graph based document storage model.” This is somewhat of an ambiguous description, and depending on how you read it, is not strictly true. Finally, NLS did not have transclusions, although its architecture could easily have supported them.    (7X)

In between NLS and the first appearance of Purple Numbers on the Bootstrap Institute’s web site, there was a graphical Augment client written in Smalltalk. The first Augment PC client was MS-DOS-based. Doug still uses this! In the early 1990s, a contractor wrote a GUI client for Windows, which displayed statement IDs (what we call node IDs) in purple. According to Doug, the choice of color was either his daughter’s, Christina, or the author of the client. In any case, this is why Purple Numbers are purple.    (7Y)

Bootstrap Institute first used Purple Numbers to display structural location numbers (what we call hierarchical IDs) for Augment documents converted to HTML. Soon afterwards, Frode Heglund suggested making the numbers a live link, so that it would be easy to copy and paste the node’s address.    (7Z)

On January 31, 2001, I released the first version of Purple, which was a collection of Perl and XSLT scripts for adding node and hierarchical IDs to documents and generating HTML with Purple Numbers. I believe my most important contribution at the time was recognizing that node IDs were more useful in a Web context, where documents were largely dynamic, than hierarchical IDs. On April 24, 2001, Murray Altheim released plink, a Java program similar to Purple, except that it worked on XHTML files.    (80)

My OHS Launch Community experiment (June 15 – November 9, 2001) proved to be a fruitful time for Purple Numbers. Murray and I agreed on a standard addressing scheme for Purple Numbers. I implemented a MHonArc filter for adding Purple Numbers to and extracting Backlinks from mailing list archives, and tool that generated HTML with Purple Numbers from Quest Map files.    (81)

I also started thinking about PurpleWiki for the first time, and hacked a first version based on TWiki. This experience gave me a better understanding of the right way to implement PurpleWiki, and it also gave me a healthy distaste for TWiki. When Chris joined Blue Oxen Associates, we made PurpleWiki a priority, and the rest is history. Chris’s account from there is pretty complete.    (82)

Purple Numbers and Link Integrity

Danny Ayers is looking to implement Purple Numbers in his Wiki, and had the following question:    (66)

But is the expectation that the anchor will always refer to the same information item?    (67)

If we’re going for coolness, I think this may cause problems in the context of Wikis. Ok, pages come and go but the URI will (usually) always address something sensible – edit new page if the one originally addressed has gone.    (68)

But the Purple anchors are pointing to info-snippets that may be modified (no problem – it’s still conceptualy the same item) or be deleted (problem).    (69)

The expectation is that the information to which an anchor points may change. This is obviously not ideal.    (6A)

In March 2001, I wrote some notes for the Open Hyperdocument System entitled, “Thoughts on Link Integrity.” I had posted those notes to a mailing list, but those archives no longer exist, so I reproduce those thoughts below.    (6B)

Danny also mentioned using trackback as an aggregator of comments. There is such a system, which Chris Dent also mentions in his response: Internet Topic Exchange. We used it for the 2003 PlaNetwork Conference to aggregate blogs about the conference.    (6C)

Thoughts on Link Integrity    (6D)

We want the OHS to maintain link integrity across all documents. In other words, once you create a link to something, it should never break.    (6E)

The first requirement for link integrity is that documents are never deleted from the system. If you link to a document, and that document is subsequently removed, the link breaks. The only way to fix that link is to put the document back into the system.    (6F)

The second requirement is to have a logical naming scheme that is separate from the physical name and location of a document. On the web, if you have the document http://foo.com/bar.html, and you move it to http://foo.com/new/bar.html, links to the first URL break. You need a name for that document that will always point to the right place, even if the document is physically moved to a different part of the system.    (6G)

The third requirement is version control. This is where things start to get a little hairy. Version controlled systems are insert-only. In theory, nothing is ever removed. This satisfies the first requirement.    (6H)

However, in a useful DKR, links don’t just not break, they also evolve. Suppose you have a document, foo.txt, that contains the following text:    (6I)

  These are the dasy that try men's souls.    (6J)
  Example. foo.txt, version 1.    (6K)

Note that there’s a typo — “dasy” should be “days.”    (6L)

Now suppose someone creates a link to this sentence in this version of the document. Suppose that afterwards, you notice the typo and correct it. This results in a new version of the document:    (6M)

  These are the days that try men's souls.    (6N)
  Example. foo.txt, version 2.    (6O)

If your links neither broke nor evolved, then the original link would continue to point to version 1 of the document, not this new version. However, this does not always seem to be desirable behavior. If I created a link to this sentence — essentially designating it interesting and relevant content — when the typo is corrected, I’d prefer that the link now point to the corrected document, version 2.    (6P)

This is certainly doable. The system could automatically assume that the link pointing to the first sentence in version 1 should now point to the first sentence in version 2.    (6Q)

However, there are two scenarios when this would not be the correct behavior. First, what if, instead of fixing the typo, the sentence was changed to:    (6R)

  Livin' la vida loca.    (6S)
  Example. foo.txt, version 3.    (6T)

If the purpose of the link is to designate the target content as relevant, then the content of the first sentence of this third version no longer applies, because the meaning of the sentence has completely reversed.    (6U)

Second, what if the link is from an annotation that says, “There’s a typo in this sentence”? In this case, you would want the link to point only to version 1, since the typo does not exist in version 2 (and, for that matter, in version 3).    (6V)

How can we accomodate these scenarios? One solution would be to allow the user to define how the link should evolve with new versions of the document. So, for example, you could specify that the link that points to the first sentence in version 1 should also point to the first sentence in some number of subsequent versions of foo.txt.    (6W)

Another solution would be to have the system automatically notify everyone who has linked to a document (or who has otherwise registered for notification) that the document has changed, and have those people manually update their links, possibly providing suggestions as to how to update the links.    (6X)

OHS Launch Community: Experimenting with Ontologies

My review of The Semantic Web resulted in some very interesting comments. In particular, Danny Ayers challenged my point about focusing on human-understandable ontologies rather than machine-understandable ones:    (5D)

But…”I think it would be significantly cheaper and more valuable to develop better ways of expressing human-understandable ontologies”. I agree with your underlying point here, but think it’s just the kind of the Semantic Web technologies can help with. The model used is basically very human-friendly – just saying stuff about things, using (triple) statements.    (5E)

Two years ago, I set out to test this very claim by creating an ad-hoc community — the OHS Launch Community — and by making the creation of a shared ontology one of our primary goals. I’ll describe that experience here, and will address the comments in more detail later. (For now, see Jay Fienberg’s blog entry, “Semantic web 2003: not unlike TRS-80 in the 1970’s.” Jay makes a point that I want to echo in a later post.)    (5F)

“Ontologies?!”    (5G)

I first got involved with Doug Engelbart‘s Open Hyperdocument System (OHS) project in April 2000. For the next six months, a small group of committed volunteers met weekly with Doug to spec out the project and develop strategy.    (5H)

While some great things emerged from our efforts, we ultimately failed. There were a lot of reasons for that failure — I could write a book on this topic — but one of the most important reasons was that we never developed Shared Language.    (5I)

We had all brought our own world-views to the project, and — more problematically — we used the same terms differently. We did not have to agree on a single world-view — on the contrary, that would have hurt the collaborative process. However, we did need to be aware of each other’s world-views, even if we disagreed with them, and we needed to develop a Shared Language that would enable us to communicate more effectively.    (5J)

I think many people understand this intuitively. I was lucky enough to have it thrown in my face, thanks to the efforts of Jack Park and Howard Liu. At the time, Jack and Howard worked at Vertical Net with Leo Obrst, one of the authors of The Semantic Web. Howard, in fact, was an Ontologist. That was his actual job title! I had taken enough philosophy in college to know what an ontology was in that context, but somehow, I didn’t think that had any relevance to Howard’s job at Vertical Net.    (5K)

At our meetings, Jack kept saying we needed to work out an ontology. Very few of us knew what he meant, and both Jack and Howard did a very poor job of explaining what an ontology was. I mention this not to dis Jack and Howard — both of whom I like and respect very much — but to make a point about the entire ontology community. In general, I’ve found that ontologists are very poor at explaining what an ontology is. This is somewhat ironic, given that ontologies are supposed to clarify meaning in ways that a simple glossary can not.    (5L)

Doug himself made this same point in his usual ridiculously lucid manner. He often asked, “How does the ontology community use ontologies?” If ontologies were so crucial to effective collaboration, then surely the ontology community used ontologies when collaborating with each other. Sadly, nobody ever answered his question.    (5M)

OHS Launch Community    (5N)

At some point, something clicked. I finally understood what ontologies (in an information sciences context) were, and I realized that developing a shared ontology was an absolute prerequisite for collaboration to take place. Every successful communities of practice had developed a shared ontology, whether they were aware of it or not.    (5O)

Not wanting our OHS efforts to fade into oblivion, I asked a subset of the volunteers to participate in a community experiment, which — at Doug’s suggestion — we called the OHS Launch Community. Our goal was not to develop the OHS. Our goal was to figure out what we all thought the OHS was. We would devote six-months towards this goal, and then decide what to do afterwards. My theory was that collectively creating an explicit ontology would be a tipping point in the collaborative process. Once we had an ontology, progress on the OHS would flow naturally.    (5P)

My first recruits were Jack and Howard, and Howard agreed to be our ontology master. We had a real, live Ontologist as our ontology master! How could we fail?!    (5Q)

Mixed Results    (5R)

Howard suggested using Protege as our tool for developing an ontology. He argued that the group would find the rigor of a formally expressed ontology useful, and that we could subsequently use the ontology for developing more-intelligent search mechanisms into our knowledge repository.    (5S)

We agreed. Howard and I then created a highly iterative process for developing the formal ontology. Howard would read papers and follow mailing list discussions carefully, construct the ontology, and post updated versions early and often. He would also use Protege on an overhead projector during face-to-face discussions, so that people could watch the ontology evolve in real-time.    (5T)

Howard made enough progress to make things interesting. He developed some preliminary ontologies from some papers he had read, and he demonstrated and explained this work at one of our first meetings. Unfortunately, things suddenly got very busy for him, and he had to drop out of the group.    (5U)

That was the end of the formal ontology part of the experiment, but not of the experiment itself. First, we helped ourselves by collectively agreeing that developing a shared ontology was a worthwhile goal. This, and picking an end-date, helped us eliminate some of the urgency and anxiety about “making progress.” Developing Shared Language can be a frustrating experience, and it was extremely valuable to have group buy-in about its importance up-front.    (5V)

Second, we experimented with a facilitation technique called Dialogue Mapping. Despite my complete lack of experience with this technique (I had literally learned it from Jeff Conklin, its creator, the day before our first meeting), it turned out to be extremely useful. We organized a meeting called, “Ask Doug Anything,” which I facilitated and captured using a tool called Quest Map. It was essentially the Socratic Method in reverse. We asked questions, and Doug answered them. The only way we were allowed to challenge him or make points of our own was in the form of a question.    (5W)

That meeting was a watershed for me, because I finally understood Doug’s definition of a Dynamic Knowledge Repository. (See the dialog map of that discussion.) One of the biggest mistakes people make when discussing Doug’s work is conflating Open Hyperdocument System with Dynamic Knowledge Repository. Most of us had made that mistake, which prevented us from communicating clearly with Doug, and from making progress overall.    (5X)

Epilogue    (5Y)

We ended the Launch Community on November 9, 2001, about five months after it launched. We never completed the ontology experiment to my satisfaction, but I definitely learned many things. We also accomplished many of our other goals. We wanted to be a bootstrapping community, Eating Our Own Dogfood, running a lot of experiments, and keeping records of our experiences. We also wanted to facilitate collaboration between our members, most of whom were tool developers. Among our many accomplishments were:    (5Z)

The experiment was successful enough for me to propose a refined version of the group as an official entity of the Bootstrap Alliance, called the OHS Working Group. The proposal was accepted, but sadly, got sidetracked. (Yet another story for another time.) In many ways, the Blue Oxen collaboratories are the successors to the OHS Working Group experiment. We’ve adopted many of the same principles, especially the importance of Shared Language and bootstrapping.    (64)

I believe, more than ever, that developing shared ontology needs to be an explicit activity when collaborating in any domain. I’ll discuss where or whether Semantic Web technologies fit in, in a later post.    (65)