Transclusions, Path-Based Addressing, and Version Control

The PurpleWiki community has been rumbling recently, thanks largely to the contributions of two members of the Canonical Hackers, Jason Cook and Matthew O’Connor. Jason wrote perplog, an IRC logger that supports Purple Numbers and Transclusions. Matthew hacked a PurpleWiki node manager, then started adding and fixing other stuff, including an XML-RPC interface. Additionally, John Sechrest developed an experimental email interface to PurpleWiki. Lots of great stuff. It’s forcing us to get off our butts and make some long-promised changes and explain some long-undocumented things. Open Source is a wonderful thing.    (117)

A lot of the excitement is because of PurpleWiki‘s support for Transclusions. We had Transclusions in mind when we architected PurpleWiki, but we (or I, at least) didn’t think Transclusions would actually be implemented until much later. However, exactly one year ago today, Chris Dent got the itch and started playing. A few months later, Chris committed some code, and suddenly, we had Transclusions. It was a total hack, but it worked, and it was unexpectedly cool.    (118)

It’s still a hack, and it needs to be cleaned up, but it’s suddenly become a higher priority. First, we have a pretty good idea of how to support Transclusions “correctly.” Second, having had the chance to use Transclusions regularly, we are starting to recognize their utility and want to take greater advantage of that. (See, for example, my early specs for Abelard.) Third, people are starting to get excited about them.    (119)

Transcluding Multiple Chunks    (11A)

Currently, we support Transclusions of individual nodes (paragraphs, headers, list items, etc.) via the following syntax:    (11B)

  [t nid]    (11C)

where nid is the ID of the node you want to transclude. This works fine when you want to transclude small chunks, but at times, it’s useful to be able to transclude multiple chunks on a page. Rather than specify a transclusion for each individual node, it would be nice to have a syntax for specifying a collection of nodes in a single transclusion.    (11D)

Chris proposed the following syntax:    (11E)

  [t nid,nid,nid,...]    (11F)

This is problematic. The current implementation suggests that the transclusion command be replaced by the content identified by the specified NID. This proposal suggests that the command be replaced by both the content and the structure of the content. If you had a document like:    (11G)

  = Plan for World Domination {nid 1} =    (11H)
  # Finish PurpleWiki. {nid 2}    (11I)

and you tried to transclude this content with:    (11J)

  * [t 1, 2]    (11K)

what is the parser supposed to translate this to?    (11L)

A proper solution must be treated as its own structural element within a PurpleWiki document. More importantly, the syntax should capture document-specific context. This contrasts with the current syntax, which ignores document context entirely.    (11M)

Why is document context important? Suppose you have the following task list:    (11N)

  = To Do {nid 3} =    (11O)
  * Buy milk. {nid 4}   * Feed iguana. {nid 5}   * Implement distributed Transclusions. {nid 6}   * Vote in primaries. {nid 7}   * Expose API to Backlinks. {nid 8}    (11P)

Suppose you want to start a PurpleWiki-specific task list, transcluding all of the list items relevant to PurpleWiki (in this case, nodes 6 and 8). The resulting document might look like:    (11Q)

  {title PurpleWikiToDo}    (11R)
  = PurpleWiki To Do {nid 9} =    (11S)
  * [t 6] {nid A}   * [t 8] {nid B}    (11T)

Now, suppose you want to replicate this task list on another page. You could transclude all of the items individually, just as you do on the PurpleWiki To Do page. In this case, a slight variation of Chris’s proposed syntax (a standalone structural element) would simplify that process. (It also raises an interesting question: Which NIDs do you use for the transclusions: 6 and 8, or A and B? Does it make a difference?)    (11U)

However, what I really want to do is say, “Transclude all of the list items on the ‘PurpleWiki To Do’ page.” For this, you want something like XPointer:    (11V)

  [transclude PurpleWikiToDo#xpointer(id("9")/li)]    (11W)

A few observations: First, the command should be on a line by itself, and it should be interpreted as an independent structural element that will be replaced by a set of structural elements and content. I used “transclude” instead of “t” to make the point that these are two different commands. Second, the Transclusion command specifies a range of nodes within a document, as opposed to a document-independent list of nodes. Third, this combines a path-based address with an ID-based address.    (11X)

Fourth (and this is an implementation detail), if we want to support such syntax, it would behoove us to use an XML data model rather than the home grown model we’re currently using. This way, we could easily plug in existing XPointer implementations to do the queries.    (11Y)

Version Control    (11Z)

The fact that Wiki pages are dynamic throws a kink into all of this. The syntax I propose above takes this into account for the most part. Barring major changes to the PurpleWiki To Do page, the transcluded content will include all of the PurpleWiki tasks, even if more items are added later. If you had to list a set of NIDs, then you would have to be diligent about updating that list manually every time the To Do page changed.    (120)

In addition to supporting path-based addressing, we also need to allow people to specify versions in the address. In other words, you may want to transclude a specific version of a node or a set of nodes from a specific version of a document.    (121)

This shouldn’t be too difficult, but there are some complications. The biggie is whether to transclude a node if the node no longer exists in any document. My instinct right now is telling me that yes, it should, but it should make it clear somehow that it’s an orphan node. (See my previous entry on link integrity for more on this.)    (122)

“Low-Focus Thought” in Knowledge Management Systems

David Gelertner wrote an essay called “The Logic of Dreams” (a chapter in Denning and Metcalfe’s Beyond Calculation: The Next Fifty Years of Computing), where he discussed the creative process. Gelertner suggested that there are two kinds of thought: high-focus (analytical, logical) and low-focus (free association). The former we understand well (according to the Gelertner); the latter we barely comprehend.    (TI)

Low-focus thought is the story of weak ties, not just in the context of Social Networks but of ideas in general. It’s a story that is told over and over again. A poet smells a rose, and is reminded of his lover. Friedrich Kekule dreams about a snake biting its tail, wakes up, and solves the structure of benzene. Grace Hopper remembers an old play from her college basketball days and figures out a memory-efficient algorithm for her A-0 compiler.    (TJ)

Gelertner was interested in implementing low-focus thought in Artificial Intelligence software. I’m interested in facilitating low-focus thought via Knowledge Management systems.    (TK)

In the past year, my tools and processes have revealed a number of unexpected connections. For example, last August, I blogged two interesting articles about Marc Smith and Josh Tyler. The following morning, I happened to be rifling through some old articles, and discovered papers written by Smith and Tyler that I had previously archived.    (TL)

Old-fashioned tools and a little bit of karma led to these discoveries. I wanted to eliminate one of the stacks of papers on my floor, which was how I accidentally came across the Smith article. Later that morning, I was searching for an email that a friend had sent me earlier, and it just so happened that the same email contained the reference to Tyler’s paper.    (TM)

These discoveries were largely due to luck, although the fact that I keep archives in the first place and that I review them on occasion also played a role. I don’t want to oversell this point, but I don’t want to undersell it either. Many people don’t archive their email, for example. Many groups don’t archive their mailing lists, a phenomenon that baffles me. More importantly, many people never review their old notes or archives, which is about the same as not keeping them in the first place. All of that knowledge is, for all intents and purposes, lost.    (TN)

Good Knowledge Management tools facilitates the discovery of these weak connections, and make us less reliant on luck. Blogging is great, because it encourages people to link, which encourages bloggers to search through old entries — both of others and their own. This is an example of tools facilitating a pattern, and it’s one reason why blogs are a powerful Knowledge Management tool.    (TO)

I’m excited about the work we’ve done integrating blogs and Wikis using Backlinks and WikiWords, because I believe these tools will further facilitate low-focus thought, which will ultimately lead to bigger and better things.    (TP)

Blog Backlinks Enabled on PurpleWiki

If you view the Backlinks on any of my Wiki pages, it will now display Backlinks from both the Wiki and also this blog. For example, if you view the backlinks to “DougEngelbart”, you will see a list of all of my Wiki pages and blog entries that mention Doug.    (SG)

The beautiful thing about this feature is that it maintains context for all of the different concepts described on my Wiki. I list several Patterns on my Wiki, with some level of detail on each page. But when you look at the Backlinks to those Patterns, you see a list of all the stories where the Patterns are mentioned. I tell the stories as I have before, and the tool explicitly ties the concept to the stories that describe the context. That’s augmentation! As Chris Dent said, it “makes the universe bigger.”    (SH)

My essay, Wikis As Topic Maps, describes this phenomenon in further (and slightly more technical) detail.    (SI)

Open Source At Work    (SJ)

How this feature finally became implemented is a wonderful example of what makes Open Source so great. We’ve wanted it for a while, but didn’t have time to implement it. Last month, I started thinking more seriously about implementing the feature, because I wanted to demonstrate it to some potential clients. Unfortunately, I was swamped, and didn’t have time to do it myself.    (SK)

David Fannin to the rescue. David had installed PurpleWiki and the MovableType plugin, and liked it. However, he also wanted the Backlink feature. So, he wrote it, and contributed it back to us. Neither Chris nor I nor anyone else in the small PurpleWiki community knew David beforehand, but as you can imagine, we welcomed his contribution.    (SL)

David’s patch was just a hack. Chris had some ideas for refactoring the PurpleWiki code to better integrate this feature. So, he implemented them, and released a preview of the code. Chris’s refactoring made it very easy for me to write a similar plugin for blosxom. Suddenly, we had the feature I had been pining for.    (SM)

As an aside, I had grander plans for how to implement this feature, and those plans haven’t gone away. (See my notes on TPVortex for a preview.) The important thing is, David and Chris’s approach worked. It may not do all of the whiz bang things I eventually want it to do, but it does what I want it to do right now. More importantly, it may very well inspire others to implement some of the grander ideas. Release Early And Often is an extremely important pattern of Open Source development, because it enables collaboration, which accelerates the implementation and dissemination of ideas.    (SN)

Precedence    (SO)

Ours is not the first integrated Wiki and blog. Notable precedents include Kwiki and Bill Seitz‘s Wiki Log. These tools all had the integrated Backlinks feature before we did.    (SP)

The key difference between these tools and ours is that they require you to use a single tool. You have to use Kwiki as both your blogging tool and Wiki to get all of the features. Our approach integrates PurpleWiki with MovableType, blosxom, and conceivably any other blogging tool. This is consistent with our overall philosophy of improving interoperability between tools using Doug Engelbart‘s ideas as a unifying framework.    (SQ)

We’ve only taken baby steps so far. We plan bigger and better things. More importantly, we want to encourage other tool developers to adopt a similar approach, and to collaborate with each other to do so.    (SR)

PurpleWiki v0.9 Released

PurpleWiki v0.9 is now available. Chris Dent‘s announcement covers the basics. A few words on how we got here, and where we’re going.    (83)

The Wiki Experiment    (84)

When we launched Blue Oxen Associates last December, we made Wikis a core part of our infrastructure. We had two reasons for choosing Wikis. The first was practical. We needed a system for sharing documents and collaborative authoring, and Wikis fit the bill quite nicely.    (85)

The second reason was more philosophical. We wanted a knowledge management system like Doug Engelbart‘s Open Hyperdocument System (OHS), and felt that Wikis already resembled the OHS in many ways. It was immediately usable and useful, while also offering the perfect platform for coevolution.    (86)

PurpleWiki fulfills the following OHS requirements:    (87)

  • Backlinks. This is a core feature of all Wikis, but also one of its most underutilized. In a future version of PurpleWiki, we will create an open API to its Backlink engine, so that other applications (such as blogs) may use it.    (88)
  • Granular Addressability. PurpleWiki‘s site-wide, automatic Purple Number management had the additional benefit of enabling us to quickly experiment and evolve the feature. Node IDs are now document-independent, which has improved usability and enabled features like Transclusions.    (89)
  • Link Types. We’ve added a syntax for specifying link types, and have implemented our first new link type: Transclusions.    (8A)
  • View Control. We can easily add new or customize existing output formats, thanks to PurpleWiki‘s modular architecture. More importantly, we can implement dynamic views, such as a collapsible outline view, of any document on PurpleWiki.    (8B)

PurpleWiki‘s most visible feature is Purple Numbers, but its most important feature is its data architecture. In general, Wikis view the world as a graph of linked documents. PurpleWiki views the world as a graph of linked documents with a unified data model. The data model is sufficiently general and simple to apply to any kind of document. When you use this data model in other applications, you automatically inherit PurpleWiki‘s features. Integrating PurpleWiki with blogs was a piece of cake as a result, and we’ve only begun to explore the ramifications.    (8C)

Next Steps    (8D)

Our roadmap lists several enterprise features we plan on implementing: templates, pluggable database back-ends, mod_perl controller, etc. I consider these important, but relatively mundane. Blue Oxen Associates isn’t in the business of software development; we’re implementing these features because we need them.    (8E)

Blue Oxen Associates is in the business of research and improvement, and of facilitating coevolution. PurpleWiki is an amazing platform for this. One of Chris’s pet projects is to create universal ID space for nodes, possibly based on handles. Think of this as persistent URIs for nodes. Among other things, this will enable us to support Transclusions of content from other sites, not just the site on which PurpleWiki is installed.    (8F)

My long-term interests lie in three areas: the aforementioned open Backlink engine, Wiki page types, and Wiki namespaces. I’ll discuss the latter a bit here.    (8G)

One of the reasons I dislike TWiki is its notion of “webs.” You can partition your Wiki into multiple webs, and each web is contained in its own namespace. The problem with this is that it encourages people to balkanize their Wiki content. That, in my opinion, runs counter to the spirit of Wikis. Early balkanization prevents evolution.    (8H)

Several months ago, Richard Gabriel passed along some insight he had learned from Ward Cunningham: In computer science, you want to keep namespaces separate. With Wikis, you want namespaces to clash. Ward’s idea for SisterSites is one way to create namespace clash. This idea could be taken a step further by allowing site administrators to establish a system of namespace resolution. For example, if a certain page doesn’t exist on a local namespace, the Wiki would search another namespace for that page. If it existed there, the Wiki would simply take the user to that page.    (8I)

A Brief History of Purple Numbers

A few weeks ago, Chris Dent posted a brief history of Purple Numbers, noting, “This is likely full of errors as the story as I’ve heard it is incomplete and I was unable to check some things because the network path to California was busted while I was writing.” His account is pretty good, but there are a few holes here and there. Most of my clarifications are nitpicky. In case you, my dear readers, haven’t realized it yet, I am very anal.    (7W)

Chris differentiates NLS from the mouse, hypertext, GUIs, and so forth. NLS was actually the totality of all of these things. Chris also says that NLS had a “graph based document storage model.” This is somewhat of an ambiguous description, and depending on how you read it, is not strictly true. Finally, NLS did not have transclusions, although its architecture could easily have supported them.    (7X)

In between NLS and the first appearance of Purple Numbers on the Bootstrap Institute’s web site, there was a graphical Augment client written in Smalltalk. The first Augment PC client was MS-DOS-based. Doug still uses this! In the early 1990s, a contractor wrote a GUI client for Windows, which displayed statement IDs (what we call node IDs) in purple. According to Doug, the choice of color was either his daughter’s, Christina, or the author of the client. In any case, this is why Purple Numbers are purple.    (7Y)

Bootstrap Institute first used Purple Numbers to display structural location numbers (what we call hierarchical IDs) for Augment documents converted to HTML. Soon afterwards, Frode Heglund suggested making the numbers a live link, so that it would be easy to copy and paste the node’s address.    (7Z)

On January 31, 2001, I released the first version of Purple, which was a collection of Perl and XSLT scripts for adding node and hierarchical IDs to documents and generating HTML with Purple Numbers. I believe my most important contribution at the time was recognizing that node IDs were more useful in a Web context, where documents were largely dynamic, than hierarchical IDs. On April 24, 2001, Murray Altheim released plink, a Java program similar to Purple, except that it worked on XHTML files.    (80)

My OHS Launch Community experiment (June 15 – November 9, 2001) proved to be a fruitful time for Purple Numbers. Murray and I agreed on a standard addressing scheme for Purple Numbers. I implemented a MHonArc filter for adding Purple Numbers to and extracting Backlinks from mailing list archives, and tool that generated HTML with Purple Numbers from Quest Map files.    (81)

I also started thinking about PurpleWiki for the first time, and hacked a first version based on TWiki. This experience gave me a better understanding of the right way to implement PurpleWiki, and it also gave me a healthy distaste for TWiki. When Chris joined Blue Oxen Associates, we made PurpleWiki a priority, and the rest is history. Chris’s account from there is pretty complete.    (82)