Transclusions, Path-Based Addressing, and Version Control

The PurpleWiki community has been rumbling recently, thanks largely to the contributions of two members of the Canonical Hackers, Jason Cook and Matthew O’Connor. Jason wrote perplog, an IRC logger that supports Purple Numbers and Transclusions. Matthew hacked a PurpleWiki node manager, then started adding and fixing other stuff, including an XML-RPC interface. Additionally, John Sechrest developed an experimental email interface to PurpleWiki. Lots of great stuff. It’s forcing us to get off our butts and make some long-promised changes and explain some long-undocumented things. Open Source is a wonderful thing.    (117)

A lot of the excitement is because of PurpleWiki‘s support for Transclusions. We had Transclusions in mind when we architected PurpleWiki, but we (or I, at least) didn’t think Transclusions would actually be implemented until much later. However, exactly one year ago today, Chris Dent got the itch and started playing. A few months later, Chris committed some code, and suddenly, we had Transclusions. It was a total hack, but it worked, and it was unexpectedly cool.    (118)

It’s still a hack, and it needs to be cleaned up, but it’s suddenly become a higher priority. First, we have a pretty good idea of how to support Transclusions “correctly.” Second, having had the chance to use Transclusions regularly, we are starting to recognize their utility and want to take greater advantage of that. (See, for example, my early specs for Abelard.) Third, people are starting to get excited about them.    (119)

Transcluding Multiple Chunks    (11A)

Currently, we support Transclusions of individual nodes (paragraphs, headers, list items, etc.) via the following syntax:    (11B)

  [t nid]    (11C)

where nid is the ID of the node you want to transclude. This works fine when you want to transclude small chunks, but at times, it’s useful to be able to transclude multiple chunks on a page. Rather than specify a transclusion for each individual node, it would be nice to have a syntax for specifying a collection of nodes in a single transclusion.    (11D)

Chris proposed the following syntax:    (11E)

  [t nid,nid,nid,...]    (11F)

This is problematic. The current implementation suggests that the transclusion command be replaced by the content identified by the specified NID. This proposal suggests that the command be replaced by both the content and the structure of the content. If you had a document like:    (11G)

  = Plan for World Domination {nid 1} =    (11H)
  # Finish PurpleWiki. {nid 2}    (11I)

and you tried to transclude this content with:    (11J)

  * [t 1, 2]    (11K)

what is the parser supposed to translate this to?    (11L)

A proper solution must be treated as its own structural element within a PurpleWiki document. More importantly, the syntax should capture document-specific context. This contrasts with the current syntax, which ignores document context entirely.    (11M)

Why is document context important? Suppose you have the following task list:    (11N)

  = To Do {nid 3} =    (11O)
  * Buy milk. {nid 4}   * Feed iguana. {nid 5}   * Implement distributed Transclusions. {nid 6}   * Vote in primaries. {nid 7}   * Expose API to Backlinks. {nid 8}    (11P)

Suppose you want to start a PurpleWiki-specific task list, transcluding all of the list items relevant to PurpleWiki (in this case, nodes 6 and 8). The resulting document might look like:    (11Q)

  {title PurpleWikiToDo}    (11R)
  = PurpleWiki To Do {nid 9} =    (11S)
  * [t 6] {nid A}   * [t 8] {nid B}    (11T)

Now, suppose you want to replicate this task list on another page. You could transclude all of the items individually, just as you do on the PurpleWiki To Do page. In this case, a slight variation of Chris’s proposed syntax (a standalone structural element) would simplify that process. (It also raises an interesting question: Which NIDs do you use for the transclusions: 6 and 8, or A and B? Does it make a difference?)    (11U)

However, what I really want to do is say, “Transclude all of the list items on the ‘PurpleWiki To Do’ page.” For this, you want something like XPointer:    (11V)

  [transclude PurpleWikiToDo#xpointer(id("9")/li)]    (11W)

A few observations: First, the command should be on a line by itself, and it should be interpreted as an independent structural element that will be replaced by a set of structural elements and content. I used “transclude” instead of “t” to make the point that these are two different commands. Second, the Transclusion command specifies a range of nodes within a document, as opposed to a document-independent list of nodes. Third, this combines a path-based address with an ID-based address.    (11X)

Fourth (and this is an implementation detail), if we want to support such syntax, it would behoove us to use an XML data model rather than the home grown model we’re currently using. This way, we could easily plug in existing XPointer implementations to do the queries.    (11Y)

Version Control    (11Z)

The fact that Wiki pages are dynamic throws a kink into all of this. The syntax I propose above takes this into account for the most part. Barring major changes to the PurpleWiki To Do page, the transcluded content will include all of the PurpleWiki tasks, even if more items are added later. If you had to list a set of NIDs, then you would have to be diligent about updating that list manually every time the To Do page changed.    (120)

In addition to supporting path-based addressing, we also need to allow people to specify versions in the address. In other words, you may want to transclude a specific version of a node or a set of nodes from a specific version of a document.    (121)

This shouldn’t be too difficult, but there are some complications. The biggie is whether to transclude a node if the node no longer exists in any document. My instinct right now is telling me that yes, it should, but it should make it clear somehow that it’s an orphan node. (See my previous entry on link integrity for more on this.)    (122)