A Brief History of Purple Numbers

A few weeks ago, Chris Dent posted a brief history of Purple Numbers, noting, “This is likely full of errors as the story as I’ve heard it is incomplete and I was unable to check some things because the network path to California was busted while I was writing.” His account is pretty good, but there are a few holes here and there. Most of my clarifications are nitpicky. In case you, my dear readers, haven’t realized it yet, I am very anal.    (7W)

Chris differentiates NLS from the mouse, hypertext, GUIs, and so forth. NLS was actually the totality of all of these things. Chris also says that NLS had a “graph based document storage model.” This is somewhat of an ambiguous description, and depending on how you read it, is not strictly true. Finally, NLS did not have transclusions, although its architecture could easily have supported them.    (7X)

In between NLS and the first appearance of Purple Numbers on the Bootstrap Institute’s web site, there was a graphical Augment client written in Smalltalk. The first Augment PC client was MS-DOS-based. Doug still uses this! In the early 1990s, a contractor wrote a GUI client for Windows, which displayed statement IDs (what we call node IDs) in purple. According to Doug, the choice of color was either his daughter’s, Christina, or the author of the client. In any case, this is why Purple Numbers are purple.    (7Y)

Bootstrap Institute first used Purple Numbers to display structural location numbers (what we call hierarchical IDs) for Augment documents converted to HTML. Soon afterwards, Frode Heglund suggested making the numbers a live link, so that it would be easy to copy and paste the node’s address.    (7Z)

On January 31, 2001, I released the first version of Purple, which was a collection of Perl and XSLT scripts for adding node and hierarchical IDs to documents and generating HTML with Purple Numbers. I believe my most important contribution at the time was recognizing that node IDs were more useful in a Web context, where documents were largely dynamic, than hierarchical IDs. On April 24, 2001, Murray Altheim released plink, a Java program similar to Purple, except that it worked on XHTML files.    (80)

My OHS Launch Community experiment (June 15 – November 9, 2001) proved to be a fruitful time for Purple Numbers. Murray and I agreed on a standard addressing scheme for Purple Numbers. I implemented a MHonArc filter for adding Purple Numbers to and extracting Backlinks from mailing list archives, and tool that generated HTML with Purple Numbers from Quest Map files.    (81)

I also started thinking about PurpleWiki for the first time, and hacked a first version based on TWiki. This experience gave me a better understanding of the right way to implement PurpleWiki, and it also gave me a healthy distaste for TWiki. When Chris joined Blue Oxen Associates, we made PurpleWiki a priority, and the rest is history. Chris’s account from there is pretty complete.    (82)

Purple Numbers and Link Integrity

Danny Ayers is looking to implement Purple Numbers in his Wiki, and had the following question:    (66)

But is the expectation that the anchor will always refer to the same information item?    (67)

If we’re going for coolness, I think this may cause problems in the context of Wikis. Ok, pages come and go but the URI will (usually) always address something sensible – edit new page if the one originally addressed has gone.    (68)

But the Purple anchors are pointing to info-snippets that may be modified (no problem – it’s still conceptualy the same item) or be deleted (problem).    (69)

The expectation is that the information to which an anchor points may change. This is obviously not ideal.    (6A)

In March 2001, I wrote some notes for the Open Hyperdocument System entitled, “Thoughts on Link Integrity.” I had posted those notes to a mailing list, but those archives no longer exist, so I reproduce those thoughts below.    (6B)

Danny also mentioned using trackback as an aggregator of comments. There is such a system, which Chris Dent also mentions in his response: Internet Topic Exchange. We used it for the 2003 PlaNetwork Conference to aggregate blogs about the conference.    (6C)

Thoughts on Link Integrity    (6D)

We want the OHS to maintain link integrity across all documents. In other words, once you create a link to something, it should never break.    (6E)

The first requirement for link integrity is that documents are never deleted from the system. If you link to a document, and that document is subsequently removed, the link breaks. The only way to fix that link is to put the document back into the system.    (6F)

The second requirement is to have a logical naming scheme that is separate from the physical name and location of a document. On the web, if you have the document http://foo.com/bar.html, and you move it to http://foo.com/new/bar.html, links to the first URL break. You need a name for that document that will always point to the right place, even if the document is physically moved to a different part of the system.    (6G)

The third requirement is version control. This is where things start to get a little hairy. Version controlled systems are insert-only. In theory, nothing is ever removed. This satisfies the first requirement.    (6H)

However, in a useful DKR, links don’t just not break, they also evolve. Suppose you have a document, foo.txt, that contains the following text:    (6I)

  These are the dasy that try men's souls.    (6J)
  Example. foo.txt, version 1.    (6K)

Note that there’s a typo — “dasy” should be “days.”    (6L)

Now suppose someone creates a link to this sentence in this version of the document. Suppose that afterwards, you notice the typo and correct it. This results in a new version of the document:    (6M)

  These are the days that try men's souls.    (6N)
  Example. foo.txt, version 2.    (6O)

If your links neither broke nor evolved, then the original link would continue to point to version 1 of the document, not this new version. However, this does not always seem to be desirable behavior. If I created a link to this sentence — essentially designating it interesting and relevant content — when the typo is corrected, I’d prefer that the link now point to the corrected document, version 2.    (6P)

This is certainly doable. The system could automatically assume that the link pointing to the first sentence in version 1 should now point to the first sentence in version 2.    (6Q)

However, there are two scenarios when this would not be the correct behavior. First, what if, instead of fixing the typo, the sentence was changed to:    (6R)

  Livin' la vida loca.    (6S)
  Example. foo.txt, version 3.    (6T)

If the purpose of the link is to designate the target content as relevant, then the content of the first sentence of this third version no longer applies, because the meaning of the sentence has completely reversed.    (6U)

Second, what if the link is from an annotation that says, “There’s a typo in this sentence”? In this case, you would want the link to point only to version 1, since the typo does not exist in version 2 (and, for that matter, in version 3).    (6V)

How can we accomodate these scenarios? One solution would be to allow the user to define how the link should evolve with new versions of the document. So, for example, you could specify that the link that points to the first sentence in version 1 should also point to the first sentence in some number of subsequent versions of foo.txt.    (6W)

Another solution would be to have the system automatically notify everyone who has linked to a document (or who has otherwise registered for notification) that the document has changed, and have those people manually update their links, possibly providing suggestions as to how to update the links.    (6X)