Visualizing Wiki Life Cycles

On the first day of WikiSym in Denmark last August, I spotted Alex Schroeder before the workshop began and went over to say hello. Pleasantries naturally evolved into a discussion about Purple Numbers. (Yes, I’ve got problems.) Alex suggested that unique node identifiers were more trouble than they were worth, because in practice, nodes that you wanted to link to were static. Me being me, my response was, “Let’s look at the numbers.” Alex being Alex, he went off and did the measurements right away for Community Wiki, and he did some followup measurements based on further discussions after the conference.    (LSP)

As it turned out, the numbers didn’t tell us anything useful, but our discussions firmly implanted some ideas in my head about Wiki decay rates — the time it takes for information in a Wiki page to stop being useful.    (LSQ)

I had toyed with this concept before. A few years ago, I came up with the idea of changing the background color of a page to correspond to the age of the page. A stale page would be yellowed; an active page would be bright white. I had originally envisioned the color to be based on number of edits. However, I realized this past week that I was mixing up my metaphors. There have been a few studies indicating a strong correlation between frequent edits and content quality, so it makes sense to indicate edit frequencies ambiently. However, just because content has not been edited recently does not mean the information itself is stale. You need to account for how often the page is accessed as well.    (LSR)

(At the Wikithon last week, Kirsten Jones implemented the page coloring idea. She came up with a metric that combined edits and accesses, which she will hopefully document on the Wiki soon! It’s cool, and it should be easy to deploy and study. Ingy dot Net suggested that the page should become moldy, a suggestion I fully endorse.)    (LSS)

This past Sunday, I had brunch with the Socialtext Bloomington Boys. Naturally, pleasantries evolved into Matthew and me continuing along our Wiki Analytics track, this time with help from Shawn Devlin and Matt Liggett. We broke Wiki behavior into a number of different archetypes, then brainstormed ways to visually represent the behavior of each of these types. We came up with this:    (LST)

https://i0.wp.com/farm1.static.flickr.com/149/388587151_3f730b0a5c_m.jpg?w=700    (LSU)

The x-axis represents time. The blue line is accesses; the green line is edits. Edits are normalized (edits per view) so that, under normal circumstances, the green line will always be below the blue (because users will usually access a page before editing it). The exception is when software is interacting with the Wiki more than people. The whole graph should consist of a representative time-slice in that Wiki’s lifespan.    (LSV)

The red line indicates the median “death” rate of Wiki pages. After much haggling, we decided that the way to measure page death was to determine the amount of time it takes for a page to reach some zero-level of accesses. We’ll need to look at actual data to see what the baseline should be and whether this is a useful measurement.    (LSW)

The red line helps distinguish between archetypes that may have the same access/edit ratio and curve. For example, on the upper left, you see idealized Wiki behavior. Number of edits are close to number of accesses, both of which are relatively constant across the entire Wiki over time. Because it’s a healthy Wiki, you’ve got a healthy page death rate.    (LSX)

On the upper right, you see a Wiki that is used for process support. A good example of this is a Wiki used to support a software development process. At the beginning of the process, people might be capturing user stories and requirements. Later in the process, they might be capturing bugs. Once a cycle is complete, those pages rapidly become stale as the team creates new pages to support a new cycle. The death line in this case is much shorter than it is for the idealized Wiki.    (LSY)

Again, one use of the Wiki isn’t better than the other. They’re both good in that they’re both augmenting human processes. The purpose of the visualization is to help identify the archetypes so that you can adjust your facilitation practices and tools to best support these behaviors.    (LSZ)

This is all theory at this point. We need to crunch on some real data. I’d love to see others take these ideas and run with them as well.    (LT0)

Leave a Reply