Do We Need the Semantic Web?

The Semantic Web, by Michael DaConta, Leo Obrst, and Kevin Smith (Wiley 2003), is a good book. I’ve worked with Michael a bit in an editorial context, and I’ve enjoyed some of his other writing. He thinks and explains things clearly, and this book is no exception. I especially enjoyed how The Semantic Web‘s crisply defined a number of hairy concepts — ontologies, taxonomies, semantics, etc. With some restructuring and condensing — there is some technical detail that isn’t that important, and the sections on ontologies could be more cohesive and should come earlier — this book could go from good to great.    (4V)

My goal here, however, is not to review The Semantic Web. My goal here is to complain about its premise.    (4W)

The authors say that the Semantic Web is about making data smarter. If we expend some extra effort making our data machine-understandable, then machines can do a better job of helping us with that data. By “machine-understandable,” the authors mean making the machines understand the data the same way we humans do. However, the authors make a point early in the book of separating their claims from those of AI researchers in the 1960s and 1970s. They are not promising to make machines as smart as humans. They are claiming that we can exploit machine capabilities more fully, presumably so that machines can better augment human capabilities.    (4X)

The authors believe that the Semantic Web will have an enormous positive effect on society, just as soon as it catches on. There’s the rub. It hasn’t. The question is why.    (4Y)

The answer lies with two related questions: What’s the cost, and what’s the return?    (4Z)

Consider the return first. Near the end of the book, the authors say:    (50)

With the widespread development and adoption of ontologies, which explicitly represent domain and cross-domain knowledge, we will have enabled our information technology to move upward — if not a quantum leap, then at least a major step — toward having our machines interact with us at our human conceptual level, not forcing us human beings to interact at the machine level. We predict that the rise in productivity at exchanging meaning with our machines, rather than semantically uninterpreted data, will be no less revolutionary for information technology as a whole. (238)    (51)

The key phrase above is, “having our machines interact with us at our human conceptual level, not forcing us human beings to interact at the machine level.” There are two problems with this conclusion. First, machines interacting with humans at a human conceptual level sounds awfully like artificial intelligence. Second, the latter part of this phrase contradicts the premise of the book. To make the Semantic Web happen, humans have to make their data “smarter” by interacting at the machine level.    (52)

That leads to the cost question: How much effort is required to make data smarter? I suppose the answer to that question depends on how you read the book, it seems to require quite a bit. Put aside the difficulties with RDF syntax — those can be addressed with better tools. I’m concerned about the human problem of constructing semantic models. This is a hard problem, and tools aren’t going to solve it. Who’s going to be building ontologies? I don’t think regular folks will, and if I’m right, then that makes it very difficult to expect a network effect on the order of the World Wide Web.    (53)

Human-Understandable Ontologies    (54)

There were three paragraphs in the book that really struck me:    (55)

Semantic interpretation is the mapping between some structured subset of data and a model of some set of objects in a domain with respect to the intended meaning of those objects and the relationships between those objects.    (56)

Typically, the model lies in the mind of the human. We as humans “understand” the semantics, which means we symbolically represent in some fashion the world, the objects of the world, and the relationships among those objects. We have the semantics of (some part of) the world in our minds; it is very structured and interpreted. When we view a textual document, we see symbols on a page and interpret those with respect to what they mean in our mental model; that is, we supply the semantics (meaning). If we wish to assist in the dissemination of the knowledge embedded in a document, we make that document available to other human beings, expecting that they will provide their own semantic interpreter (their mental models) and will make sense out of the symbols on the document pages. So, there is no knowledge in that document without someone or something interpreting the semantics of that document. Semantic interpretation makes knowledge out of otherwise meaningless symbols on a page.    (57)

If we wish, however, to have the computer assist in the dissemination of the knowledge embedded in a document — truly realize the Semantic Web — we need to at least partially automate the semantic interpretation process. We need to describe and represent in a computer-usable way a portion of our mental models about specific domains. Ontologies provide us with that capability. This is a large part of what the Semantic Web is all about. The software of the future (including intelligent agents, Web services, and so on) will be able to use the knowledge encoded in ontologies to at least partially understand, to semantically interpret, our Web documents and objects. (195-197)    (58)

To me, these paragraphs beautifully explain semantics and describe the motivation for the Semantic Web. I absolutely agree with what is being said and how. My concerns are with scope — the cost and benefit questions — and with priority.    (59)

The Semantic Web is only important in so far as it helps humans with our problems. The problem that the Semantic Web is tackling is information overload. In order to tackle that problem, the Semantic Web has to solve the problem of getting machines to understand human semantics. This is related to the problem of getting humans to understand human semantics. To me, solving the problem of humans understanding each other is far more important than getting machines to understand humans.    (5A)

Ontologies are crucial for solving both problems. Explicit awareness of ontologies helps humans communicate. Explicit expression of ontologies helps machines interpret humans. The difference between the two boils down, once again, to costs and returns. The latter costs much more, but the return does not seem to be proportionately greater. I think it would be significantly cheaper and more valuable to develop better ways of expressing human-understandable ontologies.    (5B)

I’m not saying that the Semantic Web is a waste of time. Far from it. I think it’s a valuable pursuit, and I hope that we achieve what the authors claim we will achieve. Truth be told, my inner gearhead is totally taken by some of the work in this area. My concern is that our collective inner gearhead is causing us to lose sight of the original goal. To paraphrase Doug Engelbart, we’re trying to make machines smarter. How about trying to make humans smarter?    (5C)

Santa Maria Steaks at The Hitching Post

About a month ago, my friend Justin mentioned a town near Santa Barbara, California that claimed to have the world’s best barbecue. As I explained a few weeks ago, I claim to be somewhat of an authority on barbecue, having eaten it outside of California. To be so near (well, about 250 miles away) yet so ignorant of a place claiming to be the cradle of barbecue civilization was somewhat of a shock to me.    (4N)

I attempted to right that wrong yesterday at The Hitching Post, a steakhouse in Casmalia. Casmalia is a former mining town in the Santa Maria Valley, about 75 miles north of Santa Barbara.    (4O)

Santa Maria barbecue has its roots with the Spanish ranchers who populated the region in the 1850s. To reward los vaqueros after a successful cattle herd, the ranchers would throw a feast consisting of top sirloin crusted with garlic salt and pepper and cooked slowly over a red oak fire, salsa, and pinquitos, a pinkish bean. Both the beans and the wood are native to Santa Maria.    (4P)

As its boastful claim suggests, Santa Maria takes its meat seriously. My challenge was to find a restaurant that specialized in the local fare. The city’s Chamber of Commerce web site was somewhat unhelpful. I couldn’t find a place whose menu jumped out at me as the real deal. Justin suggested The Hitching Post, which seemed to have a good reputation and also produced its own label of wines.    (4Q)

The steaks at The Hitching Post were excellent, the salsa was fresh, the servings were large, the wine (The Hitching Post Pinot Noir Santa Maria 2000) was good, and the price was reasonable. But, I wasn’t satisfied. I had a beef with their beef; namely, I don’t think The Hitching Post served true Santa Maria barbecue.    (4R)

Most people think that barbecue is food cooked over a hot fire. That’s actually grilling. Barbecue is food cooked slowly over a cool fire. The process tenderizes the meat while imbuing it with a delicious, smoky flavor. It’s what makes barbecued ribs or pulled pork literally fall off the bone.    (4S)

The Hitching Post served steaks, not barbecue. True, they used the correct cut of meat — top block sirloin. True, they rubbed it with garlic salt and pepper. True, they cooked it over a red oak fire. True, the steaks were delicious. But, it still wasn’t barbecue. The kicker was that they did not serve pinquitos.    (4T)

I could only conclude that I did not experience the true Santa Maria dining experience. That wrong still needs to be righted. I suspect that next Sunday, I will once again find myself in Santa Maria, searching, hoping, eating. Stay tuned.    (4U)

Clustering in the Margins

Seb Paquet posits the following theory: “people who pioneer group-forming practices are those who have a marked interest in something that is not generally shared by the rest of the population.” He cites evidence from a First Monday paper entitled, “A social network caught in the Web.”    (4L)

Said a different way: Niche topics generate greater Shared Intensity.    (4M)

William Kent at Extreme Markup 2003

For my blog entry on e-mail clients, I had to look up the link for the Extreme Markup conference. In so doing, I noticed that William Kent will be keynoting this year’s conference.    (4I)

Bill Kent‘s book, Data and Reality, is awesome. I consider it a must-read for anyone interested in data modeling. I won’t be able to make it to Montreal for the conference this year, but I’d recommend it for those of you who can. It’s August 4-8 in Montreal, Quebec, Canada.    (4J)