Semantic web: dull as dishwater edition

Recovered from the Wayback Machine.

Mathew Ingram has decided that the problem with the semantic web is that it’s as boring as dry toast. Of course, by Mathew’s standard, all the stuff that makes the web work is also boring as hell. It’s probably a good thing, then, that some people looked beyond the need for immediate titaliation when it comes to the tech underlying this environment, or Mathew’s audience for his opinions would be his immediate family members, and perhaps those neighbors not quick enough to run away when seeing him approach.

He also writes:

It’s all about plumbing and widgets and data standards, all of which have names like FOAF and TOTP and SIOC and whatnot. It’s right off the dork-o-meter. The Lone Gunmen from The X-Files would have a hard time getting interested in this stuff, let alone anyone who isn’t married to their slide rule or their pocket protector.

Now, taking Mathew’s complaints of No glitter! No glitter! Mama, Mama, where’s my glitter! seriously, I decided to put my slide rule down for a sec and see if I couldn’t respond to his one statement about no one knowing what this all means.

First, there was the web. The web was dumb, but it was hyperlinked.

Then, there was search. Search followed hyperlinks, scraped pages, massaged keywords and tested the strength of the links. The web was still dumb, but number crunching helped generate some smarts. Think of your favorite dog. Yeah, that smart.

Next, there was the semantic web. The semantic web says, You and I can derive understanding from this blob of text on this page, but applications can’t. Applications can pull keywords and run algorithms, but can only approximate what this blob of text is all about. What if we add a little information to this blob of text so that applications don’t have to crunch numbers or make guesses as to what we mean?

How do we add a little information? A hundred different ways. We can use microformats, or RDFa, or RDF, or whatever the HTML5 people cook up for us. With this little bit of extra information, applications can access a web page list that’s created with UL/LI elements, but instead of having to look at the text in the list and try to guess what the list is all about, it can read that little bit of data and know that the list consists of recommended books. Perhaps they can take that little list of books and use another application to look up these books at Amazon. Or at their library. Or better yet, click a button and load all the books into our Kindle. (Assuming that Mathew doesn’t subscribe to the Steve Jobs school of, “We don’t read, we aint’ got no books, gimme the vids”, school of thought.)

The little bit of information might, instead, be an address for an event, triggering the browser to add that event information to a desktop calendar application.

It could be information about people we know and how we know them, so that when we move from Facebook, which is today’s darling, to MyPowerBase, we can tell MyPowerBase to add all people who we have defined as friends, but not those defined as just contacts.

If the information is embedded in a photo–wow, information embedded in a photo, how dull–when we upload the photo to a site like Flickr, it could automatically be added to a map, with all the other photos from the same location. It can be pulled up on a search someday, when we ask the web to show us all photos for St. Louis, or for a certain block in St. Louis. Perhaps it can even help us find photos that are licensed Creative Commons so we can steal them.

I might write about a product or company, and the little bit of information I add to my post might help others who are thinking of doing business with the company, or buying that product. Sure, search engines can scrape the content and try and gleam useful bits based on keywords such as the product or company name, but we’ve all had enough really strange search results to know how far search can go, no matter how brainy the algorithm.

Someday, I’ll be able to write about movies and add just a little bit of extra information, and we can do the same for movies. Or music. Or cooking recipes (“give me all recipes on the web that use apricot jam and bourbon, but I don’t want chicken”). Or even poetry, though don’t mention poetry around Sir Tim–it makes him peevish.

Mathew is very addicted to FriendFeed, which allows him to pull in all the activities of his friends in various places. I bet if we scratched the surface of this application, a lot of the data that makes the application tick comes courtesy of the semantic web dorks.

I could go on and on, but I’ve already been away from my slide rule too long. Instead I’ll end with the best for last: because all of these different ways of adding that tiny little bit of useful information to blocks of text or photos or video files or what have you are based on agreed upon specifications, we can use applications to merge this data and use it for something new; something we haven’t thought of yet. See, now that’s when it really gets exciting because rather than coming up with an idea and then taking five years to get enough data to test it, we’ll already have the data, at no extra effort or cost.

Maybe I’ve been cooped up in my cube with my computers and code for too long, but that strikes me as kind of interesting. In a dorky sort of way.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31