25 March 2003 Archives

Recovered from the Wayback Machine.

Such a long time since I’ve written about technology. Feels odd – like wearing clothes from the 80’s that still fit, but you’re not sure about leaving the house with them on.

Today I hope to finish the last of the final edits for the Practical RDF book, which means no longer putting off the edits that, frankly, frustrate the hell out of me. Two years ago when I began my interest in RDF, I felt that the concept was sound but the discussion about it was obscure. I still feel that way now, hence the frustration. It also doesn’t help that this obscurity is matched with what can only be called intellectual elitism and its attendant arrogance, both of which drives away any potential audience and interest in RDF.

Apropros, Sean McGrath just wrote about RDF and the dangers of abstraction and losing one’s audience. Specifically he talks about RDF as the focus of the W3C’s Semantic Web effort, and how this is forcing ‘the masses’ into basically tuning out, losing interest in what could be a Good Thing. Bad he says:

I think it’s time for the Semantic Web proponents to stop trying to teach us all to think at their level of abstraction. We can’t (or won’t). Instead, the Semantic Web proponents should look at mapping transparently from the RSS 0.91, XFML 1.0 specifications that 94% of us are happy with, into the more abstract, generalized models that the other 6% need, for the applications they are all dying to take advantage of.

In other words, let everyone do their own XML thing and just transform the bloody mess into RDF/XML. Everyone can have their chocolate and eat it, too. Unfortunately, he uses that misbegotten, old, tired, and basically inconsequential, as well as absolutely boring RSS debate as the basis of his argument.

Sean is a wiz at transformation, and a bright light in the XML world; but he’s missed the boat with this one. The problem doesn’t lie with the specification and the W3C’s effort to promote it. The problem lies within the W3C, itself.

If you’ll forgive me a little digression, I want to talk a moment about another data model that faced stiff opposition. Years ago, a man named Codd wrote a paper proposing a new way of viewing data – the relational data model. Considering that he worked for IBM at the time, which was making considerable profit from non-relational data storage mechanisms, one would have expected that this paper, and the concept, would face stiff opposition, and it did. But the concepts Codd proposed of a standardized model and view of data that would allow one to focus on the essential of the business domain rather than the implementation of physical storage was a sound one. DARPA became interested and so did some folks at UC-Berkeley who created a system called Ingres, which formed the inspiration for the beginning commercial databases in use today. Commercial databases created by people who not only knew about technology, but knew how to sell that technology.

They succeeded. You can’t swing a dead cat without hitting a relational database in the world today.

Yet, Codd’s data model can be considered very esoteric for the average person. Very “abstract”. However, rather than abandon the abstraction necessary to ensure that data is consistent, valid, and can be merge with data from other systems, the creators of relational databases provided tools and technologies to handle most of the implementation details of relational databases, allowing company technologists to focus on their own specific business needs.

I want to build a claims system for an insurance company. Okay, I start by mapping the business domain data to the relational data model and then have the DBAs implement it. That’s my start. It saves me a great deal of time because without the relational data model and the database implementation, I first of all would have to decide what model of data I’ll use, and then figure out the most optimum implementation of that data store, and then build a prototype, test it, cross my fingers and hope it doesn’t result in invalid data – all before actually building something that actually meets the needs of the business.

The concepts underlying RDF are basically the same as those underlying the relational data model – a model for data that supports multiple business domains in such a way that the data from the domains can be merged and manipulated, consistently and efficiently. As an added bonus, both come with lots of tools that support that data management and manipulation so you don’t have to build your own.

Now, tell me: what’s so hard to understand about that?

You might be thinking that I’m supporting Sean’s assertions with my analogy comparing RDF to the relational data model and the implementors hiding most of the detail, but I’m not. One key factor in all of this is that people today design systems for the relational data model. They don’t throw the data out using their own unique variation of data store and then tell the DBA’s and programmers to map the data to the database.

In other words, a decision is made to follow the relational data model from the beginning, using whatever tools, technologies, and experts necessary to use the data model correctly. There ain’t no free ride. If you want the job done right, do it right from the beginning. Don’t give me no Excel spreadsheet and tell me to slap it into Oracle and expect the database to support 10,000 people. It don’t work that way. I know. You might need to transform the data from an old system into the new system in the beginning – but you don’t try and support both at the same time. Not and expect it to scale.

Sure you can transform RSS 2.0 to RSS 1.0 and back. But RSS is basically a brain dead business model. You have a source, the source publishes items, here are the items, in this order. Even my mama can figure this one out. Of course you can make it more complicated, which the dear hearts associated with RSS do at the spill of a latte; but the underlying business model is the same. RSS is not a good ‘example’ on which to make a stand either for or against RDF.

I agree with Sean in that the W3C shouldn’t be forcing pure RDF model theory on the masses; I disagree when he says to continue to use whatever, transform it, and just bung it in when it suits us to map to RDF. If we want to do the job right, let’s do it right, from the beginning. Which means that at some point, we’re going to have to understand how to map that data of the domain to the RDF data model. RDF must be made accessible.

Unfortunately, the W3C is its own worst enemy when it comes to promoting RDF and the Semantic Web, and understanding the concerns of just plain folks when it comes to ‘abstraction’. Why? Because there are no street smarts at the W3C.

The W3C has representatives from some of the best research labs in the world. They come from the best universities, the most prestigious R & D centers at the largest corporations, and the most influential standards organizations in the industry. In many industries.

However, few, if any of the members, have been woken up with a call in the middle of the night by the SysOp because a database system failed during a quarter systems run, and then had to try and debug the problem over the phone for a non-programmer. Or looked into the face of a customer service rep who is trying to figure out how a multi-screen application is going to make their jobs better, when before they had a simple one page form.

They’ve never been faced with a business manager who tells them to do the application using XML. Why? Doesn’t matter. This manager knows that XML is the Big Thang – therefore shut up and use it. And, oh by the way, use Python for the application, she’s heard that it’s the language to use. Why? Doesn’t matter, just shut up and use the language. Oh, by the way, here’s the specs. These two pages are specs? Sure, we’re using that new iterative approach to development. Don’t need all the requirements up front. Improvise.

And then, when you’re done creating that ultra-modern Python/XML, extreme, iterative, ultra-hip application that’s guaranteeed tested, bug free, on budget and on time, using GoF object patterns, and UML, and Rational Rose, and CVS, and what not, document it so a monkey can run the application. However, make sure the monkeys aren’t made to feel like monkeys.

Street smarts. In a more formal parlance – accountability to the using community. Somewhere along the way, the W3C has forgotten its accountability to the using community. The monkeys. Us.

Mark Pilgrim touched on this relatively recently when he said the following of XHTML 2.0 and its lack of backwards compatibility:

Standards are bullshit. XHTML is a crock. The W3C is irrelevant.

Now see – a monkey can understand this.

What’s interesting is that the W3C’s XHTML 2.0 reminds me of Oracle when it changed it’s underlying database foundation from partitions to tablespaces long ago. Sure it was the right thing to do, but it still almost caused the company to fail, the customers were that irate. You can’t just tell people to throw out their hard work because you have a ‘better’ way to do things. Not an maintain any credibility. Or customers. If you do have a ‘better’ way of doing things, its up to you to meet the community, not have the community meet you.

Two years I’ve worked with RDF in one form or another. After all this time, I still don’t understand half of what the RDF Core Working Group says in their little semantic debates. Is that a shocking thing to say since I’m writing a book on RDF? Could be. My editor is probably slapping his forehead right now as he reads this. However, when you consider that no two members of the group seem to either understand or agree with each other, either, I find myself in good company.

I’m not knocking Tim Berners-Lee and the RDF Core Working Group or the other W3C folks. They’re people who believe in what they do, have a vision, and the smarts and the drive to try and implement this vision. They genuinely love this stuff, and want to see it work. But somewhere along the way, they seem to have forgotten about us. The monkeys.