Categories
RDF

Why a processor rather than a transform

I have spent a little time looking at other approaches to mapping RDF to a web document created as XHTML; approaches such as GRDDL, which uses XSLT to transform basic concepts from (X)HTML into RDF/XML and then provides a link to the transform.

(RAP just released a GRDDL parser, though it’s based on PHP 5.x, which means don’t expect it out on the streets too soon.)

This works, if all you’re doing is pulling out data that can be mapped to valid XHTML structure elements. But it doesn’t work if you want to capture meaning that can’t be constructed from headers and paragraphs, or DIV blocks with specific class names. Still, it meets a criteria of minimal human intervention, which finds favor among Semantic Web developers. If the user is providing a page anyway, might as well glean the meaning of it.

However, as we’ve found with Google, which does basically the same thing except it performs it’s magic after the material is accessed, automated mechanisms only uncover part of the story. This is why I get people searching on the oddest things coming to my site – accidental groupings of words pulled from my pages just happen to meet a word combination on which they’re searching.

In other words, hoping to discover semantics accidentally, only goes so far.

One reason I use a poetry finder as a test of any new semantic web technologies and approaches is that any solution that would work to help people find the right sub-set of poetry, won’t do so because of accidental semantics.

Let’s look at two popular RDF vocabularies: RSS and FOAF. RSS is an accidentially semantic application. The same data that drives an application such as a weblogging tool can be used to create RSS without much intervention on the part of the user. I could also use the same mechanism that drives RSS to drive out something like my Post Content vocabulary, PostCon.

(Though one bit of information I capture in PostCon, such as the fact that a page has been pulled and information as to why it’s been pulled cannot be capture in RSS; RSS implies a specific state for a document: “I exist.”)

FOAF, on the other hand, requires that the user sit down and identify relationships. There really is little or no accidential semantics for this vocabulary, unless you follow some people’s idea that FOAF and blogrolls are one in the same (a hint: they’re not).

So what drives out the need for FOAF? Well, much of it is driven out by people attracted a bright, new, shiny objects. Still, one can see how something like FOAF could be used to drive out systems of social networks, or even *shudder* webs of trust, so there is an added benefit to doing the work for FOAF beyond it being cool and fun.

The key to attracting human intervention, beyond getting someone influential and well known to push it, is to make it easy for the end user–the non-XML, non-RDF end user–to provide the necessary data, and then to provide good reasons why they would do so. The problem with this approach, though, is that many Semantic Web technologists don’t want to work on approaches that require the human as an initial part of the equation. Rightfully so: a solution that requires effort from people, and that won’t have a payback until critical mass is reached, is not something that that’s easy to sell.

Still, I think FOAF has shown a direction to follow – keep it simple, uncomplicated, and perhaps enough people will buy in at first to reach the critical mass needed to bring in others. The question, though, is whether it can attract the interest of the geeks, because it’s not based on XSLT.

With GRDDL, one can attach a class name to a DIV or SPAN element, and then use XSLT to generate matching RDF/XML. This removes some of the accidental discovery by explicitly stating something of interest with that DIV element. More, this doesn’t require that the data be kept separate from the document – it would be embedded directly in the document.

However, rather than making this less complicated, the whole thing strikes me as making the discovery of information much more complicated than it need be.

Now, not only would the end-user have to write the text of a writing, they would have to go through that text and mark specific classes of information about each element within the XHTML. This then exposes the end user to the XHTML, unless one starts getting into a fairly complicated user interface.

Still, this is another approach that could be interesting, especially when one considers the use of Markdown and other HTML transforms used in weblogging tools. How to do something like this and have it map to multiple data models could be challenging.

Don’t mind me, still thinking out loud.

Categories
RDF

I need to keep up more

…with the Semantic Web doings at the W3C, though doing so precludes doing much else at times.

However, not keeping up means that I’m losing important bits of information; such as this bit that Danny Ayers named his new kitten after a proposed new query language for RDF.

Ah well, at least he didn’t name her Ontaria.

Categories
RDF Semantics

In need to keep up more

…with the Semantic Web doings at the W3C, though doing so precludes doing much else at times.

However, not keeping up means that I’m losing important bits of information; such as this bit that Danny Ayers named his new kitten after a proposed new query language for RDF.

Ah well, at least he didn’t name her Ontaria.

Categories
Technology

Bye bye Windows Bye bye Linux

Recovered from the Wayback Machine.

Over the last several months, I’ve been moving more and more of my work from my Windows/Linux dual-boot laptop to my Mac. Now with the open source development environment working so effortlessly in my PowerBook, there’s little reason to stay with my other machine.

I still get Excel spreadsheets and Word documents, but today I upgraded my OpenOffice environment to 1.1.2, and the performance and ease of use with this application has now reached a point that I can do without Office. My printer doesn’t work with my Mac, but to be honest the print drivers don’t work with many of my Windows applications. Besides my roommate’s printer just died, and he could use a new one.

Next week I’ll take my Powerbook down to the Apple Store, to the so-called Genius Bar (how pretentious can one get?) and have them fix the battery, and tell me how much it would cost to upgrade my hard driver to a larger size. After that, I’ll spend the next week cleaning out my Windows laptop and re-installing the software from scratch; giving the box and the printer to my roommate and buying a new photo-capable printer, desktop keyboard and possibly a stylus and pad for my Mac. (Suggestions on all of these would be welcome.)

At that point, for the first time since I was a tester of the earliest beta release of Windows, back in the 80’s, I won’t be using a Windows box as my primary work machine.

A few years ago, I never would have thought this could occur. I had written a best-selling book on COM/COM+ and ASP for O’Reilly, I was a member of the Microsoft Development Network, had passed several Windows certification tests, attended Windows conferences almost exclusively, and programmed primarily in VB and VC++ and just a little Java. In addition, I scoffed at the Macs with their cute graphics, and decided if I were to go with a second environment, away from my beloved Windows, it would be Linux.

This weekend, though, I was able to install several open source applications far more easily than I ever could on Linux, primarily because Mac users won’t tolerate piecemeal packages, cryptic installation instructions, and a hackers attitude of “well, if you have to ask how something works, you shouldn’t use it”. Best of all, they work out of the box on the Mac — no mucking around with Windows ‘tweaks’.

I am now become one of the Mac people I used to look askance at years ago; you know, the starry eyed ones that wax on and on enthusiastically about their machines. However, I draw the line at standing in line for an Apple Store opening, or spending time in the Mac forums comparing the size of my local Genius Bar with those of other members.

Categories
Political

All is relative

Note from 2023 when this was recovered: No, I was wrong. Brooks is the pawn of the devil, and he’s not worth listening to.

Loren Webster talks about too many bridges being burnt, and I can identify with this. I am at that point now where I am thinking of burning some bridges, an impulse brought on by reading others’ implications that those of us who don’t share the sound and fury about the ‘reds’ winning, are somehow compromising our beliefs.

Per the Tracy Chapman song Loren quoted:

All the bridges that you burn
Come back one day to haunt you
One day you’ll find you’re walking
Lonely

Scott Hanson pointed to a NY Times editorial by David Brooks very worth reading. I know, I know — Brooks is the pawn of the devil. But he’s also one of the ‘reds’ we should be listening to:

But the same insularity that caused many liberals to lose touch with the rest of the country now causes them to simplify, misunderstand and condescend to the people who voted for Bush. If you want to understand why Democrats keep losing elections, just listen to some coastal and university town liberals talk about how conformist and intolerant people in Red America are. It makes you wonder: why is it that people who are completely closed-minded talk endlessly about how open-minded they are?

(Here’s another interesting NY Time’s article on this issue, but I don’t necessarily agree with all the opinions expressed.)

Looking at the vote counts in the states that passed anti-gay marriage initiatives, to get the numbers they’re getting, they’ve had Democrats vote for this in addition to Republicans. Remember my post, An Actual Conversation? Both of the people featured in this voted for Kerry. Nothing is ever as black and white as it first appears.

I wrote in comments in another weblog that I hope every fear I have about what could happen under Bush doesn’t materialize. I have no greater desire now, than to be proven wrong about all of it, and will do everything in my power to ensure this.

Last post on politics for a while. I think we all need to take a deep breath and give this subject some space. I for one happen to like most of the people on the other side of the bridges I’ve been thinking of putting to the flame; too much so to follow the impulses of the moment. They’ll have to make their own decisions as regards their own fires.