Categories
RDF

Jena Week

Recovered from the Wayback Machine.

At the time I wrote Practical RDF, the folks at HP’s Semantic Web Research Lab were in the process of creating the second major release of Jena, the popular and extremely comprehensive Java RDF API. However, at the time, the release was in pre-alpha state and wasn’t stable enough for inclusion in the book. With the release of the first formal beta of Jena2, the product is now ready for prime time discussion.

In the next week, I’m going to explore differences in the Java classes by porting the book examples over to Jena2. In addition, I’m going to take a look at some of the new features, including that new ontology API that supports OWL. I’ll also run all of my existing RDF/XML documents through the Jena2 parser, ARP2, to see how they fair with the updated parser. Are they still valid with recent RDF specification updates from the Last Call effort? Should be, they validate with the RDF Validator and it’s built on Jena.

I am tempted with this release to install Tomcat on the Wayward Weblogger co-op, so I can use Jena2 with some of my RDF applications. I hestitate, though, because Tomcat can be a drag on resources.

Categories
RDF

FOAF page and specification update

Dan Brickley and Libby Miller have updated the FOAF Specification Page, and have done a very nice job of it, too. This becomes a good schema page/documentation page model for others to use with their RDF vocabularies. This also reminds me that I need to focus both on PostCon, and its associated vocabulary, and the RDF Poetry Finder and its vocabulary and get these finished. Lately I’ve been too distracted with other things.

Speaking of RDF vocabularies, I was reminded this week – and not sure where I read this – that there really are no ‘vocabularies’ in RDF/XML, just sub-graphs and that’s why namespacing and ’smushing/smooshing’ of data from different schemas is so effortless in RDF/XML. Good point. But I still think of them as ‘vocabularies’. I guess a rose by any other name and all that.

(Thanks to Danny for heads up on FOAF.)

Categories
RDF

Tax? Or Precision?

Recovered from the Wayback Machine.

I read the comments about “RDF tax” and how we must “prove” RDF’s worth, yet when I look at so many plain XML feeds, all I can see is the improvement that could be added because of the precision of using RDF/XML. Not all XML feeds, but any that are purposed beyond being generated and consumed by one product.

For instance, in an IRC conversation yesterday about the Pie/Echo/Atom syndication feed, a question arose about how to interpret the ordering and grouping of the contributors and entries within the plain vanilla XML feed. Yesterday’s exercise was focused on doing a straight port of functionality, which meant, according to the participants, that both the entries and contributors needed to be placed within an RDF Seq container. Why? Because order is implied in XML. Or part of the specs, I’m not sure which.

A discussion then commenced that just because XML enforces a grouping and a sequencing on child elements by some form of default, doesn’t mean we have to constrain the same elements within RDF/XML; we have other options built directly in the syntax. We could use repeating properties, which means there is no implied grouping. We could use a Seq, which means there is a grouping, and a sequence to the elements. Or we could have used a List, which also includes a nil factor, meaning that these elements are in a group, and there are no other members for this group. There’s a lot less ’slop’ built into RDF/XML then there is in just plain XML by default.

That “RDF/XML” tax, as it’s called, opened up a door to a conversation that forced the members of the Pie/Echo/Atom group to look at implicit behavior and XML and decide if this is the type of behavior they want to enforce; or whether this is the type of behavior imposed by tree structure nature of the hierarchy of XML. As Sam wrote:

(T)his effort provided an alternate insight into this data, which surfaced a number of questions I never pondered before. For example: is the order of contributors significant? This needs to be answered and documented.

Of course, we can use XML DTDs and XML Schema and a host of other XML ancillary specifications to provide the same precision as we have within the RDF/XML model and syntax; but it doesn’t strike me that the vanilla XML is all that ‘readable’ at this point. In fact, seems to me that you’d have to spend at least as much time reading and working with the XML specifications to do XML ‘properly’ as you would with RDF/XML.

In other words, there’s a ‘tax’ to XML, too, but it can be ignored if you don’t care whether your feed is imprecise. In fact, if my memory serves me, one of the reasons why people had trouble with RSS 2.0 is they felt there were ambiguities in it – that constraints were too loosely defined and it was too easy to generate a ’sloppy’ XML feed. One of the reasons for Pie/Echo/Atom was to create a tight, well-defined feed that also allowed room for future growth. This type of rigor implies either that you use DTDs or XML Schema to ensure similar behavior. Or you use RDF/XML.

Folks ask me to prove the worth of RDF/XML. I think at this point I’ll turn this one around – I’ll ask the “RDF Tax” folks to prove to me that vanilla XML provides the same precision in both meaning and implemented behavior as RDF/XML – and still be ‘readable’.

Categories
RDF

I am not the church and RDF is not the earth

The discussion continues on using RDF/XML for the new Pie/Echo/Atom syndication feed, in Sam’s comments and in the email list. I even had a very fun time in the echo IRC yesterday, though I’m not a particularly adept IRC person.

(I did find out about the use of /me, and went crazy using it as a result.)

I’m glad these conversations are happening now. I would like to work with the Pie/Echo/Atom folks as much as possible promoting the idea of using RDF/XML for the syndication feed, but not at the expense of hiding what this means for the feed in the long run. I do have interests in showing how using RDF/XML can be helpful, beneficial, and not that complicated; but I have no interest in sneaking it in through the backdoor.

In the first chapter of Practical RDF, I wrote:

 

RDF is a wonderful technology, and I’ll be at the front in its parade of fans. However, I don’t consider it a replacement for other technologies, and I don’t consider its use appropriate in all circumstances. Just because data is on the Web, or accessed via the Web, doesn’t mean it has to be organized with RDF. Forcing RDF into uses that don’t realize its potential will only result in a general push back against RDF in its entirety—including push back in uses in which RDF positively shines.

RDF and RDF/XML aren’t for every person or every project. The most I can do is gently work with those reluctant in its use, suggest it where appropriate, demonstrate it here and elsewhere, and be philosophical if it’s use is rejected.

The editor for Practical RDF is my friend Simon St. Laurent, a person who I admire and greatly respect. He was the perfect editor not only because he’s a adept and skilled and a great writer in his own right; but also because he is not an obsessive fan of RDF. Neither one of us wanted Practical RDF to be a ‘fan book’. Both of us realize the problems associated with the perception of the specification, and more specifically the constraints of the markup.

Simon recently wrote a rant, as he styled it, on RDF/XML. I link to it here not to chastise or disagree, but because I found it to be well written and concise in where the pushback against RDF is arising.

I think the reason why I don’t have as much problem with RDF/XML as others is because I’ve been working with RDF/XML about as long as I’ve been working with plain XML. To me, there is no problem with the syntax because I’m so comfortable with it, pure and simple. I need to reminded that others are less so, and I’m grateful when they write their reasons, clearly and bluntly.

Categories
RDF

Bray and Symbols and Grounding

Tim Bray on the namespace fooflah that’s been happening:

Right now, in the context of the Pie/Echo/Atom/whatever project, people assert that crystallizing the meaning of embedded namespaces is the key to interoperability, the central problem, and so on. Huh? When someone proposes markup from another namespace for inclusion in a syndication feed, there are three possible outcomes:

Nobody pays attention and it isn’t much adopted.

It gets widely adopted, with semantics along the lines originally proposed.

It gets widely adopted, with some semantic drift away from the original proposal becoming evident in the implementations. (Note that this has already happened with some RSS 2.0 markup).

Oddly enough, this is exactly what will happen with proposed tags and attributes that aren’t in a different namespace.

I agree with Tim when he summarizes his essay with “…we shouldn’t try to kid ourselves that meaning is inherent in those pointy brackets, and we really shouldn’t pretend that namespaces make a damn bit of difference”. There is no ‘meaning’ behind markup, there is no ‘meaning’ behind namespaces.

But, there is behavioral assumptions associated with both – behavior that can be programmed into both producers and consumers of the markup. In the recent discussions about namespaces, as per Jon Udell, the programmatic behavior and assumptions might be getting a bit blurred about the meaning of it all, but within the Pie/Echo/Atom world, the discussion of namespaces is concrete: what signals a change, what works, what doesn’t, and what should be ignored.

For me, namespaces say:

I mark things that belong in a specific schema. This schema isn’t an extension to diddly squat – it can live on its own, thank you. If it didn’t, we’d have these psuedo-schemas floating around because the originators of Important Schema didn’t take the time to do their job right in the beginning. The opposite of analysis paralysis is … broken bits of schema floating around, desperately holding on to Big Brother, hoping to be acknowledged as part of the family and not the bastard add-on that crept in after darkness.

If you see a name like one of mine somewhere else that has a different namespace, this means that the two things aren’t the same. How they differ is up to organic side of this relationship to figure out. I personally don’t care. Because all I do, is mark things.

Come to think of it, there is a lot of ‘meaning’ in my understanding of namespaces, isn’t there?