Categories
RDF

Mixing Vanilla XML with RDF/XML

Recovered from the Wayback Machine.

What would it be like to add the ability to create RDF/XML “sub-trees” within a plain vanilla XML document? It would be like the following:

xoxoxoxoxoxxoxoxox xoxoxoxo xxoxoxooxoxxo foaf:knows xoxoxoxox xox xoxoxoxox oxoxoxoxox xxoxoxoxoxoxxo rss:item xoxoxoxoxoxox xoxoxoxo xxoxoxoxox xoxo xxxxoxo xxox x foaf:lastname xoxoxoxo xoxoxoxoxox oxxx oxoxox oxoxoxox xoxoxox postcon:reason xoxoxoxox xxxxoxoxo xoxoxox job:title xoxoxox xoxoxoxoxxx xoxo xxxxxxxxxxxx xooxoxox ooooooo xoxoxoxooxx oxxoxo xoxoxo xooxoxox xoxoxooo oooxoxoxo oxoxoxo xoxoxoxo oxoxox ooooooxoxox cc:license

Categories
RDF

You have your peanut butter in my chocolate!

Recovered from the Wayback machine.

Jon Udell has been exploring the concept of mixing, in his words, RDF-isms with RSS 2.0, which is a non-RDF, single use XML vocabulary.

First, important note – when RDF people talk about RSS, they usually mean RSS 1.0, which is an RDF-enabled vocabulary.

Second important note – RDF puts certain constraints and requirements on an XML document for it to be valid RDF/XML, and to be usable within RDF applications and APIs.

The question began with namespaces, and Dan Brickley pointing out what is one of the major strengths of RDF – if a vocabulary is RDF compliant, then it’s namespace would work within other RDF-compliant vocabularies. As Dan wrote in comments in the original thread:

In the RSS1 design (via the love-it-or-loathe-it RDF approach) we had a more loosly coupled, de-centralised design: a namespace worked with RSS1 if it worked with RDF. If someone created an RDF vocab for Jobs, then it worked with RSS1. If someone else creates an RDF vocab for locations (to talk in more detail about where the jobs are), then that too just worked. Same goes for a skills vocab (there’ll likely be several). Or person-descriptions (not just FOAF, vCard but the several others people have created to qualify those).

Jon took this tidbit and stretched it – waaaay out of shape – into a discussion about using RDF in conjunction with RSS 2.0. He comes up with a sample feed. It validates with the RSS validator. Fine and good – but it’s not going to validate as RDF, which provide those aforementioned rules and criteria and constraints that keep any old XML from being used as RDF and violating the RDF model that forms the basis of the RDF/XML.

There is a mathematical model we all use that says when we add 2 plus 2, we get 4. In addition, within this model a value of ‘3′ is greater than a value of ‘2′. You can create your own mathematical model, and in it, 2 plus 2 could equal 5, or 3, or 6 if you want, and a value of ‘2′ is greater than ‘3′ – but it won’t work with the existing model.

You could take your model and document it nicely and say, “Please use it. It’s better and more simple”, and you might be able to get people to use it – but then they would never be on time for trains, and they would never have correct change, and they would most likely be in trouble with the income tax folks.

The point is that the numbers, the operators, and even the syntax that we use within our mathematical system isn’t what’s important – it’s the model, not the pictures that counts.

Jon today states:

Actually, I’m not saying that I want to put RDF into RSS. I’m trying to ask and answer two questions: 1) Is it feasible? and 2) What benefits would it confer?

Jon, in a nutshell: 1) no , 2) many, but would require that you abandon the single-use RSS 2.0 architecture in favor of a more universally defined architecture, which is the aforementioned RDF/XML.

Lots of new RDF/XML vocabularies coming down the road, Jon, and not just the new RSSJobs which is this week’s hot spot. (Is this RSS 1.0 or RSS 2.0 – can’t find info on this.) Many excellent RDF/XML vocabularies, all of which will not work with RSS 2.0 because RSS 2.0 is a single-use, single-purpose simple syndication feed that doesn’t care whether it mixes with other vocabularies. That’s cool. That’s your choice.

update: 

Did find a comment thread where Dan talks about mixing RDF namespaces in with regular XML.

Dan’s a patient man. He wants to allow the XML freeforall world to benefit from RDF/XML. I admire him for this. I disagree with even attempting to do this.

What’s the point of the model, the rigor, the challenges associated with RDF/XML, if we’re going to say, “Oh, well, we don’t want to force you to use our model. Don’t worry your pretty little heads about it – just go ahead and just take what you want, we don’t mind.”

It’s the same as saying use your mathematical model, we’ll learn it so that we can give you the proper change when you buy that beer you so obviously need for doing something so daft.

You want to combine plain vanilla XML and RDF/XML? Fine. Use XSLT. Or a sledgehammer. Or drugs.

Categories
Semantics

FOAF, Flocking, and the Semantics of Starlings

Recovered from the Wayback Machine.

This weekend I played a bit more with the attachment that allows me to take photos of slides with my digital camera. The ones shown here I took years ago when I lived in Portland, Oregon. The subject is a flock of European Starlings at sunset, just after a storm.

Every year our apartment complex in Portland would be overrun with flocks of starlings that swooped and swirled about, covering the trees and darkening the sky — whatever that part of the sky that wasn’t already darkened by the rain clouds that were a part of our life in Portland. Pretty as they may seem from the photos, the Starlings were a pest — a species that didn’t belong in the area, and one that would take food and habitat away from native birds. Their waste was corrosive to cars, and damaging to buildings and streets; additionally, the birds are known to carry and spread disease.

swallows1.jpg

The apartment would bring in a bird specialist who had this explosive air cannon, which he would shoot at the trees to scare the birds off. (Rather unnerving for tenants in addition to birds.) The starlings would leave for a time, but they always managed to find their way back; they are nothing if not tenacious.

Starlings are a flocker, following lead birds almost obsessively, and it was fascinating to watch as one flock of starlings would meet head-on with another flock — thousands of birds racing towards each other in what you would expect would be a collision, but would coalesce into this wonderful ballet of birds flying over and around each other, literally riding the wake in the air each other caused.

What do starlings and their behavior have to do with the Semantic Web? Only in I was reminded of this ‘heads on’ behavior this weekend when I was quietly reading the various entries out at the W3C’s TAG (Technical Architecture) email list — all about URIs and resources, and what it all means…and doesn’t mean. The discussions spilled out into weblogging when Tim Bray wrote a posting titled On Resources. Focus on the web of now, Tim says, and document what we have now:

So, explaining the Web-as-it-is would be enough to make me happy. Clearly, we should have an eye to the future, and, in writing down the architecture, try to avoid making life difficult for any others who are working to make something new and important involving the Web. Obvious examples are the Semantic-Web and Web-Services efforts.

But at the end of the day, the success criterion for me is having the success criteria for the Web-as-it-is explained clearly and convincingly.

In other words, the focus of TAG should be on what exists now, not what might exist some day. As for resources and their identifiers — those pesky little devils — Tim wrote:

We could just not talk about resources in the Architecture document. That wouldn’t get in the way of any software that I know of. But I suspect that this would impair the document’s usefulness as people paged frantically back and forth trying to figure out what URIs identify. Perhaps there’s a middle ground, where we say that the nature of resources is outside the scope of this document, aside from the fact that they are what is named by URIs.

Tim Berners-Lee wasn’t particularly happy with Tim Bray’s essay. In the TAG email list he wrote:

You say that the TAG should concentrate on the web as it has been
before the semantic web and web services, and that you will be happy if
the architecture works for that, even if it does not work for web
services and semantic web.

That is a pity, partly because the web is no good unless it can be a
sound foundation for the semantic web and web services too. WSDL (ed. Web Services Description Language) and RDF (ed. Resource Description Language) have real serious issues on the table, working groups which need a consistent framework.

At first glance, it seems as if the two Tims were at opposite ends of a circle — the web of the now versus the web of the future. Mathematically defining resources as compared to basically ignoring the concept as one that can’t be effectively defined. One could then assume that their opinions cancel each other out, leaving us a big fat zero in understanding. On the contrary: like the two flocks of starlings converging together from opposite directions — resulting in a thing of great beauty and great destructiveness — Tim B and TimBL have articulated the dichotomy behind the debate of what is a ‘resource’, and how is it identified within the Semantic Web (as introduced earlier this week). But that’s a dry summation — what they’ve really done is articulate the challenges of the Semantic Web:

To be a Semantic Web, it must be mechanical, and therefore precise, mathematical, and ultimately unambiguous. But to be a Semantic Web, it must also encapsulate meaning, context, and embrace ambiguity. Ignore the discontinuities, embrace the discontinuities.

swallows2.jpg

What does this all mean? If a resource is defined to be anything, including something abstract then how can it have an identifier on the web, in the form of a URI? But if a resource within the context of the Semantic Web is defined to be something on the web, then how can it not have a URI? If we limit resources to things on the web, how can we identify things as disparate as a person, a galaxy, and an abstraction such as a metaphor in a poem? And how can one global set of URIs work for all items, at all granularities?

If a resource is a representation of something, and one that exists on the web, then software can be designed with an assumption that if you access the URI, something is returned. But can all ‘resources’ of interest within the Semantic Web be represented with something on the web, and identified by a URI? What about peace — can it be constrained within a representation? It’s hard enough identifying it in ‘real life’, how would it be represented on the web? How about you and I? Can we be represented on the web?

Questions! So many questions. And topmost in your mind might be: Why should I care?

Frankly, I’m not sure you should care about this debate and the Semantic Web — you cannot eat it, sleep with it, or use it to rear your young. However, you might care because what’s being discussed is the scope of what will be a part of the architecture of the Semantic Web. If the Internet and the Web, and all of its simple hyperlinkness has invaded your life to a degree now, how much more so will it if it becomes richer, more complex, and more meaningful?

I personally care about this debate because I want to make sure my metaphor, my syllogism, and my analogy are represented effectively or my own Turing Test for the Semantic Web will never come about. I don’t want these abstract concepts to be discarded because they can’t be mathematically defined.

“What we need to understand may only be expressible in a language that we do not know.”

Anthony Judge

That would be a pity.

swallows4.jpg

In the title, I introduced FOAF, and you might be wondering where this simple RDF-based vocabulary fits into this grand debate. I could wish that the membership of the W3C wasn’t so averse to webloggers — our seeming arrogance and assumptions of our importance on the web, and our messiness — because the issues the TAG members are discussing are related to what’s happening with FOAF among the weblogging community. It is a microcosm of the Semantic Web, with its rich possibilities and its many ambiguities and misunderstandings.

To return to FOAF: FOAF represents both people and relationships, the former being concrete but difficult to physically put on the web, the latter being an abstract concept.

Me. The representation of “me” in this context is that which is described in a FOAF file. I am identified primarily by a hash of my email address — in this case, in this microcosm, I am known as:

cd2b130288f7c417b7321fb51d240d570c520720

You may call me “2b” of “Bb” for short.

In addition, my current FOAF file has a property defined within it — knows — and the object of this property is another person — Simon St. Laurent. In this file, I say, “I know Simon St. Laurent”, and to identify Simon within whatever FOAF system might exist, I use the hash of his email address:

65d7213063e1836b1581de81793bfcb9ad596974

I suppose you could call Simon “e183” for short.

Both Tims should be unhappy with my FOAF file, I would think, following from their arguments described earlier. For instance, there is a resource with a representation on the web — myself — but there is no URI for it; not only that we’re not completely sure what the resource is, but we’re not ignoring it, either.

Within a Semantic Web of moving parts and grinding bits, FOAF doesn’t fit.

Tim Bray took on an action item recently to draft language surrounding information resources as compared to resources. As he wrote in another TAG email:

Many existing Web servers and clients (for example web browsers) do not have any notion of what the Resource identified by a URI is. However, humans and Semantic Web software are strongly concerned with this issue. Some resources are perceived as falling into a class called “Information Resources”. That is to say, they are on-line units of electronic information or service. Examples would include a photograph, a news story, and a weather forecast for Oaxaca. Other resources named by URIs may exist entirely apart from the Web. Examples include an edition of some book identified by urn:isbn:0-395-36341-1, a person identified in an RDF assertion using http://example.com/foaf#Dan, and an XML namespace such as http://www.w3.org/1999/02/22-rdf-syntax-ns#. The Web may be used to obtain representations of both kinds of resources.

What Tim is saying is that either a resource exists on the web, or a representation of the object exists on the web. If the URI has an associated protocol, such as the FOAF identifier given in Tim’s example, its representation is accessible on the web if it isn’t itself.

Or is it?

Not one FOAF file I have seen uses a URI to represent the person. They either don’t use anything, or they use what is known as a blank node identifier, which is only relevant to the file. However, the lack of a URI hasn’t impacted adversely in FOAF because each person identified within the context of FOAF is done so through two alternative keys: the mbox_sha1sum, which is the hashed representation of my email address; and/or the URL (URI) of my FOAF file — http://burningbird.net/foaf.rdf.

Neither key is officially a URI of my representation within the context of either the existing or the semantic web. We have simply worked around the debate and issue of how can one identify a representation of ourselves on the Net by not using an identifier. We could use an analogy of parents arguing about proper diet, while we hungry children raid the fridge and eat all the pie. We could, but there’s no URI for analogies, either, and therefore must not have a proper place in a proper discussion of the proper Semantic Web.

Then there’s the issue of what’s being identified — is the person the resource? Or is the FOAF information the resource, and I’m defined by many such? Additionally, FOAF files also denote other resources — or what we’re assuming are resources because they are, after all, defined within the Resource Description Framework — and they’re parameterized, if that’s the right word, by using the RDF property ‘knows’. However, we don’t have a good understanding of what it is we’re defining with ‘knows’. Is it denoting a relationship? Or is it nothing more than an acknowledgment that I literally know who Simon St. Laurent is? Is Simon a friend, because he’s in a FOAF file? If so then, what were to happen if I wasn’t in Simon’s FOAF file?

If I were to remove Simon from my FOAF file, am I disavowing the friendship? Or am I ‘pretending’ that I don’t ‘know’ Simon? With FOAF, we not only assert truth, we assert a lie, because I know Simon, and him not being in my FOAF file, or not, doesn’t change this. If I don’t list him, am I lying by omission? What does it mean to be in one of these files? What does it mean when you are not?

Will you ‘feel’ it, when you’re not?

The fact is that FOAF is being used as a representation of something, we’re making assertions about something but we’re not sure of what. Whatever it is, though, it’s loaded with connotations.

Within FOAF we’re representing information about ourselves, but it’s not us — too flat, too two-dimensional to be a representation of us. Additionally, we’re representing relationships with other people, but we’re each bringing our own interpretations of these relationships along for the ride. In other words, we’re making assertions of relationships and attaching social context to them.

In the RDF Concepts working draft, there was a section that discussed the social context of assertions. It is the one and only section of all the RDF documents that brings up the issue of social context about RDF statements. The only one. And of course, it is this section that the Semantic Web Architecture recommended be struck.

Why the cut? From the meeting minutes where the recommendation arose, it would seem to come back to our old debate of URI and context. There was too much confusion about what was meant by ‘identify’, by URI, by resource. As Dan Brickley said in IRC notes:

21:10:54 [bwmscribe]
authoritive definition of URI’s: i.e. who gets to say what a URI denotes
21:12:48 [danbri]
something like “RDF graphs have propositional content. Their meaning is fixed by a bunch of hairy stuff only partly understood and documented (eg. implicit theory of reference associated with URIs). Minor health warning. The End.”

But that doesn’t stop the confusion — ignoring the concept of ‘resource’, postponing the issue of identity, and ignoring social context because it’s too hard to define, won’t prevent problems when people act to fill the void that’s left. As Kendall Clark wrote:

This way of carrying on the social meaning debate was unlikely to lead to a satisfactory resolution, since it was possible to strike the problematic language without solving or addressing the substantive issues which animate the debate in the first place.

swallows5.jpg

Consider FOAF files again: Marc Cantor and Eric Sigler are working on this thing that Marc is calling a “PeopleAggregator”. From bits and pieces I’ve picked up at their weblogs, in emails, and in comments elsewhere, this application will be able to create and consume and maintain FOAF files as well as networks of interlinked people who ‘know’ each other, as defined in these files. More, if someone within the network designates you a ‘friend’ in their FOAF file, the PeopleAggregator sends you an email asking for some form of confirmation.

(Again, this is based on casual discussion in comments and may be incorrect in whole or part.)

Rather than the network of friends being maintained behind walls ala Friendster, it’s out in the open with decentralized FOAF files that anyone can read. Now, what will become the social context of the relationships denoted as resources within these FOAF files? And what can be the social consequences of same?

Personally, I expect the first ‘Technorati of FOAF popularity” before the year is out. I wonder, what crown will we give to the man and woman voted most popular? Prom king and queen? I also wonder, how soon will we get emails saying, “Please remove me from your FOAF file — you don’t really know me” How soon will we get emails saying, “Why am I not in your FOAF file”?

If you doubt this, then look no further for proof than the plain, ordinary, unsemantic hypertext links that form our blogrolls. Remember public delinking, and how in the past this has been used as a measure of censorship, and as a form of punishment and control? I’ve been delinked, publicly and privately, from friend and foe, and believe me when I say there is more to this than a simple hypertext link, and the removal thereof.

Remember also the discussions of the power that these links provide within this communication medium because — as Clay Shirky has demonstrated with his power laws — those with a disproportionate share of weblogging links also have a disproportionate share of attention, and even respect?

Power and pain, reward and punishment, all encapsulated in a simple hypertext link, in a simple blogroll — what can happen within the socially explosive context of FOAF?

Both Tims might say that the FOAF example isn’t relevant — weblogging is its own problem and isn’t really representative of the web as a whole. After all, there are billions of pages on the web, and only about a half million webloggers, if that.

But webloggers are becoming the Semantic Web lab rats — through our curiosity and our interest, we’re the first to test these Semantic Web tools outside of labs and universities. We’re the ones that propagate the data and the technologies. When faced with confusion, we’ll wing it. We did so with RSS 1.0, we’re doing so with Pie/Echo/Atom and now we’re continuing the trend with FOAF.

FOAF is becoming the bastard child that grew from the seeds that fell between the cracks of W3C debates or were discarded with all the other messy ‘touchy-feely’ stuff, such as social context surrounding URIs. It’s the wolf child tempered in the pack, surviving on an existence of “keep what works, throw out the rest”. One can’t blame it, then, if it, and we, don’t behave properly when invited to the Semantic Web tea.

And the more I look at these photos the more I think some are upside down. I can’t really tell for sure, the slides aren’t properly marked — but the images are pretty and my representing them upside down on the web doesn’t stop the birds from flying.

swallows3.jpg

Categories
RDF

FOAF:knows a clarification

Recovered from the Wayback Machine.

Dan Brickley just came out with a why there’s a foaf:knows but not a foaf:friend. The better explanation occurs in the comments:

Because the concepts of ‘knowing’, ‘knowing well’, ‘friend’ etc. are both slippery and because people vary (personality, use of language etc.) in how they’re comfortable using those concepts, you get into situations such as X’s foaf file says that X has friends Y, and Z whereas Y’s foaf says X is ‘just’ a knows or knowsWell (knowsWell being particularly awkward as it suggests significant familiarity without affection, ie. no “would like to know better” wiggleroom). Z’s foaf might list neither as friends, and risk being taken (despite ommission not implying negation in RDF or FOAF) as suggesting that Z doesn’t consider either X or Y to be friends. Although Z might protest that the absence of a claim from a FOAF file is consistent with it still being true, X and Y could fairly counter-protest that Z could have made the effort to mention them since they made the effort to mention him/her. And so on…

You see similar economies of expected reciprocation in closed-world systems like Friendster or LinkedIn, especially where they offer endorsement and commenting facilities. Not something to blunder into with FOAF without some careful thought, so we retreated to the safer ground of ‘foaf:knows’.

Glad, am I, that Dan came out with this.

Categories
RDF Writing

RDF: Ready for Prime Time

Originally published at O’Reilly, and recovered from the Wayback Machine.

Not long ago, Marc Canter, one of the early founders of Macromedia, talked about RDF and the Semantic Web in his weblog. Specifically, he wrote:

“I’ve been spending more and more time trying to grok the RDF folks. I have to say I like what I see and hear, but what I don’t see are many apps and services actually up and running and working.

We have a saying over here: “put up or shut up.” I’m still looking for two different RDF apps or services to work together in some meaningful way. Then bring on the books.”

Considering that I’m “bringing on a book” on RDF this month, I thought it appropriate to answer Marc’s plea for meaningful, working examples of RDF apps and services, especially those that work with other RDF-based services. My problem, though, is that I have only a limited amount of time and space in this article; I can only cover a few of them. However, best to just start, but first, a little digression into RDF and XML.

RDF/XML: The Syntax That Could

You probably know that RDF has both a defined model as well as a preferred serialization, RDF/XML. In many ways there’s been far less criticism of RDF than there has been of the RDF/XML syntax. Tim Bray, one of the creators of XML has said:

“Speaking only for myself, I have never actually managed to write down a chunk of RDF/XML correctly, even when I had the triples laid out quite clearly in my head. Furthermore, once again speaking for myself, I find most existing RDF/XML entirely unreadable. And I think I understand the theory reasonably well.”

Tim even went so far as to offer his own version of RDF/XML, which he called RPV.

I’ve found that the more a person works with markup such as XML, the more they dislike RDF/XML. I’ve also found that no matter the alternative proposed, someone else will dislike it just as much, which makes RDF/XML a bit of a “damned if you do, damned if you don’t” proposition.

Ultimately, if RDF is ready for prime time, then so is RDF/XML. Regardless of our views of it, it’s official, it’s real, and it’s here now. So on to the RDF applications, starting with the basics: the APIs.

RDF APIs

For every programming language you’re interested in, there’s most likely an RDF API and a library implementing it. If you’re interested in Java, one of the most popular Java RDF libraries for RDF is Jena, from HP’s Semantic Web Research Lab. The current version of Jena is 1.6.1, which is the one I’ve used, but there is a beta-release of a new version (Jena2), and it’s the one you’ll most likely want to investigate. As you’ll see later, Jena is used for several utilities and applications.

For those interested in Python, the most popular RDF library — which also includes a triplestore with several different backends — is Daniel Krech’s RDFLib. Want something a little more unusual? Try Wilbur, a Common Lisp RDF library, written by Ora Lassila, one of the creators of RDF.

For those who work primarily with Microsoft development environments, there is a C# RDF Parser called Drive, which provides an API to parse RDF/XML into an in-memory RDF graph for manipulation. It’s fully compatible with the .NET platform, and it can also be used with the open source variant of .NET, Mono.

If Perl is more your thing, there’s Ginger Alliance’s PerlRDF, a library I’ve used in several small applications at my site. And other, popular applications like Six Apart’s weblogging application, Movable Type, are also using it. Six Apart extended the PerlRDF module by creating a new module, XML::FOAF, which enables autodiscovery and processing of FOAF files. FOAF, or Friend-of-a-Friend, is an RDF vocabulary for defining hierarchies of acquaintances and is now one of the most popular uses of RDF/XML.

If you want support for multiple RDF languages as well as a more sophisticated framework and data persistence, you’ll want to check out Dave Beckett’s Redland. In addition to providing a persistent data store, as well as multiple language support (Python, Perl, Java, Tcl, and Ruby), Redland also provides support for an independent RDF parser called Raptor. Raptor has been used, independently, in other applications, including several FOAF apps, as well as RDF Gateway, a commercial product I’ll discuss later in this article.

RDF Vocabularies

FOAF is one of the more popular vocabularies of RDF/XML. Just a quick perusal at the FOAF web site will show dozens of uses of FOAF in tools ranging from a FOAFBot, created by Edd Dumbill and used to provide services within chat forums, to uses of FOAF in desktop tools within the OS X environment for managing contacts. My own FOAF file is at http://weblog.burningbird.net/foaf.rdf, and consists of pointers to friends I know online, though the list is incomplete.

The beauty of FOAF lies in its simple way of describing personal information, including our work and academic affiliations. The power of FOAF lies in its ability to list acquaintances who themselves may have FOAF files. Over time, this interlinked network can expand until it’s a simple matter of mapping out who is connected, directly to indirectly, to each other.

Another RDF vocabulary in popular use is RSS 1.0. Webloggers and other online publications use RSS 1.0 to provide information about updates at their web sites, including the date of the update, the author, an excerpt of the material and so on.

A third RDF vocabulary is the RDF/XML used to describe Creative Commons licenses, a new way to provide more detailed information about use of copyrighted material.

All three vocabularies use, in one way or another, elements from the Dublin Core Metadata Initiative (DCMI), as defined in RDF/XML. However, these vocabularies aren’t the only ones available using RDF/XML. In fact, the W3C uses RDF/XML to define the underlying syntax for its own Web Ontology Language (OWL) effort. With RDF providing the underlying model, and OWL adding higher-level ontology support, it’s only a matter of time before a host of sophisticated, domain-specific ontologies spring up, all of them interoperable because of the underlying use of RDF/XML.

In fact, there’s a host of tools and utilities people can use right now to work with RDF/XML directly or with OWL.

Tools and Utilities to Work with RDF/XML

As much as I like RDF/XML, even I’ll admit that it requires time to understand and work with, and not everyone has either a desire or an inclination for this effort. Thankfully, there’s plenty of tools available to allow people to manually create or read RDF/XML.

The most commonly used RDF utility is the RDF Validator, a tool to check your RDF/XML to ensure that it’s valid, as well as to generate different views of the model data. I find that when working with an API, I’ll use the Validator to validate my sample RDF/XML, view the model to ensure I’ve created the appropriate one, and then create the triples to use as a pattern with my RDF/XML API calls, in whatever language I’m coding.

Another handy utility for working with RDF/XML is the BrownSauce RDF Browser. This web application uses Jena. It can open an RDF/XML document and provide easily readable and hypertext-linked pages of the RDF data contained in the document. Best of all, the browser also opens any associated RDF Schema documents that provide information about the RDF elements themselves, through the relationships described in the schema, and through comments provided with the schema elements.

A long-time advocate of RDF and a friend of mine, Danny Ayers, has been busy at work on Ideagraph, a tool for visually mapping ideas and then generating RDF/XML from the results. In addition to this effort, the tool can also act as a RDF-based weblogging tool, as well as an RSS aggregator.

Isaviz is another popular visual-editing tool for creating, importing, and working with RDF documents in RDF/XML, and within other serialization formats such as Notation 3 and N-Triple format. It’s particularly useful when you’re creating a new RDF vocabulary and want to use a visual tool for this effort rather than trying to create the vocabulary in RDF/XML manually. However, I prefer to use the tool to work with existing RDF/XML documents, particularly larger ones, because the tool has a way of being able to zoom in on components of a model, to create snapshots of particular paths, and to query on specific elements. In particular, if you’re documenting an existing RDF/XML vocabulary, Isaviz can be useful for providing snapshots of particular instances of data.

Most of these tools are geared more for working directly with RDF/XML vocabularies. If you’re working with an ontology instead, then you must look at Protege, from Stanford University. This tool not only allows you to define an ontology using an easy-to-use user interface, you can then create forms to capture the ontology data. Once the forms are defined, the tool can then be used to capture instances of data based on the ontology. Currently. effort is underway to provide support for OWL files, and mapping between Protege’s own ontology language and the W3C language. Regardless, the data captured by Protege can be output in multiple formats, most particularly RDF/XML.

Peripheral RDF Support in Other Tools and Utilities

Of course, tools that focus purely on RDF, whether to create RDF or to consume RDF, are handy when you’re starting work with RDF–but what about RDF in the real world?

Probably one of the first uses of RDF/XML was by those involved in the Mozilla effort, which still uses RDF/XML for all of its automated Table of Contents data and processing. In fact, it was through my interest in the Mozilla development environment that I became exposed to RDF/XML (see www.mozilla.org/rdf/doc/).

If you’ve worked with Linux then you’re most likely familiar with RPM, a way of packaging Linux applications for easy installation. What you may not know is that RDF has been used with RPM to provide metadata about the package being installed. A utility created by Daniel Veillard, rpmfind, uses RDF to discover RPM installations on Rpmfind.Net, a database of RPM packages maintained by the W3C. Though the original creator of the product is no longer maintaining rpmfind directly, the source is now located at sources.redhat.com, and I’m still using rpmfind for my own server.

Earlier I mentioned Movable Type and its use of RDF for autodiscovery of FOAF files. The application also uses RDF/XML to annotate weblog entries with trackback information, which can be used to document links from one weblog to another and provide reverse link information. This same functionality has been isolated for use by other tools, weblogging or otherwise.

Spring, a Mac OS X desktop tool created by Robb Beal, provides support for dragging and dropping FOAF files. Find an FOAF link in a web page? Click on it and drag it to Spring in order to automatically transform the FOAF contents into the tool.

As ubiquitous as RDF is becoming, creeping its way into a favorite tool or utility near you, the power of the RDF model’s inferential capability is particularly apparent when you look at some of the larger applications that are being built on RDF.

Larger Applications

People at MIT are working on an application, called DSpace, which will maintain a digital repository of information. The application is geared to any larger organization such as a college or university that wants to maintain a searchable index of publications from its members. DSpace is a freely available, open source application that makes use of an ontology, Harmony/ABC and RDF to maintain the historical subsystem. RDF Gateway is a Semantic Web application server that uses RDF as the core of all of its services. With the Gateway, you get access to a persistent data store that can be queried using an inferential engine that goes beyond normal SQL-like queries. Included with the application is support for server-side scripting similar in nature to both ASP (Active Scripting Pages) and JSP (Java version of same).

Siderean Software’s Seamark is another commercial application that makes use of RDF and a persistent data source, but Seamark focuses primarily on site navigation. Plugged In Software’s Tucana Knowledge Store provides sophisticated knowledge-based querying of large stores of data, again based on RDF.

These companies are just the first to start looking at RDF and the RDF data model for use in large-scale, sophisticated applications. And then there’s the Semantic Web.

The Semantic Web

It’s funny in a way, but I can sit down and rattle off a dozen uses for the RDF data model and the associated RDF/XML without once mentioning its primary purpose, which is to provide support for the Semantic Web efforts. All uses of RDF for any purpose are good because they increase our familiarity with the specification as well as the syntax. In addition, applications that increase the level of RDF/XML out on the web add to the pool of accessible data on which we are slowly building the Semantic Web. Through the use of RDF, we know that all of the vocabularies are compatible.

Beyond these good and practical uses of RDF I’ve described earlier in the article, and unlike XML or HTML or XHTML, the RDF model, and its associated syntax, brings with it the ability to define statements about data, rather than to just record pieces of data. Add to this the use of OWL, and we begin to have the ability to mine for knowledge, not just words.

Consider poetry. My favorite poem is Walt Whitman’s “Song of the Open Road”, with its friendly and positive imagery of life as an adventure, a road to follow with glee. In fact, the use of “road” as a metaphor for life and life’s journey is quite common in poetry. (See an excellent article, “Poetry of the Open Road.”) However, it’s the very use of imagery and metaphor in poetry that defeats traditional web discovery techniques.

Currently we have the ability to use keyword searches within search engines such as Google, and with this we can find poems that mention the word “road”. This is all well and good, but in the future, as the use of RDF and RDF/XML expands, we’ll be able to do searches that not only provide links to poems that have used “road”, but also know which poems use the word as a metaphor for life, which have used it metaphorically to describe freedom, and which are just talking about roads as roads.

Eventually as RDF insinuates itself throughout the web, as it has already started, you’ll be able to search on “road” and “poem” and “metaphor for life” and not get this article back as a result. As much as I like the thought of people reading this article, that search result will be a good thing because this article is not about poems, metaphors, and life. It’s about RDF and how it is now more than ready for prime time.