Categories
RDF

Newest RDF goodies and challenges

Recovered from the Wayback Machine.

I spent the last several days reading through the six RDF documents currently under final review. During the last few days I acted the minor irritant to some members of the W3C RDF Working Group, primarily getting clarification on some confusing or complex aspects of the documents. I also spent time trying to bring concerns expressed by fellow webloggers to the attention of the group members — trying to bring the viewpoint of non-RDF folks to the RDF table. I find myself in the interesting position of being an RDF supporter who doesn’t necessarily support all aspects of RDF. Which means I can’t claim kinship with any ‘side’ in the RDF debate.

Of the items I couldn’t cover in the article (due to space considerations), and their possible impacts:

Containers Containers such as Bag, Seq, and Alt are still included, but without additional semantics attached. What does this mean? It means that a container is a grouping of resources, but there is no additional assumptions attached to the RDF specification about how container elements relate to each other, or how applications process the data. Elements in a Bag are treated no differently, to the specification, than elements in a Seq. It’s up to each individual application to determine what ‘Bag’ or ‘Seq’ means.

Sound familiar? Can we all say “HTML”?

RSS 1.0, the same RSS 1.0 generated by most weblogging tools, uses a container to group items — a Seq (sequence). I have never been particularly happy with this as I believe that ordering or other processing should result from the data rather than the structure, such as using the posting date to determine sequence of display. In addition, container within RSS add redundancy. Individual items are contained in Items which is contained in Channel. Added redundancy is equivalent to added complexity.

In fact, we can simplify RSS, as I tried to demonstrate once before. Unfortunately, this example won’t validate as RSS.

However, this one does.

By putting the onus of semantics — the behavior if you will — of containers on to the applications, there will be differing results based on different applications interpretation of ‘Bag’ and ‘Seq’ or ‘Alt’. The RSS 1.0 group can specify that RSS 1.0 uses a Container, but there is no guarantee that the data within the container will be processed in any specific way by any specific aggregator, or generated a specific way by a specific application.

That’s what happens when precision of meaning is lost…or deliberately withheld.

(I think I’m going to change my Movable Type RSS template to support the new and improved SORSS — Shelley’s Own RSS. I could unite disparate RSS sides under one banner. Instead of a king, a Queen, Dark and Beautiful. All will Love me and Despair…. )

nodeID For those of you who ran into problems with blank nodes (bnodes) when trying to work with RDF, you’re going to love this: the WG has created the nodeID that allows you to apply a label to blank nodes. This means you can use whatever you want as a bnode label and the document will validate. No more having to figure out how to create a fake URI in order to specifically access the relationship denoted by a bnode.

Collection A new container like construct has been created: the Collection. This is used with groups of resources to create a list-like structure. However, just as with Containers and Reification, there is no assumption about the semantics of a Collection — it is up to each application. If I use this in my own applications, it will only be because there’s an absolute need for it, and I can’t avoid it’s use.

datetype There’s a new RDF attribute, rdf:datatype, that can be attached to a property to give a specific datatype URI reference, usually to Schema datatypes. Adding in support for datatyping is a goodness, but I’m absolutely appalled that the Working Group is adding this in at the property instance, rather than into the vocabulary definition. This means that one person can define create-date as an integer, the number of seconds since January 1, 1972. Another person could define create-date as a Schema date, with a value of 1999-10-01. Both would be valid instances of the same RDF vocabulary.

Of course, the designers can provide documentation about what format is to be used, but I would prefer something other than the honor system.

Embedding The WG did come out specifically on the issue of embedding RDF in HTML and XHTML — don’t do it, use link instead.

There’s a lot more, but you’ll have to buy the book. Literally.

Categories
Diversity

Girlism?

Recovered from the Wayback Machine.

Halley Suitt wrote the following at Blog Sisters in response to the question, “Whatever happened to feminism”:

“There is no more feminism,” I explain. Game Over. But it took me a day or two to name the new game. It’s “girlism” — women want to be sexy girls and use all the tricks girls use. Crying, flirting, begging, winking, stomping their feet when they don’t get their way, general trotting around showing off their long legs and whatever else they decide to show off thereby distracting and derailing men.

 

It’s about power — the girl power we’ve always had, but forgot about, combined with all the stuff we’ve learned in the workplace. Needless to say, if you’re a man and you call us on it, we deny it. The new double double standard. We learned how to stop playing fair

In my computer technology field, which is one of the most heavily male-dominated professions, I have never once seen a woman use flirting, begging, winking, stomping their feet, showing off their long legs, dressing sexy, or anything of this nature to get their way. If anything, women are less likely to display emotion on the job in my field than the men. Why? Because of statements such as these, saying that there is a double double standard and that women are using ‘girly’ ways to succeed.

Once I was so frustrated at being continually undercut by a male co-worker that I shed tears. Another of my co-workers, a woman, said that I needed to stop crying, because I was falling into the ‘women cry, men swear’ stereotype. I have never cried at work since. However, I have learned to pound the desk in anger, and swear a lot. Are these better? Well, at least they aren’t stereotypical.

Girlism. A label to discount women’s human experience and expression.

When women cry, they’re resorting to ‘girlism’, but when men cry, they’re being sensitive. Men can be hurt and receive understanding and compassion, but when women are hurt, they’re being overly emotional. Is that it works now? Women dress for sex, but men dress for success. And when women get angry, they’re being ‘girly’, but when men get angry, they’re being assertive.

Categories
RDF

C2C Datahead

Recovered from the Wayback Machine.

Dorothea received an email from Simon St. Laurent, the editor of my RDF book. I appreciate her respect for Simon and match it with considerable respect of my own, which will cause him no end of discomfort, I’m sure. However, I have to push back at the sentence:

But Simon really is cool, one of the sadly few voices for document-oriented XML howling in the vast wilderness of C2C (computer-to-computer) dataheads.

It is the C2C ‘dataheads’ that ensure that XML documents don’t document crap for all of their cleanliness and pristine eloquence. It is the C2C ‘dataheads’ that provide the proofs behind the seemingly simple XML vocabularies to ensure that the data documented within them is always consistent and reliable. And it is this particular C2C datahead that spent several days this last week locked in debate, difficult debate, with members of the RDF Working Group, the XML community, the weblogging community, and others, trying my best to ensure that I understand the concerns of the non-RDF community; that RDF/XML is as simple as it can be, or work with the XML community to come up with a feasible alternative; that the RDF specification documents are comprehensive and clear; and that I understand the concepts and semantics of RDF well enough that I may write cleanly about them. Perhaps even clean enough for the D2D markup heads.

Of course, this was a lot more work than writing out “RDF/XML sucks”. I think next time I won’t go through this effort. When someone says, “RDF/XML sucks”, I’ll respond with “No it doesn’t” and leave it at that.

Categories
Weblogging

Happy birthday, Mark

Recovered from the Wayback Machine.

Today, November 24 2002, Mark Pilgrim has a milestone birthday — he turns 30.

And since this is Mark, how else does one express Birthday greetings?

<item>
<title>Happy Birthday</title>
<link>http://diveintomark.org</link>
<description> Happy Birthday to you, Mark</description>
<dc:subject>Greeting</dc:subject>
<dc:date>2002-11-24T00:00:30-06:00</dc:date>
</item>

Better yet:

<item rdf:about=”http://weblog.burningbird.net/fires/000673.htm”>
<title>Happy Birthday</title>
<link>http://diveintomark.org</link>
<description> Happy Birthday to you, Mark</description>
<dc:subject>Greeting</dc:subject>
<dc:creator>Bb</dc:creator>
<dc:date>2002-11-24T00:00:30-06:00</dc:date>
</item>

Categories
RDF

RDF Query-O-Matic light

Recovered from the Wayback Machine.

I slaved away this afternoon, persevering in my work in spite of numerous obstacles (sunshine, cat on lap, languor) to bring you RDF Query-o-Matic Light – the PHP-based RDFQL machine. A grueling six or so lines of code. I sit in exhaustion on my stool, fanning myself with old green bar computer paper.

Speaking of stools, that reminds me of another nursery rhyme associated with RDF.

Little Miss Muffet, sat on a tuffet,
Eating her curds and whey;
Along came a spider,
Who sat down beside her
And frightened Miss Muffet away.

Chances are, the stool referenced in this rhyme was a three legged one, similar to the milk stools still used today. Three is the perfect number of legs for a stool: just enough legs to provide stability, but without the need for the additional material for an extraneous fourth leg.

Returning to the subject of RDF, it, like the milk stool, is based on the principle that ‘three’ is the magic number – in this case three pieces of information are all that’s needed in order to fully define a single bit of knowledge. Less than three, then all you have is fact without context; more, and you’re being redundant.

Of the three pieces of information, the first is the RDF subject. After all, when discussing a property such as name, it can belong to a dog, cat, book, plant, person, car, nation, or insect. To make finite an infinite universe, you must set boundaries, and that’s what subject does for RDF.

The second piece of information is the predicate, more commonly thought of as the RDF ‘property’. There are many facts about any individual subject; for instance, I have a sex, a height, a hair color, eye color, degree, relationships, and so on. To focus on that aspect of me that we’re interested in at any one point in time, we need to specifically focus on one of my ‘properties’.

If you look at the intersection of ’subject’ and ‘property’, you’ll find the final bit of information quietly waiting to be discovered – the value of the property. X marks the spot.

I am me. I have a name (Shelley Powers). I have a height (close to six feet). I have an attitude (sweet tempered and quite easy going). Each of these bits of knowledge form a picture, and that picture is me.

All from RDF triples strung together in precise ways.

On to the new version of the RDF Query-o-Matic, the PHP-based Query-o-Matic Light. This version, like the JSP version can apply a valid RDFQL query against a valid RDF file, printing out a target value. However, there are some minor syntactic differences between the two.

The PHP classes that provide the functionality for Light (PHP XML rdql), include the file name as well as explicit namespace use within the query rather than as separate elements. For instance, the following query will access titles from all elements contained within my resume.rdf file – a file with an experimental resume RDF vocabulary:

SELECT ?b
FROM <http://weblog.burningbird.net/resume.rdf>
WHERE (?a, <bbd:title>, ?b)
USING bbd for >http://www.burningbird.net/resume_schema#>

The first line is the same SELECT clause, as discussed in the last RDFQL posting, but this is followed by a FROM clause, which lists the RDF file’s URL within angle brackets. Following is the WHERE clause containing the query, and again, this is no different than the JSP version, except that an alias is used instead of the full namespace. The namespace itself is listed in the last clause, delimited with the USING keyword.

Regardless of some syntactic differences, the query still returns the same result.

Taking the Light version of Query-o-matic out for a spin, I went looking for more complex queries, and found one in Phil’s Comments RDF. Though deceptively simple looking, Phil’s RDF file, in fact any RSS 1.0 RDF file, has one nasty little complication: containers.

An RDF container is an RDF object that groups related items together, usually with some implied processing as to order. An RDF container can group ordered items (SEQ), alternative items (ALT), or just a collection of unordered items (BAG). An RDF container is also a bit of a bugger when it comes to processing or generating RDF, one reason that they lack popularity.

However, the key to overcoming the difficulties associated with containers is the same as the one used with RDFQL queries – work with it one step at a time.

Container elements can be accessed individually by knowing that each item appears as an object in a (subject, predicate, object) triple with a predicate of TYPE (http://www.w3.org/1999/02/22-rdf-syntax-ns#type using the namespace). To access all container elements using RDFQL, you would need to have a WHERE clause similar to:

(?subject, <rdf:type>, “http://purl.org/rss/1.0/item”)

This will return all container elements within the RDF document for the JSP version of Query-o-Matic, but not the Light version. The PHP version doesn’t allow for literals (the “http://purl.org/rss/1.0/item” value) directly within the query triple. Instead, you use a filter, designated by the keyword AND:

WHERE (?subject, <rdf:type>, ?object)
AND ?object==”http://purl.org/rss/1.0/item”

This triple query filters the elements returned, giving us a target set of subjects that are equal to all of the container elements in the document. With Phil’s comments RDF/RSS file, this is all the comments.

Once we have the container elements, the subject values are then are passed into the next triple query, to access the DESCRIPTION property for each (the description holds the actual comment in RDF/RSS Comments). The value of the DESCRIPTION predicate is our target value, which gets printed out.

Pulling this all together, the query to access all of the actual comment text in the RDF document is:

SELECT ?desc
FROM <http://philringnalda.com/comments.rdf>
WHERE (?subject, <rdf:type>, ?object),
(?subject, <rss:description>, ?desc)
AND ?object==”http://purl.org/rss/1.0/item”
USING rdf for <http://www.w3.org/1999/02/22-rdf-syntax-ns#>,
rss for <http://purl.org/rss/1.0/>

The mapped values – the subjects – are highlighted. The subjects found in the first triple query are passed as subjects to the next.

Check out the results.

I’m actually not fond of container elements myself, precisely because there is processing semantics integrated into the element – sequence is assumed to be an ordered list of items, while a bag is not. I would rather provide the information necessary to order elements – such as date or some other characteristic – and then let the tool creators decide how they want the elements ordered.

Regardless, the trick to working with container elements is to use the TYPE predicate to discover the container elements, pull the subject associated with each, and then use these with relatively standard RDFQL for the rest of the query.

You can use both the JSP-based Query-o-matic and the PHP-based Query-o-Matic Light to try out different queries on whatever valid RDF documents you know of. Documentation for the RDFQL syntax used with the JSP based version can be found here, and the RDFQL syntax for the Light version can be found here. Remember that though there are syntactic differences between the two, the actual RDFQL used in the WHERE clause is logically the same – one or more chained triples, with the results of the first triple being passed to the second and so on.

Now that I have my query engines and can test my RDFQL, the next step is to pull these queries into an actual application, covered in the next of these essays into RDF and RDFQL.

To try the JSP Query-o-matic yourself, download and install Jena into your own environment. The actual o-matic JSP page can be downloaded here.

To try out o-Matic light, download and install the PHP XML classes. The PHP I used can be downloaded here.

Remember, these are for fun. So, have fun.