Right tool for the right job: XML formats redux

In the last post, I said I was a pusher of code, not a designer. As a pusher of code, then, I do feel comfortable commenting on the user of Atom or RSS for an import/export format.

Danny Ayers recently pointed out that there’s a new Atom format spec. Good, clean writeup with an interesting twist in the Introduction:

Atom is an XML-based document format intended to allow lists of related information, known as “feeds”, to be synchronised between publishers and consumers. Feeds are composed of a number of items, known as “entries”, each with an extensible set of attached metadata. For example, each entry has a title.

The primary use case that Atom addresses is the syndication of Web content such as Weblogs and news headlines to Web sites as well as directly to user agents. However, nothing precludes it from being used for other purposes and kinds of content.

That’s a bit like saying, “Here now, we have a specification for the banking industry wot would be a good spec for those of you who run petrol stations, what say?”

In my opinion, Atom, as with RSS, make a great syndication format, but there’s too much of the underlying purpose to the format to make them exceptionally good for universal weblog transforms, including pushing weblog data from one tool to another. For instance:

  • Each item in Atom, or RSS for that matter, has a link associated with it. I suppose one could use this to hold a slug, or filename, but the two are not the same information.
  • Atom has the concept of an identifier, atom:id, which doesn’t translate well in weblogging terms. Each tool would have it’s own unique identification system.
  • Too many of the fields are associated with the mechanics of the feed, such as atom:generator. While this is essential for syndication feeds, there’s no need for this in weblog migration. Though you can say it’s optional, if you find there is no fit in the business for most (all?) of the optional bits, then you may be looking at a poor fit, overall, between the spec and the use.
  • There is a lot of data missing in Atom. Keyword-value pairs is something I think a format has to support. There isn’t anything in the specification to do with categories, or how hierarchical categories would be managed. One could say the same for comments – right now they have to be artificially transformed into little feed items, when what they are, are comments to a post, not individual feeds.

The latter item is the kicker for me. If you say that one can extend the model to include this extra data since Atom supports namespaces, why not take this a step further and say, well, then we’ll go with a new model specifically focusing on migrating data between tools; a syndication feed is not the same thing as porting an entire weblog between tools.

Of course, saying something like “Atom is not a good fit for this purpose” is similar to invoking the Lazy Web to have it done, and I’m sure a dozen feeds will be created that use Atom, or RSS, to produce and consume migration data. However, I’m not saying it can’t be done; I’m saying that the forcing a specification for one purpose into being used for another purpose will, in the long run, be more trouble then its worth. Especially when you consider the political ramifications to using a syndication feed.

One could write a tool that both exports and imports data directly into the database, rather than interfacing through the tool, but this is not a comfortable option for many non-geeks. They could be concerned, and rightfully, that the underlying data model could change for the tools, and what worked one time may not work the next. The best approach is to use something that tools support, so that users have a degree of comfort with the post.

What we don’t need is one tool using an RSS formatted import mechanism, while another uses an ATOM formatted export. Asking all tools to support all syndication formats for weblog imports and exports is a bit much; generating multiple syndication feeds is a matter of a new arrangement of tags, but consuming the different feeds is a whole different game.

At the same time, telling people who are already apprehensive about learning a new set of template tags that they need to transform the output generated by one tool before it can be used the another (oh, and there will probably be loss of data between the two) is a Geeks Choice response.


Right tool for the right job: designs

I spent time yesterday working with the three sites ported from Movable Type. While I port each individual’s current look, I set up a PHP-based switcher they could use to try out some of the available WordPress Themes. There has been criticism that WP doesn’t look that great out of the box. I think the issue is that it doesn’t look like Apple wrote it – rounded corners and shaded lines, with bright lollypoppy or balloony colors. What the current style is, is functional. Best of all, it has a very clean layout that makes it easy to play around with different looks.

Loren gave me an okay to show what his site will look like under different themes. Here it is, in just a selection of the styles I moved into their sandbox environment:

Extreme Dark Time

Outback Dark Time

Silver Dark Time

Dots Time

Wp New

Serene Time


kinder, gentler Loren

The switcher only works if you add the URL parameter wpstyle=whateverstyle to each page.

There are a whole lot more themes, but I noticed that some of the styles suited Loren’s writing style, some did not. For instance, some of the styles had writing blocks that fit across most of the page. These did not suit th poetry that Loren embeds in his page, leaving far too much white space to the right. Loren’s own style extends to fit the page, but he has a large enough sidebar to balance it.

The Pink Lilies narrow writing space would work for his style of writing, with his symmetrical paragraphing, and good use of visual breaks. However, it would not work well with Loren’s use of photographs. If he used smaller photos, it might work within the layout, but the graphics would compete with the image. And if you used photos embedded directly into the text, a triple column format with images could be too fussy, unless you kept the use of geegaws to a minimum.

Weblog style is more than just a matter of taste and what colors you like, or whether you’re into that Apple thing with the rounded and the shaded and the aluminized faux future looks. The style has to not only work with what you write but how you write, and other material included. Shorter paragraphs demand more narrow columns; larger photos need a style that can accomodate, without the sidebar pushed down below the main content; poetry shouldn’t have excessive white space showing on any one side.

However, I’m not a page designer, so take my opinion with a grain of salt.


Going…going…soon to be gone

Recovered from the Wayback Machine.

Well all good things come to an end or some such thing. I had hoped to revive this weblog, but with having to drastically cut costs, car payments take precedence over web sites. This means time for me to start disengaging from this lovely online world.

For any updates to the book, you’re welcome to contact me directly via email or check at the O’Reilly book site. For general RDF issues, the links in this post are a good place to start.

Other than that, this weblog is going down end of September.

It’s been fun.