Category: Technology

XHTML2 is dead

Post author By Shelley Powers
Post date July 2, 2009

XHTML2 news on Twitter

I have mixed feelings on this news.

On the one hand, I think it’s a good idea to focus on one X/HTML path.

On the other, I’ve been a part of the HTML WG for a little while now, and I don’t feel entirely happy, or comfortable with many of the decisions for X/HTML5, or for the fact that it is, for all intents and purposes, authored by one person. One person who works for Google, a company that can be aggressively competitive.

A major site redesign

Post author By Shelley Powers
Post date June 21, 2009

I’ve finished the re-organization of my web site, though I have odds and ends to finish up. I still have two major changes featuring SVG and RDFa that I need to incorporate, but the structure and web site designs are finished.

Thanks to Drupal’s non-aggressive use of .htaccess, I’ve been able to create a top-level Drupal installation to act as “feeder” to all of the sub-sites. I tried this once before with WordPress, but the .htaccess entries necessary for that CMS made it impossible to have the sub-sites, much less static pages in sub-directories.

Rather than use Planet or Venus software to aggregate feed entries for all of my sites, I’m manually creating an excerpt describing a new entry, and posting it at Burningbird, with a link back to the full article. I also keep a listing of the last few months stories for each sub-site in the sidebar, in addition to random display of images.

There is no longer any commenting directly on a story. One of the drawbacks with XHTML and an unforgiving browser such as Firefox, is that a small error is enough to render the page useless. I incorporate Drupal modules to protect comments, but I also allow people to enter in some markup. This combination handles most of the accidentally bad markup, but not all. And it doesn’t protect against those determined to inject invalid markup. The only way to eliminate all problems is not allow any markup, which I find to be too restrictive.

Comments are, however, supported at the Burningbird main site. To allow for discussion on a story, I’ve embedded a link in every story that leads back to the topmost Burningbird entry, where people can comment. Now, in those infrequent times when a comment causes a problem with a page, the story is still accessible. And there is a single Comment RSS feed that now encompasses all site comments.

The approach may not be ideal, but commentary is now splintered across weblog, twitter, and what not anyway—what’s another link among friends?

I call my web site design “Silhouette” and will release it as a Drupal theme as soon as it’s fully tested. It’s a very simple two column design, with sidebar column either to the right (standard) or easily adjusted to fall to the right. It’s an accessible design, with only the top navigation bar coming between the top of the page and the first story. It is valid markup, as is, with the XHTML+RDFa Doctype, because I’ve embedded RDFa into the design. It is not valid, however, when you also add SVG silhouettes, as I do with all but the top most site.

The design is also valid XHTML 5.0, except for a hard coded meta element that was added to Drupal because of security issues. I don’t serve the pages up as HTML 5, though, because the RDFa Doctype triggers certain behaviors in RDFa tools. I’m also not using any of the new HTML 5 structural elements.

The site design is plain, but it suits me and that’s what matters. The content is legible and easy to locate, and navigate, and that’s my second criteria. I will be adding some accessibility improvements in the next few months, but they won’t impact on the overall design.

What differs between all of the sites is the header graphic, and the SVG silhouettes, which I changed to suit the topic or mood of the site. The silhouettes were a lot of fun, but they aren’t essential, and you won’t be able to see them if you use a browser that doesn’t support SVG inline. Which means you IE users will need to use another browser to see the images.

I also incorporate some new CSS features, including some subtle use of text-shadows with headers (to add richness to the stark use of black text on pastel graphics) and background-color: rgba functionality for semi-transparent backgrounds. The effects are not viewable by browsers that don’t yet support these newer CSS styles, but loss of functionality does not impact access to the material.

Now, for some implementation basics:

*I manually reviewed all my old stories (from the last 8 years), and added 410 status codes for those I decided to permanently remove.
For the older stories I kept, I fixed up the markup and links, and added them as new Drupal entries in the appropriate sub-site. I changed the dates to match the older entries, and then added a redirect between the old URL and the new.
By using one design for all of the sites, when I make a change for one, it’s a snap to make the change for all. The only thing that differs is the inline SVG in the page.tpl.php page, and the background.png image used for the header bar.
I use the same set of Drupal modules at all sub-sites, which again makes it very easy to make updates. I can update all of my 7 Drupal sites (including my restricted access book site), with a new Drupal release in less than ten minutes.
I use the Drupal Aggregator module to aggregate site entries in the Burningbird sidebar.
I manually created menu entries for the sub-site major topic entries in Burningbird. I also created views to display terms and stories by vocabulary, which I use in all of my sub-sites.
The site design incorporates a footer that expands the Primary navigation menu to show the secondary topic entries. I’ve also added back in a monthly archive, as well as recent writings links, to enable easier access of site contents.

The expanded primary menu footer was simple, using Drupal’s API:


<?php
$tree = menu_tree_all_data('primary-links');
print menu_tree_output($tree);
?>

To implement the “Comment on this story” link for each story, I installed the Content Construction Kit (CCK), with the additional link module, and expanded the story content type to add the new “comment on this story” field. When I add the entry, I type in the URL for the comment post at Burningbird, which automatically gets linked in with the text “Comment on this story” as the title.

I manually manage the link from the Burningbird site to the sub-site writing, both because the text and circumstance of the link differs, and the CCK field isn’t included as part of the feed. I may play around with automating this process, but I don’t plan on writing entries so frequently that I find this workflow to be a burden.

The images were tricky. I have implemented both the piclens and mediaRSS Drupal Modules, and if you access any of my image galleries with an application such as Cooliris, you’ll get that wonderful image management capability. (I wish more people would use this functionality for their image libraries.)

I also display sub-site specific random images within the sub-site sidebars, but I wanted the additional capability to display random images from across all of the sites in the topmost Burningbird sidebar.

To get this cross-site functionality, I installed Gallery2 at http://burningbird.net/gallery2, and synced it with the images from all of my sub-sites. I then installed the Gallery2 Drupal module at Burningbird (which you can view directly) and used Gallery2 plug-ins to provide random images within the Drupal sidebar blocks.

Drupal prevented direct access from Gallery2 to the image directories, but it was a simple matter to just copy the images and do a bulk upload. When I add a new image, I’ll just pull the image directly from the Drupal Gallery page using Gallery2’s image extraction functionality. Again, I don’t add so many images that I find this workflow to be onerous, but if others have implemented a different approach, I’d enjoy hearing of alternatives.

One problem that arose is that none of the Gallery2 themes is XHTML compliant because of HTML entity use. All I can say is: folks, please stop using  . Use   instead, if you’re really, really generating XHTML, not just HTML pretending to be XHTML.

To fix the non-compliant XHTML problem, I copied a version of my site to a separate theme, and just removed the PHP that serves the page up as XHTML for XHTML-capable browsers from this “Silhouette for HTML” theme. The Gallery2 Drupal modules allow you to specify a different theme for the Gallery2 pages, and I use the new HTMLated theme for the Gallery2 pages. I use my XHTML compliant theme for the rest of the site. Over time, I can probably add conditional tests to my main theme to test for the presence of Gallery blocks, but what I have is simple and works for now.

Lastly, I redirected the old Planet/Venus based feed locations to the Burningbird feed. You can still access full feeds from all of my sub-sites, and get full entries for all but the larger stories and books, but the entries at Burningbird will be excerpts, except for Burningbird-only posts. Speaking of which, all of my smaller status updates, and general chit-chat will be made directly at Burningbird—I’m leaving the sub-sites for longer, more in-depth, and “stand alone” writings.

As I mentioned earlier, I still have some work with SVG and RDFa to finish before I’m completely done with the redesign. I also have some additional tweaks to make with the existing infrastructure. For instance, I have custom 404, 403, and 410 error pages, but Drupal overrides the 403 and 404 pages. You can redirect the error handling to specific pages, but not to static pages, only to pages within the Drupal system. However, I’m not too worried about this issue, as I’m finding that there’s typically a Drupal module for any problem, just waiting to be discovered.

I know I must come across as a Drupal fangirl in this writing, but after using the application for over a year, and especially after this site redesign, I have found that no other piece of software matches my needs so well as Drupal. It’s not perfect software—there is no such thing as perfect software—but it works for me.

* This process convinced me to switch fully from using Firefox to using Safari. It was so much more simple to fix pages with XHTML errors using Safari than with Firefox’s overly aggressive XHTML error handling.

Technology

What’s shorter than 140 characters?

Post author By Shelley Powers
Post date May 29, 2009

What can possibly top Twitter and its immediacy, as well as brevity of contact? I think we found out this week, with Google Wave. Tim O’Reilly describes it as what email would be like if invented today. My first reaction, and judging from other responses, is that it’s remarkably similar to Ray Ozzie’s Groove, before Groove became little more than a ghost appendage to Microsoft.

Folks immediately started rumbling about “twitter killer”, but I look at it and see the answer to the question, “What can beat out 140 characters?” The answer is, evidently, echoed keystrokes as people make them.

I watched the presentation video (thank you for that, Google). Technologically, Google Wave is intriguing. What was also intriguing was Google’s strong emphasis on HTML5 during the presentation, including a reference to additions to the HTML5 spec. But the part that caught my attention is that Wave is actually echoing keystrokes. I can imagine the following discussion, happening live:

A: I just saw the demo of Google Wave …

B: Oh, yeah, that was terrific

A:….and it sucked

B: Oh, um, well I thought…

A: You liked it! Are you…

B: …it was innovative

A: …cracked?

Google Wave is ADD heroin.

I was thinking about Google Wave yesterday, as I ran the gauntlet that is known as Watson Street, here in St. Louis. As I dodged little old ladies who pull into the road without looking, and the 30 something guy who cut me off when he should have yielded, or contemplated the new ding in my car from some mother’s precious child opening his or her car door too hard, and too wide, I began to appreciate what Twitter, Google Wave, Blogging, Facebook, and other social media are: real life alternative communities.

Because in real life, we’re all pricks.

RDF Standards XHTML/HTML

A Loose Set of Notes on RDFa, XHTML, and HTML5

Post author By Shelley Powers
Post date May 23, 2009

There’s been a great deal of discussion about RDFa, HTML5, and microdata the last few days, on email lists and elsewhere. I wanted to write down notes of the discussions here, for future reference. Those working issues with RDFa in Drupal 7 should pay particular attention, but the material is relevant to anyone incorporating RDFa.

Shane McCarron released a proposal for RDFa in HTML4, which is based on creating a DTD that extends support for RDFa in HTML4. He does address some issues related to the differences in how certain data is handled in HTML4 and XHTML, but for the most part, his document refers processing issues to the original RDFaSyntax document.

Philip Taylor responded with some questions, specifically about how xml:lang is handled by HTML5 parsers, as compared to XML parsers. His second concern was how to handle XMLLiteral in HTML5, because the assumption is that RDFa extractors in JavaScript would be getting their data from the DOM, not processing the characters in the page.

“If the object of a triple would be an XMLLiteral, and the input to the processor is not well-formed [XML]” – I don’t understand what that means in an HTML context. Is it meant to mean something like “the bytes in the HTML file that correspond to the contents of the relevant element could be parsed as well-formed XML (modulo various namespace declaration issues)”? If so, that seems impossible to implement. The input to the RDFa processor will most likely be a DOM, possibly manipulated by the DOM APIs rather than coming straight from an HTML parser, so it may never have had a byte representation at all.

There’s a lively little sub-thread related to this one issue, but the one response I’ll focus on is Shane, who replied, RDFa does not pre-suppose a processing model in which there is a DOM. The issue of xml:lang is also still under discussion, but I want to move on to new issues.

While the discussion related to Shane’s document was ongoing, Philip released his own first look at RDFa in HTML5. Concern was immediately expressed about Philip’s copying of some of Shane’s material, in order to create a new processing rule section. The concern wasn’t because of any issue to do with copyright, but the problems that can occur when you have two sets of processing rules for the same data and the same underlying data model. No matter how careful you are, at some point the two are likely to diverge, and the underlying data model corrupted.

Rather than spend time on Philip’s specification directly at this time, I want to focus, instead, on a note he attached to the email entry providing the link to the spec proposal. In it he wrote:

There are several unresolved design issues (e.g. handling of case-sensitivity, use of xmlns:* vs other mechanisms that cause fewer problems, etc) – I haven’t intended to make any decisions on such issues, I’ve just attempted to define the behaviour with sufficient detail that it should make those issues visible.

Cite not link

Post author By Shelley Powers
Post date May 10, 2009

I do have considerable sympathy for ¹Thomas Crampton, when he discovered that all of his stories at the International Herald Tribune have been pulled from the web because of a merger with the New York Times.

So, what did the NY Times do to merge these sites?

They killed the IHT and erased the archives.

1- Every one of the links ever made to IHT stories now points back to the generic NY Times global front page.

2- Even when I go to the NY Times global page, I cannot find my articles. In other words, my entire journalistic career at the IHT – from war zones to SARS wards – has been erased.

At the same, though, I don’t have as much sympathy for Wikipedia losing its links to the same stories, as detailed by ²Crampton in a second posting.

The issue: Wikipedia – one of the highest traffic websites on the Internet – makes reference to a large number of IHT stories, but those links are now all dead. They need to delete them all and find new references or use another solution.

As I wrote in comments at Teleread:

I do have sympathy, I know I would be frustrated if my stories disappeared from the web, but at the same time, there is a certain level of karma about all of this.

How many times have webloggers chortled at the closure of another newspaper? How many times have webloggers gloated about how we will win over Big Media?

The thing is, when Big Media is gone, who will we quote? Who will we link? Where will the underlying credibility for our stories be found?

Isn’t this exactly what webloggers have wanted all along?

Isn’t this what webloggers have wanted, all along?

I have sympathy for a writer losing his work, though I assume he kept copies of his writings. If they can’t be found in hard copies of the newspaper, then I’m assuming the paper is releasing its copyright on the items, and that Mr. Crampton will be able to re-publish these on his own. That’s the agreement I have with O’Reilly: when it no longer actively publishes one of my works, the copyright is returned to me. In addition, with some of the books, we have a mutual agreement that when the book is no longer published, the work will be released to the public domain.

I don’t have sympathy for Wikipedia, though, because the way many citations are made at the site don’t follow Wikipedia’s citation policy. Links are a lazy form of citation. The relevant passage in the publication should be quoted in the Wikipedia article, matched with a citation listing the publication, author, title of the work, and publication—not a quick link to an external site over which Wikipedia has no control.

I’m currently fixing one of my stories, Tyson Valley, a Lone Elk, and the Bomb because the original material was moved, without redirection. But as I fix the article, what I’m doing is making copies of all of the material, for my own reference. Saving the web page is no different than making a photocopy of an article in the days before the web.

In addition, I will be adding a formal citation for the source, as well as the link, so if the article moves again, whoever reads my story will know how to search for the article’s new location. At a minimum, they’ll know where the article was originally found.

I’m also repackaging the public domain writing and images for serving at my site, again with a text citation expressing appreciations to the site that originally published the images.

By using this approach, the stories I consider “timeless”, in whatever context that word means in this ephemeral environment, would not require my constant intervention.

Authors posting to Wikipedia should be doing the same, and this policy should be enforced: provide a direct quote of relevant material (allowed under Fair Use), and provide a formal citation, in addition to the link. Or perhaps, instead of the link. Because when the newspapers disappear, they’ll have no reason to keep the old archives. No reason at all. And then, where will Wikipedia be?

¹Crampton, Thomas, “Reporter to NY Times Publisher: You Erased My Career”, thomascrampton.com. May 8, 2009.
²Crampton, Thomas, “Wikipedia Grappling with Deletion of IHT.com”, thomascrampton. May 8, 2009.