Categories
Burningbird Technology Web

A major site redesign

I’ve finished the re-organization of my web site, though I have odds and ends to finish up. I still have two major changes featuring SVG and RDFa that I need to incorporate, but the structure and web site designs are finished.

Thanks to Drupal’s non-aggressive use of .htaccess, I’ve been able to create a top-level Drupal installation to act as “feeder” to all of the sub-sites. I tried this once before with WordPress, but the .htaccess entries necessary for that CMS made it impossible to have the sub-sites, much less static pages in sub-directories.

Rather than use Planet or Venus software to aggregate feed entries for all of my sites, I’m manually creating an excerpt describing a new entry, and posting it at Burningbird, with a link back to the full article. I also keep a listing of the last few months stories for each sub-site in the sidebar, in addition to random display of images.

There is no longer any commenting directly on a story. One of the drawbacks with XHTML and an unforgiving browser such as Firefox, is that a small error is enough to render the page useless. I incorporate Drupal modules to protect comments, but I also allow people to enter in some markup. This combination handles most of the accidentally bad markup, but not all. And it doesn’t protect against those determined to inject invalid markup. The only way to eliminate all problems is not allow any markup, which I find to be too restrictive.

Comments are, however, supported at the Burningbird main site. To allow for discussion on a story, I’ve embedded a link in every story that leads back to the topmost Burningbird entry, where people can comment. Now, in those infrequent times when a comment causes a problem with a page, the story is still accessible. And there is a single Comment RSS feed that now encompasses all site comments.

The approach may not be ideal, but commentary is now splintered across weblog, twitter, and what not anyway—what’s another link among friends?

I call my web site design “Silhouette” and will release it as a Drupal theme as soon as it’s fully tested. It’s a very simple two column design, with sidebar column either to the right (standard) or easily adjusted to fall to the right. It’s an accessible design, with only the top navigation bar coming between the top of the page and the first story. It is valid markup, as is, with the XHTML+RDFa Doctype, because I’ve embedded RDFa into the design. It is not valid, however, when you also add SVG silhouettes, as I do with all but the top most site.

The design is also valid XHTML 5.0, except for a hard coded meta element that was added to Drupal because of security issues. I don’t serve the pages up as HTML 5, though, because the RDFa Doctype triggers certain behaviors in RDFa tools. I’m also not using any of the new HTML 5 structural elements.

The site design is plain, but it suits me and that’s what matters. The content is legible and easy to locate, and navigate, and that’s my second criteria. I will be adding some accessibility improvements in the next few months, but they won’t impact on the overall design.

What differs between all of the sites is the header graphic, and the SVG silhouettes, which I changed to suit the topic or mood of the site. The silhouettes were a lot of fun, but they aren’t essential, and you won’t be able to see them if you use a browser that doesn’t support SVG inline. Which means you IE users will need to use another browser to see the images.

I also incorporate some new CSS features, including some subtle use of text-shadows with headers (to add richness to the stark use of black text on pastel graphics) and background-color: rgba functionality for semi-transparent backgrounds. The effects are not viewable by browsers that don’t yet support these newer CSS styles, but loss of functionality does not impact access to the material.

Now, for some implementation basics:

  • *I manually reviewed all my old stories (from the last 8 years), and added 410 status codes for those I decided to permanently remove.
  • For the older stories I kept, I fixed up the markup and links, and added them as new Drupal entries in the appropriate sub-site. I changed the dates to match the older entries, and then added a redirect between the old URL and the new.
  • By using one design for all of the sites, when I make a change for one, it’s a snap to make the change for all. The only thing that differs is the inline SVG in the page.tpl.php page, and the background.png image used for the header bar.
  • I use the same set of Drupal modules at all sub-sites, which again makes it very easy to make updates. I can update all of my 7 Drupal sites (including my restricted access book site), with a new Drupal release in less than ten minutes.
  • I use the Drupal Aggregator module to aggregate site entries in the Burningbird sidebar.
  • I manually created menu entries for the sub-site major topic entries in Burningbird. I also created views to display terms and stories by vocabulary, which I use in all of my sub-sites.
  • The site design incorporates a footer that expands the Primary navigation menu to show the secondary topic entries. I’ve also added back in a monthly archive, as well as recent writings links, to enable easier access of site contents.

The expanded primary menu footer was simple, using Drupal’s API:


<?php
$tree = menu_tree_all_data('primary-links');
print menu_tree_output($tree);
?>

To implement the “Comment on this story” link for each story, I installed the Content Construction Kit (CCK), with the additional link module, and expanded the story content type to add the new “comment on this story” field. When I add the entry, I type in the URL for the comment post at Burningbird, which automatically gets linked in with the text “Comment on this story” as the title.

I manually manage the link from the Burningbird site to the sub-site writing, both because the text and circumstance of the link differs, and the CCK field isn’t included as part of the feed. I may play around with automating this process, but I don’t plan on writing entries so frequently that I find this workflow to be a burden.

The images were tricky. I have implemented both the piclens and mediaRSS Drupal Modules, and if you access any of my image galleries with an application such as Cooliris, you’ll get that wonderful image management capability. (I wish more people would use this functionality for their image libraries.)

I also display sub-site specific random images within the sub-site sidebars, but I wanted the additional capability to display random images from across all of the sites in the topmost Burningbird sidebar.

To get this cross-site functionality, I installed Gallery2 at http://burningbird.net/gallery2, and synced it with the images from all of my sub-sites. I then installed the Gallery2 Drupal module at Burningbird (which you can view directly) and used Gallery2 plug-ins to provide random images within the Drupal sidebar blocks.

Drupal prevented direct access from Gallery2 to the image directories, but it was a simple matter to just copy the images and do a bulk upload. When I add a new image, I’ll just pull the image directly from the Drupal Gallery page using Gallery2’s image extraction functionality. Again, I don’t add so many images that I find this workflow to be onerous, but if others have implemented a different approach, I’d enjoy hearing of alternatives.

One problem that arose is that none of the Gallery2 themes is XHTML compliant because of HTML entity use. All I can say is: folks, please stop using &nbsp;. Use &#160; instead, if you’re really, really generating XHTML, not just HTML pretending to be XHTML.

To fix the non-compliant XHTML problem, I copied a version of my site to a separate theme, and just removed the PHP that serves the page up as XHTML for XHTML-capable browsers from this “Silhouette for HTML” theme. The Gallery2 Drupal modules allow you to specify a different theme for the Gallery2 pages, and I use the new HTMLated theme for the Gallery2 pages. I use my XHTML compliant theme for the rest of the site. Over time, I can probably add conditional tests to my main theme to test for the presence of Gallery blocks, but what I have is simple and works for now.

Lastly, I redirected the old Planet/Venus based feed locations to the Burningbird feed. You can still access full feeds from all of my sub-sites, and get full entries for all but the larger stories and books, but the entries at Burningbird will be excerpts, except for Burningbird-only posts. Speaking of which, all of my smaller status updates, and general chit-chat will be made directly at Burningbird—I’m leaving the sub-sites for longer, more in-depth, and “stand alone” writings.

As I mentioned earlier, I still have some work with SVG and RDFa to finish before I’m completely done with the redesign. I also have some additional tweaks to make with the existing infrastructure. For instance, I have custom 404403, and 410 error pages, but Drupal overrides the 403 and 404 pages. You can redirect the error handling to specific pages, but not to static pages, only to pages within the Drupal system. However, I’m not too worried about this issue, as I’m finding that there’s typically a Drupal module for any problem, just waiting to be discovered.

I know I must come across as a Drupal fangirl in this writing, but after using the application for over a year, and especially after this site redesign, I have found that no other piece of software matches my needs so well as Drupal. It’s not perfect software—there is no such thing as perfect software—but it works for me.

* This process convinced me to switch fully from using Firefox to using Safari. It was so much more simple to fix pages with XHTML errors using Safari than with Firefox’s overly aggressive XHTML error handling.

Categories
Technology

What’s shorter than 140 characters?

What can possibly top Twitter and its immediacy, as well as brevity of contact? I think we found out this week, with Google Wave. Tim O’Reilly describes it as what email would be like if invented today. My first reaction, and judging from other responses, is that it’s remarkably similar to Ray Ozzie’s Groove, before Groove became little more than a ghost appendage to Microsoft.

Folks immediately started rumbling about “twitter killer”, but I look at it and see the answer to the question, “What can beat out 140 characters?” The answer is, evidently, echoed keystrokes as people make them.

I watched the presentation video (thank you for that, Google). Technologically, Google Wave is intriguing. What was also intriguing was Google’s strong emphasis on HTML5 during the presentation, including a reference to additions to the HTML5 spec. But the part that caught my attention is that Wave is actually echoing keystrokes. I can imagine the following discussion, happening live:

A: I just saw the demo of Google Wave …

B: Oh, yeah, that was terrific

A:….and it sucked

B: Oh, um, well I thought…

A: You liked it! Are you…

B: …it was innovative

A: …cracked?

Google Wave is ADD heroin.

I was thinking about Google Wave yesterday, as I ran the gauntlet that is known as Watson Street, here in St. Louis. As I dodged little old ladies who pull into the road without looking, and the 30 something guy who cut me off when he should have yielded, or contemplated the new ding in my car from some mother’s precious child opening his or her car door too hard, and too wide, I began to appreciate what Twitter, Google Wave, Blogging, Facebook, and other social media are: real life alternative communities.

Because in real life, we’re all pricks.

Categories
RDF Standards XHTML/HTML

A Loose Set of Notes on RDFa, XHTML, and HTML5

There’s been a great deal of discussion about RDFa, HTML5, and microdata the last few days, on email lists and elsewhere. I wanted to write down notes of the discussions here, for future reference. Those working issues with RDFa in Drupal 7 should pay particular attention, but the material is relevant to anyone incorporating RDFa.

Shane McCarron released a proposal for RDFa in HTML4, which is based on creating a DTD that extends support for RDFa in HTML4. He does address some issues related to the differences in how certain data is handled in HTML4 and XHTML, but for the most part, his document refers processing issues to the original RDFaSyntax document.

Philip Taylor responded with some questions, specifically about how xml:lang is handled by HTML5 parsers, as compared to XML parsers. His second concern was how to handle XMLLiteral in HTML5, because the assumption is that RDFa extractors in JavaScript would be getting their data from the DOM, not processing the characters in the page.

“If the object of a triple would be an XMLLiteral, and the input to the processor is not well-formed [XML]” – I don’t understand what that means in an HTML context. Is it meant to mean something like “the bytes in the HTML file that correspond to the contents of the relevant element could be parsed as well-formed XML (modulo various namespace declaration issues)”? If so, that seems impossible to implement. The input to the RDFa processor will most likely be a DOM, possibly manipulated by the DOM APIs rather than coming straight from an HTML parser, so it may never have had a byte representation at all.

There’s a lively little sub-thread related to this one issue, but the one response I’ll focus on is Shane, who replied, RDFa does not pre-suppose a processing model in which there is a DOM. The issue of xml:lang is also still under discussion, but I want to move on to new issues.

While the discussion related to Shane’s document was ongoing, Philip released his own first look at RDFa in HTML5. Concern was immediately expressed about Philip’s copying of some of Shane’s material, in order to create a new processing rule section. The concern wasn’t because of any issue to do with copyright, but the problems that can occur when you have two sets of processing rules for the same data and the same underlying data model. No matter how careful you are, at some point the two are likely to diverge, and the underlying data model corrupted.

Rather than spend time on Philip’s specification directly at this time, I want to focus, instead, on a note he attached to the email entry providing the link to the spec proposal. In it he wrote:

There are several unresolved design issues (e.g. handling of case-sensitivity, use of xmlns:* vs other mechanisms that cause fewer problems, etc) – I haven’t intended to make any decisions on such issues, I’ve just attempted to define the behaviour with sufficient detail that it should make those issues visible.

More on case sensitivity in a moment.

Discussion started a little more slowly for Philip’s document, but is ongoing. In addition, both Philip and Manu Sporney released test suites. Philip’s is focused on highlighting problems when parsing RDFa in HTML as compared to XHTML; The one that Manu posted, created by Shane, focused on a basic set of test cases for RDFa, generally, but migrated into the RDFa in HTML4 document space.

Returning to Philip’s issue with case sensitivity, I took one of Shane’s RDFa in HTML test cases, and the rdfquery JavaScript from Philip’s test suit, and created pages demonstrating the case sensitivity issue. One such is the following:

<!DOCTYPE HTML PUBLIC "-//ApTest//DTD HTML4+RDFa 1.0//EN" "http://www3.aptest.com/standards/DTD/html4-rdfa-1.dtd">
<html
xmlns:t="http://test1.org/something/"
xmlns:T="http://test2.org/something/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<head>
<title>Test 0011</title>
</head>
<body>
<div about="">
Author: <span property="dc:creator t:apple T:banana">Albert Einstein</span>
<h2 property="dc:title">E = mc<sup>2</sup>: The Most Urgent Problem of Our Time</h2>
</div>
</body>
</html>

Notice the two namespace declarations, one for “t” and one for “T”. Both are used to provide properties for the object being described in the document: t:apple and T:banana. Parsing the document with a RDFa application that applies XML rules, treats the namespaces, “t” and “T” as two different namespaces. It has no problem with the RDFa annotation.

However, using the rdfquery JavaScript library, which treats “t” and “T” the same because of HTML case insensitivity, an exception results: Malformed CURIE: No namespace binding for T in CURIE T:banana. Stripping away the RDFa aspects, and focusing on the namespaces, you can see how browsers handle namespace case in an HTML document and in a document served up as XHTML. To make matter more interesting, check out the two pages using Opera 10, Firefox 3.5, and the latest Safari. Opera preserves the case, while both Safari and Firefox lowercase the prefix. Even within the HTML world, the browsers handle namespace case in HTML differently. However, all handle the prefixes the same, and correctly in XHTML. So does the rdfquery JavaScript library, as this test page demonstrates.

Returning to the discussion, there is some back and forth on how to handle case sensitivity issues related to HTML, with suggestions varying as widely as: tossing the RDFa in XHTML spec out and creating a new onetossing RDFa out in favor of Microdatacreating a best practices document that details the problem and provides appropriate warnings; creating a new RDFa in HTML document (or modifying existing profile document) specifying that all conforming applications must treat prefix names as case insensitive in HTML, (possibly cross-referencing the RDFa in XHTML document, which allows case sensitive prefixes). I am not in favor of the first two options. I do favor the latter two options, though I think the best practices document should strongly recommend using lowercase prefix names, and definitely not using two prefixes that differ only by case. During the discussion, a new conforming RDFa test case was proposed that tests based on case. This has now started its own discussion.

I think the problem of case and namespace prefixes (not to mention xmlns as compared to XMLNS) is very much an edge issue, not a show stopper. However, until a solution is formalized, be aware that xmlns prefix case is handled differently in XHTML and HTML. Since all things are equal, consider using lowercase prefixes, only, when embedding RDFa (or any other namespace-based functionality). In addition, do not use XMLNS. Ever. If not for yourself, do it for the kittens.

Speaking of RDFa in HTML issues, there is now a new RDFa in HTML issues wiki page. Knock yourselves out.

updatenew version of the RDFa in HTML4 profile has been released. It addresses a some of the concerns expressed earlier, including the issue of case and XMLLiteral. Though HTML5 doesn’t support DTDs, as HTML4 does, the conformance rules should still be good for HTML5.

Categories
Web

Cite not link

I do have considerable sympathy for 1Thomas Crampton, when he discovered that all of his stories at the International Herald Tribune have been pulled from the web because of a merger with the New York Times.

So, what did the NY Times do to merge these sites?

They killed the IHT and erased the archives.

1- Every one of the links ever made to IHT stories now points back to the generic NY Times global front page.

2- Even when I go to the NY Times global page, I cannot find my articles. In other words, my entire journalistic career at the IHT – from war zones to SARS wards – has been erased.

At the same, though, I don’t have as much sympathy for Wikipedia losing its links to the same stories, as detailed by 2Crampton in a second posting.

The issue: Wikipedia – one of the highest traffic websites on the Internet – makes reference to a large number of IHT stories, but those links are now all dead. They need to delete them all and find new references or use another solution.

As I wrote in comments at Teleread:

I do have sympathy, I know I would be frustrated if my stories disappeared from the web, but at the same time, there is a certain level of karma about all of this.

How many times have webloggers chortled at the closure of another newspaper? How many times have webloggers gloated about how we will win over Big Media?

The thing is, when Big Media is gone, who will we quote? Who will we link? Where will the underlying credibility for our stories be found?

Isn’t this exactly what webloggers have wanted all along?

Isn’t this what webloggers have wanted, all along?

I have sympathy for a writer losing his work, though I assume he kept copies of his writings. If they can’t be found in hard copies of the newspaper, then I’m assuming the paper is releasing its copyright on the items, and that Mr. Crampton will be able to re-publish these on his own. That’s the agreement I have with O’Reilly: when it no longer actively publishes one of my works, the copyright is returned to me. In addition, with some of the books, we have a mutual agreement that when the book is no longer published, the work will be released to the public domain.

I don’t have sympathy for Wikipedia, though, because the way many citations are made at the site don’t follow Wikipedia’s citation policy. Links are a lazy form of citation. The relevant passage in the publication should be quoted in the Wikipedia article, matched with a citation listing the publication, author, title of the work, and publication—not a quick link to an external site over which Wikipedia has no control.

I’m currently fixing one of my stories, Tyson Valley, a Lone Elk, and the Bomb because the original material was moved, without redirection. But as I fix the article, what I’m doing is making copies of all of the material, for my own reference. Saving the web page is no different than making a photocopy of an article in the days before the web.

In addition, I will be adding a formal citation for the source, as well as the link, so if the article moves again, whoever reads my story will know how to search for the article’s new location. At a minimum, they’ll know where the article was originally found.

I’m also repackaging the public domain writing and images for serving at my site, again with a text citation expressing appreciations to the site that originally published the images.

By using this approach, the stories I consider “timeless”, in whatever context that word means in this ephemeral environment, would not require my constant intervention.

Authors posting to Wikipedia should be doing the same, and this policy should be enforced: provide a direct quote of relevant material (allowed under Fair Use), and provide a formal citation, in addition to the link. Or perhaps, instead of the link. Because when the newspapers disappear, they’ll have no reason to keep the old archives. No reason at all. And then, where will Wikipedia be?

1Crampton, Thomas, “Reporter to NY Times Publisher: You Erased My Career”, thomascrampton.com. May 8, 2009.
2Crampton, Thomas, “Wikipedia Grappling with Deletion of IHT.com”, thomascrampton. May 8, 2009.

Categories
SVG XHTML/HTML

Whipping boy

I noticed a passing twitter message from Laura Scott. It said One word: standards. Firefox follows w3c standards. Internet Explorer does not. She wrote it in response to another Twitter message from tutu4lu, who was having problems with a web page appearing differently with IE than Firefox.

It is true that Firefox implements more standards than IE, especially in when it comes to some of my favorites, such as SVG. And I appreciate the fact.

Firefox does not necessarily get an A+ for all of its effort, though. In particular, if Microsoft’s lack of implementation of XHTML has been one force against broader implementation of XHTML at web sites, Firefox’s own handling of XML errors in XHTML is another, more subtle force against XHTML.

Here’s an example. I added an ampersand (&) to a URL in one of my posts, which generates an XHTML error. The following are three screen shots from Chrome, Opera, and Safari, respectively, that demonstrate how they handle the error:

XHTML error in Chrome
Opera XHTML error
Safari error

Safari and Chrome are both built on WebKit, which handles XHTML errors by parsing, and rendering, the document up to the error. This has the advantage of providing some content, as well as being able to more quickly find the error when you’re debugging.

Opera doesn’t render the document, but it does provide a display of the source with highlighting where the error occurs. This is extremely helpful when you’re debugging a larger document. In addition, Opera also provides an option to render the document in HTML, rather than XHTML, which is helpful for everyone else.

Contrast and compare these screenshots with the following, from Firefox.

Firefox error handling

The Firefox XHTML error handling is also known as YSOD, or Yellow Screen of Death. It’s harsh, abrupt, and somewhat punishing in nature, with its sickly yellow background, and bright red text. The message is typically cut off by the edge of the browser window, so one can’t easily see where the error has occurred. It’s most definitely intimidating for readers who accidentally stumble on to an XHTML page currently in a broken state.

All four of the browsers do support the XHTML standard, and all stop processing the XHTML when an error occurs, as is proper. But where Safari/Webkit, Chrome/Webkit, and Opera try to provide a useful web page, Firefox picks up a ruler and gives the owner of the web site a good whacking.

It’s easy to fall into the trap of blaming all web development and design problems on Microsoft and IE, and to use IE as a whipping boy—to the exclusion of looking, critically, at the other browsers in the web space. If the lack of support for XHTML in IE is a primary inhibitor of the spread of XHTML, Firefox’s YSOD has to take the second place prize. Support for XHTML doesn’t end at the parser.