December 13th, 2007

What does it take to convert your Wordpress weblog to XHTML?

First, the template has to be valid XHTML. One way to check this is to make sure the page validates as XHTML, first, before actually converting the page to XHTML. I use an XHTML 1.1 DOCTYPE that supports MathML and SVG:


<!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"
    "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">
I also add XHTML, SVG, and XLink namespaces:

<html xmlns="http://www.w3.org/1999/xhtml" 
      xmlns:svg="http://www.w3.org/2000/svg"
      xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en">

When you validate the page the validator will let you know that the DOCTYPE differs from the page MIME type, but shouldn't impact on the validation process. Just make sure that the validator is treating your page as XHTML.

The reason why the Validator assumes the page is HTML is because the page is served up as HTML at this point, Wordpress wants to serve pages up as HTML. In fact, Wordpress fights you every step in the way when it comes to serving your page as XHTML. Luckily, there's nice people who build plug-ins to ensure your page is served up as XHTML. However, not every browser supports XHTML. For those limited browsers, we have to serve the pages as HTML. If we don't, the limited browser (that would be, IE) has a problem serving the pages.

Testing to see what a browser can handle is known as content negotiation. There is a way you can implement content negotiation with .htaccess, but this approach doesn't work well with Wordpress. Instead, I use the m0n5t3's nest "content negotation plug-in for Wordpress". I install it, activate it, and it manages the content negotation for me–serving pages as XHTML for browsers that can handle it; and HTML for those that can't (IE).

To ensure the comments work, I added the following line to wp-comments-post.php before saving the comment:

$comment_content = mb_convert_encoding($comment_content, "UTF-8","auto");

If you've followed my steps so far, congratulations! You're now serving your pages as XHTML. Now, go back through your archives. Be prepared for:

  • Yellow screen of death for Firefox
  • Opera's polite, "You're F**cked!" elegant gray
  • Safari's, "You're hurting me!" page
  • IE is reading the page as HTML, which means it doesn't care that your page is crappy.

I've had a weblog for years, other pages even longer. I have used old HTML, dated HTML, and good HTML, used badly. This means I have a lot of pages that will break when served as XHTML.

There might be *nice, automated applications that can fix all my bad uses of HTML. I've not tried to create such an application, nor have I found one. Instead, I fix pages manually, based on someone letting me know they've found a broken page. I also have an application I run that shows me which pages are broken. I run this application when I have time, fixing pages.

The application I use to find bad XHTML pulls the content in from the Wordpress database:


<?php
require_once('./wp-config.php');
require_once('./XhtmlValidator.php');

global $wpdb;

$sql="select ID,post_content from $wpdb->posts 
where post_status = 'publish'
ORDER BY ID ASC ";

$lines = $wpdb->get_results($sql);
if ($lines) {

   foreach ($lines as $line) {
      $post = $line->ID;
      $data = "<div>" . $line->post_content . "</div>";
      $XhtmlValidator = new XhtmlValidator();
      if($XhtmlValidator->validate($data) === false){
         echo "Post $post <br />\n";
         $XhtmlValidator->showErrors();
      }
    }
}

?>

As you can see from accessing the application, I still have work to do. I make use of a PHP class, XhtmlValidator, from Akelos Framework. It works nicely. Too nicely.

Of course, the upside to all of this is that my new posts are XHTML valid, or I wouldn't be able to publish them. To ensure this continues this way, I turn off WP formatting for those posts that Wordpress formats incorrectly. For instance, I can't use Wordpress default formatting when I use CODE elements, because WP wants to insert inappropriate paragraph tags.

Is it work? Yes, but when you're done you know, without a doubt, that all your i's are dotted, your t's crossed. You also know that you can add trees.

Christmas Tree holiday religholiday festive advent christmas christianity recreation Aaron Spike Aaron Spike Aaron Spike image/svg+xml en

And cute, cuddly bears.

image/svg+xml
And choo-choo trains.
image/svg+xml

Which, unfortunately, you can't see if you're using IE. They're cute, take my word for it. And semantical, too, thanks to RDF embedded with the image. All allowed, because the page is served up as XHTML.

(SVG images from Wikipedia. Artists: Aaron Spike, Richard Thompson, and Jarno Vasamaa)

(Per Sam Ruby, HTML5Lib should be able to fix the XHTML. )
December 13th, 2007

Fascinating story about Opera's EU antitrust suit. We've been down the road about Microsoft's bundling of IE into Windows–which I thought was ruled antitrust at one time and Microsoft was instructed to discontinue such efforts, or something fuzzy like that. This is the first suit, I know of, where one browser maker has basically filed suit against another browser maker for not using standards.

The complaint describes how Microsoft is abusing its dominant position by tying its browser, Internet Explorer, to the Windows operating system and by hindering interoperability by not following accepted Web standards. Opera has requested the Commission to take the necessary actions to compel Microsoft to give consumers a real choice and to support open Web standards in Internet Explorer.

This echoes what I wrote last week, about IE 8:

My take is that IE 8 will not implement any new standards. It might, might, clean up some existing standard support. There will be no support for SVG or XHTML, limited support for ECMAScript and CSS 2.1…Instead, I think Microsoft is going with integrated Silverlight, and more tightly binding IE into the company's desktop, all the while thumbing its nose at the rest of the world. IE is, still, the most used browser, and while it's ahead, Microsoft is going to use this time to delay innovations based on standards the other browsers are implementing.

With the release of Silverlight, it's not difficult to see Microsoft's direction along proprietary paths, at the expense of standards. After all, implementing standards does nothing to help company share prices. As long as the MS fan boys and girls sit at Bill Gates' feet, kissing them rather than holding them over the fire and demanding compliance, Microsoft does not have any reason to support standards.

The problem is that Microsoft has people convinced if it supported standards, applications would break. Look at what the Guardian wrote:

Consumers would no doubt be delighted if Microsoft suddenly shipped a fully compliant browser and discontinued IE7. That would probably break a large proportion of the sites on the web, and kill e-commerce at a stroke. (No, we shouldn't be in this position. I wish we weren't. But the fact is, we are.)

Developers know, know, this is bullshit. Other than a few kiddies, there isn't one of us who hasn't supported legacy systems while moving application architectures in new directions. Unless Microsoft has the most incompetent developers in the entire world working on IE, the company should be able to develop to new standards while still providing support for legacy systems long enough for people to update their applications.

What's different with the suit this time isn't bundling, so much, as bundling a browser into Windows that breaks compatibility with open market products–all the while seemingly to participate in efforts to standardize across browsers. Not only participate, but actively work to control direction–and timing–of standards to which it does not adhere, itself. More tellingly, using its dominant position in the browser marketplace to force such adherence. If that's not antitrust, I don't know what is.

It is the standards support that makes this antitrust lawsuit different, and it is in this area where Microsoft is going to have a difficult time proving its case.

Rather than sitting back, giggling at Opera as the fly who swats back at the giant, we should be lined up in support of the organization, and the organization's efforts. It's frankly obvious, with the release of Silverlight, that Microsoft is choosing a non-standards path for its future development.

Either we support efforts like Opera's, or we just give up and accept the fact that the web is broken, forever.

Update

Mary Jo Foley writes:

Should antitrust courts be the ones in charge of determining which versions of Cascading Style Sheets (CSS), XHTML, Document Object Model (DOM) and other Web standards are the ones to which all browser/Web developers should be writing? Participants in various standards bodies can’t even agree among themselves which version of these standards is the best. How are judges supposed to wade through the browser-standards confusion in a good/fair way?

If the lack of standards support allows Microsoft to advance with its proprietary technology, all the while holding back browsers who are spending their time implementing standards: yes.

Contrary to the pundits, there is much more agreement on standards than some people seem to realize. There are specific releases of specifications, on which MS has representatives. There's also a neutral 3rd party test, the Acid2 Test, which can be used as a guideline.

As for the "Our customers' applications will break" plaint, Microsoft doesn't even have any support for XHTML or SVG: exactly how would implementing either of these 'break' customer applications?

The real key to this is that Microsoft is on browser standards committees, but isn't committing to implementing the standards. Instead, it is spending time on its own proprietary technologies. Other browser creators are acting in good faith and spending much of their time implementing standards. So where is the problem from a competitive perspective? Windows comes with IE installed, giving it an edge on the other browsers. Because of this, and the still significant IE marketshare, we can't build our pages to a higher standard, which means we're all still stuck down in the IE basement. All the work the other browser creators are spending on implementing standards is for naught.

That, to me, is a double whammy. That, to me, is antitrust.