Categories
SVG XHTML/HTML

There’s open and then there’s open

Recovered from the Wayback Machine.

As an example of Microsoft’s new commitment to being more open with web developers, the company is releasing the IE8 beta to invited testers only, with a more general release later. Perhaps by “open”, we don’t all mean the same thing?

I also noticed that the company has not provided any answers to the questions we’ve been asking about the “super standards mode”. In particular, nothing from the company about support for XHTML and SVG/MathML. A simple, “Yes, we’re supporting XHTML” would have added real weight to all those bold pronouncements of openness and standards support this last week. Instead, the company spends it’s time, spreading fooflah, and working the community.

As an aside, you know, there’s nothing sadder in nature than a wasp without its stinger.

Categories
RDF Specs SVG XHTML/HTML

Our bouncing baby markup has growed up

Recovered from the Wayback Machine.

On today’s tenth anniversary of the birth of XML, Norm Walsh writes:

I joined O’Reilly on the very first day of an unprecedented two-week period during which the production department, the folks who actually turn finished manuscripts into books, was closed. The department was undergoing a two-week training period during which they would learn SGML and, henceforth, all books would be done in SGML…My job, I learned on that first day, would be to write the publishing system that would turn SGML into Troff so that sqtroff could turn it into PostScript. “SGML”, I recall thinking, “well, at least I know how to spell it.”

Ah yes. “Unix Power Tools” was formatted as SGML, the one and only book at O’Reilly I worked on that wasn’t in a Word format. I must express a partiality to my NeoOffice, though the SGML system was ideal for cross-referencing and indexing. OpenOffice ODT, or OpenDocument text, will be the most likely format for the next UPT. Just another example of the permanent/impermanence of web trends.

Norm also mentions about HTML5 possibly being the nail in this child of SGML’s coffin, but as I wrote recently, the folks behind HTML5 have solemnly assured us this specification also includes XHTML5. I’d hate to think we’re giving up on the benefits of XHTML just when they’re finally being realized by a more general audience.

Of course, I’m also fond of RDF/XML, which seems to cause others a great deal of pain, the pansies. And I’ve never hidden my SVG fandom and SVG is based in XML. I must also confess to preferring XML over JSON–you know, good enough for granddad, good enough for me. Atom rules. Or is that, Atom rocks? I’m also sure XML has squeezed between the joints of many of my other applications, and I just don’t know it.

Categories
XHTML/HTML

Adventures in XHTML

Recovered from the Wayback Machine.

During the recent light hearted discussions revolving around IE8 and its faithful companion, Wonder Tag, a second topic thread broke out about XHTML. As is typical whenever XHTML is brought up, the talk circles around to the draconian error handling or yellow screen of death when encountering even a small, harmless seeming discrepancy in a page’s markup.

However, the yellow screen of death is a factor of how Firefox deals with problems, not handling that’s inherent to serving XHTML as application/xhtml+xml. Safari’s error handling is much less extreme, attempting to render all of the ‘good’ markup up to the point where the ‘bad’ markup occurs.

Opera’s error handling is even more friendly. It provides the context of the error, which makes it the best tool for debugging a faulty XHTML page. You might say Opera is to XHTML, as Firebug is to JavaScript. The browser also provides an option to process the page as a more forgiving HTML.

To return to the discussion I linked earlier, in response to the mention of the draconian error handling, I wrote:

I can agree that the extreme error handling of the page can be intimidating, but it’s no different than a PHP page that’s broken, or a Java application that’s cracked, or any other product that hasn’t been put together right.

To which one of the commenters responded:

I don’t want to get off-topic either but I hear this nonsense a lot. You can’t simply compare a markup language with a programming language. They have very different intended authors (normal people versus programmers) and very different purposes.

I disagree. I believe you can compare a markup with a programming language. Both are based on technical specifications and both require an agent to process the text in a specific way to get a usable response. As with PHP or Java, you have to know how to arrange XHTML in order to get something useful. Because HTML has a more forgiving processor than the XHTML or PHP doesn’t make it less technical–just inherently more ‘loose’ for lack of a better term.

In my opinion, the commenter, Tino Zijdel, was in error on a second point, as well: markup isn’t specific to programmers. In fact, programmers are no better at markup than ‘normal’ people. Case in point is the error pages I’ve shown in this post.

As most of you are aware, I serve my pages up with the application/xhtml+xml MIME type. For those of you who have tried to access this site using IE, you’re also aware that I don’t use content negotiation, which tests to see if the browser is capable of processing XHTML and returns text/html if not.

Before yesterday, I still served up the WordPress administration pages as text/html, rather than application/xhtml+xml. Yesterday I threw the XHTML switch on the administration pages as well, and ended up with some interesting results. For instance, both plug-ins I use that have an options page had bad markup. In fact one, a very popular plug-in that publishes del.icio.us links into a post, had the following errors:

  • The ‘wrap’ class name wasn’t in quotes.
  • Five input fields were not properly terminated.
  • The script element didn’t have a CDATA wrapper.
  • Properties such as ‘disabled’ and ‘readonly’ were given as standalone values.
  • Two extraneous opening TR tags.
  • One non-terminated TR element.
  • Two terminating label elements without any starting tag.

For all of that, though, it didn’t take me more than about 15 minutes to fix the page, with a little help from Opera.

The WordPress administration pages work except for the Dashboard, where the version of jQuery that comes with WordPress didn’t seem to handle the Ajax calls to fill the page. I updated jQuery with the latest version, and the feed from the WordPress weblog shows, but not the other two items. At least, not with Firefox 3 or Safari, but all the content does show with Opera.

The Text Control plug-in had one minor XHTML error in the options page, but even when that was fixed, selecting a new text formatting option in the post doesn’t work–the selection goes back to the default. That one will end up being more challenging to fix, because I haven’t a clue what’s stopping the update.

WordPress does a decent job of generating proper XHTML content when using the default formatting. In fact the only problem I’ve had, other than when I embed SVG inline, was my own inaccurate use of markup. I used <code> elements, by themselves, when displaying block code. What I should have used is the <code> preceded by <pre>. When I do, the WordPress default formatting works without problems.

remove_filter('comment_text', 'wpautop', 30);
remove_filter('comment_text', 'wptexturize');
add_filter('comment_text', 'tc_comment');

My error, and the errors of the plug-in creators all demonstrate that though programmers might be more familiar with the consequences of making a mistake with technical text, we don’t make fewer mistakes than anyone else when it comes to using web page markup. Our only advantage is we’re not as intimidated by pages with errors. Regardless of how displayed or our relative technical expertise, though, these error messages aren’t necessarily a bad thing.

One of the advantages to serving the pages with application/xhtml+xml is that we catch mistakes before we serve the pages up to our readers. We definitely catch the mistakes before we release code that generates badly formed markup, or providing broken option pages to accompany our coded plug-ins. I can’t for the life of me understand why any programmer, web developer, or designer would want less than 100% accuracy from their web pages. That’s tantamount to saying, “Hire me. I write sloppy shit.”

Of course, being able to program can have advantages when working with XHTML, especially with many of today’s applications. WordPress does a good job at working in an XHTML environment, but not a great one. One example of where the application fails, badly, is in the Atom feed.

In Atom, WordPress outputs the HTML type as an attribute to many of the fields:

<summary type="<?php html_type_rss(); ?>">
<![CDATA[<?php the_excerpt_rss(); ?>]]></summary>
<?php if ( !get_option('rss_use_excerpt') ) : ?>

This is all well and good except for one thing: when the type is returned as ‘xhtml’, Atom feeds are supposed to use the following syntax for the content:

<summary type="xhtml"><div xmlns="http://www.w3.org/1999/xhtml">
...</div></summary>

This is an outright error in how the Atom feed is coded in WordPress. I’ve had to correct this in my own feed, and then remember not to overwrite my copy of the code whenever there’s an update. What the code should be doing is testing the type, and then providing the wrapper accordingly.

A second issue with WordPress is more subtle, and has to do with that part of XML I don’t consider myself overly familiar with: character sets and encoding. As soon as I switched on XHTML at my old weblog, I started to have problems with certain characters in my comments, and had to adjust the WordPress comment processing to allow for UTF-8 encoding. As it is, I’m not sure that I’ve covered all the bases, though I haven’t had any re-occurrence of the initial problems.

However, during the XHTML discussion, Philip Taylor demonstrated another problem in the WP code, in this case sending through a couple of characters that the WP search function did not like.

I checked with one of my two XHTML experts, Jacques Distler (the other being Sam Ruby), and the characters were Unicode, specifically:

utf-8 0xEFBFBE = U+FFFE
utf-8 0xEFBFBF = U+FFFF 

From Jacques I found that Philip likes the U+FFFE and U+FFFF Unicode characters because they’re not part of the W3C’s recommended regular expression for filtering illegal characters.

Unfortunately, to protect against these characters in search as well as comments required code in more than one place, and in fact, having to hack into the back end of WordPress. This is not an option available to someone who isn’t a programmer. However, this example doesn’t demonstrate that you have to be coder to serve pages as XHTML–it demonstrates that applications such as WordPress have a ways to go before being technically, rather than just cosmetically, compliant with XHTML.

Having said that, I can almost hear the voices now: Why bother, they say. After all, no one uses XHTML, do they?

Why bother? Well, for one thing, XHTML served as XML provides a way to integrate other XML-based specifications into the page content, including in-line SVG, as well as MathML, and even RDF/XML if we’re so inclined. The point is, serving XHTML as XML provides an open platform on which to build. Otherwise, we’re dependent on committees to hash through what will or will not be allowed into a specification, based on one company or another’s agenda.

We can include SVG into a page using an object element, but we can’t integrate something like SVG and MathML together without the ability to include both inline. We certainly can’t incorporate SVG into the overall structure of the page–at least not easily using separate files. There is no room in an HTML implementation for all the other XML-based vocabularies, and we can only cram so much into class attributes before the entire infrastructure collapses.

No, we need both: an HTML implementation for those not ready to commit to an XML-based implementation, and XHTML for the rest of us.

During the recent discussions on IE8, several people asked Chris Wilson from Microsoft whether IE8 will support the application/xhtml+xml MIME type. So far, we’ve not had an answer. Whatever the company decides, though, XHTML is not going away. The HTML5 working draft, which was just released, is about a vocabulary, not a specific implementation of that vocabulary. Both HTML and XHTML implementations are covered in the document, though XHTML isn’t covered as fully because most of the aspects of processing XHTML are covered in other documents. At least, that’s what we’re being told.

What’s critical for the HTML5 effort is that browsers support both implementations. Even the smallest mobile device is not going to be so overburdened by the requirements that it can’t consume pages delivered up as proper XHTML. It’s a sure thing that handling clean markup takes less requirements than handling a mess.

I’d also hate to think we’re willing to trade well designed and constructed web sites for pages filled with missing TR end tags, poorly nested elements, and unquoted class names, just because Microsoft can’t commit to the spec, and Firefox took the “bailing out now!” approach to error handling.

Categories
Standards SVG XHTML/HTML

Microsoft: Fish, or cut bait

Recovered from the Wayback Machine.

Sam Ruby quotes a comment Microsoft’s Chris Wilson made in another weblog post:

I want to jam standards support into (this and future versions of) Internet Explorer. If a shiv is the only pragmatic tool I can use to do so, shouldn’t I be using it?

Sam responded with an SVG workaround, created using Silverlight–an interesting idea, though imperfect. Emulating one technology/specification using another only works when the two are comparable, and Silverlight and SVG are not comparable. When one specification is proprietary, the other open, there can be no comparison.

There was one sentence of Sam’s that really stood out for me:

You see, I believe that Microsoft’s strategy is sound. Stallstallstall, and generate demanddemanddemand.

Stall, stall, stall, and generate demand, demand, demand. Stalling on standards, creating more demand for proprietary specifications, like Silverlight. Seeing this, how can we be asked to accept, once more, a Microsoft solution and promises that the company will, eventually, deliver standards compliance? An ACID2 picture is not enough. We want the real thing.

Jeffrey Zeldman joins with others in support for the new IE8 meta tag, based on the belief that if Microsoft delivers a standards-based browser with IE8, and companies adopt this browser for internal use, intranets that have been developed specifically to compensate for IE shortcomings will break, and Microsoft will be held liable. According to statements he’s made in comments, heads will roll in Microsoft and standards abandoned forever:

…the many developers who don’t understand or care about web standards, and who only test their CSS and scripts in the latest version of IE, won’t opt in, so their stuff will render in IE8 the same way it rendered in IE7.

That sounds bad, but it’s actually good, because it means that their “IE7-tested” sites won’t “break” in IE8. Therefore their clients won’t scream. Therefore Microsoft won’t be inundated with complaints which, in the hands of the wrong director of marketing, could lead to the firing of standards-oriented browser engineers on the IE team. The wholesale firing of standards-oriented developers would jerk IE off the web standards path just when it has achieved sure footing. And if IE were to abandon standards, accessible, standards-compliant design would no longer have a chance. Standards only work when all browsers support them. That IE has the largest market share simply heightens the stakes.

From this we can infer that rather than Pauline, the evil villain (marketing) has standards tied to the railroad tracks and the locomotive is looming on the horizon. If we ride to the rescue of this damsel in distress, though, what happens in the next version of IE? Or moving beyond the browser, the next version of any new product that Microsoft puts out that is supposedly ‘open’ or ‘standards-based’? Will we, again, be faced with the specter that if we rock the boat, those who support standards in Microsoft will face the axe, as standards, themselves, face the tracks? There’s an ugly word for this type of situation. I don’t think it’s in Microsoft’s best interest if we start using this word, but we will if given no other choice.

If Microsoft really wants to make the next version of IE8 work–both for its corporate clients and with the rest of us–in my opinion it needs to do two things.

The first is accept the HTML5 DOCTYPE, as a declaration of intention for full standards compliance. Not just support the DOCTYPE, though. Microsoft has to return to the HTML5/XHTML5 work group and participate in the development of the new standard.

The next step is, to me, the most critical Microsoft can take: support application/xhtml+xml. In other words, XHTML. XHTML 1.1 has been a released standard for seven years. It’s been implemented by Firefox, Safari, and Opera, and a host of other user agents. There is no good reason for Microsoft not to support this specification. More importantly, support for XHTML can also be used as a declaration of intentions, in place of the IE8 meta tag.

This is Microsoft meeting us half-way. It gives a little, we give a little. Microsoft can still protect it’s corporate client intranets, while we continue to protect the future of standards. Not only protect, but begin to advance, because the next specification Microsoft must meet will be support for SVG. Perhaps it can use Silverlight as the engine implementing SVG, as Sam has demonstrated. However, if the company does, it must make this support part of the browser–I’m done with the days of plug-ins just to get a browser to support a five year old standard.

Microsoft is asking us to declare our intentions, it’s only fair we ask the same of it. If Microsoft won’t meet us half-way–if the company releases IE8 without support for the HTML5 DOCTYPE or XHTML, and without at least some guarantee as to when we’ll see SVG in IE–then we’ll have our answer. It may not be the answer we want, but it will be the answer we need.

I would rather find out now than some future time that Microsoft’s support for standards is in name, only. At the least, we’ll know, and there will be an end to the stalling.

Categories
HTML5 XHTML/HTML

Soft Strategy

Sam Ruby wroteJackass 2.5 is available exclusively on SilverLight and my first thought was, “Hey! IE 8 must be shipping!” Then I clicked the link and realized he was talking about a movie.

Sam brought up Jackass the movie because of an issue of the video element in the HTML5 specification, and whether user agents should, or should not, be required to support the “free” video compression technique, Ogg Theora. Interesting to see the inner workings of the group. Now what group was this?

Oh, yeah. HTML5. Anyway, Sam also writes:

Fundamentally, Microsoft’s strategy is sound. Ignore standards that you find inconvenient, and focus on producing and enabling the production of content people want. While my humble site can’t compete with the likes of Jackass 2.5, I do have a few people who follow my site. I’ve switched my front page to HTML5 despite the fact that this means that MSIE7 will therefore ignore virtually all CSS. ..Perhaps if a few more HTML5 advocates did the same, people would eventually take notice.

I was inspired to go to XHTML, in part, by Sam’s earlier fooling around with SVG and XHTML. So I’ll give HTML5 a shot.

In five, six years. Or so.