Categories
Specs

The W3C bites back?

Recovered from the Wayback Machine.

This has been a long time coming, and not sure where it will go. It started innocuously enough: remove a paragraph associated with the alt attribute, about user agents using some form of heuristics to determine replacement text. It wasn’t associated with a bug—it predated the current decision process. It did have an issue, though, Issue 66.

Consensus was: remove the text. Simple, easy, and absolutely no impact on HTML5.

Except…

Except that Ian Hickson decided to do a little editorializing:

The latest stable version of the editor’s draft of this specification is always available on the W3C CVS server and in the WHATWG Subversion repository. The latest editor’s working copy (which may contain unfinished text in the process of being prepared) contains the latest draft text of this specification (amongst others). For more details, please see the WHATWG FAQ.” — <http://dev.w3.org/html5/spec/>, “Status of this document”.

And what does the WhatWG document now contain?

The W3C has also been working on HTML in conjunction with the WHATWG; at the W3C, this document has been split into several parts, and the occasional informative paragraph or example has been removed for technical or editorial reasons. For all intents and purposes, however, the W3C HTML specifications and this specification are equivalent (and they are in fact all generated from the same source document). The minor differences are:

  • Instead of this section, the W3C version has a different paragraph explaining the difference between the W3C and WHATWG versions of HTML.
  • The W3C version refers to the technology as HTML5, rather than just HTML.
  • Examples that use features from HTML5 are not present in the W3C version since the W3C version is published as HTML4 due to W3C publication policies.
  • The W3C version includes a redundant and inconsistent reference to the WCAG document.
  • The W3C version omits a paragraph of implementation advice for political reasons.

So the W3C specification contains a link to the WhatWG document, which contains a rather inflammatory statement, prompting Sam Ruby, the HTML5 WG co-chair, to write:

To have the W3C specification refer readers to another specification for an exact list of differences, and to have that other specification indicate that the omission was due to political reasons is intolerable.

Earlier this afternoon, I filed a Formal Objection against publication of the HTML5 heartbeat documentation, until all references to the WhatWG version of the specification, and the WhatWG mailing list, are removed from the W3C document. That this state has continued for so long is unconscionable. What, on earth, was the W3C thinking?

I was pleased to see that two of the HTML5 co-chairs, Paul Cotton and Sam Ruby, have stopped the heartbeat publication of the HTML5 specification, until the offending material is withdrawn. This effort was not prompted by my Formal Objection—the chairs are reacting to the editorial changes.

Even now, the HTML5 editor, Ian Hickson, is demanding an accounting of decisions that satisfy him. I wonder how many fan boys will rush to provide him support, or whether the browser companies finally start to realize that he is more wall, than doorway.

Regardless, if you’re thinking that the WhatWG will pull out of the HTML5 effort, and doom it by their lack of participation, think again: the WhatWG organization is not a legal entity. It is an informal group of a handful of individuals, half of whom became disillusioned with the effort years ago and have not participated since.

The group does not have a patent policy, so there is no way possible that Apple would allow something such as Canvas to be managed under the auspices of the WhatWG. The group can’t even support a copyright license, because there is no legal entity to act as holder of copyright. Not unless you count the person who provides the WhatWG server, who restarts the WhatWG server when it’s down, and who has complete editorial control over the WhatWG HTML5 document: Ian Hickson. And I don’t think even Google could tolerate handing that much power over to one, single individual.

The W3C is also not a legal entity BUT…members of the W3C organizations have to enter into a contract and agreement with the three legal entities behind the W3C: MIT, ERCIM, and Keio University. Though frequently contentious, and overly bureaucratic at times, the W3C effort is the only effort where people other than Ian Hickson may have a say about what is, or is not, in the HTML5 specification. Just as important, the W3C provides the legal stability that allows all of the browser companies to freely, and safely, participate.

Categories
HTML5 Specs

Microsoft’s proposed namespace distributed extensibility in HTML5

Per Sam Ruby, Microsoft has submitted a proposal for distributed extensibility in HTML5, which features the use of namespaces.

The proposal uses reverse DNS names, but other than those ugly sons-of-bitches, it looks promising. There are some issues, including no support for innerHTML on namespaced elements, because they would end up defined as Element, not HTMLUnknownElement, but I don’t think that’s going to be a real problem in the wild. More importantly, I believe the proposal would handle the problems I noted in my last writing, about the valid use of namespaced elements and attributes in SVG. The issue with namespaced elements in SVG isn’t a made-up problem or one that is unlikely to occur: it is a real problem, it will occur in the future, and it does require real solutions. The problem is not going to go away because Ian Hickson clicks ruby shoes together and murmurs “There’s no such thing as namespaces…there’s no such thing as namespaces…” This absolute refusal to acknowledge something that has existed on the web for a decade is, frankly, unconscionable.

I wish I could say that the happy campers of the HTML WG are willing to at least enter into an open and unbiased discussion on the proposal, but I stopped believing in fairy tales, a long time ago. There is contention on this issue (namespaces for distributed extensibility), as noted in the past, in the current discussion, in the HTML WG bug database, related to the new RDFa-in-HTML proposal. Needless to say, the WhatWG members have responded in their typical, open and mature manner.

But here’s some cold, hard reality for Ian et al: this isn’t a proposal from folks like me, who have little say, and no power. This proposal comes from Microsoft. Microsoft, who still maintains a dominant position when it comes to browser use in the world. The HTML5 editor cannot simply ignore the proposal, pretend he doesn’t understand it, or rubber stamp it WONTFIX. Not this time.

Categories
Semantics

Maxwell’s Silver Hammer: RDFa and HTML5’s Microdata

Being a Beatles fan, I must admit to being intrigued about the new Beatles box set that will be available in September. I have several Beatles albums, but not all. None of the CDs I own have been re-mastered or re-mixed, including one of my favorite songs, from Abby Road: Maxwell’s Silver Hammer:

Joan was quizzical; Studied pataphysical
Science in the home.
Late nights all alone with a test tube.
Oh, oh, oh, oh.

Maxwell Edison, majoring in medicine,
Calls her on the phone.
"Can I take you out to the pictures,
Joa, oa, oa, oan?"

But as she's getting ready to go,
A knock comes on the door.

Bang! Bang! Maxwell's silver hammer
Came down upon her head.
Bang! Bang! Maxwell's silver hammer
Made sure that she was dead.

I love the chorus, Bang! Bang! Maxwell’s silver hammer came down upon her head…

Speaking of Bang! Bang! Jeni Tennison returned from vacation, surveyed the ongoing, and seemingly unending, discussion on RDFa as compared to HTML5’s Microdata, and wrote HTML5/RDFa Arguments. It’s a well-written look at some of the issues, primarily from the viewpoint of a heavy RDFa user, working to understand the perspective of an HTML5 advocate.

Jeni lists all of the pushback against RDFa that I’m aware of, including the reluctance to use namespacing, because of copy and paste issues, as well as the use of prefixes, such as FOAF, rather than just spelling out the FOAF URI. Jeni also mentions the issue of namespaces being handled differently in the DOM (Document Object Model) when the document is served as HTML, rather than XHTML.

The whole namespace issue goes beyond just RDFa, and touches on the broader issue of distributed extensibility, which will, in my opinion, probably push back the Last Call date for HTML5. It may seem like accessibility issues are the real kicker, but that’s primarily because no one wants to look at the elephant in the corner that is extensibility. Right now, Microsoft is tasked to provide a proposal for this issue—yes, you read that right, Microsoft. When that happens, an interesting discussion will ensue. And unlike other issues, whatever happens will take more than a few hours to integrate into HTML5.

I digress, though. At the end of her writing, Jeni summarizes her opinion of the RDFa/namespace/HtmL5/Microdata situation with the following:

Really I’m just trying to draw attention to the fact that the HTML5 community has very reasonable concerns about things much more fundamental than using prefix bindings. After redrafting this concluding section many times, the things that I want to say are:

  • so wouldn’t things be better if we put as much effort into understanding each other as persuading each other (hah, what an idealist!) so we will make more progress in discussions if we focus on the underlying arguments so we need to talk in a balanced way about the advantages and disadvantages of RDF or, in a more realistic frame of mind:
  • so it’s just not going to happen for HTML5
  • so why not just stop arguing and use the spare time and energy doing?
  • so why not demonstrate RDF’s power in real-world applications?

My own opinion is that I don’t care that RDFa is not integrated into HTML5. In fact, I don’t think RDFa belongs in HTML5. I think a separate document detailing how to integrate RDFa into HTML5, as happened with XHTML, is the better approach.

Having said that, I do not believe that Microdata belongs in the HTML5 document, either. The HTML5 document is already problematical, bloated, and overly complex. It encompasses too much, a fault of the charter, as much as anything else. Removing the entire Microdata section would help, as well as several other changes, but we’ll focus on the Microdata section for the moment.

The problem with the Microdata section is that it is a competing semantic web approach to RDFa. Unlike competition in the marketplace, competition in standards will actually slow down adoption of the standards, as people take a sit-back and see what happens, approach. Now, when we’re finally are seeing RDFa incorporated into Google, into a large CMS like Drupal 7, and other uses, now is not the time to send a message to people that “Oops, the W3C really doesn’t know what the fuck it wants. Better wait until it gets its act together. ” Because that is the message being sent.

“RDFa and Microdata” is not the same as “RDFa and Microformats”. RDFa, or I should say, RDF, has co-existed peacefully with microformats for years because the two are really complementary, not competitive, specifications. Both can be used at a site. Because Microformat development is centralized, it will never have the extensibility that RDF/RDFa provides, and the number of vocabularies will always, by necessity, be limited. Microformats, on the other hand, are easier to use than RDFa, though parsing Microdata is another thing. They both have their strengths and weaknesses. Regardless, there’s no harm to using both, and no confusion, either. Microformats are managed by one organization, RDFa by the W3C.

Microdata, though, is meant to be used in place of RDFa. But Microdata is not implemented in any production capable tool, has not been thoroughly checked out or tested, has not had any real-world implementation that I know of, has no support from any browser or vendor, and isn’t even particularly liked by the HTML WG membership, including the author. It provides only a subset of the functionality that RDFa provides, and necessitates the introduction of several predefined vocabularies, all of which could, and most likely will, end up out of sync with the organizations responsible for the extra-HTML5 vocabulary specification. And let’s not forget that Microdata makes use of the reversed DNS identifier that sprang up, like a plague of locusts, in HTML5, based on the seeming assumption that people will find the following:

com.example.xn--74h

Easier to understand and use then the following:

http://example.com/xn--74h

Which, heaven knows, is not something any of us are familiar with these last 15-20 years.

RDFa and HTML5/Microdata, otherwise known as Issue 76 in the HTML 5 Tracker database. I understand where Jeni is coming from when she writes about finding a common ground. Finding common ground, though, presupposes that all participants come to the party on equal footing. That both sides will need to listen, to compromise, to give a little, to get a little. This doesn’t exist with the HTML5 effort.

Where the RDFa in XHTML specification was a group effort, Microdata is the product of one person’s imagination. One single person. However, that one single person has complete authorship control over the HTML 5 document, and so what he wants is what gets added: not what reflects common usage, not what reflects the W3C guidelines, and certainly not what exists in the world, today.

While this uneven footing exists, I can’t see how we can find common ground. So then we look at Jeni’s next set of suggestions, which basically boil down to: because of the HTML WG charter, nothing is going to happen with HTML5, so perhaps we should stop beating our heads against the wall, and focus, instead, on just using RDFa, and to hell with HTML5 and microdata.

Bang! Bang!

I am very close to this. I had started my book on the issues I have with HTML5, and how I would change the specification, but after a while, a person gets tired of being shut out or shut down. I’m less interested in continuing to “bang my head against the wall”, as Jeni so eloquently put it.

But then I get an email this week, addressed to several folks, asking about the introduction of Microdata: so what does the W3C recommend, then? What should people use? Where should they focus their time?

Confusion. Confusion because the HTML5 specification is being drafted specifically to counter several initiatives that the W3C has been nurturing over the last decade: Microdata over RDF/RDFa; HTML over XHTML; Reverse DNS identifiers over namespaces, and URIs; the elimination of non-visual cues, not only for metadata, but also for the visually challenged. And respect. There is no respect for the W3C among many in the HTML Working Group. And I know I lose more respect for the organization the closer we get to HTML5 Last Call.

In fact, HTML Working Group is a bit of a misnomer. We don’t have HTML anymore, we have a Web OS.

We don’t have a simple HTML document, we have a document that contains the DOM, garbage collection, the Canvas object and a 2D API, a definition for web browser objects, interactive elements, drag and drop, cross-document communication, channel messaging, Microdata, several pre-defined vocabularies, probably more JavaScript than the ECMAScript standard, and before they were split off, client-side SQL, web worker threads, and storage. I’m sure there’s a partridge in a pear tree somewhere in there, but I still haven’t made it completely through the document. It’s probably in Section 10. I know there’s talk of extending to the document to include a 3D API, and who knows what else.

There’s a lot of stuff in HTML5. What isn’t in the HTML5 document is a clean, straightforward description of the HTML or XHTML syntax, and a clearly defined path for people to move to HTML5 from other specifications, as well as a way of being able to cleanly extend the specification—something that has been the cornerstone of both HTML and XHTML in the past. There’s no room for the syntax, in HTML5. It got shoved down by Microdata and the 2D API. There’s no room for the past, the old concepts of deprecated and obsolete have been replaced by such clear terms as “Conforming but obsolete”. And there’s certainly no room for future extensibility. After all, there’s always HTML6, and HTML7, …, HTMLN—all based on the same open, encompassing attitude that has guided HTML5 to where it is today.

If we don’t like what we see, we do have options. We can create our own HTML5 documents, and submit “spec text” for a vote. But what if it’s the whole document that needs work? That many of the pieces are good, but don’t belong in the parent document, or even in the HTML WG?

The DOM should be split out into its own section and should take all of the DOM/interactive, and browser object stuff with it. The document should be re-focused on HTML, without this mash-up of HTML syntax, scripting references, and API calls that exists now. The XHTML section should be fleshed out and separated out into its own section, too, if no other reason to perhaps reassure people that no, XHTML is not going away. We should also be reminded that XHTML is not just for browsers—in fact, the eBook industry is dependent on XHTML. And it doesn’t need Canvas, browser objects, or drag and drop.

Canvas should also be split out, to a completely separate group whose interest is graphics, not markup. As for Microdata, at this point, I don’t care if Microdata is continued or not, but it has no place in HTML5. If it’s good, split it out, and let it prove itself against RDFa, directly.

The document needs cleaning up. There are dangling and orphaned references to objects from Web Workers and Storage still littering the specification. It hops around between HTML syntax and API call, with nothing providing any clarity as to the rhyme or reason for such jumping about. Sure there’s a lot of good stuff in the document, but it needs organization, clean up, and a good healthy dose of fresh air, and even a fresher perspective.

Accessibility shouldn’t be added begrudgingly, woodenly, resentfully. It should be integrated into the HTML, not just pasted on in order to quiet folks because LC is coming up.

The concepts of deprecated and obsolete should be returned, to ensure a sense of continuity with HTML 4. And no, these did not originate with HTML. In fact, the use of deprecated and obsolete have been fairly common with many different technologies. I can guarantee nothing but the HTML5 document has a term like “conforming but obsolete”. I know, I searched high and low in Google for it.

And we need extensibility, and no, I don’t mean Microdata and reverse DNS identifiers. If extensibility was part of the system, folks who want to use RDFa could use RDFa, and not have to beg, hat in hand, to be allowed to sit at the HTML 5 table. This endless debate wouldn’t be happening, and everyone could win. Extensibility is good that way. Extensibility has brought us RDFa, SVG, MathML, and, in past specifications, will continue to bring whatever the future may bring.

whatever the future may bring…

Finding common ground? Walk a mile in each other’s moccasins? Meet mano a mano? Provide alternative specification text?

Bang! Bang!

Jeni’s a pretty smart lady.

Categories
HTML5 Semantics

Separating presentation from semantics

After all these years, we have finally reached the point where we’ve separated page organization from presentation, and now we’re about to embark on the same mistakes again, but this time with presentation and semantics.

I’ve been following the issues associated with the vocabindex Drupal module, including one where the person submitting the bug stated the vocabindex use of UL was incorrect. We’re supposed to, MXT writes, use definition lists rather than unordered lists for any lists of terms with associated definition.

At first glance, it does seem as if the vocabindex module is using the unordered list incorrectly. After all, look at any of my category pages (such as the one for the Semantic Web)—what you see is a list of “terms” and their associated definitions. An obvious candidate for definitions lists.

Look more closely, though. In my sidebar menu I list the vocabindex terms as links to web page URIs, but there is no definition attached to any item. The description, if one is given, is, instead, added as a title attribute to the item and displayed only when the item has cursor focus. Yet, it’s the same data. Does this mean, then, I’m somehow not properly displaying my menu items? Should I have a huge sidebar, with the item description given underneath?

More importantly, whether I have a given text description for each item is purely optional, some do, some don’t. Yet the items in the list have meaning without any associated description. In fact, each item in the list is really nothing more than a label for a bucket to hold content. I could just as easily use foobarsillyputty as labels, except that I’m trying to use “meaningful” labels in order to enable you all to better find past content.

In the absolutely ancient W3C page where lists are covered a list of ingredients is given as an example of an unordered list:

  • 1 cup sugar
  • 1 cup oil
  • 2 cups flour

However, I don’t see that anyone would have a problem with adding parenthetical information to this list in order to further clarify the items:

  • 1 cup sugar (light brown granulated by C & H)
  • 1 cup oil (canola or corn, but not olive)
  • 2 cups flour (white or mixed white and whole wheat)

This is really nothing more than what the vocabindex is doing with the vocabulary index terms within both the index pages and my sidebar menu: a listing of items and a (parenthetical) description to clarify what that item is. However, it’s only when the description is displayed as a “tooltip” that one sees the item as a clarification phrase, only. When presented in the index pages, it “looks” like a definition list, and so we want to have it marked up this way— mixing up semantics and presentation.

A definition list is assumed to have two pieces of information; the term or phrase and an associated definition. Even when not present, the definition is still assumed to be forthcoming at some point— not having it is the exception, not the rule. An unordered list is just that: a list of items. They can be a list of items to buy at the store, select from in a form, or click on in a web page. There’s no assumption that any additional information is necessary for the item. If there was, Drupal would make this information mandatory rather than optional. The application and associated developers would definitely discourage the use of the items in a tag cloud or other format where only the term is given because the term, by itself, would be meaningless.

Yet we look at how the terms are portrayed in a page like the vocabindex page I linked above, and that’s enough for us to say we should use one form of markup over another because it’s more “semantical”. Further exploration online at other sites who attempt to define the differences between unordered lists and definition lists shows the same thing: if we see two pieces of data, we’re assuming a definition list, because that’s what it looks like—not what it is.

The HTML5 document adds another key element to the discussion of definition lists by stating that definition lists are name-value pairs. From this can we deduce, then, that the name has no meaning without the value. That’s my interpretation: that a definition list is the proper semantic markup only when data is defined within a context of names and associated values, both of which are meaningless without the other. If, however, *HTML5 allows us to list the names without values, or the values without the names, then the HTML5 document is imprecise, and we should just use whatever we want to use— semantics can not be derived from imprecision.

Currently, in Drupal, vocabulary terms are discrete labels, nothing more. Any description is for clarification not definition, and isn’t essential to the meaning of the term. Forget how the vocabindex pages “look”, and focus on what the data means. If we can’t do that, then this whole semantic markup thing is a bit of a farce, really.

*Confirmed: it doesn’t