Categories
HTML5 Specs

One Table

Next up in my series HTML5: A Story in Progress is a final discussion on the table summary attribute, as well as new discussion on the HTML5 table examples, and the introduction of an HTML5 Primer. Read more at RealTech: One Table in a Thousand..

By providing the differing, and I feel complementary table documentation techniques and examples, we’re also, indirectly, enabling better data collection activity in regards to summary in the future. If the issue with the perceived incorrect use of summary is that people don’t understand how to use summary, then in the future we should see correct descriptions of the table using one of the other techniques, without seeing an associated correct use of summary. I hypothesize, though, that we’ll see a positive correlation between correct use of HTML tables, and correct use summary, most likely used with a correct use of caption, or figure legend.

Categories
HTML5 Specs

One table in a thousand

Continuing with the discussion on the table element, began in the last page of this series, another change I would make in section 4.9.2 of the HTML 5 specification is to the example tables. First though, I want to return to the summary attribute one more time, before moving on to other topics.

The decision about summary was based on an analysis of data pulled from web pages scraped from the internet[1]. What’s been ignored in the discussions related to the incorrect use of the summary attribute is that only about one in 1,000 HTML tables reflect correct HTML table use. So, accuracy when it comes to past use of HTML tables is just something that one can’t “accurately” assess. The raw data is interesting, but we can’t draw conclusions from it.

I found that rather than looking at raw data use, if we look at discussions people have had about the table summary attribute, instead, we find that the summary attribute has been taken very seriously, but its use isn’t necessarily showing up on the web, or in Google.

For instance, this bug report is for a CMS and is directly related to an incorrect table summary. The summary use is accurate, but the table structure was altered, and the summary needed to be changed to reflect this alteration. The table summary is also probably one of the better examples of why something like summary is necessary, including the important note that column 1 is not used. A sighted person would see immediately that column 1 is not used, so this information is redundant to the visually enabled. For those people who need to use a screenreader, though, if they didn’t know that column 1 is empty, they would end up getting some variation of “blank” for every cell in that column. The information about the second column is just as valuable, as it informs the person that column 2 has checkboxes, again something that a sighted person would see immediately.

Yet we won’t see this accurate use of summary on the web, because the CMS is primarily used for educational purposes, and most likely implemented behind a firewall. In fact, it would have to be, because most educational systems have to be protected because of legal issues to do with students and privacy. I worked on systems at both Harvard and Stanford, and they all involve quite complex data tables, and none of the web pages would be available on the internet. When you consider that both universities had government and school mandated accessibility requirements, I’m fairly sure that both would be using summary with data tables, but we wouldn’t see this use in Google.

I found the same discussion about accurate and effective use of the table summary attribute related to intranet use for states and counties, other companies, and other products (Google Search on table summary example accessibility). It could very well be that there is a significant number of good summary uses, but we’ll never directly see them because they’re behind a firewall.

This brings me back to the table examples in section 4.9.2 of the specification. To reiterate, the summary attribute remains. However, so do the other examples of table documentation. Providing examples is a good way to not only help people use HTML tables accurately, but it also enforces the importance of accessibility. In fact, I would modify and expand the section.

By providing the differing, and I feel complementary table documentation techniques and examples, we’re also, indirectly, enabling better data collection activity in regards to summary in the future. If the issue with the perceived incorrect use of summary is that people don’t understand how to use summary, then in the future we should see correct descriptions of the table using one of the other techniques, without seeing an associated correct use of summary. I hypothesize, though, that we’ll see a positive correlation between correct use of HTML tables, and correct use summary, most likely used with a correct use of caption, or *figure legend.

However, I don’t like the example table, and will replace it. I also believe more example tables are needed, as multiple examples help drive home differences in the table documentation techniques. Unfortunately, adding more examples will make a long specification even longer.

Because of the increased length of the table example section (and example sections elsewhere in the HTML 5 specification), we’ll need to split out the examples into an HTML 5 Primer.

HTML 5 Specification Modifications

To **summarize: The summary attribute is maintained as a viable, active attribute, the existing HTML table examples in section 4.9.2 will be replaced with multiple table examples, all of which will, most likely, be moved to an HTML 5 Primer document.

Additional References

See the HTML/SummaryForTable Wiki page for more details on this topic.

[1] Ian Hickson’s recent email related to the topic.

*I don’t expect to see a lot of use of figure with HTML tables, and I’m not a keen fan of figure use in this way. I’ll cover figure in more detail in a future page.

**No pun intended

Categories
HTML5 Specs

When deprecate is obsolete

First up for my HTML5: A Story in Progress is Deprecated is now Obsolete.

It was an elegant process, for an elegant time. We gently pushed the no longer wanted attributes and elements over a hill and out of sight. We don’t have markup hills now, in HTML 5, we have markup cliffs. We haven’t taken attributes and elements over the hill, we’ve taken them to cliffs, and pushed them off. And the little buggers are grabbing hold of page designers and developers, not to mention authoring tools and user agents, to take with them on their way down.

Categories
HTML5 Specs

HTML5: A story in progress

The HTML WG continues its endless round of argument. Like Ouroboros, it seems intent on swallowing its own tail, all of which has left me in a quandary: I can’t stand anything to do with the email lists anymore, but I really can’t sit still and let the HTML 5 document be released, as is, without at least attempting to fix problems with the document.

No, let me rephrase that: I can’t sit still and let the HTML 5 document be released without at least making comment on it. I doubt I could have any impact on the document, and I certainly don’t want to continue to be part of the never ending arguments. They depress me. They sap my energy. A vigorous discussion out at the HTML WG email list leaves me wanting to sit in front of my TV, watching old episodes of “I Dream of Jeannie”.

The Chairs have recommended that those of us who want to bring about change, should grab copies of the spec and make the change. Post the changed document out at the W3C site. Put the change up for a vote. But how would this work?

The draft of the HTML5 spec out at the W3C is under continuous change, the formal Working Draft hasn’t been updated for some time, so we’re having to make changes on the run. To actually make these changes requires a degree of technical proficiency that has nothing to do with HTML markup. We’re told that to propose changes to the document for consideration, we need to, first of all, send our SSH2 public key into the Michael Smith, who will then set us up so that we can check out the existing documents, using CVS. We will then need to use a variety of tools, SVN, CVS, makefiles, XSLT, and so on, just to get reach a point that our concerns and suggestions are actually taken seriously.

In other words, if you haven’t been a Unix programmer, be ready for some serious tech sticker shock.

Cameron McCormack provided a solution, in the W3C archives email list, to make the effort easier:

Let’s say I want to work on a branched spec.  I would need to have a
Unix-y environment (so that means Cygwin on Windows) that can execute
these commands, at least: make, perl, python, svn, grep, sed, head,
patch, anolis.  I would download and install Anolis from:

  http://anolis.gsnedders.com/

Then, I check out the HTML WG repository somewhere:

  $ cvs -d :ext:username@dev.w3.org/sources/public co html5

add a directory for my spec:

  $ cd html5
  $ mkdir spec-mccormack
  $ cp spec-template/{*,.cvsignore} spec-mccormack
      (ignore the error about not being able to copy the CVS directory)

initialise it with the current spec source:

  $ cd spec-mccormack
  $ make init

I’d then edit the EDITOR_EMAIL, EDITOR_NAME, EDITOR_AFFILIATION
variables in Makefile.  Also, I’d change THIS_SPEC in Makefile to be set
to the directory I created (in this case, “spec-mccormack”).

Then, to build the spec and check it in:

  $ make
  $ cvs add Makefile header source util.pl Overview.html
  $ cvs commit -m "Initial check-in."

Now I can edit the “source” file and run “make” to regenerate
Overview.html.  To merge in recent changes from Ian’s spec:

  $ make merge

That could fail if the merged changes are to the same parts of the
document that I’ve been editing.  In this case, rejected patch files
named *.rej will be dumped out into the directory.  I’d then merge them
manually, and then indicate that I’ve resolved the conflicts:

  $ make resolved

The ‘header’ file is just a copy of the current document header
(everything before the ToC) from the W3C copy of the spec.  The build
scripts here will modify various parts of this in the generated
Overview.html, which is a “willful violation” of the comments Ian has
included in the spec source. :-)  I’m presuming this is OK since this
isn’t editing Ian’s document.  Ian, let me know if you’d like me to do
less/different munging.

Also note that if you want the images in the spec
(http://dev.w3.org/html5/spec/images/) you’ll need to copy them over
yourself.

Cameron has provided a template directory, but I haven’t been able to check it out to give this effort a try. Notice that you’re checking out from both the WhatWG and the W3C directories, and then checking into the W3C directory. Oh, and if you don’t know what any of this means, well, you can ask for help.

The only problem is, having to ask for help every step of the way puts people at a disadvantage. It already creates a separation between the participants; between those who know Unix, and the magical concepts of CVS and Makefiles, and those who are experts in other things, such as markup, or accessibility, SVG, JavaScript, video controls, and so on.

Manu Sporny is aware of the issue, and has been working on an approach that would make it simpler to check in and out the documents. He just posted an email to the HTML WG that outlined how to use the GIT Repository, which does put most of the effort into the browser. Better, but still intimidating for folks who have never used source code control, sourcce code repositories, makefiles, autoconfig, and so on.

There’s nothing wrong with source code control. I like source code control. And I would expect this level of commitment from people who end up as formal co-editors of a specification, but not necessarily people just wanting to make comments, suggestions, or proposing alternative text. This reliance on source code control and makefiles is, to me, just as much a roadblock as the rest of the HTML 5 process has been, except now we’re trying to shift the “blame” if you will, to technology rather than the HTML WG, the HTML 5 editor, and the W3C.

I am a programmer. I do know CVS and SVN, autoconfig, makefiles and so on, though GIT is a new one for me. My system is set up to run Cameron’s process. I’m sure I could manage Manu’s alternative. If I don’t, it’s not because I can’t, but because I find the whole process to be absurd.

HTML is a web document markup language. It is not a programming language or operating system. It is not WebKit, the Apache project, or the Linux kernel. Why it is being treated as such is because of group demographics. The recommended processes to work through issues are symptomatic of the fact that there is little or no diversity in the HTML 5 working group, virtually none in the WhatWG group. What we have is a working group run by tech geeks: not designers, not accessibility experts, graphic artists, web authors, not even web developers. Hard core, to the metal, geeks. And to a geek, the way around a problem is to throw technology at it; the way to filter input is to use technology as a barrier.

If you have to ask, you’ve already failed the first test.

With all due respect to Manu and Cameron, and Michael Smith at the W3C, for trying to make things simpler, I’m not buying into it. Yes, I am set up to check in and out changes to the W3C directory, and I pulled a copy of the source. But I copied the documents into my space, at Burningbird. What I’m going to do for the next couple of weeks, is go through the document, piece by piece, make note of the problems I see, why I see them as problems, and modify the documents accordingly.

I’m not going to check any of it in, though. I’m not going out to the HTML WG and beg, hat in hand, to be given an audience. That ship has sailed. Have a nice trip! Bring me back a coffee mug!.

The only reason I’m doing this work, is that there seems to be a belief in the HTML Working Groups that those of us who have expressed concerns to the group aren’t willing to roll our sleeves out and get involved in the nitty gritty; that we’re not willing to to match work to words; that we’re all talk, no action; that we’re poseurs. In other words, street cred. Show you can walk the talk! Prove yourself!

I suppose I could just ignore the assumptions, and the people making the assumptions, but it really peeves me that people are using technology as a barrier. I like technology. Technology should enable, not obfuscate. And proving I have street cred isn’t the only reason I’m doing the work. There is another reason, which motivates me more: I’d like to show what my version of HTML 5 would look like, if I had my druthers. Just because.

Of course, while I’m doing this work, I have to put my book writing aside (sorry Simon), and postpone an article on SVG (sorry Carolyn). I’m also not getting paid while I do the work, because, unlike several members of both working groups, I, and many others, aren’t paid to do this work. But that’s OK. I am willing to forgo that new computer I need, or the salmon fillets and fresh fruit I like to serve a couple of times a week because they’re so healthy, and other such luxuries, because I find this whole thing to be so fun. Why else would I spend so much time on this effort, if it weren’t so much fun. Seriously, HTML could be redefined to mean “Hot Time Markup Love”.

Categories
HTML5 Specs

Deprecated is now obsolete

A simple change can have profound consequences. What triggered this epiphany was my attempt to return the summary attribute to the HTML table element.

I thought all we would need to do to add summary back is remove the deprecated label from the attribute in the HTML table custom attribute list. But wait a second, when you look at the table element, there are *no custom attributes. All of the previously existing table attributes are now listed in the Obsolete but conforming or the Obsolete and not conforming section of the HTML 5 specification. Summary is joined by cellpadding, cellspacing, frame, rules, bgcolor, align, border, and width. The instructions associated with the latter attributes read, “The following attributes are obsolete (though the elements are still part of the language), and must not be used by authors”.

Now, most of the presentational attributes, such as bgcolor, were deprecated in HTML 4.01, so we had warning that these attributes could be made obsolete in future versions of HTML. Everything is right and proper to make bgcolor obsolete for Table in HTML 5. But what about those attributes, such as width, cellpadding, cellsummary, border, and yes, summary, that were not deprecated in HTML 4.01? Isn’t the proper procedure to, first, deprecate an attribute or element, and then obsolete it in a future version of the specification?

But the summary and other attributes are not deprecated in HTML 5. Instead, they have been tossed directly into the obsolete bin. In fact, if you look for these attributes in the table element definition, you won’t find anything but a toss away sentence for summary in the examples.

I thought the whole purpose behind deprecating language elements is so that if these elements are in widespread usage, it gives web authors notice that these elements can disappear someday. It gives people notice that they need to be prepared to change their web pages, without yanking the support rug out from under them while they make these changes? Look at the definition for “deprecated” in HTML 4.01:

A deprecated element or attribute is one that has been outdated by newer constructs. Deprecated elements are defined in the reference manual in appropriate locations, but are clearly marked as deprecated. Deprecated elements may become obsolete in future versions of HTML. User agents should continue to support deprecated elements for reasons of backward compatibility.

Definitions of elements and attributes clearly indicate which are deprecated.

This specification includes examples that illustrate how to avoid using deprecated elements. In most cases these depend on user agent support for style sheets. In general, authors should use style sheets to achieve stylistic and formatting effects rather than HTML presentational attributes. HTML presentational attributes have been deprecated when style sheet alternatives exist (see, for example, [CSS1]).

Clear, concise, and everyone understands what deprecate means. User agents still have to support the elements and attributes when they support HTML 4.01. Authors know they eventually have to modify their web pages to remove the attributes and elements, but that they’re still supported until they do.

Now look at the definition for obsolete in HTML 4.01:

An obsolete element or attribute is one for which there is no guarantee of support by a user agent. Obsolete elements are no longer defined in the specification, but are listed for historical purposes in the changes section of the reference manual.

Again, clarity. User agents know they don’t have to support the obsolete elements in HTML 4.01, though they do for older HTML languages. Authoring tools also understand that they should no longer support the attributes in new documents. In addition, there’s a historical reference to the elements/attributes so when we stumble upon the attribute or element in old web pages, we know when it bit the dust, so to speak. Authors definitely know that if they haven’t changed their pages by now, there’s no guarantees how those old, obsolete attributes and elements will be handled by user agents.

Compare and contract these with “The following attributes are obsolete (though the elements are still part of the language), and must not be used by authors”. Or the statement associated with summary, “Authors should not specify the summary attribute on table elements. This attribute was suggested in earlier versions of the language as a technique for providing explanatory text for complex tables for users of screen readers. One of the techniques described in the table section should be used instead.” Well, are the things obsolete, or not?

There is no continuity to this approach. There is no graceful movement from one version of the markup to the other. Instead, there is an abrupt transition that is guaranteed to leave confusion rippling in its wake.

“The following attributes are obsolete (though the elements are still part of the language), and must not be used by authors”. What does that mean? Does that mean user agents have to support the elements and attributes for an indefinite period of time? How about the summary attribute, which is obsolete but conforming? How can something be conforming and part of a language, and obsolete at the exact same time? Isn’t one aspect of a deprecated (and obsolete) attribute or element is that it is replaced by something else? Doesn’t physics preclude two bodies from occupying the same space at the same time?

There’s also a disconnect because the perfectly valid HTML 4.01 attributes are dumped into an also ran bin at the bottom of the HTML5 specification, leaving you wondering where the hell summary, or cellspacing, or cellpadding has gone. In HTML 4.01, when an attribute was deprecated, it was still listed with the element, but labeled deprecated, so you knew what the attribute was (if you see it in an actual page), and not to use it for new pages (deprecated).

It was an elegant process, for an elegant time. We gently pushed the no longer wanted attributes and elements over a hill and out of sight. We don’t have markup hills now, in HTML 5, we have markup cliffs. We haven’t taken attributes and elements over the hill, we’ve taken them to cliffs, and pushed them off. And the little buggers are grabbing hold of page designers and developers, not to mention authoring tools and user agents, to take with them on their way down.

By skipping over the entire concept of deprecation, and diving head first into making these elements and attributes obsolete, we’ve either redefined obsolete, so that it no longer means the same thing (“You’re gone!”), or we’re creating a massive level of uncertainty about how long we have to change our web pages.

My version of HTML 5 will return the concept of deprecated and obsolete to their old HTML 4.01 meaning, so that there is continuity between the specifications. In addition, the attributes that were not deprecated in HTML 4.01, but no longer wanted in HTML 5 will become deprecated in HTML 5, and eventually, possibly, gracefully made obsolete in a future version of HTML.

How will this impact on summary? The point of contention about summary isn’t that everyone loves the attribute and wants it to last forever, but that the accessibility folks want it supported in HTML 5 until something better comes along. Is there something better? The HTML 5 working draft lists various ways a person can document the structure of a table, but none of these ways fulfill the same purpose of summary. If they did, we could deprecate the summary attribute, with a reference to the replacement. But since these alternatives don’t serve the same purpose, summary has to continue as a viable, active attribute until replaced by something else.

The alternative approaches for documenting a table are also viable, and can continue in the specification, but they should be joined by an example demonstrating summary. In addition, summary needs a good description, at least to the same level of other elements and attributes, such as legend, section, and so on.

*Perhaps the fact that all custom attributes have been removed from the table element explains another reason why there’s reluctance to bring back the summary attribute: summary would look odd, hanging out there all by itself. Being the only table element custom attribute would actually emphasize the attribute, making it more likely that people would use it.