Categories
HTML5 Specs W3C

The HTML5 Document structure

If you’ve seen the HTML4 specification and are expecting something similar, you’re in for a surprise and not necessarily a pleasant one. The HTML5 document differs significantly.

The HTML5 document not only includes the markup syntax you would expect but it also includes the Document Object Model (DOM) for the HTML elements, as well as the parsing algorithms recommended for applications such as browsers. This leaves us with one very big document that is geared to multiple audiences. What you have to do, then, is dig through the document for the bits that interest you.

From the top:

Section 1: The Introduction

The first section provides an overview of the HTML5 specification and should be read by all audiences. It provides some background information and sets the stage for what follows. It also provides a description of the typographical conventions used in the document.

There’s two phrases that appears in this section and others: normative and non-normative. What non-normative means is that the section is provided for information purposes but isn’t definitive. The normative stuff is the rules and requirements bits.

Section 2: Common Infrastructure

This section is geared more towards application builders, primarily those building HTML parsers, browsers, or utilities. It should also be of interest to web applications builders because it introduces the DOM. Folks interested in HTML elements and their syntax may find most of this section a bit difficult to plod through. However, there is one section, Section 2.2.2 covering Extensibility, that is of interest to authors, as are the sections about general syntax for items such as colors.

Another phrase is introduced in this section: non-conforming. What this means is just what it says: if you do something that’s non-conforming, your document won’t be considered conforming. Oh, your page should still work fine in browsers and other applications, but you’ll get lots of grumbles when you run the page through a validator. I’ll have more on conforming, as compared to deprecation and obsolete later.

Section 3: Semantics, Structure, and APIs of HTML Documents

Section 3.1 is focused more on web application developers and browser builders, so web authors and designers can probably give it a pass. Section 3.2 gets into the structure of HTML elements, and should be read by everyone since it provides basic background for all of the HTML elements. Section 3.2 also contains a reference to the ARIA (Accessible Rich Internet Applications) mappings, so it is especially important that those folks interested in accessibility carefully review the material.

For those interested in accessibility, I also recommend that you check out another document, HTML5: Techniques for providing useful text alternatives. And do check back, in case some of the ARIA material ends up in a separate document, or is significantly re-written. In addition, expect either a new document or section on media accessibility for the new media elements. Support for accessibility is one area of expected change in the HTML5 document in the next several months.

Sections 3.3 through 3.5 are again focused at web developers rather than authors and designers.

Section 4: The Elements of HTML

Section 4 is the money section of the HTML5 document, providing a list of all the HTML elements. Everyone needs to go through this section, but not every component of the section needs review. As I mentioned earlier, the HTML5 specification merges the DOM APIs in with the HTML syntax, so the coverage in this section skips between audiences.

As a rule of thumb, the element syntax and use is covered in the first part of the sub-section related to the element, while the API is covered in the latter part. Many of the elements—such as nav, section, article, div—don’t have unique API needs, so there isn’t an associated DOM section, as they all share the API for one object, HTMLElement. The API for HTMLElement was covered earlier, in Section 3.2.

However, some elements, such as the new audio and video elements, do have their own unique DOM needs, which are covered following the discussion about the element syntax. Note, though, that web application functionality is interwoven among the syntax description, so expect to have to jump about a bit. Regardless, web authors should read through the entire section and just ignore the programmatic bits.

There are some new form elements, but most of the new form functionality comes from newly added form input types, which are covered in Section 4.10.7. It’s a bit tricky going through this section, as implementation of the new form functionality is sporadic, at best. Wikipedia has a table that shows which form input type is implemented in which browser (among other things). What you can look for in this sections is inconsistencies that would normally be driven out in implementation (or that could create havoc if implemented, as is).

There are also new form attributes, such as autofocus, placeholder, and the like. Most of the new attributes have been implemented in at least a couple of browsers.

Section 4.11 is labeled “Interactive Elements” and web authors may assume this section is about JavaScript-related functionality and think to skip it. However, this section covers many of the new declarative elements, which require no JavaScript. Declarative elements are ones where the functionality is baked into the element rather than implemented with code. As with form input types, though, many of these new declarative elements have not yet been implemented by any browser.

There’s a section titled “Common idioms without dedicated elements”. Oh, I’m really trying to get rid of this section. It’s best to let usage derive organically, rather than try to dictate it within a web specification. Feel free to read the section, but remember all of the material is this section is suggestion, only.

There’s also a document you may want to glance through that covers just the elements and their syntax, HTML: The Markup Language Reference. It might be a lot easier to read, though you should review the HTML5 document for bugs and concerns during this Last Call process.

Another document of interest is the HTML Canvas2D Context, which covers the guts, if I may use such an indelicate term, for the Canvas element. If you’re interested in working with Canvas, you’ll definitely want to check out this document.

Section 5: Loading Web Pages

This section and the next are actually old friends that are under new management. If you’ve worked with JavaScript applications in the last ten years, you know about the Window object, History, and so on. You also know that this functionality has never been standardized. Well, it is in HTML5.

It’s good to see this functionality finally standardized. Unfortunately, and there’s little disagreement about this, it really doesn’t fit within an HTML5 specification. However, trying to separate it out from the document was something none of us wanted to attempt—not without a guarantee that the effort wouldn’t be disregarded and rejected by the powers-that-be.

This section is for web and application developers. The page load processing model might be of interest to web authors and designers curious in how it works, but the focus of this section really is on developers. And web application developers should give this section a very careful perusal, because they may find some of their existing web applications no longer working as expected because of what’s documented in this section.

Section 6: Web Application APIs

The title tells you what you need to know, upfront: this is for developers. This is the remaining components of what used to be known as the Browser Object Model, and also includes an overview of event handling, events, and timers. Web authors and designers can also give this section a pass.

Section 7: User Interaction

Everyone is going to want to go through Section 7. It describes new attributes, such as hidden, and standardized versions of functionality, such as drag and drop and contenteditable. What’s interesting about this section is that we’re looking at a blend between functionality and structure, which is going to create a challenge for those building web sites. Should the structure components be added by designers, such as the use of the hidden attribute, or the drag and drop attributes? Or do the application developers add these in using JavaScript when the pages are loaded?

Web page authors should go through the section to understand what you can do with a web page. Web application developers should peruse it more carefully; to review the functionality for possible gotchas and future problems.

Section 8: The HTML Syntax

This section provides an overview of the HTML syntax, but the description is given from a browser parsing perspective, rather than in language geared to web authors, and even web developers. If you’re a web author and developer, do feel free to look through the section, but if you find yourself losing interest in the first few paragraphs, you can probably skip this section.

The section that might interest you the most is Section 8.1.2.4, which covers optional tags. This tells you when you can or cannot omit tags. It’s not easy to follow, but it can help you determine why you’re having problems with your documents, and getting all those errors and warnings when you test your documents in a validator.

Section 9: The XHTML Syntax

If you’re like me, and mostly use XHTML for your documents, give this section a once over. However, you may also want to examine another document, HTML/XHTML Authoring Compatibility Guidelines. This document provides an overview of the differences in syntax between HTML5 and XHTML5, including quoting attribute values, closing elements, and so on.

Section 10: Rendering

Now this is an interesting section. This is where the HTML5 editor is providing guidance about how elements are to be rendered. It’s important to note, though, the following disclaimer in the very start of the section:

User agents are not required to present HTML documents in any particular way. However, this section provides a set of suggestions for rendering HTML documents that, if followed, are likely to lead to a user experience that closely resembles the experience intended by the documents’ authors. So as to avoid confusion regarding the normativity of this section, RFC2119 terms have not been used. Instead, the term “expected” is used to indicate behavior that will lead to this experience.

To me, the use of expected sets expectations, but you can’t depend on this. As we are finding, how many of the new elements and form types are rendered does differ between browsers and operating system, and there’s little we can do to style most of the elements.

Look through the section, note inconsistencies, and file any concerns.

Section 11: Obsolete Elements

Earlier I mentioned about the non-conforming phrase. Unlike HTML4, the concept of deprecation is no longer supported; neither is the concept of validation. Now, a use of an element or attribute is conforming or not, and an element can be obsolete but conforming, which is the HTML5 version of deprecation. In addition, elements and attributes no longer go through a deprecation period, first, but can be labeled immediately obsolete, such as the longdesc attribute.

The whole point on this change is that supposedly no element or attribute will ever not be supported by user agents, which is something that deprecate and obsolete imply in earlier versions of HTML. I’ll be frank and say I don’t agree with this change or philosophy, or the entire obsolete but conforming designation, or making elements and attributes obsolete without first deprecating them. There is an issue on this currently (Issue 106), but no change proposals. However, I am filing a LC Comment on this section, and the concepts.

In the meantime, look through the list, see if you can spot any old friends among the newly dead.

Section 12: IANA Considerations

Of interest to application and web developers, but probably not of interest to web authors and designers.

That’s the last of the sections. Note that you don’t have to rush out today and begin reviewing HTML5, because the formal call for Last Call comments won’t happen until next April or May, and you’ll still have a period of time to comment, following. However, the sooner you do file concerns and bugs, the sooner they’re addressed (and the sooner they’re implemented the way you want).

Have fun.