Summary: Split the material related to the browsing context into a separate specification, reducing the existing HTML5 to covering HTML, XHTML, and the DOM, only.
The following is my initial request that led to this issue
Currently the HTML5 specification contains a section, Section 6, devoted specifically to browsers. The section also notes that though it is focused on browsers, requirements in the section apply to all user agents, not just browsers, unless otherwise noted.
Though browsers are a major user agent for HTML/XHTML, they are not the only user agents. In particular, ebook technology is dependent on XHTML, and forms a completely different class of user agents than browsers. In addition, there are email applications that exist outside of browsers that also make use of HTML/XHTML, in addition to some word processing software.
Though the section does provide a good reverse engineering of browser technology, the section has little or nothing to do with HTML, in general. In addition, it also has little to do with the Document Object Model, which is based on the HTML syntax, not objects implemented by various browsers.
Including this section greatly extends the HTML5 specification beyond the charter, and beyond boundaries one can reasonably expect from an effort focused on HTML, both the XML serialization and non-XML serialization, and the DOM. In addition, by focusing the specification primarily towards browsers, we are limiting the usefulness of the HTML specification for other uses, both now, and in the future for ebooks, as well as other new technologies.
This is counter to good, technology practices. Consider how a programmer creates an application. They look for opportunities to create reusable objects, which they then use to create any number of applications, not just one. We should follow the same philosophy when creating a new version of HTML: restrict our effort to a new version of HTML, its serialization in XML, and the DOM. This will include new elements, such as video, which may not be useful for all variations of user agents, but the concept behind the new elements still fits within our perceptions of what we would reasonably expect from an HTML specification.
Simplifying the HTML5 specification in this way will greatly increase its usability by many user agents, not just browsers. A standardized BOM (Browser Object Model) can reference the HTML, true, but so can other specifications, such as ePub (for eBooks) and so on.
In addition, browser technology expands at a faster pace than that for the underlying HTML specification. By separating Section 6 out, it can then be incorporated into a new effort that can be focused specifically on the class of user agents, browsers. This new effort won’t then be dependent on the same release cycle as HTML.
I can see no negative ramifications from this change. Not only would it reduce the boundaries of the HTML5 specification to those that that one would reasonably expect, the separated section could then be used to seed a new, more targeted effort. As there is work on an ePub specification, there could also be work for the equivalent browser specification.
The end result of this request was a odd sequence of events where the HTML5 Editor split apart the W3C version of the HTML5 specification, but not specifically along lines of browsing context and not. This is also when the WhatWG version of the HTML5 specification began to strongly diverge from the W3C version.
The split apart parts were put back together again, but then the original Section 6 was split into two sections, Section 5 and 6. No rationale was given for not meeting my request, so I can’t reproduce here.
The concept of a Browser Object Model is not new. If you search on the term, you’ll find hundreds of references, including discussion about how it, unlike the DOM, is not defined within any formal specification. A formal specification is needed, though, and I agree with the work begun in this group. I just don’t believe that the HTML5 specification is the place for this work, because the BOM, which includes objects like Window, History, Navigator, Timers, and so on, is specific to HTML, XHTML, or the DOM. SVG is also dependent on the BOM, as is any script enabled document that could potentially be opened in a browser-like object.
Splitting out the non-HTML/XHTML/DOM bits isn’t going to be easy, because there’s a great deal of interwoven coverage of these items all throughout the specification. We could make the split based on script-enabled agent versus not, but this doesn’t make a lot of sense if the DOM is a valid component of the HTML5 specification. A split along the lines of requirements for scripting-enabled user agents, as compared to all user agents, is also not a matter of moving a couple of sections. For instance, in Section 3, titled Semantics, Structure, and APIs of all HTML documents, subsection 3.1.2 states:
User agents must raise a SECURITY_ERR exception whenever any of the members of an HTMLDocument object are accessed by scripts whose effective script origin is not the same as the Document’s effective script origin.
It is obvious that such a stricture is targeted solely to those user agents that support scripting, and is completely and totally irrelevant for those that don’t. This makes this stricture a good candidate for removal to the separate scripting User Agent document, yet isn’t in the target sections first referenced in the bug and issue.
Why does it matter to do this kind of split? Because the more unrelated material non-scripted user agents have to wade through to find their relevant bits, the more likely they’ll miss something important because it’s entangled with all sorts of browser or other user agent specific instructions. Conversely, burying browser context material in with HTML makes the material harder for user agents to update, and may hide this crucial information from web page developers, who aren’t expecting this type of information to be included in an HTML specification.
An additional concern is that many of the specifications related to the browsing context aren’t specific to just HTML/XHTML. They can also apply to other document types, such as SVG, MathML, and whatever the future brings.
I don’t think this type of split would be contentious, but I do think it will be a lot of work. It is definitely a group effort.
OK, I’ve listed out the problem. Now what is a solution?
Rather than split the document based on whether the user agent is script-enabled or not, look for material relevant to the browsing context, whether stated or not, and mark it as a potential candidate for movement to a separate document. Once all are marked, then it’s a matter of determining whether the section should be split or not. That’s where the group effort comes in, because frankly, the more the relevant audience is involved, the better job it will be.
This work does amount to a major refacturing of the HTML5 specification. However, after so many changes in the last year, so many components split out, and the ongoing discussions about whether author guidelines should be included or not, the specification is overdue for this type of operation. You don’t have to spend too many hours going through the document to see the fragmented nature of the material, and that a good, solid edit is needed. Sometimes it is better for progress to stop, look, and think, rather than continually move ahead.
I didn’t have the time to go through the many pages of the HTML5 specification, but did go through Sections 5 and 6, in the March 4th published HTML5 spec. The following list of candidate items is not by any measure complete or comprehensive, but it is a start.
- Section 5.1 on browsing contexts. Much of this material is Window related, and I feel comfortable that this section can be split, in its entirety. Reworded, this could become an intro for a new document. It does need some clean up, as it jumps about, and the text is a little hard to follow. Some collapsing of multiple sentences into paragraphs, rather than one sentence-one paragraph could help the readability.
- Section 5.2 on the Window object. This one is an excellent candidate, and is pretty easy to follow, so wouldn’t need much in the way of editing.
- Section 5.3 on Origin. There’s nothing about the origin that’s pertinent to HTML, XHTML, or even the DOM. It’s really specific to the location, and security based on the location, which is outside the scope of markup or a Document Object Model. It is important material, but would be best served in the separate document. For instance, I would say there’s much in this section that would be relevant to an SVG document.
- Section 5.4 on Session history and navigation. Again, another excellent candidate for removal to the new document.
- Section 5.5 on Browsing the Web. Even without the title, you can tell that this section is a particularly strong candidate for removal to the new document. I have to question, though, the number of algorithms given in the text in this and other sections: are these really so necessary? Or are we really overspecifying, which can be just as harmful as underspecifying? This might be a question to ask when doing this split. There are also discussions of events, that could be considered borderline for movement, but the events really are related to the browsing context, not necessarily the HTML/XHTML/DOM. It might be best to define the actions and events in the separate document, and then ensure the section is referenced in the HTML5 specification.
- Section 5.6 on Offline Web Applications. This is another that’s a given. There’s nothing about HTML, XHTML, or the DOM that is application specific. Something like offline web application cache is specific to the browsing context, not the markup or the object model. This is also the last subsection in Section 5, so all of Section 5 could be moved, with appropriate back and forth links, and editing of the moved material.
- Section 6.1 on Scripting should be a candidate, though this move might require minor tweaking. There are references to DOM methods that probably should be pulled out of the moved material and incorporated into the HTML5 spec. Either that, or removed as unnecessary, since these methods are defined in another W3C document. There is a section on event handlers, but most of the section is focused on how the user agent should process the event, not on how the event handler is defined within HTML. The event handler section could potentially be split, part to the browsing context document, part to remain in the HTML5 spec. Also again, there does seem to be a great deal of overspecification in this section, but perhaps it is necessary.
- Section 6.2 on Timers. Oh, most definitely split to the browsing context document. We know this is just as pertinent to SVG as HTML.
- Section 6.3 on User Prompts. Again, a good candidate, as relevant for SVG as HTML.
- Section 6.4 on System state and capabilities is really about the Navigator object, and is definitely part of the browsing context. And this is the last subsection in Section 6, so this section could be moved. Again, there is some editing that needs to be done, and cross-reference links.
My first inclination was also to include Section 7.10 on the Undo History, but this section is related to the DOM. However, Section 7.8 on Spelling and Grammar checking should also be moved. The object referenced is the document object but the usage is pure browsing context. I also think Section 7.9 on Drag and Drop could be a potential candidate for movement. I don’t want to lose the section, but I think it fits within the domain of “browsing context” more aptly than in the domain labeled “HTML/XHTML/DOM”. Hard to say.
Section 7.11, the Editing APIs is a tough one. It is definitely specific to browsing context, especially in the examples, but the document object is the focus. I think the better way of looking at this is, if the section is more pertinent to interacting with agents outside the page, then it probably better fits within the browsing context. If this is for toolbars and extensions, then it definitely does work better in browsing context.
There are other sections, but this is a start, and I’ve run out of time to go more in-depth into the document.
Though I’m a big believer in doing business on the email list, I strongly believe this activity would be better done at a face to face meeting, including all of the browser companies, at a minimum. This is, in a way, the browsing company’s business model.
As painful as the amount of work seems, we would have a superior set of documents as a result. If we move the browsing context to a separate document, it would decrease the size of the HTML5 specification, but also focus the audience within the newly separated document. If we were to further split out authoring guidelines, and refocus the HTML5 specification purely on HTML, XHTML, and the DOM, we’d also be ahead, but that’s outside the scope of this effort.
This type of major movement is also an excellent time to do a general readability edit— to ensure a consistent audience focus in each section, and to work on the flow. We can get so caught up on stating rules that we end up overwhelming the reader, even a highly technical reader. When we do, they will unconsciously tune out the flood of rules, and could miss very important data.
Time. This is a major work, though it doesn’t have to be an overwhelming work. If the group participates in the exercise, the work can be split among many. In addition, a face-to-face could easily work through some of the more problematical areas.
I have a feeling that in the end, the specification will progress more swiftly making this move, than not.
This is a major work, and we’re already a relatively burned out group. I think, though, a smaller task group, consisting primarily of reps from the browsing companies, and some members from the other communities, could progress quickly with this task. I do not think it needs day to day oversight by the larger group.