Category: Specs

Tech specs

Google’s Ta Da moments

Post author By Shelley Powers
Post date September 12, 2011

Henri Bergius wrote a piece on Google’s seeming desire to replace all web components, except HTML. Among the “new” technologies:

SPDY to replace HTTP
schema.org and Microdata to replace a decade’s worth of semantic work with RDF and microformats
WebP, a new image format
WebM, a new video format
And now Dart, to replace JavaScript

However, I wouldn’t leave HTML out. The only editor for HTML5 is a Google employee, Ian Hickson, who has been working with other folks, including another Google employee, to break pieces off the HTML5 specification, take them to WHATWG space, and completely re-write them in isolation. Then, when the pieces are re-written, the editors don’t seem want to bring them back to the W3C. (Or they have to ask Google Legal whether they can do so, completely ignoring the fact that as a W3C member, Google pledged to work with others.)

I’ve been battling this effort with the Editing API and just recently, the same thing happened with the section on dynamic markup insertion.

It’s not that people aren’t happy about these non-HTML components being pulled out of the HTML5 specification, but rather than work with the members of the HTML WG and the W3C, Google has been encouraging people to act unilaterally, aided and abetted by the HTML5 editor.

What’s ironic is that the concepts behind the Editing API and the dynamic markup insertion sections, which includes innerHTML among other things, actually originated with Microsoft. I’ve been waiting for Microsoft to go, “Hold on partner!” Apple already has. (And again).

Google has become all that is arrogant conceit. It believes it can do anything better than anyone else. It has dropped any pretense of seemingly wanting to work with others, and pretends its work is open, as long as it “gives” it all away when it’s finished.

The internet and the web were created so that people could connect; that those who were separated physically could still work together. The roots of the web are based in openness and cooperation, not unilateral decisions that demonstrate little tolerance and no empathy. I’d rather use an imperfect technology created by a team of varied and interested people, then a “perfect” work created in isolation and dumped on the world in some grand “Ta Da!” moment.

An imperfect technology can be perfected, but you can’t fix hubris.

Tags HTML

HTML5 Specs SVG

This page isn’t valid…and who cares

Post author By Shelley Powers
Post date September 11, 2011

I covered my recent experiments in using SVG in HTML in SVG in HTML. I linked two different example pages with SVG inline in HTML: one dependent on HTML5 parsing (Firefox nightly), the other using the library, SVGWeb.

There’s another difference between the two examples other than just their implementation. The first example, dependent on a browser parsing the page as HTML5, doesn’t validate. The example using SVGWeb, does. Yet, both pages display correctly, as long as you use an HTML5 enabled browser for the first. The odder thing is, neither page is “invalid”.

The HTML markup is fine for both, as is the SVG used. However, the Validator doesn’t like inline SVG at this time, because, we’re told, no browser implements SVG inline in HTML, yet. The SVGWeb example validates because the SVG is contained in a script block. The validation problems with the first example go beyond embedding the SVG element directly in the web page, though. The example also incorporates a metadata element in the SVG that contains RDF/XML.

Embedding RDF/XML into the metadata element is perfectly valid with SVG, and in fact, quite common when people attach Creative Commons licenses to their work. The HTML5 Validator, though, doesn’t really know what to do with this RDF/XML. Why? Because RDF/XML uses namespaced elements, and namespaced elements are taboo in HTML. Yet, SVG is acceptable in HTML5.

Herein we discover the paradox that is HTML5: XML allowed in HTML, but parsed as HTML; extensible namespaced elements that are valid in SVG/XML, becoming invalid when embedded in the non-extensible environment that is HTML5. HTML5 as XHTML likes namespaces. HTML5 as HTML does not like namespaces. But HTML5, as both XHTML and HTML likes SVG, and SVG likes namespaces.

Pictorially, the logic of this looks about as follows (which would not be valid if inserted into an HTML5 HTML document):

Ouroborous

Oh, what is a web designer/developer to do, who just wants to use a little SVG here and there? Enter, stage left, the HTML5 Doctor.

Recently the HTML5 Doctor was asked about attributes and elements from HTML4 that are now obsolete but conforming (or not) in HTML5. Won’t adding a HTML5 DOCTYPE while still using these elements cause the pages to be invalid?

The Doctor’s answer:

While validation is undoubtedly important for your markup and your CSS, in my opinion it isn’t crucial to a site. Allow me to explain, we recently received a couple of emails pointing out that this site doesn’t validate. While there were some errors that have now been corrected, a primary reason why is the use of ARIA roles in the markup. These attributes currently aren’t allowed in the current specification, however there is work underway to make this happen.

To illustrate this point let’s look at Google, the search giant. If you look at the source on Google’s search pages you’ll see they use the HTML 5 doctype.

<!DOCTYPE html>

However, those pages don’t validate because they use the font and center elements amongst others things that we already know have been removed from the specification. Does this mean that users stop visiting Google? No.

Remember too that the specification is yet to be finalised and may still be changed (thus breaking you’re perfectly valid docments), in partnership with this changes to the specification may not immediately take be implemented in the validators. We also need to bear in mind that HTML 5 takes a “pave the cowpaths” approach to development, meaning that the Hixie, et al will look at what authors already do and improve upon it.

The days of validation being an end all, be all, are effectively over with HTML5. By obsoleting (not deprecating) elements that were perfectly valid in HTML4; by not providing an extensibility path within HTML in HTML5, especially considering that new elements will arise over time—not to mention, the inclusion of perfectly legitimate namespaces elements in SVG— all, combined make “validation” a goal, but not an end when it comes to the web pages of the future. We’re more likely to define a set of supported browsers and user agents and worry more about the pages working with these, then be concerned about whether the pages validate in Validator.nu.

So, my one web page with the inline SVG works with the Firefox nightly, with HTML5 parsing enabled. It isn’t valid…but who cares?

Tags HTML5, SVG

Media Specs

Notes from writing HTML5 Media

Post author By Shelley Powers
Post date July 19, 2011

Recovered from the Wayback Machine.

This last weekend I finished my latest book for O’Reilly: HTML5 Media. This is one of O’Reilly’s shorter books (about 100 pages), primarily focused at the eBook market, though you can get a hard copy with print-on-demand.

The book focuses on the HTML5 audio and video elements. I cover how to use the elements in a web page and go into detail on the attributes for each element, as well as cover video and audio codec support. I also devote a couple of chapters on developing with both elements, including how to create a custom control, as well as integrating the media elements with the canvas element and SVG.

In one chapter, I touch on the newest media element API functionality including the brand new and unimplemented media controllers and support for multiple audio, video, and text tracks. Though no browser currently provides support for captions/subtitles, I also explain how to use JavaScript libraries and SRT or WebVTT files to add captions and subtitles to videos.

I enjoyed working on this book. I enjoyed worked with the media elements, though I’m more partial to the video element. Working on the book was also a learning experience—even, at times, an eyebrow raising experience. I thought I would share with you all some of the notes I wrote while working on the book.

WebVTT versus TTML

The WHATWG group started working on a subtitle/caption format based on SRT (SubRip) text format. The original name was WebSRT, but it was recently renamed to WebVTT. The LeanBack Player web site provides a good review of WebVTT.

WebVTT is a pretty basic format, consisting of line numbers, timelines, and text with formatting options. There are plans to add additional capabilities, but what we have now should meet most needs.

There’s been interest in bringing WebVTT over to the W3C. However, the W3C already has a timed text specification, TTML. TTML is an XML based format that is more sophisticated than WebVTT, but also more complicated to use.

I covered WebVTT in the book in detail, but only briefly mentioned TTML. The reason I didn’t spend time with TTML is because of existing support and the industry movement away from XML.

None of the various JavaScript libraries I tested that provided some caption/subtitle support worked with TTML. They worked with SRT or WebVTT, but none that I tried worked with TTML.

Additionally, TTML is an XML format. Now, XML might have been the approach to take a half dozen years ago, when most everything at the W3C was heading in an XML direction. In the last several years, though, we’ve seen the popularity of the RDF/XML serialization fade in favor of Turtle or RDFa, and XHTML2 abandoned in favor of HTML5. SVG is still holding on, but now there’s rumblings of an API that will generate SVG or canvas API calls, and basically hide most of the XMLness of SVG from view. I vaguely remember reading something somewhere that the folks working on TTML were even thinking of creating a JSON version of the spec.

Whether intended or by accident, there is a subtle but noticeable shift away from XML in the W3C. At the same time, there is a strong core of support for XML formats in the W3C. Between both seemingly contradictory paths, I’m thinking we should just skip the interim pain and anguish of yet another format war, and go right to the end point. So I covered SRT and WebVTT and only mentioned TTML in passing.

Protecting the Users from the Big Bad Web Developers

I like HTML5 video and audio, I really do. I had a great deal of fun writing this book. However, despite my affection for these elements I must also admit to some irritation with their design and implementation. (Well, other than the fact that an entire block of the specification changed mysteriously one night, requiring a sudden and unexpected re-write in one of my chapters.)

The part about the HTML5 media elements I like the least is the seeming level of distrust directed at web page authors and developers.

For instance, if you’re creating a custom control and remove the controls attribute, you may think you then have complete control over the media playback. You don’t, though—at least, not in most browsers.

In the section of the HTML5 spec related to the media element’s user interface, implementors are advised to provide playback control in some manner regardless of whether the controls attribute is present or not:

Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), but such features should not interfere with the page’s normal rendering. For example, such features could be exposed in the media element’s context menu.

If you right mouse click on a video element in Firefox, you’re given the options to play or pause the video, mute the volume, play the video in fullscreen, show or hide the controls, as well as save the video or play the video by itself in another page. Chrome provides options to play, pause, or mute the video, as well as show or hide the controls, open the video in another tab, or save the video. Opera’s context menu options are similar to Chrome’s, minus the option to open the video in a new tab. IE10 provides play, pause, mute options, the ability to save the video, and the ability to control playback speed. Safari is the only browser that doesn’t provide context menu options to control the video. At least, not yet.

There is absolutely no way to directly control what does or does not display in the context menu that the browser provides. There is no way to control some of the actions that people can take in the context menu, such as preventing the fullscreen display of the video, if you don’t want it played fullscreen.

If you’re providing custom controls for the video, you have to account for the fact that the video playback is being managed by the context menu as well as your controls. One of my examples in the book provides a video playback control that consists of separate buttons for play, pause, and stop. These controls are disabled based on what action the user takes. It seems like a simple act to just disable and enable the appropriate buttons at the same time you play or pause the video, but you actually have to capture two sets of events: the click events from the buttons, and the play and pause event from the video.

Of course, the amount of extra code to do something like enable and disable buttons based on playback is trivial. But what isn’t trivial is controlling which options are made available to the user. If, for whatever reason, you don’t want the video to be played fullscreen, there is absolutely no way to prevent this from happening with Firefox.

The only way to prevent the context menu from displaying for the video is to provide a transparent div overlay for the video, so that the context menu reflects the div element, not the video. That or turn the video element’s display off, and play the video by redrawing it into a canvas element—a case of overkill, just to be able to control video playback.

The conflict between the context menu and customization isn’t the only web developer/author restriction.

There are the times when the web page author wants the audio or video to begin automatically when the page loads. The media elements do provide attributes for this: autoplay and loop. To ensure automatic playback, the author removes the controls attribute, adds autoplay and possibly loop, and when the page loads, the media element begins playing. The web page author can also remove the audio element completely from display so all that’s left is the sound. The video element is, of course, left displayed, but the control UI should not be showing. We can’t control the context menu options, but at least the control UI isn’t displaying. Well, not unless scripting is disabled, that is.

If the user has scripting disabled, the control UI is automatically re-displayed with the media element—even if you don’t want it to be displayed. If scripting is disabled, you cannot control the visibility of the control UI. According to the HTML5 specification:

If the attribute is present, or if scripting is disabled for the media element, then the user agent should expose a user interface to the user.

I’ve been told by members of the HTML WG that “should” in this context is equivalent to “must”. The two terms are not the same, but I gather that they become one in HTML5 land.

Currently, only Opera provides a visual control UI when scripting is disabled. Firefox doesn’t display a visual control UI when scripting is disabled regardless of whether the controls attribute is present or not. Safari, Chrome, and IE currently do not display the control UI. If “should” is equivalent to “must”, then Firefox, Safari, Chrome, and IE are all in error in their handling of disabled scripting and the media elements. I imagine bugs will be filed, if they haven’t already been filed, and these browsers will also automatically add the control UI when scripting is disabled.

I hear people cheering. You’re cheering, aren’t you? You’re all insanely happy with the power given the end user with the HTML5 video and audio elements.

Most of us remember those times when we opened a web page and some horrid music was blaring, or a video automatically plays with some idiot in a suit talking about his constipation. If we’re at work, we keep our machines permanently muted, lest something embarrassing blare out at an inopportune moment. If screen readers are not sophisticated enough to automatically lower background sound when a page is opened, the background sound competes badly with the reader.

Automatic audio, bad. Automatic video, bad.

I also imagine most of you have forgotten your visits to sites where you expected music or a video to play, and how much you enjoyed a well crafted multimedia experience.

Consider sites devoted to movies. Currently the last of the Harry Potter movies, the latest Transformer, and the new Spielberg Super 8 are playing in movie theaters. All three movies have their own web sites. If you open all three movie sites, you’ll find extensive use of both audio and video media.

The Harry Potter site opens with a preview of the movie with its own custom control that automatically starts playing as soon as it is sufficiently loaded. Among the options not provided with this video are the ability to open the movie out of context of the frame, such as opening the video in fullscreen. At most you can start or stop the video, choose a different video format, or skip the video and go to the site offerings.

The site offerings page has audio playing in the background. In addition, the bottom of the page features an animated video of owls. You can do nothing to stop either.

The Transformer movie site also provides background sound, as well as a video that begins to play automatically and loops continuously in the splash page. When you enter the site, another video plays continuously in the background of the page. Again, sound is used. There is very little about the site that is text-based: it’s all eye and ear candy.

The Super 8 movie site provides an automatically playing trailer with its own control. The page also has background audio when the trailer is finished. One of the sections of the site is the Editing Room. This page features a video playing automatically in an old 8mm style. Once this video if finished, rows of film are displayed. You can click a control that opens another video, again in super 8 style, providing a back story for the movie. You’re provided with controls to play the video and mute the sound.

None of these movie sites provide a context menu for their videos, other than what you would expect to see with a Flash movie. None of the sites allowed you to open the videos in fullscreen or play in a separate tab, because the videos are part of an integrated whole. The sites don’t allow you to switch off audio that I can see. I realize that automatically playing audio can be irritating for some, and can play havoc with screen readers, but again, none of this is unexpected for a movie site.

These types of sites will never be created using HTML5— not because HTML5 isn’t capable of creating most of the effects, but because HTML5 deliberately circumvents finer control over the video element. Can you imagine what would happen with the Transformer site with scripting disabled? The browser would then automatically plunk the control UI over the video in the page, which would ruin the overall effect the page creator was trying to make.

The unfortunate consequence of making HTML5 video and audio unattractive for these sites is that once they start using Flash for one component of the site, they continue to use Flash for every component of the sites. If you open these pages and use a screen reader such as NVDA, the only sound you’ll get is the background audio because every last bit of the site is in Flash: the text, the menus, all of it.

We want these sites to consider using HTML5 instead of just Flash, because if they do, the sites will end up being more accessible rather than less. Yes, even if the HTML5 media elements don’t have a control UI, and audio and video are played automatically. If we want to convince people to use something other than Flash, we need to ensure they have the same level of control that they had with Flash. Currently, the HTML5 video and audio elements do not provide this level of control.

HTML Media and Security

During the recent brouhaha related to WebGL security, the HTML5 editor, Ian Hickson, discovered that the video element, as it was currently defined, would not allow the cross-domain access that the img element provides. In other words, if the video you linked in with the src attribute was not from the same domain as your web page, the video wouldn’t play. This restriction was lifted, and the video (and track) resources are now treated the same as image resources.

However, one of the safety features related to cross-domain resource access was the concept of canvas tainting. If the image or video drawn into a canvas element is from another domain, the canvas is marked as tainted (the origin-clean flag is set to false). When the canvas is tainted, the toDataURL, getDataImage, and measureText methods generate a security exception. You couldn’t circumvent the same-origin restriction by using Ajax, either, because it would not allow cross-domain resource access.

Of course, much of this has changed because of the WebGL security issues. Originally WebGL was limited to using only same-origin image access for canvas textures, but a more recent version of the specification allowed for cross-domain image access. WebGL developers wanted to add images (and potentially video) from other domains as textures for their 3D creations. Unfortunately, when the WebGL specification and implementations enabled cross-domain image access, they also opened up a security violation: the WebGL could be manipulated in such a way as to create a “data leak”, giving the web pages access to actual image (and video) data.

In order to allow WebGL to proceed without having to tackle the functionality causing the data leak (I’m told a daunting task), the WebGL community requested and received a new attribute that can be added to the img, audio, and video elements in HTML5: crossorigin. This attribute allows same-origin privileges with cross domain resources, as long as the resource server concurs with this use. This is a concept known as Cross-Origin Resource Sharing, or CORS.

CORS is another specification in work at the W3C. It originated as a way for web developers to access cross-domain resources using XMLHttpRequest (Ajax). The concept has since been expanded to include workarounds for the same-origin security restrictions in other uses, including the newest related to canvas tainting.

It sounds all peaches and cream except that there are issues related to the concept, especially when accessing image and video data from cloud services such as Amazon’s AWS or centralized image systems, such as Flickr. For CORS and the crossorigin attribute to work, these services must be willing to support CORS. The WebGL and other developers assumed the sites would be more than willing to do so. However, I know that Amazon has already expressed reservation about supporting CORS, and I wouldn’t be surprised if there wasn’t some reluctance on the part of other services.

I also had reservations about the breathlessly quick addition of crossorigin to HTML5, starting with the unanswered question, “What would WebGL had done if HTML5 was too far along in the recommendation track to add this change?” I still have concerns about quickly adding in functionality that routes around security protocols because another specification needs to have this functionality because of a security violation. I’ve long been a fan of 3D effort on the web, beginning with the earlier VRML and continuing with my interest in WebGL (I covered it in my Painting the Web book). However, I’m even more of a fan of web security. That and a stable specification. What would have happened if WebGL had made this request after HTML5 had progressed to candidate recommendation status?

Yes, I am a stick in the mud. I like stable specifications and secure web pages. I’m just old fashioned that way.

Anyway, for those wanting to integrate HTML5 video and canvas element, be aware of this very new functionality. You won’t find it included in the HTML5 Last Call document, you’ll only find it in the HTML5 editor’s draft.

Codec Support

You would expect to find tables with audio and video browser container/codec support littering the internet, and you do. The only problem is, none of the tables seem to agree.

Trying to determine exactly what container/codec each browser supports is actually a pain in the butt. I’m sure each and every browser has a page somewhere that explicitly lists what it supports in all possible environments. Wherever these pages are, though, must be one of the better kept web secrets.

It’s not as if there’s a simple yes/no answer to audio or video codec. After all, if you use the HTMLMediaElement’s canPlayType method with various audio or video codecs, you’ll either get a “maybe”, “probably”, or an empty string. Maybe and probably are not normally viewed as decisive words. It also doesn’t help when Chrome answers either maybe or probably to everything.

Then there are the quirks.

Firefox and Chrome only like uncompressed WAV files. Opera and Safari don’t seem to mind compressed WAV files. Technically, though, all four browsers “support” WAV.

Both these statements are true: only Safari supports AAC; Safari, Chrome, and IE support AAC.

If you use a tool such as the Free MP3/Wma/Ogg Converter (http://www.freemp3wmaconverter.com/), you’re given an option to convert your sound file to several different formats, including AAC and M4A. Many people will tell you AAC and M4A are one in the same. Well, yes and no.

The AAC option creates an AAC file that is packaged in a streaming format called Audio Data Transport System (ADTS). The M4A option is an AAC file that’s packaged in MPEG-4. Since Safari can play whatever QuickTime can play on a system, and QuickTime can play the ADTS AAC file, the AAC file only plays in Safari. Chrome and IE can also play the AAC file, but only if it’s wrapped in the MPEG-4 container, which Safari also supports.

But wait…there’s more!

No, no. I’m just joshing you.

Well, there really is more but I don’t want to be cruel.

The confusion about support is further exacerbated by the politics surrounding container/codec support. Yes, Chrome supports MP4. No, Chrome does not support MP4. Yes, Ogg is the open source community’s fair haired child. No, WebM is the open source community’s fair haired child … they just don’t know it yet. Speaking of WebM, yes, WebM is a video container/codec, but it’s also an audio container/codec—just leave out the video track.

Remember when everything was going to be Ogg and life was simpler?

Anyway, to add to the audio/video container/codec noise on the internet, my own versions of browser/codec support for the HTML5 audio and video elements.

Are they accurate? Sure. Why not.

What day is it?

Popular HTML5 audio container/codec support by browser
Container/Codec	IE	Firefox	Chrome	Safari	Opera
WAV(PCM)	No	*Yes	*Yes	Yes	Yes
MP3	Yes	No	Yes	Yes	No
Ogg Vorbis	No	Yes	Yes	No	Yes
MPEG-4 AAC	Yes	No	Yes	Yes	No
WebM Vorbis	No	Yes	Yes	No	Yes

*Make darn sure the WAV file is uncompressed

Popular HTML5 video container/codec support by browser
Container/Codecs	IE	Firefox	Chrome	Safari	Opera
MP4+H.264+AAC	Yes	No	*No	Yes	No
Ogg+Theora+Vorbis	No	Yes	Yes	No	Yes
WebM+V8+Vorbis	No	Yes	Yes	No	Yes

*Google has announced that Chrome will not support H.264. However, there are faint traces of support—ghosts if you will—still left in Chrome.

Official HTML5 Video Mascot

The official HTML5 video mascot is ….

Big Buck Bunny!

Tags HTML5, RDFa

Diversity Specs

W3C HTML WG decisions and the ARIA meltdown

Post author By Shelley Powers
Post date April 11, 2011

Recovered from the Wayback Machine.

One last decision I want to touch on, for now, was the decision related to Issue 129 on ARIA Mapping. In the decision, the co-chairs sided with the change proposal that added new role mappings for several elements. An uncomplicated change proposal that should require only some small edits to the ARIA mapping table.

However, things are never as simple as they seem.

First, the change tracking shows the addition of interesting new editorial comments related to this change:

These are issues that are known to the editor but cannot be
currently fixed because they were introduced by Sam Ruby
acting as chairman of the W3C HTML Working Group as part of
the HTML Working Group Decision Process. In theory we could
fork the WHATWG copy of the spec, but doing so would introduce
normative differences between the W3C and WHATWG specs and
these issues are not worth the hassle that this would cause.
We’ll probably be able to fix them some day, but for now we
are living with them.

In addition, evidently the changes made to the HTML5 spec didn’t agree with the change proposal, as noted by Steve Faulkner. To make a long, sad story short: a request was made to revert the changes and the editor must bring whatever changes he makes to meet the decisoin to the working group, first, before applying to the document.

I’m, personally, less bothered by the editorial errors than I am the discussion about forking. In many ways, this only demonstrates why the license discussion, which also seems to be never-ending, is essential: forking a specification is not the same as forking software. And there’s too much of a tendency among some folks in the WHATWG to want to fork, first, and then work through the issues.

I’m also concerned that these issues will continue to arise, time and again, because folks at the W3C are dancing around the edges of the problem, rather than confronting the problem directly. However, if the W3C does respond assertively, there is a very real possibility of one or more browser companies taking their marbles and quitting the game.

It’s a damnable situation.

Tags HTML5

Specs

The W3C HTML WG decision on RDFa prefixes

Post author By Shelley Powers
Post date April 11, 2011

Recovered from the Wayback Machine.

One HTML WG decision I agree with is the one associated with Issue 120 on RDFa prefixes.

Considering that RDFa support in XHTML/HTML to this point has made use of prefixes, I don’t understand why we even contemplated not supporting prefixes just because RDFa is being ported to HTML5. Frankly, it’s not the HTML5 WG’s design decision to make—RDFa in HTML5 is a port, the design for RDFa resides with another group.

As for RDFa prefixes being confusing, one of the most fundamental design patterns, in computer tech and elsewhere, is the concept of variable/value pairs, with a shorter, easy to type and remember variable or abbreviation used in place of a longer, more complex value.

Then there’s the fact that RDFa has significant adoption, and dropping support for prefixes will break the web. I’ve heard that this is an important criteria for other HTML5 design decisions. If nothing else, consistency demands we support prefixes.

I could go on, but the proposal to keep prefixes does a commendable job and I don’t need to repeat its arguments.

Tags HTML5, RDFa