Categories
Semantics

Maxwell’s Silver Hammer: RDFa and HTML5’s Microdata

Being a Beatles fan, I must admit to being intrigued about the new Beatles box set that will be available in September. I have several Beatles albums, but not all. None of the CDs I own have been re-mastered or re-mixed, including one of my favorite songs, from Abby Road: Maxwell’s Silver Hammer:

Joan was quizzical; Studied pataphysical
Science in the home.
Late nights all alone with a test tube.
Oh, oh, oh, oh.

Maxwell Edison, majoring in medicine,
Calls her on the phone.
"Can I take you out to the pictures,
Joa, oa, oa, oan?"

But as she's getting ready to go,
A knock comes on the door.

Bang! Bang! Maxwell's silver hammer
Came down upon her head.
Bang! Bang! Maxwell's silver hammer
Made sure that she was dead.

I love the chorus, Bang! Bang! Maxwell’s silver hammer came down upon her head…

Speaking of Bang! Bang! Jeni Tennison returned from vacation, surveyed the ongoing, and seemingly unending, discussion on RDFa as compared to HTML5’s Microdata, and wrote HTML5/RDFa Arguments. It’s a well-written look at some of the issues, primarily from the viewpoint of a heavy RDFa user, working to understand the perspective of an HTML5 advocate.

Jeni lists all of the pushback against RDFa that I’m aware of, including the reluctance to use namespacing, because of copy and paste issues, as well as the use of prefixes, such as FOAF, rather than just spelling out the FOAF URI. Jeni also mentions the issue of namespaces being handled differently in the DOM (Document Object Model) when the document is served as HTML, rather than XHTML.

The whole namespace issue goes beyond just RDFa, and touches on the broader issue of distributed extensibility, which will, in my opinion, probably push back the Last Call date for HTML5. It may seem like accessibility issues are the real kicker, but that’s primarily because no one wants to look at the elephant in the corner that is extensibility. Right now, Microsoft is tasked to provide a proposal for this issue—yes, you read that right, Microsoft. When that happens, an interesting discussion will ensue. And unlike other issues, whatever happens will take more than a few hours to integrate into HTML5.

I digress, though. At the end of her writing, Jeni summarizes her opinion of the RDFa/namespace/HtmL5/Microdata situation with the following:

Really I’m just trying to draw attention to the fact that the HTML5 community has very reasonable concerns about things much more fundamental than using prefix bindings. After redrafting this concluding section many times, the things that I want to say are:

  • so wouldn’t things be better if we put as much effort into understanding each other as persuading each other (hah, what an idealist!) so we will make more progress in discussions if we focus on the underlying arguments so we need to talk in a balanced way about the advantages and disadvantages of RDF or, in a more realistic frame of mind:
  • so it’s just not going to happen for HTML5
  • so why not just stop arguing and use the spare time and energy doing?
  • so why not demonstrate RDF’s power in real-world applications?

My own opinion is that I don’t care that RDFa is not integrated into HTML5. In fact, I don’t think RDFa belongs in HTML5. I think a separate document detailing how to integrate RDFa into HTML5, as happened with XHTML, is the better approach.

Having said that, I do not believe that Microdata belongs in the HTML5 document, either. The HTML5 document is already problematical, bloated, and overly complex. It encompasses too much, a fault of the charter, as much as anything else. Removing the entire Microdata section would help, as well as several other changes, but we’ll focus on the Microdata section for the moment.

The problem with the Microdata section is that it is a competing semantic web approach to RDFa. Unlike competition in the marketplace, competition in standards will actually slow down adoption of the standards, as people take a sit-back and see what happens, approach. Now, when we’re finally are seeing RDFa incorporated into Google, into a large CMS like Drupal 7, and other uses, now is not the time to send a message to people that “Oops, the W3C really doesn’t know what the fuck it wants. Better wait until it gets its act together. ” Because that is the message being sent.

“RDFa and Microdata” is not the same as “RDFa and Microformats”. RDFa, or I should say, RDF, has co-existed peacefully with microformats for years because the two are really complementary, not competitive, specifications. Both can be used at a site. Because Microformat development is centralized, it will never have the extensibility that RDF/RDFa provides, and the number of vocabularies will always, by necessity, be limited. Microformats, on the other hand, are easier to use than RDFa, though parsing Microdata is another thing. They both have their strengths and weaknesses. Regardless, there’s no harm to using both, and no confusion, either. Microformats are managed by one organization, RDFa by the W3C.

Microdata, though, is meant to be used in place of RDFa. But Microdata is not implemented in any production capable tool, has not been thoroughly checked out or tested, has not had any real-world implementation that I know of, has no support from any browser or vendor, and isn’t even particularly liked by the HTML WG membership, including the author. It provides only a subset of the functionality that RDFa provides, and necessitates the introduction of several predefined vocabularies, all of which could, and most likely will, end up out of sync with the organizations responsible for the extra-HTML5 vocabulary specification. And let’s not forget that Microdata makes use of the reversed DNS identifier that sprang up, like a plague of locusts, in HTML5, based on the seeming assumption that people will find the following:

com.example.xn--74h

Easier to understand and use then the following:

http://example.com/xn--74h

Which, heaven knows, is not something any of us are familiar with these last 15-20 years.

RDFa and HTML5/Microdata, otherwise known as Issue 76 in the HTML 5 Tracker database. I understand where Jeni is coming from when she writes about finding a common ground. Finding common ground, though, presupposes that all participants come to the party on equal footing. That both sides will need to listen, to compromise, to give a little, to get a little. This doesn’t exist with the HTML5 effort.

Where the RDFa in XHTML specification was a group effort, Microdata is the product of one person’s imagination. One single person. However, that one single person has complete authorship control over the HTML 5 document, and so what he wants is what gets added: not what reflects common usage, not what reflects the W3C guidelines, and certainly not what exists in the world, today.

While this uneven footing exists, I can’t see how we can find common ground. So then we look at Jeni’s next set of suggestions, which basically boil down to: because of the HTML WG charter, nothing is going to happen with HTML5, so perhaps we should stop beating our heads against the wall, and focus, instead, on just using RDFa, and to hell with HTML5 and microdata.

Bang! Bang!

I am very close to this. I had started my book on the issues I have with HTML5, and how I would change the specification, but after a while, a person gets tired of being shut out or shut down. I’m less interested in continuing to “bang my head against the wall”, as Jeni so eloquently put it.

But then I get an email this week, addressed to several folks, asking about the introduction of Microdata: so what does the W3C recommend, then? What should people use? Where should they focus their time?

Confusion. Confusion because the HTML5 specification is being drafted specifically to counter several initiatives that the W3C has been nurturing over the last decade: Microdata over RDF/RDFa; HTML over XHTML; Reverse DNS identifiers over namespaces, and URIs; the elimination of non-visual cues, not only for metadata, but also for the visually challenged. And respect. There is no respect for the W3C among many in the HTML Working Group. And I know I lose more respect for the organization the closer we get to HTML5 Last Call.

In fact, HTML Working Group is a bit of a misnomer. We don’t have HTML anymore, we have a Web OS.

We don’t have a simple HTML document, we have a document that contains the DOM, garbage collection, the Canvas object and a 2D API, a definition for web browser objects, interactive elements, drag and drop, cross-document communication, channel messaging, Microdata, several pre-defined vocabularies, probably more JavaScript than the ECMAScript standard, and before they were split off, client-side SQL, web worker threads, and storage. I’m sure there’s a partridge in a pear tree somewhere in there, but I still haven’t made it completely through the document. It’s probably in Section 10. I know there’s talk of extending to the document to include a 3D API, and who knows what else.

There’s a lot of stuff in HTML5. What isn’t in the HTML5 document is a clean, straightforward description of the HTML or XHTML syntax, and a clearly defined path for people to move to HTML5 from other specifications, as well as a way of being able to cleanly extend the specification—something that has been the cornerstone of both HTML and XHTML in the past. There’s no room for the syntax, in HTML5. It got shoved down by Microdata and the 2D API. There’s no room for the past, the old concepts of deprecated and obsolete have been replaced by such clear terms as “Conforming but obsolete”. And there’s certainly no room for future extensibility. After all, there’s always HTML6, and HTML7, …, HTMLN—all based on the same open, encompassing attitude that has guided HTML5 to where it is today.

If we don’t like what we see, we do have options. We can create our own HTML5 documents, and submit “spec text” for a vote. But what if it’s the whole document that needs work? That many of the pieces are good, but don’t belong in the parent document, or even in the HTML WG?

The DOM should be split out into its own section and should take all of the DOM/interactive, and browser object stuff with it. The document should be re-focused on HTML, without this mash-up of HTML syntax, scripting references, and API calls that exists now. The XHTML section should be fleshed out and separated out into its own section, too, if no other reason to perhaps reassure people that no, XHTML is not going away. We should also be reminded that XHTML is not just for browsers—in fact, the eBook industry is dependent on XHTML. And it doesn’t need Canvas, browser objects, or drag and drop.

Canvas should also be split out, to a completely separate group whose interest is graphics, not markup. As for Microdata, at this point, I don’t care if Microdata is continued or not, but it has no place in HTML5. If it’s good, split it out, and let it prove itself against RDFa, directly.

The document needs cleaning up. There are dangling and orphaned references to objects from Web Workers and Storage still littering the specification. It hops around between HTML syntax and API call, with nothing providing any clarity as to the rhyme or reason for such jumping about. Sure there’s a lot of good stuff in the document, but it needs organization, clean up, and a good healthy dose of fresh air, and even a fresher perspective.

Accessibility shouldn’t be added begrudgingly, woodenly, resentfully. It should be integrated into the HTML, not just pasted on in order to quiet folks because LC is coming up.

The concepts of deprecated and obsolete should be returned, to ensure a sense of continuity with HTML 4. And no, these did not originate with HTML. In fact, the use of deprecated and obsolete have been fairly common with many different technologies. I can guarantee nothing but the HTML5 document has a term like “conforming but obsolete”. I know, I searched high and low in Google for it.

And we need extensibility, and no, I don’t mean Microdata and reverse DNS identifiers. If extensibility was part of the system, folks who want to use RDFa could use RDFa, and not have to beg, hat in hand, to be allowed to sit at the HTML 5 table. This endless debate wouldn’t be happening, and everyone could win. Extensibility is good that way. Extensibility has brought us RDFa, SVG, MathML, and, in past specifications, will continue to bring whatever the future may bring.

whatever the future may bring…

Finding common ground? Walk a mile in each other’s moccasins? Meet mano a mano? Provide alternative specification text?

Bang! Bang!

Jeni’s a pretty smart lady.

Categories
Social Media Technology

The Tweet stuff: When it absolutely positively has to get there

If we’ve learned one thing from this week’s massive attack against the very fabric of our social connectivity, it’s that clouds don’t make the best stuff on which to build.

Twitter, in particular, has shown how very vulnerable it is—a vulnerability we share as we become more dependent on Twitter for most of our communication with each other. Oddly enough, I needed to contact someone about a business opportunity just as Twitter universe began to crumble, but all I had was her Twitter name—I couldn’t find her email address. Since Twitter was down, I couldn’t connect up with her for hours.

Of course, massive DDoS isn’t all that common, but Twitter still hasn’t recovered from the attack. As I’ve been playing with new Twitter accounts this week, I found varying degrees of responsiveness across the accounts, probably based on how busy they are; possibly based on how many followers a person has. None of the accounts would allow me to change most profile information, including the design. As you can see with my new integrated @shelleypowers Twitter account, I haven’t been able to change the picture, or to delete or add a new background image. I’ve had varying success with just posting a new message.

I have never liked centralized systems, though I understand their appeal and worth. It always seems, though, that just when you start to depend on the centralized service something happens to it.

Yahoo is now out of the search engine business, and with its new business partnership with Microsoft, its side applications like delicious are now vulnerable. I’ve managed to replace delicious with Scuttle, though I no longer have the social aspect of delicious. However, my Scuttle implementation does an excellent job with bookmarks, which is what I needed.

Then NewsGator sent an email around this last week telling all of us that our NewsGator feed aggregator is being replaced by Google Reader. I don’t like Google Reader. More importantly, I really don’t want to give Google yet more information about me. So, I replaced my NewsGator/NetNewsWire installation with a Gregarius implementation. It took me some time to get used to the new user interface, and I’ve had to password protect the installation, but I’m not dependent on a centralized feed aggregator, which can, and did, go away.

Twitter, though. I was not a big Twitter fan at first, but I can see the benefits of the application, especially if you want to point out an article or something else to folks, and have it quickly, virally spread, in a nice swine flu-like manner. It’s fun to have a giggle with folks, too. But the darn thing is centralized, and not only centralized, vulnerable and centralized, which gives one pause.

I have an Identi.ca account, too, but most folks are in Twitter. You can integrate the two by linking your identi.ca account to Twitter, as I have. Still, identi.ca is also centralized, just located in a different slice of the internet pie.

I finally bit the dust this week, and installed my own version of Laconica which is the microblogging software used with identi.ca. There were a couple of glitches, not the least of which were two very minor programming typos in the install program (yes, I have turned these in to the developers and it should be fixed, soon). However, the application is actually quite easy to use. I’ve had fun playing around with a new theme.

Just like identi.ca, you can connect an individual Laconica account up with Twitter, but doing so would cut my identi.ca account out of the picture. Beyond just the identi.ca issue, I also want to be able to display a list of links in my burningbird.net sidebar, with expanded URLs, so folks get search engine mojo. I could aggregate tweets, but you end up with shortened URLs, not expanded URLs when you go from tweet to sidebar. Besides, a sidebar link and a tweet are not the same thing, with the same structure.

I finally created my own tweet workflow, as I like to call it.

  • First, I installed Laconica, created a single user account (at this time), and then disabled registration. I don’t want to run a Twitter alternative.
  • Next, I found software, RSSdent, which will take an RSS feed and submit the items as tweets to identi.ca/Laconica. What I did was modify the application to submit the body of the feed without a link to the feed. The reason I don’t want the URL is that the feed I’m syndicating is my newly created Laconica installation. The body of the items will have the links that matter. Since I didn’t need any URL shortening (happening sooner in the process), I was able to trim much of the code, leaving a nice simple little application.
  • I set up a cron job so that items posted to my individual Laconica account will get posted to my identi.ca account every hour.
  • I connected my identi.ca account to my main Twitter account, at @shelleypowers. Now, when an item is posted in my identi.ca account, it gets posted to Twitter.
  • I can individually post in my Laconica account, but I also want to capture the links for my main Drupal installation, at Burningbird. There is a Drupal Twitter module, which works with identi.ca (by using indenti.ca/api as the alternative URL), or a Laconica account (in my case, using laconica.burningbird.net/api/). The only problem, though, is that this module is used to post a status update reflecting a new weblog post, not an interesting link. It gives you options to post the title, post link, and/or author, but not the body.
  • To work around the problem, I created a new content type, linkstory, with a custom content field (via CCK) that contains the link of interest and it’s link text. When I create a new linkstory, the body contains the tweet text and the expanded or shortened URL (depending on how long the URL is), but the CCK field contains the expanded link and the text I want for the link text.
  • I then created a view to display the content field text and URL, but not the body, or title of the posting.
  • I copied the Twitter module and did a small tweak (tweak, twitter, tweet — my head just blowed up) so that it outputs the body of the post, when I provided the !body label.
  • When a new linkstory is posted, the full link and link text get put into the sidebar, while the post body, containing the possibly shorted URL and any message get posted to my Laconica account.

The full workflow is: create a new linkstory or regular post in Drupal, which gets posted to my Laconica account via my modified Twitter module. Once a hour, these postings are picked up by rssdent, and posted to identi.ca. Posting on identi.ca automatically posts to my Twitter account.

If Twitter goes offline, the posts still get made to identi.ca. If identi.ca goes offline, the post is still made to my Laconica account, and the fully expanded URL for the link is posted to my main web site. My rssdent application keeps trying, once an hour, to post to identi.ca, and hence to Twitter. My modification to the Twitter Drupal module was an addition, so I can tweet posts and links, alike.

It sounds like a lot of work, but it was only about a day’s fun playing around. I plan on submitting my small Twitter Drupal module tweak as a patch, and hopefully it will be accepted. It only adds new functionality at the cost of one line of code. I’ll check in my fork of rssdent, but I need to figure out how github works. The Laconica installation didn’t require any modification, once I made the code corrections. These corrections should be incorporated into the original application, hopefully soon.

Now, this isn’t spamming. Everything gets posted to one place, though if people are subscribed to my Twitter and identi.ca accounts (or even my Laconica account), they’ll get an echo effect. This is just me grabbing hold of a little independence, while still partying with the communes. Setting my Bird free.

update I’m still getting familiar with the Twitter/Laconica API, but received a message via my identi.ca account from csarven about remote subscriptions. I can subscribe to identi.ca folks, as well as other Laconica sites, using the REST API. For a Laconica site, attach “?action=remoteSubscribe” to the URL, and you’ll get a page to enter the nickname of the person to whom you want to subscribe (at that site), and your remote profile, such as http://laconica.burningbird.net/shelleypowers. Or if you’re not logged into the system, just clicking the subscribe button next to the person’s avatar will open the Remote subscription page, automatically.

Once you enter the remote subscription request, you’re then taken back to your own site, where you have to accept the request. This prevents spamming. Once accepted, when you access your Home location, the postings from your remote friends will show up, in addition to postings from your friends who are local. You can also reply to the individual.

This functionality is also available for Twitter, built-in, but on my system, trying to use it caused errors. This is a known bug and a fix is currently being developed.

This is truly distributed, and decentralized, connectivity. You can’t take a system like this down, no more than you can take all email down, or all weblogs down. Way of the future.

Now, I must find out what other goodies are in the API…

Categories
Social Media

The Tweet stuff

From RealTech The Tweet Stuff: When it absolutely, positively has to get there:

If we’ve learned one thing from this week’s massive attack against the very fabric of our social connectivity, it’s that clouds don’t make the best stuff on which to build.

Twitter, in particular, has shown how very vulnerable it is—a vulnerability we share as we become more dependent on Twitter for most of our communication with each other. Oddly enough, I needed to contact someone about a business opportunity just as Twitter universe began to crumble, but all I had was her Twitter name—I couldn’t find her email address. Since Twitter was down, I couldn’t connect up with her for hours.

I spent a happy day playing with code today. The end result is a new “tweet workflow” that could possibly survive the year 2012.

update I’m still getting familiar with the Twitter/Laconica API, but received a message via my identi.ca account from csarven about remote subscriptions. I can subscribe to identi.ca folks, as well as other Laconica sites, using the REST API. For a Laconica site, attach “?action=remoteSubscribe” to the URL, and you’ll get a page to enter the nickname of the person to whom you want to subscribe (at that site), and your remote profile, such as http://laconica.burningbird.net/shelleypowers. Or if you’re not logged into the system, just clicking the subscribe button will open the Remote subscription page, automatically.

Once you enter the remote subscription request, you’re then taken back to your own site, where you have to accept the request. This prevents spamming. Once accepted, when you access your Home location, the postings from your remote friends will show up, in addition to postings from your friends who are local. You can also reply to the individual.

This functionality is also available for Twitter, built-in, but on my system, trying to use it caused errors. This is a known bug and a fix is currently being developed.

This is truly distributed, and decentralized, connectivity. You can’t take a system like this down, no more than you can take all email down, or all weblogs down. Way of the future.

Now, I must find out what other goodies are in the API…

Categories
People Specs

I lock my door at night

I’m not sure why the WhatWG folks thought that keeping an open door to the Twitter @whatwg account on the front page of their web site is a good idea, but it’s been interesting watching the *updates. Most are pretty juvenile, but there’s been some interesting snark along the way. Mostly, the posts have been by people asking why the WhatWG thought this was a good idea, as WhatWG followers are dropping like flies, tired of being spammed.

Interesting, too, that the WhatWG members are posting on the WhatWG IRC that the openness of the Twitter account was by decision, not by accident. Bragging about it, actually. After all, only a few spam messages will get through. Of course, that was before someone posted a note to the WhatWG IRC about the openness, making people aware of the capability. Once the open door was found, that’s all she wrote. The only thing keeping some control on the postings is the Twitter API limits.

To me, deciding to keep the door open to the WhatWG Twitter account highlights some of the problems we’ve had with the WhatWG folks in regards to HTML5: they see the world as this perfect utopia, where everyone follows the rules. If people don’t follow the rules, then the rules must be changed, because obviously, there’s something wrong with the rules.

Case in point: the table summary attribute isn’t being used correctly, not because people make mistakes, but because it’s bad and has to be removed, before someone gets hurt! Of course, those who advocate for its removal totally disregard that the bad summary attributes are attached to equally bad HTML table uses, too. But that’s not the point!

RDFa is too hard for people to use, and must be replaced by microdata. Why? Because Google make an RDFa error when it rolled out its support recently. But, it wasn’t a Google mistake, according to Ian Hickson, it was something inherently wrong with RDFa. Google quickly corrected its mistake error, but by that time, the damage was done: RDFa was shown to be a flawed system that needed to be replaced by something better. Why? Because people don’t make mistakes.

The same as people won’t spam an open Twitter account.

And now the folks on the WhatWG IRC are discussing the fact that those posting spammish messages to the WhatWG twitter account don’t understand the consequences of their actions. As jgraham wrote:

Lachy: I know that you can take some measures to cover your tracks, but in practice many people don’t bother and find that actions that they took believing that they would be free of consequences are not actually as anonymous and as free of consequences as they had assumed

In other words, never doubt your own judgment, when you can safely and easily find a way to dump any responsibility of your decisions on someone else.

The thing is, people learn from mistakes. A neighbor gets robbed, and we learn to lock our doors at night. People make mistakes with the summary attribute, or with RDFa, or any web technology, and we learn to provide better documentation. We are capable of learning from our past mistakes, learning, and doing a better job. We even learn to shut down that open door way to a Twitter account when it’s getting spammed.

There is no such thing as the perfect utopia where things can be made in such a way that no one is capable of making a mistake. All we can do is learn from the past, and do better in the future.

*Oh, and by the way? That “summary rules!” post was mine.

update

Evidently the WhatWG folks have finally realized that maybe an open door is a problem. Don’t expect an apology or acknowledgement, though:

factoryjoe: well, ok. some acknowledge might help people feel better that no more spam is forthcoming
Hixie: i expect that no more spam being forthcoming will make people feel better than spam telling them that no more spam is forthcoming

Categories
Burningbird

Who are these strange people

A favorite writer of mine is James Fallows at The Atlantic. He doesn’t have comments, but when I’ve sent him emails in the past, he always responds as quickly as he can, and in detail. He’s also been known to republish email comments, with permission, on his weblog—ensuring exposure to both new people and new ideas.

James has shown that one can be part of a community, without having comments.

I have been inundated with spam comments at this site, and now I’m getting spam replies and follows on Twitter. I am, at least for now, turning off comments. I am also, seriously, thinking of deleting my Twitter account, unless Twitter can find some way to control what is becoming an increasing, and even overwhelming, problem.

My warmest thanks to all who have taken the time to comment here and at my other sites. I value your friendship, your words, and your time. I hope that you’ll still email me, if for no other reason than to say hello, how’s it going.