Categories
HTML5 Photography

Pack of pictures and other stuff

I’ve put together a package of photos I’ve taken earlier this year. They include photos of places around town, flowers, chimps, and other critters. This is the package of photos I’m currently using for my screen saver, so I thought I’d put it online. I don’t guarantee you’ll like any of the pictures, but if you don’t, the most it will cost you is the download time. Note, the file is 17.3MB so I hope you have broadband. If you want to look at the photos online, they’re all at MissouriGreen.

I’m in the process of butting into the HTML5 effort in regards to RDFa. You can read the history of this effort at the HTML WG list. I’m taking the HTML5 editor, Ian Hickson’s, use cases, his original raw material, and mapping the two. I’m also adding in my own use cases. In the effort to make the use cases “implementation free”, I think that the detail and the complexity of the original use cases were reduced too drastically. You can see what I mean by my first use case, and will have the same for the others by Monday.

Will this make a difference? I haven’t a clue. Probably not. I’m sure that neither the HTML5 group, nor the RDFa group, appreciate my particular style of “contributing”, but I decided to follow Sam Ruby’s advice to “put up or shut up” when it comes to HTML5. I’m just going to put up or shut up in my way.

In the meantime, I need to return to my book, which also means that I will be tearing apart my sites as part of my research. I don’t expect to be twittering much, or writing to the weblog, either, in the next few months. I need to focus on the book, and other writing/work for income. I’m also really burned out and very tired, and feeling under the weather lately, and have a need to disconnect from the social hive. Emails always welcome, but I just don’t feel like “broadcasting”.

If you do access any one of the sites at any point in time and find them either not working, or working oddly, no worries, this is just me experimenting, researching, documenting, and writing. Hopefully by the time my book is done, I’ll be more up for writing to my web sites, and they’ll be all settled down and behaving.

If you do follow along with my RDFa use case efforts, I hope you’ll make comments at the HTML WG, as that’s the appropriate place to have a discussion. However, I will also open up comments for a week, in case you just want to make more casual remarks here. Or you can just ignore the whole thing, which is also a good option.

Categories
HTML5 W3C

Annotation

(This document is part of an effort to flesh out use cases for microdata inclusion in HTML5. See the original use case document, and the background material document as well as the email correspondence that best describes this process.)

————–

USE CASE: Allow authors to annotate their documents to highlight the key
parts, e.g. as when a student highlights parts of a printed page, but in a
hypertext-aware fashion.

SCENARIOS:

* Fred writes a page about Napoleon. He can highlight the word Napoleon
in a way that indicates to the reader that that is a person. Fred can
also annotate the page to indicate that Napoleon and France are
related concepts.

—————

Ian has already provided his summary of this use case in the What WG group list. His summary

This use case isn’t altogether clear, but if the target audience of the
annotations is human readers (as opposed to machines and readers using
automated processing tools), then it seems like this is already possible
in a number of ways in HTML5.

In conclusion, this use case doesn’t seem to need any new changes to the
language.

This use case was submitted by Kingsley Idehen, who said considerably more than was entered into the summary user case. Kingsley wrote:

When writing HTML (by hand or indirectly via a program) I want to
isolate at describe what the content is about in terms of people,
places, and other real-world things. I want to isolate “Napoleon” from a
paragraph or heading, and state that the aforementioned entity is: is
of type “Person” and he is associated with another entity “France”.

The use-case above is like taking a highlighter and making notes while
reading about “Napoleon”. This is what we all do when studying, but when
we were kids, we never actually shared that part of our endeavors since
it was typically the route to competitive advantage i.e., being top
student in the class.

What I state above is antithetical to the essence of the World Wide Web,
as vital infrastructure harnessing collective intelligence.

RDFa is about the ability to share what never used to be shared. It
provides a simple HTML friendly mechanism that enables Web Users or
Developers to describe things using the Entity-Attribute-Value approach
(or Subject, Predicate, Object) without the tedium associated with
RDF/XML (one of the other methods of making statements for the
underlying graph model that is RDF).

This use case could have used some more discussion between Ian and Kingsley, because, in my opinion, Ian’s interpretation doesn’t match what Kingsley wrote.

Kingsley wrote about annotating the information within the publication, as one would use a highlighter, but he didn’t mean that this information actually has to be highlighted and made visible to the person reading the text. I believe he meant that the annotation would be visible to processes that could then be made available, both to the individual who made the annotation (most likely at a later time, as notes), or perhaps others when aggregated (the latter is my own interpretation).

The question then, is there a mechanism currently in HTML5 where one can annotate the data within a writing, in a non-visible manner, and which one then be used to make an assertion, such as Napoleon is the name of a person, and the person Napoleon is related to another entity, this one named France (which is the name of a country, and so on).

So, let me take another try at this use case:

Within a writing published on the web, I want to add annotation into the text to highlight specific facts, but I don't want such highlighting to distract from the text, so I don't want it to be visible. An example of the type of annotation I may make is to highlight the word "Napoleon" and annotate this word with an assertion that Napoleon is a person, and to add further information, that the person, Napoleon, is related to France (a country).

I write on many topics, and so I may make use of several different vocabularies in order to perform my annotation. In addition, I may have to create my own vocabulary if the annotation I want to make doesn't match any of the known and previously published vocabularies. If I do, I'll do so in such a way that there can't be a possible conflict with any other vocabulary.

Once my text is documented, I want to be able to access this annotation at a later time, separate from the document. To do this, I'll process each of my writings with an application that will pull out this specialized annotation, for aggregation and later query. In addition, by using a standard metadata annotation technique and model, the data can also be accessed by search engines, making the data also available to others.

It would help to get concurrence from Kingsley as to the accuracy of my assessment, but I do feel comfortable that my use case is a closer approximation to what Kingsley meant. If this is so, Ian’s concluding statement about this use case, including the fact that it would require no change to HTML5 could be in error.

Categories
Semantics

Arbitrary Vocabularies and Other Crufty Stuff

I went dumpster diving into the microformats IRC channel and found the following:

singpolyma – Hixie: that’s the whole point… if you don’t have a defined vocabulary, you end up with something useless like RDF or XML, etc
@tantek – exactly
Hixie – folks who have driven the design of XML and RDF had “write a generic parser” as their #1 priority
@tantek – The key piece of wisdom here is that defined vocabularies are actually where you get *user* value in the real world of data generated/created by humans, and consumed eventually by humans.
Hixie – i’m not talking about this being a #1 priority though — in the case of the guy i mentioned earlier, it was like #4 or #5
Hixie – but it was still a reason he was displeased with microformats
@tantek – Hixie – ironically, people have written more than one generic parser for microformats, despite that not being a priority in the design
Hixie – url?
@tantek – mofo, optimus
@tantek – http://microformats.org/wiki/parsers
@tantek – not exactly hard to find
@tantek – it’s ok that writing a generic parser is hard, because not many people have to write one
Hixie – optimus requires updating every time you want to use a new vocabulary, though, right
@tantek – OTOH it is NOT ok to make writing / marking up content hard, because nearly far more people (perhaps 100k x more) have to write / mark up content.
Hixie – yes, writing content should be easy, that’s clear
Hixie – ideally it should be even easier than it is with microformats 🙂
singpolyma – Of course you have to update every time there’s a new vocabulary… microformats are *exclusively* vocabularies
Hixie – there seems to be a lot of demand for a technology that’s as easy to write as microformats (or even easier), but which lets people write tools that consume arbitrary vocabularies much more easily than is possible with text/html / POSH / Microformats today
singpolyma – Hixie: isn’t that what RDFa and the other cruft is about?
Hixie – RDFa is a disaster insofar as “easy to write as microformats” goes
singpolyma – Not that I agree arbitrary vocabularies can be used for anything…
Hixie – and it’s not particularly great to parse either

Hixie – is it ok if html5 addresses some of the use cases that _are_ asking for those things, in a way that reuses the vocabularies developed by Microformats?

Well, no one is surprised to see such a discussion about RDFa in relation to HTML5. I don’t think anyone seriously believed that RDFa had a chance of being incorporated into HTML5. Most of us have resigned ourselves to no longer support the concept of “valid” markup, as we go forward. Instead, we’ll continue to use bits of HTML5, and bits of XHTML 1.0, RDFa, and so on.

But I am surprised to read a data person write something like, if you don’t have a defined vocabulary, you end up with something useless like RDF or XML. I’m surprised because one can add SQL to the list of useless things you end up with if you don’t have defined vocabularies, and I don’t think anyone disputes the usefulness of SQL or the relational data model. A model specifically defined to allow arbitrary vocabularies.

As for XML, my own experiences with formatting for eBooks has shown how universally useful XML and XHTML can be, as I am able to produce book pages from web pages, with only some specialized formatting. And we don’t have to form committees and get buy off every time we create a new use for XML or XHTML; the same as we don’t have to get some standards organization to give an official okee dokee to another CMS database, such as the databases underlying Drupal or WordPress.

And this openness applies to programming languages, too. There have been system-specific programming languages in the past, but the widely used programming languages are ones that can be used to create any number of arbitrary applications. PHP can be used for Drupal, yes, but it can also be used for Gallery, and eCommerce, and who knows what else—there’s no limiting its use.

Heck HTML has been used to create web pages for weblogs, online stores, and gaming, all without having to redefine a new “vocabulary” of markup for each. Come to think of it, Drupal modules and WordPress plug-ins, and widgets and browsers extensions are all based on some form of open infrastructure. So is REST and all of the other web service technologies.

In fact, one can go so far as to say that the entire computing infrastructure, including the internet, is based on open systems allowing arbitrary uses, whether the uses are a new vocabulary, or a new application, or both.

Unfortunately, too many people who really don’t know data are making too many decisions about how data will be represented in the web of the future. Luckily for us, browser developers have gotten into the habit of more or less ignoring anything unknown that’s inserted into a web page, especially one in XHTML. So the web will continue to be open, and extensible. And we, the makers of the next generation of the web can continue our innovations, uninhibited by those who want to fence our space in.

Categories
Writing

Whiteness

I don’t know if I’m the only one seeing a white page on the site, but since the upgrade to 6.11 in Drupal, I’ve had problems accessing all my sites. The problem could also be my hosting, and I’m currently exploring the possibility of moving. However, the problem has become much, much worse with the 6.11 upgrade. If you’ve had problems accessing the site, let me know.

I now have seven Drupal installations, though two are “stealth”. One I’m using to write my new book. I stripped away all styling and then designed a Drupal theme that supports ePub. I’ll be adding a second theme that supports Mobi/Amazon, and possibly a third that supports a PDF book. One of the advantages of being comfortable with XHTML is that you can take your mad XHTML markup skillz to the eBook world with only a little effort. Once I’ve published the book, and know the themes are working 100% I’ll upload them to the Drupal theme site, for people who want to use Drupal to write eBooks.

I will say that self-publishing is a different world now. There are so many resources. One wall I hit, though, was getting an ISBN. I could swear these were free at one time, but now, ISBNs in the US have been “contracted out” from the government to a privately owned monopoly.

You don’t need an ISBN for an eBook, though some sellers prefer ones. But if you’re going hard copy as well as eBook, which I am, you’ll have to have one. You can also “borrow” an ISBN from some distribution companies, but they don’t recommend this approach, because you’re then stuck with them as publisher. You can also buy a single ISBN, but it’s a lot cheaper just to buy a block of ten, and then if you need a new ISBN for another edition, or a new book, you have it.

It’s just that having to buy an ISBN wasn’t a cost I was expecting. Again, these are free throughout the world. Only in America do we contract what should be universally accessible to monopolies. How else to explain our cable systems?

Regardless of the unexpected expenses, there’s something very rich, and satisfying, about having some control in all aspects of my book. O’Reilly is a good publisher, and the company has been generous with me, but I’ve always felt out of the loop with my books. For instance, I didn’t know my books were going to be published to the Kindle until after the fact. I didn’t know they were all being released as DRM free ebooks on the Kindle until after the fact. I’m happy about the books being offered DRM free, but I sure would have appreciated a quick note before hand.

(Not to mention having some say in the cover, formatting, and subtitles…)

No, the success or failure of a self-published book is really dependent on the author. This is both scary, and wonderful.

Categories
SVG XHTML/HTML

Whipping boy

I noticed a passing twitter message from Laura Scott. It said One word: standards. Firefox follows w3c standards. Internet Explorer does not. She wrote it in response to another Twitter message from tutu4lu, who was having problems with a web page appearing differently with IE than Firefox.

It is true that Firefox implements more standards than IE, especially in when it comes to some of my favorites, such as SVG. And I appreciate the fact.

Firefox does not necessarily get an A+ for all of its effort, though. In particular, if Microsoft’s lack of implementation of XHTML has been one force against broader implementation of XHTML at web sites, Firefox’s own handling of XML errors in XHTML is another, more subtle force against XHTML.

Here’s an example. I added an ampersand (&) to a URL in one of my posts, which generates an XHTML error. The following are three screen shots from Chrome, Opera, and Safari, respectively, that demonstrate how they handle the error:

XHTML error in Chrome
Opera XHTML error
Safari error

Safari and Chrome are both built on WebKit, which handles XHTML errors by parsing, and rendering, the document up to the error. This has the advantage of providing some content, as well as being able to more quickly find the error when you’re debugging.

Opera doesn’t render the document, but it does provide a display of the source with highlighting where the error occurs. This is extremely helpful when you’re debugging a larger document. In addition, Opera also provides an option to render the document in HTML, rather than XHTML, which is helpful for everyone else.

Contrast and compare these screenshots with the following, from Firefox.

Firefox error handling

The Firefox XHTML error handling is also known as YSOD, or Yellow Screen of Death. It’s harsh, abrupt, and somewhat punishing in nature, with its sickly yellow background, and bright red text. The message is typically cut off by the edge of the browser window, so one can’t easily see where the error has occurred. It’s most definitely intimidating for readers who accidentally stumble on to an XHTML page currently in a broken state.

All four of the browsers do support the XHTML standard, and all stop processing the XHTML when an error occurs, as is proper. But where Safari/Webkit, Chrome/Webkit, and Opera try to provide a useful web page, Firefox picks up a ruler and gives the owner of the web site a good whacking.

It’s easy to fall into the trap of blaming all web development and design problems on Microsoft and IE, and to use IE as a whipping boy—to the exclusion of looking, critically, at the other browsers in the web space. If the lack of support for XHTML in IE is a primary inhibitor of the spread of XHTML, Firefox’s YSOD has to take the second place prize. Support for XHTML doesn’t end at the parser.