Update: Yahoo search

I had made an assumption that Yahoo Search was using the RDF/XML embedded with the CC license information to build its search results; Mike Linksvayer, though, was kind enough to clarify in comments that the company is using the CC license links, only, to capture this information.

This is disappointing, as I feel that there is more about the CC licensed objects that Yahoo could provide and doesn’t because it’s only after the links. That’s about the same as running a mine for rubies and tossing aside the diamonds you find.

Mike also mentioned about the use of RDF-A to bypass problems with embedded RDF/XML. Trying to define yet another new syntax when there’s an option already available doesn’t make sense. The RDF/XML Syntax document stresses the use of <LINK> for linking to a separate RDF/XML document with whatever metadata is defined for the resource. This is a good approach, and I’m not sure why folks are resistent to this. It’s not as if the extra documents will take up a lot of space; for dynamic systems, such as many of the ones we’re using today for weblogging, commerce, and so on, the document can be generated on demand.

A scenario for use with CC could be that when the CC license is generated, the person is told to create a file and copy in the generated RDF/XML. Then to take this LINK and add it to the header of the page. If they also want to add a icon and a link to the license in human readable format, then copy this link and put it into the page.

Is this that much more complicated for the people? Yes and No.

No in that people who host their own sites could probably do this without much problem; especially if tools start providing ways of editing pages on the site. However, for hosted sites, this is a problem – and will continue to be a limitiation of these types of sites. Now, a smart hosted site will be one that eventually gets that they need to provide some mechanism to allow for this type of activity. But until then, yes this is a limitation.

But CC could solve this for the hosted sites, by hosting the license files themselves and giving the person the link to the file to put into their document. Even with a weblogging tool, you could do this just by embedded a tag for the individual file name as the name of the metadata file into the header.

Eventually, we need ways of merging data for many uses into these pages. One way would be to provide the RDF/XML document URI to these tools, and the tools would then read in the existing RDF/XML and add the additional statements. Another would be for tools to provide a way of reading in a block of RDF/XML, pull out the individual statements, and then merge into those that already exist.

There’s code everywhere to do this type of data merging, and best of all: it’s RDF/XML, which means you don’t have to worry about namespaces and collision.

All we would need, then, is nice search bots that grab this and pull all this info into a nicely consumable spot. With API that returns individual data query results, or RDF/XML.

Yahoo! Yahoo! *knock knock knock* Opportunities knocking. Don’t blow it.