Recovered from the Wayback Machine.
Phil Ringnalda points to the new Yahoo Creative Commons search engine and notices that because the engine is relying purely on links to CC licenses to pull out content that is supposedly licensed as CC, there is going to be a lot of confusion related to what is, or is not, CC licensed.
An issue with CC has always been how to attach CC license information in such a way that automated processes could work with it. The solution has been to use RDF/XML embedded within HTML comments to indicate what is licensed on the page. However, this is kludgy and doesn’t validate within XHTML and people are dropping it, and just including the link to the specific license. More, even if they include the RDF/XML they do so in such a way that it looks like everything in the page is under the specific license–HTML, writing, CSS, photos, whatever.
In other words, they take the rich possibilities inherent with using RDF, and dumb it down until it’s equivalent to the link.
Phil then pointed out that Yahoo releasing this search that just looks for links to the license in a document, and doing so without any legal disclaimers, warnings, or asides, is about the same as somebody accidentally putting a GPL license on the next version of Windows. In other words: it’s a a really dumb move:
But if I was the Yahoo! lawyer who vetted their Creative Commons search, and let it loose without any disclaimer that “Yahoo! makes no assertion about what, if any, content in these results is actually offered under a Creative Commons license” I’d be hanging my head in shame.
To make matters worse, in the associated FAQ for the new search is the following:
This search engine helps you quickly find those authors and the work they have marked as free to use with only “some rights reserved.” If you respect the rights they have reserved (which will be clearly marked, as you’ll see) then you can use the work without having to contact them and ask. In some cases, you may even find work in the public domain — that is, free for any use with “no rights reserved.”
Yup. I think this is a case for the new Corante legal weblog.
I tried the search with my weblog’s name, and found one interesting result: the bbintroducingtagback tagback in Technorati. It seems that Technorati has linked to one of the CC licenses that allows non-commercial use. But used in the way it is, it implies that all the material in the page is licensed this way. Wait a second, though: that’s my photo in the page, pulled in from Technorati via flickr. I don’t license my work as CC–it’s still too damn vague a licensing, usually applied badly (as we’re seeing now).
Phil calls this accidentally by link association form of CC licensing, viral and viral it is, indeed; through bad implementations of a vague license, I may, by allowing my photo to be copied (while holding all rights), have lost rights to that photo by implication and effect. At a minimum. who holds the copyright on the photo has been lost when it filters through both the Technorati tag and the search engine results.
I’ve been in a discussion about the CC license and the issue of how to record more specific information with Mike Linksvayer (who is on the staff at cc) at Practical RDF. I brought up the issue of lack of precision in the licensing and Mike mentioned that one approach CC is looking at is to use, again, the ‘rel’ attribute as a way of marking metadata. But this can only go so far — it’s really not much more than just linking to the license and assuming this implies usage.
(And, frankly, our use of ‘rel’ is becoming a bit of a stretch–we’re trying to stuff all the meaning in the internet in one little bitty attribute.)
The approach I’m using for complex metadata (which is what CC is) in Wordform is to generate a separate RDF/XML feed that explicitly states which element is licensed, which isn’t within a page, and exactly how the licensed element can be used (among other metadata). I link to this page through a LINK element in the header, as many of you do with auto-discovery of feeds right now. However, Mike’s response to this was:
A separate RDF file is a nonstarter for CC. After selecting a license a user gets a block of HTML to put in their web page. That block happens to include RDF (unfortunately embedded in comments). Users don’t have to know or think about metadata. If we need to explain to them that you need to create a separate file, link to it in the head of the document, and by the way the separate file needs to contain an explicit URI in rdf:about … forget about it.
But if we don’t explain to people how all this works, and provide a way for folks to be more precise, problems like the Yahoo CC search and the Technorati tag page are going to continue. By ‘protecting’ people from the technology, we are, in effect, doing more to harm them then help them.
What we should be doing is providing the tools to allow people to use rich metadata, richly; not make assumptions that “people can’t deal with it” and then dumb it down accordingly. We should be helping people understand how to use something like the CC license wisely and effectively–using clear, non-technical language to explain how all the bits work–not depend on technology to somehow ‘guess’ what a person wants and act accordingly.
Because as we’ve seen, technology almost invariably guesses wrong.