Categories
Stuff

Dances with Geese

Yesterday was a gray day with mixed rain and snow and slushy grounds and the threat of ice on the road. However, I’ve made a pact with myself to get out and walk everyday, regardless of the weather–around the neighborhood if I absolutely must, though I hate walking through city streets, and on cement.

Walking lately has gone beyond being just something I enjoy to be become both my medicine and my salvation. There is an invisible barrier on my front door that reads, “In case of assholes, stress, or malaise, break”, and I break it daily. If the weather is good and it’s been dry for a couple of days, I head for a real hike, or to find another mill or other bit of old stuff to photograph. However, if the weather isn’t so great, I just head to one of the usual places–Shaw, Powder, the Botanical Gardens, Tower, or others close in and familiar. On these close in hikes in bad weather, I don’t take my camera, but I sometimes take my radio headset, which is a rather quaint thing that fits over my hears, has a radio tuner and a little antenna that sticks up, making me look like a Martian. It isn’t sexy, and doesn’t have white ear buds, but I’m partial to the old thing and wouldn’t think of replacing it.

Not long ago Dave Rogers wrote about how we define our own worlds, sinking further into ourselves rather than paying attention to our surroundings and each other. I thought about this yesterday when listening to my favorite oldies station as I walked along the road at Shaw. It’s a rare event indeed when there is no one at this park and with the paths so mucky, I decided to walk the road that circles the lake. I was really enjoying having the place to myself and the music, as the station plays really great music on Saturdays with hardly any commercial interruptions. In fact, after a while I was walking less and dancing more, and it was when I was listening to Dusty Springfield’s Son of a Preacher Man that I decided to get off the road and into the snowy/slushy fields and do a little dancing among the geese. Though they seemed a bit confused by my actions, they eventually went back to their eating, ignoring this odd human gyrating in their midst.

Now, while I was prancing about, using my hiking stick to pretend to hit sloppy ice balls into the lake, folks may have driven by, pausing to view the crazy woman before moving, most likely hurridly, along, but I didn’t know because I wouldn’t be able to hear them over Preacher Man, Wild Thing, or the other tunes that followed. Which I guess goes to show that sometimes being in our own worlds can be marvelously liberating.

I’ve also been thinking of getting a hand held digital voice recorder to take with me on my trips to record experiences on days such as this, when I’m full of the intoxication of living the life of the village idiot. Especially after listening to Chris aka Stavros the Wonderchicken’s recording of one of his posts this week. Who would have known that the man has such as wonderful, smoky, sexy voice as that? What made it even more special was the Canadian accent. Sort of Hemingway and Nanook of the North, collapsed into one irresistable package. Seriously, if more people podcast in this manner, I may have to consider getting one of this plastic white things with the silly ear buds.

One song I had hoped they would play yesterday is Melanie’s Brand New Key. Remember that? If you do, then you’re most likely older than dirt, as I tested out to be at Ken Camp’s History Quiz. I had a terrific time with this quiz, but the reference in it to roller skate key reminded me when skates were off foot rather than inline, which led to this great, great song.

I rode my bicycle past your window last night
I roller skated to your door at daylight
It almost seems like you’re avoiding me
I’m okay alone, but you got something I need

Well, I got a brand new pair of roller skates
You got a brand new key
I think that we should get together and try them out you see
I been looking around awhile
You got something for me
Oh! I got a brand new pair of roller skates
You got a brand new key

I ride my bike, I roller skate, don’t drive no car
Don’t go too fast, but I go pretty far
For somebody who don’t drive
I been all around the world
Some people say, I done all right for a girl

Well, I got a brand new pair of roller skates
You got a brand new key
I think that we should get together and try them out you see
I been looking around awhile
You got something for me
Oh! I got a brand new pair of roller skates
You got a brand new key

I asked your mother if you were at home
She said, yes .. but you weren’t alone
Oh, sometimes I think that you’re avoiding me
I’m okay alone, but you’ve got something I need

Well, I got a brand new pair of roller skates
You got a brand new key
I think that we should get together and try them out to see
La la la la la la la la, la la la la la la
Oh! I got a brand new pair of roller skates
You got a brand new key.

Today’s lyrics with their explicit references to sex and swearing are so dull compared to the playful metaphors and lyrical winking embedded in songs such as this. Yeah, I got a pair of roller skates, boy, and you got a new key. I looked for this song at iTunes, but of course it didn’t have it. None of the online music stores had it, either in digital form or on CD. However, I did find a wav of the song online in a page with other old classics that tried to install spyware, but I grabbed the song and ran. I am so bad.

With music like that rolling around you, how can you not dance in the fields with the geese? What think, Jeneane? Would you go to a conference that played Brand New Key as anthem? Seems a good song for folks in the tail. And it sure beats looking at monkey bottoms or dreaming about RSS. Though come to think on it, I did dream about a pancreas last night. A great big, 3 story tall bright orange-pink pancreas that was standing on end in an art warehouse.

I guess that’s what you get from dancing with geese. Speaking of which, welcome home, Loren! The trail is calling, but watch out for the geese.

(BTW, if anyone has recommendations for a good but inexpensive hand held digital voice recorder that can be used to create podcasts, please drop me a note.)

Categories
RDF

Bush: Weblogged and Googled

Recovered from the Wayback Machine.

This new item has hit Slashdot: Deriving Semantic Meaning from Google Results.

Seems a couple of scientists have devised a method where they take page results for pairing of words and from these determine which pairing is more semantically meaningful. They call this the Normalized Google Distance or NGD for short.

Their hope is that this work could be used to assist computers in understanding human language. Or lordy, aren’t weblogs going to mess up the works. Can you imagine the effect?

Machine: I am IGOR, automated voting system. How may I help you?

Human: I am here to vote for President.

Machine: Please input your choice.

Human: I am voting for President Bush

Machine: Just to verify, are you voting for Bush who is the scum bag dirt wad ignorant asshole?

Saaaay. Maybe this isn’t such a bad idea.

Categories
Semantics

Cheap eats at the Semantic Web cafe

Recovered from the Wayback Machine.

It’s a rare event when several seemingly disparate items of interest all come together to form a compelling, coalescent whole. This event happened for me the past few weeks; an experience formed of discussions about digital identity and laws of same, LID, Technorati Tags, new and old syndication formats, Google’s nofollow, and the divide between tech and user. Especially the divide between tech and user.

I’ve written about digital identity and LID and nofollow recently, so I want to focus on Technorati Tags in this writing, and then, later, bring in the other technologies relationship to same. Besides, for someone who is interested in lowercase semantic web, how can my ear not be all a quiver when I hear about a new way of ‘adding meaning’ to what can be a meaningless web at times?

Note that I’m also including photos from my recent old mill photo shoot in these pages; not because they’re necessarily relevant but because an email respondent to my post on digital identity and LID noticed that I hadn’t included photos in that work. Since I was less than flattering to something important to him, consider this my nod for his efforts. Additionally (and this might be my own opinion, only) I think a bit of color helps break up all those words: a visual equivalent of a deep breath before plunging back in for another swim.

Besides, this is my space, haven’t you heard? This is the user’s web now, which means it’s my web and I can make the rules.

Tag, you’re it

If you’re unfamiliar with Technorati Tags, it’s a new implementation of an existing concept previously enabled by other sites such as del.icio.us and flickr. With Technorati tags, webloggers can annotate their entries to add keyword associations to their work forming a quasi-classification on the hoof, so to speak.

When you update your weblog, and ping Technorati (or some other service that results in Technorati’s web bot consuming your post), the link to your post is then added to the other most recent additions to the other entries that share the same tag. Not only that, but items at delicious and flickr are also shown in the page, as this entry labeled Folksonomy demonstrates.

From reading other webloggers, the main excitement behind Technorati Tags is its ability to socialize a classification. David Weinberger wrote the following when the concept was first rolled out:

This is exciting to me not only because it’s useful but because it marks a needed advance in how we get value from tags. Thanks to del.icio.us and then flickr in particular, hundreds of thousands of people have been introduced to bottom-up tagging: Just slap a tag on something and now its value becomes social, not individual.

Cory Doctorow shared in this enthusiasm, writing:

Technorati Tags are keywords that map to category names, keywords, and other cues in blog posts. When you bring up a Technorati Tag for “computers,” you get all relevant blog posts that Technorati knows about, presented on a page with relevant Del.icio.us links and relevant Flickr images. Technorati Tags blend three different Internet services and three services’ worth of tags to tease meaning out of the ether. Brilliant.

Ross Mayfield writes

But below all that global heady stuff, what tags do really well is aid social discovery.

Simon Waldman jumped in with:

Smart. Smart. Smart. If a little rough round the edges.

And Suw Charman enters the lists with:

All in all, this is an interesting way of using emergent tagsonomies to pull together diverse datastreams in one place. As it happens, I’ve had a number of different conversations recently with friends about such things, and this is a useful first step along the way to creating a single entry point for a variety of sources.

It might seem at first exposure that the enthusiasm for Technorati Tags is a little difficult to understand. After all, we’ve been able to classify our writings for a long time in our weblogs; as for searching on specific topics, we’ve had considerable experience using keyword searches in Google and Yahoo. However, the interest in Technorati Tags seems to be focused on its value as a social grouping rather than as a way of categorization. Waldman referenced the term “self-organizing web”, to describe the concept.

For instance, if I were using Technorati Tags in this post, I would add whatever tags I felt represented the content of this writing, such as FolksonomyDigital_IdentityTags, and Old_Mills. Of course, when checking Old_Mills, I find that this is fresh meat from a Technorati perspective, as there no previously annotated weblog listings using this tag. This leads me to believe that perhaps there’s a different tag I want to use. After all, if I’m going to go through the bother of using a Technorati Tag, I’m would rather use one that puts me into an active social classification than one that doesn’t. So I try Missouri instead, because after all, the photos of old mills in this writing are in Missouri. I see a gratifying number of entries for this tag, providing positive feedback of my choice.

This process of refining exactly which tags to use demonstrates what we’re told is the true power of Technorati Tags–not that we, as individuals, can categorize our writing any way we want; but that people will seek out existing tags that represent their material, and therefore begins a grass roots taxonomy–or folksonomy to use what is becoming a popular term.

Returning to my ’socialized choice’, among the other entries tagged “Missouri” are pointers in del.icio.us to a Metafilter discussion on the recent ruling about the KKK being allowed into the highway cleanup program, and an interesting story in reference to the New Mardras fault, both stories I’ve written about and if had tagged previously, would also show in the list. This does demonstrate the positive grouping effect of these tags.

Still, there are other entries that look more like ads than entries related to Missouri, including ones for mobile DJs. This demonstrates one of the negative aspects of Technorati Tags: their vulnerability to spammers. Another vulnerability that has been quickly pointed out is that the material can be seen as inappropriate to the topic or even offensive when placed next to the other material that’s published in the same category.

Bad tag. Bad.

Rebecca Blood was one of the first to make note of inappropriate material within the content tagged with “MLK” for Martin Luther King day.

Now, that photo is perfectly appropriate on Flickr as part of an individual’s collection, and as documentation of Sunday’s rally. It’s perfectly appropriate as an illustration for ‘protests’, or even ‘Israel’ and ‘Palestine’, even though it surely will offend some people wherever it appears. But it is not appropriate to illustrate a category tagged ‘MLK’. I personally was offended–these sentiments reflect the polar opposite to those espoused by Dr. King. More to the point, such an illustration is inappropriate–that poster has as much to do with Dr. King as would a picture of a banana peel.

Foe Romeo also noticed this, especially when looking at the Teen tag and noticing links to a pornography weblog and suggests that Technorati has taken on new roles as both editor and moderator with the introduction of Tags. In her comments, Kevin Marks responds to her concerns with:

We have confirmed with Flickr that pictures flagged with offensive are not included in external feeds, so the advice to Rebecca to visit Flickr to warn about the picture was correct; we also removed the german porn spam blog you noticed from our database.

We are still feeling our way here, and adding community moderation is one possibility.

But another commenter, Beerzie Yoink (who links to an interesting website, btw) wrote:

I’m not a technical genius, but quite frankly don’t see how they are going to manage this. Won’t tags used by spammers, pornographers, racists, and other jerks will be hard to separate from legitimate posts? It will be interesting to see how this plays out.

(em. mine)

Within a day or so of Tags being released, questions have been asked about separating out ‘good’ material from ‘bad’, and finding ways of altering Technorati so as to eliminate offensive material. Of course, as Julian Bond points out, there’s a mighty big chasm between here and there when it comes to this type of change:

We seem to be playing out the same old, same old pattern once more that’s been done a million times before in online communities. The Politically Correct Police (PCP) are making lots of noise about how “This isn’t right and SOMETHING SHOULD BE DONE”. The Anti-PCP come along, who love a good flame war, and are finding ways to wind them up. The poor developers get backed into a corner and end up coming up with a series of nasty hacks to sanitise what was once a nicely elegant, simple and minimalist solution. What makes me laugh in all this are the ludicrous solutions put forward by the PCP who clearly have never been anywhere code.

One of the challenges with self-forming community efforts is that each member brings with him or her different interpretations of why the group has formed, and what it’s purpose is. What’s particularly fascinating about it is that the same people who exult the ease with which the group can form, are also the same people who then pick through the members, saying which ones can stay, and which ones have to go.

While some of those who have questioned the overall goodness of Technorati tags have focused on the correctness of the content, others focused on the quality of the overall effort. In other words, can cheap semantics scale?

Get yer semantics here! Red hot semantics! Get ‘em while they last

I took the title for this post from Tim Bray’s discussion about Technorati tags, where he wrote:

I’ve spent a lot of time thinking about metadata and have written on the subject; the most important conclusion was: There is no cheap metadata. I haven’t seen anything to make me change my mind.

Having said that, and granting the proposition that The Simplest Thing That Could Possibly Work usually wins, I still have to say that the Technorati Tags all being in a single flat namespace does seem a little, well, brittle.

Liz Lawley also wrote on her concerns about the long-term viability of tags and folksonomies, specifically, whether group concensus leads to valid, or best, results:

On the one hand, as a librarian, I understand completely the value of controlled vocabularies and taxonomies. I don’t want to have to look in six different places for information on a given topic—I want some level of confidence that the things I want are grouped together. On the other hand, I don’t share the optimism that so many of my colleagues in this field seem to have that the collective “wisdom of crowds” will always yield accurate and useful descriptors. Describing things well is hard, and often context-specific.

Bang on the money except that I would extend this further to read, “…describing this well in such a way as to be meaningful to a great proportion of the populace…” All of us can describe things easily understood by ourselves or our immediate social groups.

Both Liz and Tim reference a post by Clay Shirky where he writes that though folksonomies (the concept to which Technorati Tags has been linked) may not have the quality of well-designed vocabularies, they’ll still persist and ultimately triumph, primarily because these efforts minimize cost and maximize user participation.

This is something the ‘well-designed metadata’ crowd has never understood — just because it’s better to have well-designed metadata along one axis does not mean that it is better along all axes, and the axis of cost, in particular, will trump any other advantage as it grows larger. And the cost of tagging large systems rigorously is crippling, so fantasies of using controlled metadata in environments like Flickr are really fantasies of users suddenly deciding to become disciples of information architecture.

Any comparison of the advantages of folksonomies vs. other, more rigorous forms of categorization that doesn’t consider the cost to create, maintain, use and enforce the added rigor will miss the actual factors affecting the spread of folksonomies. Where the internet is concerned, betting against ease of use, conceptual simplicity, and maximal user participation, has always been a bad idea.

Yet it’s interesting that those who support the concept behind folksonomies tend not to use it as effectively as they could, as pind’s dot com discovered when looking at the del.icio.us tags used by Liz and Clay. What’s needed, he then writes, is technology that helps him, and the rest of us, do a better job of classification. But then that takes us back to Julian’s statement about taking minimalistic solutions such as Technorati Tags and telling developers to ‘make them better’–make them so that they perform as well as controlled vocabularies, but without requiring any effort, expertise, or discipline on the part of the users of such technologies.

The concensus among all those who wrote on Technorati Tags seems to be that if folksonomies are not as sophisticated as we would wish, may not scale well, or have the quality that controlled vocabularies have, they’re still based on typically simple solutions; easily applied by the user, controlled by the user, and therefore are better than not having anything when it comes to trying to build this semantic web of ours. Or as Clay wrote:

The advantage of folksonomies isn’t that they’re better than controlled vocabularies, it’s that they’re better than nothing, because controlled vocabularies are not extensible to the majority of cases where tagging is needed. Building, maintaining, and enforcing a controlled vocabulary is, relative to folksonomies, enormously expensive, both in the development time, and in the cost to the user, especailly the amateur user, in using the system.

I grant that tags (Technorati, Flickr, and other) and the other tools of folksonomies are better than having nothing at all; but is there a possibility that they are also worse than having nothing at all?

Bad habits are hard to break

Recently I, and others, wrote about a new single sign-on digital identity system called Light-Weight Digital Identity (LID). What caught our attention wasn’t necessarily that LID was the best digital identity system proposed–there are a lot of unanswered questions inherent with the current implementation–but that it was the first that actually delivered code into the hands of the user that empowered us to control our own identities.

When I wrote on LID, I was asked in several emails what I thought of the Identity Common’s effort with XRI ((eXtensible Resource Identifiers) and XDI (XRI Data Interchange)–universal identification and data exchange protocol specifications, respectively; particularly since I am such an adherant to RDF and both are dependent on URI (Uniform Resource Identifiers) to identity objects of interest, and the implementations of the two could be made interchangable through existing technologies. I answered that I was ‘briefly’ familiar with them, the briefly based on the fact that both are still primarily in specification stage and there is no implementation that I can put my hands on. I could agree that many of the issues about digital identity and problems associated with it have been addressed by the documentation for XRI/XDI — but where’s the goodies?

In other words, XRI/XDI may be the more robust solution, but there’s nothing that I can work with (pre-alpha sourceforge projects not withstanding); where LID, perhaps not as robust, does provide something I can not only use immediately, and I can use without any form of centralized architecture being in place to support it.

Or as was noted in the mailing list for the Identity Commons efforts, sometimes the … “simplest thing that could possibly work” is very attractive indeed.

While I was being questioned about XRI/XDI, several people had emailed Kim Cameron to ask his opinion of it. Kim has become somewhat of a leader in the digital identity community through his interest and not the least because of a set of ‘laws’ he started defining for digital identity implementations.

Rather than address it directly, Kim released a sixth law of digital identities that read as follows:

The Law of Human Integration

The universal identity system MUST define the human user to be a component of the distributed system, integrated through unambiguous human-machine communications mechanisms offering protection against identity attacks.

This law references one of the difficulties inherent with the efforts behind much of the digital identity movement, in that most of the solutions are focused on organizations protecting themselves from abuse and fraud, rather than on individuals being able to safely and easily use whatever solution is provided. This would seem to support LID. However, Kim also provided a scenario earlier in his lead up to his sixth law that plays more subtly on this issue:

To take a very simple example, suppose you have a browser with an address bar showing you the DNS name of the site you are visiting. And suppose there is a “lock icon” which appears when a “secure connection” is in place. What is to prevent a piece of code running on your machine from overwriting the DNS name and throwing up a fake lock icon – so you are convinced you are visiting one secure site when you are actually visiting another insecure one? And so on.

Of course our usual immediate reaction to this type of problem is to find the most expedient single thing we can do to fix it. In the example just given, the response might be to write a new “safe address bar”. And who am I to criticise this, except that in the end, the proliferation of address bars makes things worse. By inventing one, we have unintentionally made possible the new exploit of getting people to install an address bar with evil intent built right into it. Further, who now can tell which address bar is evil and which one is not?

The point I am trying to make is that the new distributed identity system needs to be something other than an “expedient compensation”, something beyond a tactical riposte in the fight for security. And since the identity system has to work on all platforms, it must be safe on all platforms. The properties that lead to its safety can’t be obscurantist or derive from the fact that the underlying platform or software still has a small adoption.

In other words, the expedient solution may not be the best overall solution.

Whether LID can be seen as an ‘expedient solution’ or not, if LID had implementations in PHP or Python that would be simple to install and use, and there was more clarity on the license, it would have fired enough grassroots support to make it a contender for the de facto digital identity implementation, thus making it that much more difficult for other, perhaps more ‘robust’ solutions to find entry into the community at a later time.

This also applies to the concept of meta-data. If people become used to receiving value, even if it is only limited value, from folksonomies based on very little effort on their part, they’re going to become reluctant when other more robust solutions are provided if these latter require more effort on their part. Especially if these more robust or effective solutions take time to be accessible ‘to the masses’ because the creators of same are *enclosured behind walls built of scholarly interest, with no practical means of entry for the likes of you and me.

Clay expands on his general theme of the suckiness of ontologies, as compared to folksonomies because the former forces a future prediction of structure while the latter allows for dynamic growth; the former is based on a graph, with predefined nodes, each requiring a progenitor, while the latter is based on sets, and the only barrier to entry is forming a decision to belong.

Ontology is a good way to organize objects, in other words, but it is a terrible way to organize ideas, and in the period between the invention of the printing press and the invention of the symlink, we were forced to optimize for the storage and retrieval of objects, not ideas. Now, though, we can scrap of the stupid hack of modeling our worldview on the dictates of shelf space. One day the concept of creativity can be a subset of a larger category, and the next day it can become a slice that cuts across several categories. In hierarchy land, this is a crisis; in tag land, it’s an operation so simple it hardly merits comment.

The move here is from graph theory (arrange everything in a tree graph, so that graph traversal becomes the organizing principle) to set theory (sets have members, and the overlap or non-overlap of those memberships becomes the organizing principle.) This is analogous to the change in how we handle digital data. The file system started out as a tree graph. Then we added symlinks (aliases, shortcuts), which said “You can organize things differently than you store them, and you can provide more than one mode of access.”

Yet, as we’ve already started to see with Technorati Tags, as with other implementation such as del.iciou.us tags and flickr, low barrier to entry usually doesn’t scale well. Something like the Missouri Tag may have few enough entries to make finding the meaningful data easy, but something like Weblog results in so many members as to make it difficult to differentiate from the populace as a whole. The same applies to social networks, where people collect so many ‘friends’ as to make being a ‘friend’ of the person inherently meaningless.

So then we start exploring ways and means to make these simple systems and folksonomies more effective. In the case of Google, the developers create algorithms that try to add meaning to the results returned on a search by basing the results on number of links and popularity of a site, with an assumption that popularity equates to authority. In the case of Flickr, social behavior is incorporated into the tags, and members can label photos as ‘offensive’, in which case the photo is excluded from external feeds. However, without having a clear, not to mention shared, idea of what ‘offensive’ means, the results will always be suspect. After all, some would say that photos of a woman’s bare breasts or a man’s penis are offensive; others would say any photo of President Bush is offensive.

All of these solutions and the tricks to make them work better are based on the fact that the rich context of the data is not captured along with the data, and therefore there is only so much good we can wring out of these ‘cheap’ semantic web solutions before they’re wrung dry and spit out like overchewed tobacco cud. Or before they’re gamed by people such as the comment spammers, and then we, the blades of grass within the grassroots efforts, have to add more effort to our input in order to ‘refine’ (read that ‘fix’) the results, as witness the recent release of Google’s nofollow attribute.

(One could say that Peter Kaminksi is prescient when he remarks January 15th about annotating links in a similar manner to Technorati tags, so that Google could also participate in the new, more meaningful web.)

It is the structure, the future prediction, careful classification, and directed graph nature that Clay disdainfully rejects that allows us to capture the rich nuances of data that will persist longer than the quick transitory interests that meet efforts such as Technorati Tags. One only has to compare the Technorati Tag for Terrorism with the Weapons of Mass Destruction, Terrorist, and Terrorist Type ontologies, and associated instance database to see where the discipline to apply more robust metadata concepts can result in much more controlled, and specific, result sets. And since the data is defined in a universally understood model, RDF, you don’t even have to use the ontology creator’s own search tool (try who, what, where for the three values, in that order)–you could use my much more crude, but quickly hacked together Query-o-Matic, based on existing technologies.

Louis Rosenfeld discusses the strength of searches among controlled data sources as compared to that of folksonomies:

Lately, you can’t surf information architecture blogs for five minutes without stumbling on a discussion of folksonomies (there; it happened again!). As sites like Flickr and del.icio.us successfully utilize informal tags developed by communities of users, it’s easy to say that the social networkers have figured out what the librarians haven’t: a way to make metadata work in widely distributed and heretofore disconnected content collections.

Easy, but wrong: folksonomies are clearly compelling, supporting a serendipitous form of browsing that can be quite useful. But they don’t support searching and other types of browsing nearly as well as tags from controlled vocabularies applied by professionals. Folksonomies aren’t likely to organically arrive at preferred terms for concepts, or even evolve synonymous clusters. They’re highly unlikely to develop beyond flat lists and accrue the broader and narrower term relationships that we see in thesauri.

Returning to Kim Cameron’s sixth law, which states there must be an unambiguous and non-corruptable interface between the user and the technology, we could also apply to this metadata: the costs to support controlled vocabularies/ontologies and uncontrolled vocabularies/folksonomies are the same. At some point a human has to intervene with the technology to refine and validate the result. With ontologies, the intervention occurs before the data is captured; with folksonomies, the intervention occurs with each search.

I put my money on the ‘refine and validate just once’ solution.

Isgood but…is good?

Though Rosenfeld and most others I’ve listed here support folksonomy efforts, some with caveats, others unreservedly, as just one of a variety of technologies that help people find what they need, I tend to be of the camp that believes focusing on easy solutions will make it more difficult to get acceptance for ‘better’ solutions that may require a little more effort. This puts me in the exact **opposite camp of Clay Shirky.

Clay believes that ultimately ontologies will fall to folkonomies, as the latter gain rapid acceptance because of their low cost and ease of use; I believe that ultimately interest in folksonomies will go the way of most memes, in that they’re fun to play with, but eventually we want something that won’t splinter, crack, and stumble the very first day it’s released.

What we don’t need are more cheap solutions, and ultimately, I find that Technorati Tags are a ‘cheap’ solution, though a compelling one, and useful for generating conversation if no other reason. And I don’t want to deginerate Technorati’s efforts with this, because I feel in the end Technorati is going to play a major role in our semantic efforts. Still, no matter how many tricks you play with something like tags, you can only pull out as much ‘meaning’ as you put into them.

What we need, instead, is a way of making richer solutions more accessible to people, and in that, I do agree with Clay–lower the barrier of participation. In the email list for the Identity Commons effort, the members talked about how the URL which serves as identifier within LID is also a URI, which forms the basis for XRIs, and how the group should look at ways of achieving synergy with this new effort. Rather than being disdainful, they sought to turn LID into an opportunity.

This type of attitude is what we need more of–how can we make the richer, more robust solutions available to folks like you and me. In some ways, FOAF, the ontology used to identity ourselves and who we know is an example of this because its very accessible to ‘regular folk’; yet its also based on a robust and highly interchangable data model, which means it could be easily merged with other data that shares the same identity.

One hell of a ride

Clay states that whether we’re supportive of folksonomies or not, they’re going to happen–we are in a kayak floating along a river of change:

It doesn’t matter whether we “accept” folksonomies, because we’re not going to be given that choice. The mass amateurization of publishing means the mass amateurization of cataloging is a forced move. I think Liz’s examination of the ways that folksonomies are inferior to other cataloging methods is vital, not because we’ll get to choose whether folksonomies spread, but because we might be able to affect how they spread, by identifying ways of improving them as we go.

To put this metaphorically, we are not driving a car, with gas, brakes, reverse and a lot of choice as to route. We are steering a kayak, pushed rapidily and monotonically down a route determined by the enviroment. We have a (very small) degree of control over our course in this particular stretch of river, and that control does not extend to being able to reverse, stop, or even significantly alter the direction w’re moving in.

I consider that the difference between the ‘web’ and the ’semantic web’ to be one based on ‘meaning’ alone, not on toys and attachments. If my opinon holds true, is the transformation of the web to the semantic web equivalent to a ride in a kayak? Pulled along by forces with little control over direction and speed?

I will concede to Clay the challenging, swift nature of the transport, but argue that only a fool would put themselves into a narrow sliver of wood, hide, or plastic on a raging river without training, accepting to fate to ensure we don’t end up smashed, bloodied, and drowned. And it’s equally foolish to believe that we can, somehow, with the right use of technology, exponentially derive complex meaning out of what is, essentially, flat data.

I agree with Clay that the semantic web is going to be built ‘by the people’, but it won’t be built on chaos. In other words, 100 monkeys typing long enough will NOT write Shakespeare; nor will a 100 million people randomly forming associations create the semantic web.

* No enclosured is not a real word, but should be because it adds more description of the effect than ‘enclosed’.

** Of ontologies, Clay writes …don’t get me started, the suckiness of ontology is going to be my ETech talk this year…, which is probably one reason my own proposal, which is diametrically opposite to Clay’s talk, was not accepted. Well that and I mentioned the ‘p’ word.

Categories
Weblogging

TBD Tags et al

I’ve been wanting to write something on Technorati Tags, the conference on journalism, blogging, and credibility, as well as a follow-up on LID, and even a little on Wordform (Eh? What’s that?), but all of this deserves a thoughtful discussion, carefully written. Frankly, I’m not in the mood for either thoughtful or careful, so I think I’ll brave the cold and the snow and go for a walk. I’d take photos, but once you’ve seen one Missouri landscape covered in snow, you’ve seen them all.

Well, I’ll still take photos–would have to pry camera out of cold dead hands–but you’ll have to go to Tinfoil Project to see them. Bigger pics, better bandwidth suckage.

Categories
Just Shelley

Swivel stick

Recovered from the Wayback Machine.

Loren, who has been sharing tales of courage and horror from his fearsome youth, recounted one incident with a slippery log and a fall into a swift stream. Ever since, he’s hiked out of his way to avoid having to use logs to cross streams.

(He and I also shared a five year old fascination with matches, oddly enough. I wonder if the cause and effect associated with ‘matches’ and ‘uncontrolled fire’ is an epiphany all five year olds undergo?)

I share his wariness of log crossing, but mine is based less on specific event than ongoing experience. Not only do I avoid logs, I also avoid ledges, unsteady hillsides, icy paths, and any form of roller or ice skating. I am a klutz, you see. A rather good one, at that. In a parking lot, I can manage to slip on the one and only wet pinecone; I can climb down rocky paths, only to slip on a bit of gravel at the end. I trip over unseen roots, and knocked myself out trying to chase a ball during softball when I fell into a tree. I even fainted at a wedding once, when we were required to stand for part of the service and I locked my knees, and caused myself to pass out.

When I am in prime physical condition (yeah, that will be the day again), I can move like a panther, all supple strength. But then, something always seems to get in the way.

For instance, Sunday I drove down to Bollinger Mill to film it and the Burfordville covered bridge. It was a perfect day, in the 30’s, excellent weather and the mill and bridge are extremely well maintained. Crossing the covered bridge to the other side, I started down the hill towards the White River in order to get a better shot at the mill and the spillway. I didn’t pay attention to the signs of recent flooding, until I started slipping on the wet mud that covered the hill. Not wanting to slide down on my butt into an icy cold river, I dug my stick into the ground and held on for dear life, twisting and turning to maintain my balance.

“Oh my!” “Watch out!” “Wup!” “Damn, this is slippery” “Youwillnotfall Youwillnot fall!” “Ahh!”

Eventually I managed to stable myself until I could find firmer footing on a bit of rock and from there, pull myself back up the hill, dignity and camera intact. That is until an older man who had been across the way walked over to me and asked me if I was alright. I thanked him and said it was just a slight slip, and nothing much to worry about.

“Well, we were worried that you were going to fall in. Glad you didn’t.”

Then out popped a huge smile.

“I shouldn’t say this, but you sure were funny.”

So much for the panther.