Year: 2005

2005 Errata and book updates: Chapter 1

Post author By Shelley Powers
Post date January 10, 2005

I still like my analogy to the elephant and the blind men, in chapter 1. People still see RDF, and more generally, the Semantic Web (or my preferred, the lowercase semantic web), from different viewpoints, and with different expectations. That hasn’t changed, and by the nature of the beast, never will.

Good. Keeps life interesting.

The W3C RDF working group has issued new and revised versions of the RDF specifications. This doesn’t impact on Chapter 1 that much, but will in the other chapters, and where differences in the writing as compared to the specifications arise, I will make a note.

One change in the book is that any reference to the URI yasd.com has no relevance to anything actually existing. I dropped this domain when it was so badly overrun with email spam, it was no longer usable. As for the URLs in the chapter, those related to the specifications are:

RDF/XML Syntax Specification (revised)

RDF Vocabulary Description Language 1.0: RDF Schema

The RDF Primer

Resource Description Framework (RDF): Concepts and Abstract Syntax

RDF Semantics

RDF Test Cases

The graphic depicting the differences between RDF/XML and XML by Semaview no longer exists; sorry about that. However, I do believe my textual description does a decent job of explaining the difference. Comments, though, are welcome on this.

Page 8 references the ‘new’ ontology language work. Well, this group released specifications the same time as the final specifications were released for RDF, and can still be found here.

As for the rest of the chapter, most of the material in chapter 1 is more of a introduction for the rest of the book, so I’ll be updating the material as we come to it in the other chapters.

Stuff Weather

Looking for a little bright

Post author By Shelley Powers
Post date January 10, 2005

If you access my front page you’ll see that I’ve made a temporary modification to include random photos. This doesn’t signal a redesign as much as a need to add a little bright color to my page. I was inspired to this move, as I sit inside and watch yet another thunderstorm roll past.

These thunderstorms shouldn’t be here. It is supposed to be in the 30’s, with cold, clear days just perfect for hiking. It’s not supposed to be in the 60’s one day, and then less than 48 hours later, in the 20’s, and then 48 hours later, back into the 60’s again. And no sunshine at all. I feel as if I’ve been transported back to the Northwest.

Luckily the orchid show is coming up so I can count on some color there. And I think I’ll visit the gardens this week and see the camilias in bloom. Anything to get out of the house and into something resembling the Great Outdoors.

Of course, I could add a little more color if I was in the mood to spend money, and buy one of those new mini Macs that just came out today. Cute little buggers, though color might not be the right word to use for them; they only come in Apple basic white. Still, for being an affordable Mac, adding bluetooth and Airport shoots the price up to $628.00 and that’s with going with the basic setup. I would, of course, want to cram it full of memory and with the biggest hard drive I could. And then I’d have to add a monitor and get a mouse and…frankly, a trip to Florida or Arizona would be cheaper.

But it is cute. And I’m surrounded by people with new toys. I want a new toy.

Maybe I should get the iShuttle instead. Then, by adding a new headset with noise cancelling microphone, I would have the beginnings of my own podcasting setup. Better yet! Some kind of digital recording device I could take on my hikes, and then be able to share with you every warble, buzz, mumble, stumble, and branch snapping aspect of my trips into the deep, vast wilderness.

But these are wants. Fun wants, but wants. I have everything I need.

Still, when you compare things for size, that mini Mac is awfully tempting.

Social Media

I, URL

Post author By Shelley Powers
Post date January 10, 2005

Recovered from the Wayback Machine.

My first exposure to the concept of a ‘federated identity’, or a digital identity or ID if you will, was when I had to obtain one of the first Microsoft Passport identities in order to access the material I needed to finish my book, Developing ASP Components. I was pleased with the concept, then, because it would give me a way to sign into all the Microsoft sites I visited and only have to remember the one username and password.

I was quite fond of MS tech at the time, and focused almost exclusively on this vendor in my writing. However, if you had asked me, then, whether I would input credit card information and use Passport to sign on to eBay or Amazon, I would have looked at you, blankly, waiting for you to finish the joke.

You see when email was created, two days later the first email spam was sent. And when the web was created, two days after that, the first DoS (Denial of Service) happened. Well, two days in a relative sense — almost on the doorstep of any new technology, there will follow the legions of kiddies and cons, waiting to take advantage of any opening and vulnerability. Therefore, when you talk about gathering enormous amounts of extremely vulnerable data into one spot, I would have to assume you’ve just gotten off the boat from your naivete about how secure you can make anything that’s attached to the internet.

Of course, my information is vulnerable anyway, regardless of what I do. My bank provides access to both my social security and debit card information at it’s site, and my car company also provides access to the same. There’s little I can do about companies choosing to make my data vulnerable, other than to review security procedures they follow, and be ready to hold them accountable if something happens to my data. Oh, and check my credit report every couple of months to make sure nothing is there that shouldn’t be.

But to voluntarily group sensitive data about myself behind the thin shield of a digital identity? No, not on your life.

However, I also get a little peeved at times about having to sign up at all the various newspapers’ sites to get access to their articles. And I imagine if I did the social network thing, such as LinkedIn and Orkut, I would get vastly annoyed at having to re-input whatever information is appropriate to these venues, not to mention all of the many (three) assocations I have. I also wouldn’t mind being able to keep my address in synch at all the places I do business, such as Amazon and B & H Photo. I still wouldn’t allow a site to store my credit card if I can avoid it, but I don’t mind so much my address and other contact information — this is easily obtainable regardless.

Of the other data I’ve been asked: I don’t really want to share my birth date, why not just ask me if I’m under or over 18? You don’t really need to know what I do for a living and how much I make. I’ll give you my zip code, but why do you need the full address? And no, you can’t have family member names. As for my sexual preferences, buzz off you snoopy little creep.

And under no circumstances would I input a social security number unless it was required by law.

Based on this, the concept of a digital identity for something like ‘single sign-on’, which would allow me to have one identification and password, as well as be able to share information such as my address is an attractive proposition. Especially if I could severely limit how much information I input into these systems–because no matter what they tell me about security, there is no such thing as a secure system connected to the internet.

In addition to these constraints, whatever system I used would not be centralized. If you’ve read me for any length of time at all, you’ll know that I have a very real dislike of centralized systems. Centralized systems are dependent on entities that may not exist someday. Centralized systems provide too tempting a target for Bad People. Centralized systems maintain too much control over what I do, or do not do, on the internet — and this latter is the number one reason I don’t like centralized systems.

As for my experience and exposure to identity systems: as mentioned previously, I used to have a Passport account but this is long gone, and I’ve still managed to avoid having to get a TypeKey account. I once worked directly with Boeing’s data model efforts, so I recognize the impetus behind the Liberty Alliance, and, frankly, consider this effort to be a way of providing R & D perks for key personnel. I’ve worked with Oblix at one of the universities where I consulted, and find it usable in the limited context of its scope–which is identity management within a closed, controlled group. I think Ping Identity is bloated, but then I also think J2EE is bloated. I am fairly newly aware of Sxip, primarily because Marc Canter said that it works ‘just like DNS’. And I have to ask, who would pay $25.00 to register an ‘i-name’ at Identity Commons, without seeing the tech first?

All in all, the main problem I had with most of the systems and/or products and/or organizations is that all but a few seemed to be focused on some grand scheme or another, where the tools they provide would be used by people in Armani suits, on the their way to a power lunch with some mover or shaker or another. They represent Big Things by Big People.

They weren’t for for the likes of me, people who come in all muddy from a hike, and who sit down at their computer to read an article at the Washington Post, but don’t want to have to register for yet another online newspaper. Or wouldn’t mind not having to re-input their mailing address at another store, and would like to just push a button and have any associations recorded compared and matched with others who have joined YASN (Yet Another Social Network).

These people are the only type of people I can wrap my mind around when I think of ‘digital ID’. That’s why when the creator of LID (Light-weight Digital Identity), Johannes Ernst, sent me an email about it, I was intrigued, primarily because to all intents and purposes, this system of digital identity allows one to control one’s own data; to easily extend the system using fairly standard technologies of XML and XPath for queries; and it’s a very simple concept — few bells, small number of whistles.

Cool. But will it work, and how does it compare with other systems.

Installation

I installed and played around with LID at the URL I picked to identify myself with, http://burningbird.net/shelley/. Clicking the link will bring up the help page for the installation, which provides multiple tests you can try.

I installed a minimum VCARD and FOAF files, following the instructions. Running an XPath query on the FOAF name returns correct value, and I can add other XML vocabularies if I want to extend the installation. I also tried the single-signon against the LID site, and had no problems. Trying to install a single-signon site myself did result in some Apache errors, but I’ll keep tweaking.

The installation wasn’t too complicated. I have the Gnu Privacy Guard (gpg) application installed, and I also have a SSH account to be able to access my server from the command line, so I could generate the key, per instructions. I did have to ask Hosting Matters to install the perl module for XPath, as it wasn’t installed. Despite having to handle the aftermaths of two separate DDoS attacks, the company installed it within 30 minutes.

Still, not all hosts are as willing as HM to accomodate their clients in this way and this is a strike against the application — dependent on having gpg and being able to run it at the command line, on a module that’s not common in most installation, and using Perl, an language environment that’s not easy to extend. I can’t help thinking that PHP might be a better solution, as well as more comfortable for people to use.

The install procedure was as follows:

1. Create the sub-directory which will serve as your id location. In my case, http://burningbird.net/shelley. This could be your weblog location, but if you’re running a PHP main page, this is not necessarily compatible with LID. I found that my main index.php page wasn’t successful. A better approach is to build something off your main domain directory.

2. Created the .htaccess file, which set the index page access order to index.cgi first, index.html second. I didn’t need to add the line to ensure the CGI file would be executable.

3. Tested the index.cgi file, and then sent email to HM to ask them to install XPath.pm.

4. After XPath is installed, I next created the lid subdirectory, which has a lib.xml file for configuration. I copied the template from LID and modified.

5. I have a simple VCARD and FOAF XML files, and copied these as VCARD.xml and FOAF.xml, respectively, to a data subdirectory under the lib subdirectory. Now have the following subdirectories:

/home/shelley/www/shelley
/home/shelley/www/shelley/lib
/home/shelley/www/shelley/lib/data

6. All done.

As you can see, aside from the XPath module, its about as easy as installing your own weblogging tool. But Perl always adds an extra challenge when adding new modules, as compared to PHP, which could simplify the use of this technology considerably. David Weinberger noted the complexity of the install, and asked Johannes Ernst about it in an email. According to the reply, the initial release is for technologies for exploration. I think a better approach would be to provide something usable by both techs and non-techs, because many of the people interested in digital IDs, such as David, aren’t techs. If they can’t play, they can’t write to promote the concept, and if they can’t write about the concept, it is going to get slower acceptance.

Still, the tech is flexible, all these issues could probably be easily addressed and we should be focusing on the concepts. This includes the use of a URL as digital ID, in addition to how a distributed system would work in comparison to a more centralized system such as Passport, and a closed distributed system like Sxip.

Comparisons

How does LID compare with other systems first requires you to pick which other systems. Following the LID’s creator own examples, I focused on Passport, Liberty Alliance, Sxip, and Identity Commons.

Passport is owned and operated by Microsoft, which also controls all the data that’s included within the system. If you’ve thought about leaving a comment at a MSN weblog, you would have been asked for your Passport identification. If you used eBay prior to December, 2004, you could also have used your Passport identification for sign-on. Now, though, because of security concernsmost uses of Passport are related to Microsoft content or Microsoft sites.

You don’t have to install Passport, but all data in the system remains in the centralized system, under control of Microsoft. LID, on the other hand, doesn’t store any data about you. In fact, it doesn’t even know you exist — there is no way of tracking a LID user from some root LID site.

External storage and control of our data is a concern that comes up with digital identities–who has access to the data, and what can they do with it. Frankly, in my opinion, though, this concern is overrated.

The primary interest in single sign-on systems for the user it to make it so they don’t have to remember their username and passwords from site to site. Additionally, the also don’t have to answer all the same obligatory questions at each site — address, phone numbers, and so on. Regardless, though, of whether you enter the data in one spot or many, once you make the decision to do business online, in whatever way, you have lost some control of the data…or at least some control of how the data is used.

We have dropped our names, phone numbers, email addresses, home addresses, birth date, and various other bits of publicly accessible data in more places than we can most likely remember. There is nothing to ‘control’ about this information — we voluntarily dropped the reigns of this data long ago.

It’s when we expose very sensitive data that we should concerned about the control of the data, and this primarily because of security. For instance, if we store our credit cards with our digital identities, we then want to make sure that the data is very secure and safe from hacking. This is where a centralized system can be most vulnerable, as it stores many, many such important bits of data and therefore becomes a particularly tasty target for hackers.

However, regardless of the system, there’s a way around this and that is not to store your credit card information online. All sites, unless they’re particularly primitive and ill designed, give you an option not to store your credit card information. Those that don’t, don’t deserve your business.

(And no site should ask for your social security number, unless required by law to report income. Even job search companies should not ask for this information — it’s up to an employer to obtain your SSN information after you’re hired or contracted for a position, not before. )

Consumers, spurred on by security reports scaled back in their initial trust of online systems, and the concept of ‘federated’ identities in an ecommerce setting has consequently lost interest among many consumers (and companies).

For all that Microsoft made Passport easy to use, and therefore could be considered for the blue-jeaned muddy hiker, it also has not the best reputation when it comes to security. So it fails for the blue-jeaned muddy hiker who is paranoid. This lack of trust did impact on Passport, whether the company will admit this or not. Microsoft dropped its credit card option from Passport in 2003. In fact, Passport is no longer a viable entity in the global digital identity game, primarily focusing its use on its own sites, and those of some partner sites.

If Passport is basically a non-player now, then what about Liberty Alliance? Well, frankly, Liberty Alliance isn’t for the likes of you and me, regardless of all its talk about federated identities, and discussions about the Liberty ID-FF — the specification behind the Alliance’s identity scheme. Case in point, from the specification there is a possible user scenario, with Joe Self logging on to an airline, who is part of a circle of trust. Once authenticated, in the scenario, Joe is then asked:

Note: You may federate your Airlines, Inc. identity with any other identities you may have with members of our affinity group.

Do you consent to such introductions?

Laughable. I chortled until tears ran down my face. It then continued on from there, with Joe Self being asked to ‘federate his identity’ at various sites within the ‘afinity group’ as he progressed along, just trying to reserve an airline ticket and rent a car — something that can be done in one move, with one click of the button in today’s travel systems.

Returning to LID, though, I find I can’t compare the two implementations, because it would like trying to compare an Oraclized PeopleSoft with WordPress. More, where LID represents a service to the user, Liberty Alliance represents a service to Alliance members — no more, no less. In other words, the two implementations are so far apart on the scale, that the scale becomes meaningless. Frankly, this is all to LID’s favor, too.

However, both Passport and Liberty Alliance represent large corporations trying to manage every aspect of one’s digital identity. What about smaller efforts, such as Identity Commons?

It would be great to compare LID against Identity Commons if there was anything to compare. What amazes me is from I can find about this entity/effort is that you can now register your ‘i-name’, using a URN (Uniform Resource Name), and this will be good for 50 years. All for 25.00. However, if you perchance want to check out the tech first, no such luck because though I searched high and low, I couldn’t find anything.

Regardless, the approach seems to be that you register for your own personal identification through a broker, where one assumes you’ll store all the important bits about you that forms your online self. Your data is distributed, but still managed by another entity. However, your uniqueness in the system is guaranteed by the fact that there is one overall centralized authority that manages the distribution of the actual identities.

Still, there’s nothing to see, feel, and tweak. In other words, there’s a lot of good words, and promises, but for all intents and purposes, it’s a pipe dream until it releases something tangible, though it does look like it might be releasing something today, unless this page has been up for months.

Sxip, on the other hand, does have technology you can see, feel, and tweak. In fact of all the alternatives examined, it seems to be the closest to matching what I would look for in a digital ID. Almost.

Sxip and SXIP

There is Sxip, the company, and SXIP the protocol. The latter stands for Simple eXtensible Identity Protocol. The company is managed by Dick Hardt, who I know through through my efforts to include ActiveState in that same aforementioned Developing ASP Components book. Since I was rather fond of ActiveState, I was somewhat predisposed to be positive about the Sxip efforts. This effect was only positively impacted when I was able to download what the company calls a “Membersite Developer Kit” to play around with some of the concepts, myself.

(In care you’re curious, I downloaded the PHP version.)

How the Sxip system works is that users can sign up for an account at a Homesite, and then use this identity at any other number of Membersites. They can create multiple personas and associate different pieces of data with the persona. Then, when the log into, or “sxip into” a Membersite, for the first time, they’re given an option as to what persona to use with that site. I tried it out with the demonstation materials provided by Sxip, and found it to be a very simple process.

Now, how Membersites and Homesites know about each other and can exchange information is through the use of a Rootsite, which basically manages the unique identities, without access to any of the other user data. This is similar to i-broker in Identity Commons, I believe. Where it might differ is that if one has the development expertise, one could develop one’s own Homesite for just their own personal use.

It is this latter capability that matches closest to LIDs own user controlled data technique, though maintaining a personal Homesite looks as if it could be outside of the capability for a non-developer. (Still, it wouldn’t be out of the boundary of the tool to create a plug-and-play Homesite. Would this work contrary to the overall system expectations? It would come at the same cost as a digital certificate, according to the documents at Sxip.)

Still, there are major differences between the LID approach and the SXIP approach. For instance, with LID, the effort would be completely distributed, with no central authority controlling the issuance of identities. This can work this way because its based on each person being identified by a URL, and is implemented within the existing domain system as managed by ICANN: within the DNS framework, there can be no two identities alike because there are not two domains alike.

Marc Canter and Sxip both say that SXIP works like the DNS, but it doesn’t really, other than there being one central authority preventing duplication of names, as well as a resolution of where the data associated with these names resides.

For instance, in DNS, when a person accesses a domain, and their ISP’s nameserver does not recognize it, the ISP checks with higher level root nameserver to find out where the location o the domain’s nameserver. It is this that provides the unique name-IP address mapping. The ISP’s own nameserver then gets the IP address and stores it and the domain within its own cache of data before responding to the request. The next time another person who uses the same ISP accesses the domain, the ISP already has the information.

This caching serves a couple of purposes within the DNS. For one, it makes new requests of a domain that much quicker. For another, it helps to disperse the information about the domain across many different nameservers, so that if there is something wrong with one, the system can usually route around the damage and finds the IP-domain name mapping in another.

There isn’t anything like this in SXIP. What it does, instead, is store a cookie with information about the user’s Homesite in the computer being used; or provides a place to fill this information in if the cookie is gone, or the person is using a shared machine. This can work rather well, and about the only dependency that exists now is authenticating the uniqueness of the identity at the Rootsite, when a new user, or Membersite, or Homesite is created except…

Except for the Homesite going down, or if it no longer exists.

The real strength of the DNS system is that information about a domain is cached all throughout the system, and the only time a problem will occur is if something has happened to the person’s own nameserver, but even then, they have a backup. If you’ve ever registered a site, then you know about providing two different nameservers, and that these are usually at two different locations. The whole concept is based on redundancy.

There is no redundancy in SXIP. I looked, because I thought I had read that you can store your information at more than one Homesite, but from the developer documentation, it would seem there is an assumption that there is one, and only one, Homesite. If true, then if your Homesite is down, you can’t log into a new Membersite, though you should be able to still access previously visited Membersites. And if your Homesite is blown away, then you’ll have to start over again with a new one.

LID doesn’t have this problem because you control the data. It’s true, if your site is down, you can’t be authenticated, but at least you know that it won’t go away without your compliance. That’s the advantage of basing the system managed directly the user, based on DNS, rather than an external party based on DNS-like properties.

However, using a URL, as David Weinberger had pointed out, has its disadvantages. A few years back if you had asked me what URL I would use, I would have used something related to a long time domain, yasd.com. However, this was before this domain was so overrun with email spam that it was no longer of any use. Now my test identifier is based on burningbird.net. Who knows what it will be in three years?

A better approach might be to use something such as purl.org, which can provide permanent URIs that are then redirected to a specific domain, but which themselves never change. However, this still puts some dependency on an external organization, and I hesitate to do this more than absolutely necessary. And frankly, I’m not sure this would work within the LID system.

Another approach could be a broadcast method, whereby every time you log in with your particular identity at a specific site, the local system maintains a link to the site. Then if you ever move to a different URL, you formally document the move within the system, which then visits each site where you’ve been and issues a request–a verified request–that each modifies its data to reflect the new ‘you’. I wonder if this might be the technique that Ernst discussed with David about how to handle this as a problem. Hard to say, second hand info.

And so…

LID provides a great deal of functionality in a tiny little package. It supports pseudonyms (personas), secure authentication, single sign-on, and data exchange, all using standard, accessible technologies. More, it’s not dependent on any single centralized authority, other than the DNS itself. But then, we’re all rather dependent on this. As for how well LID works with Kim Cameron’s Laws of Digital Identity, I never touch “Laws” as defined by person or persons with vested interest, regardless of how good they sound; so Johannes Ernst’s writing will have to stand in answer to this.

As I said earlier, though, LID needs work. Rather than manually edit the lid.xml file, I would recommend that a user interface create the file entries, and then encrypt the important and sensitive bits. I also think it needs to be plug and play for non-techs, as well as an efficient, and secure, way to change one’s ‘identity’ (URL). A more formal API would only help, as would more extensive documentation. It needs to be extended to other languages, not just Perl.

I also would like to see source and concepts as open source, preferably within the BSD or GPL license, and wouldn’t mind hearing more about the business model. And of course, testing. I’d like to see lots and lots of testing.

Still, the root concepts of LID are good. I think that building a unique identification system on one already in place is a good one; and for those people who don’t have domains, a trust broker could provide one for a small fee–as long as there is a way to export the LID information in a format so that the person could backup their account after a change, and move their account to another broker.

I also like the extensibility of the system, and have already tried out various tiny bits of other XML documents I have. As for the use in social networks, LID already provides integration with FOAF, probably the most open of all the ‘who knows who’ specifications.

I like Sxip, and if it weren’t for all the centralized pieces, I would like it even more. It’s definitely not an Armani suit type of system, but I still see the vague shadows of a briefcase in its architecture. I think LID is the type of digital identity system I can wrap around the type of person I am, rather than the power luncher discussed earlier. Ensuring the open source nature of the concept and the code, expand both code and documentation, test, test, test, and I’d like it enough to even use it.

(If you’re interested in digital IDs and want to try this functionality and can’t install it locally, holler and I’ll create you a temporary digital ID in my sandbox in which you can try this out. If you have a VCARD.xml or FOAF.xml file, send those along with your request.)

Photography Places

Banging heads for fun and profit

Post author By Shelley Powers
Post date January 9, 2005

I managed to get LID installed, and you can see it in operation here. I’m in the middle of another one of my multi-page essays on digital ID generally and LID specifically, and hopefully will be finished tonight, or tomorrow. I bet you’re sitting there just holding your breath, excited down to your privates at the thought of me releasing an essay on digital IDs, aren’t you? Well, when I do, don’t pee your pants.

I would have been finished sooner, but today was the first really nice day we’ve had all week. Instead of the cold, dry days we’re supposed to get in January, we’re getting warm, wet thunderstorms. Really lousy weather for hiking, which means next week, I’m going to have to find some alternative exercise or I’ll just end up staying at home, writing more code, and getting bitchier.

Not today, though. The sun broke through, the temperature was a balmy 50F with just a gentle, cool breeze; it felt more like spring than winter. I wasn’t up for a strenuous hike and also wanted to test my new pack fully loaded, so I went to Shaw and walked some of my usual paths. The ground was a bit squishy, but that makes no difference when I’m in my waterproof booties. Not many people out considering how nice it was, but that’s the great thing about hiking in the winter — you can go for miles and the only company you’ll have is a red-headed woodpecker tapping at the trees, looking for bugs. And finding them, too.

Shaw is an education center as much as it is a conservation area, and it’s not that unusual to see odd buildings and what not here and there for some class or another. But I wasn’t expecting to see a sod house built on top of the hill overlooking the prairie. A nice one, too — water tight and more than capable of holding out the elements. With a thatched roof, too, that was actually sprouting green.

When I reached that interesting little building on the hill, I stopped for a while; leaned up against the fence eating trail mix, drinking water, and just enjoying the view. This is all part of my new ‘no rush’ hiking and walking philosophy. I’ve noticed, lately, that when I’m on hikes, I’ve not taken the time to really appreciate the land as I pass through — always wanting to make the distance, go the miles, reach the end. However, what’s the good of being out in the country if you’re only going to bring the stress of the world, virtual as well as real, along with you?

No, plenty of time to stop and take in the view. And watch what looked like a group of blue jays doing the big naughty in the field.

Back home, after stopping off at Route 66 state park on the way to check out the water levels, I caught up on my weblog reading and found out that the head honchos at GM are blogging now. And everyone was just so excited, jumping up and down excited, at how one of the vice presidents of GM is blogging now. There’s also a small engine blog, of all things. I imagine Ford is just around the corner; if so, I wonder what the Ford Blogs will look like. I mean, will the backgrounds come in any color the weblogger wants, as long as it’s black?

But just when it was all looking so dark, I spotted a poem here that cheered me:

It was all about cats
and their habitats
But they only invited
the dogs and the rats.

I spent the day in prairie and wood, on mud-like trails under coffee cream skies, sure of path but lost in thought. Lauren, does that still count?

Ah well, back to the digital ID writing because I can hear you all panting for it. Back to the code, and quickly, too, before my site goes down under yet another DDoS attack.

Technology

Self-documenting technology

Post author By Shelley Powers
Post date January 8, 2005

Danny Ayers points to a Jon Udell article about dynamic documentation managed by the folks who use a product, rather than relying on stuffy old material provided by the organization making the software. In it, Jon writes:

Collectively, we users know a lot more about products than vendors do. We eventually stumble across every undocumented feature or quirk. We like to maintain the health of the products we’ve bought and we’re happy to discuss how to do that with other users.

The problem is that vendors, for the most part, do a lousy job of encouraging and organizing those discussions. Here’s an experiment I’d like to see someone try: Start a Wikipedia page for your product. Populate it with basic factual information, point users there, then step back and let the garden grow. Intervene only to repair vandalism, make corrections, and contribute useful new facts.

I had to pause when I read the words …we users know a lot more about products than vendors do. I was reminded about finding information about how to convert my Nikon 995 camera to RAW format, based on the helpful advice of just such a user, only to find out warnings at the Nikon site that if you do, you null and void the warranty on it because this could have Serious Consequences in the continued Usability of the Product.

I remembered reading a weblogger, brand spanking new in their use of SQL, telling everyone how they could fix a problem in their weblog just by running a certain SQL command, and then frantically sending the person an email saying that if the users do, there’s a good chance they’ll lose half their data. Then I was reminded that there is nothing more dangerous with a user who just knows they have the answer, and who also has absolutely no stake in whether you break your copy of the product or not.

Still, I have been helped numerous times by other users when I get into situations not documented in the product manual, and I can agree with Jon, and with Tim Bray that it’s much easier to interactively look for help than to read through static documentation when you run into problems.

Jon Udell suggests a new documentation strategy for technology vendors; rather than going on publishing incomplete, out-of-date, poorly written manuals, they could just set up a per-product Wiki and let the customer base fill it up with problems, fixes, workarounds, tips & tricks.

Which is probably why most company provided formal documentation isn’t focused on problem resolution as much as it is problem prevention. For instance, all the interactive help in the world isn’t going to help a new user set up a Movable Type weblog if they have to go query for each stage in the installation process. That’s why a company like Six Apart provides a quite nice installation guide that covers 99% of the situations most people would run into. By providing step by step instructions, the majority of people are able to get their weblogs up and running, without too many problems.

On the other hand, OsCommerce a heavily used open source free product for managing ecommerce sites, has little or no formal documentation, other than that provided by the users in provided forums. However, other sites have sprang up providing other documentation, including a wiki, and multiple site that provide, ta dah!, formally written, structured documentation for how to use the product.

Why the need for the latter? Because for the most part, OsCommerce is used by people who don’t have that much technical background or experience, and they, for the most part, are very uncomfortable without having structured documentation that they can follow, step by step, in how to actually use the product. Not troubleshoot, but actually use the app.

Still, Jon and Tim aren’t recommending that companies not provide this information — they’re saying provide this (for all those who can’t connect the dots through Google, I imagine), but then provide areas where users can provide additional documentation and help each other.

Jon, mentioned a wiki, and Danny pointed to the WordPress wiki as an example of this type of documentation project. Once upon a time, I also thought that a wiki would be a good tool to use for open documentation efforts. However, that was before I tried to get several people–non-technical people–interested in providing information at a wiki I set up. They were willing, but many were also intimidated about the environment. It was then that I realized that a wiki requires not only a great deal of knowledge about how to edit the pages, but also familiarity with the culture. In other words, wiki is for those who are wiki primed.

In fact, this has become a problem out at Wikipedia — most of the editors are technical people, and therefore the skew of information tends to be towards technical subjects. The organization has taken to actively promoting non-tech topics in hopes of attracting enough contributors to these other topics to start achieving some balance in coverage.

But then, Wikipedia has the necessary, critical element to make this work — it has achieved enough momentum to be able to direct attention to obscure topics and know that there should be enough members of the audience with knowledge of this topic, and willingness to dive into what is a fairly structured culture, to provide at least a bare minimum coverage of the topic. Most wikis will not.

That’s why when Jon proposes that vendors provide a hands-off wiki for users, and then uses Wikipedia as an example of how well a wiki can work, I winced. Too many people point to the Wikipedia as a demonstration of how a wiki can work, without realizing that the Wikipedia is unique in its use, purpose, and community. In other words, if the only wiki we can point to as a demonstration of how wikis work is Wikipedia, then perhaps what we’re finding is that wikis don’t work. Or don’t work without a great deal of organization on the part of the wiki administrators, and an already existing community of willing contributors.

Of course, technology users can be heavily motivated to support the products they use, as we’ve seen with weblogging technology. There is nothing more loyal than a weblog tool user–unless it’s a Mac user. You take your life in your hands when you take a critical bite out of Apple.

Based on the assumption of interest on the part of tech users, let’s return to the WordPress wiki that Danny pointed out. Checking out the recent changes, we find that on January 6th, a user who calls himself GooSa, has added a bunch of spam pages. I imagine that the pages will be removed by the time you look as this, but I copied this person’s ‘user’ page entry:

Goo Sa is an evil, evil spammer. Plus he smells like eggs.

Other than that, there isn’t that much activity on this wiki, because it’s no longer the WordPress wiki. No, that’s now the Codex wiki. As you can see in recent changes at this site, there is a great deal of work being done, as well as less spam content. Of course, I don’t believe this is linked any where from the main WordPress site, so it could be that the spammers haven’t found it yet.

If they do, there seems to be enough organization to help keep the site clean of the obvious spam, but what about the not so obvious destructive actions? For instance, the malicious editor who adds in a helpful tidbit that could actually cause harm to the users, but only a very experienced technical person would be able to recognize that this causes harm? (Or the user who has just enough knowledge about a topic to make them scary as hell.)

If this tip was out at a forum or email list, the user might be (should be) wary enough of the tip to perhaps get it vetted first; but this is the ‘approved’ wiki for the product, which implies trust in the material contained. Does this mean, then, that the WordPress developers vet every bit of information in the wiki? If the developers are busy providing code for the product, this is unlikely.

Of course, the thing about wikis is that they are self-correcting. However, it can take time for a correction to take place, and in the meantime, I’m fielding emails from half a dozen WordPress users about why their comments have stopped working, or what happened to their data, all because they followed information at the ‘official’ WordPress wiki.

Now, the Wikipedia doesn’t have many of these problems, because there is a good, formal procedure with enough people to monitor the site in place to route around damage. However, as the administrators of the site warn on a fairly regular basis — believe what you read their at your own risk. A wiki that’s ‘authorized’ by a vendor to provide documentation for a product can’t afford to be this lose in what information gets released under under its corporate umbrella.

Security and validity of the data aside, wikis foster a certain form of organization that may not be comfortable for all people. The information contained tends to float about in pieces, rather than flow smoothly, as more formal documentation does. Now, this might suit many of the more technical folks who want to know what’s going on with WordPress; I have a feeling, though, that those less technical folks who read it are going to feel cut adrift at times, as they read a discrete bit of information here, and one there, but without the experience to understand how the two pieces of information are related.

Of course, again, thats where the organizers come in, by helping to move things about and point out gaps in the flow–but at what point does it become obvious that if the organizers had just spent their time doing the documentation themselves, it would have taken less time than it takes to continuously prevent harm to the material?

Issues of wiki aside, let’s return to Jon Udell’s request for a new way of managing community documentation. He talks about not being able to find information at the vendor site to solve his problem, so he searches on Google and in user forums, he found the answer he needed. He then says, something new needs to be done to enable user access to community information.

Now, go back and re-read this last paragraph a couple of times.

As Danny points out, much of this user information is already available, but what might be missing is a way of accessing it:

But wait, those discussions with considerable information is already available – the end-user’s don’t really need any extra encouragement, they’re motivated as it is. But what’s lacking is the organisation of that information. Google is a very blunt instrument. Yes, a company Wiki could act as a focus for people, but they’re still be plenty of info on mailing lists and blogs that could be far more accessible.

But as Jon points out, whether intentionally or not, Google does work, and worked for him in this context. More, what he and Tim Bray and to a lesser extent Danny are looking for is more of the type of documentation that favors the geek, when it’s the geeks who don’t need the help — they know how to dig this information out, and how to vet its usefulness as compared to its harm.

What about the non-tech? The non-geek? The very documentation that Udell and Bray scathingly reject is the very documentation most of the non-techs need, and that’s well constructed, clearly written, stable, and vetted user documentation about how to use the tools. Throw in searchable user support forums for troubleshooting, Google and blogs and online interest sites, and Babs is your aunt. So when Tim writes:

Like many great ideas, it’s obvious once you think of it. I’m quite sure it’ll happen.

I have to scratch my head in confusion because seems to me that the mechanisms behind this ‘new idea’ are already in place, and have been for some time. Now if we could just convince the open source community and supporters like Jon and Tim and Danny that rather than spending time on creating a new style of hammer they need to provide nails and just start hammering away, we might be set.

Still, I don’t want to completely discount Jon’s wiki suggestion: it could be humorous to see what happens out at, say, a security wiki for Windows software. Might be better than a raree show, indeed.