I, URL

Recovered from the Wayback Machine.

My first exposure to the concept of a ‘federated identity’, or a digital identity or ID if you will, was when I had to obtain one of the first Microsoft Passport identities in order to access the material I needed to finish my book, Developing ASP Components. I was pleased with the concept, then, because it would give me a way to sign into all the Microsoft sites I visited and only have to remember the one username and password.

I was quite fond of MS tech at the time, and focused almost exclusively on this vendor in my writing. However, if you had asked me, then, whether I would input credit card information and use Passport to sign on to eBay or Amazon, I would have looked at you, blankly, waiting for you to finish the joke.

You see when email was created, two days later the first email spam was sent. And when the web was created, two days after that, the first DoS (Denial of Service) happened. Well, two days in a relative sense — almost on the doorstep of any new technology, there will follow the legions of kiddies and cons, waiting to take advantage of any opening and vulnerability. Therefore, when you talk about gathering enormous amounts of extremely vulnerable data into one spot, I would have to assume you’ve just gotten off the boat from your naivete about how secure you can make anything that’s attached to the internet.

Of course, my information is vulnerable anyway, regardless of what I do. My bank provides access to both my social security and debit card information at it’s site, and my car company also provides access to the same. There’s little I can do about companies choosing to make my data vulnerable, other than to review security procedures they follow, and be ready to hold them accountable if something happens to my data. Oh, and check my credit report every couple of months to make sure nothing is there that shouldn’t be.

But to voluntarily group sensitive data about myself behind the thin shield of a digital identity? No, not on your life.

However, I also get a little peeved at times about having to sign up at all the various newspapers’ sites to get access to their articles. And I imagine if I did the social network thing, such as LinkedIn and Orkut, I would get vastly annoyed at having to re-input whatever information is appropriate to these venues, not to mention all of the many (three) assocations I have. I also wouldn’t mind being able to keep my address in synch at all the places I do business, such as Amazon and B & H Photo. I still wouldn’t allow a site to store my credit card if I can avoid it, but I don’t mind so much my address and other contact information — this is easily obtainable regardless.

Of the other data I’ve been asked: I don’t really want to share my birth date, why not just ask me if I’m under or over 18? You don’t really need to know what I do for a living and how much I make. I’ll give you my zip code, but why do you need the full address? And no, you can’t have family member names. As for my sexual preferences, buzz off you snoopy little creep.

And under no circumstances would I input a social security number unless it was required by law.

Based on this, the concept of a digital identity for something like ‘single sign-on’, which would allow me to have one identification and password, as well as be able to share information such as my address is an attractive proposition. Especially if I could severely limit how much information I input into these systems–because no matter what they tell me about security, there is no such thing as a secure system connected to the internet.

In addition to these constraints, whatever system I used would not be centralized. If you’ve read me for any length of time at all, you’ll know that I have a very real dislike of centralized systems. Centralized systems are dependent on entities that may not exist someday. Centralized systems provide too tempting a target for Bad People. Centralized systems maintain too much control over what I do, or do not do, on the internet — and this latter is the number one reason I don’t like centralized systems.

As for my experience and exposure to identity systems: as mentioned previously, I used to have a Passport account but this is long gone, and I’ve still managed to avoid having to get a TypeKey account. I once worked directly with Boeing’s data model efforts, so I recognize the impetus behind the Liberty Alliance, and, frankly, consider this effort to be a way of providing R & D perks for key personnel. I’ve worked with Oblix at one of the universities where I consulted, and find it usable in the limited context of its scope–which is identity management within a closed, controlled group. I think Ping Identity is bloated, but then I also think J2EE is bloated. I am fairly newly aware of Sxip, primarily because Marc Canter said that it works ‘just like DNS’. And I have to ask, who would pay $25.00 to register an ‘i-name’ at Identity Commons, without seeing the tech first?

All in all, the main problem I had with most of the systems and/or products and/or organizations is that all but a few seemed to be focused on some grand scheme or another, where the tools they provide would be used by people in Armani suits, on the their way to a power lunch with some mover or shaker or another. They represent Big Things by Big People.

They weren’t for for the likes of me, people who come in all muddy from a hike, and who sit down at their computer to read an article at the Washington Post, but don’t want to have to register for yet another online newspaper. Or wouldn’t mind not having to re-input their mailing address at another store, and would like to just push a button and have any associations recorded compared and matched with others who have joined YASN (Yet Another Social Network).

These people are the only type of people I can wrap my mind around when I think of ‘digital ID’. That’s why when the creator of LID (Light-weight Digital Identity), Johannes Ernst, sent me an email about it, I was intrigued, primarily because to all intents and purposes, this system of digital identity allows one to control one’s own data; to easily extend the system using fairly standard technologies of XML and XPath for queries; and it’s a very simple concept — few bells, small number of whistles.

Cool. But will it work, and how does it compare with other systems.

Installation

I installed and played around with LID at the URL I picked to identify myself with, http://burningbird.net/shelley/. Clicking the link will bring up the help page for the installation, which provides multiple tests you can try.

I installed a minimum VCARD and FOAF files, following the instructions. Running an XPath query on the FOAF name returns correct value, and I can add other XML vocabularies if I want to extend the installation. I also tried the single-signon against the LID site, and had no problems. Trying to install a single-signon site myself did result in some Apache errors, but I’ll keep tweaking.

The installation wasn’t too complicated. I have the Gnu Privacy Guard (gpg) application installed, and I also have a SSH account to be able to access my server from the command line, so I could generate the key, per instructions. I did have to ask Hosting Matters to install the perl module for XPath, as it wasn’t installed. Despite having to handle the aftermaths of two separate DDoS attacks, the company installed it within 30 minutes.

Still, not all hosts are as willing as HM to accomodate their clients in this way and this is a strike against the application — dependent on having gpg and being able to run it at the command line, on a module that’s not common in most installation, and using Perl, an language environment that’s not easy to extend. I can’t help thinking that PHP might be a better solution, as well as more comfortable for people to use.

The install procedure was as follows:

1. Create the sub-directory which will serve as your id location. In my case, http://burningbird.net/shelley. This could be your weblog location, but if you’re running a PHP main page, this is not necessarily compatible with LID. I found that my main index.php page wasn’t successful. A better approach is to build something off your main domain directory.

2. Created the .htaccess file, which set the index page access order to index.cgi first, index.html second. I didn’t need to add the line to ensure the CGI file would be executable.

3. Tested the index.cgi file, and then sent email to HM to ask them to install XPath.pm.

4. After XPath is installed, I next created the lid subdirectory, which has a lib.xml file for configuration. I copied the template from LID and modified.

5. I have a simple VCARD and FOAF XML files, and copied these as VCARD.xml and FOAF.xml, respectively, to a data subdirectory under the lib subdirectory. Now have the following subdirectories:

/home/shelley/www/shelley
/home/shelley/www/shelley/lib
/home/shelley/www/shelley/lib/data

6. All done.

As you can see, aside from the XPath module, its about as easy as installing your own weblogging tool. But Perl always adds an extra challenge when adding new modules, as compared to PHP, which could simplify the use of this technology considerably. David Weinberger noted the complexity of the install, and asked Johannes Ernst about it in an email. According to the reply, the initial release is for technologies for exploration. I think a better approach would be to provide something usable by both techs and non-techs, because many of the people interested in digital IDs, such as David, aren’t techs. If they can’t play, they can’t write to promote the concept, and if they can’t write about the concept, it is going to get slower acceptance.

Still, the tech is flexible, all these issues could probably be easily addressed and we should be focusing on the concepts. This includes the use of a URL as digital ID, in addition to how a distributed system would work in comparison to a more centralized system such as Passport, and a closed distributed system like Sxip.

Comparisons

How does LID compare with other systems first requires you to pick which other systems. Following the LID’s creator own examples, I focused on Passport, Liberty Alliance, Sxip, and Identity Commons.

Passport is owned and operated by Microsoft, which also controls all the data that’s included within the system. If you’ve thought about leaving a comment at a MSN weblog, you would have been asked for your Passport identification. If you used eBay prior to December, 2004, you could also have used your Passport identification for sign-on. Now, though, because of security concernsmost uses of Passport are related to Microsoft content or Microsoft sites.

You don’t have to install Passport, but all data in the system remains in the centralized system, under control of Microsoft. LID, on the other hand, doesn’t store any data about you. In fact, it doesn’t even know you exist — there is no way of tracking a LID user from some root LID site.

External storage and control of our data is a concern that comes up with digital identities–who has access to the data, and what can they do with it. Frankly, in my opinion, though, this concern is overrated.

The primary interest in single sign-on systems for the user it to make it so they don’t have to remember their username and passwords from site to site. Additionally, the also don’t have to answer all the same obligatory questions at each site — address, phone numbers, and so on. Regardless, though, of whether you enter the data in one spot or many, once you make the decision to do business online, in whatever way, you have lost some control of the data…or at least some control of how the data is used.

We have dropped our names, phone numbers, email addresses, home addresses, birth date, and various other bits of publicly accessible data in more places than we can most likely remember. There is nothing to ‘control’ about this information — we voluntarily dropped the reigns of this data long ago.

It’s when we expose very sensitive data that we should concerned about the control of the data, and this primarily because of security. For instance, if we store our credit cards with our digital identities, we then want to make sure that the data is very secure and safe from hacking. This is where a centralized system can be most vulnerable, as it stores many, many such important bits of data and therefore becomes a particularly tasty target for hackers.

However, regardless of the system, there’s a way around this and that is not to store your credit card information online. All sites, unless they’re particularly primitive and ill designed, give you an option not to store your credit card information. Those that don’t, don’t deserve your business.

(And no site should ask for your social security number, unless required by law to report income. Even job search companies should not ask for this information — it’s up to an employer to obtain your SSN information after you’re hired or contracted for a position, not before. )

Consumers, spurred on by security reports scaled back in their initial trust of online systems, and the concept of ‘federated’ identities in an ecommerce setting has consequently lost interest among many consumers (and companies).

For all that Microsoft made Passport easy to use, and therefore could be considered for the blue-jeaned muddy hiker, it also has not the best reputation when it comes to security. So it fails for the blue-jeaned muddy hiker who is paranoid. This lack of trust did impact on Passport, whether the company will admit this or not. Microsoft dropped its credit card option from Passport in 2003. In fact, Passport is no longer a viable entity in the global digital identity game, primarily focusing its use on its own sites, and those of some partner sites.

If Passport is basically a non-player now, then what about Liberty Alliance? Well, frankly, Liberty Alliance isn’t for the likes of you and me, regardless of all its talk about federated identities, and discussions about the Liberty ID-FF — the specification behind the Alliance’s identity scheme. Case in point, from the specification there is a possible user scenario, with Joe Self logging on to an airline, who is part of a circle of trust. Once authenticated, in the scenario, Joe is then asked:

Note: You may federate your Airlines, Inc. identity with any other identities you may have with members of our affinity group.

Do you consent to such introductions?

Laughable. I chortled until tears ran down my face. It then continued on from there, with Joe Self being asked to ‘federate his identity’ at various sites within the ‘afinity group’ as he progressed along, just trying to reserve an airline ticket and rent a car — something that can be done in one move, with one click of the button in today’s travel systems.

Returning to LID, though, I find I can’t compare the two implementations, because it would like trying to compare an Oraclized PeopleSoft with WordPress. More, where LID represents a service to the user, Liberty Alliance represents a service to Alliance members — no more, no less. In other words, the two implementations are so far apart on the scale, that the scale becomes meaningless. Frankly, this is all to LID’s favor, too.

However, both Passport and Liberty Alliance represent large corporations trying to manage every aspect of one’s digital identity. What about smaller efforts, such as Identity Commons?

It would be great to compare LID against Identity Commons if there was anything to compare. What amazes me is from I can find about this entity/effort is that you can now register your ‘i-name’, using a URN (Uniform Resource Name), and this will be good for 50 years. All for 25.00. However, if you perchance want to check out the tech first, no such luck because though I searched high and low, I couldn’t find anything.

Regardless, the approach seems to be that you register for your own personal identification through a broker, where one assumes you’ll store all the important bits about you that forms your online self. Your data is distributed, but still managed by another entity. However, your uniqueness in the system is guaranteed by the fact that there is one overall centralized authority that manages the distribution of the actual identities.

Still, there’s nothing to see, feel, and tweak. In other words, there’s a lot of good words, and promises, but for all intents and purposes, it’s a pipe dream until it releases something tangible, though it does look like it might be releasing something today, unless this page has been up for months.

Sxip, on the other hand, does have technology you can see, feel, and tweak. In fact of all the alternatives examined, it seems to be the closest to matching what I would look for in a digital ID. Almost.

Sxip and SXIP

There is Sxip, the company, and SXIP the protocol. The latter stands for Simple eXtensible Identity Protocol. The company is managed by Dick Hardt, who I know through through my efforts to include ActiveState in that same aforementioned Developing ASP Components book. Since I was rather fond of ActiveState, I was somewhat predisposed to be positive about the Sxip efforts. This effect was only positively impacted when I was able to download what the company calls a “Membersite Developer Kit” to play around with some of the concepts, myself.

(In care you’re curious, I downloaded the PHP version.)

How the Sxip system works is that users can sign up for an account at a Homesite, and then use this identity at any other number of Membersites. They can create multiple personas and associate different pieces of data with the persona. Then, when the log into, or “sxip into” a Membersite, for the first time, they’re given an option as to what persona to use with that site. I tried it out with the demonstation materials provided by Sxip, and found it to be a very simple process.

Now, how Membersites and Homesites know about each other and can exchange information is through the use of a Rootsite, which basically manages the unique identities, without access to any of the other user data. This is similar to i-broker in Identity Commons, I believe. Where it might differ is that if one has the development expertise, one could develop one’s own Homesite for just their own personal use.

It is this latter capability that matches closest to LIDs own user controlled data technique, though maintaining a personal Homesite looks as if it could be outside of the capability for a non-developer. (Still, it wouldn’t be out of the boundary of the tool to create a plug-and-play Homesite. Would this work contrary to the overall system expectations? It would come at the same cost as a digital certificate, according to the documents at Sxip.)

Still, there are major differences between the LID approach and the SXIP approach. For instance, with LID, the effort would be completely distributed, with no central authority controlling the issuance of identities. This can work this way because its based on each person being identified by a URL, and is implemented within the existing domain system as managed by ICANN: within the DNS framework, there can be no two identities alike because there are not two domains alike.

Marc Canter and Sxip both say that SXIP works like the DNS, but it doesn’t really, other than there being one central authority preventing duplication of names, as well as a resolution of where the data associated with these names resides.

For instance, in DNS, when a person accesses a domain, and their ISP’s nameserver does not recognize it, the ISP checks with higher level root nameserver to find out where the location o the domain’s nameserver. It is this that provides the unique name-IP address mapping. The ISP’s own nameserver then gets the IP address and stores it and the domain within its own cache of data before responding to the request. The next time another person who uses the same ISP accesses the domain, the ISP already has the information.

This caching serves a couple of purposes within the DNS. For one, it makes new requests of a domain that much quicker. For another, it helps to disperse the information about the domain across many different nameservers, so that if there is something wrong with one, the system can usually route around the damage and finds the IP-domain name mapping in another.

There isn’t anything like this in SXIP. What it does, instead, is store a cookie with information about the user’s Homesite in the computer being used; or provides a place to fill this information in if the cookie is gone, or the person is using a shared machine. This can work rather well, and about the only dependency that exists now is authenticating the uniqueness of the identity at the Rootsite, when a new user, or Membersite, or Homesite is created except…

Except for the Homesite going down, or if it no longer exists.

The real strength of the DNS system is that information about a domain is cached all throughout the system, and the only time a problem will occur is if something has happened to the person’s own nameserver, but even then, they have a backup. If you’ve ever registered a site, then you know about providing two different nameservers, and that these are usually at two different locations. The whole concept is based on redundancy.

There is no redundancy in SXIP. I looked, because I thought I had read that you can store your information at more than one Homesite, but from the developer documentation, it would seem there is an assumption that there is one, and only one, Homesite. If true, then if your Homesite is down, you can’t log into a new Membersite, though you should be able to still access previously visited Membersites. And if your Homesite is blown away, then you’ll have to start over again with a new one.

LID doesn’t have this problem because you control the data. It’s true, if your site is down, you can’t be authenticated, but at least you know that it won’t go away without your compliance. That’s the advantage of basing the system managed directly the user, based on DNS, rather than an external party based on DNS-like properties.

However, using a URL, as David Weinberger had pointed out, has its disadvantages. A few years back if you had asked me what URL I would use, I would have used something related to a long time domain, yasd.com. However, this was before this domain was so overrun with email spam that it was no longer of any use. Now my test identifier is based on burningbird.net. Who knows what it will be in three years?

A better approach might be to use something such as purl.org, which can provide permanent URIs that are then redirected to a specific domain, but which themselves never change. However, this still puts some dependency on an external organization, and I hesitate to do this more than absolutely necessary. And frankly, I’m not sure this would work within the LID system.

Another approach could be a broadcast method, whereby every time you log in with your particular identity at a specific site, the local system maintains a link to the site. Then if you ever move to a different URL, you formally document the move within the system, which then visits each site where you’ve been and issues a request–a verified request–that each modifies its data to reflect the new ‘you’. I wonder if this might be the technique that Ernst discussed with David about how to handle this as a problem. Hard to say, second hand info.

And so…

LID provides a great deal of functionality in a tiny little package. It supports pseudonyms (personas), secure authentication, single sign-on, and data exchange, all using standard, accessible technologies. More, it’s not dependent on any single centralized authority, other than the DNS itself. But then, we’re all rather dependent on this. As for how well LID works with Kim Cameron’s Laws of Digital Identity, I never touch “Laws” as defined by person or persons with vested interest, regardless of how good they sound; so Johannes Ernst’s writing will have to stand in answer to this.

As I said earlier, though, LID needs work. Rather than manually edit the lid.xml file, I would recommend that a user interface create the file entries, and then encrypt the important and sensitive bits. I also think it needs to be plug and play for non-techs, as well as an efficient, and secure, way to change one’s ‘identity’ (URL). A more formal API would only help, as would more extensive documentation. It needs to be extended to other languages, not just Perl.

I also would like to see source and concepts as open source, preferably within the BSD or GPL license, and wouldn’t mind hearing more about the business model. And of course, testing. I’d like to see lots and lots of testing.

Still, the root concepts of LID are good. I think that building a unique identification system on one already in place is a good one; and for those people who don’t have domains, a trust broker could provide one for a small fee–as long as there is a way to export the LID information in a format so that the person could backup their account after a change, and move their account to another broker.

I also like the extensibility of the system, and have already tried out various tiny bits of other XML documents I have. As for the use in social networks, LID already provides integration with FOAF, probably the most open of all the ‘who knows who’ specifications.

I like Sxip, and if it weren’t for all the centralized pieces, I would like it even more. It’s definitely not an Armani suit type of system, but I still see the vague shadows of a briefcase in its architecture. I think LID is the type of digital identity system I can wrap around the type of person I am, rather than the power luncher discussed earlier. Ensuring the open source nature of the concept and the code, expand both code and documentation, test, test, test, and I’d like it enough to even use it.

(If you’re interested in digital IDs and want to try this functionality and can’t install it locally, holler and I’ll create you a temporary digital ID in my sandbox in which you can try this out. If you have a VCARD.xml or FOAF.xml file, send those along with your request.)