Categories
Social Media

Google and bad banning

I dislike banning. I dislike blacklists based on proxy, domain, IP address, and keyword. No matter how sophisticated the applications that support blacklisting and no matter how good intentioned the sites doing the banning, someone innocent always gets hurt.

My favorite banning story so far is from Jonas Luster’s weblog where he talks about showing some law enforcement people WordPress, only to discover that the San Diego Police Department was on the Real-Time Spam Blacklist. My less than favorite banning story was when the dedicated server I was leasing ended up on SPEWS–another blacklist.

A current favorite now is to ban comments or trackbacks that come in through open proxies, since comment spammers use these to post comments. Unfortunately, open proxies can be found at libraries and schools, and have even been used to route around censorship in countries like China.

I wouldn’t be as critical of blacklisting if it weren’t for one thing: once you’re listed, it can become almost impossible to get de-listed. Most of the blacklisting organizations assume you’re guilty until proven innocent, and you almost have to have an act of Congress to be proven innocent. Well, since our sites aren’t hooked up to a feeding tube, the latter is unlikely to happen. Then you go through weeks, months, even years, trying to get your site cleared so you can send email or post comments.

It would seem that Google also fits in the guilty until proved innocent camp. Karl Martino from paradox1x wrote the following last week:

Help me please – PhillyFuture was probably banned from Google

I’ve had the domain back for one year. Googlebot has not come to index the site. After exhausting all other reasons I suspect that Google banned phillyfuture.org from it’s index. Remember – the preceeding year a porn company had it and was using it for redirection.

If anyone out there can help me – please – please do.

(Philly Future is Karl’s excellent community weblog and site for Philadelpha weblogs.)

Come on Google, a whole year to fix a problem? What do we have to do, use comment spam to get it listed?

(Thanks to Rogi for pointing this out, which also reminded me to update my subscription at Bloglines to the correct feed at Karl’s. Oh, and I did the background graphics, and thanks for the compliment, Rogi!)

Categories
Social Media

Search Engine antics

Another couple of tech issues appeared several times in my overworked aggregator: Google’s AutoLink and Yahoo’s API.

As soon as I read about the Yahoo API, I knew I wanted to try it out with the new site. If you look at the bottom of the sidebar, you’ll see several links that use the API to pull back search data and then format it within the existing site look. I plan on changing the topic of each search whenever a new and interesting one comes to mind, but for now, you can see the results for searching on orchids among images; about Social Security in the news; check out what’s happening with tagback in the web; and for all of my political friends, a whole mess’a Jon Stewart videos.

This capability will be built into Wordform as part of the new metadata functionality. It’s not major tech, but it’s fun.

What’s also been fun is reading all the different reactions to Google’s AutoLink. Dave Winer doesn’t like it:

The AutoLink feature is the first step down a treacherous slope, that could spell the end of the Web as a publishing environment with integrity, and an environment where commerce can take place.

Cory Doctorow loves it, though I think his analogy comparing AutoLink to a ‘beloved butler’ is a stretch. My idea of a beloved butler is someone who keeps my house clean, draws my bath after a hike, and massages my feet when I’m tired. AutoLink pales, badly, in comparison. However, Tim Bray thinks it’s evil:

Before, the Web, publishing was about words and pictures. Now it’s about words and pictures and links. I’m OK with reformatting and aggregating and all sorts of other things, but I don’t want downstream software fucking with my words. Or my pictures. Or my links. A lot of us feel this way.

Robert Scoble agrees with Tim and Dave Winer, writing:

I believe that anything that changes the linking behavior of the Web is evil. Anything that changes my content is evil. Particularly anything that messes with the integrity of the link system.

One word for you, Robert: Nofollow. This little doohickey, which you love so much is going to change the linking behavior of the Web faster than toolbar option that only works in IE, and only when the reader clicks a button, and only if you have an ISBN, address, or other obscure piece of data embedded in your page that isn’t currently already linked. Still, as Phil Ringnalda points out in facetious response to another weblogger in a fascinating comment thread, you can’t trust them sneaky readers:

I can’t trust my readers (an unsavory lot, though I love them dearly) to understand the sacred nature of your every word (some of them *gasp* will even copy text and paste it elsewhere!), so I removed your link. Let me know when you are providing your “web”log as either a signed PDF or one large image, so that they may be trusted to behave according to your anti-web rules, and I’ll put it back.

Hey Phil, don’t remove the link: just add “nofollow” to it. (And sorry that I, um, copied and pasted your text here, which is ‘elsewhere’..but it was my evil twin’s fault! I though she was gone for good, but she hitched a ride back with me from Florida, where she was working as a Mary Poppin’s Disney Character; working that is, until she hit some kid over the head with her umbrella when he whined about wanting to see Goofy, instead.)

One of the better ‘anti-AutoLink’ writeups was provided by Paul Boutin at Slate, who wrote:

I don’t think Google is evil for naively launching this feature. I do think they’ll be an accessory to evil if their tool prompts Yahoo!, Microsoft, or my ISP to start handing out similar software that’s a little more aggressive about stuffing in the links. Lots of companies have a different definition of “evil” than the Google guys—leaving money on the table is the ultimate sin.

If for no other reason, Google should yank AutoLink because it’s a poorly designed, oddly un-Googlish feature for a company that made its name on unobtrusiveness and unambiguous results. Most of all, it’s unsavvy. Google’s clever reinvention of Web ads won instant praise from both surfers and advertisers. AutoLink makes me wince. There’s got to be a better way to present map and book links than clumsily editing someone else’s HTML.

A good argument, particularly in comparison with Google’s other efforts: it is an un-Googlish form of technology–except for the fact that AutoLink is about a link, and there’s nothing more Googly than a link. In addition, if we measure every new technology against a possible evil abuse by other parties at some future time, we should have stopped email, cold, and told Tim Berners-Lee he could keep this new Web thing he’s promoting. And let’s burn Dave Winer in effigy for hooking us all on weblogs; my mama always told me to beware the pusher man.

What surprised me about this entire conversation is that people like Winer and Scoble are deathly against AutoLink, yet they push webloggers to publish their entire posts to their syndication feeds; where they can be pulled and massaged and combined with who knows what by any Tom, Dick, or Harry who comes along. I once had my writing appear in a published syndication feed at another weblogger’s site, surrounded by X-rated material, which changes the context of my writing a whole lot more than someone adding a link to a map based on an address.

And we’re talking about a toolbar that only works in Internet Explorer, the browser that’s almost guaranteed to take your carefully designed web page and muck it up so that it’s barely legible; leaving people who use it to view your site to think that you’re the worst ever page designer. True, it doesn’t do anything with your links. Frankly, though, on balance, if we’re that worried about our pages, I think we should keep the AutoLink and throw out the browser.

Now, if Google thinks about implementing a form of Hailstorm, I’ll bunny thump the ground with warnings of dire deeds and nefarious doings; but I give AutoLink a “mildly interesting” at best, and a “who cares” at worst.

Categories
Semantics Social Media Weblogging

Introducing Tagback

Recovered from the Wayback Machine (includes comments).

The purpose of Trackback initially was to ping the readers of another’s post about something they may want to know about. Of course, we immediately started using it as a referrer link (“Hi, I linked to you!”)

So, we’re dropping trackback and we need something in its place. I provided the how-tos to add Blogline citations and Technorati links in the previous post, and these will provide you a listing of who has linked to the article directly. But that’s the limitation: these solutions are dependent on a link. How can we point a person’s readers to another post or article, without linking to the post directly?

Easy: Tagback.

For each post, I create a tagback consisting of the words of of my individual post, stripped of white space and dashes, preceded by ‘bb’ to differentiate my posts from other people’s posts. I also include a link to the Technorati tags page for this tag, which forms my ‘tagback’. You can see the tagback for this post at the end.

Now, you can either use the tag with a photo in flickr, or you can use it in del.icio.us to annotate any bookmark: your post, another person’s post, an article, a reference to a specification, whatever.

Since Technorati scarfs up delicious tags and flickr tags, all of these items will eventually appear in my Tagback page, along with weblog posts where people have linked to the tag directly in the post. And if Technorati excludes googlebots and other bots in the tags pages, thereby denying any pagerank to the tag pages, there is no incentive for spammers to spam this page.

As long as Technorati denies pagerank for the individual tag pages. Hint. Hint.

Now, regardless of what weblogging tool you use, including Blogger, WordPress, Movable Type, Typepad, ExpressionEngine, whatever, you can participate in discussions, and without having to install any code. Just use whatever tags or function calls you use in your weblogging tool to get the title, and create your own version of a tagback. Or you can manually create a tag for each post you’re interested in designating as a ‘to be discussed’ item, and leave it off from those posts you don’t want to create a tagback page for.

So, you guys were right – tags are handy. I could get the hang of this folksonomy stuff.

I did have to update the code to strip out dashes, and just create a one word tag. I don’t like it, but flickr can’t deal with dashes, and it seems like del.icio.us wants to use spaces, and Technorati seems to not care. Since there is no standardized word delimiter with all of these systems, I just stripped out anything that isn’t a alphanumeric character.

Categories
Social Media Specs

I broke Nofollow

I’m still trying to write something on Technorati Tags. What’s slowing me up is there’s been such a great deal of interesting writing on the topic that I keep wanting to add to what I write. And, well, the weather warmed up to the 60’s again today, and who am I to reject an excuse to go for a nice walk. Plus I also watched Japanese Story tonight, so there goes yet, even more, opportunity to write to this weblog.

Thin excuses for sloth and neglect aside, it is interesting that a formerly obscure and rarely used attribute in X(HTML), rel, has been featured in two major technology rollouts this week: Technorati Tags and the new Google “nofollow” approach to dealing with comment spam. Well, as long as they don’t bring back blink.

Speaking of the new spam buster, after much thought, I’ve decided not to add support for rel=”nofollow” to my weblogs. I agree with Phil and believe that, if anything, there’s going to be an increase of comment spam, as spammers look to make up whatever pagerank is lost from this effort. And they’re not going to be testing whether this is implemented — why should they?

But I am particularly disturbed by the conversations at Scoble’s weblog as regard to ‘withholding’ page rank. Here’s a man who for one reason or another has been linked to by many people, and now ranks highly because of it: in Google, Technorati, and other sites. I imagine that among those that link, there was many who disagree with him at one time or another, but they’re going to link anyway. Why? Because they’re not thinking of Google and ‘juice’ and the withholding or granting of page rank when they write their response. They’re focusing on what Scoble said and how they felt about it, and they’re providing the link and the writing to their readers so that they can form their own opinion. Probably the last thing they’re thinking on is the impact of the link of Scoble’s rank.

Phil hit it right on the head when he talked about nofollow’s impact, but not its impact on the spammers — the impact on us:

But, again, it’s not so much the effects I’m interested in as the effects on us. Will comments wither where the owner shows that he finds you no more trustworthy than a Texas Hold’em purveyor, or will they blossom again without the competition from spammers? Will we do the right thing, and try to find something to link to in a post by someone new who leaves a comment we deem not worthy of a real link, or will new bloggers find it that much harder to gain any traction?

That Phil, he always goes right to the heart within the technology–but blinking, lime green? That’s cruel.

No, no. I don’t know about anyone else, but I’ve spent too much time worrying about Google and pageranks and comment spammers. A few additions to my software, and comment spam hasn’t been much of a problem, not anymore. I spend less than a minute a day cleaning out the spam that’s collected in my moderated queue. It’s become routine, like clearing the lint out of the dryer after I finish drying my clothes.

Of course, if I, and others like me, don’t implement “nofollow” we are, in effect, breaking it. The only way for this to be effective as a spam prevention technique is if everyone uses the modification. I suppose that eventually we could fall into “nofollow” and “no-nofollow” camps, with those of us in the latter added to the new white lists, and every link to our weblogs annotated with “nofollow”, as a form of community pressure.

Maybe obscurity isn’t such a bad thing, though; look what all that page rank power does to people. But I do feel bad for those of you who looked to this as a solution to comment spam. What can I say but…

Categories
Social Media

I, URL

Recovered from the Wayback Machine.

My first exposure to the concept of a ‘federated identity’, or a digital identity or ID if you will, was when I had to obtain one of the first Microsoft Passport identities in order to access the material I needed to finish my book, Developing ASP Components. I was pleased with the concept, then, because it would give me a way to sign into all the Microsoft sites I visited and only have to remember the one username and password.

I was quite fond of MS tech at the time, and focused almost exclusively on this vendor in my writing. However, if you had asked me, then, whether I would input credit card information and use Passport to sign on to eBay or Amazon, I would have looked at you, blankly, waiting for you to finish the joke.

You see when email was created, two days later the first email spam was sent. And when the web was created, two days after that, the first DoS (Denial of Service) happened. Well, two days in a relative sense — almost on the doorstep of any new technology, there will follow the legions of kiddies and cons, waiting to take advantage of any opening and vulnerability. Therefore, when you talk about gathering enormous amounts of extremely vulnerable data into one spot, I would have to assume you’ve just gotten off the boat from your naivete about how secure you can make anything that’s attached to the internet.

Of course, my information is vulnerable anyway, regardless of what I do. My bank provides access to both my social security and debit card information at it’s site, and my car company also provides access to the same. There’s little I can do about companies choosing to make my data vulnerable, other than to review security procedures they follow, and be ready to hold them accountable if something happens to my data. Oh, and check my credit report every couple of months to make sure nothing is there that shouldn’t be.

But to voluntarily group sensitive data about myself behind the thin shield of a digital identity? No, not on your life.

However, I also get a little peeved at times about having to sign up at all the various newspapers’ sites to get access to their articles. And I imagine if I did the social network thing, such as LinkedIn and Orkut, I would get vastly annoyed at having to re-input whatever information is appropriate to these venues, not to mention all of the many (three) assocations I have. I also wouldn’t mind being able to keep my address in synch at all the places I do business, such as Amazon and B & H Photo. I still wouldn’t allow a site to store my credit card if I can avoid it, but I don’t mind so much my address and other contact information — this is easily obtainable regardless.

Of the other data I’ve been asked: I don’t really want to share my birth date, why not just ask me if I’m under or over 18? You don’t really need to know what I do for a living and how much I make. I’ll give you my zip code, but why do you need the full address? And no, you can’t have family member names. As for my sexual preferences, buzz off you snoopy little creep.

And under no circumstances would I input a social security number unless it was required by law.

Based on this, the concept of a digital identity for something like ‘single sign-on’, which would allow me to have one identification and password, as well as be able to share information such as my address is an attractive proposition. Especially if I could severely limit how much information I input into these systems–because no matter what they tell me about security, there is no such thing as a secure system connected to the internet.

In addition to these constraints, whatever system I used would not be centralized. If you’ve read me for any length of time at all, you’ll know that I have a very real dislike of centralized systems. Centralized systems are dependent on entities that may not exist someday. Centralized systems provide too tempting a target for Bad People. Centralized systems maintain too much control over what I do, or do not do, on the internet — and this latter is the number one reason I don’t like centralized systems.

As for my experience and exposure to identity systems: as mentioned previously, I used to have a Passport account but this is long gone, and I’ve still managed to avoid having to get a TypeKey account. I once worked directly with Boeing’s data model efforts, so I recognize the impetus behind the Liberty Alliance, and, frankly, consider this effort to be a way of providing R & D perks for key personnel. I’ve worked with Oblix at one of the universities where I consulted, and find it usable in the limited context of its scope–which is identity management within a closed, controlled group. I think Ping Identity is bloated, but then I also think J2EE is bloated. I am fairly newly aware of Sxip, primarily because Marc Canter said that it works ‘just like DNS’. And I have to ask, who would pay $25.00 to register an ‘i-name’ at Identity Commons, without seeing the tech first?

All in all, the main problem I had with most of the systems and/or products and/or organizations is that all but a few seemed to be focused on some grand scheme or another, where the tools they provide would be used by people in Armani suits, on the their way to a power lunch with some mover or shaker or another. They represent Big Things by Big People.

They weren’t for for the likes of me, people who come in all muddy from a hike, and who sit down at their computer to read an article at the Washington Post, but don’t want to have to register for yet another online newspaper. Or wouldn’t mind not having to re-input their mailing address at another store, and would like to just push a button and have any associations recorded compared and matched with others who have joined YASN (Yet Another Social Network).

These people are the only type of people I can wrap my mind around when I think of ‘digital ID’. That’s why when the creator of LID (Light-weight Digital Identity), Johannes Ernst, sent me an email about it, I was intrigued, primarily because to all intents and purposes, this system of digital identity allows one to control one’s own data; to easily extend the system using fairly standard technologies of XML and XPath for queries; and it’s a very simple concept — few bells, small number of whistles.

Cool. But will it work, and how does it compare with other systems.

Installation

I installed and played around with LID at the URL I picked to identify myself with, http://burningbird.net/shelley/. Clicking the link will bring up the help page for the installation, which provides multiple tests you can try.

I installed a minimum VCARD and FOAF files, following the instructions. Running an XPath query on the FOAF name returns correct value, and I can add other XML vocabularies if I want to extend the installation. I also tried the single-signon against the LID site, and had no problems. Trying to install a single-signon site myself did result in some Apache errors, but I’ll keep tweaking.

The installation wasn’t too complicated. I have the Gnu Privacy Guard (gpg) application installed, and I also have a SSH account to be able to access my server from the command line, so I could generate the key, per instructions. I did have to ask Hosting Matters to install the perl module for XPath, as it wasn’t installed. Despite having to handle the aftermaths of two separate DDoS attacks, the company installed it within 30 minutes.

Still, not all hosts are as willing as HM to accomodate their clients in this way and this is a strike against the application — dependent on having gpg and being able to run it at the command line, on a module that’s not common in most installation, and using Perl, an language environment that’s not easy to extend. I can’t help thinking that PHP might be a better solution, as well as more comfortable for people to use.

The install procedure was as follows:

1. Create the sub-directory which will serve as your id location. In my case, http://burningbird.net/shelley. This could be your weblog location, but if you’re running a PHP main page, this is not necessarily compatible with LID. I found that my main index.php page wasn’t successful. A better approach is to build something off your main domain directory.

2. Created the .htaccess file, which set the index page access order to index.cgi first, index.html second. I didn’t need to add the line to ensure the CGI file would be executable.

3. Tested the index.cgi file, and then sent email to HM to ask them to install XPath.pm.

4. After XPath is installed, I next created the lid subdirectory, which has a lib.xml file for configuration. I copied the template from LID and modified.

5. I have a simple VCARD and FOAF XML files, and copied these as VCARD.xml and FOAF.xml, respectively, to a data subdirectory under the lib subdirectory. Now have the following subdirectories:

/home/shelley/www/shelley
/home/shelley/www/shelley/lib
/home/shelley/www/shelley/lib/data

6. All done.

As you can see, aside from the XPath module, its about as easy as installing your own weblogging tool. But Perl always adds an extra challenge when adding new modules, as compared to PHP, which could simplify the use of this technology considerably. David Weinberger noted the complexity of the install, and asked Johannes Ernst about it in an email. According to the reply, the initial release is for technologies for exploration. I think a better approach would be to provide something usable by both techs and non-techs, because many of the people interested in digital IDs, such as David, aren’t techs. If they can’t play, they can’t write to promote the concept, and if they can’t write about the concept, it is going to get slower acceptance.

Still, the tech is flexible, all these issues could probably be easily addressed and we should be focusing on the concepts. This includes the use of a URL as digital ID, in addition to how a distributed system would work in comparison to a more centralized system such as Passport, and a closed distributed system like Sxip.

Comparisons

How does LID compare with other systems first requires you to pick which other systems. Following the LID’s creator own examples, I focused on Passport, Liberty Alliance, Sxip, and Identity Commons.

Passport is owned and operated by Microsoft, which also controls all the data that’s included within the system. If you’ve thought about leaving a comment at a MSN weblog, you would have been asked for your Passport identification. If you used eBay prior to December, 2004, you could also have used your Passport identification for sign-on. Now, though, because of security concernsmost uses of Passport are related to Microsoft content or Microsoft sites.

You don’t have to install Passport, but all data in the system remains in the centralized system, under control of Microsoft. LID, on the other hand, doesn’t store any data about you. In fact, it doesn’t even know you exist — there is no way of tracking a LID user from some root LID site.

External storage and control of our data is a concern that comes up with digital identities–who has access to the data, and what can they do with it. Frankly, in my opinion, though, this concern is overrated.

The primary interest in single sign-on systems for the user it to make it so they don’t have to remember their username and passwords from site to site. Additionally, the also don’t have to answer all the same obligatory questions at each site — address, phone numbers, and so on. Regardless, though, of whether you enter the data in one spot or many, once you make the decision to do business online, in whatever way, you have lost some control of the data…or at least some control of how the data is used.

We have dropped our names, phone numbers, email addresses, home addresses, birth date, and various other bits of publicly accessible data in more places than we can most likely remember. There is nothing to ‘control’ about this information — we voluntarily dropped the reigns of this data long ago.

It’s when we expose very sensitive data that we should concerned about the control of the data, and this primarily because of security. For instance, if we store our credit cards with our digital identities, we then want to make sure that the data is very secure and safe from hacking. This is where a centralized system can be most vulnerable, as it stores many, many such important bits of data and therefore becomes a particularly tasty target for hackers.

However, regardless of the system, there’s a way around this and that is not to store your credit card information online. All sites, unless they’re particularly primitive and ill designed, give you an option not to store your credit card information. Those that don’t, don’t deserve your business.

(And no site should ask for your social security number, unless required by law to report income. Even job search companies should not ask for this information — it’s up to an employer to obtain your SSN information after you’re hired or contracted for a position, not before. )

Consumers, spurred on by security reports scaled back in their initial trust of online systems, and the concept of ‘federated’ identities in an ecommerce setting has consequently lost interest among many consumers (and companies).

For all that Microsoft made Passport easy to use, and therefore could be considered for the blue-jeaned muddy hiker, it also has not the best reputation when it comes to security. So it fails for the blue-jeaned muddy hiker who is paranoid. This lack of trust did impact on Passport, whether the company will admit this or not. Microsoft dropped its credit card option from Passport in 2003. In fact, Passport is no longer a viable entity in the global digital identity game, primarily focusing its use on its own sites, and those of some partner sites.

If Passport is basically a non-player now, then what about Liberty Alliance? Well, frankly, Liberty Alliance isn’t for the likes of you and me, regardless of all its talk about federated identities, and discussions about the Liberty ID-FF — the specification behind the Alliance’s identity scheme. Case in point, from the specification there is a possible user scenario, with Joe Self logging on to an airline, who is part of a circle of trust. Once authenticated, in the scenario, Joe is then asked:

 

Note: You may federate your Airlines, Inc. identity with any other identities you may have with members of our affinity group.

Do you consent to such introductions?

Laughable. I chortled until tears ran down my face. It then continued on from there, with Joe Self being asked to ‘federate his identity’ at various sites within the ‘afinity group’ as he progressed along, just trying to reserve an airline ticket and rent a car — something that can be done in one move, with one click of the button in today’s travel systems.

Returning to LID, though, I find I can’t compare the two implementations, because it would like trying to compare an Oraclized PeopleSoft with WordPress. More, where LID represents a service to the user, Liberty Alliance represents a service to Alliance members — no more, no less. In other words, the two implementations are so far apart on the scale, that the scale becomes meaningless. Frankly, this is all to LID’s favor, too.

However, both Passport and Liberty Alliance represent large corporations trying to manage every aspect of one’s digital identity. What about smaller efforts, such as Identity Commons?

It would be great to compare LID against Identity Commons if there was anything to compare. What amazes me is from I can find about this entity/effort is that you can now register your ‘i-name’, using a URN (Uniform Resource Name), and this will be good for 50 years. All for 25.00. However, if you perchance want to check out the tech first, no such luck because though I searched high and low, I couldn’t find anything.

Regardless, the approach seems to be that you register for your own personal identification through a broker, where one assumes you’ll store all the important bits about you that forms your online self. Your data is distributed, but still managed by another entity. However, your uniqueness in the system is guaranteed by the fact that there is one overall centralized authority that manages the distribution of the actual identities.

Still, there’s nothing to see, feel, and tweak. In other words, there’s a lot of good words, and promises, but for all intents and purposes, it’s a pipe dream until it releases something tangible, though it does look like it might be releasing something today, unless this page has been up for months.

Sxip, on the other hand, does have technology you can see, feel, and tweak. In fact of all the alternatives examined, it seems to be the closest to matching what I would look for in a digital ID. Almost.

Sxip and SXIP

There is Sxip, the company, and SXIP the protocol. The latter stands for Simple eXtensible Identity Protocol. The company is managed by Dick Hardt, who I know through through my efforts to include ActiveState in that same aforementioned Developing ASP Components book. Since I was rather fond of ActiveState, I was somewhat predisposed to be positive about the Sxip efforts. This effect was only positively impacted when I was able to download what the company calls a “Membersite Developer Kit” to play around with some of the concepts, myself.

(In care you’re curious, I downloaded the PHP version.)

How the Sxip system works is that users can sign up for an account at a Homesite, and then use this identity at any other number of Membersites. They can create multiple personas and associate different pieces of data with the persona. Then, when the log into, or “sxip into” a Membersite, for the first time, they’re given an option as to what persona to use with that site. I tried it out with the demonstation materials provided by Sxip, and found it to be a very simple process.

Now, how Membersites and Homesites know about each other and can exchange information is through the use of a Rootsite, which basically manages the unique identities, without access to any of the other user data. This is similar to i-broker in Identity Commons, I believe. Where it might differ is that if one has the development expertise, one could develop one’s own Homesite for just their own personal use.

It is this latter capability that matches closest to LIDs own user controlled data technique, though maintaining a personal Homesite looks as if it could be outside of the capability for a non-developer. (Still, it wouldn’t be out of the boundary of the tool to create a plug-and-play Homesite. Would this work contrary to the overall system expectations? It would come at the same cost as a digital certificate, according to the documents at Sxip.)

Still, there are major differences between the LID approach and the SXIP approach. For instance, with LID, the effort would be completely distributed, with no central authority controlling the issuance of identities. This can work this way because its based on each person being identified by a URL, and is implemented within the existing domain system as managed by ICANN: within the DNS framework, there can be no two identities alike because there are not two domains alike.

Marc Canter and Sxip both say that SXIP works like the DNS, but it doesn’t really, other than there being one central authority preventing duplication of names, as well as a resolution of where the data associated with these names resides.

For instance, in DNS, when a person accesses a domain, and their ISP’s nameserver does not recognize it, the ISP checks with higher level root nameserver to find out where the location o the domain’s nameserver. It is this that provides the unique name-IP address mapping. The ISP’s own nameserver then gets the IP address and stores it and the domain within its own cache of data before responding to the request. The next time another person who uses the same ISP accesses the domain, the ISP already has the information.

This caching serves a couple of purposes within the DNS. For one, it makes new requests of a domain that much quicker. For another, it helps to disperse the information about the domain across many different nameservers, so that if there is something wrong with one, the system can usually route around the damage and finds the IP-domain name mapping in another.

There isn’t anything like this in SXIP. What it does, instead, is store a cookie with information about the user’s Homesite in the computer being used; or provides a place to fill this information in if the cookie is gone, or the person is using a shared machine. This can work rather well, and about the only dependency that exists now is authenticating the uniqueness of the identity at the Rootsite, when a new user, or Membersite, or Homesite is created except…

Except for the Homesite going down, or if it no longer exists.

The real strength of the DNS system is that information about a domain is cached all throughout the system, and the only time a problem will occur is if something has happened to the person’s own nameserver, but even then, they have a backup. If you’ve ever registered a site, then you know about providing two different nameservers, and that these are usually at two different locations. The whole concept is based on redundancy.

There is no redundancy in SXIP. I looked, because I thought I had read that you can store your information at more than one Homesite, but from the developer documentation, it would seem there is an assumption that there is one, and only one, Homesite. If true, then if your Homesite is down, you can’t log into a new Membersite, though you should be able to still access previously visited Membersites. And if your Homesite is blown away, then you’ll have to start over again with a new one.

LID doesn’t have this problem because you control the data. It’s true, if your site is down, you can’t be authenticated, but at least you know that it won’t go away without your compliance. That’s the advantage of basing the system managed directly the user, based on DNS, rather than an external party based on DNS-like properties.

However, using a URL, as David Weinberger had pointed out, has its disadvantages. A few years back if you had asked me what URL I would use, I would have used something related to a long time domain, yasd.com. However, this was before this domain was so overrun with email spam that it was no longer of any use. Now my test identifier is based on burningbird.net. Who knows what it will be in three years?

A better approach might be to use something such as purl.org, which can provide permanent URIs that are then redirected to a specific domain, but which themselves never change. However, this still puts some dependency on an external organization, and I hesitate to do this more than absolutely necessary. And frankly, I’m not sure this would work within the LID system.

Another approach could be a broadcast method, whereby every time you log in with your particular identity at a specific site, the local system maintains a link to the site. Then if you ever move to a different URL, you formally document the move within the system, which then visits each site where you’ve been and issues a request–a verified request–that each modifies its data to reflect the new ‘you’. I wonder if this might be the technique that Ernst discussed with David about how to handle this as a problem. Hard to say, second hand info.

And so…

LID provides a great deal of functionality in a tiny little package. It supports pseudonyms (personas), secure authentication, single sign-on, and data exchange, all using standard, accessible technologies. More, it’s not dependent on any single centralized authority, other than the DNS itself. But then, we’re all rather dependent on this. As for how well LID works with Kim Cameron’s Laws of Digital Identity, I never touch “Laws” as defined by person or persons with vested interest, regardless of how good they sound; so Johannes Ernst’s writing will have to stand in answer to this.

As I said earlier, though, LID needs work. Rather than manually edit the lid.xml file, I would recommend that a user interface create the file entries, and then encrypt the important and sensitive bits. I also think it needs to be plug and play for non-techs, as well as an efficient, and secure, way to change one’s ‘identity’ (URL). A more formal API would only help, as would more extensive documentation. It needs to be extended to other languages, not just Perl.

I also would like to see source and concepts as open source, preferably within the BSD or GPL license, and wouldn’t mind hearing more about the business model. And of course, testing. I’d like to see lots and lots of testing.

Still, the root concepts of LID are good. I think that building a unique identification system on one already in place is a good one; and for those people who don’t have domains, a trust broker could provide one for a small fee–as long as there is a way to export the LID information in a format so that the person could backup their account after a change, and move their account to another broker.

I also like the extensibility of the system, and have already tried out various tiny bits of other XML documents I have. As for the use in social networks, LID already provides integration with FOAF, probably the most open of all the ‘who knows who’ specifications.

I like Sxip, and if it weren’t for all the centralized pieces, I would like it even more. It’s definitely not an Armani suit type of system, but I still see the vague shadows of a briefcase in its architecture. I think LID is the type of digital identity system I can wrap around the type of person I am, rather than the power luncher discussed earlier. Ensuring the open source nature of the concept and the code, expand both code and documentation, test, test, test, and I’d like it enough to even use it.

(If you’re interested in digital IDs and want to try this functionality and can’t install it locally, holler and I’ll create you a temporary digital ID in my sandbox in which you can try this out. If you have a VCARD.xml or FOAF.xml file, send those along with your request.)