Categories
Technology

The clean industry

Doc Searls brings up a conversation that started when Mary Hodder wrote a post about having to use her Yahoo identity to log into her Flickr account. The tale is rather long and involved, but it seems that the cookie that maintained her Flickr identity was reset and she was given an opportunity to log in with either her Yahoo account or her Flickr one, but once she used her Yahoo account, she would have to use it from then on.

The login ID doesn’t impact what shows for her online ID in either place, and I gather the cookie reset was only for a subset of accounts and was an error, not deliberate. I know that I haven’t had to re-login to Flickr and have been able to use my Flickr login. Even if I didn’t, I wouldn’t be adverse to using my Yahoo email — none of it shows in my account, and I’m currently using my Google email account anyway.

But Doc uses this as a spring board into a criticism of the current systems of identity management that are splintered here and there and that require one to fill in data such as occupation. This allows him to bring up his treasured technical gem: a single identity for each of us that allows us to control whatever data is given to each company, rather than having to re-input data. This forms the basis of the cover story for Linux Journal, where he gives a very good summary of his involvement in the digital identity business, the Gang of Identity consisting of an inner corp of people who are all things ID, and the various identity schemes and concerns about, and/or benefits of each. It really is an excellent synopsis of the digital identity movement.

In this article, there is a great deal of discussion about Microsoft’s Identity Metasytem effort, as led by Kim Cameron. Doc has become friends with Cameron and an enthusiastic proponent of his work and his philosophy. The only concern he has is the open source licensing of the technology:

I’ve told Kim that he and Microsoft need to do more before my constituency-the Linux and Open Source development communities-takes a serious interest in the Identity Metasystem. I said, “If you don’t have an open-source license or if you start talking about IP Frameworks, my readers will leave the room.” The term IP Frameworks was used by somebody from another part of Microsoft, in respect to the WS-* standards process.

I respect Doc’s enthusiasm and have always been rather awed at his loyalty, but I think that more is at stake then an open source license for some of the technology.

As I mentioned, I have had no problems logging into Flickr but if the group wanted me to switch to my Yahoo account, it would not bother me; this is all used for public interaction anyway. I never trust anything secure or sensitive to centrally located services.

What I was more concerned about was Yahoo helping the Chinese government discover a Yahoo user in such a way leading to his *arrest and imprisonment for ten years (Rebecca MacKinnon has been convering this the most). I was especially concerned because it seems to me that my industry, the tech industry (or computer, or web) has been ‘re-defining’ its behavior lately; a re-definition that takes it from the noble principles highlighted and painted on walls (”Do no evil”) into an adherance to the bottom line in such a way to gladden any Wall Street investor’s heart.

Industries once known as the ‘clean industries’ (because of their lack of negative effect on the environment, and seemingly positive social impact) are changing the way they do business–a change that is not based in altruism. According to the article I just linked:

Rather than using their clout to help push the boundaries of free speech and information in the one-party state, critics say companies like Google, Yahoo and Microsoft are at best turning a blind eye to the machinations of the cyber police.

“It’s too early to say that just by doing business in China and developing the internet in China they will foster democracy and human rights,” said Julien Pain, of media watchdog Reporters Without Borders.

“It doesn’t work that way.”

Indeed, the group says there is evidence the opposite is happening, with the major web players accused in the past of pre-empting the government by routinely blocking discussions on sensitive subjects from the 1989 democracy movement to the spiritual group Falun Gong.

In fact, no tech company doing business with China can escape its complicity in helping to suppress the population of that country.

From a local perspective, meaning what does this have to do with me personally, if these companies would willingly help China censor information and willingly provide information that actually leads to the jailing of a reporter, what would they do in countries supposedly free that have passed, in panic, potentially intrusive laws based on fear of terrorism?

Countries such as, say, the United States? Countries that jail prisoners for indefinite periods without due process of law, or demand library records and investigate people just for checking out certain books?

I have no doubts that if something such as the Identity Metasystem comes into existence that the US government wouldn’t be at the doors of the companies involved, demanding a digital backdoor so that they can view a person’s activities; any person’s activities. The target would be too tempting–all that information about a person stored in one place, managed by one system. Hackers would have to earn their way in, but governments would be given the key.

Even if we discount our concerns about the government, dismiss them as paranoia, I am less than sanguine that any system would give to the users any control beyond which the companies themselves would deem beneficial for their own purposes. Companies do not act from altruism. Their actions may not necessarily be evil at heart, but they aren’t ‘good’ either.

I have different identities at many different companies I do business with, and it’s rarely a hardship to remember each. Most are based on one of three email addresses, or use a variation of three different user names (depending on how soon I’m able to register for a username, and how popular the service is). I record my passwords in a little book, hidden away in my room, which would require the government or other entity to phsyically enter my home to access them–something I would hear about from my neighbor behind me, and probably the one three doors down if they try to do so secretly. There is many levels of breakage between my identity and the government, as well as my identity and hackers, and especially my identity and corporations–and I want to keep this breakage!

Perhaps because I live in St. Louis, the netherworld of technology, which the hip and A list consider to be the ‘Out Back and Beyond’, but I’ve not seen demand for any form of Identity MetaSystem — not at a personal level. Seems to me that most people get by just fine with this somewhat fragmented environment. Not only that, but from actions I’ve seen here in Missouri, most folk–left, right, or the really strange folk in the Ozarks–would be appalled at the concept.

When there was discussion about a federal identity system, and our own state driver’s license system was considered not in compliance with Patriot Act rules, both conservatives and liberals–and in Missouri these terms really mean something–joined together to deplore the concept. Tell them you want to do the equivalent to all their online interactions, and you’ll see what happens when Missourians really get riled. Let’s just say that the West Coasters promoting this idea would be nothing more than soft, squishy, expensively dressed obstacles easily overcome in the move to trample this idea into the dirt.

I support the concept of identity research, because digital identity is not the same thing as university identity, or federated identity, or even Identity Metasystems. Companies here are interested in security, of course. We have Boeing, we have Citibank–not to mention food and pharmacy research firms. But they’ve jumped beyond the digital divide to biometrics–yes, the bionic finger. As for me personally, I wouldn’t mind eventually incorporating something such as LID into my weblogging tool, to enable people to edit their comments without being dependent on IP address. I also wouldn’t mind a good identity system that I could use for a set of similar services, such as specific social services or group membership, or for the online newspapers I subscribe to. Especially as regards the latter, these are the ones that are hard for me to remember, but I already have one identity that I can use for six of them because they’re all part of the same shared system. I don’t care who knows what I read when it comes to newspapers, but I do care about connecting this up with my financial actions, my travel, what I access at the library, my medical interactions, not to mention other services.

In other words, good identity systems within shared components of my online interactions, but not one overall system to bind them together. Too Lord of the Rings for me.

Doc wrote:

We won’t get it if we get bogged down in long-winded digressions about privacy and crypto and the big awful companies that want to keep their hands-oops, credit and membership cards-in our pockets. Those are legitimate and necessary concerns, but they are secondary to the purpose of establishing methods and protocols and technologies for the assertion of Independent Identity. And for changing the world by saving markets from the producerist mentality that has kept everybody, producers included, in darkness for more than a century.

I also feel certain that forces far more nefarious than Microsoft are hell-bent on putting the Net genie back in the telco and cableco bottles-and turning it into the distribution system for “protected content” they imagined when they made sure the “information superhighway” had asymmetrical driveways to every “consumer’s” home.

Yes, I can agree with Doc that each of us has a unique digital identity, and I can agree that work on stronger and more reliable protocols is a goodness. But that’s different than work on an overall and encompassing vendor inspired (and vendor benefiting) Identity Metasystem, and the concerns we bring up now are legitimate ones, and not secondary. I fail to see what all this -co talk has to do with anything: other than trying to replace one boogeyman with another; unless Doc’s referring to Google joining the global wireless game, in which case, I do share his concerns; but these are concerns in addition to, not instead of, those having to do with an Identity Metasystem.

To willingly place my entire digital ‘fingerprint’ in the hands of companies such as Microsoft, Google, and Yahoo? No. None of these supposedly ‘clean’ companies have lately shown me that any of them are worthy of such trust. These may seem secondary concerns to Doc, but if you ask Shi Tao in China what he thinks, I think he might urge Doc to reconsider his priorities.

*Hopefully Shi Tao won’t die in prison, or his skin will be harvested for use in cosmetics. Or that he won’t be required to do forced labor: something to think on next time you buy that Chinese manufactured item, such as your iPod.

Categories
RDF Semantics Web

Semantic web lite: same great taste, less reified

Most of the time the feeds at Planet RDF reference isolated items with general interest. Other times, though, the thoughts featured strike sparks against each other, leading to a chain reaction whereby everyone jumps in and Things Happen.

Starting a few days ago, people have been referencing two stories, both of which I find very interesting. The first is Kendall Clark’s SPARQL: Web 2.0 Meet the Semantic Web; the second is Ian Davis Internet Alchemy Crises.

Kendall brings up what’s missing in Web 2.0 is a common query language and it just so happens SPARQL is a common query language, backed up by a common data model (RDF) and syntax (RDF/XML). He suggests that the Web 2.0 folks provide an RDF wrapper to their data, and both groups can then benefit from the same query language, which will make things a whole lot simpler:

So what, really, can SPARQL do for Web 2.0? Imagine having one query language, and one client, which lets you arbitrarily slice the data of Flickr, delicious, Google, and yr three other favorite Web 2.0 sites, all FOAF files, all of the RSS 1.0 feeds (and, eventually, I suspect, all Atom 1.0 feeds), plus MusicBrainz, etc.

And this leads us to Ian Davis and a cognitive crises he underwent at the DC2005 (DC as in Dublin Core), as relates to a pissy-ant, pick-a-une problem of dc:creator:

Danbri referred us to work he had done after the last DC meeting in 2004 on a SPARQL query to convert between the two forms. Discussion then moved onto special case processing for particular properties, along the lines of “if you see a dc:creator property with a literal value then you should insert a blank node and hang the literal off of that”. Note that I’m paraphrasing, no-one actually said this but it was the intent.

That’s when my crisis struck. I was sitting at the world’s foremost metadata conference in a room full of people who cared deeply about the quality of metadata and we were discussing scraping data from descriptions! Scraping metadata from Dublin Core! I had to go check the dictionary entry for oxymoron just in case that sentence was there! If professional cataloguers are having these kinds of problems with RDF then we are f…

Ian then recommended paring down RDF into an implementation subset, which focuses primarily on RDF, as it is used to define relationships. This means jettisoning some of the more cumbersome elements of the model — those that tend to send traditional XMLers screaming from the room:

What if we jilted the ugly sisters of rdf:Bag, rdf:Alt and rdf:Alt and took reification out back and shot it? How many tears would be shed?

What if we junked classes, domains and ranges? Would anyone notice? The key concept in RDF is the relationship, the property.

The end result would be an RDF-Lite: a proper subset of RDF that can be upwardly compatible with the model as a whole, though the converse would not be true. If this subset were formalized, then libraries could be created just for this it that would be significantly less complex, and correspondingly leaner, than libraries needed for the full featured RDF.

This, then, leads back to Kendall’s interest in seeing if Web 2.0 couldn’t be wrapped, morphed, or bridged on to RDF and thus allow us to assume one specific data model, and more importantly, one specific query language for use with all metadata easily and openly available on the web–not just the RDF bits. If a simple subset of RDF could be derived, it could be trivial to map any use of metadata into RDF. More importantly, since the capabilities of the technology is never the issue, those generating the disparate bits of XML or otherwise metadata might actually be willing to go this extra step.

True, an RDF-Lite would not have the same inferential power as the fully aspected RDF model, but frankly, most of our general web-based uses of RDF aren’t using this power anyway. And if we can make RDF tastier to the general web developer, we’re that much closer to an RDFalized web. To Kendall, an RDFalized Web 2.0 could be a powerful thing:

How powerful? Well, imagine being able to ask Flickr whether there is a picture that matches some arbitrary set of constraints (say: size, title, date, and tag); if so, then asking delicious whether it has any URLs with the same tag and some other tag yr interested in; finally, turning the results of those two distributed queries (against totally uncoordinated datasets) into an RSS 1.0 feed. And let’s say you could do that with two if-statements in Python and three SPARQL queries.

Pretty damn cool.

Well, not necessarily. What Kendall describes is something already relatively easy to access through Web services. And, as we’re finding, how tags are used with Flickr differs rather dramatically than how tags are used within delicious, and so on. I do agree that being able to do something like all of this with a couple of statements and SPARQL queries would be nifty; but the technology is still going to be limited based on a common understanding of the data being manipulated. Even with something as simple as tags, we have different understandings of what the term means across different applications.

I don’t necessarily agree across the board with Ian, either. For instance, you can take my blank nodes (bnodes to use popular terminology) only if you pry them from my cold dead APIs, but his general points are good. My own recent work has been focusing more on using RDF for its ability to map the relationships, and less on its participation in grander semantic schemes (though the data is available for any person/bot interested in such).

More, I’ve been exploring the capabilities of using RDF as a lightweight, portable, self-contained database–one to a unit, with unit being weblog page. I’ve been steadily pulling bits of metadata out of MySQL and embedding them into an RDF document, which then drives some of this site’s functionality.

There is a line between taking advantage of MySQL’s caching, versus managing my own with RDF but I’m finding that not only is a hybrid solution quite workable: it is a very effective solution for data that is meant to be open, unrestricted, and consumed by many agents.

The best aspect of all is that because of two specific aspects of RDF–ease of capturing a relationship, and the use of a URI to map the relationships correctly–it’s trivial for me to just ‘throw’ more metadata into the pot, and not have to worry about modifying existing tables in my database, or re-arranging a hierarchy and run into possible namespace collision in a straight XML document. I’m also not constrained by being dependent purely on primitive keyword-value pairs, a limitation that makes it difficult for me to make multiple statements about the same noun-object pairs.

It is all becoming very, very fun, and I am busy ripping the guts out of my current weblog tool implementation in order to incorporate the hybrid data store.

All of this effort, though, presupposes one thing: that I have a small subset of classes to manage the RDF bits, and to meet this, I experimented around with RAP (a PHP RDF library) until I had a trimmed, core set of functionality that, by happenstance, would meet Ian’s criteria for RDF-Lite. There isn’t a SPARQL implementation yet, but I know that this is on the way, and when released, I will use it to replace my use of the existing RDQL implementation.

Categories
XHTML/HTML

Repeating

Dare Obasanjo writes:

Repeat after me, a web page is not an API or a platform.

Versioning APIs is hard enough, let alone trying to figure out how to version an HTML website so screen scrapers are not broken. Web 2.0 isn’t about screenscraping. Turning the Web into an online platform isn’t about legitimizing bad practices from the early days of the Web. Screen scraping needs to die a horrible death. Web APIs and Web feeds are the way of the future.

Consider it repeated. Just because people are using XHTML for their pages doesn’t mean that they’re following any specific data model. XHTML is meant to be both open and loose. As for screen scraping: ew, ew, ew.

Categories
Internet

Speaking of connection

I had forgotten to apply the MAC filter and once having done so, the machine called ‘Chris’ is history. I own the air, the bandwidth is mine. All mine.

Speaking of connectivity, I was rather amazed to see that Google has hired Vint Cerf. If you’re not familiar with the name, he is literally the father of the internet: the co-inventor of the TCP/IP protocol on which all of this web stuff lives.

Lots of flapping gums about how good Cerf is for the company, but I don’t see any symbiosis between Cerf and Google. Well, other than I think that Google is ‘hiring’ (the use of ‘buying’ sounds so crude) its way into a symbolic link with the internet: the internet is Google, Google is the internet. Especially since they hired Cerf as an ‘evangelist’–such an overused, and abused term–and he has projects of his own; including management of ICANN.

One thing I will say about this match is that Cerf is a closer: a man who wants to see things accomplished. Perhaps with his influence, Google will actually release some of its software from beta.

You think?

Categories
Internet

Connectivity

I’ve been fighting a real battle the last two weeks to get and hold a stable internet connection. I thought at first it was the cable connection, but that doesn’t seem to be the problem.

With university starting and all new neighbors, I think that my wide open wireless router has been getting several customers, so this week I’ve secured the wireless connection; filtering access to only those MAC addresses of the three computers in the household. Still, my router has been dropping the wireless connection about every one to two hours.

I’ve updated the firmware, and this afternoon the connection seems to actually be holding. I’m also running a ping to the router, in hopes that this will help maintain the connection. But there is one entry in the DHCP table for the router that doesn’t make sense–to a computer named Chris.