Wolfram Alpha

Sheila Lennon asked my opinion on the Nova Spivack’s recent writing about Wolfram Alpha, and posted my response, as well as other notes. Wolfram Alpha is the latest brainchild of Mathematica creator, Stephan Wolfram, and is a stealth project to create a computational knowledge engine. To repeat my response:

First of all, it’s not a new form of Google. Google doesn’t answer questions. Google collects information on the web and uses search algorithms to provide the best resources given a specific search criteria.

Secondly, I used Mathematica years ago. It’s a great tool. And I imagine that WolframAlpha will provide interesting answers for specific, directed questions, such as “what is the nearest star” and the like. But these are the simplest of all queries, so I’m not altogether impressed.

Think of a question: who originated the concept of “a room of one’s own”. Chances are the Alpha system will return the writing where the term originated, Virginia Woolf’s “A Room of One’s Own”, and the author, Virginia Woolf. At least, it will if the data has been input.

But one can search on the phrase “A room of one’s own” and get the Wikipedia entry on the same. So in a way, WolframAlpha is more of a Wikipedia killer than a Google killer.

Regardless, when you look via Google, then you get link to Wikipedia, but you also get links to places where you can purchase the book, links to essays about the original writing, and so on. You don’t get just a specific answer, you also get context for the answer.

To me, that’s power. If I wanted answers to directed questions, I could have stayed with the Britannica years ago.

Nova Spivack’s writing on the Alpha is way too fannish. And too dismissive of Google, not to mention the human capacity for finding the exact right answer on our own given the necessary resources.

Again, though, all we have is hearsay. We need to try the tool out for ourselves. But other than helping lazy school kids, I’m not sure how overly useful it will be. If it’s free, yeah. If it’s not, it will be nothing more than a novelty.

I also beg to differ with Nova, when he states that Wolfram Alpha is like plugging into a vast electronic brain. Wolfram Alpha isn’t brain-like at al.

The human brain is amazing in its ability to take bits and pieces of data and derive new knowledge. We are capable of learning and extending, but we’re really shite, to use the more delicate English variation of the term when it comes to storing large amounts of data in an easily accessible form.

Large, persistent data storage with easy access is where computers excel. You can store vast amounts of data in a computer, and access it relatively easily using any number of techniques. You can even use natural language processing to query for the data.

Google uses bulk to store information, with farms of data servers. When you search for a term, you typically get hundreds of responses, sorted by algorithms that determine the timeliness of the data, as well as its relevancy. Sometimes the searches work; sometimes, as Sheila found when querying Google for directions to cooking brown rice in a crockpot, the search results are less than optimum.

Wolfram Alpha seems to take another approach, using experts to input information, which is then computationally queried to find the best possible answer. Supposedly if Sheila asked the same question of Wolfram Alpha, it would return one answer, a definitive answer about how to cook brown rice in a crockpot.

Regardless, neither approach is equivalent to how a human mind works. One can see this simply and easily by asking those around us, “How do I cook brown rice in a crockpot?” Most people won’t have a clue. Even those who have cooked rice in a crockpot won’t be able to give a definitive answer, as they won’t remember all the details—all the ingredients, the exact measurements, and the time. We are not made for perfect recall. Nor are we equipped to be knowledge banks.

What we are good at is trying out variations of ingredients and techniques in order to derive the proper approach to cooking rice in a crockpot. In addition, we’re also good at spotting potential problems in recipes we do find, and able to improve on them.

So, no, Wolfram Alpha will not be like plugging into some vast electronic brain. And we won’t know how well it will do against other data systems until we all have a chance to try the application, ourselves. It most likely will excel at providing definitive answers to directed questions. I’m not sure, though, that such precision is in our best interests.

I also Googled for a brown rice crockpot recipe, using the search term, “brown rice crockpot”. The first result was for RecipeZaar, which lists out several recipes related to crockpots and brown rice. There was no recipe for cooking just plain brown rice in a crockpot among the results, but there was a wonderful sounding recipe for Brown Rice Pudding with Coconut Milk, and another for Crocked Brown Rice on a Budget that sounded good, and economical. I returned to the Google results, and the second entry did provide instructions on how to cook brown rice in a crockpot. Whether it’s the definitive answer or not, only time and experimentation will tell.

So, no, Google doesn’t always provide a definitive answer to our questions. If it did, though, it really wouldn’t much more useful than Wikipedia, or our old friend, the Encyclopedia Britannica. What it, and other search engines provide is a wealth of resources for most queries that not only typically provide answers to the questions we’re asking, but also provide any number of other resources, and chances for discovery.

This, to me, is where the biggest difference will exist between our existing search engines and Wolfram Alpha: Alpha will return direct answers, while Google and other search engines return resources from which we can not only derive answers but also make new discoveries. As such, Alpha could be a useful tool, but I’m frankly skeptical whether it will become as important as Google or other search engines, as Nova claims. I don’t know about you all, but I get as much from the process of discovery, as I do the result.

Nova released a second article on Wolfram Alpha, calling it an answer engine, as compared to a search engine. In fairness, Nova didn’t use the term “Google killer”, but by stating the application could be just as important as Google does lead one to make such a mental leap. After all, we have human brains and are flawed in this way.

As for artificial intelligence, I wrote my response to it on Twitter: It astonishes me that people spend years and millions on attempting to re-create what two 17 year olds can make in the back seat of a car.

Drupal and OpenID

I have been focused on OpenID implementations lately, specifically in WordPress and Drupal. The Drupal effort is for my own sites.

Until this weekend, I had turned off new user registration at my Drupal sites, because I get too many junk user registrations. However, to incorporate OpenID into a Drupal site, you have to allow users to register themselves, regardless of whether they use OpenID or not.

I think this all or nothing approach actually limits the incorporation of OpenID within a Drupal site. If you limit registration to administrator’s only, then people can’t use their OpenIDs unless the administrator gets involved. If you allow people to self-register, there’s nothing to stop the spammy registrations.

I believe that OpenID should be an added, optional field attached to the comment form, allowing one to attach one’s OpenID directly to a comment, which then creates a limited user account within the site specifically for the purposes of commenting. Rather than just providing options to allow a user to register themselves, or not, add another set of options specific to OpenID, and allow us to filter new registrations based on the use of OpenID.

Currently, the new user registration options in Drupal 6x are:

  • Only site administrators can create new user accounts.
  • Visitors can create accounts and no administrator approval is required.
  • Visitors can create accounts but administrator approval is required.

Turn on the latter two options and you’ll get spammy registrations within a day. Not many, but annoying. I believe there should be a fourth and fifth option:

  • Visitors can create accounts using OpenID, only, and no administrator approval is required.
  • Visitors can create accounts using OpenID, only, but administrator approval is required.

With these new options, I could then open up new user registration for OpenID, but without having to allow generic new user registration for the account spammers that seem to be so prevalent with Drupal.

To attempt to implement this customized functionality at my sites, I’ve been playing with Drupal hooks, but the change is a little more extensive than just incorporating a hook handler and a few lines of code, at least for someone who is relatively new to Drupal module development like I am.

Taking the simplest route that I could implement as a stand-alone module, what I’m trying now is to modify the new user registration forms so that only the OpenID registration links display. You’ll see this, currently, in the sidebar if you access the site and you’re not logged in. Unfortunately, you have to click on the OpenID link to open the OpenID field, because I’m still trying to figure out how to remove the OpenID JavaScript that hides the field (there is a function to easily add a JavaScript library, but not one to remove an added library).

With my module-based modifications, rather than a person having to click a link to create a new account, and specify a username and password, they would provide their OpenID, and I would automatically assign them a username via autoregistration. To try my new sidebar module, I decided to turn my Drupal sites into OpenID providers, as well as clients, and use one of them as a test case. Provider functionality is not built in, but there is an OpenID provider module, which I downloaded and activated with my test Drupal installation (MissouriGreen).

I tried my new module and OpenID autoregistration but ran into a problem: the Drupal client does not like either the username or email provided via the Drupal OpenID provider. Why? Because the OpenID identifier used in the registration consists of the URL of the Provider, which is the URL of the Drupal site I used for my test, and the Drupal client does not like my using a URL. In addition, the provider also didn’t provide an email address.

Digging into the client side code, I discovered that the Drupal OpenID client supports an OpenID extension, Simple Registration. Simple Registration provides for an exchange of the 9 most requested information between the OpenID client and provider: nickname, email, full name, dob (date of birth), gender, postcode, country, language, and timezone. With Simple Registration, you can specify which of the items is optional and which mandatory, and the current OpenID client wants nickname and email.

By using Simple Registration on the provider, I could then provide the two things that my Drupal OpenID client wanted: nickname and email. Unfortunately, though, the current version of the OpenID provider doesn’t support Simple Registration. I was a little surprised by this, as I had made an assumption that the Drupal OpenID provider would work with the Drupal OpenID client. However, OpenID is in a state of flux, so such gaps are to be expected.

Further search among the Drupal Modules turned up another module, the Drupal Simple Registration module, which allows one to set the mandatory and optional fields passed as part of the OpenID authentication exchange. The only problem is that the OpenID Provider also doesn’t have any incorporated hooks, which would allow the Simple Registration module to provide the Simple Registration data as part of the response. To add these hooks, the Simple Registration module developer also supplied a patch that can be run against the OpenID Provider code to add the hook.

I applied the patch and opened the module code and confirmed that it had been modified to incorporate the hook. I then tried using the Drupal site as OpenID provider again, but the registration process still failed. Further tests showed that the Simple Registration data still was not being sent.

All I really want to do is test the autoregistration process, so I abandoned the Drupal OpenID provider, and decided to try out some other providers. However, I had no success with either my Yahoo account or my Google GMail account, even though I believed both provided this functionality. The Yahoo account either didn’t send the Simple Registration fields or failed to do so in a manner that the Drupal OpenID client could understand. The Gmail account just failed, completely, with no error message specifying why it failed.

I felt like BarbieOpenID is hard!

I finally decided to use phpMyID, which is a dirt simple, single user OpenID application that we can host, ourselves. I had this installed at one time, pulled it, and have now re-installed at my base burningbird.net root directory. I added the autodiscovery tags to my main web page, and uncommented the lines in the MyID.config.php file for the nickname, full name, and email Simple Registration fields. I then tried “http://burningbird.net” for OpenID autoregistration at RealTech. Eureka! Success.

The new user registration is still currently blocked at creation, but the site now supports autoregistration via OpenID. Unfortunately, though, the registration spammers can still access the full account creation page, so I can still get spammy registrations. However, I believe that this page can be blocked in my mandatory OpenID module, with a little additional work; at least until I can see about possibly creating a module that actually does add the OpenID only options I mentioned above. The people who generate spammy user account registrations could use OpenID themselves, but the process is much more complex, and a lot more controlled at the provider endpoint, so I think this will help me filter out all but the most determined spammy registrations.

Once all of this is working, I’ll see about adding the OpenID login field to the comment form, rather than in the sidebar. If one wonders, though, why there isn’t more use of OpenID, one doesn’t have to search far to find the answers. Luckily for Drupal users, OpenID seems to be an important focus of this week’s DrupalCon in Washington DC, including a specialized Code Sprint.

Kindle Versions

On Groundhog Day, I’ll have had my Kindle for a year. I’ve been working on an anniversary review of the device, which will get posted either to the Frugal Algorithm or Secret of Signals. Or perhaps a bit in both, not sure.

The buzz about the Kindle now is that a 2.0 version is coming out, February 9th. I imagine a new version is likely, but contrary to what people have been saying, there has been more than one Kindle variation released in the last year.

Currently, there are Kindles running the following operating system versions: 1.04, 1.08, 1.1, and 1.1.1. Amazon has stressed that all provide the same functionality. The only thing to account for the difference, then, is variations in the device. Not a simple swapping of parts, either, because one doesn’t need to update an operating system when one swaps identical parts.

I have a 1.04 version of a Kindle, and must admit to some curiosity about what improvements went into the 1.08 and 1.1 models. I know that one always takes risks buying version 1 of anything, but I don’t think I’ve ever seen a case where an item’s internal architecture has changed three times within one year. Changed enough to force a new version of the operating system. At a minimum, I have to wonder what will happen when new software functionality is rolled out. Do we 1.04 owners get the same goodies as, say, 1.1 owners?

To add further to the confusion, some people have reported in the owner forums seeing an OS version of 1.2 in their devices, and there are differences with this OS, but Amazon has stated this operating system has not been released. So rumor runs rampant in the forums, because we have no other source of communication about what’s happening with the devices. To be blunt, Amazon does not communicate with Kindle owners.

Regardless of lack of communication, and despite being an “old” Kindle owner, I do still like my device, though I really wish we had folder capability. However, I’d really rather that Amazon support ePub, and release its AZW format to other ebook readers. And I’ll have more to say on this later, too.

Another PowerPC Nail, Another smug Tech Writer

The good news is, Netflix WatchNow will now work on the Mac. The bad news? It only works with Silverlight 2, which only works within the Intel architecture.

Should be no problem, to Engadget:

Unfortunately for super-duper late adopters, the software will only work with Intel-based Macs, so if you’ve been holding onto a G3 for dear life, here’s one more reason to finally can it, along with your Xbox 360 HD DVD player, Von Dutch trucker cap, and gas-guzzling Escalade.

I believe that the last version of Mac machines with the PowerPC architecture is G5, not G3. As for being antiquated, I have the last of the Powerbook G4 laptops, bought less than three years ago and still covered under Apple warranty. I guess that puts me in the Engadget “super-duper late adopter” category.

This is another nail in the coffin for machines that really aren’t that old, primarily brought about by Apple’s indifference to the fact that it switched architectures and then has done little to ensure that older architectures get full support. Though I appreciate the Universal platform Apple provided, which means applications work on both PowerPC and Intel machines, too many applications such as the recent Photoshop CS4, and now Silverlight 2, forming the background for services such as Netflix Watch Now, are being released only for Intel machines.

However, there’s not much we can do about companies like Netflix, Microsoft, and Adobe, and their lack of support for machines that really aren’t that old. Well, other than look for other sources of software. What bothers me more about this story, though, is the disdain demonstrated by the Engadget author, especially in light of today’s economic environment.

Too many of the writers for sites like Engadget assume this is 1999 all over again, and money flows, and everyone can afford a new machine every year. On the contrary, we’re heading into a recession, if not in one already. The estimates are that the unemployment rate in this country will hit 8%, or more, before we’re done; the impact on the job market could be worse in other countries. Yet here we have Engadget, sneeringly poking fun at those who are staying pat with their existing machines, not because the people who haven’t upgraded are cheap, but because they have no other option.

Personally, if there’s one thing I hope does occur from the current economic crises, it’s that sites like Engadget either fail, or starting looking more closely at today’s reality and begin to adapt their stories accordingly. Being frugal and making do can be just as challenging, interesting, and yes, even sexy, as buying every new generation of iPhone, iPod, or whatever that comes along.

How Not to write about the semantic web

How not to attract new semantic web readers, especially among the women. Write the following:

I just thought that this is a smart strategy to make video tutorials about the Semantic Web more appealing to female* or otherwise not so super-tech-savvy* audiences: Just put a Lolcat in it!

Though the author wrote that she matches the “stereotype”, which I guess means women who aren’t tech and like LOLcats, by the time I followed the asterisks, I’d already passed from astonishment to loathing. FYI, I wrote the first book on RDF, babes.

A reference to females was unnecessary. Surprising, too, from the same company featuring an interview with Corinna Bath, author of the thesis, “Towards a De-Gendered Design of Information Technologies”.