Categories
RDF Semantics Specs Web

Semantic web: dull as dishwater edition

Recovered from the Wayback Machine.

Mathew Ingram has decided that the problem with the semantic web is that it’s as boring as dry toast. Of course, by Mathew’s standard, all the stuff that makes the web work is also boring as hell. It’s probably a good thing, then, that some people looked beyond the need for immediate titaliation when it comes to the tech underlying this environment, or Mathew’s audience for his opinions would be his immediate family members, and perhaps those neighbors not quick enough to run away when seeing him approach.

He also writes:

It’s all about plumbing and widgets and data standards, all of which have names like FOAF and TOTP and SIOC and whatnot. It’s right off the dork-o-meter. The Lone Gunmen from The X-Files would have a hard time getting interested in this stuff, let alone anyone who isn’t married to their slide rule or their pocket protector.

Now, taking Mathew’s complaints of No glitter! No glitter! Mama, Mama, where’s my glitter! seriously, I decided to put my slide rule down for a sec and see if I couldn’t respond to his one statement about no one knowing what this all means.

First, there was the web. The web was dumb, but it was hyperlinked.

Then, there was search. Search followed hyperlinks, scraped pages, massaged keywords and tested the strength of the links. The web was still dumb, but number crunching helped generate some smarts. Think of your favorite dog. Yeah, that smart.

Next, there was the semantic web. The semantic web says, You and I can derive understanding from this blob of text on this page, but applications can’t. Applications can pull keywords and run algorithms, but can only approximate what this blob of text is all about. What if we add a little information to this blob of text so that applications don’t have to crunch numbers or make guesses as to what we mean?

How do we add a little information? A hundred different ways. We can use microformats, or RDFa, or RDF, or whatever the HTML5 people cook up for us. With this little bit of extra information, applications can access a web page list that’s created with UL/LI elements, but instead of having to look at the text in the list and try to guess what the list is all about, it can read that little bit of data and know that the list consists of recommended books. Perhaps they can take that little list of books and use another application to look up these books at Amazon. Or at their library. Or better yet, click a button and load all the books into our Kindle. (Assuming that Mathew doesn’t subscribe to the Steve Jobs school of, “We don’t read, we aint’ got no books, gimme the vids”, school of thought.)

The little bit of information might, instead, be an address for an event, triggering the browser to add that event information to a desktop calendar application.

It could be information about people we know and how we know them, so that when we move from Facebook, which is today’s darling, to MyPowerBase, we can tell MyPowerBase to add all people who we have defined as friends, but not those defined as just contacts.

If the information is embedded in a photo–wow, information embedded in a photo, how dull–when we upload the photo to a site like Flickr, it could automatically be added to a map, with all the other photos from the same location. It can be pulled up on a search someday, when we ask the web to show us all photos for St. Louis, or for a certain block in St. Louis. Perhaps it can even help us find photos that are licensed Creative Commons so we can steal them.

I might write about a product or company, and the little bit of information I add to my post might help others who are thinking of doing business with the company, or buying that product. Sure, search engines can scrape the content and try and gleam useful bits based on keywords such as the product or company name, but we’ve all had enough really strange search results to know how far search can go, no matter how brainy the algorithm.

Someday, I’ll be able to write about movies and add just a little bit of extra information, and we can do the same for movies. Or music. Or cooking recipes (“give me all recipes on the web that use apricot jam and bourbon, but I don’t want chicken”). Or even poetry, though don’t mention poetry around Sir Tim–it makes him peevish.

Mathew is very addicted to FriendFeed, which allows him to pull in all the activities of his friends in various places. I bet if we scratched the surface of this application, a lot of the data that makes the application tick comes courtesy of the semantic web dorks.

I could go on and on, but I’ve already been away from my slide rule too long. Instead I’ll end with the best for last: because all of these different ways of adding that tiny little bit of useful information to blocks of text or photos or video files or what have you are based on agreed upon specifications, we can use applications to merge this data and use it for something new; something we haven’t thought of yet. See, now that’s when it really gets exciting because rather than coming up with an idea and then taking five years to get enough data to test it, we’ll already have the data, at no extra effort or cost.

Maybe I’ve been cooped up in my cube with my computers and code for too long, but that strikes me as kind of interesting. In a dorky sort of way.

Categories
Technology Weblogging

WordPress at the top: not

Recovered from the Wayback Machine.

The biggest mistake I ever made was to install WordPress at the top level. The second, was to use “smart” URLs.

My site was restricted due to bandwidth overlimit this morning, something that shouldn’t have happened. When I checked my stats, one site, proxyit.com, was hammering my bandwidth. Checking the recent visitors list, this domain was grabbing my feed every minute, except it was grabbing the Burningbird feed, which was then redirected to the new combined feed, at http://burningbird.net/feeds/atom.xml.

This feed, created by the aggregator, Venus, hadn’t changed, but with the redirect, it was coming up fresh and sparkly new. Now, that doesn’t excuse the fact that this site was accessing it every minute, but I’m not sure that my twisted convoluted redirects to feeds wasn’t at least partially responsible. To make matters worse, I used an inline SVG object yesterday, which shouldn’t tax bandwidth limits overmuch…unless your feed is being hammered.

(Not to mention that using SVG inline absolutely killed my entry at Planet RDF…)

Of course, when I redirected my Burningbird main feed, /feed/ to atom.xml, this redirected all other variations of /feed, including /feed/atom, /feed/rdf, and so on. Not just for Burningbird, but all sub-domains, too. So I had to add more redirects, which attempted to bypass WordPress’s programmatic management of URLs. I had to so many redirects in my sites to get feeds to serve correctly, I wasn’t sure who was getting what. So I’ve removed all of them.

One of my site changes is to remove WordPress at the top level. I’m replacing it with a page generated by Venus that combines all feeds from WordPress installations in sub-domains. Each sub-domain gets its own WordPress installation. Some will get the full installation, and others will get my new semi-forked version that I’ve named Curmudgeon WordPress. Curmudgeon WordPress is a WordPress installation that has had all the reader interactive bits, such as ping back, registration, XMLRPC, and comments, and their associated includes and admin functions removed.

When I get all this finished, no more RDF feeds, no more RSS feeds. You get one feed for each WordPress installation, an Atom feed. And you get one overall feed generated by Venus, name and location TBD, generated once per day.

In the meantime, feeds may be a problem. My bandwidth may be exceeded. Yada yada, you know the rest.

Categories
Technology Writing

Tasks, transcripts, and semantics

Recovered from the Wayback Machine.

I’m spending the rest of the week creating plug-ins that will XHTMLate WordPress. I’m not sure how far I can get with plug-ins, but the end result could be both interesting and useful. I still feel that XHTMLating WordPress is at least partially philosophy, as much as it is code. I can’t seem to communicate this clearly, though, so I am dropping the subject and just focusing on code.

I also have a design for my “Painting the Web” book web site, and need to create a lovely SVG paintbrush, as part of the design. Since my artistic skills are more along the lines of telling a program to draw a line from A to B, the effort may take some time. However, the medium I’m using (SVG) is compatible with my skillset, so perhaps the effort will be trivial and the result good. Better yet, I’ll be able to find a paintbrush at Wikipedia to use.

I did want to point out an interview that Paul Miller of Talis had with Tim Berners-Lee. Unlike most other podcasts, this interview also has a written transcript as well as published show notes. I really wish more video and audio podcasters would spend the time transcribing shows into text, as well as providing more in-depth information about the show than posting a video window and telling people, “Hey! Cool Stuff!” In the meantime, I’m going to watch this podcast via my Apple TV, since the Talis series is also listed at iTunes. I wonder if it’s in HD? (Later: oops! It’s not in video. Darn. I was looking forward to seeing Sir Tim in HD.)

In the write-up on the interview, Miller wrote:

We talked for a fascinating hour during which we ranged from past to future, from technology to policy. We covered specifications such as RDF and SPARQL, and we talked about the pressing need for more accessible texts to explain the Semantic Web to mainstream business.

My book, “Practical RDF”, is out of date, and I and my editor have been talking about a new edition. However, a new edition would not be focused entirely on RDF, and probably wouldn’t cover certain aspects of RDF, in order to be a bit more comprehensive. RDF doesn’t function alone in the world, and a book that covers semantic web technologies needs to cover not only RDF, but also all the complementary technologies. This, in addition to the new tools, data initiatives, and companies.

Now is actually a rather exciting time to be creating a new edition of a book on semantic web technologies. I remember when I wrote “Practical RDF”, which was published before the final release of the RDF specification, I had to stretch a bit to find tools and technologies focused on RDF and/or the semantic web. Now, the semantic web is hip, and the challenge is less on finding good material and more on ensuring that the book isn’t too big, or covers too much.

I don’t think the new edition will be called the same, but we’ll be keeping the “Practical” in the title in some way. Maybe something along the lines of “Practical Semantic Web”. I am nothing if not a practical person, and the “practical” component of the title will also be the overall theme for the book. However, even with this constraint I visualize a book bursting at the seams.

We’re also planning a new edition of Learning JavaScript, too. Unfortunately, the first edition was on a bit of a fast track, and I made mistakes in the book; more than I’d like to see with any of my books. I’ve made corrections via errata, but it will be nice to create a new, updated version.

I’m also helping out with a new edition of a third book, but this would be more along the lines of contributing commentary on organization and some chapters than being sole author.

Categories
XHTML/HTML

XHTMLating feeds

Jeff has been adding SVG annotation, as well as objects to his weblog design. When using SVG, the first issue that arises is serving up XHTML in order for the SVG to be processed correctly. This also means serving up your Atom feeds, accordingly.

In Jeff’s case, he’s using the object element to incorporate SVG annotating the post type. If he inlined his SVG, and wanted it processed correctly in feed readers (both big ‘ifs’–stripping out the SVG is also an acceptable response), two things need to be modified.

First, there’s an html_type option value that needs to be adjusted. The only want to do this at this time is to manually update it in the database. I had thought there was a modification made to WordPress to add this as an Administration option, but I couldn’t find it in the checked in code.

I hard coded my feed for this weblog to XHTML, and made the appropriate adjustments, removing the CDATA section and adding a wrapping DIV element (source for feed and comments feed).

To ensure the two feed files don’t get overwritten, I have a plug-in that I use to override the location of the atom feed files, pointing the location to files in my theme directory.

This is only a temporary fix, though. The real fix would be to provide an option to set the html_type in the admin page, which then serves the weblog pages up accordingly, as well as being used to set the type in the feeds. The value should also be used to determine the output of the content in the feeds.

All of this could be done in plug-ins. What can’t, is handling input from readers when serving up XHTML pages. Input from readers enters WordPress in several different places in the code, most of which do not have hooks allowing us to override the code to provide our own. The only way WordPress will be able to effectively do XHTML is through a commitment to make this a change in the underlying base code.

Since the WordPress developers have not shown any interest in supporting XHTML, and since I haven’t seen a lot of interest in XHTML support in WordPress from my own explorations and published posts, this is just not a challenge I’ve been eager to take on.

The real question is will Microsoft support the XHTML MIME type in IE8? Not having support for XHTML is one of the major browsers is probably the biggest hold up on more widespread support for XHTML. Otherwise, I would think that the increasing interest in SVG would start generating enough movement towards XHTML to finally trickle it’s way into the WordPress community, regardless of the aversion to XHTML of the development team.

PS I would appreciate help testing my current XHTML validation in my comments. You can’t hurt anything with the way the comments are now currently moderated.

Categories
SVG XHTML/HTML

There’s open and then there’s open

Recovered from the Wayback Machine.

As an example of Microsoft’s new commitment to being more open with web developers, the company is releasing the IE8 beta to invited testers only, with a more general release later. Perhaps by “open”, we don’t all mean the same thing?

I also noticed that the company has not provided any answers to the questions we’ve been asking about the “super standards mode”. In particular, nothing from the company about support for XHTML and SVG/MathML. A simple, “Yes, we’re supporting XHTML” would have added real weight to all those bold pronouncements of openness and standards support this last week. Instead, the company spends it’s time, spreading fooflah, and working the community.

As an aside, you know, there’s nothing sadder in nature than a wasp without its stinger.