Category: Semantics

It’s all angle brackets

Post author By Shelley Powers
Post date December 30, 2002

In his recent post, Mark Pilgrim writes that he is amazed, bordering on appalled because of reaction to his posting about the CITE tag. I was a bit surprised myself because the posting wasn’t necessarily about revolutionary uses of technology. However, what Mark did do, in just a few words, was hit the hot spot in several debates: XML versus HTML, machine readability versus human readability, the semantic web, RDF, and any combination of these topics. And for the cherry to complete this semantic sundae, he threw in some code. If his post was fishing instead of writing, it would be equivalent to using five different fishing poles, each with a different lure. And did he come home with a catch.

Semantics. People start talking semantics, and each person doesn’t understand what the other people mean by semantics, and therein lies the wonderful irony that seems to weave in and out of the web. Semantics is all about meaning, but eactly what does it ‘mean’? We have no problems with small ‘s’ semantics in the everyday world, but put semantics on the web, and it becomes big ‘S’ Semantics.

Mark uses CITE as an example of semantic markup in HTML. He has a point: CITE does carry with it meaning — that which is marked up with this tag is a ‘citation’. By defining the context of the element, we can, for example, discriminate between hypertext links that are just ‘links’ and links that are associated with citations.

I went back to one of my postings and added CITE to specific URLs that I wanted to designate as citations. With an itty bitty Perl CGI app, I can find all the citations in the page — as shown here. Embed the CITE within a hypertext link and I can also easily associate those citations with the author’s post, as shown here.

By using CITE in conjuction with a hypertext link, I attach special significance to the link, something I can’t really do with just a straight hypertext link tag, as shown here. CITE provides context for the link. Context provides meaning, and meaning is semantics. Works nicely.

However, and you knew there was a however, I am a greedy person. I want to know more, and at some point HTML just doesn’t have the items that can convey the ‘meaning’ that I’m after.

Sure, I can create little bots that go out and scrape HTML and return with all sorts of data. I can then create a huge database and push this data into it. And once I have mined all that data, I can then create these huge, twistie, complex algorithms and set myself up as a competitor for Google. I mean, all that’s missing is someone to do the graphics for me for holidays, and such.

But, you see, that’s not what I’m after. I’d like to be able associate new and even more complex forms of ‘meaning’ to web resources without having to store huge amounts of data, or to create ever increasingly complex algorithms, including finding devious ways of filtering out what amounts to “weblog spam”.

Ultimately, I want to record and find meaning without having to get VC funding, first.

That’s when something like RDF/XML enters the picture. Of course, you knew I was going to bring in RDF/XML — look to your left. The cover on the book doesn’t say “Practical meaning in a loosely connected environment filled with lots of data”. It says “Practical RDF”.

Let’s say I want to be able to find out Creative Commons license information for a specific posting. I could put this information into meta tags, or try and scrape it from the HTML. However, by embedding the information into RDF/XML, which is then embedded in the HTML, I can easily use one of my RDF APIs, such as my RDF PHP-based Query-o-Matic Lite, to pull the information out about the license — such as the required license information. Since I also store the RSS channel information within the page, I can also query this information.

Of course, I could get this RSS channel information directly from my RDF/RSS file, but I’d rather get specific information for a specific resource than my current running list of aggregated items.

The point of all of this, besides having a little fun with Perl and PHP and various forms of markup, is that all of this stuff is data and all of this stuff can record ‘meaning’, at least some forms of meaning. RDF/XML doesn’t replace the ‘meaning’ that HTML provides — it just adds a way to record new meanings that HTML can’t, or doesn’t provide.

I agree with Jon Udell — there’s no need for either/or propositions in the world of Semantic markup. It’s really nothing more than angle brackets, data, and a few rules depending on the specific markup used. Add a smidgeon of code and there you have it — rich, meaningful data. Sure beats the heck out of web consisting purely of Adobe PDF and Macromedia Flash files; all we’d have then is a bunch of loosely connected black holes.

(g’zipped and tarred file with itty bitty Perl CGI apps used as examples — requires HTML::BuildTree. g’zipped and tarred file of RDF Query-o-Matic files. Requires PHP XML classes from Source Forge.)

Archived at Wayback Machine

Tags RDF

RDF

Creative Commons and RSS Syndication

Post author By Shelley Powers
Post date December 18, 2002

Recovered from the Wayback Machine.

I am applying some pushback in regards to RSS and the Creative Commons License over at the RSS Development Group discussion group.

My original statement:

I’ve already incorporated this into my weblog template and into my PostContent system for weblog resources. However, there is no defined semantics defining the understanding how licensing is applied to RSS feeds. For instance, is the license applied to the feed or the source? If the feed, how does the CCL attached the feed conflict with implied consent of the data considering that RSS feeds are assumed to be aggregated and potentially published? If the feed has excerpts only, wouldn’t this be overridden by fair use laws? If the feed has all the content, does the license apply to the content as feed or to the content separate from the feed?

How is a conflict resolved between a license in a feed and a license within the actual resource itself? Does the license in the resource take precedence?

Just because the CCL is RDF/XML, doesn’t mean we should run out an incorporate it into every existing RDF datastore: FOAF, RSS, and so on.

Good discussion. If you’re interested in CCL and syndication feeds, or impacts of licensing on aggregated material, or even a peek at the confusion that can result when tech and law are thrown together, you might be interested in checking out the discussion. Technical background not required.

Hopefully, some fo the Creative Commons folks will also check this out and get into the discussion.

Tags RSS

RDF

RDF Browser

Post author By Shelley Powers
Post date December 18, 2002

Recovered from the Wayback Machine.

Thanks to 0xDECAFBAD for pointing out another amazing RDF product from the HP Semantic Lab – Brownsauce.

Brownsauce is based on the Jena Java API and uses the lightweight Java web server, Jetty, to serve the application pages. Or, if you prefer, you can install it into your own Tomcat server. The only requirement is Java support, and there is no installation and configuration required to work with it. I’ve successfully used it without problems in Windows 2000, Mac OS 10.2, and Linux.

Brownsauce translates RDF into human-consumable content, including separating nested resources into separate pages, linked to the parent statement. Additionally, clicking on any of the predicates (properties) opens a page with information about the predicate, polled from the associated RDF Schema file.

One very slick application. A must for any RDF-er.

RDF Writing

Practical RDF Book Cover

Post author By Shelley Powers
Post date December 16, 2002

Recovered from the Wayback Machine.

Todd Mezzulo from O’Reilly, the person responsible for marketing the Practical RDF book sent me a copy of the cover, which I’ve embedded below. Now, the book isn’t going to be on the streets until Spring, so contain your excitement…a little.

(To be honest, I’m really excited about this book. Really, really.)

The bird pictured is a Secretary Bird, a predator bird originally from South Africa. The Secretary Bird is known for it’s prowess in killing snakes, having the nickname of “serpent eater”.

It grabs the snake with its strong toes and beats it to death on the ground, while protecting itself from bites with its large wings. Finally, it seizes its prey and hurls it into the air several times to stun it.

I found this particularly humorous because my last sole-author book for O’Reilly was Developing ASP Components, featuring none other than a serpent on the cover. I joked with Todd that the choice of critter for the Practical RDF book is especially appropriate because once I made the decision to go with RDF for my next subject, I never looked back at COM+ and ASP. RDF figuratively ‘killed’ ASP for me; I just didn’t pick it up by the tail and throw it around. Much.

But all this isn’t why the cover design folks at O’Reilly picked the Secretary Bird. I think they just liked the long tail.

Hey! Don’t mess with the Burningbird — Serpent Killer!

RDF Weblogging

When doors are open

Post author By Shelley Powers
Post date December 15, 2002

Recovered from the Wayback Machine.

It started with Ben Hammersley getting an idea:

So here’s what I’d like. Movable Type blogs now automatically create trackbacks when they can. These trackbacks contain RDF, denoting the category the MT blog has that category within. MT produces RDF indexes too (in the flavour of RSS 1.0). So, what I want is a little app that takes the trackback. Follows it back to the originating site, find the RDF snippet, takes the index.rdf, and gives back all the entries within the index.rdf that are on the same subject as the trackback one.

A little chit chat occurs among a few people, all of whom invited themselves into Ben’s conversation via comments, trackbacks, and through cross-posts (here, here, here to list a few).

Today, less two days later, Ben Trott posts a solution. I download it. I run it with my entry Elitist only need apply?. I get the following:

Examining http://www.irelan.net/becoming/archives/000745.html
Category: Technology
Found RSS http://www.irelan.net/becoming/index.rdf
Examining http://esigler.2nw.net/blog/archives/000032.html
Category: Play
Found RSS http://esigler.2nw.net/blog/index.rdf
Examining http://www.seabury.edu/MT/akma/000363.html
Examining http://WWW.onepotmeal.com/blog/archives/001070.html

More Like This From Others:
Young at Heart, Bitter in Mind
Technology
http://www.irelan.net/becoming/archives/000745.shtml

For the people, by the people
Technology
http://www.irelan.net/becoming/archives/000744.shtml

Permahome
Technology
http://www.irelan.net/becoming/archives/000736.shtml

Conferences…
Play
http://esigler.2nw.net/blog/archives/000032.html

Beatings will continue until grades improve…
Play
http://esigler.2nw.net/blog/archives/000031.html

A smattering of assorted thoughts.
Play
http://esigler.2nw.net/blog/archives/000027.html

Doh!
Play
http://esigler.2nw.net/blog/archives/000018.html

Want to know what the future holds for social software? You just saw it in action, boys and girls.