Categories
Semantics

It’s all angle brackets

In his recent post, Mark Pilgrim writes that he is amazed, bordering on appalled because of reaction to his posting about the CITE tag. I was a bit surprised myself because the posting wasn’t necessarily about revolutionary uses of technology. However, what Mark did do, in just a few words, was hit the hot spot in several debates: XML versus HTML, machine readability versus human readability, the semantic web, RDF, and any combination of these topics. And for the cherry to complete this semantic sundae, he threw in some code. If his post was fishing instead of writing, it would be equivalent to using five different fishing poles, each with a different lure. And did he come home with a catch.

Semantics. People start talking semantics, and each person doesn’t understand what the other people mean by semantics, and therein lies the wonderful irony that seems to weave in and out of the web. Semantics is all about meaning, but eactly what does it ‘mean’? We have no problems with small ‘s’ semantics in the everyday world, but put semantics on the web, and it becomes big ‘S’ Semantics.

Mark uses CITE as an example of semantic markup in HTML. He has a point: CITE does carry with it meaning — that which is marked up with this tag is a ‘citation’. By defining the context of the element, we can, for example, discriminate between hypertext links that are just ‘links’ and links that are associated with citations.

I went back to one of my postings and added CITE to specific URLs that I wanted to designate as citations. With an itty bitty Perl CGI app, I can find all the citations in the page — as shown here. Embed the CITE within a hypertext link and I can also easily associate those citations with the author’s post, as shown here.

By using CITE in conjuction with a hypertext link, I attach special significance to the link, something I can’t really do with just a straight hypertext link tag, as shown here. CITE provides context for the link. Context provides meaning, and meaning is semantics. Works nicely.

However, and you knew there was a however, I am a greedy person. I want to know more, and at some point HTML just doesn’t have the items that can convey the ‘meaning’ that I’m after.

Sure, I can create little bots that go out and scrape HTML and return with all sorts of data. I can then create a huge database and push this data into it. And once I have mined all that data, I can then create these huge, twistie, complex algorithms and set myself up as a competitor for Google. I mean, all that’s missing is someone to do the graphics for me for holidays, and such.

But, you see, that’s not what I’m after. I’d like to be able associate new and even more complex forms of ‘meaning’ to web resources without having to store huge amounts of data, or to create ever increasingly complex algorithms, including finding devious ways of filtering out what amounts to “weblog spam”.

Ultimately, I want to record and find meaning without having to get VC funding, first.

That’s when something like RDF/XML enters the picture. Of course, you knew I was going to bring in RDF/XML — look to your left. The cover on the book doesn’t say “Practical meaning in a loosely connected environment filled with lots of data”. It says “Practical RDF”.

Let’s say I want to be able to find out Creative Commons license information for a specific posting. I could put this information into meta tags, or try and scrape it from the HTML. However, by embedding the information into RDF/XML, which is then embedded in the HTML, I can easily use one of my RDF APIs, such as my RDF PHP-based Query-o-Matic Lite, to pull the information out about the license — such as the required license information. Since I also store the RSS channel information within the page, I can also query this information.

Of course, I could get this RSS channel information directly from my RDF/RSS file, but I’d rather get specific information for a specific resource than my current running list of aggregated items.

The point of all of this, besides having a little fun with Perl and PHP and various forms of markup, is that all of this stuff is data and all of this stuff can record ‘meaning’, at least some forms of meaning. RDF/XML doesn’t replace the ‘meaning’ that HTML provides — it just adds a way to record new meanings that HTML can’t, or doesn’t provide.

I agree with Jon Udell — there’s no need for either/or propositions in the world of Semantic markup. It’s really nothing more than angle brackets, data, and a few rules depending on the specific markup used. Add a smidgeon of code and there you have it — rich, meaningful data. Sure beats the heck out of web consisting purely of Adobe PDF and Macromedia Flash files; all we’d have then is a bunch of loosely connected black holes.

(g’zipped and tarred file with itty bitty Perl CGI apps used as examples — requires HTML::BuildTree. g’zipped and tarred file of RDF Query-o-Matic files. Requires PHP XML classes from Source Forge.)

Archived at Wayback Machine

Categories
Technology

Working on Techie Stuff

Recovered from the Wayback Machine. What’s particularly rough about this post is a link to a discussion thread I had with Aaron Swartz. Because of legal issues, Aaron committed suicide ten years later— an incredible loss to us all. I just wish we had told him more how important he was to all of us. 

No blogging for me until the RDF book is finally finished. If it seems to be taking forever, it feels that way to me at times, too. However, there’s been many a change since I put fingers to keyboard for first word of the book and writing has morphed into re-writing and re-writing.

Additionally, in the last few months I’ve promised some tech tools and utilities to people hearabouts. I’m not coming up for air until these are finally done and published for people to use if they have an interest.

In the meantime, Lawrence Lessig has responded to some of the questions about the Creative Commons license, here and here. No answers, but responses. (Thanks to Denise for pointing these out.)

I have continued the CCL discussion over at the metadata discussion list attached to the CC web site. As you can see by my comments on the thread, my fractured writing is a good indicator of my level of frustration related to the discussion.

Back to book. Back to code. Happy New Year.

Categories
Just Shelley

Snowfall

Recovered from the Wayback Machine.

There’s something magical about seeing the first snow flake falling. At that moment, you and nature are joined in a special secret only shared by those who look out their windows at just the right moment. The first flakes are few, and dance lightly about in the breeze, like the tip of a tongue during foreplay. Moving here, no there, no here.

During the snowfall I watch the pattern of the wind, no longer limited by my crude perceptions that tells me the wind is blowing in a straight line from here to there. The snow traces the individual movements of the wind, a waltz of breezes.

During the day, through my window I watch a father take his child for her first walk in the snow. Hesitant footsteps made a little more unsure by suddently uneven footing that shifts about and causes her to fall. Cruel! But then there’s that moment when tiny face is turned up into the snowfall for the first time; gently, cold touches sweep across cheeks and wisps of cotton at lashes and falls and melts in mouth opened to cry out in pure discovery. All is forgiven, and another child is found winter.

Better than watching the first flake, I love to go to bed with bare streets and wake up in the mornings knowing that snow has started falling. You can hear it by the absence of sound, and you can see it through your window as streetlight reflected. Pulling back the curtain, you look out on a world of white, lines softened between objects until the differences are erased. All you see is soft, crystalline mounds, sparkling in the light.

Snow brings with it a hint of Mother tucking us in against the cold, and a promise of waking.

Categories
Critters

Down the path walked three…

Recovered from the Wayback Machine.

Since techie woman does not live by beating up on techie men, alone, I thought I would get outside, have a nice walk.

Powder Valley was my choice of destination today, but in addition to the Ridge trail, I also walked the little 1/3 mile Tanglewood Trail. Since the latter is handicap accessible, it makes a nice gentle walk for cooling down from the peaks and valleys of the other trails at PV.

Along the Tanglewood, the rangers had stationed implementations of projects you can work on to make your backyard wildlife friendly. Projects like creating brush heaps, planting wild grape, or building backyard ponds. An effective use of trail space to educate people into taking responsibility for the environment.

At the end of the trail is the the restrooms with a sitting area and a wild bird habitat. As I neared it, I noticed movement on the trail in front of me. Three wild turkeys were walking towards me, looking like so many other walkers I’ve met along the paths. Except those walkers didn’t have feathers. And things hanging underneath their chins.

The turkeys moved off the trail as I approached, but didn’t go far. I was close enough to the birds to smell the stuffing and see the marshmallows bubbling on the sweet potatoes.

In my mind, I named the birds, but I’m not going to tell you whose names I used.

Categories
Technology Weblogging

The story of the RSS feeds and the little CC license that could

Recovered from the Wayback Machine.

Again the consideration of exactly what it means to put an RSS feed online has reared its head. Specifically, Mitch Wagner found out that his RSS feed — which includes full posting content not excerpts — was re-published online at LiveJournal. He wrote:

 

That site is my intellectual property. You do not have permission to post the entirety of my weblog to your site. Please take down the site

http://www.livejournal.com/users/mitchwagner/

immediately.

Well, all sorts of interesting commenting occurred, as you can imagine. In particular the implicit assumption that RSS feeds come with ‘tactic approval of republication’ was raised.

What was a surprise is that Mitch reversed himself and now offers a Creative Commons license on his material, though the license information isn’t duplicated in Mitch’s RSS feed directly. Mitch also brings up the ‘commercial’ aspect of re-publishing the material at LiveJournal, and what’s to stop someone from grabbing the content and putting it behind password protected sites that charge money for access.

Easy — don’t publish all your entire posts in your RSS feed. Keep the RSS feeds to excerpts only. Remove the content-encoded field and just leave the description. And adjust your blogging tool to publish excerpts, only. If your weblogging tool doesn’t allow this adjustment, ask the tool builder to provide this capability. The RSS feeds are there to help promote your ideas, not promote their theft. But you have to control the technology, not let the technology control you.

I have a feeling that 2003 is the year when technology and the law will finally find ways to learn to live together, or forever exist in a state of permanent hostility.

(Thanks to Ben for the story.)