Categories
RDF Weblogging

The gluttony of information

There’s an elegant bit of synchronocity in play when one is inundated with emails and assorted and sundry articles on RSS on the same day one’s mouth is operated on. Where before I might blow over the discussions, I was driven in my fixated, drug-induced state to focus on everything that everyone was saying. Every little thing.

For instance, The W3C TAG team – that’s the team that’s defining the architecture of the web, not a new wrestling group – has been talking about defining a new URI scheme just for RSS, as brought up today by Tim Bray. With a new scheme, instead of accessing a feed with:

http://weblog.burningbird.net/index.rdf

You would access the feed as:

feed://www.tbray.org/ongoing/ongoing.rss

The reason is that using existing schemes opens the feed in whatever tool you use to access the page, such as the browser. However, what you don’t want is to open the feed, but to access the URI directly for the purposes of subscribing to the feed. Using a MIME type doesn’t apply because MIME types operate on the data loaded, and the URI of the subscription isn’t necessarily part of the data.

An example used was the mailto: scheme, which is used to open an application and pass in the value attached to the mailto: – the email address, rather than load that data in the browser.

The response to this topic in TAG was more discussion than has occurred on many another topic lately, a behavior which tends to happen with RSS. This amazes me, when you consider that RSS, or from a generic point of view, the technique of using XML to annotate excerpts of syndicated material that’s updated on a fairly regular basis, is actually a pretty simple concept. It’s handy, true, and I’m just as taken with Bloglines as several of my compatriots – but it is still syndication.

But that’s not all. Doc also discussed RSS today, but his take was more on an advocacy of syndication. More so, he focuses on that aspect of syndication he considers most relevant – notification:

Meanwhile, it seems to me that notification is the key function provided by online syndication. And that’s the revolutionary thing. Publishing alone carries assumptions framed by the permanance of all the media that predated the Web in the world. Hence the sense of done-ness to the result. The finished work goes up, or out, and that’s it.

But the Web isn’t just writable. It’s re-writable. I’m writing this live on the Web, and I’ll probably re-write parts of it two or three more times.

Hence the need for notification.

I agree with Doc that syndication in conjuction with aggregators is pretty handy. Since I started using Bloglines, I visit my favorite weblogs much less frequently than I used to, waiting for the bold text and the count to tell me how many new posts the person has published. And I can see from my referrers others are doing the same because most of my visits now come from aggregators and bloglines or Technorati or Blogdex or some other somewhat generic resource.

Of course, I still know people are visiting…I just don’t know who, or from where. And though sometimes I may wonder, wistfully, if my old friends still visit as much as they used to, I contrast this with my being able to read that many more weblogs now. Sure, this also impacts on the conversations we used to have across our comments and across our blogs because we don’t alwasy visit to read the comments as much, or to add our own, but we’re much more connected into the stream of information than ever before.

Previously I had perhaps 20 or 30 weblogs on my blogroll I would visit a couple of times a day. Now I have over 100 that I only visit when they update, and I’m a veritable information maven neophyte compared to others. I remember in a recent comment discussion with Steve Gillmor that he mentioned he was subscribed to 3762 different feeds if I read his comment correctly.

Speaking of Steve Gillmor, his name popped up in several places today in connection with RSS. He had a conversation with Doc, who blogged some of it:

He’s advocating thinking larger than the Web as it stands. Blogs are a subset of RSS. So is sndication a subset of RSS. He says. In a time constrained universe, it’s a killer app.

It’s the platform for synergy between the stakeholders and the journalists. He says. To limit it, by implication, which you do here by focusing on syndication as being the nub of what this is about, is self limiting in terms of understanding the new economic model that’s emerging here. Among other things.

He wants to respect :the disruptive nature of RSS.

This technlogy has already supplanted email as the core of your desktop. A conditional yes. On the other hand, my email is far more searchable, and manageable, and private and personal, which makes it highly significant, though hardly disruptive and therefore kinda irrelevant to this discussion. Of course, Steve points out, this won’t be the case “when RSS scoops up 80 or 90% of that functionality too.”

Gillmor then went on to write more about RSS in an article that basically says Apple and Sun are challenging Microsoft Outlook through the use of RSS. At least I believe it says this because, for the most part, I found it to be almost incomprehensible in its blind reverence for RSS.

But a disruptive technology is emerging that could change everything. For my money, it’s RSS (known alternately as Really Simple Syndication or Resource Description Framework Site Summary). I’m not talking about the embedded Outlook plug-in of today’s PC; I’m talking about a technology that could be as disruptive to personal computing as the digital video recorder has been to television.

I read Gillmor’s article three times and still couldn’t figure out exactly what he was enthusing about other than RSS is going to change the world. But it was one paragraph that finally gave me the clue:

It’s the combination of these system services that produces the RSS information router. IM presence can be used to signal users that important RSS items are available for immediate downloading, eliminating the latency of 30-minute RSS feed polling while shifting strategic information transfer out of e-mail and into collaborative groups.

What Gillmor is talking about is being wired to your machine. With RSS, not only can we skim more and more information resources, at faster paces, but we need not even be active in this effort – we can have the information resources notify us when we need to read them.

Rather than fight information overload, give in to it. Embrace it. Accept complete saturation as nothing less than that which is to be achieved. Apply the same practices to our consumption of information as we’ve applied to food and consumer goods and foreign policy, because we can never have too much.

After all this reading about RSS today, I finally get it. I finally understand the magic:

RSS is the both the McDonald’s and Wal-Mart of data.

Categories
Semantics

Semantic web extreme goodness

Recovered from the Wayback Machine.

I had to add a whole new category just to reference these two resources.

First, an excellent summary of the recent semantic web discussions, annotated even, can be found at Themes and Metaphors in the Semantic Web. Thanks to Chris for pointing it out or I would have missed it.

What I like about it is the way it personalizes the discussion, which can’t help but make it more ‘meaningful’, pun not intended. Comments are here.

Secondly, a new weblogger has joined the semantic web effort at a blog called Big Fractal Tangle. Timothy Falconer is off to a good start with:

 

Before the Semantic Web can come close to delivering on its promise, we need to find ways to convince non-technical types into wanting to think abstractly. Academics, developers, and businessfolk are unusually organized compared to “the rest of us,” which is why this may be hard to see at first. Hell, forget annotation. We’ve got to find compelling and obvious reasons for them to want to use metadata.

 

Saying that the web will never be more intelligent than it is today is the height of arrogance. This is no different than saying that because we can’t create it today, or today’s dreamers can’t dream it today, or it can’t be touched and has no physical manifestations today, it can never happen. If we believed this in other science, we not only wouldn’t be on the moon, we wouldn’t be on this continent.

Having said this, however, the only way we’re going to convince grandma or Uncle Joe to use meta-data is for us to listen to what they want and need and then give it to them, slipping meta-data in through the seams. May not win a Nobel, but may give us the semantic web.

Categories
RDF

PostCon – generating RDF/XML files

Recovered from the Wayback Machine.

Now that the Burningbird Network sites are getting back into the groove, time to bring this weblog back online.

I’ve incorporated bits and pieces of the PostCon throughout this system. However, none of the implementations are a blinding flash or a deafening roar. And I’m not picking a fight with anyone about it, so I imagine rolling out this technology won’t generate a lot of conversation.

The first implementation of PostCon for my system was to create the RDF files containing information about individual weblog posts for Burningbird. These files were created automatically using a Movable Type template, and you can see an example of one of the files here. It should be valid RDF/XML, and features the PostCon vocabulary. Information recorded includes:

  • Weblog posting author and creation date
  • URL of current location, as well as articles that link this posting, and other articles that are linked by this posting – its location within a hierarchy of links
  • The resources the posting is dependent on. By this I mean, what format is the file, what style sheets are required by it, and any logos.
  • The status of the posting (valid, active, relevant)
  • Title and abstract and history of the page – including a historical entry representing the fact that the resource was renamed with a reorganization of the web sites

The template used to generate these files can be seen in this exercise is to demonstrate that it does not require huge investments in time or energy in order to record intelligent metadata about a resource in a machine-accessible standard format. One argument against the semantic web, generally, and RDF specifically is that both add to the complexity of a process and are beyond the average person. Well, with PostCon and Movable Type, all the average person needs do is spend about a half an hour understanding the vocabulary, and about another half an hour to modify the template file to output what they want their files to show.

The second component is a PHP-based page that processes this information into human readable form, which is then attacked to a ‘meta’ link within each page. I’ll demonstrate this in the next posting.

(Speaking of RDF programming, I noticed a new review of the book out at Amazon. The problem with book reviews of this nature is that someone can put up a review that says the book is outdated because it covers the previous version of Jena, when Jena coverage is only one aspect of the book, and I spent time updating the examples for Jena 2.0 here in this site. This review also doesn’t take into account rewriting the book four times over two years keeping it in synch with the RDF specs. Grr.)

Categories
Connecting Semantics Weblogging

The value of human on a humanless web

Recovered from the Wayback Machine.

David Weinberger responded to my discussion yesterday about semantic web compared to Semantic Web:

So, if the semantic web means only that we’re learning to understand ourselves better on the Internet, or even that we often adopt similar terms and rhetoric, then, yes, the Web is constantly semantically webbing itself. And if the semantic web means that we are formally knitting together, in an ad hoc way, the various standards we’re adopting, then, yes, the web is semantically webbing itself.

But, I don’t think this is what most people mean by the Semantic Web. I think they have two other implications in mind.

The Semantic Web that David writes about is the one that begins with the vision outlined in the now famous Tim Berners-Lee article whereby in the future, the Web will speak to our machines, and the machines to the Web, and we will be tenderly enfolded into a world where intelligent bots will find solutions to our day to day problems at the flick of the button.

According to those who design it, for this utopian Semantic Web to come about David writes, two things must happen: the web forms one single information space that bridges the stubborn individuality of culture and language; and standards must not only continue to propagate across this space, when they combine the synergy results in something new, and utterly different. Marvels of automation… as he refers to it.

But, David continues, as did Clay before him, we can’t form a complete information space, nor will our standards ever combine because history and experience has shown us that none of this will scale; or if it does, it will only be at the expense of the richness of the human experience.

So if the Semantic Web cannot be realized, will we then have to settle for my semantic web, with its simple increments of functionality based on a growing use of standards? Well, yes and no.

Yes, in that my view of the semantic web is one that has already started, and is in use today when I go out to Bloglines and see who has done what recently. This semantic web is already here, and can only continue to expand. But no, in that David misreads what I say, and focuses on the standards, when I was focusing on the rewards.

Years ago when computers dominated entire rooms we knew that someday we would be able to communicate with a computer as if it were another person. We would be able to express emotion and innuendo alike and it would not only understand, it could reciprocate in kind. Of course, as we matured and our computers became more sophisticated, and as we explored the capability of the human visual system or the complexity of human linguistics, we began to realize that our hopes for a true artificial intelligence will never come about. It’s not because of our limitations in technology that this dream won’t be realized – it’s because of we began to realize that the richness of the human experience did not arise from our strengths, but from our failures.

We humans have an amazing ability to adapt to new situations, to accept new learning, and to grow to meet new situations. But this adaptability comes with a price: our memories are chaotic storehouses based on faulty chemical reactions easily influenced by external factors such as drugs, or emotions. I can tell you about a day sitting in my second grade classroom near the window, and I know it was Spring because the window was open and I could hear a mower running outside and smell the newly cut grass – but I can’t tell you what we were discussing, or even what I learned that day. The memories are there, or we hope after youthful experimentation that the memories are there, but we can’t pull them up because if we are marvels of adaptation, and creativity, we are the pits when it comes to efficient memory retrieval.

Later today I will visit several conservations areas in parts of the state where I’ve not been before to take photos of birds, and I will be able to walk down strange paths and adapt to the changing nature of the path because I can sense the change through my eyes – but if I walked at night, without a flashlight, I would be helpless because I am dependent on my eyesight and can’t see in the dark.

Over time, as we experimented with artificial intelligence, most computer scientists began to realize that what we didn’t need from computers is human intelligence and capability – after all it’s easy enough to create humans, one just has to have sex – but computers that partner with us, each providing what the other can’t. We need computers that store bits of information we can retrieve easily because we can’t depend on our own frail memory. Computers that can travel paths on distant planets, and adjust to the changing environment, true; but ones that won’t be looking up and marveling at the strange new world around it; becoming reminded of a song heard once years ago and then suddenly bursting forth into that song because they cannot help but sing it.

The Internet and the Web were both originally designed to facilitate sharing of information from many different machines at once. At least, when we look at the topology of the Net that’s what we see – machine connecting to line to router to router to line to machine in a vast interwoven threaded void of wire and plastic and chips. But the Internet and the Web did not come about because we needed computers to talk to each other; it came about because we humans wanted to talk to each other. To share our data, and our services, and our lives.

I am limited to a physical existence in one place at a time, which at this moment is St. Louis on a Tuesday morning in November. However, thanks to the Internet I am also in Boston, and Georgia, and South Africa, and the UK. If you read this in a month’s time, I have even transcended time. The laws of physics may limit my physical self, but they can’t limit my experiences because we have partnered with computers and technology to thread the gap between the real and the virtual.

I am a simple woman with simple wants. I read Tim B-Ls vision of a Semantic Web, with its Web talking to my machines, and its machines talking to the Web and intelligent bots being able to work through issues of time, location, and trust and arrange Mother’s treatments with a minimum of fuss and effort on Lucy and Pete’s part, and I will admit there is something about this story that leaves me cold. Not the sharing of calendar information over the Net – we have that now. Not the accessing of relevant information about various hospitals and plans in the surrounding community, because we have that now, too. It was the fact that in this vision, the global “I”, that semiotic “I”, is missing.

“Mom needs therapy? Oh no! Well, we’ll work together and make sure she’s taken care of!”

In this picture, I search for available plans in the area and then call the hospitals and I talk to the people to see if I can trust them to take care of mother; neither I nor the sister of I is so busy as to begrudge the time taken. Nor am I so incapable that I can’t click a button on a volume control, or turn a knob, and lower the volume without the stereo being wired to the Web. Or my toaster.

(Perhaps after twenty years in this field I am turning into that Luddite that I (no this is me now, not the semiotic I) accuse others of being because they resist the use of RDF.)

When I talk about my poetry finder, David sees this as nothing more than a simple growing use of standards, and it does seem as if my vision, my semantic web, is nothing much beyond this. There are no vast reaches of interconnected communication between machines, no extraordinary leaps of intuition in the software that runs between them, little to awe inspire one at first glance. Nothing to statistically analyze, no power distributions to chart.

Find me poems where a bird is a metaphor for freedom. It doesn’t sound very sexy, does it?

My semantic web does not seek to enhance the communication between machines – it seeks to enhance the communication between people. My hope is that someday in St. Louis I will be searching for the perfect poem that uses a bird as metaphor and you, the semiotic you, in your home in the UK or South Africa or Georgia, sometime in the past will have put online this poem you wanted to share, which uses a metaphor for bird, and through time and place and differences in culture and gender and language and interests, we will connect.

This blows my mind. This leaves me weak at the knees and brings tears to my eyes because of the absolute beauty and serendipity of the act. But from a technology standpoint, it doesn’t ring anyone’s chimes, does it?

When did we start valuing technology over that which the technology enables?

I was thinking last night as I tentatively went out among the tech weblogs again,
when was the last time that a discussion in a comment thread within these weblogs end with words, and not code?

We talk about how the Internet sees censorship as damage and routes around it. We sometimes forget, though, that it is people who act as routers in this case, not machines.

We attend conferences because we want to experience the discussion in person. or at least, this is what we say, and I remember conferences and sitting in the back so that I could watch people’s reactions to the words, or look into the speaker’s eyes and see their enthusiasm, and let their voice wrap around me with equal parts hope and wisdom. But in this day of ever growing uses of technology, we aim our phones at each other as if they were lances and this a tournament of pictures; we put up our laptop lids to act as shields to work through, and we don’t look at each other in the eye or watch each other’s reactions as we listen to the speaker. No, instead we write down what the speaker is saying and others in the room read this and they, in turn, write down about the marvel of reading what you’re writing, as you’re in the same room, and we say, isn’t this wonderful?

Personally, I find it sad. And lonely.

David, and Clay Shirky and others, write that the Semantic Web can never happen because it can’t scale; it can never hope to encompass the richness of the human experience enough to reach the synergy needed to burst forth in a blaze of automated glory. If we continue in that direction, what will happen is that we’ll have to adapt to meet it rather than it adapt to meet us. I agree with David and Clay.

However, when I see my semantic web, my simple semantic web, viewed as nothing more than an increased use of standards implemented with the most mundane of technologies, with results that aren’t all that interesting, I’m not sure that the Semantic Web, in all its automated glory, won’t happen someday.

Categories
Semantics

A semantic conversation

Recovered from the Wayback Machine.

When Clay Shirky’s paper on Semantic Weblogging first came out and I saw the people referencing it, I thought, “Oh boy! Fun conversation!” But that was before I saw that many of the links to Clay’s paper were from what are called ‘b-links’ I believe – links in side columns that basically have little or no annotation.

I guess what a b-link says is that the person found the subject material interesting, but we don’t know if they agree or disagree. An unfortunate side effect of these new weblogging bonbons is that it’s hard to have a conversation when the only statement a person makes is, “I’m here. I saw.”

What led to this is Sam Ruby continued his discussion about Clay’s paper, saying Links are unquestionably the greatest source for semantic data within weblogs. What we see is that even with something that we all know and understand such as the simple link, you can’t pull semantics out when none is put into in the first place.

Still, not all links were b-links. Tim Bray talks about Semantic Web from the big picture, and references big corporations with big XBRL (Extensible Business Reporting Language) files and all that juicy corporate data found at data.ibm.com and data.microsoft.com. To him, the Semantic Web will only come about if there is a mass dispersion of data, and in this case, dispersion of data from Big Companies.

But the original Web didn’t start big, it started small. It began with masses of little web sites, with bright pink or heavily graphical backgrounds and really ugly fonts and some of us even used the BLINK tag. Remember animated GIFs? Remember how excited you got with an animated GIF? That’s nothing compared to the link, though, our very first, link. Do you remember when you lost your link virginity?

We swooned when someone told us about a ‘web form’, and this processing we could do called “CGI”. And then someone posted the first picture of a naked girl, and that was all she wrote.

Tim, man, you got to get down, son. Scrabble in the hard pack with the rest of us plain folk. Yank off that tie, and put on some Bermudas and hang with the hometown gang for a bit. You been with the Big Bad Business Asses too much – you forgot your roots.

What I do agree with in Clay’s paper is that the semantic web is going to come from the bottom up. It is going to come from RSS, and from FOAF, and from all the other efforts currently on the web (I need to start putting a list of these together). It’s going to start when we take an extra one minute when we post to choose a category or add a few keywords to better identify the subject of our posts. It will flourish when more people start taking a little bit of extra time to add a little bit more information because someone has demonstrated that the time will be worth it.

It will come about when people see the benefits of smarter data. Small pieces, intelligently joined.

Which leads to the good Doctor, one of the two Influential Bloggers that Tim references – David expanded on his earlier comment about Clay’s paper by saying:

I don’t think Clay is arguing that all metadata is bad. Rather, he’s saying that it doesn’t scale. Yes, the insurance industry might be able to construct a taxonomy that works for it, but the Semantic Web goes beyond the local. It talks about how local taxonomies can automagically knit themselves together. The problem with the Semantic Web is, from my point of view, that it can’t scale because taxonomies are tools, not descriptions, and thus don’t knit real well.

To back this up David references the problems with SGML – how we can’t find or agree on the ideal DTDs to pull this all together. This is an expansion of his agreement with Clay’s response on Worldviews and compatibility. I’ve worked on two industry data modeling efforts: PDES (manufacturing) and POSC (petroluem and energy). I know what David is talking about – it is hard to get people to agree on data.

This is a name, you say. I say, a name of what. You say, a name of a person. I say, a first name? A last? A proper name? A name that’s an identifier? A maiden name? A dead person? A live one? An important person? By this time you’re frustrated and screaming back: It’s just a damn name! Why are you making it so complicated?

I do hear what David is saying. But the thing with the semantic web, though, is that it’s already started.

This group can go off and do their thing, and we can do ours and someday we may need to map the data, and that’s cool. In the meantime though thanks to the use of a model and namespaces, you can have your name, and I can have mine and we don’t have to stop working to get agreement first to exist within the same space. When we get to that point where we do need to work together, then we’ll sit and talk – but its not going to be detrimental to what’s happened in the past. If we find that my postcon:source is the same as your bifcom:target then we’ll just define this little rule that says, ‘these are equivalent’. But I’ll still generate postcon:source and you can still generate bifcom:target.

(*bang* *bang* *bang*

Do you hear that sound? That’s me banging my head against a door. And no, the hollow sound is from the door, not my head. There’s a reason we keep wanting to use one model for our work – so that someday when we want to make our data work together, it is just as simple as defining that one silly little rule.)

You know what my definition of semantic web is? You’ve all heard this before. Even Tim Berners-Lee has heard this from a scathing comment he made in the W3C Tag mailing list, once. My idea of semantic web is if I can look for a poem that uses a metaphor of bird as freedom, and get back poems that have bird as metaphor for freedom. But you know, I don’t have to go everywhere in the web to look for this – if I could just do this at something like poets.org, or among the poetry weblogs I know, I’d be content.

I don’t have to scour the complete world wide web today. I don’t have to get every interpretation of every poem that has ever used bird as metaphor today. I can start with a small group of people convinced that this is the way to go. And eventually, other poetry fans, and high school sophmores, will also see the benefit of doing a little bit of extra work when putting that poem online, aided and abetted by helpful tools. It’s from this tiny little acorn, big mother oaks grow.

How do you think RSS started? Or FOAF for that matter?

I’ll let you in on a little secret: my semantic web is not The Semantic Web. They won’t give nobel prizes for it, and it won’t be a deafening flash or a blinding roar. It will just make my life a bit easier than what what it is now. Some folks who like the Semantic Web won’t necessarily like or agree with my simple, little small ’s’, semantic, small ‘w’ web. But I don’t care, and neither does it.

In this semantic web, people like Danny Ayers with his good humored patience persistence supporting RDF and the ’semantic web, will have just as much an impact as any Tim, Dave, or Clay.

One last thing: I wanted to also comment on Dare Obasanjo’s post on this issue. Dare is saying that we don’t need RDF because we can use transforms between different data models; that way everyone can use their own XML vocabulary. This sounds good in principle, but from previous experience I’ve had with this type of effort in the past, this is not as trivial as it sounds. By not using an agreed on model, not only do you now have to sit down and work out an agreement as to differences in data, you also have to work out the differences in the data model, too. In other words – you either pay upfront, once; or you keep paying in the end, again and again. Now, what was that about a Perpetual Motion Machine, Dare?

However, don’t let me stop you from using XML and your own home grown data model and rules and regs. But we won’t let this stop us from using RDF and RDF/XML.

The point I’m trying to make is this: the semantic web is here. It snuck in quietly while the rest of us were debating. It is viral, slowly putting out little tendrils of applicability throughout the web. The only problem we’re really having is that we’re not recognizing it now because no huge rocket burst into the air going “Semantic Web is here! Semantic Web is here!”

I think what we’re missing is the semantic web equivalent of the animated GIF. Something with lots of moving parts so that people know it’s working.

(P.S. Liz> has started pulling all of the links on this issue into one permanent record.)