Categories
RDF

RDF Poetry Finder: The beginnings of a beautiful friendship

Recovered from the Wayback Machine.

I enjoy posting photographs to the weblog, and usually accompany them with a story or a short and appropriate note. A couple of months ago I started something new — posting photos accompanied by poems that complemented the ‘story’ I wanted to tell with the photo. Explaining my new approach, I wrote:

I started pairing my photographs with poems I found on the Internet as a way of playing with the mood of the photograph, and to discover new poems and new poets. It is fast becoming a favorite hobby, and is very effective at relieving stress, anger, and sadness. (Which is why I found myself spending a lot of time with it the last few weeks.)

I’ll look at a photograph and write down my first impressions of it: what it means to me, why I like it or not, and what I was trying to say with it when I took it. From this, I’ll gather select keywords and use these to search for a poem at a site, such as Plagiarist or the Academy of American Poets. I’ll wander about through the results until finding the poem that best connects.

What I found over time, though, is that it is uncommonly difficult to find a poem online using the ‘traditional’ search techniques — so much of poetry is based on imagery and metaphor, and metaphors are the ultimate destroyer of search engines. True Google busters.

For instance, take a look at the following photograph:

This photo shows a bridge, a river, and some vegetation, such as weeds, grasses, and trees. Now, I can search for poetry related to bridgesriver, and trees and find some interesting results, but none of them matches what I see.

The photo shows a bridge, a river, and bushes, but what I see is change in my life, with the bridge being the path I must take, while the river is the path I want to take. In the photo, the bridge is more substantial than the river, but I see it reflected in the water in wavy lines, making it less real than the seemingly solid, glassy surface of the water. This dichotomy of images and reality represents my internal struggle over the direction the change in my life will take.

Yes, all of that from one simple photo.

Now, try putting all of that into Google and see what you get. You get something like this. There’s a dream analysis in the results that’s rather interesting, but not the poem I’m looking for.

The bridge, as metaphor, can represent meeting a challenge, facing responsibility, leaving someone important, even death. It can also mean facing change in one’s life, and if I look through enough poems that have the word ‘bridge’ in them, I’m sure to find one that represents the complex concept I’m looking for. After a time. But what if I want to find a poem that focuses on a particular concept, but uses something else other than a bridge as metaphor? Other metaphors representing a change in life are the eagle, tree, a wave, and I’ve even seen cheese used as a metaphor for a resistance to change. (Here if you must.)

Do I then search for all metaphors for the concepts I’m seeking, and then search for these metaphors among the poems? I wanted an opportunity to get exposed to new poetry, but even the most ardent poetry lover would weary of this over a time.

No matter how sophisticated the search engine, even our personal pet Google, the success of the queries break down whenever you move beyond searches for specific facts. When you look for something more complex, such as a concept, the most you can do is input all of the words that best represent the concept, and then start the long, arduous refining process.

During my search for poetry I tried this, and then varied the results by focusing on a specific artist and then looking for a poem among their other works. The effort became frustrating at times because I couldn’t find what I was looking for and the poems would run together into this mess o’ words. At that point, I would sell the search for artistic truth short and resort to ‘bridge’, ‘river’, and ‘trees’; giving into the inevitable limitations of the traditional hierarchically structured, keyword-based discovery that is the existing web.

With poetry, the traditional web fails. Even with Google. Even the Google of the future.

Of course, not all poems incorporate imagery, or use metaphors; however, there are underlying concepts incorporated into the poem, assumptions and history and interpretations that blow holes in the search engine bots as sure as a shotgon blows holes in the wily fox caught in the hen house. For instance, take the following poem:

 

Mary Tyler Moore
moved out today.
A big orange truck
came and took her away.

 

So what does it mean? Exactly what it says: a neighbor who looks remarkably like Mary Tyler Moore moved out today, and a moving company that uses large orange painted trucks moved her. Now, if someone wants a poem about Mary Tyler Moore, or a big orange moving truck, why here I am.

We have focused so much of our attention on RSS and FOAF and other RDF-based vocabularies such as these. But given access to Google, I can find all the information that’s contained in RSS files. Given Google, I can most likely find all the information that’s contained in FOAF files, too.

What I wrote in the weblog is available here, at the weblog. If I were to stop generating an RSS file for this weblog tomorrow, the information would still exist. You would just have to visit here, rather than use your aggregator.

If someone was looking for weblogs that post on specific topics, again the information is here. And there are web bots looking for certain topics already, such as references to Ashcroft, or Iraq.

Same with FOAF — anything I’m willing to expose in a FOAF file is already exposed. Want to know if Mark Pilgrim knows me, and I know him? Easy, search on our names.

The information contained in RSS and FOAF files isn’t hidden behind imagery, isn’t obfuscated behind metaphor. The information these files record is the bits and pieces at which Google is so good, but we’re looking for that information that can only appear when the bits and pieces are assembled into a whole.

The web is positively dripping with facts, and there’s no better tool to find these facts than Google and the other search engines. We don’t necessarily need the semantic web to uncover the facts, though it can help. No, we really need the semantic web to uncover the complex concepts, the information that can only be found when many pieces are pulled together in a meaningful way. To discover the conversations. The subtle rumor and innuendo. The play on words and the analogies.

The poetry.

The title of this essay in the series is The Beginnings of a Beautiful Friendship, and, hopefully, you can now see how the semantic web can be a friend to poetry. But you might be wondering how poetry can be a friend to the semantic web?

The odd thing about poetry and the web is that as much as poetry isn’t a fit for the traditional web, it’s an ideal fit for the semantic web. Consider the dictionary definition of semantics — not the one given for artificial intelligence and programming, but the one given for linguistics. Semantics is the study of the meaning of language. Who pursues meaning in language more diligently, than the poet?

What’s been missing from the effort on RDF and the semantic web is the poets. On the committees and in the interest groups we have the mathematician, the logician, the computational linguist, the semantician, the artificial intelligence specialist, and the computer engineer. But we don’t have the the poet, and that’s a pity.

After all, who else through history has been more focued on meaning than the poet? Except perhaps the priest or the philosopher, and the former is more worried about souls while the latter spreads their interests too thinly.

 

Next: The Technician Sleeps while the Poet Speaks

Categories
Weather

Deadly, Beautiful

Recovered from the Wayback Machine.

The light and sound show hit with a fury yesterday, but not without exacting payment on its way.

Several people were killed by tornadoes in Kansas and Missouri and it looks like the town of Pierce City was pretty much leveled.

The storms had lost a lot of their energy by the time they hit us, but the sight was one I had never seen before, not having lived on the coasts. I’ve been through blizzards, floods, ice storms, wind storms that have leveled forests, and Nor’easters in Boston, but nothing like the continuous explosion of lightning and thunder last night. The sky wasn’t just lit up, it was blazing white at times.

I feel guilt at appreciating the incredible beauty of the same storm that just killed people. But it was beautiful. It took my breath away.

It’s just the start of the season. I suppose I’ll get used to the storms, but I don’t think I’ll ever stop being amazed at them. Or respectful.

Categories
Just Shelley

You are how you write

Recovered from the Wayback Machine.

I am in the midst of semantics, poetry, and RDF but I did want to take a moment to add my own comment on a new linguistic nosh currently being nibbled in the neighborhood. The nosh in question is a new book by William Hannas titled “The Writing on the Wall: How Asian Orthography Curbs Creativity”, referenced in a NY Times article.

According to Language Hat, the first to reference it, the author of the book, …claims that Asian science has suffered because the main Asian languages are written in “character-based rather than alphabetic” systems. According to the Times:

Mr. Hannas’s logic goes like this: because East Asian writing systems lack the abstract features of alphabets, they hamper the kind of analytical and abstract thought necessary for scientific creativity.

Stavros, currently living in South Korea and studying linguistics, reacted in a manner both swift and sure:

puk kyu

Roughly translated: Mr. William Hannas, with all due respect to your abilities and experience, but I would like to suggest that you stuff your head up your bum. Idiomatically: Fuck you.

Jonathon has also weighed in on this topic, specifically character association with sound, with:

In other words, as far as Japanese is concerned, the assertion that the language is based on characters corresponding to a syllable of sound is utter nonsense. Unless you’re referring to five year olds—but then there aren’t too many five year olds of any nationality winning Nobel prizes.

But he also added:

[image missing]

Roughly translated: With all due respect Mr. Hannas, but I beg leave to dispute your assertions and suggest that you take this banana and insert it into your rectum. Idiomatically: Fuck you.

I don’t have the expertise these webloggers have to contribute much to these excellent and appreciated discussions on linguistics, but even I, as someone with little exposure to this field, have a difficult time understanding why a people’s use of characters rather than an alphabet for writing would interfere with their scientific achievements. All I know is how much I appreciate the beauty of the characters, but I imagine that makes me provincial in the eyes of a learned man such as Mr. Hannas.

So I’ll add my own contribution to the response:

pHUcK j00

Roughly and idiomatically translated: What they said. (Thanks to Aquarionics for linguistic help.)

Of course, once I wrote this, I thought of Jonathon’s previous writing on Linguistic Imperialism and the impact that political correctness is having on what we say.

Well, back to the poetry and the RDF and the next essay, which I’ll release later tonight but must take my afternoon walk. In the meantime, while trying to look something up related to this topic, indirectly, I found a website that might be of interest: Omniglot.

Archived with comments at the Wayback Machine

Categories
Travel

Thoughts of a traveler

Recovered from the Wayback Machine.

Another storm watch on, and this one left tornados in its wake in Kansas. Another excellent light and sound show tonight. I thought I would ramble a bit online while I wait for it.

I’ve been called into jury duty the first week in June. Yup, Burningbird on a jury. Boggles the mind, doesn’t it?

I was saddened for the people of New Hamphire, losing their Old Man on the Mountain. I hope they don’t try to piece it back together, though. It just wouldn’t be the same. Would you want to visit Mount Rushmore if Teddy’s nose dropped off and someone glued it back on?

Speaking of which, does everyone set their cruise control at the speed limit + five?

Have you ever noticed when you drive long distances that you have these weird conversations with yourself? For instance, every time I pass into Kansas, I always sing the song from Wizard of Oz, “Oh we’re off to see the wizard!”. And when I leave I say those immortal words of…well, you know what I say.

Why are there six rest rooms twenty miles apart, and then not another one for 300 miles?

And then there’s the border between Nevada and Utah. Reno has sometimes been called the “Sodom and Gemorrah” of the states. And Salt Lake City in Utah is known to be very conservative and quite religious. And there’s the salt flat between the two. Anyone else but me get a chuckle from this?

Why is it you never see anyone working in the areas blocked off for road development?

Does that constant movement of the car make you…well, never mind.

Have you noticed when you’re driving at night that the only other vehicles on the road are semi-trucks? And in these circumstances, do you have a hard time getting that Dennis Weaver movie, Duel out of your mind?

Why are the sunrises when you’re on the road so beautiful?

 

Categories
RDF

RDF Poetry Finder: Pieces of the Puzzle

Recovered from the Wayback Machine.

First in a multi-part series focusing on RDF (Resource Description Framework) and poetry and demonstrating two-way integration between art and technology. No prior experience with either RDF or poetry is required.

Recently, Simon St. Laurent wrote a weblog essay titled The (data) medium is the message, in which he discusses the influence of the data container on the data. He uses the analogy of the newspaper and television as mediums for delivering information, which makes them technically the same type of container — both deliver information. However the format and quantity of information differs enormously between the two:

To some degree, you can get the same information from different media sources, but no one expects television to be a reading of newspaper stories or the newspaper to be a transcript of the nightly news on TV. Both are containers for information, but the shape of the container inevitably affects the way the information is both produced and consumed.

Developers tend to disregard this lesson from the real world, approaching the problem of data and container from a purely programming perspective based on an assumption of passive data. The assumption becomes it doesn’t matter what the container is, one can always manipulate the data to fit; we’ll just use technology to transform the data from a relational database to an XML document to RDF to an object store and so on. However, this passive data/programmatic approach to managing data almost always requires effort beyond that required using the appropriate data container; and the transforms between the data require compromises that may not always work cleanly.

In his essay, Simon wrote that the best approach to managing data is to first understand that it isn’t passive, and to work with its native structure, respect it’s natural state. Most importantly, working with data means using the appropriate container for the data.

As examples of matching data to container, data that requires a great deal of flexibility and that has recursive structures is a good fit for XML; while unordered data requiring a great deal of processing is a better fit for relational databases and so on.

Coming from a strong data background, I agree with Simon on the active nature of data, and thought his essay was both thoughtful and compelling. However, what caught my interest most about it was his interpretation of the nature of RDF data. Simon described it as, RDF feels like ‘puzzle’ data to me, interlocking pieces which form larger pictures when assembled. This is, in my opinion, one of the best descriptions of RDF I’ve yet seen, and I’ve seen a few.

Interlocking pieces, which form larger pictures when assembled. In addition to describing RDF data, this phrase could also be used to describe the data model underlying semantics; after all, semantics is the process of discovering meaning behind combinations of symbols — finding the big picture from the sum of the parts.

This parallelism of data model between RDF and semantics is to be expected because the purpose behind RDF is to provide a model on which to build the semantic web. Unfortunately, though, somewhere along the way, we became fixated on RDF’s serialization (transformation) to XML and lost sight of RDF’s power to describe complex structures, the big picture mentioned earlier.

While working on the book, Practical RDF, I had difficulty discovering uses of RDF that I felt demonstrated this capability. I was familiar with the two most popular uses of RDF/XML — RSS (RDF Site Summary) and FOAF (Friend of a Friend). I also created my own vocabularies, for Threadneedle (a way of threading conversations online), as well as PostCon (an online post-content management system). However, while all of these vocabularies are useful and workable, to me none of them captured, fully, the essence of RDF — a model of data that can only be described as complex concept rather than simple fact.

For instance, taking a closer look at RSS and FOAF:

At its simplest, RDF is a way of recording statements consisting of a subject, a predicate, and an object, known as the RDF triple. I know a person. I (subject) know(predicate) a person(object). The triples can be also be ‘chained’ when the object of one statement forms the subject of another, as in: I know a person who has a cat. With this example, the object of the first statement, the person, becomes the subject of the second, the owner of the cat.

Within the FOAF vocabulary, I know a person, and this person has a name; this person has an email address; this person has their own FOAF file, which, in turn lists the people they know, and so on. No matter how you record these statements — in an RDF directed graph, in a RDF/XML file, or using another notation, such as N-Triples — it doesn’t change the nature of the statements, assured by the underlying RDF model.

The same underlying principles work with RSS. A brief synopsis of the postings/essays I write to this weblog are output to a file which is then accessed by tools my readers use to determine that I (and others) updated, and what I have written. Within this file, the source of the information is described, including the source’s primary URL, name, and so on. Following are other statements, such as the individual items, each of which has a unique URL, and a unique title, and so on.

The data in the RSS file is described using RDF/XML, but, as with FOAF, I could easily record the statements as another allowable RDF format, N-Triples, and again, the validity of the statements isn’t changed. The model ensures this.

FOAF and RSS share other similarities beyond just those imposed by the underlying RDF model. Both record knowledge about a top-level object, either a person or a channel; both then record information about items related to that top-level object, in a strongly hierarchical relationship.

A FOAF file lists information about the subject, the person whom the FOAF file describes. It also links the person to other people. They also may know people, and this association can continue in a hierarchy of “A knows B’ until a FOAF file is reached wherein a person lists only people that don’t have FOAF files themselves and no further traversals are possible.

A RSS file lists information about a channel, such as this weblog. It also lists information about items contained within the webog, such as the individual postings. Newer changes proposed to the RSS specification are taking this breakdown of information further, by listing out comments under individual items, and eventually we’ll see trackback entries recorded in RSS. With the addition of trackback into RSS, weblog posting can be related to other weblog posting, and so on. Literally, ‘A knows B’, until, again, there is no further RSS object to traverse.

From an RDF semantics point of view, to some degree FOAF does provide the ability to capture and record information that would be difficult to discover just by searching for specific pieces of the data. Without FOAF, it would be difficult to determine if someone such as Leigh Dodds knows someone else, such as Edd Dumbill other than searching on both their names and hoping to find something in a web page somewhere that validates this assumption. Within the relationship there is a hint of interlocking pieces and a bigger picture.

RSS, on the other hand, provides no clues to some bigger picture within the data it encompasses, and makes no use of the richness of RDF semantics. I have referred to it as a ‘brain dead’ data model, and before the RSS fans in the audience lynch me, allow me to explain.

RSS is a convenience. Sources of information such as this weblog can generate RSS files or feeds. You, as the source reader, can subscribe to a feed using an RSS aggregator (a tool that grabs the feed information and organizes it into one spot). With the aggregator, you’ll be notified of updates, shown abstracts or even the entire items.

The RSS business model states that my RSS file contains a reference to this writing, including the title, the author, an excerpt, the date and time it was written, and the category. However, this same information is nothing more than a repetition of the information contained in the individual writing page. There is nothing in the RSS file that enhances the discovery of information about that thing being described.

What’s more, the RSS files only contain a specified number of items — next update, the oldest item drops off the page. Not only is the information simple and repetitious, it’s temporary at that. So the components of the RSS specification, rather than combining to describe a more complex concept, provide nothing more than a snapshot in time, abbreviated for easier consumption.

Of course, the RSS business model can be changed and the data persisted as well as enhanced, but then it would not longer be RSS. It would be something else.

This isn’t to say the RSS specification isn’t important, or useful, it is. RSS aggregators allow people to see, at a glance, that their favorite sources have written something new, on what subject and when. It is a fantastic convenience…but it is nothing more than a convenience. There is no complex semantics associated with RSS — hence my use of ‘brain dead’ to describe the underlying data structure. In fact, the structure of RSS, which consists of flexible data in recursive structures is a perfect fit for XML, but not necessarily RDF/XML.

Even FOAF for all of its ability to enhance discovery of information about a person and the people they know doesn’t really provide much sophistication — deliberately on the part of the original creators who wanted to keep the vocabulary simple. You can find out who a person knows, but not in what context, and without the context, the information associated with ‘knows’ is limited.

From my FOAF file, you can read that I know Danny Ayers and Mark Pilgrim. Well, that knows could be anything from I’ve met them online and have exchanged emails and we read each others weblogs (true), to we were once torrid lovers (untrue). That’s quite a range implied with that ‘knows’. The maximum information that can be gained from the richer aspects of FOAF is that Person A knows Person B. And that’s it.

Because of this deliberate simplification, I use the term ‘brain dead’ with FOAF, but with a caveat: FOAF was created to be simple deliberately, and could easily be enhanced to a much higher level of sophistication on the part of the FOAF originators if they or others choose.

My own efforts in creating an RDF vocabulary don’t fare much better. Threadneedle could be used to discover and persist the threads of an Internet-based conversation, resulting in a hierarchical structure somewhat comparable to FOAF but capturing the interaction of a group momentarily self-formed about a specific topic at a specific time. There is some semantic richness to this vocabulary, but again, no new information is inferred, just existing communication threads discovered.

PostCon does provide information that would be difficult to discover by other means, such as the movement history of a web resource, or why it was pulled from the server. However, this information isn’t necessarily sophisticated, as much as it just doesn’t exist. Current web technologies don’t have a way to persist this type of information, and PostCon supplies that persistence. Nice, but not quite a semantic cigar.

Again, as with FOAF and RSS, these implementations are useful and very handy, but they aren’t the brass ring of RDF semantic richness I hoped to discover. They are not examples of data demonstrating the complex nature of semantic data, the … interlocking pieces which form larger pictures when assembled.

Of course, RDF provides usefulness beyond just discovering complex concepts. First of all, it is based on a formalized model, which does ensure that it’s data is consistent regardless of business use. No small thing, this. In addition, its incorporation of namespaces allows data from many sources to be combined, and vocabularies to be enhanced and still ensure backwards compatibility. Additionally, I have found the APIs and the simple RDF triple based queries to be quite an easy way of manipulating data in XML documents — even more so then pure XML based query mechanisms. Based on this, I still use RDF for any XML vocabularies I create. But it’s not the same as using RDF’s rich semantics capability, especially when used to build an ontology that incorporates the inferential rules necessary to discover “concepts” rather than just “facts”.

I was beginning to think I would never find what I felt to be a perfect candidate for RDF. However, this all changed, by accident, when I started doing something new in my weblog. Something poetic.

 

Next: The Beginnings of a Beautiful Friendship