Categories
Web Writing

Dusting off the poet

It’s been a long time since I’ve indulged in any poetry at the site. Been a long time since I’ve haunted poets.org to look for just the right verse to suit a picture or a mood.

This week, I oiled my inner poet and set it on its creaky way only to find out that poets.org has undergone a rather significant reorganization. Faced with ‘new’ and wondering if there was anything in there for the inner geek as well as the inner poet, I explored about.

One new feature, or at least, new to me, is many of the poems now is have a topic association. For instance, if a poem is related to aging, other poems related to this topic are listed in the sidebar. This goes beyond groupings of poem by poet, period, and era. It definitely goes beyond keyword searches. It’s given me much thought, and new ideas, in my own continuing search for the case-insensitive semantic web.

The site also has a listening booth, though perhaps it already had this and I didn’t notice. Anyway, the listening book contains readings by poets and readings about poets, including my favorite Dylan Thomas.

Having satisfied the geek, at least for the moment, I returned to the poet, though poet is inaccurate and even a conceit, because I can barely walk and talk at the same time, much less rhyme. If, though, code is poetry, then I wield a mean curly bracket with the best of them. As for loops, you should see me loop–sexiest thing since fishnet stockings.

Returning to my poet, I accessed the improved search engine and searched on the keyword “words”; finding not one but two really great poems from contemporary poets among those returned. Since I’ve been remiss in letting my inner poet out for a walk, I’ll publish both.

Sorry, no photos to accompany the works. The weather continues in the 90s and heavily humid, and I have had no desire to sweat and puddle my way through new venues (though I must break out of my cave tomorrow morning before I bite the cat from cabin fever).

A Quick One Before I Go by David Lehman

There comes a time in every man’s life

when he thinks: I have never had a single

original thought in my life

including this one & therefore I shall

eliminate all ideas from my poems

which shall consist of cats, rice, rain

baseball cards, fire escapes, hanging plants

red brick houses where I shall give up booze

and organized religion even if it means

despair is a logical possibility that can’t

be disproved I shall concentrate on the five

senses and what they half perceive and half

create, the green street signs with white

letters on them the body next to mine

asleep while I think these thoughts

that I want to eliminate like nostalgia

0 was there ever a man who felt as I do

like a pronoun out of step with all the other

floating signifiers no things but in words

an orange T-shirt a lime green awning

How can you not love a poem that has a line like o was there ever a man who felt as I do like a pronoun out of step with all the other floating signifiers? This poem should be required reading for everyone who has found the truth. Then it should be required for everyone who thinks they have lost it.

All She Wrote by Harryette Mullen

Forgive me, I’m no good at this. I can’t write back. I never read your letter.

I can’t say I got your note. I haven’t had the strength to open the envelope.

The mail stacks up by the door. Your hand’s illegible. Your postcards were

defaced. Wash your wet hair? Any document you meant to send has yet to

reach me. The untied parcel service never delivered. I regret to say I’m

unable to reply to your unexpressed desires. I didn’t get the book you sent.

By the way, my computer was stolen. Now I’m unable to process words. I

suffer from aphasia. I’ve just returned from Kenya and Korea. Didn’t you

get a card from me yet? What can I tell you? I forgot what I was going to

say. I still can’t find a pen that works and then I broke my pencil. You know

how scarce paper is these days. I admit I haven’t been recycling. I never

have time to read the Times. I’m out of shopping bags to put the old news

in. I didn’t get to the market. I meant to clip the coupons. I haven’t read

the mail yet. I can’t get out the door to work, so I called in sick. I went to

bed with writer’s cramp. If I couldn’t get back to writing, I thought I’d catch

up on my reading. Then Oprah came on with a fabulous author plugging

her best selling book.

Another brilliant line and somewhat, oddly sad: I regret to say I’m unable to reply to your unexpressed desires. But now I have a highly original way of apologizing for unanswered email. What is your excuse?

Categories
Web

Have I mentioned

how much I love brilliantly executed satire?

Ladies and gentlemen, I give you Web 1.0! Please do not let the excitement go to your heads so much that you piss your pants. Or if you, do it in someone else’s garage.

Categories
Technology Web

The whole thing

Recovered from the Wayback Machine.

The Architecture of the World Wide Web, First Edition was just issued as a W3C recommendation. I love that title – it reminds me of Monty Python’s “The Meaning of Life”, volume one.

Interesting bit about URIs in the document. To address the ‘resource as something on the web’ as compared to ‘resource as something that can be discussed on the web’ issue, the document describes a resource thusly:

By design a URI identifies one resource. We do not limit the scope of what might be a resource. The term “resource” is used in a general sense for whatever might be identified by a URI. It is conventional on the hypertext Web to describe Web pages, images, product catalogs, etc. as “resources”. The distinguishing characteristic of these resources is that all of their essential characteristics can be conveyed in a message. We identify this set as “information resources”.

This document is an example of an information resource. It consists of words and punctuation symbols and graphics and other artifacts that can be encoded, with varying degrees of fidelity, into a sequence of bits. There is nothing about the essential information content of this document that cannot in principle be transfered in a representation.

However, our use of the term resource is intentionally more broad. Other things, such as cars and dogs (and, if you’ve printed this document on physical sheets of paper, the artifact that you are holding in your hand), are resources too. They are not information resources, however, because their essence is not information. Although it is possible to describe a great many things about a car or a dog in a sequence of bits, the sum of those things will invariably be an approximation of the essential character of the resource.

The document then gets into URI collision:

By design, a URI identifies one resource. Using the same URI to directly identify different resources produces a URI collision. Collision often imposes a cost in communication due to the effort required to resolve ambiguities.

Suppose, for example, that one organization makes use of a URI to refer to the movie The Sting, and another organization uses the same URI to refer to a discussion forum about The Sting. To a third party, aware of both organizations, this collision creates confusion about what the URI identifies, undermining the value of the URI. If one wanted to talk about the creation date of the resource identified by the URI, for instance, it would not be clear whether this meant “when the movie was created” or “when the discussion forum about the movie was created.”

Social and technical solutions have been devised to help avoid URI collision. However, the success or failure of these different approaches depends on the extent to which there is consensus in the Internet community on abiding by the defining specifications.

Categories
Web Writing

A True Title

I am enjoying the comments and suggestions about the book title in the last post, and have directed my editor to have a look. In the meantime, for a bit of fun, I’ve come up with several titles that I’d really like to use for the book:

Internet for people who have been screwed online and are now out for revenge.

Internet for those who invested in the dot-com bubble a few years back, and now want to know why they’re holding worthless pieces of paper.

Internet for those with money…what did you say your name and email address was again?

Internet for people who have a more intimate relationship with an email spammer then their own significant other because they at least get the spammer’s email through all the filters.

Internet for people who are scared by their kids knowledge of the Internet.

Internet for people who are scared by their kids knowledge about sex they gained on the Internet.

Internet for those who want to talk about work online.

Internet for those who are looking for a new job online.

Internet for those seeking a warm, caring relationship online, but will settle for a quick roll in the hay. Or picture of same.

Internet for the paranoid and…wait! Wait! What was that?

Internet for the remaining Howard Dean supporters…all two of you.

Internet for Mom, Dad, and don’t tell them about my weblog.

Internet for the censored, spied on, and imprisoned, because the truth will not always set you free.

Internet for the pundits, because you will inherit the Web.

Internet for the meek, because you will inherit the bill.

Internet for people who will not stop clicking on email attachments and whose machines are now a festering bed of evil, with monitors levitating above the desk, and spinning in circles.

Categories
Web

Putting Hotlinks on Ice

Recovered from the Wayback Machine.

Hotlinks — what a perfect word for the practice of directly linking to a photograph or other high bandwidth item on someone else’s server. Hot with its implication of hot goods and thieves passing in the cybernight. The proper term is “direct linking”, and while more technically accurate, the latter term lacks panache. Hotlinking is a particularly warm subject for me because of my extensive use of photography with my writing.

I’m not really sure what led to me start posting photographs with my essays and other writing. Probably the same impulse that leads me to mix poetry with technology, a combination leading Don Parks to write They are rather verbose and poetic… of my Permalinks for Poets essays. Well, get comfortable with your favorite drink, because we’re about to embark on another poetic, verbose, adventure into the mysteries of technology. Most fortunate for you, this one’s a murder mystery because we’re going to put hotlinks on ice.

This is a photograph of me

It was taken some time ago.
At first it seems to be
a smeared
print: blurred lines and grey flecks
blended with the paper;

then, as you scan
it, you see in the left-hand corner
a thing that is like a branch: part of a tree
(balsam or spruce) emerging
and, to the right, halfway up
what ought to be a gentle
slope, a small frame house.

In the background there is a lake,
and beyond that, some low hills.

(The photograph was taken
the day after I drowned.

I am in the lake, in the center
of the picture, just under the surface.

It is difficult to say where
precisely, or to say
how large or small I am:
the effect of water
on light is a distortion

but if you look long enough,
eventually
you will be able to see me.)

Margaret Atwood

 

Hotlinking is the practice of adding a photograph or other multimedia directly in a web page, but linked to the resource on someone else’s server. The bandwidth bandit gets the benefit of the photograph, but the owner of the photograph or has has to pay for the bandwidth. If enough photographs or movies or songs are hotlinked, the bandwidth use adds up.

Recently I noticed that several photographs from FOAF, Flocking, and the Semantics of Starlings were being accessed from various other weblogs, including Adam Curry’s weblog. The reason this was happening is that some folks copied part of the essay, including the links to the photographs. The photograph accesses started appearing from one weblog, then another, then another.

The problem was then compounded when each of these sites published RSS that included all their content rather than excerpts — including these same direct links to the photographs. In fact, it was through RSS that photographs appeared in Adam Curry’s online aggregator — along with several very interesting pornography photos.

I’ve had photographs hotlinked in the past and haven’t taken any steps to prevent it because the bandwidth use wasn’t excessive. In addition, some people who are weblogging within a hosted environment don’t have a physical location for photographs, and I’ve hesitated about ‘cutting them off’. Besides, I was flattered when people posted my photographs, being a pushover when it comes to my pics.

However, with this last incident, I knew that not only was my bandwidth being consumed from external links, those who share space and other resources on the weblogging co-op I’m a part of are also losing bandwidth through our shared line. Time to close the door on the links.

To restrict access to images, I’ll need to add some conditions to my existing .htaccess file. If you’ve not worked with .htaccess before, it’s a text file located in your directory that provides special instructions to the web server for files in your directories. In this particular case, the restrictions I’ll add will be dependent on a special module, mod_rewrite, being compiled into your server’s installation of Apache. You’ll need to check with your ISP to see if you have it installed.

(If you have IIS, you’ll use ISAPI filters, instead. See the IIS documentation for specifics.)

Restrictions for image access are made to the top-level .htaccess file shared by all my sites. By putting the restrictions into the top-level file, they’ll be applied to all sub-directories unless specifically overridden.

Three mod_rewrite instructions are used within the .htaccess file:

RewriteEngine On — turns on the rewrite engine
RewriteCond — specifies a condition determining if a rewrite rule is implemented
RewriteRule — the rewrite rule

When the web server accesses the .htaccess file and sees these directives, three things happen: the rewrite engine is turned on, the rewrite conditions are used against the incoming request to see if a match is found, and the rewrite rule is applied.

The rewrite conditions and rules make use of regular expressions to determine if an incoming request matches a specific pattern. I don’t want to get into regular expressions in this essay, but know that regular expressions are basically pattern matching, using special characters to form part of the pattern. The examples later make use of the following regular expression characters, each listed with its specific behavior:

! used to specify non-matching patterns
^ start of line anchor
$ end of line anchor
. match any single character
? zero or one of preceding text
* 0 or N of the proceding text, where N is greater than zero
\char Escape character — treat char as text, not special character
(chars) grouping of text

There are other characters, but these are the only ones I’m using — the mod_rewrite Apache documentation describes the entire set.

Within .htaccess I add a line to turn on the rewrite engine, and add my first condition — match a HTTP request from any domain that is not part of the burningbird.net domain:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?burningbird.net/.*$ [NC]

The condition checks the HTTP referrer (HTTP_REFERER) to see if it matches the pattern, in this case anything that is not from the burningbird.net. This includes domains other than paths.burningbird.net, rdf.burningbird.net, www.burningbird.net, and burningbird.net directly. The qualifier at the end of the line, [NC], tells the rewrite engine to disregard case.

I’m looking for domains other than my own because I want to apply the rules to the external domains — let my own pass through unchecked. Since I have more than one domain, though, I need to add a line for each domain and modify the file accordingly:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?burningbird.net/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?forpoets.org/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yasd.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?dynamicearth.com/.*$ [NC]

Once all of conditions are added to .htaccess, when the web server accesses a file within my directories, the conditions are combined, adding up to a pattern match for any domain other than a variation on burningbird.net, forpoets.org, yasd.com, and dynamicearth.com.

One last pattern domain needs to be allowed through, unchecked — I need to allow access to the images when the referrer has been stripped, such as local access or access through a proxy. To do this, I add a line with no domain or pattern — a blank referrer. The file then becomes:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?burningbird.net/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?forpoets.org/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yasd.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?dynamicearth.com/.*$ [NC]

Once I have the rewrite conditions set, time for the rule. This is where all of this can get interesting, depending on how clever you are, or how devious.

In my .htaccess file, when a referrer from a domain other than one of my own accesses one of my photos, I forbid the request. The rule I use is:

RewriteRule \.(gif|jpg|png)$ – [F]

What this rule says is that any request to a JPG, GIF, or PNG file, coming from a domain that doesn’t match the conditions set earlier, is rewritten to the ‘-‘ character. In addition, the [F] qualifier at the end of the line tells the browser that they are forbidden to fetch this particular file.

Depending on the browser accessing the web page that contains the hotlinked photo, rather than the image, the page will either show a missing image symbol, or the name of the image file will be printed out.

Now, my approach just prohibits others from hotlinking to my images. Other people will redirect the image request to another image — perhaps one saying something along the lines of “Excuse me, but you’ve borrowed my bandwidth, and I want it back.” In actuality, people can be particularly clever, and downright mean, with the image redirection.

If this is the approach you want, then you would use a line similar to:

RewriteRule \.(gif|jpg|png)$ http://burningbird.net/baddoodoo.jpg [R,L]

In this case, the image request is redirected to another image, baddoodoo.jpg, and a redirect status is returned (the ‘R’). The ‘L’ qualifier states that this is the last rewrite rule to apply, to prevent an infinite lookup from occurring (accessing that redirected image, triggering the rule, that accesses that image, that triggers…you get the idea). Don’t forget to terminate the rule with the ‘L’ qualifier or you’ll quickly see how your web server deals with runaway processes.

(Does anyone smell smoke?)

It’s up to you if you want to forbid the image access, or redirect to another file — note, though, that you shouldn’t assume that people who are hotlinking are doing so maliciously. Most do so because they don’t know there’s any wrong with it. Including most webloggers.

My complete code is:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?burningbird.net/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?forpoets.org/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yasd.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?dynamicearth.com/.*$ [NC]

RewriteRule \.(gif|jpg|png)$ – [F]

 

update

Some browsers strip the trailing slash from a request, and can cause access problems, as noted in comments in Burningbird. I’ve modified the .htaccess file to the following to allow for this:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?burningbird.net(/.*)?$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?forpoets.org(/.*)?$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yasd.com(/.*)?$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?dynamicearth.com(/.*)?$ [NC]
RewriteRule \.(gif|jpg|png)$ – [F]

I tested to ensure this worked using the curl utility, which allows me to access a file and pass in a referrer:

curl -ehttp://burningbird.net http://weblog.burningbird.net/mm/blank.gif

The .htaccess file should now work with browsers that strip the trailing slash

With the rule now in place, if someone tried to link directly to the following photograph, all they’ll get is a broken link.

boats5.jpg

You can see the effect of me putting hotlinks on ice by accessing this site with the domain mirrorself.com. This is one of my domains, but I haven’t added it to the .htaccess file — yet. If you access the main burningbird weblog page through mirrorself.com, using http://www.mirrorself.com/weblog/, you’ll be able to see the broken link results. Look quickly, though, I’ll be adding mirrorself.com in the next week.

When I mentioned writing this essay, Steve Himmer made a comment that the rules he added to .htaccess didn’t stop Googlebots and other bots from accessing images. To restrict access of images from webbots such as googlebot, you’ll want to use another file, robots.txt.

My photos are usually placed in a sub-directory called /photos, directly under my main site. To prevent well behaving webbots such as Google from accessing any photo, add the following to your robots.txt file, located in the top-level directory:

User-agent: *
Disallow: /photos/

This will work with Googlebot, which is a well behaved bot; it will also work with other well behaved bots. However, if you’re getting misbehaving ones, then the next step is banning access from specific IPs — but that’s an essay for another day, because it’s past my bedtime, and …to sleep, perchance to dream.

bwboats.jpg