Categories
Technology Weblogging

Comment spam? Or DoS?

Recovered from the Wayback Machine.

The topic about comment spam still rages, with people following the spammer’s tracks to shut them down or at a minimum harass them with bills and whatnot. The spammers then come back with, “It’s all legal, your comment forms are open.”

Well, yes and no. Try thinking of comment spam as a Denial of Service (DoS) and the legality changes, real quick. All it takes is using Movable Type with comment emailing turned on and then getting hit with close to 150 comment spams at once, as happened to me this morning before I shut the web server down to stop it.

When you have this many comment spams at once on Movable Type, with the associated activities such as database lookup, update, and email, then any and all other activity basically slows down to a crawl, or stops completely. Since the person deliberately triggers this many updates at once, it is a deliberate denial of service, and hence a DoS, and against the law.

This is the approach I’m taking to fighting back at comment spam of this nature.
If the spammer just did a few comments and I had better comment control, this wouldn’t bother me. But the recent multi-post blitzes, well they take down the system and I’m getting right tired of this.

I’ve already warned the company hosting the dial-up, and the company providing the nameservers – one more DoS and I’m filing a criminal complaint.

Mt-blacklist would have stopped the multi-post blitz, but I don’t have mt-blacklist installed – it stopped working for me with version 1.5, and still doesn’t work with version 1.6. Since I’m trying to move several webloggers to a new server, I don’t have time to work through what’s out of synch.

However, I do want to take this time to refresh my Movable Type wish list (and yes, Six Apart, you can put this into a commercial variety of the beast – just don’t go crazy on the fees, okay? )

Movable Type Comment and Trackback Wish List

Pretty please, sirs and lovely lady. May I have some more…

– Comment control: pull up and review comments by email, url, and IP address. Allow deletion based on all entries pulled up, or based on checks next to each item. Allow this at the installation level, not the weblog level – and also provide rebuild based on deleted entries

– Trackback control: ditto

– Blitz Prevention: Test to make sure the blitz doesn’t happen, this is really killing my system each time it happens. Restrict based on number of comments posted within an inhuman length of time for the same IP, or something of that nature.

(This is a real killer for me and I may hack the code myself to stop these blitzes, because I have a feeling I’m going to be getting these more frequently.)

I’d rather have these then blacklisting. We in the Wayward Weblogger co-op are already suffering because of uncontrolled blacklisting from SPEWS and I’m not sympathetic to banning in any form, though I can understand why people like this preventative measure.

(Not that I don’t appreciate Jay Allen and his mt-blacklist (which I wish I could get working again) – right now it’s the only thing standing between us the howling comment spammers at the door.)

As for the new wars: I think i’ts good we’re all fighting back, as long as we all remember something: anyone who we push can push back, and most of us share servers with others. When you say you’re going to put yourself on the line – you might want to spare a moment or two to the others you’re dragging along with you in your crusade. Be deliberate if you’re going to pick a fight, knowing all the consequences.

Categories
Connecting

On authenticity and friendship

Recovered from the Wayback Machine.

One last note about the tax board member and weblog writing, if for no other reason to clarify that it was not the IRS I was referencing – it was the California Franchise (tax) board. I was reluctant to mention the name for some reasons I didn’t want to get into, but I wasn’t comfortable with the continuing misunderstanding that it was the IRS.

In the comments associated with the post ShhhBill Kearney in his usual sensitive and tactful way writes:

This is nothing new. As McNealy said, you have no privacy, get over it. I’m always reminded of Claude Raines’ role in Casablanca when I hear this sort of thing “shocked, shocked I say to hear…”

The only difference here is the realm of physical expression usually kept people from making fools of themselves to too wide an audience. Now that it’s world-wide the potential’s much greater. Is this the fault of the technology or the people? Are they lesser fools if they’re not on the world-wide stage? Greater if they are?

To me this just raises the question of personal integrity to more sharper focus. If one could succeed behaving in a manner that would cause them discomfort if revealed should they expect to get away with it? If it was just chance and a small audience that coddled their notions shoudl the larger audience hide itself from them?

If someone’s going to be ‘out there on the web’ they need to know what that means. To pretend otherwise is foolish, at best, but most certainly naive.

What’s next, someone blogging their house got robbed because they blogged about going on vacation?

I don’t think any of us are surprised, per se, when someone from the ‘outside world’ mentions they read our weblog. Still, as Stavros writes, to be boggled, even a little, when your public journal is revealed to be just that – entirely public – is neither foolish nor naive.

Was I surprised? There are details of the conversation with the tax board that I won’t repeat, but yes, in the nature of the conversation I had with her, I was surprised. More than that, I was made very uneasy. In some ways, in the course of everyday chit chat, talking about everyday things, I felt that I had came close to incriminating myself – in a situation where no crime had occurred. So, yes, I was surprised.

Francoise and Mary bring up the fact that the pages I’m deleting are in Google cache, and in the Wayback Machine. True, if any government agency or other organization wanted to dig, I imagine that at least for a while, they could find this information. But before we jump into a situation where an Oppressive Regime has overcome our country and we have to flee with 7 favorite books, a little perspective here: I am not the CEO of Enron. I am not that important.

It makes sense for the tax board member to put my name into Google and do a bit of reading. It makes less sense that she would spends hours, days even, trying to find cached data or go through the Wayback Machine.

Now if I had been the CEO of Enron, I could see this happening. Of course, I couldn’t image Kenneth Lay having a weblog. Can you imagine the entries:

The wife and I are going to dinner tomorrow night at that new Italian place. I’ll be sure to let you know how it is. We’re also looking into taking a very long vacation soon. Some place out of the country, hopefully warn with no extradition treaty with the US.

I decided to swindle millions from the company shareholders, today. Oh, and the cat isn’t feeling well.

No, I’ll delete the old records, and be more discrete in the future – because I don’t like hearing that information come up in conversation, not because I am guilty of a crime or loss of integrity – thank you very much. But I won’t go any further in my efforts, because I am not guilty of a crime.

I think it was Jeneane, though, that brought up the most interesting aspect of this whole incident. She also writes in her weblog:

Just to clarify something: We’re not talking about a public journal being read by the public in this instance. We’re talking about what you’ve written in public within your weblog, which, HELLO, could be fact or could be fiction, being used by the government in their financial assessment of you and what you may or may not owe them.

Not sure about you or Shelley, but I’m just thrilled to know that the IRS is a valued reader of this blog, just as I’ll be thrilled to have a chat with the HMO folks over the phone one day, indicating that they’ve read every sentence I’ve written about my daughter’s asthma, and would like to deal with me financially based on the pixel trail I left behind.

And what if I told you it’s all a lie? What if I told you I made it up? What if I confessed she’s never wheezed in her life? What if I say, that was all an experiment to guage the interest of my readers on specific topics, or, if I declare that I was doing research? Or, that it was ENTERTAINMENT, not necessarily fact?

The issue of telling the truth or not has been discussed before, but lets face it, the fact that we now know that government agencies are for a fact reading our weblogs, how does this impact on our writing?

Can you imagine what most would make of Oblivio’s weblog?

Every A-list blogger that can get themselves into print is talking about the honesty of the voices in weblogging; how weblogs are personal journals; Weblogs are impressions and facts written by real people. Now imagine what happens when the weblogger pushes and pulls the truth, just a little – just to make things more interesting?

What if I had talked about this great job I found that’s paying me six figures? What the tax board is hearing is that I’ve had not the best of times and would like to make payments for the corporate tax owed. What’s going to happen when what I write in this weblog is not consistent with what I say is happening in ‘real’ life?

Of course, none of this is ever going to happen to you. You’re never going to have what you write brought up by a creditor or government agency. Or friend or family member or boss. You never bitch about work. In fact, you never mention it. You never write on impulse. Your writing is impersonal and completely risk free.

Must be dull to be you.

Speaking of authenticity, a note of thanks to two authentic ladies who have been above and beyond good friends, particularly this week for reasons that I think I’ll just keep to my own bloody self. Jeneane and Sheila, thank you both.

Categories
Semantics

Deconstructing the Syllogistic Shirky

Recovered from the Wayback Machine.

Clay Shirky published a paper titled, The Semantic Web, Syllogism, and Worldview and made some interesting arguments. However, overall, I must agree with Sam Ruby’s assessment: Two parts brilliance, one part strawman. Particularly the strawman part.

First, Clay makes a point that syllogistic logic, upon which hopes for the Semantic Web are based, requires context and therein lies the dragons. He uses an example the following syllogism:

– The creator of shirky.com lives in Brooklyn
– People who live in Brooklyn speak with a Brooklyn accent

From this, we’re to infer that Clay, who lives in Brooklyn, speaks with a Brooklyn accent. Putting this into proper syllogistic form:

People who live in Brooklyn speak with a Brooklyn accent
The creator of shirky.com lives in Brooklyn
Therefore, the creator of shirky.com speaks with a Brooklyn accent

Leaving off issues of qualifiers (such as all or some) , the point Clay makes is that context is required to understand the truth behind the generalization made with people living in Brooklyn and speaking with an accent:

Any requirement that a given statement be cross-checked against a library of context-giving statements, which would have still further context, would doom the system to death by scale.

Clay believes that generalities such as the one given require context beyond the ability of the medium, the Semantic Web, to support. He then goes on to say that we can’t disallow generalizations because the world tends to think in generalities.

I agree with Clay that people tend to think in generalities and that context is an essential component of understanding what is meant by these generalities. But Clay makes a mistake in believing that the proponents of the Semantic Web are interested in promoting a web that would be able to deduce such open-ended generalities as this; or that we are trying to create a version of Artificial Intelligence on the web. I can’t speak for others, but for myself, I have never asserted that te Semantic Web is Artificial Intelligence on the web (which I guess to show that machines aren’t the only ones capable of miscontruing stated assertions).

Clay uses examples from a few papers on the Semantic Web as demonstrations of what we’re trying to accomplish, including a book buying experience, an example of trust and proof, and an example of membership based on event. However, in all three cases, Clay has done exactly what he’s told the Semantic Web folks we’re guilty of: disregarded the context of all three examples. As Danny Ayers writesShirky is highly selective and misleading in his quotes.

In the first paper, Sandro was demonstrating a book buying experience that sounds overly complex. As Clay wrote:

This example sets the pattern for descriptions of the Semantic Web. First, take some well-known problem. Next, misconstrue it so that the hard part is made to seem trivial and the trivial part hard. Finally, congratulate yourself for solving the trivial part.

The example does seem as if the trivial is made overly complex (and, unfortunately, invokes imagery of the old and tired RDF makes RSS too complex debate), but the truth is that Sandro was basing his example on the premise of how would you buy a book online if you didn’t know the existence of an online bookstore. In other words, Sandro was demonstrating how to uncover information without a starting point. Buying a book online may not have been the best example, but the concept behind it, the context as it were, is fundamental to today’s web; it’s also the heart of tomorrow’s Semantic Web, and the basis behind today’s search engine functionality, with their algorithmic deduction of semantics.

As for Sean Palmer’s example, which makes an assertion about one person loving another and then uses a proof language to demonstrate how to implement a trust system, Clay writes:

Anyone who has ever been 15 years old knows that protestations of love, checksummed or no, are not to be taken at face value. And even if we wanted to take love out of this example, what would we replace it with? The universe of assertions that Joe might make about Mary is large, but the subset of those assertions that are universally interpretable and uncomplicated is tiny.

I agree with Clay that many assertions made online don’t necessarily have a basis in fact, and no amount of checksum technology will make these assertions any more true. However, the point that Sean was making isn’t that we’re making statements about the truth of the assertion — few Semantic Web people will do this. No the checksum wasn’t to assert the truth of the statement, but to verify the identity of the origination of the statement. This latter is something that is very doable and core to the concept of a web of trust — not that your statement is true, because even in courts of law we can’t always deduce this; but that your statement was made by you and was not hearsay.

In other words, the example may not have been the best, but the concept is solid.

Finally, as to Aaron Swartz’s example of the salesman and membership in a club, Clay writes:

This is perhaps perhaps the high water mark of presenting trivial problems as worthy of Semantic intervention: a program that can conclude that 102 is greater than 100 is labeled smart. Artificial Intelligence, here we come.

Again, this seems like a trivial example — math is all we need to determine membership based on count of items sold. However, the point Aaron was making was that in this case it was count, in other cases membership could be inferred because of other actions, and by having a consistent and shared inferential engine behind all of these membership tests, we do not have to develop the technology to handle each individual case — we can use the same model, and hence the same engine, for all forms of inferences of membership.

Again, without the context behind the example the meaning is lost, and just the words of the example as republished in Clay’s paper (and I wonder how many of the people reading Clay’s paper also read the other three papers he represents) seem trivial or overly pedantic. With context, this couldn’t be farther from the truth.

Following these arguments, Clay derives some conclusions that I’ll take one at a time. First he makes a point that meta-data can be untrustworthy and hence can’t be relied on. I don’t think any Semantic Web person will disagree with him, though I think that untrustworthy is an unfortunate term, with its connotations of deliberate acts to deceive. But Clay is, again, mixing web of trust and Semantic Web, and the two are not necessarily synonymous (though I do see the Web of Trust being a part of the Semantic Web).

I use poetry as an example of my interest in the Semantic Web. As an example. I want to find poems that use a bird to represent freedom. I search on “bird as metaphor for freedom” and I find several poems people have annotated with their interpretation that the bird in the poem represents freedom. There is no inherent ‘truth’ in any of this — only an implicit assumption based on a shared conceptual understanding of ‘poetry’ and ‘subjectivity’. The context is that each person’s opinion of the bird as metaphor for freedom is based on their own personal viewpoint, nothing more. After reviewing the poems, I may agree or not. The fact that the Semantic Web helped me find this subset of poems on the web does not preclude me exercising my own judgement as to the people’s interpretations.

Clay also makes a statement that There is simply no way to cleanly separate fact from fiction, and this matters in surprising and subtle ways…. As example he uses a syllogism about Nike and people:

– US citizens are people
– The First Amendment covers the rights of US citizens
– Nike is protected by the First Amendment

Well, the syllogism is flawed, but disregarding that, the concept of the example is again mixing web of trust with the Semantic Web, and that’s an assumption that isn’t warranted by what most of us are saying about Semantic Web.

Clay also mentions that the Semantic Web has two goals: to get people to use meta-data and the other is to build a global ontology that pulls all this data together. He applauds the first while stating that the second is …audacious but doomed.

Michelangelo was recorded as having said:

My work is simple. I cut away layer after layer of marble until I uncover the figure encased within.

To the Semantic Web people there is no issue about building a global ontology — it already exists on the web today. Bit by bit of it is uncovered every time we implement yet another component of the model using a common, shared semantic model and language. There never was a suggestion that all metadata work cease and desist as we sit down on some mountaintop somewhere and fully derive the model before allowing the world to proceed.

FOAF, RSS, PostCon, Creative Commons — each of these is part of the global ontology. We just have many more bits yet to discover.

Clay’s most fundamental pushback against the Semantic Web works seems to be covered in the section labeld “Artificial Intelligence Reborn”, where he writes:

Descriptions of the Semantic Web exhibit an inversion of trivial and hard issues because the core goal does as well. The Semantic Web takes for granted that many important aspects of the world can be specified in an unambiguous and universally agreed-on fashion, then spends a great deal of time talking about the ideal XML formats for those descriptions. This puts the stress on the wrong part of the problem — if the world were easy to describe, you could do it in Sanskrit.

Likewise, statements in the Semantic Web work as inputs to syllogistic logic not because syllogisms are a good way to deal with slippery, partial, or context-dependent statements — they are not, for the reasons discussed above — but rather because syllogisms are things computers do well. If the world can’t be reduced to unambiguous statements that can be effortlessly recombined, then it will be hard to rescue the Artificial Intelligence project. And that, of course, would be unthinkable.

Again, I am unsure of where Clay derived his thinking that we’re trying to salvage the old Artificial Intelligence work from years ago. Many of us in the computer sciences knew this was a flawed approach almost from the beginning. That’s why we redirected most of our efforts into the more practical and doable expert systems research.

The most the proponents of the Semantic Web are trying to do is show that if this unannotated piece of data on the web can be used in this manner, how much more useful can it be if we attach just a little bit more information about it?

(And use all of this to then implement our plan for world domination, of course; but then we don’t talk about this except on the secret lists.)

Contrary to the good doctors’ AKMA and Weinberger and their agreement with Clay, as to worldview and its defeat of any form of global ontology, what they don’t take into account is that each worldview of data is just another facet of the same data; each provides that much more completeness within that global ontology.

What we do know about the Soviet view of literature? Its focus was on Marxism-Leninsim. What do know about Dewey’s view of literature? That Christianity is first among the religions. The two facts, the two bits of semantic information, do not preclude each other. Both form a more complete picture of the world, as a whole. If they were truly incompatible, people in the world couldn’t have had both viewpoints in the same place, the earth, at the same time. We would have imploded into dust.

I do agree with Clay when he talks about Friendster and much of the assumption of ‘friendship’ based on the relationships described within these networks. We can’t trust that “Friend of” is an agreed on classification between both ends of the implied relationship. However, the Semantic Web makes no assumption of truth in these assertions. Even the Web of Trust doesn’t — it only validates the source of the assertion, not the truth of the assertion.

Computers in the future are not going to come with built-in lie detectors.

I also agree, conditionally, with Clay when he concludes:

Much of the proposed value of the Semantic Web is coming, but it is not coming because of the Semantic Web. The amount of meta-data we generate is increasing dramatically, and it is being exposed for consumption by machines as well as, or instead of, people. But it is being designed a bit at a time, out of self-interest and without regard for global ontology. It is also being adopted piecemeal, and it will bring with it with all the incompatibilities and complexities that implies.

Much of the incompatibilties could be managed if we all followed a single model when defining and recording meta-data, but I do agree that meta-data is coming about based on specific need, rather than global plan.

But Clay’s reasoning is flawed if be believes that this isn’t the vision shared by those of us who work towards the Semantic Web.

Categories
Photography

Pet rock

Recovered from the Wayback Machine.

My early days as a mineral and crystal collector would find me at Earthlight in Kirkland on a regular basis. This shop was full, floor to ceiling, with rare and wonderous crystals from throughout the world. Not just minerals in raw form – the owner also carried rock carvings, jewelry, and other odds and ends.

The owner knew the mystical properties of each of the minerals, and about half the shop was devoted to those crystals favored more for their healing properties than their value as a collectible item. I, however, would spend my time among the closed and locked cabinets for serious mineral collectors that the owner, Jack, would unlock for me once he got to know me.

(Everyone had a crystal for health or spirituality in those days. When I told people I collected crystals, I would hasten to add that mine were psychically dead.)

One Saturday I was gazing through his new additions, trying to decide which to take and which to regretfully leave behind, when I saw this odd looking little rock in the corner of the cabinet. At first, I thought it was a bit of the packing material, because it was fuzzy and white and not like any crystal or rock I’d ever seen before.

I picked it up, and touched it to see if it was brittle, but it felt soft. I was stroking it again when Jack called out, “Don’t stroke that. It’s very delicate and you’ll damage the crystals.”

I looked more closely at the rock and sure enough I could see little thread-thin crystals radiating from it.

“This is real?”

“Yes. It’s called Okenite.”

To hold the rock is to want to stroke it. The rock’s shape vaguely resembles a kitten, which only added to the overall urge to run my finger down it’s side.

“You’re doing it again.”

I guiltily stopped in mid-stroke and looked down and it’s true, the rock seemed a tad less fluffed out then originally. “I’ll buy it, Jack.”

He waved his hands at me and laughed. “Go ahead and pet the rock all you want, then.”

It’s not worth a lot of money compared to the azurite and dioptase, aquamarine, and rhodochrosite, but it’s a cute little bugger.

Oh, damn! I just stroked it again.

Categories
Government Weblogging

Shhh

Recovered from the Wayback Machine.

During my break I made a decision not to talk about my financial affairs in this weblog again. I’m not sure why I did so before – this is not a topic I would normally bring up in a get together among friends; I have always been private about my finances in the past. I think the reason why I broke my own personal rules was that the anonymity of weblogging lured me into increasing exposure online. Even though I write under my name, and have even posted a personal photo, there is still something about not seeing your faces when I talk that gives the illusion of a priest’s confessional.

No more talk about job hunts, contracts, or money online. If I get a job, I won’t mention it, nor will I talk about an employer in any way. That part of my life is no longer pertinent to this space, and the only thing I’ll mention is public events, such as publishing a photo, story, or book.

(I mentioned selling my rock collection but that’s as much because I want the collection to go to a good home rather than be packed away in a box, hauled about by a permanent vagabond such as myself. And besides, my story on the rock collection will be public; the auction of the collection will also be in public, and I will have no hesitation about directing you all to it to bid, bid till it hurts.)

I made this decision because of personal reasons and internal discussions and various other factors. However, even if I hadn’t made this decision before now, I would have had to make it today because of a phone conversation this morning. This call now leads to my last story on the financial world of Burningbird, aka Shelley Powers. In fact, the only story on this subject that will remain in my weblog, as I spend the afternoon deleting entries on the subject in my archives.

I only write this today as a bit of heads up for those of you who, like me, sometimes get seduced into putting information online that you may regret someday.

I’ve had a corporation in the past, primarily created as a way of getting contracts with companies that are uncomfortable working with self-employed (1099) contractors. When the bottom fell out of our industry and I closed the corporation down, I found I couldn’t pay the tax bill for it. The short story is that I wrote the tax board a letter offering payments.

I talked with a very nice lady today from the tax board who was very helpful, but very upfront about how the tax laws work. Tax boards are not like creditors – they don’t have much leeway when it comes to taxes paid or not, or penalties, or actions taken if taxes aren’t paid.

I had told the board my situation, about not having the best of year(s), and she was very sympathetic. There were two ways the board could have gone in dealing with me, and she recommended the most compassionate way, and I am very grateful. Not only for that but also for how she managed the call today: putting a very real and very human face on what is a cold, unfeeling institution; treating me with dignity and respect.

However, lest you think that tax board employees are just going to take a person’s word for their current financial situation, think again. The person I talked today was compassionate, and extremely helpful, but she was also very thorough.

She mentioned that before calling me, she gone out to my weblog, this weblog, and read the entries scattered about in it where I talked about my financial situation. She mentioned about reading that thanks to unemployment, I can at least keep my car; about the other things I put online that I didn’t think I would hear back from the mouth of a member of a representative of a governmental tax organization.

I’m not faulting her or shouting out cries of ‘government invasion of privacy’ just because she was thorough. What privacy? I put all this online for anyone to read. Am I going to blame the government, or my creditors, or anyone else for that matter because they read what I write?

Gladly, she didn’t catch the posts about my Bermuda vacation and diamond bra purchase from Victoria Secret.

JUST JOKING!

The point to take away from this writing is that in addition to worrying about your family and your friends, your clients and your employer when you write online – you also have to worry about your local, state, and federal tax boards and other creditors.

You know, I liked weblogging a whole lot more when it flew under the radar.