Categories
Just Shelley

Passing Notes

Two of my favorite people are having a hard time getting each other’s emails, so I offered to act as a go-between until the email problem was resolved. As I wrote them this morning, this reminded me of an incident that occurred when I was in the 9th grade.

I sat between one of the school’s most popular boys and a very pretty girl in math class. While my wonderful Russian teacher was at the board writing out his esoteric messages, I was enlisted as a conduit in an entirely different message communication process, i.e. I was elicited to pass notes between the two.

When the popular boy first gave me a note and I looked at him in pure astonishment he hastened to add that the note wasn’t for me. Of course not. I was a tall, skinny girl with long frizzy red/brown hair, granny glasses ala John Lennon, and wearing Nehru jackets. This is not the picture of a girl who gets passed notes in math class in school.

I wasn’t offended that the note wasn’t for me. If it had been, it would have shaken my world and caused me too much confusion about my understanding of the roles each of us played. When the situation was clarified, far from being offended, I was relieved that I wouldn’t have to rise to such unexpected behavior and gratified to help out in this endeavor. Though the two were popular and pretty, they were also very nice people—being pretty not being counter to being nice contrary to popularly held views.

However, in the next year when variously assorted curves all of sudden started appearing, and I discovered the shag haircut, make-up, as well as purple-red short, short velvet hot pants and see-through lemon yellow gauzy blouses, I was ready to fit into a new role. But by that time, high school was a dead bore, and I had moved on.

All of which I remembered this morning when I offered to help my friends get emails to and from each other.

Categories
Just Shelley People

Curves and Blogs

Clay Shirky has updated his article to incorporate new data. He pointed to a new list over at Technorati created in response to his article:

Top 100 Interesting New Comers

I’m on the list. But then, I’m also on Technorati’s Top 100 as well as Technorati’s 100 Interesting Recent blogs.

What can I say, I’m a Technorati Blog Magnet.

Clay and I are continuing our chat in the comments attached to the previous post. However, in looking at the new power law distribution graph, and looking at the data that generated the graph, I found that it would be quite simple to get rid of the curve: DaveCory, shut up. You’re skewing the curve.

Sam Ruby wrote about the best summary of this whole thing:

Here’s the way I look at it. I’m listed in the Technorati top 100. By looking at the statistics there, 98.93% of the weblogs it tracks do NOT link to mine. 99.90% of the weblogs tracked have less inbound links than me.

I see no mountains here, only molehills.

Squeak.

Archived with comments at Wayback Machine

Categories
Just Shelley

Load of hooie isn’t in the guidebook

I’m sorry, I used a term like “a load of hooie” in my last posting rather than using some more learned discussion. I didn’t treat Clay’s article with the serious reverence due to it, and didn’t use enough words from the “How to impress people when you write” guidebook, here next to my computer.

I’ll make sure to use words of more than one syllable next time I write, so that I don’t insult my readers.

So I guess I’ll link to other more learned colleagues:

Alex Halavais Power Less

Mark Pilgrim Power Laws and Priorities

Jonathon Delacour’s Stuck(?) in the middle again

I had missed the posting Stavros did on this.

And Phil, who started all of this.

update

I gather this posting came off as somewhat defensive or perhaps even petulant. I’m sorry. I’ve let you all down.

Seriously, this wasn’t meant to be petulant. I was joking. But then after I posted it, I found I wasn’t joking. Hard to explain. However, no one’s comments, which are excellent and generous as always, were responsible for this posting. I wanted to make that assurance. And Mark’s, Jonathon’s, and Alex’s postings are good, which is why I linked to them.

And Phil: Zoe’s curled up in my lap right at the moment, and sends you purrs.

Archived with comments at the Wayback Machine

Categories
Just Shelley

Boys with Toys

Along with others, I also read Clay Shirky’s Power Laws, Weblogs, and Inequality. However, unlike most others, my reaction to Clay’s newest gem was to go, “What a load of hooie”.

Don’t get me wrong, I think Clay’s sharp as a tack and smart as a whip. (Do I need any other weapon-like metaphors to make my point?) He’s a great speaker, and knows his technology, and loves what he does, and I respect that. But he has one failing in regards to his viewpoints as to social gatherings: he’s an elitist. He believes there will always be an ‘elite’ grouping within any society, something I don’t necessarily discount; however, from his writing and actions, he also tends to facilitate the mistaken belief that social groupings must follow fixed statistical patterns that support a static elite and that we must all behave as the statistics dictate. And I say, what a load of hooie.

Clay references Pareto’s work in wealth distribution, showing that 20% of the people control 80% of the wealth. He writes:

Power law distributions, the shape that has spawned a number of catch-phrases like the 80/20 Rule and the Winner-Take-All Society, are finally being understood clearly enough to be useful. For much of the last century, investigators have been finding power law distributions in human systems. The economist Vilfredo Pareto observed that wealth follows a “predictable imbalance”, with 20% of the population holding 80% of the wealth. The linguist George Zipf observed that word frequency falls in a power law pattern, with a small number of high-frequency words (I, of, the), a moderate number of common words (book, cat cup), and a huge number of low-frequency words (peripatetic, hypognathous). Jacob Nielsen observed power-law distributions in website page views, and so on.

Clay equates these versions of the Pareto curve with weblogging popularity, as a measure of weblogging elitism. In his first figure, which I copied under Fair Use and replicated here, he shows a curve that plots the number of incoming links as a function of popularity. This is, Clay assures us, a demonstration of weblogging popularity as mapped to a power law distribution.

[image gone]

(Clay pulled the figures from NZ Bear’s old blogging ecosystem work, an effort that is now defunct, and still alive and well as the Blogging Ecosystem.)

At first glance, Clay’s diagram does demonstrate the traditional curve that marks both Pareto and power law distributions. However, Clay pulled his data from a tainted source, and then compounded the error by an extrapolation that hasn’t been born out in observed behavior.

First, the tainted data. NZ originally polled his ‘starting’ weblogs based on his own viewing patterns, which tend to reflect his warblogger interests. This created a bias towards warblogging weblogs. As NZ wrote at the time:

So, after a few days of screwing around with lots of different tools, I found a way to do it. The methodology, in a nutshell, is this: I started with a fairly large list of about 175 blogs; mostly, I stole from Instapundit and Vodkapundit’s lists, since they are pretty comprehensive, especially when taken together. Then, I built a process to do the following:

– Download the front page of each blog to my local machine
– Scan through each page and extract every link (URL) found in the HTML of the page
– For each of the original list of blogs, scan through the total link list and count how many links go to that blog
– Sort the list of blogs in descending order of their number of inbound links, and include the number in parentheses next to the blog link

NZ’s work was never based on the random sampling necessary in order to make a sound statistical measurement. Tainted data leads to a tainted statistic.

However, even if NZ’s earliest work had been based on this sampling, Clay’s extrapolation about ‘links’ forming a power law distribution is not borne out by an examination of the existing Blogging Ecosystem, which shows that the power law distribution tends to favor tools and mainstream media links over weblogs. Of the top ten link earners, only two, Scripting News and Boing Boing belong to webloggers. The rest belong to Moveable Type, Blogger, CNN, Google, and so on.

If we were to start with untainted data and then filter it to exclude anything other than weblogs, the results are not as static as Clay’s hypothesis would suppose. He wrote:

However, though the inequality is mostly fair now, the system is still young. Once a power law distribution exists, it can take on a certain amount of homeostasis, the tendency of a system to retain its form even against external pressures. Is the weblog world such a system? Are there people who are as talented or deserving as the current stars, but who are not getting anything like the traffic? Doubtless. Will this problem get worse in the future? Yes.

Using my own behavior as a guideline, perfectly acceptable if I view myself as a statistical subject, I started out linking primarily to the more well-known webloggers. However, over time, I found other weblogs and webloggers who I tended to read more and more and appreciate more than the so-called elite webloggers. Most of these people I met in my comments, and in comments on other weblogs. As I added more of these people to my blogroll, and linked to them in my postings, I tended to link to the elite bloggers less and less because I found that I just didn’t read them as much. In other words, as my experience level increased in weblogging, my reliance on linking to a set group of elite bloggers decreased.

If you look at my blogroll now, you only find a few of what can be termed ‘elite bloggers’ (if elitism is a measure of incoming links as measured in the Blogging Ecosystem and Technorati and elsewhere). My blogroll reflects what is an unmistakable human trait — my tastes have changed, my interests have matured, some people have quit, while others have gone in directions I’m not interested in pursuing.

What Clay doesn’t factor into the equation is that unlike Pareto’s work, based on a closed system with finite resources, weblogs are neither closed, and links are neither finite nor fixed.

Even without all these statistical games, Clay’s observations are just not borne out by practice. Quoting his conclusion:

At the head will be webloggers who join the mainstream media (a phrase which seems to mean “media we’ve gotten used to.”) The transformation here is simple – as a blogger’s audience grows large, more people read her work than she can possibly read, she can’t link to everyone who wants her attention, and she can’t answer all her incoming email or follow-up to the comments on her site. The result of these pressures is that she becomes a broadcast outlet, distributing material without participating in conversations about it.

Meanwhile, the long tail of weblogs with few readers will become conversational. In a world where most bloggers get below average traffic, audience size can’t be the only metric for success. LiveJournal had this figured out years ago, by assuming that people would be writing for their friends, rather than some impersonal audience. Publishing an essay and having 3 random people read it is a recipe for disappointment, but publishing an account of your Saturday night and having your 3 closest friends read it feels like a conversation, especially if they follow up with their own accounts. LiveJournal has an edge on most other blogging platforms because it can keep far better track of friend and group relationships, but the rise of general blog tools like Trackback may enable this conversational mode for most blogs.

In between blogs-as-mainstream-media and blogs-as-dinner-conversation will be Blogging Classic, blogs published by one or a few people, for a moderately-sized audience, with whom the authors have a relatively engaged relationship. Because of the continuing growth of the weblog world, more blogs in the future will follow this pattern than today. However, these blogs will be in the minority for both traffic (dwarfed by the mainstream media blogs) and overall number of blogs (outnumbered by the conversational blogs.)

What a load of hooie. Or as Dave Winer says, rightfully, and more diplomatically, Clay doesn’t understand weblogs.

What Clay doesn’t take into account is that many of the so-called A-List, or head bloggers, the ones that primarily link and comment, have always been the type of blogger who primarily links and comments. This isn’t a measure of their popularity as much as it is that’s how they started their blogging and that’s how they continue it. There are just as many webloggers who don’t have as many incoming links but are the “link and comment” type of weblogger.

This type of weblogging is a matter of preference, not time or popularity.

Clay also mentions that the ‘long tail’ of webloggers, those with the least links, will always be the ‘conversational’ bloggers. By this, I’m assuming that Clay means those webloggers who talk about their life, their interests and events in their lives, and who get into cross-blog and comment style of conversations.

What a load of hooie. I can’t count the number of times I read Dave Winer talk about what he had for dinner, or about his illness, quitting smoking, and his father’s illness. There’s been many a time I’ve gotten into cross-blog and comment debates with Dave, and others who are currently in the ‘Pareto head’.

In fact, about the only popular bloggers who never get into cross-blog or comment conversations is Andrew Sullivan and Wil Wheaton. To be honest, no real loss.

Looking at the top 100 weblogs in Blogging Ecosystem or Technorati — if you filter out the tools and the major publications, the vast majority of people in the top slots are all conversationalists.

A person not having comments does not mean they don’t get into conversations. Many a so-called non-conversational and popular weblogger has spoken up in comments, mine and others, more than once. I’ve even had a cross-blog conversation with the Great Pundit, Glenn Reynolds himself, a couple of times. Mark Pilgrim, Dave Winer, Anil Dash, Chris Pirillo, VodkaPundit, Chris Locke, Jon Udell, John Robb, Jason Kottke — these are ‘popular’ webloggers (as measured by incoming links in the systems that measure these sort of things) and you couldn’t shut any of these people up if you tried because they want to be part of the conversation.

Most of the webloggers with the highest incoming number of links thrive on conversation. It’s our drug of choice.

As for this “At the head will be webloggers who join the mainstream media…” This reminds me so much of the parable of the elephant and the six blind men. If you only read Glenn Reynolds, your view of weblogging is that webloggers link and comment and then get good jobs as journalists. If you only read Dave Winer, your view of webloggers is that they link and comment, write an occasional longer essay, and get a job at a prestigious university. If you read Doc Searls mainly, your view of webloggers is that they’re professional journalists who link a lot, but also write a lot and tend to lose things a lot (which is unfortunate).

If you only read Boing Boing, your view of webloggers is that they link and comment and write science fiction, which they publish online for free access. If you only read Jon Udell, all webloggers are technical.

But where does Mark Pilgrim fit into this? Mark’s an all over the board blogger and he’s a ‘popular’ blogger from the statistics. How about Big Pink Cookie? Christine is about as conversational and personal and connected with her audience as you can get, and she’s popular. What about Anil Dash? Boy, can’t beat Anil for getting in and mixing it up with his audience. Anyone forget the time when Anil took on Little Green Football? How about Steven DenBeste? How about Michele from Small Victory? Or Davezilla for that matter, who’s one of the most eclectic people in weblogging?

Do any of these people — do any of us — fit into the statistical cookie cutters that Clay used in his effort to bake us into his weblogger cookies?

(Speaking of which, since this is a weblog: if you were a cookie, what type of cookie would you be?)

Two years from now, if I were to write this again, chances are that I’d be using the names of weblogs that don’t even exist now. Why? Because two years ago, many of the weblogs I just quoted didn’t exist.

Clay’s extrapolations based on statistical observations about webloggers is not validated by the empirical behavior of the webloggers. Or, in layman’s terms: We blow Clay’s hypothesis all to hell and gone. Clay has too much invested in his beliefs in static social patterns to open his eyes and look at what we’re doing. And that’s okay because we’re too busy doing what we’re doing to be all that concerned.

Archived with comments at Wayback Machine

Categories
Writing

Can’t spill coffee on a digital book

Recovered from the Wayback Machine.

One very nice perk of being an O’Reilly author is getting full membership in O’Reilly’s online digital book library, Safari. I think about the 34 boxes of books I have stored back in San Francisco and can’t help but feel that digital is really the way of the future. At least for technical books.

(I still like the feel of a ‘real’ book when I snuggle under the covers in bed for a nice read before heading off to sleep. I love the feel of the covers in my hands, the sound of the turning page, the illustrations and type face, and the slightly acrid, dusty smell you get sometimes with an older book from the library.)

When I was out at Safari tonight, looking something up about Python, I noticed that one of the book’s I helped write and organize for O’Reilly last year, Unix Power Tools 3rd edition, is currently Safari’s most read book.

Don’t you like the unexpected gift the most? That was a very nice little egoboo to help me as I work away with Practical RDF.