May 6th, 2007

I finished David Weinberger's new book, Everything is Miscellaneous. I hesitated to write anything but a positive review, because I know how discouraging negative reviews can be. At the same time, I did commit more than the time of a flight to Mexico to the book.

While I can admire David's enthusiasm and breadth of knowledge, I found I couldn't agree with most of the conclusions he reached. Beyond that, he jumps about from subject to subject, in a bewildering manner that isn't always easily understood or tied together at the end of each section/chapter/topic. David's a good writer, and has a way with the anecdote or with describing the information he's presenting. But there's no flow between the sections that take you from less knowledge to more. I assume it's intentional, and that David is dropping leaves of knowing about for us to form our own paths over time. Instead of enlightenment, though, I found myself going, 'Huh? What? Why is this here? What about that?"

Returning to the conclusions, in one section on Wikipedia, David mentions that we can know when neutrality is achieved for an article when there are no further edits. Yet that's not something we can determine. It could just as easily be a case of the more knowledgeable finding the back and forth of Wikipedia to be tedious, or not worth the time. In cases such as this, because they did not stay around long enough to argue their point, their point of view is judged 'less' than the person's who did stick around. But 'truth' isn't based on which side has the most extra time and/or staying power in a debate. The whole premise behind lack of activity being a measure of neutrality is flawed.

David also wrote, One of the lessons of Wikipedia is that conversation improves expertise by exposing weaknesses, introducing new viewpoints, and pushing ideas into accessible form. But not everyone does well in a confrontational setting, and Wikipedia is a confrontational setting. Let's also not forget that Wikipedia favors those with much free time; or forget that it wasn't all that long ago that a so-called 'expert' on religion in Wikipedia–one of the legion hailed by Jimmy Wales in David's book–was eventually exposed to be a fraud who derived most of his seemingly intimate knowledge of the subject through the book, "Catholicism for Dummies".

In his many mentions of metadata, David conflates metadata with data, or defines metadata in ways to confuse it all out of recognition. He talks about the spaces between words as 'metadata'. That one had me so flummoxed, I ended up reading five pages without comprehending a word of what was written because my mind was still trapped in the concept of a space between words as 'metadata'. Does that mean, then, that the space between me and my desk is also metadata? How about the space between me and the car that's moving outside my window. Wait, it's continuing to move — must I continue to redefine the metadata?

In his chapter, "Messiness as a Virtue" David references an anecdote told earlier in the book, about where the data that makes the Wikipedia entry for elephant physically resides (Wikipedia is served by several machines). "Beats heck out of me", is the answer, more or less, from the powers that be. In the later chapter, this translates to, As we saw with the computers that house Wikipedia, the physical placement of the bits is of so little importance that even the people in charge have no idea where they are. This is a mess of a whole new type.

David equates a distributed application with chaos and messiness, but any computer person would know immediately that we don't have to know where the bits reside because a carefully constructed application consisting of programming algorithms and data model is used to do this task for us. In other words, the power of the computer is that it isn't us, and our power is we're not the computer: The computer can maintain references to a billion bits of data and pull out a meaningful entry from such, but lacks the reasoning power to take similar tasks and technologies and use them to build a new banking system. We have the overall reasoning capability to create such a system in the first place, but most of us wouldn't be able to call Mom if we didn't have her number programmed into our cells.

Contrary to being a mess, the Wikipedia system is a very carefully crafted and controlled structure. There is no chaos, only code.

On 'mess', David wrote, Third-order messes reverse entropy, becoming more meaningful as they become messier, with more relationships built in. He uses Flickr and the site's ability to evaluate associated tags and derive clusters, as evidence for this proclamation. Yet we're finding out, more and more, that people who want to participate in Flickr as a social enterprise actually modify their tags in order to deliberately place their photos into clusters.

In other words, they do not participate in tagging as a chaotic enterprise, but literally one that is based on a pre-defined order. Though the order itself may seem to be derived from the masses, in actuality, just like with Wikipedia, it derives from a small group of early users–founders, if you will. Regarding the concept of accidental meaning, as compared to intentional, true, there are tags for some items that are obvious, such as 'cat' for a cat and most people would use this tag for this subject. Yet, these are matched by tags that can never participate in any meaningful manner, such as adding "Zoe" to pictures of a cat, because Zoe is the name of my cat. When I search on cat, I see pictures of cats; when I search on Zoe, I see too many pictures from a child's soccer game. But wait! There's clusters for Zoe. From these I see Zoe is clustered with dogs, cats, kitties, eyes, nose, baby, serenity, and the list goes on.

The truly chaotic tend to stay chaotic. It's only when social patterns exert and one operates within an accepted social manner–tagging with 'meaningful' phrases and terms–does one begin to derive something of value. It doesn't mean that tagging operates in a whole new manner, or 'order', as David refers to it. It just means that what makes one a 'creator' of the order differs between the old and new way of doing things: order is derived by authority in the first; by spare time, early interest, access to betas, and even social network in the second. I'm not sure the trend is heading in a positive direction.

Much of the book was spent in defending a state of miscellany by demonstrating the power of Wikipedia, Flickr, de.licio.us and the like. Interesting, true, but I would have rather David had expanded more on his introduction of topics such as faceted classification. One of the more interesting interviews in his book is with Eleanor Rosch who defined a way of categorization based on prototyping. David then extends this to the approach taken with tagging on Flickr, but he doesn't provide examples that would demonstrate how Rosch's approach extends to Flickr, or Delicious, or any other tagging based system. Like much of the rest of the book, David touches on subjects but not in enough depth to feel sure one understands the points David is trying to make. Other than "miscellaneous" is good.

He does touch on the semantic web, or the Semantic Web, but again this is covered lightly. More importantly, though, the coverage was hugely one sided. RDF is introduced only to immediately be dismissed. People interviewed to support a predefined view: RDF is bad, microformats good–and all covered in five or so pages.

I'm not disputing that a digital, connected environment will derive new ways of categorizing and classifying data, or that providing outlets for such and making them available to everyone isn't a good thing. As David said himself, there is a difference between classifying virtual objects, compared to those that are real–at a minimum, you can easily replicate a virtual object, and so one can explore different ways of putting things together.

No one also denies that earlier forms of categorization brought about their own limitation: Dewey and his provincial view of religion, or Linnaeus and his disinterest in invertebrates. But we also can't deny that flawed or not, these older systems did provide some form of consistency. The Dewey Decimal system is flawed, true, but it is consistently flawed. With it, I can use a library equally well in Boston, as in St. Louis. I could do so before computers and the web; I can do so now, after both are ubiquitous. This same consistency also applies to David's 'new order'. I mentioned earlier, when people's purpose in using a site like Flickr is a social sharing of photos, people work to use tags consistently. If there is one commonality between the old and the new, it is a desire for consistency. Where we differ is the tools used to achieve such.

David used his own book as demonstration of the weakness of the Dewey Decimal system–how would it be classified? Where will it be placed? After all, it doesn't fit neatly into pre-defined categories.

To be admittedly contrary, I would say the book would be rather simple to classify: it would fit nicely under 'religion'. Returning to the discussion on the semantic web, the coverage of RDF demonstrates one of the weaknesses of the entire book: David had a concept, a belief, and then sought out specific knowledge and other witnesses to the faith who would provide the evidence to support such. To understand why, we can use David's own words.

About the failure of the Howard Dean campaign, he wrote, "For Clay Shirky–whom we've met as a skeptic about the power of top-down taxonomies–it was a 'collective delusion' caused by supporters speaking only with those who agreed with them." One could say the same about David and miscellaneous. Where we truly disagree is that I think the 'collective delusion' is something to be avoided, while David would seek for us to embrace it, completely.

Comments
1
Charles - 10:49 pm 5/6/2007

I don't understand why people feel the need to write whole sections of books about Wikipedia. I saw someone summarize everything you need to know about it in a one sentence pseudo-definition:

"The human hand has five fingers [citation needed]"

2
fp - 11:16 pm 5/6/2007

I'm about half-way through the book and I'm enjoying it very much. I bridle a bit when I run into over-simplifications like the ignorance of data location on the disks at Wikipedia. There is a sys-admin somewhere who could tell them exactly where that stuff is located using a "second order' referential description to track it down. But this is a 250 page book that provides an entering wedge into the replacement of a dominant paradigm, a conventional wisdom regarding hierarchy, that stretches back for millennia.

I think it is an engaging disquisition on new ways to order truth and assemble knowledge given the information processing tools we now have at our disposal.

Oh, yeah… and what you said about people interviewed to support a pre-defined view? I think it as likely that the chickens preceded the egg on this. A different flock would have influenced a different presentation of RDF. But this is among other things an encomium for tags and folksonomies, so more formal structured approaches take a back seat.

I liked your review and some of the points you've made (the Dewey system's broad availability making up for the culturally narrow world view of old Melvil, for example), but I'm enjoying the book too.

3

Shelley, it's a dirty job, but somebody had to do it :-(

I couldn't bear to review his book, because I didn't want to make an enemy.

4
Phil - 5:33 am 5/7/2007

David had a concept, a belief, and then sought out specific knowledge and other witnesses to the faith who would provide the evidence to support such.

This is precisely why I've only glanced at the reviews David's linked from his own blog - and skipped the ones where he says so-and-so has been really nice about the book.

I hesitated to write anything but a positive review

Seth: I couldn't bear to review his book, because I didn't want to make an enemy.

Woe to the unbeliever!

5
Bud Gibson - 6:49 am 5/7/2007

Does that mean, then, that the space between me and my desk is also metadata?

I'm afraid to say, I think it does, although it seems a rather loose form of metadata. In most cases, metadata seems to be an assertion rather than a physical fact.

How about the space between me and the car that's moving outside my window. Wait, it's continuing to move — must I continue to redefine the metadata?

Yes, and that may be the point.

6

Thnaks for this review. It is a critique in the best sense of the word.

Does that mean, then, that the space between me and my desk is also metadata?

Technically, I think if it was expressed as a dimension, it could be. OTOH, it could just be that PEBKAC.

One of the more interesting interviews in his book is with Eleanor Rosch who defined a way of categorization based on prototyping.

That alone might make it worth the price of the book to me. It will certainly make it worth borrowing from the library.

ESR wrote an interesting essay applying prototypes to Science Fiction worlds.

7

it seems a rather loose form of metadata

What's the point? Radio is a loose form of television then, it only lacks the picture.

8

[…] Shelley Burningbird Powers both disagrees with what I say and doesn't much care for how I say it in Everything Is Miscellaneous. I appreciate Shelley's care and thought. She does exactly what an author hopes reviewers will do — engage with the ideas — although of course I'd rather that she loved every comma and period in it. But, I didn't ask the publishers to send her a copy thinking that she was likely to agree with it. […]

9

Thanks, Shelley. I posted a long-ish response at EverythingIsMiscellaneous.com (and a pointer to your review and my response at my "normal" blog, http://www.JohoTheBlog.com).

10
Shelley - 1:14 pm 5/7/2007

Thanks for the response, David. And the comments, folks.

11
Tim - 1:54 pm 5/7/2007

>The Dewey Decimal system is flawed, true, but it is consistently flawed. With it, I can use a library equally well in Boston, as in St. Louis. I could do so before computers and the web; I can do so now, after both are ubiquitous.

This is wrong in a number of ways. First, Dewey's system was intended to go only as deep as a library wanted it. A small public library will then go X digits into a topic, to colocate books, say, on "evolution," which, in an academic library, would get their full number and fall into numerous subcategories of the topic. So, strictly speaking, the libraries you'd consult would have different Deweys for the same book. The "physicality" of the system–that you can't change numbers without getting out the razor–is a classic second-order problem that a third order can solve.

Second, almost nobody uses Dewey to find things in a library. They don't come in and ask for the 913.15 (Ancient Geography of Japan). And library catalogs–old and new–do not provide the meaning for the codes. They're just codes. To the extent they are of any use, it is in locating sets of books which seemed similar to a clever but rather provincial librarian a hundred and forty-three years ago.

12

Shelly - FWIW Shelly, I'd like to commend you on taking the time to both finish and write a substantial review of a book with which you found so many issues. Your review and David's response piqued my interest more than seeing the thing during construction over the last couple years.

So, for me at least, a negative review helped sales. How 'bout that?

13
Shelley - 2:35 pm 5/7/2007

Tim, I'm assuming I can use the library's online catalogs, search for a book, and if it's located in, say, history under a specific decimal number, I can find it more easily by searching the numbers on the back of the book. It doesn't work this way? Odd, it seemed like it worked that way last time I went to the library. At least for most of the non-fiction.

I'm not defending it. I'm just saying it was something. What would be better? Toss the books in a pile and let the patrons shelve them however they want?

14
Shelley - 2:37 pm 5/7/2007

Jonathon, as a fellow author, I'm tickled pink if the review helps sell David's book. I hope it brings him many financial rewards.

One doesn't write a critique to influence the sales of a book; one writes a critique to give one's opinion. Writing to influence sales is marketing, and frankly, I'd rather gnaw my hands off at the wrist before reviewing something purely for marketing.

15
Shelley - 3:02 pm 5/7/2007

PS Tim, I went out and searched on glass exhibitions and Dale Chihuly. I did find a consistent number between different libraries, 748.092, which is under the main topic of 748, glass. That's pretty consistent.

I'm curious, what new approach would libraries prefer?

16
Tim - 7:35 pm 5/7/2007

I think that there's a place for both professional and non-professional data. Some examples, comparing Library of Congress Subject Headings (LCSH) and tags:

Professional data wins:

Love stories, English > Bibliography > Methodology (http://tinyurl.com/32lef5) . Tags won't usually get at the complex things going on in this category.

Tagging wins:

*Tag page for "chick lit" (http://www.librarything.com/tag/chick%20lit)
*Tag page for "chick lit" (http://www.librarything.com/tag/cyberpunk)
*Tag page for "queer" (http://www.librarything.com/tag/queer)
Tagging gets at genres and identities that subjects don't, and probably can't. Chick lit, for example, usually gets only the LCSH "Love stories."

A tie:
LCSH: United States > History > Civil War, 1861-1865 > Fiction (http://tinyurl.com/2j2bfb)
Good, but not necessarily better than the LibraryThing tagmash of "Civil war" and "fiction" (http://www.librarything.com/tagmash/civil%20war,fiction)

The future:
Mashing and faceting tags, subjects and algorithmic data.

17

[…] because I plan to write one myself and now my range for originality has been limited.  I saw Shelley's disagreement and thought it natural that the loosely joined guy and the more — well — tightly wound […]

18
Shelley - 11:08 pm 5/8/2007

Tim, I must admit to an inability to map how we'll get from the one to the future. That's probably my biggest disconnect from David. Thanks for the example.

19

[…] or just re-tagging the deck chairs on the Titanic, so to speak. He uses this to try to understand Shelley's negative reaction to the […]

Thanks to all those who have contributed to the discussion. Comments are now closed, but you can contact the author of the post directly.