30 Augusti 2004

Heuristics and Wikipedia

You know what annoys me about the whole Wikipedia uproar? I’ll tell you what annoys me about it. (You knew I was going to anyway, right?)

We proto-librarians are informed with much fanfare in library school that librarians have a better sense for “source authority and quality” than the average joe, and that the information sources we choose are therefore better than those the average joe chooses when left alone to choose sources.

One would think that a profession that makes sweeping claims like this would spend a lot more time than it does teaching students how to evaluate sources. Leaving that Achilles heel aside, however…

The methods we are taught to evaluate source authority are not binary yes-or-no algorithms, always spitting out the right answer. They are heuristics. I’ll say that again, louder: H-E-U-R-I-S-T-I-C-S. Print versus online publication, peer-reviewed versus not, reputable publisher versus self- or vanity-publishing, number and quality of post-publication reviews, author reputation—these and their like are all heuristics. (Librarians have developed evaluation heuristics for websites, too, in case you were wondering.)

A quick Google definition hunt will give you a fair idea what a heuristic is, if you don’t already know. My off-the-cuff definition: heuristics are quick-and-dirty but often surprisingly effective surrogates for doing extensive spadework on a problem before making a decision. My favorite book on heuristics and the human brain is an eye-opening research-essay collection by Gerd Gigerenzer, Simple Heuristics that Make Us Smart. Find it at a library near you.

The thing about heuristics is that they leak. They fail. They do not weed out everything that needs weeding. They do not include everything that meets acceptable quality guidelines. They can even be gamed, intentionally defeated, by someone who knows what heuristics people bring to the table in a given situation.

That doesn’t mean heuristics are bad; they’re necessary, in fact, because who has all the time in the world? Not to mention that evaluating quality is a fuzzy problem to begin with, not overly amenable to yes-or-no processing. Heuristics are a fine way to attack otherwise intractable problems.

But when we—librarians or ordinary janes—take either the efficacy or the results of our heuristics as gospel (mistaking them for algorithms, I suppose we might call it), we frankly start looking like prize jackasses, and bloody arrogant jackasses to boot. The high-school librarian cited by the anti-Wikipedia journalist (see Many2Many, if you haven’t already) is only the latest example. I could (and do, actually) argue that the famous Sokal hoax demonstrated not so much a failure of human intelligence, but an ordinary and essentially unsurprising failure of human heuristics use. “Well, but it was published in a peer-reviewed publication!” That’s a heuristic too, folks. It fails now and then. Get used to the idea.

Think I’m knocking peer review? I’m not. David’s book got peer-reviewed; I watched the process. It’s a much better book because of the reviewers. But at its base, peer review substitutes somebody else’s reputation and hastily-formed-and-delivered opinion for the wretched and impossible task of going over the author’s work with a microscope. It’s a heuristic. As such, it too fails.

Moreover, heuristics share one key trait with algorithms: unforeseen circumstances can make them outdated. Cling to an outdated heuristic, look even more the jackass. Librarians had to be dragged kicking and screaming away from “All information on the Web is automatically inferior to any information in print!” to “There’s good stuff on the Web, if you know where to look for it and are able and willing to weed out the crud.”

Some librarians haven’t even gotten that far, of course, but they’re jackasses. I’m sorry, they just are.

As for Wikipedia, I included them in my William Morris project for reference class last year. Their article isn’t the best one I found anywhere, but it’s quite competent, and very far indeed from the worst. The worst encyclopedia entry by far, truly craptastical, came from the librarian-drooled-over World Book Encyclopedia.

As I said—heuristics leak.