29 Octobris 2005

Repository FUD

Laura Smart of California State Polytechnic sent me a link to the Educause “missing the train” article that everybody’s been chin-wagging about. I hadn’t bothered reading it previously, because library tech angst crosses my radar so often I’ve quit paying attention unless the author is someone I respect or the buzz sounds genuinely novel.

Laura showed me the repository language:

In response to the Web, many libraries, individually and/or collectively, have started to create their own information hubs—digital repositories—using the intellectual content of their institutions. Unfortunately, many of these repositories are built on traditional methods of information organization rather than on the new information-dissemination models evolving on the Web. Potential contributors to and users of these repositories are finding the organization and metadata tag systems imposed by libraries far too cumbersome. Moreover, in designing many of these new digital repositories, libraries have largely ignored the important role that people play. Most library digital repository initiatives are designed to serve only as gateways to documents and artifacts. Few are designed to serve as true information hubs, providing users access to both relevant information and experts.

Aren’t we librarians just awful. We ought to be ashamed of ourselves.

Well, I can tell you one thing right off the bat about article author Paul B. Gandel—he’s never deposited anything to a DSpace repository. I whinge a lot about DSpace usability, and rightly so, but the bare fact is that even given DSpace’s usability problems, submitting an item through the web interface is brain-dead simple to do. What’s more, it’s going to get simpler and more customizable to individual content types over time—I’m on the DSpace dev list and I pay attention to what’s coming.

Complex library metadata my foot—DSpace uses Dublin Core, for heaven’s sake, the brain-dead simplest and stupidest metadata set in creation! Filling out the metadata forms for a DSpace submission takes about as much time as writing up a cite for a journal article with all the proper punctuation. (Of course, lots of Certified Smart People can’t do that, either, but they don’t generally blame their incapacity on draconian journal editors.) I’m honestly not sure what Mr. Gandel wants us to do.

Maybe what’s being sought is tags, folksonomies, whatever you want to call them. Okay, fair cop. Give me a minute.

As for ignoring people—which people, pray? Submitters or consumers? I maintain stoutly that we’ve done pretty well by submitters. Sure, we’ve made them think about some things they don’t want to think about (e.g. preservation issues), but somebody’s got to. Nobody wants to be saddled with repositories-full of dead material in fifteen years.

Consumers—well, I’m going to argue that we aren’t doing what Gandel thinks we’re doing. I don’t think I’m in the content-discovery business, except indirectly. What’s more, that’s not a business I want to be in, because Google and Scopus and ISI and OAIster will kick me to the curb and steal my lunch money. What’s more, it’s really rather wasteful for all us repository-rats to deploy sophisticated discovery systems; most of what we collect isn’t self-contained, and benefits from cross-repository aggregation. Hardly anyone is going to find everything they could possibly want in the repository I maintain. Not that it couldn’t happen; some of the special-collections stuff that’ll be going up soon works as a unit. Even so, better that content consumers should search and discover on a larger scale.

So what I and my fellow repository-rats do instead is put our data out there for other people to build discovery services from. That’s what OAI-PMH is all about. No, I don’t anticipate DSpace building a tagging system any day soon, but that doesn’t stop CiteULike or Connotea or whomever from doing it on top of OAI-PMH data. That’s webby content discovery, whatever Mr. Gandel says: inviting network effects to take hold and bear fruit.

In sum, we repository-rats are only part of the larger picture. Our mission is to solicit digital content from submitters, do all the quasi-legal work to ensure we can display and preserve it (and let me tell you, this aspect of my job is looming far larger than I ever thought it would!), and put the content in shape for future generations via intelligent preservation and metadata practice.

Mr. Gandel’s FUD? Isn’t helping. This repository-rat wishes he’d sit down with us and figure out how to help, instead of misunderstanding our role and giving our faculty yet more lame excuses to ignore us.