JCDL 2006 has closed, though there’s a “metadata party” tonight that I will probably be attending. This has very much been a “get looked over” conference for me—a lot of people wanting to meet me, doubtless to reassure themselves I’m not an axe-wielding maniac. Since I’m not, I don’t worry. Much.
The papers I liked best fell into a category I have just invented that I am (for the next five minutes) calling “technical ethnography.” (Technography? Ethnographic techosophy? I dunno.) Essentially, it’s technological insight acquired via observation of human behavior.
(Rather than bog-standard survey work, which I am really starting to loathe, especially as presented at conferences. Hey, presenters of bog-standard survey work? Don’t tell us about your methodology; we already know how surveys function. Don’t do a lit review during the presentation; if you’re talking about something of interest to us, we’ve already read the papers, and if we haven’t, we can read your bibliography. Tell us why we care about yet another bog-standard survey, then tell us what you found out from it, especially if it’s cool or anti-intuitive. Then shut up and let us ask questions. Honestly, though, if I ever run a conference survey work will be relegated to poster sessions, period. In passing, do ARL libraries have to hire a Survey Librarian just to answer all the bog-standard surveys they get?)
Anyway, the conference Clever Boots award goes to the guys who bootstrapped name-disambiguation software for Citeseer (which desperately needs name disambiguation; I loathe Citeseer metadata more than I can even begin to tell you) with the observation that people cite themselves. That’s just bloody brilliant, is what that is. Human behavior informing a technological solution to a metadata problem. Love it.
I also liked the winning student paper, about PDA software for specimen identification in the wild via cleverly-implemented dichotomous keys with a side order of easily-accessed photos and drawings. I want an EcoPod for hiking, I do—and that’s what’s brilliant about it. It ties into a basic hiker desire: “hey, what a cool critter! what is it?” Ethnographic technosophy, again.
The winning non-student paper both amused and frustrated me. Carl Lagoze talked about the National Science Digital Library, and how it was believed that the Magic Metadata Fairy would use OAI-PMH to build a beautiful searchable garden of science, and how everyone ended up with an ugly, weed-choked, cracked-asphalt vacant lot instead.
This? Should not be news. There is no Magic Metadata Fairy, any more than there are Magic Editing and Typesetting Fairies in publishing. Metadata is an artisan’s job. If you want artisanry, pay an artisan, damn it.
Does that mean never accepting author-created metadata? Nope. But it means accepting that much author-created metadata is going to be crap, and building workflows that proceed from that assumption. Lordy, people, I was writing about this back in 2003, and now it wins conference paper awards?
I’ll be blunt. The solution for NSDL’s problem is hiring cataloguers, or metadata librarians, or indexers/abstracters, or whatever you want to call ’em, to clean up the incoming garbage. Ideally, OAI-PMH would be a two-way protocol, so that nice cleaned-up metadata made its way back to the repository that had spewed the garbage in the first place. That, however (despite all the jaw-flapping about frameworks that went on during JCDL) does not seem to be in the offing. It should be.
Yes, this is feasible; your cleanup artisans aren’t creating records from scratch, and existing cleanup algorithms can be run before they see the data they’ll be correcting. (Not to mention that their presence will improve your cleanup algorithms no end.) Besides, a lot of records will be okay to begin with.
The other answer, discussed during JCDL, is lowering the technical barrier to participation so that participants can focus more on metadata quality. This is good and I’m all for it; let’s just not pretend that it’ll solve the problem, is all. Most metadata sucks. Learn to work around that inconvenient fact.
This and other JCDL tech-ethnography got me pondering my own ethnographic inquiries. I think (along with many others, I should say) that a lot of the problem with attracting faculty contribution to IRs resides in the “this is not part of our normal workflow” problem. I would personally love to offer services that insinuated the IR into that workflow, but without some ethnography, I’m not sure what those services should be.
My sense is that a research-collaboration aid would help a lot. Such an aid would look a lot like a networked hard drive with bolted-on access controls. Researchers need somewhere to stash all the digital stuff they accumulate while they’re working on something—research results, downloaded literature, datasets, digitized stuff, Endnote citations, drafts and so on. They need to let their collaborators in and keep everybody else (except me, of course) out. The beautiful part is that if I’m in ultimate control of that drive, then it’s trivial for me to pick up the preprint or the publisher’s galleys for deposit into the IR.
Anybody want to go halfsies on an investigation into researchers’ digital workflows?
The Greenstone guys are the runners-up for the Clever Boots award; several excellent and useful demos of cool things to do with Greenstone. I dearly wish the Greenstone-DSpace integration project would hurry up and finish, because I’m dog-tired of coming under attack for DSpace’s UI ugliness and inflexibility.
JCDL 2006 was a solid conference. I doubt I’ll be flying to Vancouver for the next one, but it’s definitely on my list of conferences I’ll happily consider when they’re in my general vicinity.