8 Iunii 2008

Archiving blogs

Meredith asks if anybody’s thought about archiving blogs.

Well, I have, and I can prove it. Dan Chudnov had a blog-preservation infrastructure he was kicking around, but I don’t know what happened to it.

Here are the chief barriers I see:

  1. Rights barriers. If getting a license from the blog owner weren’t hassle enough, consider the problem of third-party-owned designs.
  2. Respect barriers. If I had a buck for every time I’ve heard this: “Libraries exist to preserve the filtered, reviewed, authoritative scholarly literature. When we step outside those boundaries, we damage our reputation for purveying credible knowledge.” I’ve heard it about IRs. I’ve heard it about data curation. I’ve heard it three times over about blogs. Even those of us who see value in blog preservation can’t move forward while our libraries still think like this.
  3. Technological barriers. DSpace is very poorly-suited to acquiring serials of any description. It doesn’t have any kind of harvest or cron-job mechanism. This could be hacked, but nobody’s hacked it. Until someone does, don’t talk to me about blogs; I don’t have time to do manual grabs once a month or whatever.
  4. Priority barriers. I am one person responsible to 26 campuses. Where am I going to put my energy? Capturing peer-reviewed literature? Data curation? Open-access journals? Grey lit? I’m sorry, blogs are pretty far down the list.

That said, if I had it in mind to bootstrap a blog-preservation program, I tell you what I’d do: write a grant proposal, probably to IMLS. Focus on law blawgs, because there’s already scholarship indicating that they’re being cited in law reviews and the rest of the legal literature, so like it or not, they’re part of the scholarly record. Promise an ongoing collection project and a survey of the rights landscape as well as an open-source collection tool (that plays nice with SWORD and OAI-ORE, natch) to help other libraries archive blogs.

I think that might be a winner.