Archive for September, 2008

30 Septembris 2008

The library real-estate bubble

At the libraries/e-science symposium that Purdue held last summer, our fine hosts unintentionally dropped a shocker: they’d closed a hefty fraction of their small branch libraries. Shut ’em down flat, moved their print collections, reallocated the associated positions.

I don’t think they meant that to be a shocker; they’d clearly grown used to their new situation. For MPOW’s delegation, it was a shocker. We have a considerable network of small libraries. I’m used to that, despite my sojourn at another institution whose library system actively resisted the establishing of small libraries. I’m so used to it that Purdue’s declaration shocked me, too. How on earth did they make that happen, I wondered. Wasn’t there great wailing and gnashing of teeth?

Since then, I’ve been pondering libraries and real estate, usually not simultaneously. (Yes, I’ve been a housing-bubble watcher since moving to DC. My coworkers then assured me airily that prices would never go down in the DC area. I wasn’t sanguine, but I kept my opinion to myself… so obviously I can’t prove at this late date just what my opinion actually was. People are saying the same thing about Madison at the moment. They’re wrong too, and trulia.com is starting to prove it.) My ponderings have led me to this: I wonder if at least some academic libraries are overinvested in public-service real estate.

The other day I was walking from a meeting with a valued colleague when she started on what I believe to be the Librarian’s Eternal Plaint: not enough time in the day. We all say that, every last one of us. I do. You do. We all do. Her edition contained something I don’t always hear, though. “… and we have to keep the library open and the desks manned somehow!”

Hm. Do we? I wonder. Do we have too many desks to man? Too many rooms and buildings to monitor (and clean, and secure, and provision with terminals and e-reserves scanners and circ gadgets, and route materials to, and put signs in, and…)? Maybe some of the staff and resource overhead that goes into routine space management and service-point provision could find more productive uses?

I don’t know the answer; I’m not being a fire-breathing revolutionary again. I just think we (writ large; this is a much larger question than just at MPOW) ought to be asking the question, instead of treating the spaces as sacrosanct. I think MfPOW might have been onto something, not spreading their human resources too thinly over too many spaces. Space is at a premium on college campuses. What could libraries gain in a space horse-trade? How much would general overhead costs go down with reductions in public-service square footage? Would we really need to build offsite storage if we got rid of public-service space in some of those branch libraries and stuffed the resulting empty places full of stacks (architecture permitting)?

I don’t know. I’d be interested in Purdue’s answers, and the reasoning behind MfPOW’s resistance to space expansion. I wonder if they’ve published about it. (And off I toddle to the library literature…)

I think this question is the real challenge of what I’m hearing called “embedded librarianship” to the library world order. Embedded librarians don’t need libraries-as-places; libraries-as-places just plain don’t make sense for the kinds of services they provide. They may need different spaces altogether, but we don’t quite know what those are yet, I don’t think. (If I were an embedded librarian, I’d want a good-sized office in the same building as faculty with a conference table and a big-screen computer monitor for group meetings and consults. Plus a serious high-speed Internet connection. But I don’t know if that’s right, if it’s what I’d still want, say, two years of embedded-librarian experience later.)

My gut instinct is prodding me quite hard, insisting that a lot of us are headed in Purdue’s direction whether we know it yet or not. Way back in the day (okay, okay, during library school; not that long ago), I interviewed a campus librarian who ran a small library, asking what she wanted more of in her day. Her answer? Unstructured time with individual faculty. Bumping into faculty in the hall, exchanging news, making her presence known, offering off-the-cuff help as well as making appointments on the spot to address in-depth questions.

What I didn’t see then because I took the space’s continued existence for granted, and see now because I am questioning the need for that space, is that she can’t realistically create those semi-random meetings chained to the desk in her little library. Is chaining her there just to keep the space open really the best situation, the situation that will make her most effective in her work with and for faculty?

Mind you, this battle is lost in many of the hard sciences. A librarian chained to the desk is a librarian faculty will never even meet, because they don’t go to the library, don’t think about the library, don’t give a flying flip about library-as-place. It is probably a lost battle among many undergraduates, too.

But it’s hard to give up space. Oh, it’s hard. Library traditionalism completely aside, it’s hard. Fundamentally, space is status in the academic context—you don’t have to work in academia terribly long before you start hearing about pitched battles over campus spaces, fought by many other stakeholders besides libraries. Moreover, giving up space feels disrespectful of the planning and effort that went into securing and running the space. It can even feel like closing off options for the future. From a human-resources perspective, small libraries represent mini-career-ladders for academic librarians to aspire to, in what has been a fairly flat profession and is becoming a flatter one.

Mm. I would really like to know what went on in those high-level library admin meetings at Purdue. I bet they were real humdingers. I just don’t think it’s coincidence that Purdue is able to be in the forefront of data curation in libraries in the US. They consciously reallocated resources away from keeping spaces open and toward other things.

Roaches still scuttling

I am reliably informed that the institutional-repositories issue of Library Trends will be delayed some months. Such is professional publishing. The delay is not, I believe, the fault of the issue editors; further deponent sayeth not.

I also heard something so wild about Roach Motel that I’m not even sure I believe it (which is, I hasten to say, the result of a skeptical mind rather than any lack of credibility on the part of my informant). I’ll be able to verify (and share) when the issue comes out. If it’s true (and seriously, people, this is wild), it’s a real testament to the power of an early OA preprint.

Occasionally an uncomfortable shiver travels up my spine at the thought that Roach Motel really may change the IR game—may, in fact, already have done so. I mean, yes, of course I wrote it to do precisely that—but hell, I never thought it would work. Who listens to repository rats? Even rats who have sharp-edged rhetorical flourishes and aren’t afraid to use ’em?

It’s not all good. I embarrassed one good person doing her level best to run a good IR program, and I’ve apologized profusely and sincerely for that in private. (The only reason I haven’t done so in public is that my sense is that the person in question would rather the problem die quietly, if possible.) The published version of Roach Motel will eliminate the cause of embarrassment… but that doesn’t remove the damage the preprint did, or my culpability in letting it happen. I also made a bonehead-undergraduate error of attribution that I’m still ashamed of (and yes, Dr. Jacobs, that one’s fixed). And lastly, the bit about prophets and honor and their own country is much, much too true for my comfort.

Even so… I said some things needing saying, and they were read, and the way we’re talking and thinking about IRs is changing, and I think I’m content (pace the solecisms retailed above) with my small part in those changes.

29 Septembris 2008

JISC report on data curation

Okay, okay, so I’m finally going to have to admit—reluctantly—that most data curators have domain expertise, and that that’s the most desirable situation for researchers.

However, the Swan and Brown report offers even comp-lit majors like me a little hope:

On the other side of the coin, there are data scientists who argue that it is not necessary to be a subject expert in order to do the job effectively. There are some fundamental data science skills that are generic in nature, such as dealing with confidential research, data description and metadata, software, copyright and intellectual property rights, and data storage. Although this is may be so, the core issue is that of effective communication between a data scientist and their research colleagues…

From a practical perspective, as demand for competent data scientists grows, so it will become necessary to cast the net as wide as possible. Subject knowledge is important, but so too are technical skills and people skills…

We must consider also the question of technical and computing aptitude…

Our online survey of current data scientists also showed that the data science community is evenly split on whether people skills are more important than technical skills for success as a data scientist – but then people’s opinions are often predicated on their own experiences and whether their own strengths lied toward the technical or people skills end of the spectrum. It is uncommon to find people who are excellent at both. We came across several examples of instances where people whose background was primarily computing and information technology became sufficiently familiar with the subject area of their specialist institutions that they were deemed to be effective data scientists.

In my mind, I compare this to the situation of librarians who become selectors or bibliographers in subject areas where they have no formal training. Let’s not kid ourselves, it happens, especially in the sciences. I’ve known some such—and paradoxically, they tended to be toward the more effective end of the scale. Admittedly, this is because they were people with courage sufficient to dive into an unfamiliar topic head-first, and such people tend to be naturally effective at whatever they turn their hand to. Sometimes, though, being buried in the subject is a positive disadvantage. Ever had a foreign-language teacher who was a native speaker, so embedded in the language that s/he couldn’t explain it? I have. I think this happens to scientists a lot.

I’m still reading the report, which is an evenhanded and intelligent one. I quibble with the idea that rigid terminology distinctions are appropriate at this early date, but I think the lines drawn in the report are useful ways to think about the problem as long as they’re not meant to reify it. I have most of the skills of the report’s “data librarian,” but some of the “data manager”’s skills as well. This is not a bad thing. It should be encouraged!

An open letter to Thomson Reuters

Dear Thomson Reuters,

I only became a Zotero convert quite recently. Admittedly, this is odd, since I formerly worked for George Mason University, and I had the privilege to see Zotero in pre-alpha stage and was wowed by it even then. It’s only recently, though, that I have been writing enough professionally to make a citation manager worthwhile. Zotero does what I need it to (except for translating Emerald’s web pages into citations, and I could rectify that problem myself if I chose). It will shortly do many things that I want it to. It’s good software. I like it.

You are suing a product I like. I’m a librarian. I have influence over other librarians, and (occasionally) over faculty and students. Does this truly strike you as a wise move? Truly?

I’ll let other people, those with more legal savvy than I, opine about the merits of your case. I just want to make clear to you a potentially serious consequence of your actions, no matter what happens in court.

We’re developing a piece of software locally that is, among other things, a citation database, one we envision both importing into and exporting from. We’re having a lengthy meeting about it today and tomorrow, in fact. As soon as news of your lawsuit crossed the transom, an email went out from one of our devs saying “Um… importing/exporting with EndNote could be a potentially fatal idea.”

Do you see, Thomson Reuters? Do you see? If you don’t settle this nonsense in a fashion that leaves Zotero intact, the open-source software development world will fear to interoperate with you. If EndNote isn’t already dead, this will kill it, because our little project is hardly the only one of its type. We are legion, and you have shut yourself away from us. You have no one to blame for this suicidal course but your own legal and executive team.

And if you take away Zotero, trust me, Thomson Reuters: it won’t be EndNote that I switch to.

22 Septembris 2008

A, B, and C

Required reading for repository-rats and all who love them: Palmer et al.’s investigation into institutional-repository methods and results. Given how rarely I praise research in this area, not to mention how often I complain bitterly about it, I hope my unalloyed praise for this report holds weight. It’s well-written, it’s well-supported, and it’s right in all the important ways. Like Margaret Henty’s article, which I have also had occasion to praise, it’s useful; I learned things I hadn’t known but have no trouble believing from it, and I’m an old dog as this field goes.

If you’re in the business, you can figure out pretty quickly who at least two of the three studied institutions are. (I’m still a little fuzzy on A, though I have a strong suspicion, but I know beyond a doubt who B and C are.) None of them, in case anyone is wondering, is MPOW, so I’m not feathering my own nest here.

Money quotes:

In general, the basic aims of universities in investing in IRs—to collect, preserve, and provide access to their research output—seem misleadingly simplistic compared to what IRs are actually attempting to accomplish, and what they will need to do to identify and successfully implement functions that are not redundant or risky and of high value to faculty.

This is exceedingly well-phrased, and it gives me to ponder somewhat about how I characterized the tension between repository-rats and other librarians (including but not limited to library administrators) in Roach Motel. Faced with a “basic aim” that is impossible to accomplish, repository-rats naturally nose about for other problems to solve (and the report makes that strategy quite clear, addressing its benefits and drawbacks even-handedly). I think I have traduced my ratly colleagues and myself in Roach Motel by expressing this process purely in terms of nervous rats seeking job security and self-justification, and I’m sorry for that.

The truth is, I want to be useful. We all do, all of us rats, even if not everyone is exactly like me in usefulness being a fundamental work drive, what gets me out of bed in the morning. If we can’t be useful in IRs’ “basic aim,” and often we can’t for reasons well outside our control (this being a major theme of Roach Motel), we actively look for other problems, do our best to make ourselves useful in other ways. These problems fall almost exclusively outside IRs’ supposed “basic aim,” which naturally confuses other librarians.

The intellectual property (IP) obstacles involved in populating IRs consumed significant amounts of time and resources and can be a drain on other core development activities.

No argument here. IP is a swamp, and it’s not a swamp that most IR planning processes anticipated. The report’s discussion of how faculty and IR staff build boardwalks through the swamp is trenchant and well worth reading.

Unlike other aspects of repository building, liaison networks with faculty were already a functioning part of library operations and are now serving as essential human infrastructure in IR development.

While the subject orientation of liaisons is being exploited in IR development, there seems to be much less application of their experience in collection development, management, and evaluation—areas of expertise that are highly relevant but need to be revised for the IR collection model.

Liaison librarians are essential to a well-functioning IR, and their essential-ness is most of why the maverick-manager and no-accountability staffing models are often anti-patterns. I didn’t make this clear in Roach Motel, and I now think that was another goof-up on my part. The key, as I hope I did make clear, is library administrators setting clear and realistic goals related to the IR for all their staff: repository-rats, liaisons, cataloguers, and others alike.

I tend to be a little bit more of the traditional librarian, because I don’t know TEI, and I don’t know SHTML. [I suspect that should have been 'XHTML,' and that the error was in transcription rather than originating from the librarian interviewed.] I don’t know XML. But, it’s pushed me to try to understand that a little bit better. … But what I see happening is … and actually over at the library itself, is this beautiful combination of understanding the structure of information, and understanding the code that goes behind it, and how to make it usable to the people who want to access it.

Liaison 15, whoever you are, I salute you as a valued and respected colleague! I will be quoting you to my LIS 644 students, because you are an exemplary librarian. If we ever turn up at the same conference, please introduce yourself; the drinks are on me.

Perhaps most important to the viability of IRs, however, were those [faculty] who found that the IR solved a particular information problem they faced in the everyday practice of scholarship.

I said something quite like this pretty bluntly in Roach Motel. I’m pleased to see it supported, because I could only assert it, not back it up.

Digitization was seen as a productive correlate service.

I said that, too, and I stand by it. The analog-digital divide is not something I made up. The tension comes in, I think, because digital librarianship’s usual careful, meticulous digitization and description methods cannot function here; there’s just too much material. Archivists’ “more product less process” epiphany may well be the way forward.

Depositors and liaisons alike commented on how many faculty members could not differentiate between open access scholarship and scholarship that was available through the library.

Open-access movement, this is to your address, I think. You haven’t made that nearly clear enough, and it’s a problem. What did I say once? Oh yes, this, in the context of e-reserves quarrelling: “We have to draw a thick black line connecting what faculty do and what they have access to, because right now they don’t see it.”

I can’t pull quotes from the faculty members, because everything the report quotes from them and about them is so good and so right and so real. I’ve had all those conversations before, every last one of them.

Policy and criteria-based selection and evaluation are not typical. Instead, developers have been quick to capture collections not encumbered by copyright constraints, offering access to a growing base of local technical reports, grey literature, and theses and dissertations.

This squares with my experience, and is a logical outgrowth of “basic aim” failure combined with the IP swamp. The only thing I can add is that I believe it would take a heavy load off many repository-rats’ minds if realistic selection criteria and priorities could be made explicit, such that in pulling together local tech reports, grey-lit, and ETDs (not to mention datasets), we’re confidently fulfilling our mandate instead of cautiously creeping outside it wondering what will happen to us when we get caught. Another positive outcome would be a realistic reassessment of just how much work it takes to capture peer-reviewed material legally, and resource provision to match.

By the way, any resemblance of the title of this post to an excellent episode of The Prisoner is purely intentional, ’cuz I’m just too much of a geek for that not to tickle my funny-bone (… connected to the…).

From the pale-faced moon

When I was a sophomore in high school, I fell utterly in love with Henry IV Part One’s Hotspur. This, I presume, surprises no one? Idealistic (though his ideals are somewhat constrained), passionate, honest, believing the best of his allies, eager to excel himself, possessed of considerable native ability, jawdroppingly unstrategic, even more jawdroppingly tactless, intolerant of stupid lazy bureaucrats and not politic enough to hide or move past it, eyes too fixed on the prize to give way merely because of impossible odds.

Mm. Yes. Can’t imagine why such a character would resonate with me, even then… truthfully, I’m at once intrigued and appalled that my character was apparent so early as my sophomore year in high school.

He had to die. I understood that even in high school, and I understood that more was happening than mere plot. It is to Shakespeare’s credit that he took a historical necessity and made it a dramatic necessity. Now that I am older, I understand even better; honesty and excellence without charisma and politico-psychological awareness are not proof against treachery born of calculated cowardice. It is not Prince Hal who dooms Hotspur; it is not even Hotspur’s own weaknesses, many and tragic though they are. It is Northumberland and Worcester, those canny self-preserving politicians. (Shakespeare must have liked his Hotspur at least a little, to have gloated so in Worcester getting ‘the guerdon of his guile,’ as E.R. Eddison would have put it.)

I have a weakness for well-constructed drama leading to a sense that the denouement is perfectly inevitable and perfectly fitting. Henry IV Part One is pure brilliance in that regard. The finely-wrought difference between the relationships Hal and Hotspur have with their fathers (leaving Falstaff out of it for the nonce, though I wholeheartedly agree that Falstaff is Henry IV’s foil) makes me happy. No matter how unhappy Henry is with his wastrel son, he trusts him enough to leave him be, and when Hal makes a real promise of reform, Henry accepts it and acts upon it, though every ounce of sense must suggest otherwise. Northumberland squanders his son’s excellence because he will not, cannot believe in it.

You’re probably wondering what brought this on. Last weekend was a rental-car weekend, and I decided to splurge big on American Players Theatre tickets. It couldn’t have been more glorious weather for it, and there won’t be much more glorious weather in the Frozen North, and Henry IV Part One is my favorite Shakespeare play (edging out King Lear by a nose), so off we went. The pale-faced half-moon showed to beautiful advantage during the play; I could only smile at that.

We stopped first in Festge County Park for a few good walks and a picnic dinner. The nature trails there are unmarked (except at trailhead) and single-foot narrow, but worth the traversal. Summer is meditating giving way to autumn: hickory nuts and acorns litter the ground, spiderwebs are everywhere, burr-plants attach their cargo hopefully at the least provocation, and the trees look a little tired even though most are still deep-green. We happened upon a cute little toad in the woods, and an impressive woolly-bear in the picnic area. I brought a beautiful melon, caprese salad, baba ghanouj and hummus with pita to put it on, a bag of Terra Chips and a box of gingersnaps, and we were well content. I certainly can’t complain about the tomatoes, eggplant, basil, and melon from our CSA farm!

The only complaint I have about American Players Theatre is that the audiences are too polite. I did my best to be a proper groundling, laughing and cheering and hissing as appropriate, but I tell you what, Midwesterners are just not constitutionally capable of the groundling way of being. This is a pity, because my sense is that the company is more than capable of playing well off groundlings.

I thoroughly enjoyed David Daniel’s Hotspur. Although I understand that the production was trying to ground Hotspur in his basic churlishness, and I think that a reasonable decision, I do also think it a pity they cut his speech about honor from which this post’s title is taken. Leaving aside that it’s my favorite speech in the play (edging out Falstaff’s brilliant cynical battleground response to it by a nose), Hotspur’s idealism is his fuel and his raison d’etre; the blind recklessness it generates is his tragic flaw if you are a resolute Aristotelian (which I admit I’m not; I think tragedy is a systemic and interpersonal rather than purely individual phenomenon, and I hope this post speaks to that belief). Yes he’s tactless, yes he’s thoughtlessly stubborn, yes he’s a headstrong idiot—but it’s because he is too in love with his ideals, understands them too well and too thoroughly, to bend himself to earthly compromises, and the APT production lost that.

Both Henry and Hal, by contrast, are of the earth. Henry IV does not come out any too well in his own plays: a haunted conscience-stricken regicide, a rationalizer, a battleground coward (“the king hath many marching in his coats”), a cold and censorious father. Hal learns leadership in all its essential inglory from this father, and what the lack of leadership leads to from his second father Falstaff. (Hm. Falstaff as Northumberland’s mirror. Discuss.) Like his father, Hal employs whatever rationalizations come easily to hand to justify his bad behavior—the business about being a playboy now to make his reformation later all the more amazing is just weak sauce. Like his father, Hal learns to do what he must even when it hurts when the situation calls for it; the slaying of Hotspur and the rejection of Falstaff (and the hanging of Bardolph in the subsequent Henry V) all speak to that. Hotspur cannot bend so far, and so he is broken instead. Like his father—and I have argued that this is his father’s sole redeeming quality—Hal trusts his family, treating his younger brothers as brothers when he ascends to the throne, where a lesser king might have suspected or even executed them.

I would be remiss not to mention Brian Mani’s gorgeous Falstaff. It was such a perfectly comfortable performance; Mani managed the false paunch and the antiquated insults with equal easy glee, and the stage lit up with energy whenever he was on it. The hard edge here is that Falstaff and Hal are consciously and cruelly using each other, Hal for release, Falstaff for future illegal and immoral favor. When it comes down to it, neither really likes or appreciates the other. Hal grows beyond his self-indulgence when necessity demands it; Falstaff cannot, repeatedly does not, and that is why he must be turned away at the last.

It was a fine production of a play I love, even if I think the mashup needed a little work. (Henry’s death scene was, I’m sorry, interminable, and it didn’t fit thematically, either. The Greek convention of offstage deaths and Messengers could have been used to advantage here!) Though the drive home was a trifle scary owing to the lateness of the hour and intermittent fog, I wouldn’t change a thing—I had a glorious day.

Quite a few of my friends are struggling acutely with health problems of late. As I gird myself to go back to the doctor and reassess my own cardiovascular issues, I find that I feel compelled to do this kind of local travel, learn my area and hike in it and take advantage of the best that it offers, while I still can. “Able-bodied” and “financially solvent” are temporary and contingent conditions at best. It’s important that I make the most of them.

19 Septembris 2008

Adopt a publisher

I am not talking like a pirate at you today. In return for this courtesy, I would like a small favor.

There is language rattling around in Congress that would destroy the NIH Public Access Policy. The actual bill introduced by Conyers is probably moribund if not dead. The concern now, as I understand matters, is that the anti-NIH language could be snuck into another bill.

The Open Access Directory is doing its part in a way that will help us all, no matter what it accomplishes in the Congressional wrangle. OAD has a page of publisher policies vis-a-vis the NIH Public Access Policy, and they are asking us all to investigate a publisher (one with “No known policy as of…”) and update the page.

Robin Peek asked me to publicize this effort, which I am most happy to do.

17 Septembris 2008

And it was good

Earlier this week, the godly sysadmin got the last of his major hacks into 1.5, and got our test installation up and running thereupon.

Yesterday I got down to brass tacks installing my themes, which promptly broke because the Manakin devs fixed their misspelling of “standardAttributes.” (I’m not pointing and laughing. Really I’m not. These things happen.) That was a simple enough fix, as were a couple of messages.xml fixes.

And today I hacked at the bad stuff. My scoped search box was amazingly unbelievably broken, but I got it fixed after a lot of unnecessary metaprogramming and a similar amount of very necessary cussing. (If the fit comes upon you to program Javascript inside XSLT? For your own sanity, I urge you to resist it.)

The other thing that broke badly was my big logo hack. The problem was that Manakin doesn’t put METS metadata inside the DRI any more; it’s all called by reference. Since the logo URL lives in METS, I had to figure out how to make XSLT call the right METS file and return the URL from it. Once I had that sorted, the $context-path variable confused me rather, but Tim Donohue kindly got me straightened out and flying right, and so the logos are now fixed as well.

At this point, I have some minor XSLT and CSS tweaks to do before I’m willing to set 1.5 free, but I think I can tear through them in a day or two (although considering the number of meetings I’ve got for the next three workdays, it may take longer than that). If I get through those, I can start wading through the wishlist. Drop-dead rollout date is Open Access Day, and I’m fairly confident we’ll make that.

And it was a good day.

The things you overhear

One of my students emailed me to say that this week’s networking readings inspired him to build a print server and a home network. Win!

As the classroom gradually filled yesterday evening, I heard highly gratifying tidbits about XML validators and server space and project blogs. I’ve been asked to talk about digital signatures and VPNs, and I’m looking hard at overhauling my lecture on security later in the semester.

Somebody in that classroom is doing something right. It’s not necessarily me! But somebody is.

16 Septembris 2008

Personas and boxes

A friend of mine dropped an email to say that I should have been cited in this examination of IR-related Cooperesque personas. Oh, please, who cites blogs in stuffy old librarianship? I’m cool. Call it great (or at least thoughtful) minds thinking alike.

The money quote from that article is this:

It was assumed that the users desired an open-access archive of primarily published research materials generated by the faculty and graduate students, but the users actually desired a network where teaching and learning materials are shared, potential collaborators are identified, and participants’ research is promoted to institutional colleagues.

It was assumed. “Mistakes were made.” Mm-hm. They didn’t need to cite my personas. It wouldn’t have hurt them to cite Roach Motel on the subject of faulty ideology, or faculty not using something that has no value to them, however. They get a pass, though, because Roach Motel is still only out in preprint.

The article is worth reading in its entirety. They did the work I didn’t and couldn’t, pulling together enough user interviews to base their personas on something other than instinct and anecdote, and to their everlasting credit, they didn’t flinch away from conclusions that are not encouraging for IRs as they are designed and run today. The chief problem with the article is that none of their personas is a librarian. It’s impossible to understand the situation of IRs without the librarians who authorize, plan, build, and run them. Doing so leaves you with “it was assumed.” Assumed by whom, pray, and why? And to put a Harnadian spin on the matter, if we build faculty a whizbang collaboration space that doesn’t actually make any literature open access, is what we’re building really an IR? Will it achieve what we (we librarians, remember us?) wanted to achieve in the first place?

Anyway. Read, ponder, learn.

Also click over to Mark Leggott’s Repository in a Box. Built atop Fedora (a point I will return to shortly), this is a mashup with Drupal that faces head-on the reality that them that has (content) gets (content). Leggott has built a system that gathers citation data, freeing faculty of the need to enter it themselves and giving them incentive to correct and augment it.

I’m dubious about the strength of that incentive, personally, given the English experience with the Research Assessment Exercise. Les Carr can opine more fruitfully than I on that subject. However, any incentive is better than none!

The technical underpinnings of this work read as pretty solid to me. The one link I’m mildly dubious about is going straight to FOXML from RefWorks; on principle, I would want to go through SWORD, but sometimes pragmatism trumps principle—SWORD isn’t completely baked yet. I look forward to the release of this software, because I’m enough of a Drupalista to be able to get along with it, and I’m just starting to learn a bit about Fedora.

I get the sense sometimes that the decision to run an IR on DSpace is, in the United States at least, a variant on “Nobody ever got fired for choosing Microsoft.” Of the three open-source repository packages, it is the most demanding on hardware and (ironically) the hardest to install and get running. (I got Fedora running on my desktop Mac at work in fifteen minutes. Seriously. Try that with DSpace, I double-dog dare you.) Compared with Fedora, DSpace is rigid and all but impossible to stack other technologies on, as Mark Leggott has done with Drupal. Compared with EPrints, DSpace is an unusable mess, particularly on the back end.

(If you sit Chris Gutteridge down with a beer, as I was able to do in Edinburgh, he will happily tell you that he revamped the EPrints deposit system for usability after trying to deposit something in an EPrints repository and being appalled at the number of clicks and keystrokes it took. He did a good job of it, too. On my more evilminded days, I have wild daydreams of forcing the entire DSpace development inner circle to screenscrape back issues of a journal or newsletter and then deposit every single last article through the DSpace web UI, one… by… one. Much would be learned, I believe.)

And if you’re even thinking about building the system that would satisfy the personas in the Maness et al. article—forget about building it over DSpace. Just forget it. Sheer madness. Fedora is the right choice, the only possible choice.

For the last month, I’ve been running an ad-hoc requirements-gathering process on the DSpace mailing lists and IRC channel. I’ve learned a few things from it. One is that getting librarians to speak up in a discussion even faintly technical is like pulling all your teeth at once. I am quite unhappy about this; never mind that it doesn’t speak well at all for my profession, it’s ludicrous to ask a passel of developers to read our minds. No wonder the ILS is in such a sad state (open-source aside). Relying on, or even hoping for, librarian input can be just plain deadly.

The other thing I’ve learned is that the DSpace development process is significantly underresourced given the state of the codebase and the needs of the stakeholders. I don’t have a quick fix for this (and as I must, I have mythical man-months in the back of my mind) or even a useful suggestion. I can only observe that it’s standing in the way of progress. I can gather all the requirements I want—and despite my grousing, I think I have gathered quite a bit of useful input in the last month—but it don’t mean a thing if none of it can get built, and I’m currently hearing a lot of “we can’t build this; we’re volunteers” from the developers.

As always, caveat lector. Caveat emptor as well. If I were Harvard especially, I’d be looking really really hard at Mark Leggott’s mashup, because it goes a long way toward nipping a potentially damaging faculty backlash (against extra work) in the bud. Try that on top of DSpace. Even with SWORD, which at least makes something like that possible, it’s a tall order.

In all honesty—I’m having a much easier time of it learning how Fedora ticks than I ever had learning DSpace. Partly that’s because I’m an old unreconstructed markup geek, so little XML files hold few terrors (and FOXML is actually pretty elegant, as these things go; it’s definitely nicer than METS), but partly it’s the effect of a sanely-designed system.

Anyway. That’s what’s caught my eye the last few days. Read, ponder, learn.