Archive for December, 2005

13 Decembris 2005

Good morning, housemonkey

Didi came in for her morning trample bright—well, dark and early this morning. Yawning, I got up to feed her, and check in by IM with a friend of mine in Australia.

Turning on the light, I discovered that one of them hurled on my winter cloak. Good morning, housemonkey! Isn’t it a lovely morning!

12 Decembris 2005

Nouns or verbs?

Has somebody written the ultimate guide to whether and when to use nouns and verbs in labelling on websites? I’ve been trying to clean out language cruft from the DSpace UI, and I keep bumping my nose against this question.

Take, for example, an item’s handle, or citation URL. (Yes, I know it’s a URI. Hush and let me explain.) DSpace’s standard message for it is “Please use this URL to cite or link to this item: [item URL].” This doesn’t seem so bad, but every time I look at it I feel the urge to ask why. Why this URL and not the one in the address bar? Citing in a paper, or in a presentation, or what? And how is a cite different from a link?

So I decided to do a trial run with a noun phrase, not a verb: “Permanent citation URL.” That tells people what it is and what they can do with it, plus a hint to the wise as to why it differs from the URL they can see in the addressbar. Plus, it’s eight fewer words, and I am all about fewer words in the UI. If I get complaints, or if I can’t stand it any more, I can always switch back.

But I don’t have a justification for my decision other than instinct, and since my instinct isn’t reliable, I’m asking: when does one choose to label with a verb or action rather than a noun or description?

One possible criterion is whether the user is expected to do something on the actual website with the information. If so, use a verb; if not, use a noun. I like this theory, as it would neatly justify what I did. Or there’s verbs for links, nouns for all else. But I don’t know that that makes sense in all contexts. Somebody help me out here.

DSpace-devel, week of 5 December 2005

Quiet week on the list. The most interesting post was a “call for rants” about integration of DSpace (and by extension, other IRs) with virtual-learning-environment software such as Sakai. Exactly where the poster thought the joins were was left rather vague (something about “grey literature,” but why would one use a VLE for grey literature?)… but I am seriously wondering if this is in part a response to the e-reserves royalty situation.

One response indicated that cross-repository search is being added to the VLE bag of tricks, in addition to putting repositories into metasearch engines.

A few additions to a permathread about switching between HTTP and HTTPS as needed, to save server resources while maintaining the security of DSpace logins. Turns out to be a hard problem. If you’re worried about it, serve everything over HTTPS (as many installs already do).

Implementors using SRB to manage a DSpace assetstore need to be aware that SRB doesn’t calculate MD5 checksums on files the way DSpace does. The workaround is to calculate the checksums via a batch process, and insert them into the DSpace database.

Notable patches: Creation of a “community administrator” role.

Where’s my T-shirt?

I already have my “Information to the People” T-shirt, complete with upraised fist and “Librarian Power” on the back. (They were SLIS’s 2005 design. I have two of them, actually. They were just too cool to pass up.)

Now I want a Radical Militant Librarian T-shirt. (New York Times, registration required, yadda yadda.) Or at least a LiveJournal icon.

Sounds like a job for the Librarian Avengers.

On a slightly more serious note, it strikes me as a good thing that the OIPR seems to be afraid of us. Thwart NOT the librarian.

11 Decembris 2005

The Microsoft Word Nobbling Council

I swear, if I ever in my life again have to clean up craptacular HTML expelled from the nether regions of Microsoft Word, I am going to have to follow the exalted example of the president of the Mid-Galactic Arts Nobbling Council and gnaw my own leg off.

This stuff is truly, madly, deeply vile, and it resists being cleaned up like a three-year-old making mud pies.

This rant has been brought to you by a TAG contract that I probably should never have signed and am incredibly thankful to say runs out at the end of this year. Tech writing, especially when it consists of gluing together bits and pieces from sixteen different sources? Is very not my thing. Rah-rah those who do it without going bananas. Me, I’m going bananas.

9 Decembris 2005

DIDL ordering?

This came up on #code4lib, and I tried my librarianly best to come up with a definitive answer, but the bloody DIDL spec is not helpful, so I’m throwing myself on the mercy of the LazyWeb.

Is there an ordering semantic in MPEG-21 DIDL? For example, I have a scanned book in which each page is a separate file. Can I create a DIDL file that will spit back the pages in the correct order?

My gut says “no,” based on no explicit language about ITEM ordering and no implicitly or explicitly ordered examples in the back of the spec. But I haven’t implemented this puppy. Does anybody know for sure?

Snow day!

Well, not so much snow. More like “snow mixed with hideously bloody dangerous sleet and ice, which mixed with clueless Southern drivers would mean extreme danger to life and limb if MPOW were open, which thankfully it isn’t.”

Nice timing, too. I have a lot of work to do for a TAG client.

8 Decembris 2005

What will and won’t work

At DASER I was caught by Stevan Harnad’s idea of an email link to the author on a repository page for embargoed content. I buttonholed him briefly to confirm my understanding of the idea. He told me that the author’s email address was available as part of the item’s metadata.

“Well, the submitter’s name,” I corrected (respectfully, I hope). “We get some third-party submissions.” Like every other repository on the face of the earth.

I had to repeat myself before he quite understood me. An annoyed look crossed his face. “Well, that won’t work, then!”

It won’t work the easy way, true. It’ll still work, though, and I think I know how (at least for DSpace). Make a ticky-box on the first page to select the embargo, and then add a couple of inputs if that ticky-box is ticked: one for the appropriate email address, and (possibly) one for the length of the embargo. (Me, I’d rather see only one embargo length, but the rest of the world may see matters differently: the NIH embargo looks to end up at six months of length, whereas the typical thesis/dissertation embargo is one year long.)

As with most e-business websites, the best way to handle the email address is probably a “Use mine!” ticky-box with an input box for backup. And embargoing should probably not be available to everybody; best to activate it on a per-collection basis.

The neat thing about this system is that it handles multiply-authored material where the submitting author is not necessarily the one who ought to get requests. Not to mention, of course, the third-party-submission use case that got me thinking about this in the first place.

(I did discover yesterday that the license I’ve been using actually has language to cover third-party submissions. I’m thrilled; that cuts down amazingly on my paperwork hassles.)

Once DSpace has a plugin architecture, I daresay I’ll take a whack at coding this up. Shouldn’t be too hard; the hard part (well, the part I don’t know how to do, anyway) is baking in the embargo release automatically.

The image gallery, now… that is giving me fits, and I’m not even sure I can come up with something usable. The problem is the unpredictable relationship between an item and its image bitstreams. In some of the image collections my repository’s got, a typical item has two image bitstreams, where Image 1 is in a different (usually higher-resolution and lossless) format, but is otherwise identical to Image 2. An image gallery would pick one bitstream to display and forget about the other one, or have some UI widget to pick out the other one for closer examination. If it displays them all (and especially if it’s smart enough to use DSpace’s media-filter to step down high-res images), the user will (rather annoyingly) have to click past several versions of the same image.

In other collections, an item will have several image bitstreams that form part of a whole: for example, scans of several pages of a single manuscript. The image gallery should grab out all of these and display them in the correct order. How?

And how the heck does the image gallery know which kind of an item it has?

The image gallery I’m meant to be imitating clearly works because of an unstated understanding about one-image-bitstream-per-item. I don’t have the luxury of making that assumption, so I may have to nix the idea altogether for now. There’d be a way to hack it if DSpace used the “bundle” construct to group different versions of the same image, but I’m not confident enough to go that deeply into core code.

(And y’all Fedora-ites can just hush up. I know it’d be easy to build this on top of Fedora.)

Eh, well. I’m backburnering the image gallery for now. We’ll see what develops.

Moving from print to e-journals

An Inside Higher Ed article provides a solid, unemotional look at the need for journal publishers who haven’t started negotiating the transition from a print-only world to do so eftsoons and right speedily.

I’m still reading through the actual report (still clearing a work backlog), but my sense is it’s good strong stuff.

I’m annoyed, though, at sentences such as “There is as yet no archiving solution for electronic periodicals, so it is not possible to calculate the costs or determine how they will be borne.” This is ludicrous. There’s no reason on this earth the emerging institutional-repository infrastructure can’t take on this load. Just because hardly anybody’s gone that route yet doesn’t mean the path isn’t worth treading.

7 Decembris 2005

Miss Congeniality

Hey, um, who did this? Seriously. I don’t have any idea. If it was you, email me, won’t you? I’m touched, and I’d like to thank you personally.

I shan’t claim any larger significance for the honor than that I have some very kind and loyal readers, but I shall observe that this came my way despite my stubborn unwillingness to subscribe wholly to any of several sets of The Rules of Blogging. Don’t let anybody tell you how to blog (unless they’re paying you to blog a certain way, and if they are, you did disclose that, right?). Blog your own way, and good luck to you.

Now. To business. I’m going to throw the vote, as best I can. If you’re the type of person who votes in the Edublog Awards, and you were considering a vote for Caveat Lector, please vote for Jessamyn West’s librarian.net instead. I admire Jessamyn with all my heart; she’s a librarian’s librarian, her eyes always on her patrons, her voice raised on their behalf. Not sure what a librarian does? Watch Jessamyn. She’s what we all ought to be.

So help me send that tiara and bouquet her way. I’ll just be here in the back row with this paper bag over my head, grinning my big silly ugly grin underneath it.