Archive for January, 2006

31 Ianuarii 2006

DSpace-devel happenings

So I say I’m going to start a series, and not another post since then. Go me.

Anyway, a lot has been happening on dspace-devel, some of which has spilled over onto dspace-tech and even dspace-general. The biggest news is that we probably can’t expect to see DSpace 1.4 until March at the earliest (and personally, given the scope of what’s in progress, I wouldn’t place a large bet on seeing it in March).

What’s the holdup? Well, active discussions have been had (and are still being had) about making DSpace modular, and offering a standard way to create, describe, and integrate plugins. This is not a simple hack, not at all—it has major implications for DSpace’s future development, such that getting it wrong would be a tremendous problem. So the developers are trying extra hard to think it through and get it right.

For those technically inclined, the current stumbling block appears to be coping with all the myriad little config files that will come with plugins—everything from Messages.properties to actual database schema changes. Thoughts range from homegrown merge tools to entire IDEs.

Other things popping up with frequency in the patch pipeline include a commenting add-on, a lovely hack that will allow much more flexibility in defining submission workflows, an on-the-fly interface internationalization patch (coming from Canada, I believe, quelle surprise), a controlled-vocabulary patch, a search-improvement patch (unlimited search arguments; the limit is currently 3), and a “community administrator” patch that should help delegate work better.

30 Ianuarii 2006

Not gonna fly

Never try to explain to your mother what a shoggoth is. Just doesn’t fly.

I had a pretty good weekend, though. I’m exhausted and achy and I’m not sure how I’m going to make it through work and rehearsal today, but that’s all right.

Because how often do you get to try to explain to your mother about your plush shoggoth?

27 Ianuarii 2006

Like that, only more so

The wiki project is—glacially—moving. I’ve settled on MediaWiki, and am grimly beating it with rocks (with Adrian’s patient assistance) until it does what I need.

Minds thinking alike, though—this is exactly what I’m trying to do, just with a wider focus. Holy howlers, they’re even structuring it the way I want to. (Via Open Access News, as usual.)

Give me another couple of weeks. I will get this done. I swear. (Even though my parents are visiting this weekend…)

26 Ianuarii 2006

Preservation musings

There’s a debate in progress on the DSpace tech and development lists about the costs and benefits of modularizing the DSpace architecture and providing plugin hooks. The list gurus are doing quite a remarkable job of heading off unproductive arguments at the pass, I must say.

Gives me to think, though, about this Dan Cohen post which I promised to respond to a long time ago and never located sufficient round tuits for. Also gives me to think about library data silos and how lovely it would be to breach them, and which silos ought to be targeted first—and I had a huge post in the pipeline about this, but I trashed it because Roy Tennant and Lorcan Dempsey are saying everything I had to say better than I could have said it.

“Preservation,” like most other words, has multiple meanings. DSpace’s definition of the word is “once you’ve got it, hang onto it like grim death.” Which is a good definition. Dan Cohen’s definition is different, though, more like “grab it before it disappears!” Which is another good definition. And DSpace is horrible, horrible, horrible at that.

(I do find myself wondering about permissions issues with regard to projects like the Hurricane Digital Memory Bank. I’ve had to do all kinds of tapdancing I don’t especially care for to be sure I’m not exposing MPOW to copyright liability. I don’t know what CHNM is doing about the problem—but if they aren’t doing anything, I strongly recommend they talk to the nice university lawyers, or the Copyright Office.)

Something I would love to bolt onto DSpace is a mudroom, a front lobby, a sandbox. Someplace to stash a whole bunch of files and know they’re being looked after (checksums, backups, format-checking on upload, assigning a temporary identifier, et cetera) until I have time to do a proper workup on them. DSpace’s concept of “workload” just doesn’t extend far enough back in the process—you can’t dump a file in without entering its metadata first, and sometimes the metadata entry just plain needs to wait. Not least because of permissions issues!

Such a thing would help solve Dan’s problem, I believe, and it’d solve a lot of my workflow problems, too.

As for modifying-to-preserve, I think DSpace has the right answer to that, frankly: don’t change and delete (as Dan suggests), change and add. DSpace’s media filter lets one build in automatic file transformations, with the result of the transformation added to an item as a new file. It would be entirely possible to bring in miscellaneous junk images and transform them all to (say) PNGs of the same bit depth and quality, without throwing the originals away. This is a good thing. You just never know when that original, however junky, will come in handy.

The problem at that point is that DSpace’s file addressing is godawful, as is its concept of the relationship between items and files (and, for that matter, between items and items—there’s “ispartofseries” in Dublin Core, and that’s about it). Here are some things I can’t do:

  • Reliably address files in DSpace from outside. (It’s possible; it’s just ugly, and its URLs aren’t 404-proof the way handles are.) So I can’t, say, build a nice pretty whizzy photograph library that uses DSpace as its back-end. The only solution is to get all the files out of DSpace and store them elsewhere. Which is frankly rather wasteful, and leads to stupid pointless fights about who stores what where. (Yes, been-there-done-that. Don’t even ask.)
  • Build a pretty whizzy photograph library into DSpace.
  • Do content-negotiation with client software to pass the right file format over. This is even more a pity because DSpace’s media filter goes a long way toward solving nasty content-negotiation problems.
  • Concatenate files for presentation to the client when that makes sense.

I need DSpace to understand that multiple files in an item may have different relationships to each other. They may be different formats of the same content (as with my poetry collection, which contains a lossy mp3 plus either an AIFF or a WAV for the non-lossiness of it all). They may be components of an overall whole, as with websites. DSpace kinda-sorta groks websites, but things other than websites have that structure and ought to have it respected. They may be ordered sequences, as with the hundreds of page-image TIFFs in a couple of the books I have that were scanned for preservation. They may be slightly different views of the same intellectual object; I don’t have anything like this that I can think of, but photographs may want to be stored this way.

Or—and here’s the really fun part!—some combination of the above. Some of my poetry pieces have an introduction in separate soundfiles. So there’s two mp3s and two WAVs; each file stands in a format relationship to one file and a sequence relationship to another. And I cannot represent that in DSpace. At all. It’s insanely frustrating, because it locks DSpace’s files behind a wall of incomprehensibility such that it’s all but impossible to build kewl ’leet public-facing services and websites from them. Why are we locking out the remix culture we ought to be embracing?

The danger, of course, is that adding these relationships adds a layer of complexity to DSpace management. Fine, okay, I agree with you. But at least give me the option to enrich my stuff in this fashion!

Maybe when the plugin architecture arrives. In DSpace 1.4. Which we won’t see until March at the earliest. Sigh.

Prefer experience to education

So I’ve got another couple friends now who are dipping their toes into the first-job water, and finding that water mighty cold. Fairly typically, they are thinking about going back for post-graduate education.

I don’t think that’s wise, and I’ve said so. Remains to be seen whether they listen.

I’ll say it right out. Once you have a bachelor’s degree, experience matters more than education. A ton of education without experience looks suspicious to employers, and I can’t say I disagree, because of all the folks who flee into graduate school to avoid the bruising slog that is the first-job search. Who wants to hire frightened people? (Answer: nobody you want to work for, trust me.)

Thing is, the search for a first job is always, always, always a slog. If you’re going to grad school because you think a job will just magically fall into your lap afterwards, please, please don’t. It won’t happen, and you’ll just end up hating graduate school and hating yourself for going.

Go do some work first. There’s lots of work out there that doesn’t require any particular specialization. If you don’t know where it is, your first job is to find out. And yes, a lot of that work sucks, but that doesn’t mean you don’t learn from it.

My new-librarian acquaintance is still looking. I know her pretty well, and as much as I want her to find a job, I find myself stopping short of actually recommending her to people. I myself wouldn’t hire her. It’s not even the lack of experience—it’s all the rough edges that a first job knocks off you that she still has, and the ceaseless virulent childish cynicism that would make her terrible to share an office with.

I don’t know how to fix that. Any suggestion from me that her attitude is part of the problem is only going to be met with—you know, attitude. But I’d lay odds that if she’d had a few jobs in her early 20s, she wouldn’t be languishing without so much as callbacks now.

Given the choice between a bad job and a bad graduate education, always take the job; at least it pays you, whereas you have to pay professors to maltreat you. Given the choice between a bad job and a decent graduate education—if you have no work experience, take the job. It’ll be worth far more in the job market than the additional degree. I absolutely guarantee it.

How do you say w00t in Latin?

Caveat Lector lost a little bit of flavor after the latest webhost move—I was simply too lazy to re-hack my Latin dates back in.

Thanks to my good buddy Adrian, though, they’re back, and they’ll likely stay back, because they’re an actual WordPress plugin now. This makes me happy, in a humanities-geeky sort of way.

So, hey, if anybody in Melbourne, Australia is looking for an obliging bloke with a CCNA, Linux knowledge, and good web skills including PHP, I know somebody…

24 Ianuarii 2006

Do over!

Last night was not my most scintillating chorus rehearsal ever. We spent all kinds of time on the vile Ravel “Daphnis et Chloe,” which I loathe and abominate more every time I work on it. It is a horrid piece, and if I ruled the world, henceforth every composer wanting to use voice as an orchestral instrument would be taken out and riddled with conductor’s batons St.-Sebastian-style pour encourager les autres. It was a dumb idea when Tchaikovsky did it (and yes, I sang that one in college), it was a dumb idea when Ravel did it, and it’s still a dumb idea.

And then we moved on to the Holst, which I had dutifully taught myself in roughly the same way children teach themselves to read: lots of repetition, and whole phrases at a time. Understand me, I had it cold, words and music together.

Then our Fearless Leader made us sing it on beat-counts instead of the words, and I got loster than a lost thing because it’s impossible to keep whole phrases in mind that way. Bah. It was a bad rehearsal.

Half my workday got eaten with a consortium meeting, and I came back to find that Postgres had gone down hard—fortunately, only on the test server, where it’s no big deal. So now I know some things about Postgres administration that I didn’t before (such as, how to make it log!).

And I’m behind on work communications, which is something that I’m totally going to spend tomorrow fixing, because I keep my mental must-contact list in my brain and I’m getting troubling signals of a buffer overflow.

All in all, would really like to do the last twenty-four hours over, but I’ll have to settle for making the next 24 a bit better.

23 Ianuarii 2006

Wiki software that fails to suck

Is there such an animal?

MediaWiki is a bloated monster, impossible to configure. I’ve spent two weeks trying and failing to coax PmWiki into doing what I need it to. MoinMoin flatly refused to install. What the hell? If this is the future, I’ll take the past, thank you.

Here’s what I need:

  • Disallow anonymous editing. This is a must-have.
  • Edit tracking, and easy reversion.
  • A look-and-feel I can mess with. A good theme library is a big plus, because I hate messing with things.
  • A sensible, usable admin interface.
  • Documentation in English, not barely-literate geekese. Don’t point me to wiki-based documentation, either. Talk about things that conspicuously suck.
  • An install that doesn’t break my brain. Or my webhost.

Nice to have, but not essential:

  • Clean URLs.
  • Clean markup.
  • Written in Python.

I am really starting to be annoyed. I like working with wikis, but I had no idea administering one was such a pain in the nethers. Suggestions, anyone?

20 Ianuarii 2006

OJS, PHP, and OSX

(I just felt like TLAing everybody to death today.)

Seriously, now. I installed Open Journal Systems on my testing server today (OS X Server 10.4 with all the updates). To do so, I had to install entropy.ch’s PHP5 package, which was no problem at all. I made a throwaway Postgres database, did the setup, and everything worked…

… except that it was dog-slow. Painfully slow. Ten or fifteen seconds just to see a page. Just intolerable.

I tried a few fixes I saw suggested here and there that didn’t work. Here’s what did work: disable Performance Cache. Yes, how’s that for irony: Performance Cache was killing my server’s performance!

Here’s how to do it: go into Server Admin, select the Web option on the offending server, then click the Settings tab. From there, pick the Sites tab, double-click on your site, and select the Options tab. Uncheck “Performance Cache,” save your settings, and kill and restart Apache.

Fixed it right up, and now I’m fiddling with vapor-journals to my heart’s content, and pondering how one would use DSpace instead of the native OJS storage/archiving mechanism. Enjoy!

19 Ianuarii 2006

Small public squee

It’s probably a measure of my fit with the profession I chose that I have a laundry-list of heroes in it. I tend to go for people who challenge received wisdom intelligently, and the more lively and outspoken they are about it, the better.

(Yeah, yeah. I plead no contest to the charge of narcissism, even though I make no particular claim to intelligence. It does go to show, though, that it’s possible to be loud and occasionally contentious in this field and survive. Even prosper.)

I’ll cop to squeeing a bit when I get an email from one of my heroes. Walt Crawford and I are just about sparring buddies now. I loved hearing from Barbara Quint. And today I got a message from Andrew Pace, whose dry wit about his vendor background in the beginning of The Ultimate Digital Library amused and cheered me when I was considering taking a position with a library vendor. I’ve enjoyed reading his American Libraries columns, too; sometimes (especially recently) that was the only worthwhile read in the whole mag!

I’m told that development of the NCSU catalog is ongoing, which is no surprise. The nice thing about software projects is that there’s always room (if not necessarily funding) to improve them. I fully anticipate that my nitpicks as well as input from such luminaries as Karen Schneider will be considered and addressed where possible.

(Oh, and yes, the report from California is a must-read, even though it’s a PDF. My jaw dropped to find such radicalism in a buttoned-down library report! It’s wonderful stuff.)

Anyway, I was happy to hear from Andrew Pace, and I wanted to share my little squee moment with everybody.