MARC, XML, and conversion
I’m not the first to have pointed this out, but it bears repeating: the normally quiet and rather tech-wonkish XML4LIB mailing list has had a terrific thread on the imminent (or not) death of MARC, and its replacement (or not) by XML. You can find the thread here if you scroll down a bit and look for the subject line “When will XML replace MARC?”
I won’t recap the whole thing (because doubtless I’d do a poor job), but I will say that Terry at Panlibus gets closest to my initial reaction: we need to figure out which level of MARC’s multiple functions we’re talking about replacing. I made this point loudly but fortunately not profanely in the talk I worked up for Ruritania, and one of the nicest things for me about reading the mailing list thread was seeing a lot of what I’d reasoned out for myself confirmed by people who know much more about MARC and AACR2 than I do.
There’s bits-on-disk—the actual storage model—and everybody knows that MARC’s binary format is a horrible dinosaur but nobody does anything about it, because the binary format is so intertwingled with everything else about MARC. The thing is, that decision is costly, all by itself. Anybody wanting to work with MARC has to decipher that format (or find a programming library that does, and my fave programming language only added such a library very, very recently). That’s a barrier to casual hacking right there—even we librarians can’t mess about with our own pet information format!
Compare this to the situation of the relational database, which runs rather more than half the world these days. These suckers have been intensively worked on, polished to a high sheen, optimized for speed and comprehensibility and all sorts of things. Give me a database full of information, newbie database person that I am, and I can answer all sorts of complicated questions with nothing more than a little SQL.
What I can’t do is put together weird and wonderful heretofore unforeseen OPAC queries, because the bare OPAC interfaces won’t let me, and I can’t get far enough under the MARC hood to program such things on my own. Is this retarding OPAC and OPAC-interface development? You better believe it is.
So that’s one problem. Someone else brought up the belt-and-suspenders problem: MARC plus ISBD. Stripped to its essentials, this is a problem of delimiters. MARC has delimiters, ISBD has delimiters, and sometimes the twain don’t meet.
To my mind, that’s an argument for junking them both. Neither is adequate on its own, and ISBD in particular is causing all sorts of data-repurposing problems (which you can read about in the XML4LIB thread). So let’s figure out what we need to delimit, and delimit it already!
Another objection to junking MARC/ISBD is conversion issues. Can’t do it entirely automatically, because ISBD can be unpredictable and there are errors in the records anyhow.
Well, okay, so what? One, this is a wizard chance to, I don’t know, find and fix large swathes of the errors! Two, if we ask for a completely automated conversion experience, we’ll never leave MARC, and heaven help us! Yes, this is going to take work and pain and annoyance. Yes, the rewards on the other end are worth it, especially the bit about never having to do it again because we won’t make the mistake of rolling our own rigid and inflexible encoding and content-designation formats from scratch.
And then there’s the ever-popular “we’ve used it for 30 years; why would we stop now?” You know, SGML is just about as old as MARC, and has about the same number of die-hard adherents. But most of us SGMLers, even the old-school (which I’m not), have made the mental switch to XML. If we can do it, so can you.
Anyway, go read the thread. It’s of far more value than anything I’ve said about it.