29 Septembris 2006

Metadata mismatches

Well, DSpace has a gizmo that creates SFX links, but it’s not very elaborate, and it’s (duh) SFX-specific. So I built a more general OpenURL crosswalk. My test server now has what look to be pretty respectable COinS, but I’m having to trust other people to tell me if they’re really working correctly.

Somebody tell me when this “human vs. corporate author” thing got into the bibliographic mix, and why, because it is a pain in the posterior. OpenURL distinguishes them (au vs. corpau); Dublin Core (at least as it is implemented in DSpace) doesn’t.

I can imagine a heuristic that would mostly work. If it’s got a single comma in it, one that does not precede an abbreviation, it’s probably human. If it doesn’t have a comma, but it’s only one word long, it’s probably human (even one-word company names usually have an “Inc.” or something after them). Otherwise, it’s probably corporate.

Honestly? I wouldn’t have much confidence in that. Too many counter cases. As it is, though, all the corporate authors in DSpace (got govdocs? I do) are going to have to be erroneously coded as human authors.

Somebody tell me what the practical use of this distinction is, because right now I’m really not seeing it. “But OPACs do it” is bzzzt! not a good answer. Neither is “because MARC puts them in different fields/subfields.” Tell. Me. Why.

Grrrr. It has not been a good day for code.