Organizing repository items
I had an interaction today with one of the repository’s early-adopters that bears examination.
The faculty member in question will be submitting papers that he has already carefully organized into over a dozen categories on his own website. He wanted to mirror that organization on the repository—only the repository isn’t really set up to do that.
I mean, it can. I could have made him a DSpace sub-community in which each of his categories was represented by a collection. Administratively, though, that’s a nightmare; collections can’t be deleted without wiping out the items in them (said items can be reassigned to other collections first, but speaking of nightmares…), for example. The entire structure has an unacceptable rigidity. Categories change; DSpace collections are forever.
Not to mention that the overhead in setting up a collection is, while not onerous, not inconsiderable either. DSpace’s data model is designed to reflect the structure of the submitting organization, not so much any classification or categorization arising from the materials submitted.
I’d love to hack a faceted-browsing-and-search system into DSpace; it’d solve an immense lot of problems. The “how much of this stuff is peer-reviewed?” problem. The “I’m looking for a thesis (but not anything else)” problem. (Yes, I know that’s in the metadata already, but DSpace doesn’t let you browse on it!) The problem of communities organizing their collections orthogonally—some do it by type of resource, some by who’s submitting the resource (e.g. faculty vs. students), some by subject categorization, and some by combinations of the above. Faceted views would let communities organize collections and items the way they want to, while users browse the way they want to.
Mark my words, this is a major change, not overly amenable to my desperate-hacking approach to life. The database would have to change, the metadata would have to change, and I don’t even want to think about the user-interface changes. Not to be done on a whim. But what am I to do with depositors who have their organizational act together?
I compromised. This faculty member is getting a single collection, and I’m going to help him keyword his work from the appropriate controlled vocabulary so that it’ll be easily findable from the great wide world. He will maintain his own categorized browse-list outside the repository altogether, linking to the items in the repository via the famous unbreakable URLs.
This is not as bad a solution as it might sound. Most people won’t arrive at a given item in a repository via browse. They’ll have a citation already (in which case they can browse up from the item), or they’ll have searched for it (in which case the carefully-constructed category browse has done them exactly no good at all). Good keywording plus an outside browse list is as close as we can get right now to the best of both worlds.
It shouldn’t be this hard, though; truly it shouldn’t.