Selection?
As I suppose could have been expected, some of the reaction to the Google digitization initiative amounts to people talking before they think. Forgiveable, but still.
I rolled my eyes when I saw somebody or other wondering about the selection criteria Google or the participating libraries were going to use to pick books for digitization. But I ignored it, until I saw the same criticism two or three more times. So I’m going to address it, perhaps rudely.
Jehoshaphat, think, people! The books in question are already in an academic library. They’ve been selected already! What’s more, they haven’t been weeded, which suggests (especially as regards the public-domain works, which are older) that they’ve passed a few more selection and use tests. If you’re doubting that this is good stuff, you’re passing negative judgment on the librarians who put it where it is and now keep it there. Is that really what you meant? Thought not.
And what selection criteria are you going to use, exactly, if you actually did want to winnow through those books? Audience? But the potential audience for the public-domain books is theoretically infinite (leaving aside digital-divide and accessibility issues) and likewise infinitely varied. How does one pull useful selection criteria out of an audience like that, one man’s poison being another man’s wine and so forth?
All “selection” would do in this context is reduce access to the “weeded” subset of books by denying them digitization. This is a good thing why?
Now, there’s another angle to this that the average librarian won’t have thought of. That’s okay; the average librarian isn’t a text artisan. But Harvard’s FAQ about this project, which is well worth a read in its own right, gives the game away (emphasis mine): “While the University hopes that the decision [about expanding the digitization project] can be made in the coming months, the larger project presents many complex issues that need to be evaluated, and the pilot may hold surprises and may uncover additional issues that will require time to understand and resolve.”
As I thought, Google doesn’t exactly know what-all it’s doing here, seeing as how they’re new at it. They’re smart enough to know, though, that however cool the new processes and procedures of theirs are, there will be kinks to work out of them. (Believe me, this is smart. I’ve known publishers to think a spandy-new process would work without serious testing and without hitches right off the bat.)
So Google emphatically does not want anybody monkeying with their book sample, lest somebody eliminate the one book that teaches them something they need to know about their process. They want to take a whack at anything and everything. This is eminently wise, believe you me—never, ever pilot a complex project on the simple stuff, because the complicated stuff will catch you unawares and eat you for lunch!
The lack of user-oriented selection based on library-sanctified collection-development techniques, in other words, is wholly intentional—and in my opinion, entirely warranted given Google’s technical goals.
I’m still not seeing the evil, people. Just not seeing it.