7 Octobris 2004

Don’t do it!

I have three words for publishers thinking Google’s new Google Print gizmo the answer to all their digitization needs. They are “don’t,” followed by “do,” “it,” and a whole line of exclamation points.

What’s my issue? Ownership of the scans rests with Google. So the publisher gains only searchability, not real digitization. And the publisher may lose significant control over data. (Do you know how Google is going to digitize? Do you know they’re going to do it right? If they screw it up, can you get them to fix it, or heaven forbid, do it yourself?)

To its enduring shame, the ebook bubble offered publishers precisely this same scam. Give us your content, we’ll do it up pretty for free (except we own the result and you’ll never see a single byte). Once Versaware and NetLibrary hit the skids, publishers saw that they’d been had. Do not sign over your data! Not for the price of digitization! Trust me, it’s not worth it!

I suspect, also, that Google doesn’t know what it’s getting into here. The plans I’ve read indicate that Google isn’t scan-and-OCRing; they’re expecting to work from publisher electronic files or possibly PDF Normal.

Hollow laugh. Good luck, Google. I know what those files are going to look like. Do you?

It’s a case of invisible labor, if I may paraphrase Greg Downey. First the ebook techies and now Google, operating on zero knowledge of what book production is like and what book producers actually produce. Me and my fellow text artisans? Totally invisible.

I bet Google thinks the only reason Amazon doesn’t do Search Inside the Book with everybody is recalcitrant publishers. Ha. I’ll bet you everything I own in this world that part of the problem is incompetent production practices leading to impossible-to-use electronic files.

Eh, well. We shall see. I fully intend to gloat very loudly if I’m right, though.

Addendum: Okay, my bad. On further inspection, Google is going to scan-and-OCR. But if they think that solves production problems…