Psst! Wanna buy a hot ebook?
So, I established last time that unique identifiers and cite-able text are technological hurdles needing solution, if we’re to have ebooks that prove useful.
There’s a problem with unique identifiers, though. No, not the problem of the same identifier accidentally getting slapped on two books. These things happen, and can be worked around. I mean something different: how do you know that a chunk of bits with a given unique ID is a proper copy of the ebook that deserves to have that unique ID?
I mean, if you think nobody’ll spoof an ebook, for fun or in malice, you haven’t been hanging around either the Internet or academia very long.
I’ve heard about watermarking as a solution for this kind of thing, but it seems overelaborate to me. I don’t know much about checksums and hashes and whathaveyou, but the idea (insofar as I understand it) appeals: the ebook contains a short chunk of metadata representing the result of a calculation carried out on the ebook itself. Grab the ebook, do the calculation, check it against the metadata, scream loudly if there’s a discrepancy. Fairly simple to do, difficult or impossible for a spoof to defeat.
An additional advantage to this scenario is protection against errors in transmission. If the ’net hiccups and drops half the ebook by accident, the checksum won’t check out, and the reader will know there’s a problem.
Either way, or some third way I’m not smart enough to imagine, we need to figure out something.