Archive for October, 2002

31 Octobri 2002

Motives

Some pushback from Blackmask Online about my PDF series.

“Gotta wonder about the original author’s motives,” says Blackmask. Naw. No need to wonder at all. Just ask. I don’t bite. Not even on Halloween.

For the record, I’ve nothing much against Adobe that I haven’t already laid out for all to see. They irk me, sure, but no more than Microsoft, and no more (to be brutally honest) than the latter days of the Open eBook Forum. (Oh, yeah, am I irked with some of those people.)

My axes to grind with PDF itself (as opposed to with Adobe) are also pretty clearly laid out in my first post. PDF’s accessibility rots. PDF’s futureproofing rots. PDF’s editability rots. All of these things are very, very bad for e-text in the medium-to-long term.

And one thing I didn’t mention but should have: PDF itself—as opposed to PDF readers and writers—cannot realistically be further developed without Adobe’s consent. (I’d like to be wrong about that, but I don’t think I am. Counterarguments welcome.) It’s possible to avoid Adobe proprietary enhancements now, sure. What happens when they do something really cool, and they don’t put it in the open part of the PDF spec, and they don’t allow the open-sorcerers to emulate it?

Since only Adobe can develop and enhance PDF, if you stick with PDF, you are embracing a corpse. I don’t know when PDF will finally be mummified. I only know it will, and I hate the thought of losing all the texts in PDF, and all the effort that went into those texts. I am not at all sanguine, based on my own experience, that publishers are currently farsighted or tech-savvy enough to understand that they need to archive source files previous to a text’s incarnation as PDF.

See, I have no corporate affiliations. None. I don’t work in ebooks; I can’t for another, um, four and a half months. I don’t care who makes money off e-text; I don’t care who goes bust on it as long as they don’t splatter e-text itself with their demise. I don’t care who reads e-text, except that I do believe the more the merrier. I care about the text itself. Always have, always will. You think with my grad-school history I’m applying to info-science school as a lark?

A fair bit of scholarly work and a heck of a lot of experience have gone into text-preservation best practices. Work remains to be done, without question, but I don’t know a single knowledgeable soul claiming that PDF is a superior or even an acceptable electronic-text archival mechanism. I know lots of souls with lots of smarts and lots of experience making the case for markup.

And from my own experience and my own research, I believe them. Those are my motives. (Okay, I admit that my motive for the macabre metaphors throughout this post is simple jealousy of Mark and Bill and their slick Halloween-y color schemes.) What are Blackmask’s?

Abstractions

Zeldman opines today, “What’s amazing and unprecedented about CSS layout is that it?s completely abstracted from the data it presents.”

No argument about “amazing.” But “unprecedented?” Far be it from me to gig the great Zeldman who has taught me much, but I must assume that statement was just a momentary lapsus, er, not linguae exactly—let us say lapsus digitorum.

Print-publishing workflows have understood abstraction for quite some time, though this fact has admittedly been swept under the carpet by WYSIWYG page-layout tools.

First a manuscript goes to an editor. The editor goes through the manuscript and marks up each block with its type—heads are marked as such, paragraphs, lists, table parts, and so on. Sounds a lot like abstracting out a data model, no? And it takes place completely independently of the appearance of the text in the finished book. Said appearance hasn’t even been determined yet, in fact.

The list of unique block types then shoots over to a book designer, who puts together a design spec detailing precisely how each block is supposed to look. When the manuscript is keyed, each block is marked in some way or other with its type.

Ways differ. A few publishers use real SGML or XML markup. Some publishers use markup-like tags of various sorts. Some publishers use word-processing styles. Doesn’t matter. My point is, these block-types are placed explicitly into the data, and move with it.

The typesetter then uses the information about the block types to set up and apply a stylesheet, so that s/he doesn’t have to write out the same formatting instructions umptillion times for the umptillion ordinary paragraphs in the book.

If this sounds an awful lot like the way markup and CSS get put together in a website, guess what? It is an awful lot like it, which is why it bothers me no end that print publishers are still so freaked about markup. They don’t need to change to accommodate markup nearly as much as they think they do.

And it wouldn’t hurt markup experts to learn something of the nuts-and-bolts of print publishing, either.

Responses

I have been bad. Folks have been emailing me and I have not been responding. I’m sorry.

One person responded to my last post on responsibility detailing a phenomenon I ought to have taken into consideration: gatekeeping. To shorten a long excursus into sociology—some people, typically women, very possibly including me, shut other people out of particular tasks and then complain about lack of help. (There’s a flawed but interesting book out there about gatekeeping and childrearing. I can’t remember its name or author at the moment. Rhoda or Rhonda something, maybe? Thing wanted always buried. Sigh.)

I hope I’m not too guilty of this particular mode of behavior. I’ll be watching myself for it. I do know that I have to ignore guilt-twinges, sometimes even stop myself from taking over, when I see David doing household tasks that he isn’t specifically responsible for. Which is silly; we both have a general responsibility to the household, and why shouldn’t he pick up or vacuum?

Another person responded to Courage and Necessity yesterday, perhaps a bit angrily, to point out that honesty about mental illness can create real losses—of jobs, of mentoring, of educational aspirations.

Yes. It can. If I seemed to deny or minimize that, I’m sorry. All in all, though, I still believe I’ve lost a lot more by denying the depression. Others’ mileage may vary.

In the long run, I’d like to help build a society that acknowledges, even values, human imperfections rather than forcing humans to try to hide them. I don’t know how to start doing that except by forthright, even brutal, honesty about my own flaws. I don’t do it for self-flagellation because I hate myself. Indeed, I find it something of a miracle that I don’t hate myself. Nor do I do it to attract prurient interest; my flaws aren’t generally the prurient-interest type anyway.

I do it (and this is another hit at “courage”) out of the utterly selfish desire to carve out space in the world for myself and others like me. The only way I know to do that is to break through blindness and ignorance and cruelty and denial, and the only way I know to do that is to write, and take my lumps for it.

Someone else wrote to praise yesterday’s translation. Thanks! On rereading, I am tempted to replace the third line’s “let the poor men live” with “let a poor man live.” This is a harder question than it first seems. There’s a lot of very old cultural miasma around various combinations of “pobre” with “hombre” in Spanish, and I’m frankly not sure what Unamuno was aiming at here.

Still. I haven’t done this in over a decade. It was fun.

30 Octobri 2002

The Atheist’s Prayer

Naomi cited (in the comments to the post just linked) a sonnet by Miguel de Unamuno as a favorite. I thought I’d exercise my translation muscles a bit; haven’t done any translation since college, and I sort of miss it sometimes.

This is probably going to be bad, bad stuff. It’s really been a while, and poetry was never my strong suit anyway.

The sonnet first:

La Oraci?n del Ateo

Oye mi ruego Tú, Dios que no existes,
Y en tu nada recoge estas mis quejas,
Tú que a los pobres hombres nunca dejas
Sin consuelo de engaño. No resistes
A nuestro ruego y nuestro anhelo vistes.
Cuando Tú de mi mente más te alejas,
Más recuerdo las plácidas consejas
Con que mi ama endulzome noches tristes.
Que grande eres, mi Dios! Eres tan grande
Que no eres sino Idea; es muy angosta
La realidad por mucho que se expande
Para abarcarte. Sufro yo a tu costa,
Dios no existente, pues si tú existieras
Existiría yo también de veras.

And my rhymeless attempt at it:

The Atheist’s Prayer

O hear my plea, you nonexistent God,
And these my sorrows gather to Your void,
O you who cannot let the poor men live
Without false consolation. Don’t deny
Our plea; just clothe yourself in our desires.
When furthest from my mind You wander off,
I most remember placid legends, those
With which my nurse beguiled weary nights.
How great you are, my God! So great are you
That you are just a Thought; reality
Is strait however far it stretches out
To welcome you. I suffer due to you,
you nonexistent God, for if you were
Existent, truly I’d exist as well.

Man. That sixth line is a killer (do I even have the sense right?). Not to mention the other thirteen. And yeah, I’m rusty. Ee-ick.

Courage, necessity, and blogorrhea

I’ve gotten a couple of plaudits in email for being forthright about depression. I’ve even been accused of courage, a trait I steadfastly continue to deny possessing. Add that to commendatory mentions of my recent blogorrhea and two of the highest hit counts I’ve ever seen yesterday and the day before, and one might predict a swelled head on my shoulders.

(I hope not. But one might.)

What am I supposed to be afraid of, I ask those who want to call me courageous? Depression is neither a crime nor a sin. As for its stigmatization—look, I have tried the hide-all-faults schtick. I truly have. I was the most perfectionistic twit of a child you can possibly imagine. Oh, yes, and a liar to boot. How else to maintain the façade?

It didn’t do me any good. In fact, it did me considerable harm. Therefore I stopped doing it. What harm I have suffered since from my honest self-presentation (I daresay I don’t know the half of it) pales in comparison to the harm I caused myself with my earnest striving to appear unblemished.

Anyone who has a problem with my depression can think whatever they like. The same with regard to my weight, my gender, and the other ways in which I am, let us say, pulchritude-challenged. It’s that simple. I refuse to hide these things; I refuse to deny them. What’s lovely about such refusal is that it enables me to avoid many of the pigs who would otherwise be stupid or nasty about them. They decide they want nothing to do with me before even intruding on my notice. I like that.

I accept the risks to my medical insurability, which is all I will say on that point.

(For additional insightful responses to Anil’s plea, try epersonae and Dave Rogers.)

It isn’t courage, not for me (I make no representation vis-a-vis Anil, Elaine, or Dave). It’s necessity. I need people around me who are there because of me, not because of some neo-Platonic ideal that bears me only a glancing resemblance.

As for blogorrhea, I think I was something of a blogger before there were blogs. I have a thick folder full of dot-matrix printing from a high-school journal. I also maintained a lengthy (as in “a typical printed-on-both-sides letter required extra postage”) snailmail correspondence with a high school buddy who moved away after our sophomore year. David and I spent two years apart, and I have another thick folder full of that correspondence.

I’m used to writing about stuff. It’s not scintillating writing about scintillating stuff; that’s not my thing. It’s not scholarly writing, either—I finally figured out I’m no good at that when I picked up my best grad school paper recently and was stunned at how poor the writing was. The data were fine and the arguments were good; the writing blew chunks. I don’t know what got into me when I wrote that.

It’s just writing about stuff. Running off at the keyboard. (B)logorrhea. Nothing new at all. People did write to each other and for each other before there were blogs. Honest.

Dreams

Kalilily wants to know what my dreams are like. Dreams in the sense of “hallucinations during sleep,” not in the sense of aspirations.

Uh, I seem to recall something from last night about two kittens (one black, one orange-striped) and a fenced back yard with a gate. I think I let the kittens out the gate, but I’m not sure. No idea why I was there, don’t recognize the place or the kittens. If you can find cosmic significance in that, let me know.

I don’t make an effort to remember my dreams. I’m content to let them do whatever it is they do for me in peace, on the presumption that they know how to do it better than I do.

You really want to play with dreams, ask my husband. He’ll bend your ear until it breaks. I finally had to tell him to cut out the quarter-hour-long recitations, as I couldn’t manage to sustain interest in something I found essentially trivial. He took it in good part, and doesn’t bug me any more unless he’s had a nightmare.

Speaking of Dream, though… he met me at the door last night and demanded petting immediately (“what do you have to take off your coat for? pet me now!”). He bullied his sister unmercifully, played with his catnip pillow from the vet, slept briefly on my lap, and generally acted his normal self. The vet emailed us to say that his urinalysis results were negative for infection. Seems he’s going to be fine.

29 Octobri 2002

Nem what?

I didn’t quite realize how much naughty Portuguese I remember (or can translate from Spanish) until I happened upon a list of translations of execu-speak into Portuguese.

Oh. My. Not work-safe if you work in a Portuguese-speaking office.

Pretty darn funny if you sling the lingo, though. And I love the domain name.

Via languagehat.

Don’t run in clogs

Yesterday, like every workday, I had to get across Johnson Street during rush hour to get to University Avenue to catch my bus home.

I had to speed up a bit to avoid some yahoo with nothing better to do than run down pedestrians while turning right on red. I shouldn’t try to run in clogs, yahoos notwithstanding—I got one of those twinges in my ankle that says “Landed wrong, hon; try again.”

Which would have been fine, except that I’m still getting the same twinge today when I walk. And it hurts.

Wish I could voodoo the twinge into the ankle of the right-turning yahoo.

Analogies

Clearly what we need here are some usable analogies. Let me take a stab.

We’ll try music. Or better yet, movies. A movie studio putting together a movie records the images on super-nifty-neato expensive film stock. Sound? Recorded separately—and I do mean separately, as there’s music and Foley stuff and other audio effects to add in. (I’m being vague here because I don’t know what I’m talking about.)

I don’t know what precisely movie studios archive, but I would guess one or more finished prints in the best medium available, the one that preserves most fully the information captured on film and in the recording studio. If they’re smart, particularly if they expect sequels or a director’s cut, I would think they’d keep a lot of the components around, too, in the appropriate high-fidelity medium for each component (which will be different, obviously, for a recorded sound and a computerized visual effect).

Does the DVD or videocassette that you, the consumer, buy contain all the information available to the studio? Of course not! Nor do you even expect that. For example, you probably can’t tease apart the individual tracks in the sound. You can’t get at a frame without the visual effects added in. That’s okay; all you’re interested in is watching the goldarn movie.

When a new end-user format like, say, DVD comes out, does the movie studio use VHS videocassettes to make DVDs? We’d pillory them if they did; VHS fidelity is lousy. The studios go back to their master copies, of course; they figure out how to do the transfer to the new format and they do it.

Likewise, with print books you don’t get a copy of the typesetting files or the PostScript. You don’t care (do you?). And if a lucky book goes into a second printing with a spandy-new cover, the printer doesn’t photocopy the pages and bind the photocopies, which is the best analogue I can think of for turning one end-user format into another. What a horror that would produce. The printer goes back to an earlier-in-the-process, completely non-consumer medium. Could be plates, could be PostScript or PDF, could even be typesetting files. Some sort of master, in other words.

What does this have to do with ebooks in general and OEB in particular? Well, look, the OEBPS lays out a format for ebook master files, ebook files as experienced by ebook producers, not ebook files as experienced by human readers. This was done with the understanding that ebooks as experienced by human readers are going to change. Rapidly. Radically. Should change, in fact, because as everyone and his dog delights in saying with appropriate nose-lifting disdain, the first generation of readers was something less than universally desirable or universally capable.

That does not mean, however, that the masters must change in equally radical fashion. Movie studios didn’t rush out and re-archive all their movie masters when DVD became big with movie-watchers, did they now? (They may indeed be using some digital medium for mastering now that they didn’t before—I wouldn’t know—but I guarantee you that if they are, the decision to do so took place entirely independently of the consumer move to DVD.) The logical way to keep ebooks available and usable in periods of such change is to lay down a protocol for master files. Not for human-reader-destined files. Master files.

Ideally, those master files are so good that new human-reader formats can be generated from them indefinitely, with as little pain as possible. Is OEB there yet? No, it’s not, particularly for certain categories of books, but it isn’t too far away, it’s moving closer, and it’s a whale of a lot closer even now than anything else going.

Expecting to generate new human-reader files from old human-reader files is like expecting to generate DVDs from videocassettes. The loss of fidelity from the transformation is only a minor problem compared to the lack of fidelity inherent in an end-of-the-line format. This is one reason asking Microsoft (or whoever) to make .lit convertible is a dumb idea. (Another reason, of course, is that Microsoft is going to laugh at you.)

“The ebook industry would take off if only there was a single end-user format.” Heard that a lot. Maybe it’s even true. Know what? I don’t care, because an industry based on a single end-user format is cruisin’ for a bruisin’. Do you honestly believe you can create an end-user format that will survive technology changes? I’d like to see you try. If you try and fail, what then? How are you planning to tell all those publishers that your format is a dud and they need to change everything they do to accommodate your new format, which really, truly, genuinely isn’t a dud?

Please.

Decoupling master files from human-reader files, and standardizing the masters only, just makes sense. It allows the master files to be consistent and the reading experience to change. It is the only feasible production system that allows that.

The reason folks have such a hard time accepting this, I think, is that the system of master copies is completely hidden from them in most other media industries. I watch movies, yet I haven’t a clue how they’re archived. I didn’t know diddly about how paper books were made, much less how masters were archived, until I started working for a publishing-services company, and I’ve been reading paper books since I was three. Since ebooks are new, however, more of the nuts-and-bolts of the process is exposed. We’re thrashing out production problems and questions in sight of everybody. Which isn’t a bad thing at all; it just occasionally necessitates liberal applications of clue-by-fours.

The other reason this seems so hard is that most publishers aren’t accustomed to thinking about archival issues as relating to anything but paper. Free clue: the world is changing under you. Wake up. Fast.

And if I may put on my activist hat for a moment: The reason we have all these arguments over copy-protection for DVDs is that there is only one end-user format. That format is a chokehold on all of us. Do you honestly want to repeat that mistake with ebooks? I didn’t think so.

Yes, even creating a so-called “open” end-user format would be a mistake, because the identical instant it becomes obsolete, all the reading-system manufacturers are going to squall about how it was a total failure and proprietary formats are just plain better. And we’ll land in a useless mess even worse than the one we’re in now with PDF versus .lit versus everything else, because the OEBPS will go down along with the end-user format, and we won’t have any master files anywhere.

Postscript: Thanks to Stuart for correcting a boneheaded math error in my rant of earlier today. Note to self: Folks don’t generally write routines to convert a format to itself.

PDF, OEB, and memes">PDF, OEB, and memes

Both Jenny and TeleRead linked to my PDF rant. I wish I’d given it a slightly less unsavory title, but you know how it is when you’re spoiling for a good rant.

I think I’m spoiling for another good rant. Off we go… and it’s just as well I’m not doing anything with the OEBF just now, as their lawyers would crucify me for saying some of the stuff I’ve said and am about to say.

I am annoyed by TeleRead’s assertion that “OEB, though much influenced by Microsoft, is evolving with input from many other companies.” In fairness to TeleRead, they aren’t the first to say it and won’t be the last, unfortunately. Funny, how the folks slinging this meme around haven’t ever shown their faces on the OEBPS working group.

Whereas I have, and I tell you right now that if Microsoft “heavily influenced” the OEBPS, then so did I, and so did plenty of other people who don’t owe Microsoft a plugged nickel. All those new CSS properties in 1.2? I did the lion’s share of the legwork on those. Oh, and the new selectors? Guy from Versaware; when he left, I took them over (not that I had to do anything much with them). Look, is it really so bloody impossible to believe that Microsoft wasn’t in the driver’s seat in OEBPS development?

Well, I’m here to testify, they weren’t. Have they contributed? Heck yeah. Have their representatives been influential? Yes, but on an individual basis; they’ve never been the Great Corporate Monolith treading down the hoi polloi. When they have something good to say (often—sharp people), we listen. When the rest of us have something good to say, do they listen? Yeah. They sure do. Have they had any more influence than the rest of us by virtue of their affiliation? No. Have they controlled the direction of development? No more than anyone else.

Have they controlled votes? They have not; most of our work has been done by consensus anyway. The one contested vote I participated in, Microsoft’s representatives’ point of view lost, gentlefolk. I tabulated that vote myself (I was working-group scribe at the time), so if you want to raise a question about it, I’m the person you’re questioning. And you better believe I still have the documentary evidence.

(I’ve no idea why you’d want to challenge that vote anyway; it was an issue only a die-hard technogeek could care about at all, much less love.)

Calling Microsoft responsible for the OEBPS is a frank insult to me and to the many other non-Microsofties who gave and are giving time, money, and brainspace to the development effort. It also taints what has been a genuinely open effort with utterly unnecessary and counterproductive anti-Microsoft sloganeering. Cut it out, please. Immediately. Thank you.

(And if you think I’m a Microsoft apologist, kindly use that nifty-neato search box at the top of my sidebar to search for “Microsoft”—and then go wash your mouth out with raw lye. Or undiluted bleach. Whichever is handier.)

As for this bushwa about “converting Microsoft Reader’s compiled binary .lit files to other compiled binary ebook formats”—look, it’s bushwa. Stupid. Pointless. Impossible. It’s “come up with a single end-user binary” in another guise. Please read my earlier rant. End-user binaries are not and cannot be futureproof. Giving them more attention than they warrant is an exercise in futility and unnecessary data death.

Consider, for example, that different reading systems can legitimately have different capabilities. One, designed for scholarly use, may display MathML equations natively. One, designed for Beach Blanket Bob, may not. One may be color, another grayscale. How do you create usable conversions, given these differences? More to the point, why do so, when it makes vastly more sense for each system to pick and choose the files it wants from a base fileset that clearly indicates fallbacks for unusable or undesirable files? As the OEBPS package file does now for media files, and (I hope) will eventually do for interspersed markup languages.

Please also consider the nasty, evil combinatorics issue of creating (n×n)−n conversion routines for n reading systems. Count current-gen OEB-based systems, please: MS Reader, Gemstar, Cytale, hiebook, Mobipocket, GlobalMentor. Probably a few that I missed, but let’s stick with these six. Want to convert them all to each other? That is thirty conversion routines you are writing, you big idiot. Whereas if you start from OEB, you write six.

Yeah, sure, you only want conversion routines from the biggies. That is brain-dead, frankly. An immensely more productive tactic is asking publishers to create OEB and go from there to other reading systems. Do I want people to do this? Yes! Please! Please go bother your favorite publisher who only does .lit or PDF. Show them they’re losing sales and readers thereby.

The evil genius of this tactic is that it reduces the importance of any one end-user binary format. Want to slip a stiletto into .lit? (Sure, I do too.) Get publishers to push their OEB to different formats. Get reading systems to accept OEB files as input, and make their OEB-handling routines easy, batchable (very important!), and preferably scriptable. Don’t waste your time trying to bully Microsoft.

A slightly more legitimate position might be a demand that end-user binaries be decomposable into their component OEB files. Again, the problem there is that an end-user binary may not represent the entirety of the OEB archive. If said archive contains high-resolution TIFF images for one souped-up reading system (or for print, for that matter) and JPEG equivalents for the others, what brain-dead non-TIFF-using reading system is going to bloat its end-user-binary file size with those TIFFs just so that the entire OEB package can be reconstructed? Doesn’t it make more sense to let the publisher, rather than the reading systems, maintain complete, futureproof archives? (Or, heaven forbid, let’s have librarians do it!)

I look forward to seeing the “other perspectives” put forward by TeleRead, but I must say that if they are as poorly thought-through as the post to which I am now responding, I am not sanguine about their usefulness or feasibility.