Because I really, really don’t make this stuff up
Peter Sefton got a mite huffy at me for my contention that Word-template-based scholarly-article production systems invariably fail when they meet the author.
I don’t make this stuff up just to be annoying. Honestly and truly, I don’t.
Seems everything old is new again at Extreme Markup 2007 too:
I went to see David Lee of Epocrates on getting content authored in MS Word into appropriate XML. The core of this talk was an extended lament on how authors insist on using Word; even if you provide specialized authoring tools, they compose in Word and then cut and paste, more or less incorrectly, into the specialized tool. Epocrates has tried a variety of strategies: Word styles (authors won’t use them), tagged sections (authors screw them up), form fields (plaintext only, so authors delete them and type in rich text instead). In the end, they adopted Word tables as the safest and least corruptible approach. A few Word macros provide useful validations, and when the document is complete, a Word 2003 macro rewrites it using Word 2003 XML (unless it is already in that format). I pointed out that the approach of having authors use Word and saving in plain text was also viable, leaving all markup to be added by automated downstream procssing; David said that design was too simple for the complex documents his authors were creating.
My contentions in a nutshell. Thank you, Mr. Lee and Mr. Cowan.
I will add that testing such tools on a small, highly-selected author population (as Mr. Sefton’s blog post indicates that he has done) leads to tools that work very well for a small, highly-selected population of authors—and fail utterly once they move beyond that population.
I do not. DO. NOT. Make this stuff up. Been there, done that, don’t even have the T-shirt any more.