Warning: fopen(/home/.lasher/yarinare/cavlec.yarinareth.net/wp-content/cache/) [function.fopen]: failed to open stream: Is a directory in /home/.lasher/yarinare/cavlec.yarinareth.net/wp-content/plugins/wp-cache/wp-cache-phase2.php on line 96
Caveat Lector » Why container tags rule

Dies Lunae, 8 Aprili 2002

Why container tags rule

I’ve never been sure whether Sean McGrath is a document or database guru. The great thing is, it doesn’t matter. When he writes about XML, folks oughta listen up.

His latest article at xml.org covers terrain similar to that of my article on structure and typesetting. Simply put, the question we both tackle is this: does it make more sense to simplify XML hierarchy by multiplying tags, or reduce the number of needed tags by creating hierarchy?

Perhaps an example would help. Sean uses lists, so I will too. HTML handles lists by creating different types of list tags (<ol> and <ul> for ordered and unordered lists respectively), both of which contain <li> (“list item”) tags. This isn’t the only way HTML could have been designed. HTML could have created <oli> tags for ordered list items, and <uli> tags for unordered list items.

On the face of it, that might look simpler: two tags instead of three. The reality is, though, that you shouldn’t leave it at that, because the first item in a list has to be treated differently from middle items in the list, which have to be treated differently from the final item in the list. (Why? Look at a list and tell me why. Still not sure why? Look before and after the list. See the extra visual space? That’s why.)

Sean assumes that this differentiation happens somewhere in list-processing code, and he’s utterly right that such code inevitably becomes spaghetti. Typesetters, however, will tell you that the typical way to handle this problem is to mark the first and last list items explicitly—in other words, to use separate tags for them. So to eliminate the HTML <ul> element while retaining its functionality, you have to create not one, but three list item tags: one for the first item in an unordered list, one for middle items, and one for last items.

(In fact, you need not three, but four. Go read my article for why.) What’s even worse is that without a container element, all of these new tags are thrown promiscuously together with themselves and everything else on their level (paragraphs, heads, etc); there is no easy way to ensure that individual list items are tagged correctly. What a mess!

Add in new list types and the situation gets real ugly real fast. I’ve seen some business-trade books with five different kinds of lists. (Crappy design. Crappy design, a lot of business trade books have. Has a lot to do with crappy editing, I suspect.) Do five lists the HTML-ish way, and you have at most six tags (five list tags plus <li>). Do it the supposedly simpler way, and you end up with twenty tags (five list types times four list-item tags per list type). TWENTY.

Evil. Evil evil evil. Nobody wants to deal with remembering twenty list item tag names. Nobody wants to deal with policing placement of twenty list item tags. And, as Sean points out, nobody wants to write code to process twenty list item tags, not when they could be writing code to process only six.

One very large, very important publisher, whose name I will not mention because I am allergic to litigation, tried to wish this problem away by pretending that in the total absence of hierarchy, any given type of list only needs one list item tag. The lame pretense is enabled (in the sense that one enables another’s addiction) by specific tags for extra line spacing, which is an abomination unto the fair names of typesetting and markup alike.

This publisher’s XML specification is a prime example of what happens when million-dollar consulting firms with approximately zero markup expertise (no, the firm in question was not Andersen or Accenture, but yes, it was one of their prime competitors) are brought in to create document-markup specs in a management cleanroom wholly divorced from reality. The spec is total garbage, a living inducement to tag abuse. A significant portion of the books the publisher is busily having converted to that spec will have to be redone in future. The publisher is not succeeding in building a typesetting workflow targetted to this spec because the publisher cannot so succeed: the spec is too broken.

And all because of an allergy to hierarchy, and the arrogant belief that production is best understood by anyone other than production workers. Bloody waste.

I dearly hope employees from this publisher read this post and recognize themselves in it. Look, y’all, I’m not telling you this to hurt you. Your consulting firm took you for an expensive ride. The more you let that shoddy work go unquestioned and unfixed, the more you’ll pay to fix it in the end. As Sean says, “XML tags cost money.”

120t free keypad motorola ringtonemotorola v710 ringtone bitrateled zeppelin ringtones