25 Februarii 2003

RDF">Uses of RDF

I’ve been meaning to toss some thoughts at Leigh for a few days now. He asks “When would I use RDF in preference to a non-RDF XML vocabulary?”

As usual, there can’t be a hard-and-fast answer to that. I do see a few glimmerings of ideas, though, and as usual I’ll toss ’em out without worrying much whether they’re any good.

First. If you must end up with something XML-valid, don’t bother with RDF. Just don’t. Yes, you can restrict the RDF/XML you produce to a specific syntax form; you just can’t expect anything you receive to be similarly restricted, because RDF/XML-generating tools can’t be made to give a damn about which form they output of the many possible syntax forms of a given set of RDF/XML statements.

(Is this a problem? Yeah. Will the RDF working group ever admit it is a problem, and privilege one RDF/XML syntax form for XML interchange? Doubt it. The only compromise I can think of is somehow building an RDF/XML editing tool capable of respecting both a graph and a DTD or schema. I don’t know how feasible that is.)

RDF/XML does come as close as anything yet to allowing namespaces to live together in harmony, which I find to be a significant achievement. You can build an intelligible RDF structure out of six different namespaces. Try that with straight XML. So if I wanted to incorporate bits and pieces from here and there, at least in something that’s recognizably data and not document, I would honestly try RDF before rolling my own XML structure. That goes double if any or all of the namespaces I’m stealing from are widely-used in RDF circles already, Dublin Core being an obvious case in point.

RDF/XML may also do better for open-ended structures that need to remain (at least to some extent) backwards-compatible. Building extensibility into a validatable XML structure is tricky. The OEBPSWG ran into serious problems with the package file. Stuffing more—well, stuff— into it meant a stark choice between changing the DTD and making some implementations (not to mention publications!) based on that DTD unusable—breaking backward compatibility, in other words—or working through an incredibly creaky extension system.

Eventually they took the break-backward-compatibility route, and one of the results thereof was XPackage, itself based on RDF/XML. (As best I can tell, by the way, the PSWG is moribund, if not actually dead. Too bad. We did some good work in our day.) XPackage is syntax-constrained, but I tend to believe that over time the constraints would have loosened.

XPackage never enjoyed the full support of the working group, some members of which clung to dumping the problem on somebody else (“modularization”) as the solution. I, in near-total ignorance, backed the RDF-based design, and now that I know much more than I did then, I must say I think I did the right thing. The only other feasible option is allowing package-file readers to ignore whatever they don’t understand, but that implies that the working group would never again develop deal-breaker markup for the package file, which seems unlikely.

Anyhow. RDF is good at playing nice with others and at coping with open-ended information sets. What it isn’t, and what the SemWeb people too often seem to think it is, is magic pixie dust.

Okay, so I have this code in a block of RDF (Shell, kick me if I get this wrong; I’m a wingin’ it):

<rdf:Description rdf:about="http://www.example.com/spam">
<foo:bar>baz</foo:bar>
</rdf:Description>

Which, translated into English, says “Something that has been identified by the string http://www.example.com/spam has a foo:bar of baz.”

Whoopie. Be still my heart. If I don’t know anything about foo, bar, baz, example.com, or spam, this statement cannot be anything but pure gibberish. It doesn’t matter that I know the relationship between the parts of the sentence. It doesn’t matter that I can draw a pretty graph. That’s where the SemWeb stuff falls down, even if you toss in ontologies (which might tell you, for example, that foo:bar can only have “baz” or “bacon” as values).

Computers only know what you tell ’em. They don’t automagically know foo from bar any more than humans do. Inference only gets you so far. Sure, it might be further than we’ve been yet; I’m inclined to think so, myself. At some point, though, somebody’s got to know what the bits of the vocabularies mean, and all the inferential power in the world won’t get that across.

Did any of that answer your question, Leigh?