Subject: Re: XML and lisp From: Erik Naggum <erik@naggum.net> Date: Sat, 25 Aug 2001 12:08:48 GMT Newsgroups: comp.lang.lisp Message-ID: <3207730126705119@naggum.net> * Barry Fishman <barry_fishman@acm.org> > I looked again, and you incantations did not work. Attributes still > seem to be in the language. Sigh. > I agree that when XML is used as a data definition they are "completely > arbitrary" and make a syntactic separation which is destructive. I, > personally, just avoid using them when I have control of the XML I use to > define data. But I can't ignor them or re-format them, when I need to > generate XML which someone or some standard defined to use them. That > battle belongs in the XML standards committees, and I am afraid its a bit > late to change their minds. How you work with XML is not defined by those standards bodies. What your _internal_ representation of XML looks like is not defined by those standards bodies. One of the fundamental properties of Lisp is that we have a very nice and well-defined mapping between external and internal representation for most of our object types. There is no well-defined mapping between XML syntax and internal representation. Lots of ways are equally valid. Insisting on only some of them is counter-productive. > If I just treat attributes as subordinate elements, I lose the ability to > simply translate from lisp into XML. You have made up your mind about this, so I shall not try to convince you of the errors of your ways. People who are dead set on their ways should be left alone, mostly because the get cranky when faced with alternatives. > In other news articles you seem to suggest that you use information > outside the lisp representation to make that determination. No, you do not understand, and that is because you do not even try. > This means that my tools would require priori knowledge, which I feel a > simple lisp->XML (non-interpretating) translator should not need. I see that you have to be very hard and fast on how you represent your information. This is your choice. I wish you would recognize it as a choice, and not try to impose a very specific view on the reality that is far more flexible and adaptable than you have shown to believe it to be. > I don't think lisp->XML translators should have constraints that XML > parsers don't have. Well, that is another choice you have made. Other people, other choices. > In code which interprets the lispified XML, I know what the grammar is, > so can't I (at that time) bury any abstraction issues in the access > methods? What does it matter to your access whether something is an attribute or a sub-element? Why do you need to retain the distinction internally? > I admit I don't fully understand the abstraction benefits with which you > are concerned. I appreciate that you state this, because you certainly have not. > I've been overwelmed in tracking all the XML languages which are being > defined. Yes, overwhelmed by bad design, most people's brain shut down and they refuse to deal with a massive simplification because it threatens to be as painful as dealing with the complexity they have barely survived. > I was hoping that being able to map them into lisp syntax would help > avoid being buried in XML's confusing syntax. That is my idea. I am sorry for you that you have to define away the solution to your problem by insisting on a trivial one-to-one mapping of conceptual elements that effectively block your own conceptualization. > When looking at them in a lisp syntax, thing can become clearer (and seem > less innovative). How very true. > I don't agree that the distinction between attributes and entities is > always arbitrary. Attribute and entities are very different concepts and distinction between them is of fundamental importance. I fail to see how you think I have made any claims about their relationship, however. I am talking about _elements_. > SGML does stands for Simple Graphical *Markup* Language, It stands for Standard Generalized Markup Language, actually. They key to understanding the name is that "generalized markup" is something more than mere markup. SGML has aspirations beyond simply marking up text. > and in a markup language, I think it is important to distinguish the text > of a document from it markup. I think I already said that. > Multiple translators may be used, and they should not need to be kept up > to date on what attributes are used in the other translators. Your value judgments are your choice. I happen to disagree with them. If you try to deny me this, please realize that I do not care at all. > In an expression like: > > <header1><italic>Wow</italic>, this is difficult.</header1> > > or as lisp (which I think is more readable): > (header1 (italic "Wow") " this is difficult") > > it isn't clear whether "Wow" is text or the value of an attribute > unless you have prior knowledge of whether `italic` is a attribute in > the context of a header1 directive. Well, first off: You _have_ that prior knowledge. Your application will actually need to know what to do with it whether it is an attribute or a sub-element. If your application does not know what to do with it, I fail to see how whether it is an attribute or an element can matter to you. If you _do_ know what to do with it, how does it matter to you whether it came from an attribute value or a sub-element? > So here the distinction is simple, clear, and useful. It is arbitrary. > This is still important for things like xhtml -- and probably docbook, > whose standard I have not yet assimilated. No, it is fundamentally unimportant. Please try to accept this premise for the sake of discussion, and see if something you believe falls out and shows itself to you as more important than your simple protestations. > In my previous message I suggested that: > <header1 italic="Wow"> this is difficult</header1> > > become: > (header1 ((italic "Wow")) " this is difficult") > > With mimimal (but I admit real) damage to the syntax. Keeping the distinction between attributes and content is keeping you from realizing how simple and efficiently you can deal with XML data. But that is your choice. I fully expect that loads of people who have fused their brains shut and have fully "integrated" the false dichotomy of attributes and contents will never be able to unfuse it and open up to a very simple realization that it has absolutely no bearing on anything _other_ than the specific syntax in SGML/XML whether something is an attribute or an element. Those who grasp the concepts involved, will see that attributes are just another form of contents. Those who do not grasp the concepts involved, will think that attributes are different from contents because they have been given syntactically different expression. But it is always the syntax that follows the function. Someone believed that meta-information should be fundamentally different from information. Someone believed that the contents of elements should be text that wound up in the final document on the printed page and the values of attributes should not, but should only influence the processing of the information. This worked only as long as SGML was used as a markup language for documents and had no aspirations towards being an abstract structuring syntax. When it came to use it as a more abstract syntax, there _is_ no inherent quality that determines whether some value ends up displayed or not. That has to be supplied by the software that processes the information, which is precisely prior knowledge of the structure and its meaning. ///