Why does XML have attributes?

Have you ever wondered why there are also attributes in XML and not just elements, causing many never-ending "element vs attribute" discussions?
And why is XML so redundant - repeating the tag name in the ending tag, thus making the document much larger?
I asked Charles Goldfarb, the author of SGML - the original format from the 1960s (already had <tags> and attributes) from which XML and HTML were derived.
He was very kind to reply to me. Enjoy:


- In <elem>text</elem>, why repeat "elem" for the second time?

In fact, there is an option in SGML that allows you to omit the element-type
name from an end-tag, but it is rarely supported. (In XML the option does not
exist at all.) The name is there primarily for the convenience of humans
debugging the markup, as there are large document types, such as aircraft
maintenance manuals, where elements can nest very deeply.


- Why are there attributes in SGML? One could express the same just with elements.

In fact, one could just as easily express the same just with attributes!

The basic notion is that elements represent objects while attributes represent
their properties. However, the property "content" is given a convenient syntax
-- everything between the start-tag and matching end-tag -- which makes it
simpler to express content hierarchy by simply nesting elements.

In a nutshell, if information is part of the content of the document, represent
it as an element. If it is some other property, represent it as an attribute.

I wish you the best of luck with your studies.

Charles Goldfarb

--
©2008 Charles F. Goldfarb * www.xmlhandbook.com * www.xmlbooks.com
The XML Handbook?* 5th Edition ISBN 0-13-049765-7 * 100,000 in print!
--

kick it on DotNetKicks.com

Posted by Martin Konicek on 1:26 PM

3 comments:

Paul said...

Sorry that this is in English the only other language I speak has been long dead.

Isn't there a bit of a contradiction in asking why there are attributes and why there are end tags, if you take time to write a good xml schema, and use attributes wisely, you will not run into the element crush of too many closing tags.

That is to say, if you use attributes in your xml properly you will eliminate the need for excess elements and thus, excess closing tags.

shevy said...

"which makes it simpler to express content hierarchy by simply nesting elements."

Dont think i can agree there... if he nests already (but with tag elements), he could just use YAML too.
Or, put in other words, if nesting would be there, the closing tag with a / and the normal leading elements ... that should work as reference to the last open element. Only those 3 chars (Hmmmm your blog filters these chars... claiming that tags are not allowed here. :( )

But if we look at how XML is used on the www, it is a primitive and ugly database format.

Why not pick something else for objects just like Javascript did as well?

charlie = Cat.new
charlie.meow
tim = Dog.new
tim.eat(cat)
tim.belly_size 20

Well at least in javascript one gets to set these "objects" and the line noise is less annoying too than the XML one.

XML based solutions will turn out to be too complex... just look at how http://dl.matroska.org/downloads/libebml/ died ...

Martin Konicek said...

Paul, thanks for feedback. I was just thinking of XML where you don't even have to think about when to use elements/attributes (because there are no attributes) and yet, your XML is always short (because closing tag looks eg. just like this </>).
Basically, I like the way JSON is designed, XML seems a little more bloated to me.

Post a Comment