Dark theme

XML


This page will walk you through the basics of XML


XML (eXtensible Markup Language) is a text-based data format in which text information is "marked up" by tags in angle-brackets (< >) containing information about what the text relates to. There is a good introduction at Wikipedia:XML, but here we'll cover some basics.


Here's an example of an XML polygon (for a more fully operational version see: GML (Geography Markup Language) on Wikipedia):

<?xml version="1.0" encoding="UTF-8"?>
<map>
  <polygon id="p1">
    <points>100,100 200,100 200,200 100,000 100,100</points>
  </polygon>
</map>

You can see that tags surround text, and tags surround the text in a hierarchy, encasing the text in increasing detail as we get closer to it. The XML is also "well formed", that is, each start tag (e.g. <polygon>) has an associated closing tag in the same case (e.g. </polygon>). Tags not enclosing something look like this: <TAG /> or <TAG/>. If you've ever coded in HTML, you'll be thinking this is a bit like a rather stiff version of that. You'd be right. A nice tool for writing XML is Microsoft's free XML Notepad.


Because of the formality, XML can be checked for "well formedness". In addition, however, XML can also be forced to comply with a schema: a collection of rules about what the tags are and how they can be used. There are two popular schema types in XML: the (older) DTD (Document Type Definition: Wikipedia) and (newer) XSD (XML Schema Definition: Wikipedia).

Here's a DTD for the above:

<!ELEMENT map (polygon)>
<!ELEMENT polygon (points)>
<!ATTLIST polygon id ID #IMPLIED>
<!ELEMENT points (#PCDATA)>

This essentially means "map"s must contain "polygon"s; "polygon"s must have one set of "points", and can also have an "attribute" "id". Points must be in text form. You can find a tutorial on how to write DTDs here at w3Schools.

Here's the XML above (right click "Save as...") linked to the DTD. Software can now check the XML against the DTD to make sure it is not only well formed but valid. Save both files into a directory and drag the XML into IE or Firefox, both of which can display XML and will check it matches the DTD.


Here's a XML Schema Definition (XSD) for the above:

<xsi:schema xmlns:xsi="http://www.w3.org/2001/XMLSchema"
   targetNamespace="http://www.geog.leeds.ac.uk"
   xmlns="http://www.geog.leeds.ac.uk"
   elementFormDefault="qualified">
<xsi:element name="map">
   <xsi:complexType>
      <xsi:sequence>
      <xsi:element name="polygon" minOccurs="0" maxOccurs="unbounded">
         <xsi:complexType>
         <xsi:sequence>
            <xsi:element name="points" type="xsi:string"/>
         </xsi:sequence>
         <xsi:attribute name="id" type="xsi:ID"/>
         </xsi:complexType>
      </xsi:element>
      </xsi:sequence>
   </xsi:complexType>
</xsi:element>
</xsi:schema>

As you can see, even though it says the same thing, it is much more complex, mainly because it includes information on the namespace, that is, a unique identifier (like http://www.geog.leeds.ac.uk) by which the XML tag "polygon" can be distinguished from any other "polygon" XML tag. XML Schema Definitions have the advantage (for a computer) of being written in XML as well, meaning you only need one language parser. You can find a tutorial on how to write XSDs here at w3Schools along with a tutorial on XML namespaces.

Again, here's the XML above (right click "Save as...") linked to the XSD. Software can now check the XML against the XSD to make sure it is not only well formed but valid. Save both files into a directory and drag the XML into IE or Firefox, both of which can display XML and will check it matches the XSD.


You'll notice that the XML we've looked at so far just comes up as interpreted text in IE/Firefox. Next we'll look at styling it.


  1. This page
  2. Styling XML <-- next