LINUX GAZETTE
...making Linux just a little more fun!
Working with XSLT
By Daniel Guerrero

The eXtensible Stylesheet Language Transformations (XSLT) is used mostly to transform the XML data to HTML data, but with XSLT we could transform from XML (or anything which uses the xml namespaces, like RDF) to whatever thing we need, from xml to plain text.

The w3 defines that XSL (eXtensible Stylesheet Language) consists of three parts: XSLT, XPath (a expression language used by XSLT to access or refer to parts of an XML document), and the third part is XSL Formatting Objects, an XML vocabulary for specifying formatting semantics

Meeting XSLT

First of all, we need to specify that our XML document will be an XSL stylesheet, and import the XML NameSpace:

<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

...

</xsl:stylesheet>

After that, the principal element which we will use, will be the xsl:template match, which is called when the name of a xml node matchs with the value of the xsl:template match:

<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/"> <!-- '/' is taken from XPath and will match with the root element -->
    <!-- do something with the attributes  of the node -->
</xsl:template>

</xsl:stylesheet>

Inside of the xsl:template match, we could get an attribute of the node with the element: xsl:value-of select, and the name of the attribute, lets first make an xml of example with some information:

<!-- hello.xml -->

<hello>
   <text>Hello World!</text>
</hello>

And this is the xslt which will extract the text of the root element (hello):

<!-- hello.xsl -->
<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/"> 
  <html> 
    <head>
      <title>Extracting <xsl:value-of select="//text"/> </title>
       <!--  in this case '//text' is: 'hello/text' but because I'm a lazy person... I will short it with XPath  -->
    </head>
    
    <body>
       <p>
           The <b>text</b> of the root element is: <b><xsl:value-of select="//text"/></b>
       </p> 
    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

The HTML output is:

<!-- hello.html -->

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>Extracting Hello World! </title>
   </head>
   <body>
      <p>
         The <b>text</b> of the root element is: <b>Hello World!</b>
      </p>
   </body>
</html>

Selecting Attributes

@att will match with the attribute att. For example:

<!-- hello_style.xml -->

<hello>
   <text color="red">Hello World!</text>
</hello>

And the XSLT:

<!-- hello_style.xsl -->
<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/"> 
  <html> 
    <head>
      <title>Extracting <xsl:value-of select="//text"/> </title>
    </head>
    
    <body>
       <p>
           The <b>text</b> of the root element is: <b><xsl:value-of select="//text"/></b>
           and his <b>color</b> attribute is: <xsl:value-of select="//text/@color"/>
       </p> 
    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

The HTML output will be:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>Extracting Hello World! </title>
   </head>
   <body>
      <p>
         The <b>text</b> of the root element is: <b>Hello World!</b>
         and his <b>color</b> attribute is: red
      </p>
   </body>
</html>

If you are thinking in use this information to, in this case, put in red color the text Hello World!, yes it's possible, in two forms, making variables and using they in the attributes of the font, for example, or using the xsl:attribute element.

Variables

Variables could be used to contain constants or the value of an element.

Assigning constants are simple:

<!-- variables.xsl -->

<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:template match="/"> 

<!--  definition of the variable  -->
<xsl:variable name="path">http://somedomain/tmp/xslt</xsl:variable> 

  <html> 
    <head>
      <title>Examples of Variables</title>
  </head>
    
    <body>
       <p>
           <a href="{$path}/photo.jpg">Photo of my latest travel</a>
       </p> 
    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

The html output:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>Examples of Variables</title>
   </head>
   <body>
      <p><a href="http://somedomain/xslt/photo.jpg">Photo of my latest travel</a></p>
   </body>
</html>

You can also get the value of the variable selecting it from the values or attributes of the nodes:

<!-- variables_select.xsl -->

<xsl:stylesheet version="1.0"
                  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:template match="/"> 
    <html>
       <head>
         <title>Examples of Variables</title>
        </head>
       <body>
           <xsl:apply-templates select="//photo"/>
       </body>
    </html>
</xsl:template>

<xsl:template match="photo"> 
    <!--  definition of the variables  -->
    <xsl:variable name="path">http://somedomain/tmp/xslt</xsl:variable>  
    <xsl:variable name="photo" select="file"/> 
     <p> 
       <a href="{$path}/{$photo}"><xsl:value-of select="description"/></a>
     </p>       
</xsl:template>

</xsl:stylesheet>

And the xml source (I don't put images of myself, because I don't want to scare you :-) )

<!-- variables_select.xml -->

<album>
   <photo>
      <file>mountains.jpg</file>
      <description>me at the mountains</description>
   </photo>
   
   <photo>
      <file>congress.jpg</file>
      <description>me at the congress</description>
   </photo>
   
    <photo>
      <file>school.jpg</file>
      <description>me at the school</description>
   </photo>        
</album>

And the html output:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>Examples of Variables</title>
   </head>
   <body>
      <p><a href="http://somedomain/tmp/xslt/mountains.jpg">me at the mountains</a></p>
      <p><a href="http://somedomain/tmp/xslt/congress.jpg">me at the congress</a></p>
      <p><a href="http://somedomain/tmp/xslt/school.jpg">me at the school</a></p>
   </body>
</html>

If you note, you will see that the photo element-match is called three times because of the xsl:apply-templates, every time xslt finds an element that match it, is called the xsl:template match that matches it.

Ok, so you are impatient to try to make the text in red of the hello_style.xml?, try to do this with variables, if you can't do it, open this page misc/danguer/hello_style_variables.xsl

Sorting

XSLT could sort the processing of xml tags with <xsl:sort select="sort_by_this_attibute">, this element must be placed into xsl:apply-templates element, you could sort by an xml element or attribute, in ascending or descending order, you could also specify the order of the case (if the lower case is before than a upper case, or vice versa).

I will use the example of the album, and I will add only the sort element:

 <xsl:apply-templates select="//photo">
	<xsl:sort select="file" order="descending">
 </xsl:apply-templates>

This will alter only the order of photos is put in the html, in fact, xslt will order first all the elements photo of our xml, and it will send to the template-match element in that order, that's why the xsl:sort element must go inside the xsl:apply-templates.

The xsl's and html's files are in the examples, you can get it with these links:

if statement

There will some cases when you need to put some text if some xml element (or attribute) appears, or other if doesn't appears, the xsl:if element will do this for you, I will show you what can do, let's image you have a page with documents (this example is taken from my 'tests' at TLDP-ES project) and from these documents, you know if the sources were converted to PDF, PS or HTML format, this information is in you xml, so you can test if the PDF file was generated, and put a link to it:

     <xsl:if test="format/@pdf = 'yes'">
	   <a href="{$doc_path}/{$doc_subpath}/{$doc_subpath}.pdf">PDF</a>
	 </xsl:if>

If the pdf attibute of the document is yes, like this example:

   <document>
     <title>Bellatrix Library and Semantic Web</title>
     <author>Daniel Guerrero</author>
	 <module>bellatrix</module>
	 <format pdf="yes" ps="yes" html="yes"/>
   </document> 

Then it will put a link to the document in the PDF format, if the attribute is 'no' or whatever value the xml's DTD allow you, then no link will put, if you want to check all the xsl and xml documents they are in:

for-each statement

If you check the xml document of the below example, you will see, in the first document we have three authors separated by a comma, obviously a better way to separate the authors will put it in separated <author> tagas:

   <document>
     <title>Donantonio: bibliographic system for automatic distribuited publication. Specifications of Software Requeriments</title>
     <author>Ismael Olea</author>
	 <author>Juan Jose Amor</author>
	 <author>David Escorial</author>
	 <module>donantonio</module>
	 <format pdf="yes" ps="no" html="yes"/>
   </document>

And you could think to make an xsl:apply-templates and a xsl:template match to put every name in a separate row, for example, this could be done, but if you also could utilice the xsl:for-each statement.

     <xsl:for-each select="author">
	    <tr>
	       <td>
		      Author: <xsl:apply-templates />
           </td>
	    </tr>
	  </xsl:for-each>

In this case, the processor will go through all the authors that the document had, and if you are wondering what template I made to process the authors, I will say there is no template, the processor will take the apply-templates element like a 'print' the text of the element selected by the for-each element.

choose statement

The last xslt element I will show you is the choose element, this works like the popoular switch of popular languages like C.

First you must declare a xsl:choose element, and after, put all the options in xsl:when elements, if element couldn't satisfy any when, then you could put an xsl:otherwise element:

  <xsl:variable name="even" select="position() mod 2"/>

  <xsl:choose>
     <xsl:when test="$even = 1">
          <![CDATA[<table width="100%" bgcolor="#cccccc">]]>
	 </xsl:when>
     <xsl:when test="$even = 0">
	     <![CDATA[<table width="100%" bgcolor="#99b0bf">]]>
	 </xsl:when>
     <xsl:otherwise>
        <![CDATA[<table width="100%" bgcolor="#ffffff">]]>
	 </xsl:otherwise>
  </xsl:choose>

The position() returns the number of element processed, in the case of the documents, the number will increment as many documents you had, in this case, we only want to know which document is even or odd, so we can put a table of a color for the even numbers and other for the odd numbers; I put the xsl:otherwise only to illustrate its use, but actually I think it will never be a table with blank background in our library.

If you ask me why I put a CDATA section?, I will answer you, because if I don't put it, then the processor will ask for his termation tag (</table>) but its termination is bottom, so, the termination tag will need also the CDATA section.

Once again, I have to short the code, if you want to see all the code, you must see these documents:

XSLT Processors

Saxon

Saxon is a XSLT Processor written in Java, I'm using the version 6.5.2, the following instructions will be for this version, in others versions you have to check the properly information for running Saxon.

Installation

After you have downloaded the saxon zip, you must unzip it:

[danguer@perseo xslt]$ unzip saxon6_5_2.zip

After this, you must include the saxon.jar file in you class path, you can pass the path of the jar to java with the -cp path option. I will put saxon.jar under the dir xslt, you must write to Java the Class you will use; in the case of my saxon version (6.5.2) the Class is: com.icl.saxon.StyleSheet and also pass as argument the document in xml and the XSLT StyleSheet that you will use. For example:

[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet document.xml tranformation.xsl

This will send the output of the transformation to the standard output, you can send to a file with:

[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet document.xml tranformation.xsl > file_processed.html

For example, we will transform our first example of XSLT with saxon:

[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet cards.xml cards.xsl > cards.html

And as I said, the result of the processing with xslt is:

[danguer@perseo xslt]$ java -cp saxon.jar com.icl.saxon.StyleSheet hello.xml hello.xsl > hello.html

xsltproc

xsltproc comes with all the major distributions, the sintaxis it's like the saxon's one:

[danguer@perseo xslt]$ xsltproc hello.xsl hello.xml > hello.html

I know there are others xslt processors, like sablotron, but I haven't used, so, I can't suggest you ;-).

References

 

[BIO] I'm trying to finish my Bachelor Degree at BUAP in Puebla, Mexico. I'm involved with TLPD-ES project, and they make I learn all about this technologies, now I'm learning about Semantic Web.


Copyright © 2003, Daniel Guerrero. Copying license http://www.linuxgazette.com/copying.html
Published in Issue 89 of Linux Gazette, April 2003