ODF-XSLT Manual

This manual will explain how to download and install the pp-odf-xslt package, how to create ODF-XSLT templates from within your favourite office applications and how to use the commandline tool to convert those templates to new ODF documents. At the end it will show you how you can use and extend the ODF-XSLT library inside your PHP applications. Developers who want to use ODF-XSLT in their applications will also want to read the API Reference.

0: Table of contents

1: Installation instructions

1.1: Requirements

To use PHP-ODF-XSLT, you will need the following libraries installed on your system:

  • PHP 5.2 or later
  • PHP CLI for the commandline utility
  • libxslt and the PHP XSL extension
  • zlib and the PHP Zip extension

1.2: Installation

Download the odf-xslt package from the download section and extract it somewhere convenient. Then open up a terminal and from the package directory run the following commands:

  1. make && sudo make install

2: Creating ODF-XSLT document templates

ODF-XSLT documents are basically just ODF documents where XSLT markup has been insterted in the XML files inside the ODF container. Usually just in content.xml and styles.xml because these two are automatically parsed by the ODFXSLTProcessor. You can generate these XSLT documents directly using your favourite XML tool, or you can generate them from specially marked up ODF documents using your favourite ODF editor, such as OpenOffice.org.

2.1: Creating XSLT stylesheets directly

If you unzip an ODF-XSLT file and take a look at the content.xml file, you can see that it's just plain XSLT. If you understand the OpenDocument Format specification and the XSL Transformations specification then you could generate these documents manually or from within another application. Below is an excerpt of a simple ODF-XSLT document.

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <xslt:stylesheet xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
  3.    xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" ... >
  4.   <xslt:template match="/">
  5.     <office:document-content>
  6.       <office:body>
  7.         <office:text>
  8.           <text:sequence-decls>
  9.             ...
  10.           </text:sequence-decls>
  11.           <text:p text:style-name="Standard">A value: <xslt:value-of select="/foo/bar" /></text:p>
  12.           <xslt:for-each select="/foo/quu">
  13.             <text:p text:style-name="Standard">A repeated paragraph</text:p>
  14.           </xslt:for-each>
  15.         </office:text>
  16.       </office:body>
  17.     </office:document-content>
  18.   </xslt:template>
  19. </xslt:stylesheet>

2.2: Marking up ODF documents

Instead of creating ODF-XSLT documents manually, you can also use your favourite ODF-compliant Office application to create specially marked-up ODF documents. The ODFXSLTProcesser translates these documents on–the–fly to ODF-XSLT documents which are then processed with the supplied XML data to create the final ODF document.

2.2.1: ODF-XSLT template sytntax

The basic syntax to create ODF-XSLT documents is simple, but powerfull. You do need a basic understanding of the OpenDocument Format, XSL Transformations and XPath though. The ODF-XSLT syntax lets you insert XSLT tags at specified locations inside your ODF document. The basic syntax is:

  1. {@<position> <xpath-expression> <xslt-expression>}

With the XPath expression you select one or more XML nodes in your ODF document. The XSLT expression is then insterted into the document at the specified position. The position parameter can be one of before, after, append or replace. An example to clarify this. Suppose that you wish to repeat a certain paragraph, once for each node in the XML file. The paragraph "Repeat me" looks like this in XML:

  1. <text:p text:style-name="Standard">Repeat me</text:p>

You can simply add the ODF-XSLT expressions directly in the paragraph:

  1. {@before .. <xslt:for-each select="/foo/bar">}Repeat me{@after .. </xslt:for-each>}

The XPath expression ".." points to the <text:p> node. Note that you need to disable automatic hyperlinking in your office suite, to prevent it from creating mailto links from the code. The ODFXSLTProcesor cannot read the template syntax if they are broken up by ODF style rules. In XML, the paragraph looks like this:

  1. <text:p text:style-name="Standard">{@before .. &lt;xslt:for-each select="/foo/bar"&gt;}Repeat me{@after .. &lt;/xslt:for-each&gt;}</text:p>

The two ODF-XSLT expressions say: Add <xslt:for-each select="/foo/bar"> before the <text:p> node, and add </xslt:for-each> after it. When the ODFXSLTProcessor parses it, the XSLT looks like this:

  1. <xslt:for-each select="/foo/bar">
  2.   <text:p text:style-name="Standard">Repeat me</text:p>
  3. </xslt:for-each>

This piece of XSLT can be combined with an XML file to create a new ODF file. Below is a sample XML file and the XML that it will result in:

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <foo>
  3.   <bar>Quu</bar>
  4.   <bar>Quux</bar>
  5.   <bar>Quuz</bar>
  6. </foo>

becomes

  1. <text:p text:style-name="Standard">Repeat me</text:p>
  2. <text:p text:style-name="Standard">Repeat me</text:p>
  3. <text:p text:style-name="Standard">Repeat me</text:p>

Because you can add any XSLT tags to any ODF node through XPath, the possibilities are virtually endless. But adding lots of XSLT directly into the document doesn't look very good. The ODFXSLTProcessor recognises two alternative methods. The first method is inserting an ODF text placeholder into the document. In OpenOffice.org Writer you can do that via insert->field->other menu. You can use anything you want as the placeholder text. The placeholder reference should be an XPath expression into the data XML that points to whatever the placeholder should be replaced with. See the screenshot below.

A second way to unclutter your document is to use <text:script> elements to put your ODF-XSLT snippets in. If you use ODF-XSLT as the script language parameter then the ODFXSLTProcessor will strip the script elements after processing the template snippets. Here's a screenshot showing the "Repeat me" example implemented with script elements.

2.2.2: Variable substitution

The easiest method by far for simple variable substitution is the placeholder text method explained in section 2.2.1. The XPath expression in the reference field of the placeholder is run from where you currently are in the XML data field. Take for example the following XML data:

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <foo>
  3.   <bar>
  4.     <baz>quu</baz>
  5.     <baz>quux</baz>
  6.   </bar>
  7.   <bar>
  8.     <baz>quuu</baz>
  9.     <baz>quuux</baz>
  10.   </bar>
  11. </foo>

When you are doing a row repeat on /foo/bar and you want the placeholder to be replaced with the value of the baz element, then the reference field of the placeholder should simply be baz.

2.2.3: Table row repeat

Repeating table rows is pretty simple. Create a table in your ODF document and put an ODF-XSLT script element in one of the cells of the row you want to repeat. If you want to repeat over the /foo/bar elements from the previous example, then the code for the script element should be:

  1. {@before ancestor::table:table-row[1] <xslt:for-each select="/foo/bar">}
  2. {@after ancestor::table:table-row[1] </xslt:for-each>}

2.2.4: Table column repeat

Table column repeat is slightly harder than table row repeat. That's because of the way that tables are built in ODF. Tables in ODF are very similar to tables in HTML: you have rows containing cells, with column definitions separately in the table header. For example:

  1. <table:table table:name="MyTable">
  2.   <table:table-column/>
  3.   <table:table-column/>
  4.   <table:table-row>
  5.     <table:table-cell office:value-type="string">
  6.       <text:p>Column A</text:p>
  7.     </table:table-cell>
  8.     <table:table-cell office:value-type="string">
  9.       <text:p>Column B</text:p>
  10.     </table:table-cell>
  11.   </table:table-row>
  12.   <table:table-row>
  13.     <table:table-cell office:value-type="string">
  14.       <text:p>Column A</text:p>
  15.     </table:table-cell>
  16.     <table:table-cell office:value-type="string">
  17.       <text:p>Column B</text:p>
  18.     </table:table-cell>
  19.   </table:table-row>
  20. </table:table>

If you want to repeat column B for each /foo/bar from the example data XML then you need to repeat the second cell in each row and you need to repeat the second table-column. To do so, create a script element in the top cell of the second column and insert the following code. The first two rules repeat the second table-column element. The second two rules repeat the second cell element of each row.

  1. {@before ancestor::table:table[1]/table:table-column[2] <xslt:for-each select="/foo/bar">}
  2. {@after ancestor::table:table[1]/table:table-column[2] </xslt:for-each>}
  3. {@before ancestor::table:table[1]/table:table-row/table:table-cell[2] <xslt:for-each select="/foo/bar">}
  4. {@after ancestor::table:table[1]/table:table-row/table:table-cell[2] </xslt:for-each>}

2.2.5: Image replacement

Image replacement is most easily done from within the image itself. Insert an image into the document and format it the way you want it. Then place the ODF-XSLT code into the images "alternative text" attribute. For example, if the data XML contains the full path to the image you want to use in /foo/image-path, then put the following code into the alternative text field:

  1. {@child ../draw:image
  2.         <xslt:attribute name="xlink:href>
  3.                 <xslt:value-of select="/foo/image-path"/>
  4.         </xslt:attribute>}

This will replace the image xlink:href attribute of the image with the value of /foo/image-path from the data XML. The ODFXSLTProcessor will check all the image xlink:href paths and make sure that the actual images are included in the ODF container.

3: Using the commandline tool

The odf-xslt package comes with a commandine tool called odfxsltproc that allows you to convert XML data to ODF documents easily. It processed XML data from STDIN and outputs to a specified location. Usage:

  1. odfxsltproc <stylesheet> <output location>

Example usage when converting an XML file to ODF:

  1. $ odf-xsltproc ~/my-odf-xslt.odt ~/my-document.odt &lt; ~/my-data.xml

4: Embedding and extending the ODF-XSLT library

4.1: Basic implementation

Using the ODFXSLTProcessor in your own PHP application is very easy. Simply instanciate a new ODFXSLTProcessor, load a stylesheet and call transform_to_file() or transform_to_memory() with the XML data as a PHP DOMDocument. The example below should explain itself.

  1. <?php
  2.  
  3. require_once('odf-xslt/odf-xslt.php');
  4.  
  5. $data = new DOMDocument();
  6. $data->loadXML("<foo><bar>Quu</bar></foo>");
  7.  
  8. $processor = new ODFXSLTProcessor();
  9. $processor->cache_dir = "/tmp/";
  10. $processor->import_stylesheet("/home/you/my-odf-xslt.odt");
  11. $processor->transform_to_file($data, "/home/you/my-document.odt");
  12.  
  13. ?>

For a complete overview of the ODFXSLTProcessor you should read the ODF-XSLT API Documentation.

4.2: Extending the base processor

If you want to extend the functionality of the ODFXSLTProcessor then you do not need to subclass it. The processor allows you to register custom preprocessor and postprocessor functions. The preprocessor is executed after the XSLT tags have been inserted but before the XML data is parsed. The postprocessor is called after the XML data has been parsed. Pre- and postprocessors are executed in the order they are registered and are executed once for each ODF XML file specified in the $container_files class attribute. By default those are the content.xml and styles.xml files.

The example below shows a post-processor that removes all empty paragraphs.

  1. <?php
  2.  
  3. require_once('odf-xslt/odf-xslt.php');
  4.  
  5. function remove_empty_paras($raw_xml, &$processor, $user_data)
  6. {
  7.         $dom = new DOMDocument();
  8.         $dom->loadXML($raw_xml);
  9.        
  10.         foreach ($dom->getElementsByTagNameNS("urn:oasis:names:tc:opendocument:xmlns:text:1.0", "p") as $node) {
  11.                 if (empty($node->nodeValue))
  12.                         $node->parentNode->removeChild($node);
  13.         }
  14.        
  15.         return $dom->saveXML();
  16. }
  17.  
  18. $data = new DOMDocument();
  19. $data->loadXML("<foo><bar>Quu</bar></foo>");
  20.  
  21. $processor = new ODFXSLTProcessor();
  22. $processor->cache_dir = "/tmp/";
  23. $processor->register_postprocessor("remove_empty_paras");
  24. $processor->import_stylesheet("/home/you/my-odf-xslt.odt");
  25. $processor->transform_to_file($data, "/home/you/my-document.odt");
  26.  
  27. ?>

5: References

  1. OpenDocument specification
  2. XSL Transformations (XSLT) Version 1.0
  3. XML Path Language (XPath) Version 1.0