Advanced PDF templating using XDocReport with JodConverter

PDF generation from templates is a common requirement for many software applications. Sometimes, simple fillable PDF Forms are a sufficient solution and many PDF libraries support this sort of templating.
As part of our software platform, Elements, we generate reports for all patients using a variety of report types, while giving our clients the ability to fully customize the design. For these business requirements, we needed a solution to generate PDFs from arbitrary advanced templates using a complex domain model. The resulting output needed to be as close as possible to the original design, with full control over the page and while minimizing development time.
An overview of the PDF templating landscape
Low level generation
These solutions give full control over the generated PDF output, but recreating the template in the native library API takes up a lot of development time.
Higher level generation using proprietary authoring solutions
These solutions emphasize BI report generation over PDF generation. The limitations of these solutions make them unsuitable for more advanced designs and cumbersome, unless the template in question has been created with the included authoring software.
HTML Conversion
wkhtmltopdf is a powerful solution for generating PDFs from HTML templates, but lacks common print features like multi-column support. Also, creating a correctly positioned HTML template for print output will take up a considerable amount of development time.
Print format templating
OpenOffice ODT files provide an accurate page representation, so the format can be leveraged for generating print-ready PDF output. XDocReport (XML Document reporting) is a library which supports rendering OpenOffice ODT template files using the Velocity or Freemarker engines. The resulting ODT output can then be rendered into a PDF file using JodConverter (Java OpenDocument Converter), which manages LibreOffice processes and access to the OpenOffice API. We’re using the well-maintained JodConverter fork by Simon Braconnier here.
Implementing a templating service using XDocReport and JodConverter
Install LibreOffice locally.
Dependencies
The following dependencies (here as Maven pom.xml
) are required to use XDocReport with Freemarker and JodConverter.
<properties>
<xdocreport.version>2.0.1</xdocreport.version>
</properties>
<dependencies>
<!-- requires LibreOffice to be installed on the instance -->
<dependency>
<groupId>org.jodconverter</groupId>
<artifactId>jodconverter-local</artifactId>
<version>4.1.0</version>
</dependency>
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>fr.opensagres.xdocreport.template.freemarker</artifactId>
<version>${xdocreport.version}</version>
</dependency>
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>fr.opensagres.xdocreport.document.odt</artifactId>
<version>${xdocreport.version}</version>
</dependency>
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>fr.opensagres.xdocreport.converter.odt.odfdom</artifactId>
<version>${xdocreport.version}</version>
</dependency>
</dependencies>
You can also use the jodconverter-spring-boot-starter dependency to further simplify the setup in Spring applications.
Code snippet
The (Scala) code below is sufficient to generate the conversionOutput
PDF from the templateInputStream
ODT file while providing all template placeholder values in a context
map:
val templateEngineKind = TemplateEngineKind.Freemarker
val fieldsMeta = new FieldsMetadata(templateEngineKind)
// for html fields, add: fieldsMeta.addFieldAsTextStyling(htmlField, SyntaxKind.Html)
try {
val interpolationOutput = File.createTempFile("interpolationOutput", ".odt")
XDocReport.generateReport(templateInputStream, templateEngineKind.name, fieldsMeta, context, interpolationOutput)
JodConverter
.convert(interpolationOutput)
.as(DefaultDocumentFormatRegistry.ODT)
.to(conversionOutput)
.as(DefaultDocumentFormatRegistry.PDF)
.execute()
} finally {
FileUtils.deleteQuietly(interpolationOutput)
}
Usage
Template Syntax Quick Start for Freemarker
- Defaults:
${myPlaceholder!someDefault}
, also${myPlaceholder!}
to handle not-null placeholders - Nested structures:
${parent.child}
- Lists:
[#list items as i] ${i.someAttribute} [/#list]
where the[#list]
directives should be stored in an Input Field to avoid problems with line breaks, white spaces, etc. - Table row repetitions: Use the XDocReport tags
@before-row[#list someRows as someRow]
and@after-row[/#list]
in Input Fields within the first table cell. Then use the placeholder${someRow.someCellAttribute}
in all necessary table cells. For more details, please see the XDocReport Wiki.
Example
Given the ODT and an example placeholder context

XDocReport and JodConverter produce the PDF

Creating templates from a sample PDF
Here is a simple way to get started by creating an ODT from any sample PDF:
- Use Adobe Acrobat to export the PDF to Word 97–2003
- Use Microsoft Word to save the resulting doc file as Word 97–2004 (doc), again
- Load the doc file in LibreOffice and save it as ODT
- Clean up: Delete any “Heading … Char” styles that exist in Styles And Formatting > Character Styles
- Clean up: You will likely have to go to Styles and Formatting > Paragraph Styles > Text Body > right-click Modify and set the correct font name
- Replace dynamic text with
${myPlaceholder}
placeholders.
Troubleshooting
You can fix complex template issues by unzipping the ODT file, editing the content.xml
and then updating the ODT.
unzip template.odt -d template_edit && cd template_edit#editing the content.xmlzip -f ../template.odt content.xml
Split placeholders
Make sure you do not split placeholders (${myPlaceholder}
) across ODT tags. For example, the content.xml
should not contain invalid templating markup like this:
Incorrect
<text:p>${myPla<text:span>ceholder}</text:span></text:p>
Correct
<text:p>${myPlaceholder}</text:p>
Style problems with input fields
If you see an exception like
org.jodconverter.office.OfficeException: could not load document: some_template.pdf-in2412284939948309622.tmp
the template file is incorrect. LibreOffice is known to have this problem when there is no text:span
around a text:text-input
(which is needed for handling lists).
Incorrect
<text:p>
<text:text-input>[#list items as item]</text:text-input>
</text:p>
Correct
<text:p>
<text:span>
<text:text-input>[#list items as item]</text:text-input>
</text:span>
</text:p>
Conditionals and well-formed XML
When using Freemarker directives that may omit certain ODT tags (e.g. an if directive), it’s important to ensure that the resulting XML output is well-formed for all branches.
For example, the following valid template
<text:p>
<text:span>
<text:span>
<text:text-input>[#if myCondition]</text:text-input>
Some Text
</text:span>
<text:text-input>[/#if]</text:text-input>
</text:span>
</text:p>
would produce this invalid output if myCondition
evaluates to false
:
<text:p>
<text:span>
<text:span>
<text:text-input></text:text-input>
</text:span>
</text:p>