Integrating Docbook and WordPress

Sadly many people use tools like Microsoft Word or OpenOffice.org to maintain large structured documents. If you’ve ever used a tool like this to maintain a large document you’ve probably already grown to hate it. Things like latex and DocBook make handling structured documents much more bearable. I’ve come up with some simple code that allows me to integrate DocBook documents into my website.

Latex was my first experience with a tool designed for structured documentation. It’s a form of markup that, when passed through a processor, creates output in whatever format you’d like (text, html, pdf, etc). Latex is widely used because it makes entering complex mathematical formulas easy and produces high quality output. I first used latex in college because a professor recommended it for a report. It didn’t take long before I developed a dislike for latex. What annoyed me the most was that you could not have arbitrarily nested sub-sections in your document. Having only a handful of nested sub-sections seemed like a needless restriction.

DocBook is a lot like latex in that it’s a form of markup for structured documents. The latest versions of DocBook use XML. Generally I feel that XML is bloated and a bad idea. DocBook doesn’t change my mind but it is a powerful standard that can be used to maintain structured documents. Maintaining an article, book, or just a report is relatively easy with DocBook. My first serious experience with DocBook was for my masters project report. After writing such a long report I was still reasonably happy with DocBook (that’s saying a lot!).

I wanted to write documents using DocBook and display them in my WordPress driven web site. It’s easy to generate html or xhtml from DocBook source but that produces a standalone document. I wanted to integrate it right into my site (with all the goodness of WordPress themes). My first plan of attack was to write my own stylesheet that would strip out the header and footer tags as well as the XML spec and DOCTYPE tag. This mostly worked except that xsltproc (correctly) insists on emitting a DOCTYPE when the output type is set to XML. There didn’t seem to be a good way around this so I went for another plan of attack.

My next idea was to use PHP to parse the xhtml (since it is XML) and strip out the elements that I couldn’t have in the output. This technique works but it comes at a price. Using this technique the server is doing a lot of extra work to display a document. This is using a SAX parsing technique, DOM would be even worse.

To include any XML document into my WordPress based website I use the following code:

<?php
unset($_SERVER['PATH_INFO']);
//Include current WordPress Theme Header etc.
require('./wp-blog-header.php');
get_header();

include('gjr-wp-include/xmlInclude.inc');
gjrXmlInclude('foo.xhtml');

get_footer();
?>

You can view the code for xmlInclude.php.

Advertisements
This entry was posted in Geek, Site. Bookmark the permalink.

4 Responses to Integrating Docbook and WordPress

  1. ricosecada says:

    Hi Greg

    How exactly do you use this?

    If I have a XML document written in Docbook named rico.xml and I would like to integrate that into WP and have it show up like a normal posting, how would I use your script?

    Best regards.

  2. greg says:

    Unless wordpress strips out PHP from posts the following should work. When you write a post switch to the “HTML” view in the editor (as opposed to the “Visual” view). The following should work if you copy “rico.xml” to the root directory of your website. You’ll also need to copy xmlInclude.inc into the root of your website.

  3. remko says:

    In case you’re still interested, I took the stylesheet approach to integrate DocBook with WordPress: http://el-tramo.be/blog/docbook-wordpress

  4. greg says:

    Thanks remko. I like your approach much better than the approach I have here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s