Wednesday, December 16, 2009

Docbook sucks

This message is my reply to a Facebook message claiming Docbook is better than HTML because you can’t add a table of contents (TOC) to an HTML document.

Docbook sucks. It is too complicated for its own good. In the early 2000s, people were clamoring for a standard documentation format for MaraDNS. I whipped out Perl and made a script to convert a subset of HTML (which I call “ej”) in to *ROFF man pages, along with a couple of other scripts to make HTML documents and ASCII text files.

People complained because it wasn’t docbook. I told them I looked at docbook; my rant (which I didn’t publish) was that the language was too complicated and the tools too primitive for my tastes. I still use this subset of HTML for documents that are converted in the *ROFF man pages (such as Deadwood’s man page).

HTML is the universal text markup language and using anything else is foolishness. And, yes, making a TOC for HTML documents is easy (as long as the HTML document is structured). Just last week I made a tutorial in a word processor, converted it to HTML, and added a TOC like this:

cat Tutorial.html | awk -F\> '/<\/b><p>/ {a=$2;sub(/<\/b/,"",a);gsub(/[ -]/,"",a);print "<A name=" a "> </A>"} {print}' > foo.html

mv foo.html Tutorial.html

cat Tutorial.html | awk -F\> '/<\/b><p>/ {a=$2;sub(/<\/b/,"",a);b=a;gsub(/[ -]/,"",a);print "<A href=#" a ">"b"</A><br>"}' > TOC

vi Tutorial.html
{move cursor down to area just past the BODY tag}
:r TOC
:wq

And, oh, docbook.org points directly to some ads asking me to buy their books (Update: To be fair, you can read this book for free on their web page). No thank you. I can spend five minutes whipping up an AWK script to add a TOC to a HTML document; did I mention that the GAWK manual is a free download, not a $40 download? (Update: Again, one of the books is free to read on their webpage)

Final update:

Another cool linky:

http://en.wikipedia.org/wiki/Comparison_of_document_markup_languages