Tex4HT

code, latex, web

Instead of writing my PhD thesis, I am tweaking the tools I use to write it. A colossal waste of time, because how many times will I write another thesis?

Anyway, my first plan was to write the thesis using Sphinx, like I did with my master thesis. A while back I discovered the Alabaster theme, which looks much better than what I used before, see /msc. However, as much tweaking as the html output offers, so little is possible for the Latex/PDF output. I realized I liked the look of the default article Latex template much better than the things Sphinx puts into your document. The single alternative styling I found was the Sphinxtr package. The package is however not updated for use with recent Sphinx version (requires 1.2.x), and it had some conflicting Python extensions. After solving them, I discovered it can then only cite using numbers, whereas I’ve grown to like the authoryear style of natbib. I gave up.

Then, I rediscovered the universe of Latex to HTML converters. Many exist, many crappy. I stumbled upon a repo of michal-h21 (Github is good at that!), where he provides a .css style for the output of Tex4HT. I became intrigued, and after wrestling for while I managed to get it into something I can work with. As a test, I’ve updated my bachelor thesis. Hope you like it. The most tricky thing is that Tex4HT doesn’t manual additions to the table of contents much, it adds phantom entries. By using numbered sections in the HTML output and unnumbered ones for the PDF output (for the frontmatter and bibliography for instance) you solve this problem.

It’s not perfect, but it has one huge plus: it works from Latex sources, which is what anyone in science is going to be writing anyway. Unfortunately styling Sphinx PDF output just isn’t really possible, otherwise I’d have swapped in the default article markup and be done with it.