Google
 

Wednesday, February 17, 2010

Converting Latex document to HTML / ODT / DOC

I do all my documents in Latex. Actually i usually use emacs org-mode first and later convert it to Latex, but when it comes to using bibtex, it's all latex. It works great and the output dvi/pdf files look so professional. However recently i had to submit the file in .doc format, and i had a real hard time finding good tools to do that.

There are no direct Latex to .doc converters, so the best bet is to go Latex -> HTML -> DOC. There are lots of Latex to HTML converters like latex2html, tth and hevea, but most of them either don't produce good HTML or mess with bibliography references. But after some research i found the best: TeX4ht

TeX4ht is available in ubuntu/debian repo, so to install

sudo apt-get install tex4ht
Conversion is simple:
htlatex document
(please note, there is no .tex suffix)

This will generate document.html, css and bunch of .png files.

Now to convert it to wordprocessor files (odt, doc..),
1. Open the html file with OpenOffice.org
2. File > Save as
3. Choose the format you want to save it in (odt, doc)

If you have images in the document, they will be linked instead of embedded in the document. To fix this: (source)
1. Edit > Links
2. Click "Break Link" for each image link

1 comment:

  1. Nice one.

    Wondering if you could you re-post this thing here: http://tutverse.blogspot.com/ ?

    Bibek

    ReplyDelete