Running TeX on a java virtual machine

by Martin Monperrus

This document presents two solutions to run Knuth's typesetting program TeX into a Java Virtual Machine. I don't mean a re-implementation of TeX in a JVM compatible pogramming language (like extex) but running the original code itself.

Why running TeX in a JVM?
A first reason is to be able to use TeX in a JVM only environment (e.g. Google App Engine).
My second reason is more personal: I think it's really fun and somehow magic to run (low-level) code that is about 30 years old into the state-of-the art of virtual machines. It's a kind of computing contradiction :-)

Using TeX-GPC and NestedVM

* TeX-GPC is a port of a Knuth's Tex to the GNU Pascal Compiler. It consists of a change file that has to be used with the literate programming tool tangle.

* GPC is an open source and free Pascal Compiler built as a new layer for the GNU Compiler Toolsuite. Since GCC is able to compile to MIPS code, it's also possible to compile Pascal code to MIPS code.

* NestedVM is a compiler of native MIPS code to Java bytecode (the authors compiled another port og tex for writing their papers).

To run tex in a a JVM, you compile tex-gpc to MIPS code (say tex.mips) and you then compile tex.mips to tex.class, i.e. to Java bytecode.

It seems easy? It is conceptually simple, but getting all tools to run and to work together is a difficult task. That's why I provide in this archive the executable proof-of-concept. At the end, you get two real-world documents, one for TeX and one for LaTeX (the sample document of the Springer LNCS style), to be compiled.

Main limitations of this approach: it's really slow :-) Howver, I hope to be able to compile TeX-GPC to Java source code first (see below), so as to get a significant speedup.

Using javaTeX

The second possibility is to use javaTeX. Timothy Murphy's javaTeX is a set of Java source files. It consists of some helpers classes and a class tex, resulting from the automated translation of the Pascal code of tex using a dedicated tool (web2java).

This solution is really nice for two reasons:
1) the resulting code is much faster (50x) than the MIPS one.
2) it enables us to easily modify and extend the Java code.

This archive demonstrates the compilation of tex.fmt and of a simple latex document.

Main limitations of this approach:
Since javaTeX is built on top of an old version of TeX (v3.14159 of March 1995), it is not able to deal with complex tex or latex document (like the LNCS sample document above). I guess it's due to the modern versions of TeX fonts (the tfm files) and/or the modern version of LaTeX (always the same error: Math formula deleted: Insufficient extension fonts)

Future work - do you want to contribute :-) ?:
* solving this font error to be able to compile advanced TeX and LaTeX documents with the current version
* modifying web2java to be able to translate TeX-GPC version of TeX to Java source code. Then we have neither to maintain two different change files (tex-gpc.ch and tex.jch) nor to run a slow version of TeX (tex.mips). It might only require to slightly modify the grammar and to adapt the helper classes.

How to get PDFs from TeX in a JVM?

These are open questions:
Use Tim Murphy's DVIPDF?
Compile pdftex to static MIPS code?
Compile dvipdfmx to static MIPS code?