Gecko-based EPUB Readers and LaTeXML

This morning, Deyan Ginev announced on the LaTeXML mailing list that the first alpha version of LaTeXML with LaTeX to EPUB support is now available. This is a very good news for people willing to encourage researchers to move from offline formats to more modern Web formats. Although, some people
had already been successful to combine LaTeX-to-XHTML converters
and XHTML-to-EPUB converters, this is the first tool that I'm aware of that can do the direct LaTeX to EPUB3 (XHTML+MathML) conversion. I already mentioned a couple of Gecko-based EPUB tools in my previous blog post, so let's have a look at three of them. Feel free to mention more Gecko-based EPUB tools in the comments, I'm particularly interested to hear about FirefoxOS applications that would be similar
to Apple's iBooks.

I have updated the LaTeXML samples based on Boris Zbarsky's thesis that we demonstrated at the Innovation Fairs in Santa Clara & Brussels. This shows how to generate the traditional PDF version, the Web version, the Web version with MathJax fallback and now the EPUB version! Here are some screenshots using the Firefox extension Lucifox:

Boris' Thesis in Lucifox ; page 2

Boris' Thesis in Lucifox ; page 4

I have intentionally not shown the diagram that are incorrectly converted by LaTeXML due to missing Xy-pic support (this is still in development). However,
Gecko supports mixing SVG and MathML via the foreignObject element so this would not be a problem for Gecko-based EPUB readers. Here are some screenshots of an ebook about
regular polygon that can be constructed with compass and straightedge that I have created with the help of itex2MML. They are viewed in EPUBReader which is another Firefox extension:

EPUBReader, Constructible Numbers

EPUBReader, Cyclic Galois Extension

Lucifox and EPUBReader have a big drawback: they do not support EPUB pages with the "scripted" property. This means that you can not use Javascript to create dynamic ebooks with live samples or interactive exercices... but this is one of the reason to use Web formats! Fortunately, there is a XUL application called AZARDI that supports this feature. I have created another ebook that shows an interactive
course on matrices. Click on the image to see the video on YouTube:

@Yoric: thanks for the info. I gave a quick try to your EPUB reader but was not able to make it work.

@bastiaan: I mean anything that is not a Web format (= based on HTML5, CSS, Javascript, SVG...) and so is not appropriate to be read on the Web or on various mobile devices in a way that is reusable, accessible, interactive, etc but rather whose main goal is to be printed on paper. As for your second question, I'm not sure whether it is rhetorical or if I really need to explain why the Web and related media help to be more collaborative...

fredw, the question was not rhetorical. Suppose I have my documents in LaTeX and I put them in a github repo. So far as I can tell, that would be 'reusable', 'accessible' and 'interactive'. I presume LaTeX files could be read on mobile devices (though I've never had the temptation to try).

...At any rate, I'm trying to distil your comment about researchers down to a) what we have now and b) where you want to go. It seems to me that converting my 'input' to something web-compatible is both easy and trivial (because scientific texts tend to be pretty simple). So I must be missing something.

bastiaan: So if you mean putting your LaTeX source on GitHub then it is clearly reusable and that's definitely a good idea. However, the *.tex files only can not be read directly in the browsers, nor can they be read by accessible tools (e.g. for visually impaired) nor made interactive via Javascript as I show on my video, nor published on a blog post or any social network where people can comment, nor edited on a common Wiki by a group of researchers. Actually, most people just publish some output like Postscript or PDF (generally without the LaTeX source) providing large "read-only" documents and this does not address any of the previous need. In order to get the full power of the Web... you obviously need Web formats! Indeed tools like TeX4ht or LaTeXML can do the conversion from .tex to HTML (and even preserving the TeX source of mathematical formulas). See for example the arxmliv project.

So my point was not to replace the LaTeX input but to replace the Postscript/PDF/printed papers output by Web formats. And if necessary create LaTeX packages: for example to include Javascript programs, add CSS styles and so forth. The only thing new in my blog post is that LaTeXML will now be able to generate EPUB documents too (that is zipped Web pages) so that they can be shared and read offline too while still preserving the Web features.

Last week I attended the EDUPUB Workshop, the purpose of which it to create a "EPUB profile for educational publishing". An important part of this effort is to add support for "widgets", interactive objects that can be added to ebooks. Sounds like this would be useful here. Once it has been finalized, an ebook that uses such widgets should be read optimally on ebook readers that support the EDUPUB profile.

fredw: you seem to be assuming that a researcher wants to do his research in public. That's quite a policy change for all/most research groups I've ever heard of. Having a shared group repository is one thing, but opening things up entirely is quite another. The incentives for research are generally dependent on the research details being kept secret until the research group has finished. Once that barrier is removed, it might be useful to discuss tools for making participation easy...

@bastiaan: I only know few researchers that want to do his research in public but that's one of the things that some folks are trying to change (like Mozilla Science Lab <https://wiki.mozilla.org/ScienceLab...) and I hope that they achieve their goals.

IMHO, we must build tools for making participation in researches easy before the barrier was removed because this is a chicken-egg problem and the easy thing to do is building the tools.

@bastiaan: yes, I'm aware that most researchers do not want to do their work in public and as Raniere said that was my point when quoting the Mozilla Science Lab. However, publishing their papers is important and they progressively start to use the Web to reach a larger audience, even if that's as a "repository of pdf papers" like arxiv. Even if only the final work is published, it is important to have something that can be read by accessible tools, adapted to the screen size (reflowed, linebroken, zoom in/out) or recognizable by search engines (there are projects to search MathML formulas) ; otherwise they don't need the Web they themselves invented, but only the Internet. That said my blog post was not on research only, that was just a casual remark to mention Mozilla Science Lab. I'm considering more general applications from education to academia (thus my example of the interactive course on matrices). Ebooks are particularly interesting for textbooks from school to college or specialized books giving the state of the art of a research field. And more and more Websites like Arxiv or Wikipedia are migrating to MathML in order to display mathematical formulas. Having a format like EPUB that is just zipped Web pages is important to get something 100% compatible and avoid duplicate effort.

Raniere: sure, having the tools is good. I think changing the way science is done requires that the incentive model changes. As it is, the incentive model creates a number of very negative side effects, such as not publishing 'boring' results, rush-to-press, sloppy research and peer review, and even unscientific or dishonest reporting. So the good news is the model has to change anyway.

@fredw About education applications, did you post the link to the epub used in your video? And do you know a tool that any math teacher (not a programmer) can use to create a ebook like the one in the video?

@Raniere: not yet, I plan to give more details on this soon. LaTeXML now makes creating an ebook easy but it lacks interactivity feature. At the moment one would need some basic knowledge of HTML and Javascript in order to write such a book. There are probably several options like creating general Javascript libraries to make programming easier (such as jquery-mathml), or defining some "widgets" as Paul mentioned or inventing LaTeX macros and packages that could help (see my post on the LaTeXML mailing list) or creating some kind of high level Web authoring tools for math similar to WebMaker.

@michal.h21: thank you, that's great! I knew that some people used TeX4ht to convert to EPUB3 (XHTML+MathML) but I was not aware of a direct conversion tool. Configuration for the UniqueIdentifier or CoverImage looks a very good idea that is currently missing for LaTeXML.