The (by far) most visited post on this blog is from 2010, about OCRing a PDF in GNU/Linux (Optical Character Recognition), and it contains a small shell script that has been improved by others several times. After having bought a new flatbed scanner, I re-investigated how to scan and OCR pdfs, how to produce DJVU files that are incredibly small and how to get metadata right. It turns out what I really ever wanted was to create PDF/A compliant documents (I just didn't know what PDF/A was before). But let me explain the details after presenting you the quick solution. At the end, I have a shell script that scans directly to PDF/A.

The units scientists use in their daily work are the SI units, and sometimes the equivalent cgs units. In these unit systems, everything is based upon the physical quantities length, mass, time (and some more). The part "physical" is only deserved if we mean something measurable by these "quantities". So, what does it mean to measure length, mass and time?

In this short rant, I want to convince you to try out some new beautiful fonts for your editor, terminal, wiki or website. In particular, I want you to take a look at Adobe's Source Pro Fonts. I'll explain where you can preview fonts online and how to employ them in various settings.

I guess you all know what a WikiWikiWeb (short: wiki) is, it's a website where you can easily add new pages and modify existing ones. MathOverflow is some kind of hybrid between Q&A and a wiki, since users with enough reputation can edit other people's questions and answers. MathOverflow made the Markdown syntax very popular, and people got used to using LaTeX online. Some of my readers surely know the nLab, a collaborative wiki on n-categorical math(ematical physics) and stuff. The nLab runs on a software called Instiki, which is a wiki written in Ruby (an intepreted language similar to Python, and somewhat similar to Lisp, Perl and JavaScript; which is often used for web applications like wikis). The good thing about Instiki is that it supports editing pages in Markdown syntax with embedded LaTeX, so it is able to support your personal knowledge management needs. In addition, Instiki is small (thus not many bugs are to be expected), fast and the code is quite readable; something I wouldn't say about MediaWiki, the software behind Wikipedia.

In this post, I will tell you how to run your own wiki like the nLab. [UPDATED 2013-01-07; easier fix]

This short article is intended to be read by non-mathematicians who don't quite remember what the term "matrix" or "polynomial" refers to. I'll try to give you an intuitive idea of what a PhD student working on "Algebraic Topology" studies nonetheless.

First of all, studying or researching mathematics is not about remembering formula or calculating some really large numbers. That is a part of mathematics, but it's not what drives it, it is merely a tool that is more and more handed to computer systems.So, what is mathematics instead? I am not competent to give an answer (and there have been many many different answers to that question in the past) but I can explain you my view on pure mathematics:

Pure Mathematics is the study (by any means) of the statements of which truth can be derived syntactically, i.e. by some computational process. (Yes, I'm quite a syntactic thinker).