Fun at the Turing tar pit

I love to babble about "social issues of software development". I try to avoid mentioning them in technical discussions (also known as "flame wars") because it makes a particularly lousy argument. Other than that, I just love talking about it. Face it: software is written by people for people. Most of it smells like a hairless pink ape. Nothing formal or provable/refutable or scientific or universally correct about it. People stuff. Like furniture, or jewelry, or porn, or screwdrivers. The machine merely follows orders, the humans are the serious players. Social Issues are what makes those bits move.

So, Social Issues. Today's Social Issue is this: programmers adore Turing tar pits. Turing tar pits are addictive. You can render programmers completely paralyzed from any practical standpoint by showing them a tar pit and convincing them to jump into it.

A Turing tar pit is a Turing-complete virtual machine that doesn't support straightforward implementation of bread-and-butter programmer stuff. Bread-and-butter programmer stuff includes arithmetics, variables, loops, data structures, functions and modules and other such trivial and awfully handy goodies. A canonical example of a Turing tar pit is the Brainfuck programming language. Implementing a decimal calculator or a sokoban game in sed is also a consensus example of Turing tar pit swimming as far as I know. By "consensus", I mean that nobody writes software that people are actually supposed to use on top of those VMs.

Some Turing tar pits are used as vehicles for production code.

C++ templates. A crippled everything-is-a-type VM. Try writing a compile-time loop using them; then, try to implement a compile-time hash table. I find the implementation of a decimal calculator in sed more concise in certain aspects. But people do awfully hairy things with C++ templates, like boost. The fun of implementing such things can only be compared to the fun of using them (error messages which would take a whole tree to print out on paper and such).

TeX. A crippled everything-is-a-text-transformation VM. A great back-end with a nightmarish front-end, as far as I can tell. You get neither WYSIWYG nor the ability to mechanically analyze the content (the document is basically code which transforms itself, and transforms itself, and transforms itself, and you can't make any sense of the document without, um, running it). But people do unimaginably hairy things on top of TeX, like LaTeX. You actually have to debug LaTeX documents, and I bet that the best TeX debugger is somewhat worse than the worst C debugger out there.

Shell scripts. A crippled everything-is-a-process-unless-it's-a-string VM. Debugging clearly wasn't on the list of Useful Activities which the shell was meant to support. "rm: Is a directory". Gotta love that. Of course people write huge shell scripts. One excuse is that they want to modify env vars in the user's shell, and "only a shell script can do that" (wrong – you can use eval `sane-prog` instead of source insane-shell-script). A particularly awesome kind of shell script is the multi-kilobyte one-liner (find | grep | awk | grep -v | fuck-off-and-die).

I'll tell you why it happens. When you write code in a full-featured programming language, clearly you can do a lot of useful things. Because, like, everybody can. So the job has to be quite special to give you satisfaction; if the task is prosaic, and most of them are, there's little pride you're going to feel. But if the language is crippled, it's a whole different matter. "Look, a loop using templates!" Trivial stuff becomes an achievement, which feels good. I like feeling good. Templates are powerful! What do you mean by "solving a non-problem?" Of course I'm solving real problems! How else are you going to do compile-time loops in C++? How else are you going to modify env vars of the parent shell?! I'm using my proficiency with my tools and knowledge of their advanced features to solve problems! Damn, I'm good!

By the way, here's another entry in The Modern Software Industry Dictionary:

Powerful
adj.

An attribute of programming environments or their individual features, most often Turing tar pits. A feature is "powerful" when at least one of the following holds:

It can be used to implement something trivial in an pointlessly complicated way.

It can cause a lot of damage.

Seriously, it seems like 85% percents of the contexts where something is called "powerful", it really means "useless and dangerous". Unlike most entries in the Modern Software Industry Dictionary, I don't consider this word a meaningless cheerleader noise. I think it actually carries semantics, making it a pretty good warning sign.

Back to our subject. A Turing tar pit deployed on a massive scale has two effects on programmers who got addicted to it:

They come to think that the sort of work they're used to accomplish with a Turing tar pit should always be done using a Turing tar pit. And this is just depressing. It really is.

Lisp-style macros, D-style mixins and code generation using DSL compilers are some ways to do compile-time things in a not entirely brain-crippled fashion. You know, you can actually concatenate two strings, which is unimaginable luxury in the world of C++ templates. People are actually afraid of these things. These people aren't cowards, mind you. These are the same people who'd fearlessly delve into the darkest caves of template metaprogramming. Sure, it's "non-trivial", but sometimes you need the Powerful Features to get your job done! But a full-blown programming language for compile-time work? Are you kidding? That is just too dangerous and unmaintainable! And anyway, you can do everything with templates – they're Turing-complete! Neat, ain't it?!

Alternatives to TeX… I'm bad at this, I really am, so you won't get a detailed rant on this subject. Let's just say that WYSIWYG is underestimated by many programmers, and extending a wiki engine, even one written in PHP, beats extending TeX hands down. This isn't a first-hand evidence, just generic Turing-tar-pit intuition. This is the part where you can tell me that I'm a moron and get away without a symmetrical remark. In particular, TeX seems to have awfully good back-ends; I don't think it can possibly justify heavy usage of its front-end macro facilities, but, um, everything is possible.

Scripting. Back to yosefk's-perceived-competence-land. When people hear about a new language, and they're told that it's a scripting language, not only don't they treat it seriously, they insist on emulating a Turing tar pit in it. They call system() and pass it incomprehensible 300-character pipelines. They write everything in one big bulk, without using functions or classes. They ignore data structures and regular expressions and call awk when they need them the way they'd do in a shell script. From Python or Ruby or Perl. This is how you write scripts, you know.

Interesting, isn't it? I find it fascinating. I think that software is to wetware what leaves are to trunks; to figure out the leaves, you have to saw through the trunk and see what's inside, or something. I also think that crappy software isn't necessarily written by dumb people, although that helps. I think that the attitude and the state of mind is at least as important as raw aptitude, much like the direction at which you aim a cannon isn't less important than its range. I'll definitely share those thoughts with you, as well as many others, but that will have to wait for another time. It's been a pleasure having you with us. Good night.

29 comments ↓

"Alternatives to TeX… I’m bad at this, I really am, so you won’t get a detailed rant on this subject. Let’s just say that WYSIWYG is underestimated by many programmers [...]"

There is an alternative to TeX, although not WYSIWYG. It is called Lout ( http://lout.sourceforge.net ). I cannot vouch for its usability or fitness for particular purpose, but its promise seems to be in line with your rant: a real programming language for creating documents.

"Lout is easily extended with definitions which are very much easier to write than troff or TeX macros because Lout is a high-level language"

Sounds nice. I'll probably look into it when I'll get to the awesome task of generating illustrations to a programming tutorial that I keep postponing.

I have to say that I totally and uncompromisingly suck at documents, content management, formatting, typesetting, revision control, rendering and basically everything that matters when evaluating this sort of stuff. Which means that I'm pretty likely to love/hate the programming support and ignore something which matters more when looking at these things…

"extending a wiki engine, even one written in PHP, beats extending TeX hands down"

Well, I haven't written much TeX, and the TeX I have writtencopypasted was horribly confusing and awful, but I wouldn't say it was much worse than extending wikis in PHP, which I've also done at one point (ah, the joys of crappy shared hosting without ssh access). It was really bad. Though it was also metacircular and written in itself, so arguably it puts it in the position of the TeX of PHP wikis.

Well, I guess if you use TeX to write documents, than the real comparison would be with the wiki markup, and I wouldn't bet on wiki (I find that wikis are good for making documents accessible, but they aren't that good for making documents). If you TeX to write extensions (like LaTeX), then I'd compare it with extending the wiki engine (written, in the suboptimal yet likely case, in PHP).

If you don't write TeX extensions, its Turing tar pit nature only shows in the error messages, which suck suck suck (from what I call tell by looking at the error messages, it has no idea what it's doing, it's just a ton of macro expansions which fail at some point). Then again, a wiki engine can beat that by, for example, not checking for errors at all; but if someone wanted to fix the implementation, at least the wiki interface wouldn't get in the way. I'd say that with TeX, error handling is inherently messy, and with wikis, it's an implementation issue.

Re: TeX – the problem, seems to me, is that languages which were never designed to function as a turing machine were given turing-completeness, y'know, just in case someone is a master hacker. Then it turns out some brain-dead manager figures out that it is possible to do everything in that language. The conversation usually goes "I think I should use Perl/Python/Ruby for that" "But that requires skills that only you have (because we have brain-dead HR departments), can't you just implement that inside the TeX document, and solve our hypothetical but nonexistent problem of finding another person who knows Ruby? I remember when you argued for us not to write our 1000 page cross-referenced and indexed set of documents in Microsoft Word that you said TeX was a full-blown programming language." "Ok, well…technically it could be done…but…" You get tired of arguing. I bet templates is a similar thing.

Point is: nothing wrong with being a tar-pit as long as you stop using the language when it stops being easy.

I like TeX; I prefer Plain TeX; some things I prefer TeX over other systems, includes, that TeX can output DVI file, which in my opinion is best format (I have also written a Haskell program to read and write DVI files). And, METAFONT for font designing; I find it far better than any other program for designing typefaces. Both Plain TeX and Plain METAFONT work same way in past as they do now and will continue to do so in future. (I have written macro packages in TeX, to make it play chess variants, calculate Easter, overlay a PBM picture onto a page (without using specials or any external programs; the PBM renderer is written entirely in TeX), an implementation of the esolang Underload, etc)

Another remark regarding TeX: the language pretty much sucks (although not more than, let's say, TCL or various XML-based languages).

However, the OUTPUT that tech produces is simply unparalled in quality, unless you allow very expensive, closed-source software in the comparison.

This is, in my humble opinion, true even if you do not use mathematical formulae in your documents. Even the decidedly UNmodern standard font of TeX (un-aptly named Computer Modern Roman) looks better than Word output with any font. It's all in the typesetting.

What this should illustrate is, that TeX is not a very good example for a Turing Tar Pit — because the reason for people doing that much in TeX is not for the fun of a pointless exercise in wrestling it, but because you simply cannot get the quality of output easily with other means. Especially if you want to publish an entire, possibly highly technical, book.

@Sigi: you could, in theory, implement a TeX front-end that would hide the language from the user, rather than as macros that people have to type by hand and debug their document as code. If the macros weren't there, people wouldn't be tempted; for instance, HTML doesn't have macros, and in a whole lot of cases outputting HTML is unavoidable whether you like it or not, and there are many HTML-generating programs that don't force you to hand-edit HTML and don't extend it through a macro system so that you have to debug your documents like programs.

But I dunno – something based on TeX macros could be the optimal TeX front-end, for all I know. I'm just not any good at TeX.

There is such a frontend (Lyx), and it's very good. It even does have a fair amount of WYSIWYG.

Also, if you use LaTeX (which *is* a macro collection for TeX, nothing else), the amount of "debugging" you have to do is fairly low, even if you write very complex documents. It's a very mature front-end to "pure" TeX, and extremely well documented.

A problem with LaTeX can be that it doesn't format the output *quite* the way you want it or need it — because it does have strong assumptions about how a "good" (or harmonic) layout is supposed to look, and most of them are there for good reasons. After all it's there for laypeople to typeset like professionals without having to figure out all the nitty-gritty details of expert typesetting.

So, from time to time one might be tempted to fight LaTeX, and that can be the beginning of a painful journey. Luckily it's highly customizable, with a good reference book it's mostly about setting the right defaults.

As for writing "bare" TeX — not many people do that, mostly macro and style authors.

Me and a friend have written a description of the bidding system which we used in Bridge (the card game), in a way a technical manual, full of symbols and tables etc., really hairy source code. We used LaTeX for that with a few "plugins" (macro collections), and it looked awesome, and we could store it in a VCS without trouble. A conventional word processor would just not have worked that well, even considering the quirks of TeX.

I would say that LaTeX is comparable to HTML/CSS in usability, unless you tempt it to bite you by doing "odd things".

Umm, a wiki isn't even remotely similar to what TeX does. Tex is a typesetting engine that does things like ligatures, pixel-perfect layouts, word-break and page-break processing (NP complete) etc. A wiki is just a multi-user text dump.

You're spot on about the front-end though, the macros that produce macros that produce macros are a nightmare. No one has come up with anything that even compares to TeX in the decades since though, so maybe that's what it takes to get beautifully typeset documets?

"That's what it takes" – perhaps, but isn't it like saying that to achieve the per-capita GDP like that of the US, you need to avoid the metric system?

How amazingly better TeX is compared to everything else I don't know; I don't feel the difference all that much, but then maybe that's like saying that I don't think Verdi's Otello is in any way superior to the first album by Ramones (in fact I much prefer the latter; you get the idea).

The second point was more an random supposition than a conclusion, but you can nitpick it if you want. I'm not trying to prove anything.

My main point was that comparing a wiki system to TeX clearly demonstrates a lack of understanding of what TeX is. It's one thing to be subjective but it seems rather pointless to make bold and opinionated statements about something with no knowledge or insight.

I liked the post and completely agree about C++ templates but I think you lost the mark on TeX. The front-end is very fragile, as you said, but your rant went beyond the original point.

A wiki is indeed just a multi-user text dump, but you can render and print stuff from wikis after the HTML is generated (not a part of what a wiki does but a part of what you get), and you can extend wikis if you want more features (embed new types of things, automatically format new types of things, etc.) If you think of it as a document preparation system then it's comparable to TeX; if you think of it as a multi-user text dump then perhaps you want to compare it to a VCS.

Of course it's very different from a VCS or from TeX in more than one way, however you treat it; you can nitpick if you want :)

Sure, TeX and HTML are both used to prepare documents. If all you want is a crudely and unpredictably rendered document, HTML will suffice.

When it comes to rendering, HTML is a lossy format, TeX isn't. That's the difference.

HTML rendering changes subtly over time, between different browsers and even different versions of the same browser. Many details that are important to typesetting (e.g. page breaking) are completely unspecified by the HTML standard.

TeX continues to exist for people who care about predictable, pixel-perfect rendering. You can certainly use it as a bastardized "document preparation system" but that's not what it is.

You're right of course – and one reason I hate TeX as well as the not-so-pixel-perfect Word, etc. is precisely that they insist on breaking pages which sucks for on-screen reading. The thing is, for every two large-enough systems, you'll find that they have fundamental differences and in that sense can not be compared. And what I was doing here was fairly legitimate I think – specifically, I ignored the differences that you mentioned, not because they don't matter to some people some of the time, but because they were irrelevant to what I was discussing – namely, extending the language which you use to spell documents.

That's why I love postscript, it's not powerful in the definition you stated. It allows me to deal with complex problems of design (that is mechanical design), because I can test any idea empirically just after forming the equation that should form a shape.

As for TeX, people… it's a language designed to do work that was necessary in the 1970's! It's perfect fit for that kind of work. But since then, new types of documents and even the internet were invented, for which the TeX was not designed and is not even suitable, with our low resolution, 20:5 aspect screens.

I don't think I believe in this least power thing, as in, I'd rather describe graphs in Python code where I can create similar groups of nodes with loops than using dot's language and I wouldn't worry about using too powerful a language. Perhaps what I'm saying is even the opposite; that people start with an attempt to have a not-very-powerful specialized language and slowly augment it with features until, sometimes incidentally, getting to Turing completeness but in tarpit-y way. If they aimed at Turing completeness to begin with – ignoring the principle of least power – the mess would have been avoided.

I think the advantages of limiting the power are very small and maybe non-existent. For instance:

"…weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about."

(A quote from the wiki article)

This seems false; today's search engines run JavaScript and it's probably not harder than parsing HTML. What is important is that the dynamic code uses standard interfaces for rendering so the search engine can see what gets rendered, what the links are etc.; but this isn't a virtue of limiting the expressivity of the language but rather of standardizing interfaces.

So basically I'd say "choose the most powerful and readable language and communicate using common interface", not "choose the least powerful language" – the latter incidentally being the road to a Turing tar pit.