How does Tupper’s self-referential formula work?

[I write this post with a certain degree of embarrassment, because in the end it turns out (1) to be more simple than I anticipated, and (2) already done before, as I could have found if I had internet access when I did this. :-)]

Figure 1: The graph of the formula, in some obscure region, is a picture of the formula itself.

Whoa. How does this work?

At first sight this is rather too incredible for words.

But after a few moments we can begin to guess what is going on, and see that—while clever—this is perhaps not so extraordinary after all. So let us calmly try to reverse-engineer this feat.

The first thing we notice is the size of N. As the “formula” is small and innocuous (just a few exponents, remainders and floors, no “unnatural” functions), and N is so disconcertingly huge, it is reasonable to guess that all the actual information that produces the graph is contained in N itself. This would mean that the formula is like a bare-bones “program”, and N is the “input” that somehow encodes the image that is the graph.

To understand the encoding, we must turn to the formula.

First, using the fact that , we see that the formula (relation, more properly) depends on and only through and . So the graph in each 1×1 square (with vertices having integer coordinates) is the same as at its bottom-left endpoint.

A part of Figure 1 enlarged, with grid turned on

(Just a sanity check at this point: this observation means that Figure 1 is essentially a 106×17 bitmap. 106×17=1802, and has about bits, which is roughly as many bits as the image. So we have another reason for our suspicion that N encodes the image.)

Accordingly, we can restrict our attention to integer values for and . Further, is the same as (I wonder why the obfuscation was there?), so we can simplify the formula:

We also notice a dependence on the value of y mod 17, so let’s write , where the remainder satisfies . So our graph is the set of all points for which

which is the same as saying that

is an odd number.

Note that the region in which the plotting is supposed to be done has a range of only 17 integers for , so remains the same for all (except possibly off by 1) while ranges from to . For a fixed value of , then, different values of correspond to different .

In case you haven’t noticed yet,

being odd is equivalent to the (17x+r)th bit of (counting from the rightmost bit as the 0th) being 1. So in the graph, the pixel at is 1 iff the bit at position (17x+r) in is 1: is merely a list of bits of the image! It’s a rather straightforward encoding after all: it just reads off the bits of the image, reading each column upwards, and going through the columns left-to-right.

In hindsight, we can see that this is in fact the natural encoding. If you were given a bitmap image ‘q’ got by concatenating each column (or row), and wanted to find a mathematical relation which for (x,r) was 1 iff the image had a bit at that point, this is more or less the relation you would come up with (for some orientation of the axes). Tupper’s further trick here is in folding the value of q into the value of y itself as y=17q+r.

The “mistake”

Actually, the value of N that I included above is one I found myself. Every source I’ve seen (examples: MathWorld, Wikipedia, “Implementation” webpage (doesn’t work), blog with code, etc., etc.) instead mentions a different constant for N, namely the 543-digit integer (call it N’)
96093937991895888497167296212785275471500433966012930665150551927170
28023952664246896428421743507181212671537827706233559932372808741443
07891325963941337723487857735749823926629715517173716995165232890538
22161240323885586618401323558513604882869333790249145422928866708109
61844960917051834540678277315517054053816273809676025656250169814820
83418783163849115590225610003652351370343874461848378737238198224849
86346503315941005497470059313833922649724946175154572836670236974546
1014655997933798537483143786841806593422227898388722980000748404719.

Graphing the function on the range gives the following figure:

What Tupper's formula as described in every source actually graphs

It is upside down. This is of course an entirely trivial goofup, probably based on the mathematical convention (the y-axis goes upwards) differing from computers’ screen convention (the y-axis goes downwards), but there’s something slightly comic about spending all this effort devising a very clever trick, but failing to ensure it’s not upside down. (‘The scene in Not the Nine O’clock News in which an elderly, exhausted monk unbent himself after years of illuminating the first page of the Bible, only to see that he had written, gloriously, “Benesis”’.) But that’s not a big deal, mistakes happen. What I find strange is that hundreds of people have been quoting and referencing this, without ever bothering to check it for themselves, and those who did bother and presumably noticed something amiss (e.g. here, Reddit, Hacker News,…) merely changed their code to “make it work” (a la voodoo programming) rather than confidently declaring that the constant was the wrong one, or at least specifying that the y-axis needs to go downwards. (Even the upside-down constant isn’t always consistently described: Stan Wagon’s nice list of favourite puzzles, mentioned as a source on MathWorld, calls it a 541-digit integer but prints all 543 digits, while the otherwise excellent book Experimental Mathematics in Action (by David H. Bailey, Jonathan M. Borwein, Neil J. Calkin, Roland Girgensohn, D. Russell Luke, and Victor H. Moll) calls it a 541-digit integer and actually cuts off the last two digits, which results in an image that looks like garbage.)

It’s remarkable that no one would check something so easy to check, even in these days when nearly any computer user has access to tools that make it easy. Maybe they are using the wrong tools? Something as simple as this will work:

which (if you squint) you can see is the formula upside-down. (The appendix below contains a better version that uses matplotlib and was used to generate the images in the post.)

Author, Background

Apparently Tupper included the formula in his 2001 SIGGRAPH paper on graphing methods, merely as an example of the kind of graphs his method can handle. (And he didn’t call it self-referential.) I’m pleased to discover that Tupper is the author of GrafEq which I remember trying some ten-odd years ago. It was at that time the best graphing software in certain respects (precision, correctness, etc.), and as far as I can tell, it still is. I hope its ideas get incorporated in all graphing programs in the future. In the paper, Tupper writes:

Many students currently studying mathematics are using automated graphing tools that produce incorrect graphs for some of the equations discussed in their curricula. I have written this paper in the hope that, in the future, more students will have access to graphing tools that work correctly.

The gallery page for GrafEq contains many other ingenious exploits that use implicit relations to get beautiful graphs. The one closest to the formula here is probably Decimal Squares, but many of the others are truly mind-boggling.

Tupperware and beyond

More bitmaps

Now that we see that graphing Tupper’s formula is just a “program” to decode a bitmap, it follows that the formula is also “universal”: for any bitmap image (of height at most 17 pixels), there exists an integer N such that graphing Tupper’s formula on the range 0 < x < width-of-image, N < y < N+17 produces the image. All possible images are contained in the graph of Tupper’s formula. For instance, N=6064344935827571835614778444061589919313891311 gives this:

And so on. The appendix below contains a program that can read (some) .bmp files and generate corresponding N.

Self-reference

This also means that the formula is not self-referential at all (though one may argue it’s something better), any more than a program that prints all possible strings can be called a quine. So in the spirit of actual self-reference, here is

Exercise 1 (easy): Find N such that the graph of Tupper’s formula (in the range given by N) is a picture of N, or prove that it is impossible.

Exercise 2 (hard): Find N such that the graph of Tupper’s formula (in the range given by N) is a picture of , or prove that it is impossible.

Pixellation

It is easy to get a similar formula with higher resolution. For instance, the formula

Hi, thanks for the comment. Yes of course that’s right, and I will not (cannot) disagree strongly.

But the orientation of the axes is such a strong convention that if a source doesn’t mention otherwise, that’s what one would have to assume. If you look at the image on Wikipedia, it even has the y-axis labelled, and that’s indisputably wrong. I would find it hard to believe that all these sources (e.g. Tupper’s paper which has the y-axis upwards for most/all other images) actually checked the graph, found it flipped, decided it was ok, and also decided that it was ok to omit mentioning this. The more reasonable explanation is that they simply didn’t realise that it was upside down relative to what they and their readers expect: in other words, wrong. :-)

I know that, but IMHO it’s not very relevant. Again: it’s a strong convention in mathematics that plotting the graph of a function means a certain thing, and if a source used something eise intentionally it would mention it, etc., etc.

OK: if you can find me another example on MathWorld where plotting a function means plotting it with the y-axis going downwards, and this is not mentioned on MathWorld, I will agree with you. :-)

As an aside, I find it somewhat telling about our nature that with so much text in the post, everyone (including myself) wants to comment on the somewhat silly section that claims something is a mistake.

Wow, that is brilliant! I have no words. Very clever, and please do write the article explaining it. Thanks also for the comments so far on the Reddit thread. I did not imagine that someone could actually achieve this! :-)

Wow, that is certainly big; it took me a while to realise that the tiny black line near the top of the screen was the actual image! :-)

It still seems to require passing N separately, though… am I right? So it’s “(formula + N) gives (image of formula)”, so the image by itself is not sufficient to regenerate the formula. Still, impressive.

N is part of the formula. Since Tupper hasn’t made his bignum GrafEq publicly available, I tried it using Maple and it seemed to work fine. The copy of the email I got included a bunch of links to files in http://www.peda.com/selfplot … which is where I got the constant.

Where does it create these files and does it save them automatically or do I have to do something? So I just open Python, copy the text from plot-tupper.py and open the picture? Sorry I’m being a boring noob but I only worked in QB64 and I don’t know anything about Python. Just try to explain it as you would to a baby :)

You can try the following:
If you know what a “command-line” (“terminal”) is, then you just type the command:
python plot-tupper.py
(from the directory/folder that contains plot-tupper.py), and (when prompted) enter the numbers 106 (the width) and 0 (for default N) respectively. Then you’ll find two files called out.png and out.svg respectively, created in the same directory.

Else, you can try opening the Anaconda IDE (I think it’s called Spyder), open the file plot-tupper.py, and find something somewhere in the menu called “Run” or “Execute” or something like that. Then you should be able to similarly enter the input and find the files created.

Sorry I don’t know anything about Anaconda so I can’t help you much!
With the terminal / command-line method, this is what it looks like on my computer (I saved plot-tupper.py in a directory called /tmp/test — you can use any directory):

To go from number to image, use plot-tupper.py.
To go from image to number, use bmp2tupper.py (it takes an image in .bmp format). That’s how I generated the N for the various images in this post; you can run it with
python image.bmp
at the command line.

By the way, how the formula works is already explained in the post; is something not clear enough?

And how exactly can I create my own .bmp image? When I asked you I thought you would explain how I could do it mathematically. And no, I didn’t understand it because I’m only 15 years old and English is not my mouther tongue so I don’t understand some mathematical terms.

I’ll edit the post to add some simpler examples, and reply again here when I’m done. (Might be a few days as I expect to be without a laptop for a while.)

You can create .bmp images with an image editor — on Windows, “MS Paint” (which comes installed by default) can do this, and I used something similar on either Mac or Linux for the images in the post.

Mathematically, going from a bitmap to N is very simple — as the post says, “q is merely a list of bits of the image … reading each column upwards, and going through the columns left-to-right.” (It has 0 if the square is blank, and 1 if the square is filled.) And N = 17q.

For example, the “empty” graph (all 0s) corresponds to q = 0 and so N = 17×0 = 0, and the filled graph (all 1s) corresponds to and again.

I understand now how to calculate N and I succeeded in making a couple of images (it took me a while until I figured out that my calculator can’t calculate so large numbers). But when I tried to get N from a picture I made in Paint and saved as .bmp cmd wrote:
SyntaxError: Non-ASCII character ‘\x8e’ in file untitled.bmp on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details.

As the output says, “Height is larger than 17, so Tupper’s formula must be modified.” The height of your untitled.bmp is 610 pixels.

Try it with an image whose height is 17 pixels or less. That means a *really* small image.

Else, you can do the following too: runpython plot-tupper.py 610
and as input give it the height (1316 for your image) and N (which has over 1200 digits). Note that your terminal may have trouble inputting such a large input for N.

i can actually make this alot simpler if you would like. so if you look at these images in their pixelized form, so that each coordinate is a pixel, you first have to make a binary. starting from the bottom left, going up, then the bottom of the next row and go up, and so on and so on. wherever there is a blank space, punch in a zero, whenever the coordinate is marked plug in a one. after this, you should get a large binary number, bring this number from base 2 into base 10, and then multiply by 17. the result is your N value. if this results in any error, or for some reason the graph didnt appear correctly, reply to this comment, and i can always help you some more. i know i made it sound complex, but once you understand it, it will seem really simple. i currently have many mathematically stable “N” values, but not a program to graph them on, and am still searching for a program i could use, but i know for a FACT that this is how to find the “N” value of whatever image you would like.

Now, time for my own comment before i head off to bed. does anyone know of any free windows programs i could use in order to graph these. as i said before, i have different “N” values for many images, and havent been able to graph them

I did not check ALL the pages where you think they give the wrong “N”. But I did try to use the N from Wikipedia and they give the correct graph. If you LOOK at the GRAPH on wiki, the x-axis is plotted in the REVERSE order. That IS what you are expected to do when plotting such graph, I think.

That is almost exactly what I said: the constant 9609…4719 only works if you reverse both axes, else the graph comes out upside-down. But most sources have not bothered to specify that, and they have instead simply flipped the axes (either in their code or in their image) thinking “maybe that is what you are supposed to do”, as you just did. :-) Flipping both x- and y- axes from the standard definition of graph/plot is not something you do silently.

Wikipedia currently mentions “note that the axes in this plot have been reversed, otherwise the picture comes out upside-down”, but at the time I wrote this post (2011 April) it did not: that comment was added by Anders Kaseorg in 2013 August 14.

Well, that’s fair enough. I guess it would only come to us when we actually bother to check them. :) But to be fair, at the university, we are always told NOT to trust things on wiki, as anyone can edit wiki pages. So it’s only a starting point, but it should NEVER be referenced. As far as some other online posts, those are posted by “amateurs”, so we can’t be too hash on them either.

First you would have to define what it means to graph using three colours. The standard definition of graphing a region is to colour points in the set and leave uncoloured points not in the set. But sure, if you come up with a definition of graphing with multiple colours, then it should work to use a different base.