$\begingroup$Great question! [Edit:} One question. Where do we draw the line between algorithms and datastructures? What if the key insight to an algorithm is intimately related to a datastructure (for example UNION FIND in the inverse Ackermann function)?$\endgroup$
– Ross SniderAug 17 '10 at 18:25

$\begingroup$I'm a little surprised that algorithms which I consider quite tricky (KMP, linear suffix arrays) are considered by others as being "from the Book." To me, "from the Book" means simple and obvious, but only with hindsight. I'm curious how others interpret "elegant".$\endgroup$
– Radu GRIGoreAug 19 '10 at 7:33

49

$\begingroup$@supercooldave You don't have to believe in God, but you should believe in his book. ;-)$\endgroup$
– Ross SniderAug 24 '10 at 22:27

10

$\begingroup$During a lecture in 1985, Erdős said, "You don't have to believe in God, but you should believe in The Book."$\endgroup$
– Robert MassaioliDec 15 '10 at 3:17

93 Answers
93

Union-find is a beautiful problem whose best algorithm/datastructure (Disjoint Set Forest) is based on a spaghetti stack. While very simple and intuitive enough to explain to an intelligent child, it took several years to get a tight bound on its runtime. Ultimately, its behavior was discovered to be related to the inverse Ackermann Function, a function whose discovery marked a shift in perspective about computation (and was in fact included in Hilbert's On the Infinite).

$\begingroup$It's kinda mind boggling to realize that that was something that wasn't obvious at some time and is only obvious now because they came up with it and we learnt it... I think we should apply Carr's theory of history to Mathematics and Computer Science.$\endgroup$
– Ritwik BoseSep 2 '10 at 12:50

1

$\begingroup$By the description, I'd say this is related to the Boyer-Moore fast substring search.$\endgroup$
– bartSep 22 '10 at 7:20

1

$\begingroup$@Mechko The fact that this algorithm was discovered simultaneously and independently by separate people is an indication that it is obvious to an extent. Whether something is "obvious" is a function of project constraints and broader programming environment. If you need (1) fast text searching and (2) you're aware of the importance of truly O(n) algorithms, and (3) you've encountered text with partial matches before, and (4) you have the time to do things "right", then this algorithm is probably obvious.$\endgroup$
– Matt GallagherJun 17 '13 at 1:20

$\begingroup$@Kaveh Please read Section 7 (Historical Remarks) from the original KMP paper. It has great remarks. About Morris writing a text editor which "was too complicated for other implementors of the system to understand". About Knuth "the first time in Knuth’s experience that automata theory had taught him how to solve a real programming problem better than he could solve it before". And " Knuth was chagrined to learn that Morris had already discovered the algorithm, without knowing Cook’s theorem;". FUN.$\endgroup$
– Hendrik JanOct 15 '14 at 1:07

$\begingroup$This is one of my favorite algorithms. I like an intuition for it that I learnt from Chazelle's discrepancy book: the set of medians of groups of $1/\epsilon$ elements is like an $\epsilon$-net for intervals in the ordered list of the input numbers. So the algorithm follows a general paradigm: compute an $\epsilon$-net fast, solve the problem on the net, recurse on some part of the input to refine the solution, until you have the exact solution. it's very useful technique$\endgroup$
– Sasho NikolovMar 30 '12 at 22:34

4

$\begingroup$BTW once you parametrize the size of the groups, the constants are not so magical. they are of course optimized to give the right thing in the Master theorem$\endgroup$
– Sasho NikolovMar 30 '12 at 22:36

$\begingroup$The claim that "this only works because the numbers are just right to fit in the master theorem" is not really true. If you replace the number $5$ with a larger number $n$, it is easy to see that the two numbers that have to sum to less than $1$ converge to $3/4$ and $0$, so all sufficiently large $n$ work. $5$ is just the first number that works, it's not the only one.$\endgroup$
– Will SawinMar 16 '15 at 17:08

$\begingroup$In my first graduate algorithms course we had 15 minute quizzes where we had to solve 2-3 problems by hand. The first such quiz included a binary search tree and two questions about heaps. I was embarrassed and dismayed to learn I had gotten the binary search problem wrong, before being told that in a class of about 30 people there were two correct answers. But even knowing that, the fact that it took 15 years for the professional community to get right is staggering.$\endgroup$
– Stella BidermanOct 20 '17 at 12:45

d[]: 2D array. d[i,j] is the cost of edge ij, or inf if there is no such edge.
for k from 1 to n:
for i from 1 to n:
for j from 1 to n:
d[i,j] = min(d[i,j], d[i,k] + d[k,j])

One of the shortest, clearest non-trivial algorithms going, and $O(n^3)$ performance is very snappy when you consider that there could be $O(n^2)$ edges. That would be my poster child for dynamic programming!

Might seem somewhat trivial (especially in comparison with the other answers), but I think that Quicksort is really elegant. I remember that when I first saw it I thought it was really complicated, but now it seems all too simple.

$\begingroup$Quicksort also raises interesting questions about what exactly the essence of an algorithm is. E.g. the standard elegant Haskell implementation looks exactly like the standard pseudo-code definition, but it has different asymptotic complexity. So, is Quicksort just about divide-and-conquer or is the clever in-situ pointer-fiddling an essential part of Quicksort? Can Quicksort even be implemented in a purely functional setting or does it require mutability?$\endgroup$
– Jörg W MittagSep 3 '10 at 22:28

$\begingroup$@Jörg: Implementing quicksort on linked lists is completely sensible, and has the same asymptotic running time as its in-place implementation on arrays (heck, even the naive out-of-place implementation on arrays has the same running time) – both on average and in the worst case. As for space usage, this is indeed different but it must be said that even the “in-place” version requires non-constant extra space (call stack!), a fact readily overlooked.$\endgroup$
– Konrad RudolphSep 8 '10 at 11:31

$\begingroup$Quicksort in theory is simple (can be outlined in 4 steps) and could be highly optimized, but in practice it's very difficult to code correctly. Which is why it doesn't get my vote.$\endgroup$
– DennisJun 16 '13 at 13:48

The Miller-Rabin primality test (and similar tests) should be in The Book. The idea is to take advantage of properties of primes (ie using Fermat's little theorem) to probabilistically look for a witness to the number not being prime. If no witness is found after enough random tests, the number is classified as prime.

On that note, the AKS primality test that showed PRIMES is in P should certainly be in The Book!

If someone woke you up in the middle of the night and asked you to test two univariate polynomial expressions for identity, you'd probably reduce them to sum-of-products normal form and compare for structural identity. Unfortunately, the reduction can take exponential time; it's analogous to reducing Boolean expressions to disjunctive normal form.

Assuming you are the sort who likes randomized algorithms, your next attempt would probably be to evaluate the polynomials at randomly chosen points in search of counterexamples, declaring the polynomials very likely to be identical if they pass enough tests. The Schwartz-Zippel lemma shows that as the number of points grows, the chance of a false positive diminishes very rapidly.

No deterministic algorithm for the problem is known that runs in polynomial time.

$\begingroup$This should have been suggested long ago! Thanks!$\endgroup$
– arnabSep 23 '10 at 3:56

1

$\begingroup$There are several other randomized algorithms that deserve a prominent place in the Book. For these the contrast between the deterministic and probabilistic alternatives is less striking: a deterministic algorithm usually exists but is much more complicated.$\endgroup$
– Per VognsenSep 23 '10 at 4:21

$\begingroup$I independently invented the same algorithm while I was working on a paper a couple years ago until someone asked me isn't it Schwartz-Zippel lemma? And I said, what is that? :)$\endgroup$
– HeliumDec 3 '15 at 7:28

$\begingroup$It is also the basis of Prolog execution!$\endgroup$
– muadAug 19 '10 at 7:14

1

$\begingroup$What's the point about BFS with a stack I'm missing? I would have thought that the answer is "yes, you get DFS".$\endgroup$
– Omar Antolín-CamarenaJun 16 '13 at 13:28

1

$\begingroup$Well, everybody seems to think this problem is trivial. Also, everybody seems to think the answer is "yes", which is wrong. The answer is actually "depends on which BFS implementation you start with". See cs.stackexchange.com/questions/329/… (This is a question I posted to help with the beta phase of CS.SE)$\endgroup$
– Radu GRIGoreJun 23 '13 at 10:11

Dijkstra's algorithm: the single-source shortest path problem for a graph with nonnegative edge path costs. It's used everywhere, and is one of the most beautiful algorithms out there. The internet couldn't be routed without it - it is a core part of routing protocols IS-IS and OSPF (Open Shortest Path First).

Assign to every node a distance value. Set it to zero for our initial node and to infinity for all other nodes.

Mark all nodes as unvisited. Set initial node as current.

For current node, consider all its unvisited neighbors and calculate their tentative distance (from the initial node). For example, if current node (A) has distance of 6, and an edge connecting it with another node (B) is 2, the distance to B through A will be 6+2=8. If this distance is less than the previously recorded distance (infinity in the beginning, zero for the initial node), overwrite the distance.

When we are done considering all neighbors of the current node, mark it as visited. A visited node will not be checked ever again; its distance recorded now is final and minimal.

If all nodes have been visited, finish. Otherwise, set the unvisited node with the smallest distance (from the initial node) as the next "current node" and continue from step 3.

Gentry's Fully Homomorphic Encryption Scheme (either over ideal lattices or over the integers) is terribly beautiful. It allows a third party to perform arbitrary computations on encrypted data without access to a private key.

The encryption scheme is due to several keen observations.

To get a fully homomorphic encryption scheme, one needs only to have a scheme that is homomorphic over addition and multiplication. This is because addition and multiplication (mod 2) are enough to get AND, OR and NOT gates (and therefore Turing Completeness).

That if such a scheme were to be had, but due to some limitations could only be executed for circuits of some finite depth, then one can homomorphically evaluate the decryption and reencyption procedure to reset the circuit depth limitation without sacrificing key privacy.

That by "squashing" the depth of the circuit version of the decryption function for the scheme, one might enable a scheme originally limited to finite, shallow circuits an arbitrary number of computations.

In his thesis, Craig Gentry solved a long standing (and gorgeous) open problem in cryptography. The fact that a fully homomorphic scheme does exist demands that we recognize that there is some inherent structure to computability that we may have otherwise ignored.

$\begingroup$I do think that it has received the recognition it deserves – what makes you think otherwise? For example, it’s implemented in the C++ sequence analysis library SeqAn.$\endgroup$
– Konrad RudolphSep 8 '10 at 11:11

$\begingroup$It is worth mentioning that there are now a number of other linear and non linear time suffix array construction algorithms which, although nowhere near as pretty, may be quite a lot faster in practice. "An efficient, versatile approach to suffix sorting", Journal of Experimental Algorithmics (JEA), Volume 12, June 2008 has some experimental results along these lines.$\endgroup$
– RaphaelJun 2 '11 at 19:35

$\begingroup$@Raphael: I'm a bit wary of the fact that on p. 3 of that JEA paper, they give only what they "believe" is a "loose" bound of O(n^2 log n)... Do you know any papers with provably linear-time algorithms that are faster in practice than the Skew Algorithm?$\endgroup$
– user651Sep 22 '11 at 8:58

$\begingroup$BTW, what's the generalization sequence, and where does Buchberger's algorithm for Grobner basis fit into it? (It seems analogous to Knuth-Bendix, but I've seen a mention somewhere that it sort of generalizes Gaussian elimination…)$\endgroup$
– ShreevatsaRAug 30 '10 at 0:03

6

$\begingroup$the sequence is: Euclidean GCD -> Gaussian Elimination -> Buchberger -> Knuth-Bendix. One can also put in (instead of Gaussian Elimination) univariate polynomial division and modulo (in the generalization order it is 'apart' from Gaussian Elimination, GE is multivariate degree 1, polynomial ring is univariate unlimited degree, Buchberger's is multivariate unlimited degree. The generalization jump is largest from EGCD to GE or polynomial ring because of the addition of variables, and then also large from Buchberger to KB because of the unlimited signature.$\endgroup$
– MitchAug 30 '10 at 15:59

$\begingroup$+1: Euclidean algorithm solves the most celebrated equation ax-by=1 in math. Why it doesn't show up in CS more often is a mystery.$\endgroup$
– Tegiri NenashiFeb 16 '11 at 21:36

I was impressed when I first saw the algorithm for reservoir sampling and its proof. It is the typical "brain teaser" type puzzle with an extremely simple solution. I think it definitely belongs to the book, both the one for algorithms as well as for mathematical theorems.

As for the book, the story goes that when Erdös died and went to heaven, he requested to meet with God. The request was granted and for the meeting Erdös had only one question. "May I look in the book?" God said yes and led Erdös to it. Naturally very excited, Erdös opens the book only to see the following.

$\begingroup$Yes, a very nice algorithm. Less trivially, at the cost of another factor of 2, this algorithm also works for maximizing any submodular function, not just the graph cut function. This is a result of Feige, Mirrokni, and Vondrak from FOCS 07$\endgroup$
– Aaron RothAug 17 '10 at 20:23

$\begingroup$Do you know the dumb algorithm that solves the problem with the same asymptotics and follows an algorithmic design pattern? I'm talking about iterative deepening. In the nth iteration you start at the 2^n-th successor of the root and look 2^n successors forward in search of a recurrence. Even though you're retracing some of your steps with every iteration, the geometric growth rate of the search radius means that it doesn't affect the asymptotics.$\endgroup$
– Per VognsenSep 23 '10 at 15:22

I've always been partial to Christofides' Algorithm that gives a (3/2)-approximation for metric TSP. In fact, call me easy to please, but I even liked the 2-approximation algorithm that came before it. Christofides' trick of making a minimum weight spanning tree Eulerian by adding a matching of its odd-degree vertices (instead of duplicating all edges) is simple and elegant, and it takes little to convince one that this matching has no more than half the weight of an optimum tour.

How about Grover's algorithm? It's one of the simplest quantum algorithms, and allows you to search an unsorted database in $O(\sqrt N)$. It is provably optimal, and also provably outperforms any classical algorithm. For a bonus it is very easy to understand and intuitive.

$\begingroup$And indeed, several Nobel prizes were given for advancing our understanding of these problems.$\endgroup$
– Ross SniderAug 18 '10 at 0:14

$\begingroup$@Ross Kantorovich won the Nobel prize in Economics for inventing LP and applying it to resource allocations. What other prizes were you thinking of?$\endgroup$
– Mark ReitblattNov 16 '10 at 2:00

$\begingroup$@Mark Koopermans was awarded the nobel prize with Kantorovich, but it was still inaccurate of me to say "several."$\endgroup$
– Ross SniderNov 16 '10 at 9:57

This kind of goes against the spirit of the other offerings in this list, since it is only of theoretical and not practical interest, but then again the title kind of says it all. Perhaps it should be included as a cautionary tale for those who would look only at the asymptotic behavior of an algorithm.

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).