If I remember correctly, the superoptimizer worked for straight lines of
integer code. When searching for a code sequence, i.e. "bcd-add bytes
from register d0 and d1 into d2" it enumerated all sensible code sequences
(i.e., code sequences not mentioning d0,d1 or d2 are not sensible in this
context) of length 1,2,3 and so on, wrote these to memory and *executed*
them on sample data. Two or three runs on random inputs usually suffice to
determine that a code sequence does *not* compute the intended result; so
with only, say 20 machine cycles, a sequences can be tested. This makes
testing the generated code trivial. Of course, larger tests have to be
done for sequences that "almost" do the right thing, and this was done by
some other means (ultimately, by hand, I think). The problem was that
some sequences of, say, 6 instructions could be faster than an "optimal"
4-instruction sequence, but were never reached because analyzing all
sequences of length <6 would takes many hours.