Pierre de Fermat is perhaps best known for the proof that he never wrote down, and perhaps never had.

I have discovered a truly remarkable proof which this margin is too small to contain. (Around 1637)

This of course refers to his famous “Fermat’s Last Theorem,” that was finally proved by Andrew Wiles in 1995. Most doubt whether Fermat really had a proof, but it is an intriguing part of the history of this famous problem. If Fermat had a solution, he certainly did not have the brilliant one that Wiles found.

Today I would like to talk about some new, discovered this century, results that generalize Fermat’s Little Theorem to matrices. They do not seem to be well known, but are quite pretty. Perhaps they may have applications to some complexity theory problems.

While Fermat’s Last Theorem is very hard to prove, the version for polynomials is much easier to resolve. In particular, one can prove that

has no solutions in non-constant polynomials, for . There are many proofs of this fact. Often changing a problem from integers to polynomials makes it easier.

Let’s now turn to study number theory problems for matrices, not for integers nor for polynomials.

Fermat’s Little Theorem

This states,

Theorem: If is a prime and an integer, then .

There are many proofs of this beautiful theorem. One is based on the following observation:

The proof of this equation follows from the binomial theorem, and the fact that

There is a general principle that a stupid man can ask such questions to which one hundred wise men would not be able to answer. In accordance with this principle I shall formulate some problems.

Since Arnold is certainly not stupid, he was joking. Yet there is some truth to his statement: it is notoriously easy to raise questions in number theory that sound plausible, hold for many small cases, and yet are impossible to prove. Fermat’s Last Theorem exactly fit this category for over three hundred years.

In any event, Arnold did extensive numerical experiments, to search for a way to generalize Fermat’s Little Theorem to matrices. He noticed, immediately, that simply replacing the integer by a matrix will not work. For example, consider a matrix that is nilpotent, but is not zero. Recall a matrix is nilpotent provided some power of the matrix is . Then, for large enough, and so clearly .

Thus, Arnold was forced to extend the notion of what it meant for a theorem to be “like” Fermat’s Little Theorem. After extensive experiments he made make the following conjecture about the trace of matrices. Recall that the is the sum of the elements in the main diagonal of the matrix .

Conjecture: Suppose that is a prime and is a square integer matrix. Then, for any natural number ,

In his paper, published in 2006, he found an algorithm that could check his conjecture for a fixed size matrix and a fixed prime. He then checked that it was true for many small values of and , yet he did not see how to prove the general case. He did prove the special case of .

Finally, the general result was proved by Alexander Zarelua in 2008 and independently by others:

Theorem: Suppose that is a prime and is a square integer matrix. Then, for any natural number ,

An important special case is when , recall Arnold did prove this case,

For an integer matrix , the corresponding coefficients of the characteristic polynomials of and are congruent . This, in effect, generalizes the statement about traces.

Factoring

Arnold’s conjecture—now theorem—has some interesting relationship to the problem of factoring. If we assume that factoring is hard, then his theorem can be used to prove lower bounds on the growth of the traces of matrix powers.

Let’s start with a simple example. I will then show how we can get much more powerful statements from his theorem. Consider a single integer matrix , and look at the following simple algorithm where is the product of two primes.

Compute

Then, compute the greatest common divisor of and for all where is a constant.

Clearly, if this finds a factor of it is correct. The only question is when will that happen? Note, by Arnold’s theorem,

Also,

The key observation is: suppose that the trace of the powers of the matrix grow slowly. Let , and let one of these values be small. Then, for some the value will be zero modulo, say, and not zero modulo . Thus, the gcd computation will work.

All this needs is a matrix so that the trace of its powers grow slowly, and yet are different. I believe that this is impossible, but am not positive.

We can weaken the requirements tremendously. For example, replace one matrix by a family of integer matrices . Then, define as:

where all are integers. Note, this value is always an integer.

Now the key is the behavior of the function . In order to be able to factor this function must have two properties:

There must be many values of so that is small;

The values of and should often be distinct.

If these properties are true, then the above method will be a factoring algorithm. Thus, an example of reverse complexity theory (RCT) is: if factoring is hard, then any that is non-constant must not have many small values. This can easily be made into quantitative bounds. I know that some of these results are proved, but their proofs often depend on deep properties from algebraic number theory. The beauty of the connection with factoring is that the proofs are simple—given the strong hypothesis that factoring is hard.

Since I am open to the possibility that factoring is easy, as you probably know, I hope that there may be some way to use these ideas to attack factoring. But either way I hope you like the connection between the behavior of matrix powers and factoring.

Gauss’s Congruence

There are further matrix results that also generalize other theorems of number theory. For example, the following is usually called Gauss’s congruence:

Theorem: For any integer and natural number ,

Here is the famous Möbius function: if , then ; if is divisible by the square of a prime, then ; if is divisible by distinct primes, then .

By the way, why does Gauss get everything named after him? I guess we should join in and create a complexity class that is called GC, for “Gauss’s Class.” Any suggestions?

As an example, let the product of two primes. then this theorem shows that we can determine,

for any integer matrix . What intrigues me is, if is less than , then we get the exact value of

Can we do something with this ability? In particular, can it be used to get some information about the values of and ?

Open Problems

I believe there are two interesting types of questions. The first is what can we say about the growth of such sums of matrix traces? Can we improve on known bounds or give shorter proofs based on the hardness of factoring? This would be a nice example of RCT.

Second, a natural idea is to look at other Diophantine problems and ask what happens if the variables range over matrices? What known theorems remain true, which become false, and which become open?

Related

1. Here is a short proof of the trace result. A is congruent mod p to a matrix with non-negative integer entries, so let it denote the adjacency matrix of a graph. The cyclic group of order acts on closed walks of length in the obvious way, and computing simply means ignoring the orbits of size . The remaining orbits must all be obtained by repeating walks of length , whence the result. This generalizes the argument I gave here.

2. In a similar way one obtains a short proof of Gauss’s congruence: by Mobius inversion counts the number of aperiodic paths of length .

3. The growth rate of traces of an integer matrix is related to the Mahler measure problem, since if a matrix’s traces grow slowly its characteristic polynomial has small Mahler measure. So this seems like a difficult but interesting avenue of attack. If the conjecture is true then the traces grow at least as fast as .

1, 2. The formula that didn’t parse is . I guess what I’m trying to say here is that I am astonished these results aren’t already well-known in the literature. Surely there are references earlier than this decade!

3. Even the truth of the Mahler measure conjecture leaves open the possibility that there are many small roots of absolute value greater than . Based on the known extremisers, however, this seems extremely unlikely.

There is simple lifting of any integer matrix A to some boolean matrix B which adds
only zeros to the spectra: spectra(B) = spectra(A) + {zeroes}. Actually, in some
universal basis B = A \oplus 0.
Thus it is sufficient to consider just {0,1} matrices. Perhaps this observation can simplify things. I really enjoyed the post, your blog is one of my favorites.