EDP5 — another very brief summary

I wasn’t expecting to have to write another post quite this soon, so this is another one where I don’t have much to say. Here are a few bits and bobs from the last lot of comments, but there’s plenty more in the comments themselves, and actually I think that it’s not that hard to browse through them now that we have depth-1 threading and quite a lot of the comments are very short.

One of the general aims at the moment is to try to show that a multiplicative function of bounded discrepancy must have some kind of character-like behaviour. Terence Tao has come up with an intriguing argument that shows not that but at least something in the right ball park.

Share this:

Like this:

LikeLoading...

Related

This entry was posted on February 2, 2010 at 12:54 am and is filed under polymath5. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

106 Responses to “EDP5 — another very brief summary”

Here is a possibly rather pointless-looking observation. I’ll explain in a later comment why I stumbled on it.

Suppose we build a sequence as follows. It is somewhat like the way one builds the character-like sequence , but there is a twist.

The first step is to write . That is, we are saying that , and we’re not yet sure about . Having done that, we say that this decides the values whenever is congruent to 1 or 3 mod 3. That is, our sequence will be 1 * -1 1 * -1 1 * -1 … whatever happens.

Now we fill in two thirds of the gaps by taking minus the sequence we have so far. That is, we take the sequence

1 -1 -1 1 * -1 1 1 -1 1 -1 -1 1 * -1 1 1 -1 …

Next, we fill in two thirds of the remaining gaps by using the original sequence again, getting

1 -1 -1 1 1 -1 1 1 -1 1 -1 -1 1 * -1 1 1 -1 …

and so on.

This sequence is not multiplicative and it does not have small discrepancy (for example it is always -1 at multiples of 3). However, it does have partial sums that grow only logarithmically. One proof of that is an obvious inductive argument. Another is to “observe” that this sequence is precisely the sequence , and we know that the partial sums of grow logarithmically along the integers and also along the even integers. (The word “observe” was in inverted commas because in fact I took the odd terms of and worked out this alternative way of generating the sequence.)

Since these are not very difficult properties to obtain, I’d better explain in a new comment why I looked at this sequence.

I had an idea in this comment here and it is not letting go of me. It’s the usual story: it doesn’t work in a completely obvious way, but neither have I found some counterexample that would crush a lemma I needed, or something like that. In this comment I’ll describe the basic thought, and in subsequent comments I’ll try to explain where I’m at (which is not far) with the details.

The thought is this. Let be a completely multiplicative function, and for simplicity suppose that takes values in . We want to get a handle on the mean-square partial sums of . That is, we want to bound from below the norms of vectors of the form , where each is defined to be the partial sum . (I take to be 0. Putting it in makes certain expressions tidier later.)

Now I want to write the vector as a sum of other vectors. Let us define to be . In other words, is the partial sum of the odd terms of the sequence up to . Then the vector is equal to

and is equal to

which in turn equals

which is times a “horizontal stretch” of by a factor of 2.

We can therefore repeat the trick with the stretched version of , decomposing it into a stretched version of (multiplied by ) and an even more stretched version of . If we allowed ourselves to extend the sequences to infinity, and if we defined to be the map that takes a sequence to the sequence , then we would be able to say that . And indeed, this is true pointwise since the sequence on the right hand side is eventually zero at every (because of the stretching of the initial 0).

The thought at the back of my mind is that the more you stretch , the more you kill off high frequencies, so if is quite a lot less than then shouldn’t correlate much with . (There are all sorts of problems with this idea, which I’ll come to, but let me just give the idea.)

Now if there were no correlation at all between the functions then we would be done. That is because for parity reasons at least half the values of have absolute value at least 1, so the norm of is at least , and therefore the norm of the sum of of the stretches is at least .

While writing this, I did finally think of a “crushing” example. I’ll explain it in my next comment. I don’t yet know how damaging it is.

The key to making an approach like the above work would of course be proving that the functions did not correlate sufficiently negatively for the sum to be small in . However, while I was writing the previous comment, it occurred to me to wonder where I was using the condition that the sequence takes values of modulus at least 1. Indeed, suppose we try out the argument on the “proto-Walters sequence” 1 -1 0 1 -1 0 …, which has bounded discrepancy. We know that the argument must fail somehow, but what actually happens?

Well, in this case,

so it is clear that the functions correlate with each other a lot. Therefore, once we multiply them by , we divide them into two parts, one very positive and the other very negative. So there is no particular reason to stop the cancelling that happens.

That’s not quite the same as explaining why the cancellation does happen, which I don’t fully understand yet.

At this stage, I hope it’s clear what my motivation was for looking at odd values of . There is a much nicer decomposition of the partial-sums vector of based on the number 3 rather than 2, but I wanted to see what happened if one split it up the “wrong” way, because if we have to decide what the “right” splitting is, then we’re back to the difficult problem of trying to classify multiplicative functions.

One final remark. If we know that the odd-partial-sums vector averages almost zero on all long intervals, then we should be able to get small correlation just because the inner product of a function that averages zero on long intervals with a function that is constant on long intervals (apart from the very occasional step) will be small. So there seems to be a chance of proving a curious result of the following kind. We are given a multiplicative function . Then either its partial sums are unbounded, or the odd partial sums must have significant bias, in the sense that they do things like staying positive (on average) for long chunks of time. I haven’t worked out the argument in enough detail to know what the precise statement should be,
but it seems to be saying something like that you either have unbounded discrepancy or you have unbounded “double discrepancy”. (By this I mean that discrepancy is what you get when you integrate once, and double discrepancy what you get when you integrate twice — I’m glossing over the fact that you take just odd values.)

An automatic sequence is defined like this: let A={a_1,…,a_r} be a finite alphabet, and extend any function “from letters” to “from words” by . Now if the length of f(a_i) is k (the image of each letter has the same length), and the first letter of f(a_1) is a_1, then the sequence of words

f(a_1),f(f(a_1)), f(f(f(a_1)))), …

converges in the sense that each word in the sequence is a prefix of the next. Let W be the infinite limit word (with index starting at 0, say). A proper coding is a map from A to the complex unit circle. An automatic sequence is a proper coding of W.

As far as I can tell, all of our examples of logarithmic growth are automatic (up to a small number of changes). Borwein, Choi, and Coons ask at the end of their paper whether automatic sequences with have logarithmic growth.

Two things of note: any HAP of an automatic sequence is an automatic sequence. A sequence x_0,x_1,… is k-automatic (k is the length of f(a_i)) if and only if the set of infinite sequences

Something I don’t quite understand about the question as Borwein, Choi and Coons ask it is that they seem to suggest that the growth should be logarithmic rather than at most logarithmic. Or does their approximate equality symbol mean that the ratio of the two sides tends to a limit that is allowed to be zero? I suppose that must be what they mean, given that the partial sums of the Morse sequence are bounded. Anyhow, that’s a nice question I agree.

I’d have thought showing EDP for automatic sequences should be a reasonably straightforward generalization of the proof for the Morse sequence, but that remark is made without having tried it so perhaps there is a difficulty that would take me by surprise if I did.

Here’s a question that might be worth investigating. I think it is probably too simple to lead to a serious advance, but giving a good answer to it might clarify one or two things.

Recall from this comment that if is the partial-sums sequence and is the odd-partial-sums sequence, then , where . This implies (and was derived from the fact) that . This gives us an easy way of deriving from .

For example, if we take the troublesome sequence 1 -1 0 1 -1 0 …, then works out to be 011110011110011110…, as I mentioned above. So now we are in a position to write out the first term in , which is 0 (because of our convention that the partial-sums sequence always starts at 0). Therefore, starts . Adding this to tells us that starts 01. Since , that tells us that starts 0 0 -1 -1. Adding this to tells us that starts , so starts 0 0 -1 -1 0 0 0 0, so starts 0 1 0 0 1 0 0 1.

We can continue this process and prove quite easily by induction that is (which, reassuringly, is the correct answer).

Note that what we are doing here is very like calculating a substitution sequence, except that at each stage we add . It’s a bit like an “inhomogeneous” substitution sequence.

The question that interests me is this. We want to prove that is unbounded. Is there some nice property of that is necessary and sufficient for this? And is it conceivable that we could prove that this property cannot occur for multiplicative functions with values of modulus 1?

Here’s a sequence of some potential interest: we start with 0, then have a 1, then two -1s, then four 1s, then eight -1s, and so on. What is the resulting if ? I get . In other words, I get . This shows that can have quite long stretches of bias and will remain bounded.

For comparison, if is itself the sequence , then works out as and looks to me as though it grows logarithmically.

Putting this together, it seems that can afford to have oscillations of logarithmic speed as long as they average zero. But so far I’m saying that just on the basis of a quick look at these examples.

Terry, (for a prime ) is defined by the conditions that if is not a multiple of , , and is multiplicative. is the same but . I defined to be the map that takes a sequence and repeats each term times, so it’s an entirely different sort of object. Perhaps would be a better choice of notation, and it’s probably not too late to change it.

Going back to my comment above, I now see that there’s an easy expression for in terms of if one goes back to the original equation. If I’m not much mistaken, it is . Turning that around, it says that will be bounded if and only if is (uniformly) bounded for all sequences such that and . This, it seems to me, is a rather strong requirement, since one can imagine an “adversary” always attempting to choose in a way that will make the sum go away from zero, and unless is pretty structured the adversary would seem to be able to win.

One other small (but encouraging) observation: when takes values in , the values of and are always distinct. This makes life easier for the adversary (who is trying to make unbounded, so is actually on our side).

Back on the proof-crushing side, here’s a thought that seems rather problematic. So far, all I’ve really used is that , and we know that it’s easy to find sequences with that property and bounded partial sums (such as the Morse sequence).

I updated the code to make it take command line input, which should make it easier to use. For instance

[program name] 2 -3

would set f(2)=1 and f(3)=-1 and see what one can deduce from this. (I’m adopting the convention that f is odd, so that f(-n)=-f(n).

It occurs to me that we should look at the long multiplicative sequences that have already been found and see whether they have a fixed value at given primes (e.g. it already seems that they need to be -1 at 2 and 5). If we see such a fixed value, one can presumably eliminate the complementary possibility easily enough.

Alec Edgington worked out all the sequences of length 246 (there are 500 of them) on his computer, and says that they agree on all primes up to 67. See this wiki page. His algorithm was (I think) a pure backtracking algorithm, so whether it’s safe to conclude that it should be easy to work out by hand what the function is up to 67 is not completely clear to me.

The tendency observed so far is for long low-discrepancy multiplicative sequences to send primes congruent to 2 or 3 mod 5 (and the prime 5) to -1, and others to +1. See for example this one on the wiki. So the contrary choices may be worth trying to eliminate first, though it may not be straightforward to do so (witness some of the numbers on this page showing how far one sometimes needs to search to eliminate large drifts).

[…] New thread for Polymath5 By kristalcantwell There is a new thread for polymath5. One goal that is close to being reached that of deriving a counterexample that is completely multiplicative from any counterexample. Let me update this there is another thread here. […]

If a completely multiplicative sequence with discrepancy 2 or less has length more than 246 then I think there is a human proof that f(2) and f(5) are -1. Given this I can show that if f(3) is 1 f(7) is -1.

Assume that f(3) is 1 and f(7) is 1. We have by the above that f(2) and f(5) are -1. Then this will give the partial sum at 10 the value 2 so f(11) must be -1.

Since f(12) is 1 the sum at 12 must be 2 and f(13) must be -1.

Then f(25), f(26),f(27),f(28) and f(30) must be 1 this forces f(31) to be -1.

Then f(62), f(63),f(64),f(65) and f(66) must be 1 and we have a contradiction.

So if a completely multiplicative sequence with discrepancy 2 or less has length more than 246 and f(3)=1 then f(7)=-1.

This step f[558,561]=-3+f(43) implies f(43)=1 goes beyond the range that proves no sequence of discrepancy 2 has length more than 246 So we have no sequence of discrepancy 2 has length more than 560. We know the length is at most 246 from computer proofs. I think this can be fixed we can have a human proof of this as well.

I think Uwe is just trying to show that there is no infinite sequence of discrepancy 2. However, I am interested in the distinction between the maximum length you can get and the maximum length you can get before there is a fairly easy proof (with no backtracking) that you are doomed. To that end, I’d be interested in the most economical proof that there is no infinite multiplicative sequence of discrepancy 2, even if it doesn’t prove that there is no such sequence of length 247. I even think this is somehow a more fundamental question, which is not to say that they aren’t both interesting.

When I started I tried to maintain the 246 bound. As the proof got longer and longer I got more humble and set the bound to 1000.

Of course, with enough patience this can be fixed by adding more cases or, as Kristal did when he fixed my last post, by considering longer partial sums. I tried to get away with partial sums of length 3 to keep checking managable since the proof was essentially generated by computer.

Suppose (as suggested before) that we simplify to real valued multiplicative functions . (The following comment comes mainly from a remark in [Balog, Granville, Soundararajan]). Then [Wirsing 1967] seems to say: first, assume a decay condition on . That is, suppose for some . Then the quantity exists, and is given by his Equation (1.6) (by letting identically, and then setting ).

Without looking at the (rather technical) proofs, I would be skeptical whether or not the decay condition could be removed (and still retain the same argument).

Here’s an attempt to prove that what we are trying to prove is very hard. What I would really like is for somebody to shoot it down.

Over at mathoverflow, I’ve been trying to see whether there is some intuitive reason for the fact that the partial sums of the Liouville function grow at least as fast as (give or take ). I get the impression from the responses that there is not an elementary proof of this fact: the proofs all use properties of the Riemann zeta function and a bit of complex analysis to show that there must be a zero with real part at least 1/2.

It seems pretty clear that this approach is not going to generalize to all multiplicative functions that are not character-like. For instance, it seems that the functional equation is important in the proof alluded to above, and that property of the Dirichlet series associated with a multiplicative function is a rather special one.

But in that case, it looks as though showing that non-character-like multiplicative functions have partial sums that grow much faster than logarithmically is going to involve solving a known hard problem (that of finding an elementary proof of this fact for the Liouville function).

Does anyone see a way out of this difficulty? The results of Granville and Soundararajan are an obvious place to look, given that they are generalizing various results to all multiplicative functions. But nobody has yet pointed out some killer result or technique that does what we want.

If it does turn out to be hopeless to prove fast growth for non-character-like functions, then either the whole problem is hopeless or we must find some unified proof that gives at least logarithmic growth for all functions. I’d be very interested to know other people’s opinions on this.

Lemma: let g be a {-1,1}-valued completely multiplicative function of bounded discrepancy. Then the (conditionally convergent) sum is non-zero.

Proof: Observe that the Dirichlet convolution g*1 is non-negative, and at least 1 on square numbers, so

On the other hand, expanding the LHS using the Dirichlet hyperbola method and the bounded discrepancy hypothesis (which in particular implies that by summation by parts), one can eventually write the left-hand side as , which leads to a contradiction for x large enough if L(1,g) vanishes. QED

(This lemma is modeled on the standard elementary proof that for quadratic characters .)

In the case of Liouville, lambda*1 is exactly equal to 1 on the squares and nowhere else. So the above argument gives the asymptotic

To put it another way: the textbook elementary proof of Dirichlet’s theorem tells us that if g is {-1,+1}-valued, completely multiplicative, and has bounded discrepancy, then g(p)=+1 for infinitely many primes p (indeed is conditionally convergent, and by working harder (and using complex analysis of course) one can eventually obtain the prime number theorem for g, and maybe even the classical error term). Of course, the Liouville function does not have this property, hence has unbounded discrepancy.

Many thanks for that, and I’m relieved that such a proof exists: as requested, you have destroyed my informal argument that what we are trying to prove might be out of reach. Here, if anyone is interested, is a link to an online article about Dirichlet’s hyperbola method, which not only tells you what the method is, but also gives an example and describes the circumstances where it is particularly useful. (In fact, it wouldn’t take much work to turn it into at Tricki article …)

I’ve just had to take my car to be looked at and then walk back home, which was a perfect opportunity to think about multiplicative functions, but it has left me puzzled. I decided to think about how the partial sums of a random multiplicative function behave. (By the way, I am still using “multiplicative” as a lazy way of saying “completely multiplicative”, since I have no use for the weaker notion at the moment.) To that end, let be a completely multiplicative function taking values in , where the values at primes are chosen randomly (=independently and uniformly).

What is the expected partial sum up to ? Well, the expected value at any is 1 if is a perfect square and 0 otherwise, so we get roughly as the expected partial sum. So for rather trivial reasons we have that the expectation of the square of the partial sum is at least (or rather the largest square less than ), and therefore, by linearity of expectation, that the expected mean-square partial sum up to is at least or something like that.

So far this looks quite reasonable: we seem to have that the mean-square partial sum of a random multiplicative function is rather like the mean-square partial sum of the Liouville function , confirming our heuristic that behaves randomly.

But that is not the sense in which is random. Indeed, we choose the values of in as unrandom a way as we can, and the randomness is in its additive properties. I’ll come back to this point.

Another point is that all I have shown so far is a lower bound for the typical size of a partial sum of a random multiplicative function. What if we try to be a bit more careful and actually get an estimate? What, then, is the expected value of ? To answer this we need to think about the expectation of for each and . And the answer is that it is 1 if is a perfect square and 0 otherwise, since .

Now the relation “ is a perfect square” is an equivalence relation. To form an equivalence class, you take a square-free number and take all numbers of the form . The size of this equivalence class will be about . Also, the square-free numbers are dense, so the number of equivalence classes is .

We therefore know that has expected value 1 if and belong to the same equivalence class and 0 otherwise. It follows that the expectation of is the sum of the squares of the sizes of the equivalence classes, which is (to within a constant) .

My puzzlement disappeared as I wrote that — I had made the mistake of thinking that a typical equivalence class had size , which gave a bound of .

Let me now try to work out the fourth moment. Given an integer I’ll define to be the set of primes that occur an odd number of times in the prime factorization of . Note that (where I’m writing for symmetric difference, or sum mod 2, or exclusive or, or whatever you want to call it).

We’re trying to work out the expectation of , so we need to think about the expectation of . This will be 1 if , or equivalently if , and 0 otherwise. So we end up with the expression

(Here I am summing over the equivalence classes rather than over the elements.)

This seems to be a little tricky to estimate, since if you choose , and , then the size of depends a lot on how much they overlap. I think I’d better save this for another comment.

The punchline of this comment was supposed to be the deliberately silly remark that we now had a proof of EDP for almost all multiplicative functions, and all we had to do was deal with the few that remained. So far, I’ve at least shown that it is true for a significant percentage of them (not that I would wish to claim that this was a new result).

One thing I find slightly mysterious is the way a low-discrepancy multiplicative function manages to deal with the fact that all perfect squares must have value 1. For a random multiplicative function that trivially gave us a lower bound of for a typical partial sum, so if we are to avoid that then we need negative correlations between and when is not a perfect square.

I guess one way to try to balance the contribution from the fact that perfect squares always give 1 could be to introduce a small bias in the values of the primes, making them slightly more likely to give -1. But then this bias would have to decrease for larger p in order to not give a large negative total sum.

For the usual simple random walk, we get for the expected partial sum, but almost all paths get substantively larger than that infinitely often. In particular, the n-th partial sum gets as large (asymptotically) for infinitely many N.

Is there a law of the iterated logarithm for random multiplicative functions?

If we have a multiplicative sequence with bounded discrepancy, we can change the values at multiples of some prime p to 0, and the sequence will still have bounded discrepancy. We could also switch the sign at all multiples of p, but then the sequence wouldn’t be completely multiplicative (it would be – at p^2). But what if we only change the value at terms whose prime factorization contain p to an odd power? This sequence would be multiplicative, but is it possible to show that it can’t have bounded discrepancy?

Here’s a thought that probably goes nowhere. Let’s think of our multiplicative function as defined not on but as a -linear function defined on sets of primes. That is, given a set of primes, we think of as rather than as . And another difference is that we don’t even think about numbers that are not square free. Instead, we associate with each set a number , which is the number of positive integers less than that can be written in the form . That is, is the size of the equivalence class associated with .

With this notation, we can rewrite as .

How might that help us produce a lower bound? I’m not sure it will, but the basic idea is this. For any particular , we cannot say what will happen (since we know that some partial sums are likely to be zero, and definitely can be zero). But we will be averaging over , and this changes things a bit, because the sizes of the sets will be changing. What’s more, we are now forced to choose the function once and for all.

What I find just a little bit encouraging about this is that the squares of the sizes of the equivalence classes are distributed in a -ish way: it seems to provide a tiny little opening through which logarithmic growth might squeeze. What I find discouraging is there are a lot of singleton s with (namely all large primes). That makes me think of another question, which I’ll put in my next comment.

A related question: I find the “square-free” analog of EDP rather appealing. Suppose you consider only square free numbers and associate 0 to all other numbers. How small can the discrepency be? Are there still logarithmic examples? Or pethaps the lower bound becomes power-like. (I must admit I do not know what happens for the basic mod 3 based example and related Matryoshka Sequences.)

Another small remark to add to what I wrote above, but it still probably doesn’t do much. With the rewriting above, we can expand out the bracket and interchange summation to obtain the expression

for the mean-square partial sum up to . So it’s just conceivable that one could get somewhere by understanding the (positive definite) matrix . We’d be wanting to be unbounded. The information we would have (but might not be obliged to use) is that depends -linearly on (in the sense that it takes values in and ).

I should add something to the comment just above, because I haven’t fully explained what the (unlikely to work) idea is. Suppose we make a guess at the value of as follows. Let be the product of the primes in and let be the product of the primes in . Then is approximately , so its expected value is approximately . Therefore, if we write , it is approximately .

If this were exact, it would be bad news because it would tell us that the matrix had rank 1, and would be nothing other than . So all we’d have to do is find a combination of the that was bounded, which doesn’t look hard. However, because our estimates were only estimates, what we actually have is an approximation to this very low-rank matrix. Could it be that the real matrix has a much higher rank?

As usual, there is the problem of understanding why the proof would work for functions with values in and not for functions in . Since I don’t have a good answer to that, I don’t have much faith in this approach, but I thought I’d mention it anyway.

a question and a comment: 1) why the 1/x ish way S(A,n) distributes gives hope to log n behavior? 2) maybe you should allow functions with values in -1 0 and 1 and just hope for a clain that unless the function is nonzero on a bounded number of variables (primes) the discrepency is unbounded. is it so?

My original answer to your question was rather pathetic: the function 1/x sums to log. Let me try to give a better one, but the calculations may well not work out.

If the product of primes in is , then is approximately and the error will have order of magnitude 1. So the error when we approximate has order , whereas the value itself is close to . If we think of the errors as fairly random, then when we add them we should get something like the root mean square of the individual errors. That seems to introduce a term like into the picture, which is vaguely promising except that I don’t actually know where I’m going with this line of thought.

Basically, I don’t even have a heuristic argument at this stage. But you could summarize what I’m trying to do as this: exploit the fact that equivalent integers (that is, ones whose product is a perfect square) take the same value, by exploiting the fact that the sizes of the equivalence classes look ever so slightly random (in that we know roughly how big they are, but if is random then the errors are distributed in a fairly random way).

I’m not sure I completely understand your second suggestion. What worries me is the function 1 -1 0 1 -1 0 1 -1 0 … . I don’t want to find myself trying to develop an argument that would prove that that function had unbounded discrepancy.

I would like to regard the function 1 -1 0 1 -1 0 1 – 1 0 as “depending” on a single prime (p=3), hoping that when a function to {0,1, -1} or even to the unit disk does not depend on a single or a bounded number of primes then it has unbounded discrepency. But I do not know how precisely to define the influence of a prime on a sequence, and if it this hope is realistic.

I have an idea for a way that one might try to build a multiplicative function with slow-growing partial sums. If it can be made to work, then it would suggest that the kind of classification we are looking for would be difficult to achieve.

I’ll start with a simple observation. Suppose you were given a random walk and told that you could change it in every th place, if you wanted to, and that your aim was to keep it as close to the origin as you could. Here’s how you might go about it. Every steps you would see what had happened to the partial sums. Typically, they would have drifted by about , and there would be places you could change, so you could bring the walk back to the origin.

OK that’s not quite right, since on average only half the possible changes could help, and there’s a small chance that the walk would have drifted further, and so on and so forth, but some kind of argument like this ought to work, with a slightly worse bound, with high probability at each stage — and one could compensate for the unlucky stages by taking longer chunks of walk from time to time. But the point is that it ought to be possible to get the walk to stay much closer to the origin — I think logarithmic growth should be achievable (with a constant that depends on ).

Now let us do something a bit similar. This time we choose a random multiplicative function. It ought to behave a bit like a random walk. We then allow ourselves to change it at some, but not all, primes. We could even randomize the changes we made, by taking an interval with several primes in it and randomly choosing the ones we change.

It may be hard to prove anything rigorously — I don’t know — but heuristically it looks to me as though by this sort of procedure we could create a multiplicative function with a lot of randomness in it (so it would be most unlikely to exhibit character-like behaviour, I would have thought) but with growth less than .

One way to do that would be to try to run it as an algorithm and see if one can produce lots of unstructured multiplicative functions all with pretty small partial sums. I’m imagining much longer sequences than the ones we’ve looked at so far — perhaps of length more like a million rather than 1000 — so we wouldn’t be displaying them on the wiki. A random such function would have partial sums of order of magnitude 1000, so if one could keep them down to 25 or something like that, then it would be a fairly convincing back-up to the heuristic argument.

I’ve tried running at as a by-hand algorithm, and the results are interesting, and slightly surprising. My algorithm was this: if the sum up to p-1 is positive then f(p)=-1, if it is negative then f(p)=1, and if it is zero then f(p)=-1. I made the last choice to give the sequence a little help in the fight against perfect squares.

So far I’ve got up to 155. I got all the way to 45 before I first hit (and in fact I hit -3). I hit 3 at 57. I hit -3 again at 99, 105 and 119. I was feeling pretty good by this point, but then suddenly the sequence went a bit wild and the records started tumbling: 4, 5, 6 at 122, 123, 124, then 7 at 133 and 8 at 136. But fairly soon there were enough -1s to get things back under control again.

Thomas Sauvaget seems to have programmed something a bit like this (see the end of this page on the wiki) and gets a disappointingly large drift that seems to show that my heuristic arguments are wrong. But I don’t see what would be wrong with them, so I wonder whether I understand properly what he has done. Anyhow, I’d be very interested if someone was prepared to program this very simple greedy algorithm and tell me roughly how fast the partial sums grow. The problem at the moment is that if F(136)=8, then F(n) could just as well be as , so I need more data.

Perhaps I got the wrong end of the stick, but I thought your heuristic argument required a certain amount of ‘backtracking’: you’d look at the sum up to some value and then change a relatively small number of earlier primes to improve it. This does strike me as interesting, but the difficult question is how to decide which primes to change…

Tim, what I had done there was to construct a sequence by filling in multiplicatively and choosing at each undetermined prime either a +1 or -1 depending on which one minimized a kind of “global sum with maximal forward information”, namely the sum of all the HAP sums up to (the next prime after ). I had not computed the drift, but the plot shows that indeed there were rather long sequences of minuses in a row.

I have some time now to quickly code your algorithm, but I apologize I don’t understand what you mean by “the sum up to p-1”: is that just the sum along the sequence, or something else?

On second thoughts, perhaps it isn’t so difficult to choose the primes: you only expect to have to change , but you’ve got something like in the second half of your sequence, so you can easily choose enough without messing about with values at multiples. Hmm…

You’re going to laugh at me, but I modified my code to put in a +1 at p only if the sum up to p-1 is <-10, and it behaves _much_ better. I just ran this up to 1000000 and the smallest value I get is -62 and the biggest value I get is 96. Coding error? I'll check, but you can try it too…

I must say I find these results very interesting. First, I do not begin to have an explanation for the rapid growth of the partial sums — it’s as though there is some accidental resonance going on. I’d actually be quite interested to know whether the resulting multiplicative function is approximating some known function with a nice formula.

As for Johan’s modification, something makes me believe it. There appears to be a message (for which, again, I have not even a heuristic explanation) that if you are trying to control a multiplicative function you shouldn’t panic, particularly when it is negative.

There are some other variants one could perhaps try. One would be to choose the value of the prime at p to be 1 with probability 2/3 if the partial sum up to p-1 is negative and with probability 1/3 if the partial sum up to p-1 is positive. (Or one could have different probabilities for the two cases.) Another would be to choose random values at say three quarters of the primes and greedy for the remaining quarter. And I’m sure there are several other variants. But I’m excited about Johan’s (which isn’t randomized but is probably not too character-like either), since it suggests that multiplicative functions with slow-growing partial sums don’t have to be all that special.

Indeed, I’d be interested if Johan could tell us whether there is any sign that his function correlates with arithmetic progressions (other than homogeneous ones of course, where we know that they don’t correlate). If partial sums such as are also reasonably small (I don’t mean logarithmic, but I do mean that their growth should be definitely slower than linear), then it would be convincing evidence against all sorts of conjectures one might otherwise be tempted to make. A quick way of judging this might be just to take the Fourier transform and see whether it gets big anywhere.

In answer to Alec’s question, it’s true that my original suggestion was to be slightly less greedy and modify primes in groups (and that may still be quite a good idea — one possibility is to have groups of size about and change as many as it takes to get the partial sum by the end of the group to be zero (unless you can’t, in which case you get as close to zero as you can).

However, it then occurred to me that it ought to be the case that adding a few greedy steps to a random walk would dramatically reduce its drift, so a slightly different heuristic argument prompted me to ask about much greedier algorithms, with these unexpected, and to me still mysterious, results.

Another thought. If the original greedy algorithm gives something that looks suspiciously regular, might it be possible to multiply the function by one of the form in order to introduce cancellation into the long stretches of pluses and long stretches of minuses? If we got it right, we might be able to get some very good multiplicative sequences taking values in .

In an effort to understand slightly better these very widely oscillating partial sums, I tried defining a multiplicative function by taking to be whenever . The thinking behind this is that if is a little below and is a little below , then will be a little below , and so on, so composite numbers will have a tendency (but only a tendency) to behave similarly to the primes of about the same size.

I’ve calculated it up to 64 and the results have borne out my expectations. The record partial sums (big and small) are 1 at 1, 0 at 2, 2 at 4, -2 at 8, 4 at 14 (and at 16), -4 at 24 (and at 32), 8 at 44, 10 at 48, 12 at 62 (and at 64), where I have not mentioned records that are instantly broken.

However, I still can’t come up with a decent explanation for the phenomenon. The problem is that there are edge effects: it is not the case that the function satisfies . And it would seem that the more factors a number has, the more the edge effects should matter, eventually swamping everything and making the function look random. Perhaps that does indeed happen once we reach the point where almost all numbers have many prime factors that occur an odd number of times.

Actually, that’s given me an idea for how to explain the phenomenon after all. I’ll do a couple of calculations and then write a comment to say how I get on.

OK, let’s suppose that we have a completely multiplicative function and we know that up to a certain point . it correlates very well with the function . In other words, it behaves as if one were choosing to set with that probability.

Now choose a typical number that’s a product of two numbers and that are less than . In the absence of any other information, it seems reasonable to assume that the logarithms of and are uniformly distributed between and (since the probability that is divisible by is ).

On this model, will be 1 with probability

since this is the probability that or .
But that cancels down to . We also know that

Since and averages zero (in our model where we’ve chosen two random positive numbers that add up to ), the probability that (randomizing over as well) is .

That’s not exactly what we wanted, because of the factor 2 inside the bracket. However, it does at least show that there’s a tendency for the biased behaviour that we have up to to continue, in a slightly weaker form. But I still seem to have the problem that my calculations suggest that the randomness ought to damp down the oscillations, whereas Alec’s plot of what happens with the greedy algorithm shows no sign of damping at all.

In a nutshell, my difficulty is this. How do we get a -valued multiplicative function to mimic the behaviour of a complex-valued multiplicative function such as ? Alec’s plot seems to suggest that this should be possible somehow, but the obvious method of taking real parts and converting them into probabilities (which is what I was doing above) doesn’t seem to work, which is not altogether surprising given that is not multiplicative. We do have a nice expression for , but it doesn’t give us quite what we want: it gives us a hint of multiplicativity, but it also seems to want to disperse that multiplicativity.

I suppose it is just about conceivable that Alec’s plot would look different if there were more of it, but at the moment it really does look as though the bias is not dispersing and the peaks of the partial sums are growing linearly.

A small observation concerning the behaviour of a random multiplicative function. We know that a standard random walk up to has a variance of about . With a multiplicatively-defined random walk, the variance seems to be significantly bigger: it’s about 3500 for sums up to 1000. It looks roughly linear, with a few kinks.

If my calculation half way through this comment is correct, then the variance for a random multiplicative function is more like , which could be exactly what you are observing.

Those plots are interesting. I feel I must be making a mistake in my heuristic arguments, because they don’t seem to predict this linear growth of the record partial sums. I’m looking for a reason to get rid of that factor of 2 …

Here’s another experiment that would interest me. I wonder whether the problem with the greedy algorithm could be thought of like this: when the walk gets close to zero one is pulling it towards zero as fast as one can, but what one should actually be doing is easing off by that stage so that it won’t go zooming off in the other direction.

With that vague thought in mind (and partly informed by Johan’s experience), one could try a more delicate damping that worked like this. You set yourself a target, such as 50, and aim to keep the partial sums between -50 and 50. But you also don’t want to lose the benefits of randomness, so you use the non-randomness sparingly — only if there seems to be some genuine danger. More precisely, if the partial sum up to is and , then you let with probability (so it’s definitely -1 if and it’s definitely 1 if and the probability varies linearly in between). If the partial sum is outside these bounds then you revert to the pure greedy strategy and try to get it back inside.

I’d be interested to know whether this easing off as you get near zero produces a completely different result. How far does it have to go before you hit , for example?

That is very interesting, and I wonder how long this better control would continue, and whether it would work with 50 replaced by 20, etc. etc. It would indeed be interesting to optimize that function, but at the moment I feel as though I’m making stabs in the dark.

Talking of which, here’s another one. The thought behind it is that the original greedy algorithm didn’t work because the additional kick it gave was accidentally in phase with the partial sums themselves (not that I quite understand why that should be, when it was designed to be precisely the opposite). One way of changing the phase would be to make it depend not on the function itself (that is, the partial sum) but on the derivative (that is, the multiplicative function). A reasonably sensible way of doing that might be this. If you’ve chosen the values of up to , then randomly pick an integer between and and set to be . The idea here is that if the function is growing quite a bit as you approach then the value at will have a high probability of cancelling out a little bit of that growth. I’d be curious to know if this had the effect of hugely smoothing out the “random walk”. Part of me says that it probably will and another part of me says that it probably won’t.

If we define to be the average value of near , then one could generalize your question and ask for the optimal function , where .

Two of them get out of hand, but eight stay under control and close to zero. I wonder if they'll all 'go linear' eventually.

I tried the suggestion of setting for a random recent ; maybe I just didn't get the parameter right but the behaviour looked fairly random. It ought to be possible to do something along these lines, though.

Alec, thanks for running that idea. It gives me one more idea (after which I may stop — but I can’t quite guarantee it). I think I can understand why for a random recent might give fairly random behaviour. The rough reason is this. If the function starts growing in a linear kind of way, then the choices at primes will cease to be random and will try to kill the gradient. So this method should stop linear growth. However, if the behaviour is fairly random, then the choice of is also pretty random. (Yes, there is some dependence, but we have a lot of dependences anyway.) So this method won’t do anything to defeat randomness, so to speak.

Now the other method (making -1 when partial sums are positive and +1 when they’re negative) defeats randomness, but seems to give rise to linear behaviour. So what if we combine the two? That is, what if we just alternate the two methods of choosing ? The idea is that one method would keep things at worst random and the other would then kill off the randomness. In short, one method would produce the randomness that the other one needs for the heuristic argument that lay behind it to work.

My heuristic arguments here are obviously pretty vague, and I can well imagine that this won’t work at all, but it definitely seems to be worth a try, and I’d have thought you could patch together some bits of existing code pretty quickly.

On another topic, I am very intrigued by your tanh sequences, and especially by the idea that they may be extremely well controlled but in an unstable way. There are all sorts of phenomena that I’d like to understand better. It feels as though there ought to be differential equations that would predict the behaviour well, but my attempts to do that have so far given obviously wrong predictions.

I have a partial explanation for this linear behaviour that keeps popping up, but need to think about it more. The idea is that the heuristic argument that predicts that choosing to have the opposite sign to the partial sum up to should work well depends on the assumption that the partial sums look like a random walk. But if they ever don’t look like a random walk and start moving at a linear rate, then choosing the s will not have enough effect to defeat that linear behaviour. So instead what happens is that you just hopelessly attempt to slow it down, putting in a long sequence of values of the same sign that set you up for trouble a bit later on.

Actually, have you tried the very simple adjustment, designed to avoid this rather extreme behaviour, of setting … wait, I’ve just looked back and seen that you have. I’m talking about having probabilities of 2/3 and 1/3 instead of 1 and 0. You gave a link to some pictures that you claimed were much more square-root-like. I’d be interested to know whether that really is the case — that is, if you go a lot further than 10,000, do the deviations get quite a bit bigger?

Also, one could imagine varying your tanh experiment by sticking a factor of 1/2 in front of the tanh (so that the probabilities vary between 1/4 and 3/4 rather than between 0 and 1). I think that might have the effect of stopping the linear behaviour from ever arising, while maintaining the good control. And if it doesn’t, then perhaps some other constant would.

Here’s yet another idea for producing a good sequence (my promise to try not to do this was fairly meaningless). If one wants to introduce a phase shift in order to try to stop any resonance, then a simple way of doing it is to define to be 1 if the partial sum up to is negative and otherwise. And of course that too can be made more probabilistic. It would be interesting to know which (if any) of the following three likely possibilities holds.

(i) Whatever you choose, you get big growth, but affects its frequency and amplitude to some extent.

(ii) Most have very little effect, but if you get the right you almost completely kill the growth (with subcases that you make it square-root-like or you make it power-of-log-like).

(iii) There are a few values for , notably 1, that are particularly bad, but in general pretty well any is a big help.

I wish I understood what was going on well enough to be able to guess what the best should be. But the next best thing would be to see what it is experimentally and then try to retrodict (if that’s the right word) that value.

As I mentioned in one of the earlier post I have kept my program for constructing sequences, not necesssarily multiplicativem running after a restart. The program is now working on sequences of length 1120 and has found a number of them, 63 so far. I have created a page from which you can download the sequences. It is herehttp://abel.math.umu.se/~klasm/Data/EDP/

As before these sequences are not necessarily maximal so running one of the backtracking programs to try to extend them would be good.

Many of these sequences differ already in the first few values, but there are not that many different stubs of length 25 which the program has managed to extend to length 1120 so far.

Is see that I also happened to put the raw output files there instead of a cleaned up version. I will fix that too. The current files are fine if you delete the first line and the first two words of the second line.

There is quite a bit of common structure to these sequences. One can partition into sets with ‘primary’ functions such that all the sequences in the list satisfy . One such set is . Most impressive is the set

Klas, is your program exhaustive, i.e. when it finishes will it mean that it has reached the maximal possible length of C=2 sequences for sure? Also, do you have an estimate of when it might finish under the assumption that 1124 is the maximal length (i.e. how long did it take from 1119 to 1120?)

[…] in spirit and in methodology to discrepency problems in combinatorics and number theory such as the Erdos Discrepency problem, now under polymath5 attack. The Kupitz-Perles conjecture seems of a different nature, but I find […]

“Showing EDP for automatic sequences may be a nice toy problem.” – O’Bryant
I must admit that I don’t have much intuition about automatic sequences, but here is my idea:
I think that if a sequence is k-automatic and d is an integer, gcd(d,k)=1, all (non-homogeneous) arithmetic progression with difference d looks like the d-HAP in the following sense: If and are natural numbers there exists a number N such that . This would imply that if the d-HAP has bounded discrepancy, then any d-AP has bounded discrepancy.
We know that the AP-discrepancy is unbounded for any sequence, so if we could prove the above for any d (and not only d relative prime to k) we would be done. But perhaps we could generalize the result about AP-discrepancy to:
For any sequence and any natural numbers C and k we can find natural numbers d and a such that gcd(d,k)=1 and the sequence has discrepancy >C.
However, I have no idea about how the AP-discrepancy result was proved, so I don’t know how difficult this generalization would be.

“For any sequence and any natural numbers C and k we can find natural numbers d and a such that gcd(d,k)=1 and the sequence x_{a},x_{a+d},\dots has discrepancy >C.”
Do’h, here is a counterexample: +,-,+,-,… and k=2. Actually I was thinking about that example while I wrote the comment, but I didn’t take I too serious, because the k-HAP had unbounded discrepancy. What I should have wrote was: Is it possible to prove that for any C we can find a HAP or a d-AP with gcd(d,k)=1 that has discrepancy >C.

Suppose is a sequence of length which is completely multiplicative, has discrepancy at most , has , and has .

Define as if and otherwise.

Then is completely multiplicative and up to length has discrepancy at most .

Verifying using the sample multiplicative sequence of length 246 with discrepancy 2 (http://michaelnielsen.org/polymath1/index.php?title=Multiplicative_sequences), I checked to find the largest index where the sequence has value +1 and the partial sum just before that is 0, which is at . Extending that gives a completely multiplicative sequence of length 1935 with discrepancy 3, and it goes to with discrepancy 7.

It's still multiplicative if , which helped improve the original Matryoshka sequence. When I extend that way I get to length 1943 while still at discrepancy 3.

What I'm really hopeful about is iterating this procedure – finding a good constructive way to choose how much to increase the discrepancy and where to pick the length of the next sequence, then constructing "better and better" versions. I thought I had sqrt(log) for a moment, but it didn't work…

Yes, the length 1935 “completely multiplicative” sequence is not, because the length 240 sequence is not completely multiplicative mod 241. Is there an enumeration of the 500 length 246 discrepancy 2 sequences somewhere?

I think Kevin O’Bryant has been down more or less this road, but perhaps you are doing something he wasn’t. In any case, if you have a sequence that is completely multiplicative mod M, then a fairly easy Fourier argument gives you a partial sum of size proportional to the square root of M, so I think there isn’t an easy way of improving Matryoshka sequences. But it’s possible that I am not fully understanding what you are hoping to do here.

The thread on simple algorithms for producing (or accidentally failing to produce) reasonably low-discrepancy multiplicative sequences has got so long now that I’m starting a new comment about what the explanations might be for what we are observing.

It occurs to me now that the apparent linear behaviour may be an illusion, and that the truth may be growth that is very slightly sublinear. My reasoning is as follows. If you define a multiplicative function by defining to be raised to the power , then you get some very nice behaviour that points in the direction of linear discrepancy. For instance, if and are primes and it so happens that is only just smaller than , then it is very likely that . So there is a tendency for the s to behave very like the primes, at least in the run-up to a power of 2.

This kind of argument works for products of more than two primes as well, but the more primes you take, the worse it gets. Now a typical integer near has about prime factors (despite anything I might have written to the contrary), and if is less than a million, then looks pretty much like a constant. So I think that the apparent linearity that we see in the examples that blow up may well be a function more like that we are just not fully seeing yet. Looking back at some of the plots, there are some signs, though far from conclusive ones, that the gradient is levelling off just a little.