EDP7 — emergency post

I don’t feel particularly ready for a post at this point, but the previous one has got to 100 comments, so this is one of the quick summaries again — but this one is even shorter than usual.

We are still doing an experimental investigation of multiplicative functions, trying to understand how special they have to be if they have low discrepancy. Ian Martin has produced some beautiful plots of the graphs of partial sums of multiplicative functions generated by various greedy algorithms. See this comment and the ensuing discussion.

I came up with a proof strategy that I thought looked promising until I realized that it made predictions that are false for character-like functions such as and . Even if the idea doesn’t solve the problem, I think it may be good for something, so I have written a wiki page about it. Gil has had thoughts of a somewhat similar, but not identical, kind. Here is a related comment of Gil’s, and here are some more amazing plots of Ian’s. (I think we should set up a page on the wiki devoted to these plots and the ideas that led to them.) Regardless of what happens with EDP itself, I think we have some fascinating problems to think about, which can be summed up as, “What is going on with these plots?”

Advertisements

Share this:

Like this:

LikeLoading...

Related

This entry was posted on February 8, 2010 at 10:31 pm and is filed under polymath5. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

I’ve been having another look at this plot of Ian’s. It’s the partial sums of the completely multiplicative function you get if you define to be 1 if and only if the partial sum up to is less than or equal to -4. I used to be quietly confident that the fairly low partial sums we were witnessing was a sign that the growth we were witnessing was , but now I’m not so sure. The reason for my doubts is that the “greedy positive” sequence seems to show type growth, which makes me think that perhaps this one does too. Given that the fourth root of 2,500,000 is about 40 or so, the numbers do seem to be in the right ball park.

It’s interesting to see how the graph is very flat to start with and then appears to take off quite rapidly, after which a new function (the more solid blue) takes over and interferes with the old one, causing it to drop back down a bit before resuming a more rapid growth. I would dearly like to know whether this kind of pattern continues, but I understand from Ian that the computations are becoming quite slow now, so this may not be feasible. Or perhaps the issue is one of storage: how many values of is it reasonable to store directly? I’d be quite interested to know the answer to that so I can try to think what the limit of feasibility would be. So my two questions are: roughly how many numbers can you store and roughly how many arithmetical operations can you do per second? I realize these questions are a bit vague, but vague answers would suffice.

The algorithm I have in mind would be this. You define two arrays, and , for some large . You initialize them to and . Then a global step of the algorithm consists in doing the following.

Below is Tim’s algorithm as a Python script (pylab is just for the plotting). On my PC here at work this runs very quickly with N = 5000000, but runs out of memory at 10000000. On a better machine or in a faster compiled language it could no doubt do a little better.

Would it be possible to provide plots of the partial sums up to 5000000? It would be nice to see the partial sum as a function of , and also the log of the partial sum as a function of . We won’t get all that much more information than we already have from Ian’s plots, but even a little might be helpful.

If the algorithm is running very quickly but needing too much space, one could cut down the space requirement by a factor of two and slightly increase the complication of the calculation by just storing the partial sums up to and the values thereafter, since the values up to can be got by differencing the partial sums. I don’t know enough about how computers work to know whether that would mean you could go twice as far.

If one wanted to save even more space, one could save just the values at primes and the partial sum up to (but not the previous partial sums), doing a plot as one went along, and perhaps taking note of each time the partial sum reached a new record, and calculating each time by working out its prime factorization. That feels as though it could get you over 15 times as far if space remains the main constraint, unless one is forced to waste space by storing a table of the primes in increasing order. But even then one should be able to get a seven-fold improvement. (I’m approximating the log of 10^6 by 15.)

Oh dear, it is as I feared. The log versus log graph looks very linear, so it seems that this greedy algorithm is in fact giving us power-type behaviour. However, it also seems that the power is definitely smaller than 1/2, with my money still on 1/4.

I’d still very much like to understand this better, but I feel a lot less optimistic about finding -type growth this way.

Maybe my optimism would increase if the power went down if one changed the -4 threshold to a smaller number such as Johan’s original -10 or something smaller like -50. If that happened, then I would expect to be able to obtain growth by having a variable threshold such as . So it might be worth running the program again with this tiny change.

I managed to get up to , and these plots may give some cause for optimism. They are log-log plots of the maximum value attained by the partial sums, for three versions of the algorithm: setting when the partial sums are less than or equal to , and respectively. But as usual one would really like more data!

Alec, one piece of “more data” I would very much like to see, which should be easy for you given what you’ve already produced, would be a diagram like this one but with a -10 threshold. I prefer the plots of the actual values (or rather the logs of the absolute values of the partial sum versus log n) to the plots that show records, because they give strictly more information: one can see what the records are fairly easily, but one can also discern underlying trends, such as the striking approximate linearity of the boundary of the solid blue part of the picture.

My very slight cause for optimism is that in that diagram there seem to be three regions, each of which displays slightly different activity. One could I suppose argue that what happens up to 20 or so is not to be taken all that seriously, and that the difference between what happens up to and what happens beyond is not as significant as it looks, in which case we would seem to have growth. But we might be very lucky and find that after a bit a new type of slower growth took over.

The reason I’d be interested to see the -10 log versus log plot is to see if there is again some discernible linearity in the underlying trend (rather than in the outliers, where there isn’t), and if so how the gradient compares with 1/4. I am of course hoping that it will be smaller (or rather, that it will start out like 1/2, since to begin with we get the Liouville function, and will then drop to something smaller). But I’m not holding my breath …

It must be possible to upload images to the wiki, and it would be very good to do so, but I’m afraid I don’t know how if you don’t (by which I mean that if you don’t find it obvious then it’ll be beyond me).

OK, I am now fairly convinced that we’re running up against some kind of universality here and will be hard pressed to do better than a growth rate of by these methods.

The fact that we seem to be able to get is still of some theoretical interest, however, since it tells us that aiming for in the non-character-like case would be aiming for something too strong.

I can’t help thinking of other variants that it would be good to try. Let me see if the following idea persuades anybody.

The heuristic behind the greedy algorithm was supposed to be that adding a drift towards the origin at a rate of to a random walk should tend to keep it confined to within most of the time, so if we can do what we want at primes then we ought to be able to produce logarithmic growth. The flaw in the argument is that if the walk ceases to be random then this drift term is (i) no longer strong enough and (ii) reinforces the non-randomness.

Here is a revised heuristic. The Liouville function seems to behave like a random function, so we may gain something if we have a bias towards negative values. Indeed, in a small way we have already gained that by setting thresholds to be smaller than zero for when we choose . But we could try various other methods to see whether they do better. Here are a few.

(i) If the partial sum up to is at least -A then set . Otherwise, set with probability . (It’s not obvious that we don’t want . It seems likely that we want , but even this probably can’t be taken for granted.)

(ii) If the primes are enumerated in increasing order as , then let except if is a multiple of (a smallish) , in which case choose according to the usual greedy procedure (with a threshold of -A, which again might perhaps be zero).

The thought behind these is that if we apply a greedy algorithm at fewer primes, or with a smaller probability, then we will disrupt the randomness of the Liouville function less, but will also make enough changes to add a substantial drift term. It probably won’t work, but it feels worth trying. Even if it gave again, it would still be interesting as evidence that we have some kind of universal law going on (which it would be very good to explain heuristically).

It occurs to me to add that a more serious problem with the heuristic picture may be related to what Alec pointed out recently. He drew attention to a graph of the partial sums of the Liouville function that can be found on the Wikipedia page for that function. Although the peaks of the graph grow like the square root of n, the graph does not resemble a random walk all that closely — its oscillations are too regular.

I’m tempted to ask the following question in response to that fact: is there some way of choosing a multiplicative function that makes its partial sums behave more like a random walk than those of the Liouville function? What about, for example, a random multiplicative function (where you choose with probability )? A plot of one or two of those might be enlightening. If they look fairly random, then maybe the algorithms I suggested above could be modified so that the role of the Liouville function is played by a more random function.

Alec, the large memory machines are purely for batch runs so I can’t do graphics on them. Maybe you could modify the program to save only consecutive maxima and minima in a file, or something similar which doesn’t grow too fast? Then I could run the program as far as possible and post the files.

I’ve slightly lost track of what this graph is. I agree it’s hard to tell what the function underlying it is.

I’d be very interested if it was possible to plot the “robust maxima”, that seemed to show up on some of the other graphs and behave very linearly (on the log versus log scale). One way it might be possible to do that would be to pick, for each , an interval about of some suitable width (perhaps it would have to be of width proportional to , but if that got too big one could do random sampling), then draw a little graph of the percentage of in the interval such that the partial sum up to is at least R, and then identify an S such that as R approaches S from below this percentage decreases with a noticeably negative derivative, but after S the derivative is almost zero (because the function itself is almost zero).

I’m not sure how clear that was, or how easy it would be to implement, but my feeling is that this “robust maximum” is likely to be given by a smoother function that can be more easily identified. And it would then give us very strong evidence for a lower bound for the growth rate.

It is a superposition of three log-log graphs (superposed because their behaviour was quite similar), corresponding to thresholds of , and . (When the partial sum is at or below the threshold we set ). All three runs were up to .

I agree it would be good to have more detailed plots showing the sections of linear-looking growth; perhaps it would be as simple as plotting the local variance of the partial sums with respect to an exponentially increasing window. I’ll see if I can knock together some code to do this.

The first is based on Klas’s latest upload, showing maxima up to on a log-log scale with a threshold of :

Andthe second goes up to , with the same threshold, also on a log-log scale, but is an attempt to capture the ‘local’ maximum of the square of the partial sum (so divide gradients by two when comparing): each point is the maximum over a window; there are 100 non-overlapping windows of geometrically increasing size:

Each time we set a value at a prime to +-1, it seems to me we’re doing something like adding the previously determined partial sums stretched by a factor of the prime, and again stretched by the prime squared, etc. When x2 gets set to -1 our initial partial sums look periodic, and so everything after that’s going to be adding periodic functions. Can we look at functions that are the partial sums up to n as if only the primes up to some prime have nonzero values?

Define F1(n) = 1 if n=0, 0 otherwise.

and with any luck that will collapse to some form we can deal with and the limit of F_k’s will converge to something we can deal with.

Or if it turns out to not be easy in general, maybe at least with the sequence?

F3(n) with x3=+1 doesn’t converge… but it’s maybe not terrible when taking the sum (of the contributions of the powers k of the prime) to m-1 instead of infinity. I get F3(n) “!=”

Here’s a very small extension of the plot for the algorithm that’s constrained to remain positive I showed before.

With my computer skills I’m not going to be able to extend this much further – I could probably leave my laptop working away for 24 hours and not get anywhere near the plot Alec could generate in an hour! (It would also be good to have an independent confirmation that the code which generated this plot is correct, since I’m somewhat less confident in it…)

A little remark on the nature of the long discrepancy-2 sequences we’ve found. One way to express the pattern noted here is that the sequences frequently coincide with a linear combination of a small number of multiplicative sequences taking values in . This comes from taking the Fourier transform of . In particular, they coincide at most 5-smooth values with a particular linear combination of three multiplicative sequences whose values are sixth roots of unity. This adds some weight to the idea (also suggested by Terence’s Fourier-reduction approach) that we perhaps ought to study the problem for -valued multiplicative sequences.

A further little remark on this topic. A feature of multiplicative functions to {-1,1} is that there are some rather crude dependences: if is a perfect square, then . This will probably place quite a big constraint on the growth rate of the partial sums (not that I can prove anything). But the dependences for complex multiplicative functions are a bit subtler, since perfect squares do not have to map to 1. It could be that this gives them a small advantage and explains why we get quasimultiplicativity appearing in the length-1124 sequences.

I’ve been thinking a little more about the idea of taking limits along translates where is increasingly divisible. One thing this does for us is turn a 1D sequence with drift bounded by C into a 2D (or even an infinite-dim) sequence with drift bounded by C. For instance, starting with a 1D sequence f(x) of drift bounded by C, consider for each j the 2D sequence

(x,y) -> f( x + n_j y )

and take a subsequence limit to obtain a new function f(x,y), which has drift bounded by C along rays through the origin in the sense that |f( ax, ay) + … + f(bx, by)| <= C for all a,b,x,y, but one also has a new drift property in the horizontal direction in the sense that |f(ax,y)+…+f(bx,y)| is less than C.

One can extend this procedure to higher dimensions. For instance, one can take a subsequence limit of the functions that take (x,y,z) to f(x+n_j y + n_j^2 z) and now we can bound |f(ax,ay,az)+…+f(bx,by,bz)|, |f(ax,ay,z)+…+f(bx,by,z)|, and |f(ax,y,z)+…+f(bx,y,z)| all by C. And indeed one can get a sequence on Z^omega (the union of the for all k) which has all sorts of bounded drift properties.

It’s tempting to try to hit this infinite-dimensional sequence with a Fourier reduction and/or a powerful Ramsey theorem (not quite Hales-Jewett – that would be amusing – but perhaps something like Hindman or Carlson-Simpson). I haven’t tried this yet.

OK, it seems that this infinite dimensional extension of the function does not, in fact, gain very much. If f(x) has discrepancy at most C, then the function f(x,y) defined by f(x) when x is non-zero, and f(y) otherwise, has bounded discrepancy at most C+1 at the 2D level; similarly, f(x,y,z), defined as f(x) when x is non-zero, f(y) when x=0 and y is nonzero, and f(z) when x=y=0, also has discrepancy C+1, and similarly in higher dimensions – in all cases the discrepancy amplifies from C to C+1 at best, leading to no asymptotic advantage for the EDP problem. The point is that this model is not capturing the very long intervals that lead to log divergence for things like . Ah well, back to the drawing board…

Regarding the Fourier reduction. One side question is: if we take M to be a power of 2 and look at the Walsh expansion of (Z/MZ)^d it will also work, right?

Another question is: It looks that if F is expressed as a complicated combination of multiplicative functions, or, in other word, if the coefficiets are all small, then the argument is very “wasteful” and that if you start with additive combination multiplicative functions (even not to {-1,1} but, say, with some 0s in the range), the discrepency behaves almost additively (without cancellation).

Is it of any interest to look at the Fourier description of our basic (-1)^(last non zero ternary digit) function and its expansion in term of “simpler” multiplicative functions?

Somehow the Fourier expansion we to study the additive perodicity (like Terry’s comment above) is different than the one used in Terry’s proof.

Very vaguely, it “feels” that we need a quantitative statement (along with a direct connection to discrepency) extending the trivial fact that if we have a sum of a finite number of periodic functions, so that each function vanishes on integers divisible by its periods then the sum also vanishes somewhere. I wonder if there is a complicated f.t. proof to this trivial statement.

Here is some related question which, on the one hand, can be easier, but on the other hand it or an argument for proving it may be useful. (Yet on the third hand it may be nonsense):

Let us consider a function from Z to {-1,0,1} (the definitions may continue to make sense when the range is C). Now lets consider the following measure for the discrepency in some large interval: for every HAP d 2d 3d …, rd we take (y_d+y_2d+…+y_rd)^2|y_rd|^2 and then we average over all HAP.

So in other words, in our case when the values are -1, 0 and +1 we just assign discrepency 0 to an HAP whose last term is 0.

Now consider a function so that the support has density t and we think of t as small. (the support refers to non zero entries.) So the average discrepency, as we defined it, should be at least t or so. The question is if we can show that the average discrepency is at least t log t.

The role model: pieces of our matriushinka function: 0 for numbers not divisible by 3^k and then 0 1 or -1 according to the first non zero digit in the ternary expansion.

Just to make it clear, the role model is: x_k = 0 if k is not divisible by 3^k Otherwise, x_k is the value of the kth digit in the ternary expansion. This is a periodic function and the density t of the support is (2/3)3^{-k}.

The reason we get t log t in our measure of discrepency is that while we can expect that the density of the sequence restricted to a HAP is t, in 1/3 of times it will be 3t and 1/9th of times it will be 9t etc. The question is if there is no way around this log t. of course if we take a random +1 -1 0 sequence of density t, a restriction to a HAP will not have much larger density but in this case the discrepency itself will be larger.

I have been working on the human proof that a sequence with discrepancy 2 must have length at most 246. There was one case left f(2)=f(3)=f(5)=-1 and f(7) = 1. I think I have a proof of this. Let me outline it f(98)=f(99) =-1 which means there is a cut at 98 so the sum is 0 at 98. f(100) =1 so the sum is zero at 100. f(110)=f(111)=-1 so there is a cut at 110 so at 110 the sum is zero. This means that the sum of the numbers 101 to 110 is zero which means three of 101,103,107 and 109 must have positive value with the third negative.

If we look at 200 201 202 and 203 we see that 200 and 201 have value -1 which means there is a cut at 200 and if 202 is negative then since 203 is negative the sum at 203 will be negative 3 so 202 must be positive and hence 101 must be negative and 103,107 and 109 must be positive.

Now look at 206 through 215. Since 103 is positive 206 is negative and since 207 is negative there is a cut at 206 but the sum from 207 to 215 is negative 3 and we have a contradiction.

I putting this in the wiki and updating it with some stuff in the thread EDP5 which is not up yet.

I would like to see the cases from the prover. If they are too long to post here then you could post them somewhere in the wiki a just give the URL here. Once you get f(11), f(3) and f(7) =-1. Then f(20),f(21) and f(22) are 1 and there is a cut at 20, so the sum at 20 is zero and if f(23) is 1 then the sum at 23 would be 3 so f(23) must be -1.

I have put some stuff on the Wiki directly behind the quoted part. It is not polished and not complete. Maybe we can use some human input to fix some further primes thereby reducing the number of cases to come. My program is not completely automatic and operating it further requires some work.

Towards an efficient proof for the non-existence of a multiplicative function (of length 247 or infinite) with discrepancy 2.

We might learn something from tweaking the constants in our log-growth examples. Remember is 1 if the last digit in the ternary representation of i was 1 and it is -1 else. Let denote the sum. We have and the recursion

for . is positive, self-similar (btw nice picture) and satisfies

The number of 1s (-1s) in has asymptotic density 1/2. This is necessary for EDP. Are there conditions we can add to make this sufficient?

If we modify by an extra factor -1 if the number of zeros behind the last digit is odd, we still get logarithmic growth. Can we alter further to get better constants?

[…] jams I was not lazy and added up +1’s and -1’s to get an idea of what is going on here: Polymath5 project. The Erdös discrepancy problem is easy to formulate and really hard to solve. Give it a […]

When I was waiting for a bus yesterday, and happened to have my laptop at hand, I decided to take a look at the discrepancy of all multiplicative functions up to an integer , rather than tryingout various algorithms for generating random samples from this space.

Let denote the nth prime number.

I fixed and then for each integer constructed the multiplicative function $f$ which is . Here is the ith binary digit of , with the convention that the nth digit is the one which correspond to The reason behind this convention is that it gives more structure to the pictures

For each function I have plotted the base 2 logarithm of the maximum of the modulus of the partial sum of for . So the value at x=10 in the figure is the base 2 logarithm of discrepancy of the function derived from the digits of the number 10. With the convention for digits mentions earlier all functions up to will have , and the function corresponding to 0 is the function which is constant 1.

Here is the plot for

For large n is not possible to test all function but one can sample them. Here is a plot based on sampling for n=200

and one for n=500

For small n one may also plot the correlation between the number of 1s in the binary expansion of i and the discrepancy of the ith function
Here the constant fuction 1 is the extreme on the left and the Liouvilel function the extreme on the right

Likewise one may plot the same correlation for the number of runs of digits instead
Here the constant function and the Liouville function become the leftmost functions and the two extreme functions on the right are the two functions which are alternatingly 1 and -1 on the primes larger than one (Which I mentioned in an earlier post)

In relation to Klas’ plots I thought it would be interesting to look at the overall density and cumulative distribution of the maximum modulus of the partial sum, over all completely multiplicative functions up to . Here are the plots for :

The plots illustrate the distribution of the logarithm of the maximum absolute sum divided by . The shapes seem to be pretty much independent of . In particular I noticed that the cumulative curve becomes almost vanishing below . One might interpret this by saying that ‘almost all’ multiplicative functions exhibit growth of at least , which is interesting in light of some of our previous observations for greedy algorithms.

PS. Having looked at some distribution graphs of random samples from longer random multiplicative sequences, such as this one (, sample size 1000), I see that I was too hasty in claiming that the shape was pretty much independent of . The distribution seems to get more concentrated around (corresponding to growth) as gets bigger.

According to this paper (see remark following Theorem 3.2), if is a random -valued multiplicative function (that is, the values are independently and uniformly distributed on the unit circle), then, almost surely,

Trying to keep track where we are (without yet digesting all the arguments) I wonder about the following statement from the wiki on the strategy described in the post above.

“This strongly suggests that we might be able to prove a result that applies to all functions f that satisfy some Fourier condition. The condition should tell us that f does not correlate with residue classes modulo small primes, so it should tell us something like that is small whenever α is close to a rational with small denominator. Intuitively, these are the functions that ‘can tell the difference between HAPs and ordinary APs’.”

My question is: Suppose we know that functions which do not correlate with every residue class modulo small primes (or small integers) have large discrepency, will the result about functions that have large correlation with a Direchlet character suffice to finish all cases?

(Or, say, all completely multiplicative cases, which are more or less sufficient.)

Another question (again, this may already been discussed): moving from a function f(1),f(2),…,f(N) to g(1)=f(1), g(2) f(1)+f(2), …, g(N)=f(1)+…+f(N) looks very similar to moving from f(x) to so maybe there is a simple relation between fourier coefficients of g to those of f.

Gil, I thought I’d let you know that I’m still very interested in this line of attack. I’ve tried a few calculations privately (the kind of thing that is quite hard to put into a blog comment) and not made any serious progress. However, it might be worth trying to collect my thoughts and write something so that others can look at it and perhaps make better suggestions. I’ll try to find the time to do that.

If f(1)+f(2)+…+f(n) is bounded for every n by a constant C then f is “roughly” periodic with a period depending on C.

We would like to have some measure, perhaps based on the Fourier expansion, for what “roughly periodic” is.

The following (non multiplicative) function was already mentioned and I think it will be interesting to understand in what sense the informal suggestion is correct for it.

Divide the integers to blocks of 12 (say; blocks of two are fine too) and in each block take at random 6 +1’s and 6 -1’s.

Is this function roughly periodic in some sense? Is this detected in some way by the Fourier coefficients?

(I vaguely think that there is some sort of trichotomy: a) “roughly periodic” b) Fourier coefficients \hat f(t) are supported on t’s corresponding to small periods c) Fourier coefficients leak to “large periods” where they become irrelevant. So the function above seems to be in class c). )

An interesting question raised by Sune (if I remember correctly) is whether if is a bounded multiplicative function and is the same as except that you change it at one prime, then must be unbounded. Here is a rather weak (but easy to prove) result in that direction. Let be a complex multiplicative function with bounded partial sums, let be a prime, and let be a constant. I shall show that it is possible to change the value of so as to make a function with a partial sum that is at least .

To do this, I define a function as follows: is the partial sum of up to except over multiples of . Equivalently, it is the partial sum of the multiplicative function that is obtained from by changing the value of to zero.

Note that

Now for any we can easily choose such that has modulus at least 1/2 for every and is zero when . To do this, we observe that has modulus 1, that at least one out of and has modulus at least 1/2 (all the time my assumption is that for every ), that if is 1 or 2, then at least one of or has modulus at least 1/2, and so on. In this way, we build up a number such that every time you divide by and take the integer part you have a number such that , as desired.

But now, by Parseval, if you take the mean square modulus of

over all possible choices of , then you get at least , which implies that there exists a choice of for which you get at least .

At the time of writing I don’t see an obvious compactness argument to get a single that works for every . However, here is a sketch of a not completely obvious and not completely checked argument.

By continuity, the proof above gives not just a single , but an interval of values of that work (if I choose such that , say). The problem I then face is that to apply Parseval I need to run over the entire circle. But if we just let run over an interval, then on the other side we get not the norm of all the but rather the norm of the convolution of that sequence with the Fourier transform of that interval (though we could take a nicer function supported in that interval, if e.g. we wanted a Féjer kernel rather than a Dirichlet kernel). But it seems to me (and this is what I have not checked) that we have enough flexibility to ensure that the norm of such a convolution can be made arbitrarily large: at each stage we have the chance to add one of two vectors in that have a distance apart that is bounded away from zero. I don’t quite see it yet, but am trying to write this straight out of my head. Anyhow, it seems to me that something like this should work.

I have a strategy for thinking about multiplicative functions. It’s not so much a strategy for how to prove things as for how to attempt to prove things, but it still feels moderately promising. (I’m not using “moderately” in the mathematician’s sense of “very” but in its usual sense — roughly half way between a complete waste of time and something to get excited about.)

Let me set up some notation. I’ll fix a multiplicative function , and since I need a bit of generality in the discussion I’ll say that the values it takes belong to the set . (The point is that I’m interested in -valued functions, but it is instructive to think about why the proposed argument fails for Legendre characters.) Given such a function, let us define to be the partial sum , and for any prime , let us define to be the “miss-out-q partial sum” .

Then we have the identity (which I have also mentioned in various other comments)

The expression on the right-hand side looks like an infinite sum but of course it isn’t really, since all the terms are zero once is larger than .

The general strategy at this point is to try to choose a value of for which the above expression is large. Intuitively, it feels as though this ought to be possible, since we can proceed as follows. We try to pick a sequence of integers between 0 and q-1, with , in such a way that for some the sum

is large. Since for each we have q choices for the number (given the choices we have already made for ), we ought to be able to do something simple like making the value of as big as we can when i is even and as small as we can when is odd (here I’m imagining that but one can also look at the case where , in which case one might consider trying to maximize at every stage).

Needless to say, a simple-minded strategy of this kind doesn’t work. However, it seems to me that the non-working of this strategy ought to place some pretty severe constraints on the function , which is what interests me about the argument.

To get some idea of what these constraints might be like, it helps to look at examples where we know that the strategy must fail. It is here that it is useful to allow to take the value zero. Let’s consider the example where is the function 1, -1, 0, 1, -1, 0, 1, -1, 0, …, otherwise known as the Legendre character mod 3. Since this function is completely multiplicative and has all its partial sums equal to 0 or 1, the argument is certainly guaranteed to fail. But how does it fail?

Let us begin by taking , a case I’ve already mentioned on this blog. In that case, goes like this (starting with the value at 0): 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, … In other words, it is 1 if x is congruent to 1, 2, 3 or 4 mod 6 and 0 otherwise. We also have . So we now try to build up a sequence of numbers , with the constraint that each is either or , and we try to make large. (In terms of my earlier notation, , q=2, and each is 0 or 1.)

To analyse this situation, let us form a directed graph on the set . We have an edge from to if and only if it is possible for to equal r mod 6 and to equal . The condition for this is that should be or mod 6. So 0 is joined to 0 and 1, 1 is joined to 2 and 3, 2 is joined to 4 and 5, 3 is joined to 0 and 1, 4 is joined to 2 and 3, and 5 is joined to 4 and 5.

Now let us partition the vertices into the following three sets: , , and . Note that every edge from goes to , and no edge from goes to itself. This means that if we write out a sequence of s, s and s according to which set falls into, we have the rules that must always be followed by , can never be followed by , and there is a third rule that I have not mentioned, which is that can never be followed by . This means that the sequence will look something like this: BCBCAAABCABCBCBCBCAAAABC … In other words, it is an arbitrary sequence made out of As and BCs.

How does this correspond to the sum we are trying to evaluate? Well, we go along the sequence and at the th term, if we have a B or a C then we add , and if we have an A then we add 0. (That is because if and only if belongs to either B or C.) Therefore, the sum can never be bigger than 1 in modulus.

Because this comment is getting quite long, I’ll use a separate comment to discuss a different example.

The example I had in mind was what you get if you take instead of but keep the same as above. But in the hours since I wrote the comment, I’ve come to see a bit more clearly what is happening, so I’ll try to describe the general phenomenon instead.

As I write, I remember a serious weakness about this kind of argument, at least as so far presented, which is that it the only way that it exploits multiplicativity is when it comes to multiplication by . It’s just possible that this could be OK, since one might start by choosing a with some property, the choice depending on the function . But even then it feels like too little to be considering.

Nevertheless, let us consider a general recipe for creating bounded functions such that it is not possible to find a sequence such that each is of the form with and . (If you want an alternating sum, then that can be done too.) The idea is to build up as follows. First choose the values of between 1 and q-1. (We know that .) The difference between successive terms should always be when we do this, and we keep the values between and . Now for each between and we choose the values between and by taking the value of to be . Again, we insist that lies between and . Next, we define to be . In general, we let be an arbitrary function defined on finite sequences of elements of the set with the property that for every the successive values of differ by as varies, and then we define

Then if all values of are confined to a set , it follows that all values of are confined to the set . Also, if , then is equal to , which also belongs to the set .

The one other thing we must ensure when we choose the function is that is always when is not a multiple of and 0 when it is. I haven’t quite worked out a general criterion for this, but I think it is reasonably straightforward.

The first is that if anyone feels like writing an easy program, then one thing I’d very much like to see is a plot on a log-versus-log scale of the partial sums of a random multiplicative function. What I want to know is whether it is noticeably different from a random walk, or whether regular oscillations magically show up.

The second is to revisit the question of whether a multiplicative function that takes values if is not a multiple of 3 and 0 if it is, and also has bounded partial sums, must be the function 1, -1, 0, 1, -1, 0, 1, -1, 0, … As Terry, and possibly others, pointed out, this looks as hard as the main question, since if you take any -valued multiplicative function with bounded partial sums and replace its values at multiples of 3 by 0, then you will get a new multiplicative function with bounded partial sums. But if a -valued multiplicative function with bounded partial sums has to equal when is congruent to 1 mod 3 and when is congruent to -1 mod 3, then it has to be either or , which is a contradiction since these two functions do not have bounded partial sums.

The conclusion from this is that if there is a -valued multiplicative function with bounded partial sums, then there is a counterexample to the question about functions that are 0 at multiples of 3, so a positive answer to that question would imply a positive answer to EDP for multiplicative functions.

Despite this, one might argue that there is a small chance that looking at this problem could be a good approach. The reason is that at least with this problem we have an extremal example that we want to push for, whereas in the original problem we do not believe in any example at all (of a multiplicative function with bounded partial sums).

And here is a counterargument: we would need in particular to prove that there is no multiplicative function that’s zero on multiples of 3 and otherwise and has bounded partial sums, such that . How would we prove that? We have the problem that there is no extremal example to push for, because we do not believe that such a function exists at all. So the difficulty has not gone away after all.

A further counterargument is that all the usual difficulties arise: we can find functions with such that the partial sums grow logarithmically, we can use greedy algorithms to try to keep the partial sums small, and so on.

On random multiplicative functions: I have done some experiments with them and found that they looked superficially fairly similar to a normal random walk — at least, I couldn’t discern any magical large-scale oscillations. Here are two sample plots (log-log plots of the absolute partial sums of a random -valued multiplicative function and a random -valued function):

Just to add to the first point above, it would also be good to have a log-versus-log plot of an actual random walk, because it’s not completely obvious how it should look. That would make the comparison with the same thing for a random multiplicative function much easier.

As ever, it is fascinating to get this kind of visual information. There seems to be a significant instability in the random multiplicative functions, since by no stretch of the imagination could one say that they all look roughly the same. But then again, the random walks look rather different too, so I’m not quite sure what to say. (I still feel that the qualitative differences in behaviour are more striking for the random multiplicative walks, though.) Perhaps it would be interesting to see whether the local behaviour of the partial sums of a random multiplicative function is roughly Brownian. One could do this by plotting such functions on ordinary linear scales between say 10^6 and 10^6+10^3. (Of course, the actual random walks could just be between 0 and 1000.)

I still think that a very interesting sub-project would be to try to come up with some intuitive reason for the apparent growth that showed up in some of the functions produced by greedy algorithms. Also, I still haven’t quite given up on the idea of getting something better by using cleverer algorithms.

Actually, that gives me a thought. Suppose we tried a greedy algorithm but in Alec’s hexagonal lattice set-up. Perhaps we could start with , just to force the complex numbers to appear, and then define thereafter in order to minimize the modulus of the partial sum up to that point, always taking it to be a sixth root of unity. (Or we could go for all complex numbers of modulus 1, but then we would lose the exactness.) The oscillatory behaviour of the functions we have produced in the real case suggests to me that we might get some nice spirals showing up.

I think that partial sums of a random multiplicative function should look locally Brownian if you go high enough. Given , then for sufficiently large , the values will look like independent Rademacher random variables, since there will usually be primes such that forces $f(N+i)$ and no others. (I’m waving my hands vaguely in the direction of the Erdos-Kac theorem…)

I’m gradually coming to realize that at least some of what I called “qualitatively different behaviour” of different instances of random walks or random multiplicative walks is nothing of the kind: it just reflects the fact that on a log scale if you are near the origin then everything is blown up a lot. So now I’m wondering if there is some way of displaying the information that would be better. One obvious idea is just to display it with ordinary axes: exponentially slowing oscillations will still be apparent if they occur, but the local behaviour will now be the same everywhere in the random case.

Basically I’m saying that I think I made a mistake when I suggested plotting things on a log-versus-log scale. Apologies.

Let be completely multiplicative, -valued with bounded partial sums and let be completely multiplicative, the same as except that you change it on one fixed prime . Then has unbounded partial sums.

To see this, we count the occurrences of 1 and -1 in . For we have the estimates with the bound in the discrepancy condition. This implies for all bounded functions. For the number of occurrences changes by counting the sign swaps. Evaluating the sum and taking limits yields .

That seems too easy to be true. If it was, we could try to use the inclusion-exclusion principle to get stronger results. Maybe even uniqueness.

I don’t understand this argument. Some of the sign swaps will go in one direction and some in another, whereas your calculation seems to be assuming that all sign swaps involving a given go in the same direction. Or have I misunderstood?

For completely multiplicative functions -subsequences are multiplies of the original sequence. Thus we have some control on the direction of flips and can use this to get a recursion for the partial sums (g and f are related by flipping f(p)):

.

This puts some constraint on the possible unboundedness of the partial sum if we flip at a single prime. For example, if we flip we go from to our (still nameless) record holder example with partial sum and vice versa. Can we jump from – to -growth?

I have a few questions which I hope are distinct from others that have been asked:

Q1: Is it possible for a -valued completely multiplicative function to have a negative average value? In [Granville and Soundararajan 2001], they show that a completely multiplicative function has partial sum (up to ) uniformly bounded below by . For fixed , it is relatively easy to find an which achieves this lower bound. But for a fixed , can this lower bound hold for all ? If not, how close can one come to the lower bound?

Q1a: Can one prove results about the time of a “record high/low” of the sum?

Q1b: Can one prove ergodicity results about the values of ? I suppose this has been the implicit theme for some time. That is, we have chosen different weight functions over which we average the values of (averaging of course being analogous with integration). Is there any better or more natural weight function than those that we have considered? Can stronger/weaker ergodicity results be proven?

Q2: Suppose has finite discrepancy. How well can it correlate with characters? This is a rather imprecise question but it is motivated by Theorem 2.1 of [BGS- Multiplicative Functions in Arithmetic Progressions]. In words, I interpret this theorem [perhaps incorrectly] as follows: cannot be “totally uncorrelated” with all characters, because the left side is (essentially) , but the right hand side is for the distance from to along . Though it looks nice as written, this bound may be rather ineffective.

————-

Finally, a trivial observation: Suppose we have a bounded, completely multiplicative function. Then, there exists such that for all , can no longer achieve any record highs or lows. In other words, after a while, behaves better than it did for lower . This last statement is clearly nonsensical, and I would have hoped it could be exploited in the proposed limiting arguments. However, I’m sure this (rather vague) leveraging scheme has been tried already.

I know we looked at these sequences a few hundred posts ago, but I don’t recall anybody digging to the bottom. This isn’t the bottom, but maybe…

What’s the discrepancy of the sequence ?

I considered the partial sums of such sequences in math.NT/0308087, but haven’t kept up with further progress on the problem.

First with the obvious: with rational, the sequence is periodic and so has linear discrepancy.

Less obvious: with $\alpha$ irrational, we can get the partial sums to grow aribtrarily slowly, and it’s possible logarithmic growth is typical. Here’re some details.

Here are some details on that. Let , and let be the continuants (denominators of convergents) to . Then, taking to be the convergent with maximal (but strictly smaller than ), we get

Here’s an example (from math.NT/0308087), if the continued fraction of is , with odd and all the other even, then all of the convergents have odd numerators, so that . Thus, if , we get

Further, if is odd, and if is even.

These combine to give the following: let $q_i$ be the largest continuant not larger than N. Then

.

If the continuants grow super exponentially, then this give $S_N$ being sublogarithmic. An example is

So if we could find such an $\alpha$ all of whose multiples had nice continued fractions, then we could get sublogarithmic growth. On the other hand, it’s likely that if such an existed, only the parities of the $a_i$ would be relevant, and so we could choose them to grow arbitrarily rapidly, thus achieving arbitrarily slow growth of discrepancy.

On the third hand, this system of parities (odd, then all even) for the partial quotients is not the only system that is workable, just the only one that I’ve worked.

That’s very interesting! I suppose the challenge is to show that we cannot find $\alpha$ with the super exponential growth of the continued fraction expansion for each multiple. Now, I could not understand the comment about parities.

I’m confused. I thought examples like this had square-root growth. Here, roughly, was my argument. By the pigeonhole principle, for any m we can find such that is within of an even integer. But then the value of at the first multiples of is the same, so the partial sum along that HAP is , and we reach it by time .

You’re not confused. I was focused on getting the partial sums small, without looking yet at the other HAPs.

So the 2 dimensional () version suffers from the same type of defect, but less so. Map n into by , and split into squares with side length . By pigeonhole, there’s an with both within of an even integer. It follows then that , so that . I’m under the impression that (excusing the constant) this pigeonhole argument gives the correct size of r, but we could perhaps get large discrepancy with a less strict r.

That’s an interesting example. It’s not obvious that one actually needs both and to be close to an even integer for the partial sums along multiples of to get large, however. Isn’t it enough if we just have and with and very close to each other? Then the parities of and will be the same for a long time. But the pigeonhole argument gives that in time rather than time .

You’re right, but I think one can argue like this: if and are often close to an integer then either you can replace r by 2r and get them always close to an integer (when is close to an integer plus 1/2) or the parity is the same more than 50% of the time, so you get linear growth in the partial sums. I haven’t checked that carefully, but the more general point I think holds (as you also seem to be suggesting) that a slight tweaking of the argument should be enough.

What I meant by “right in principle” is that asking for to both be close 0 mod 2 is asking for much more than we need. I still don’t see what we need to get square-root growth, though. I do agree that there most be *some* tweak that finishes this.

We can’t guarantee that the first case (they are close to an integer) happens unless we go out to .

In the other case, for generic , the sequence is u.d. for every $r$, so I don’t see linear growth at all.

Do you think it’s square-root growth for the general question (infinitely many , tending to 0)?

Let me try to be more precise than I was in my previous comment. I still think square-root growth happens with the and example, but maybe trying to prove it properly will cause me to realize some mistake.

I argue as follows. First let us pick some such that is at most mod 2. Now let us ask when it can possibly be that and have different parities. We know that and differ by at most mod 2, so a necessary condition is that they should both be within of an integer.

Ah, I now see that we are talking slightly at cross purposes. I certainly don’t say that we choose some , once and for all, and that for that we get linear growth. What I’m saying is that if you just look at the partial sums up to you can get growth of around by choosing an appropriate that depends on .

Going back to the argument, just about the only way that multiples of can be close to an integer at least half the time is if is close to a multiple of 1/2. If that is the case, then is close to an integer, and hence so is , which means that the parities stay the same for a long time. If, on the other hand, multiples of are close to an integer under half the time, then the parities are roughly the same over half the time and we get a linear lower bound.

That’s still not properly detailed, but maybe now it is clearer what I meant.

I was about to write that something similar ought to work for more numbers, but actually I now don’t find that obvious at all, even for three numbers. The only easy way I can see to get to have a parity that’s heavily biased is to do something “too strong” like finding with and very close mod 2 and very close to 0 mod 2. So I am coming round to being interested in your example after all. (That’s not meant as a rude comment — I was interested in it, but just didn’t see how it could work. But with this small modification I no longer see how to show that it doesn’t.)

Going back to your original post, the same issue arises. You say that if all multiples of have continued fraction expansions of a certain kind then we get sublogarithmic growth. But what that means is not that there’s some uniform upper bound of for the partial sum of the first terms along any HAP, but rather that for each there is an upper bound of the form along the multiples of .

The bottom line of the original post is this: If irrational has a continued fraction $[a_0;a_1,\dots]$ with $a_0 odd$ and all other $a_i$ even, then

where $q_i$ is the largest continuant (denominator of a convergent) smaller than $N$. Since the denominators have to grow at least as fast as the Fibonacci numbers, we get for an absolute constant . The denominators can be made to grow arbitrarily fast, giving arbitrarily slow growth to the partial sums.

So if we had a real all of whose multiples had this special shape, then we would have a sequence whose discrepancy is at most logarithmic. As noted a few comments ago, though, the discrepancy must be at least , thereby proving that there is no $\alpha$ with such special multiples.

The point I’m trying to make is that one has to be careful about what “all of whose multiples had this special shape” means. For example, if you take the number , then all its multiples will have continuants that grow very fast indeed. However, this will not be a uniform statement, so although the growth rate for any given multiple will be slow, the initial growth rate may be large. To put that another way, there’s an important difference between the eventual growth rate and, say, the minimum ratio between any pair of successive continuants. But perhaps it was the latter that you were talking about all along.

None of this affects the point that it still seems to be interesting to think about the growth rate along HAPs of sequences of the form .

Any {-1,1}-sequence is on the form
Given a sequence we want to find a set of s that gives this sequence:
Start with the empty set of s. If is the first term where the original sequence and the sequence obtained from the s differ, add a to the set of s. Continue this, possibly infinity many times.

Of course this might still be a good way to look at the sequences. We could assume that much faster than . Or that the sum is finite.

I want to suggest a generalization of the EDP. This is slightly more general that what I suggested in the last paragraph of this comment. I want to generalize EDP from sequences indexed by to sequences indexed by , the set of sequences of non-negative integers such that the sequence is constant 0 from some point. We think of the sequences as the powers in the prime factorization, so if and we define . So we say that divides if .

Now the formulation should be that for any sequences over {-1,1} (of course this could also be T or some other subset of a vector space) and any there is such that.
But for this to make sense we need a ordering on . This ordering should fulfill: If and then and for any the set should be finite.

I think (but I'm far from sure) that Terry reduction from general sequences to multiplicative sequences works for this general problem. So if there exist a sequence with bounded discrepancy for a given ordering of , the must exist a multiplicative sequences over T with bounded discrepancy (wrt. the same ordering of ). Can anyone confirm this?

I'm going to post another comment where I will explain why I think this problem is interesting.

The reason I think this generalization is interesting is that there are orderings of where the multiplicative EDP over {-1,1} fails: Lets call the elements primes. Now we set and and define and to be an irrational number. We use these logs to define an ordering in the obvious way. We let multiplicativity define the values at and anytime the discrepancy becomes greater than 1, we define the next prime to be slightly less than where the problem arises. This way we can keep the discrepancy at 1.

So anytime we have a strategy for proving EDP, we must be sure that the strategy doesn’t work for this ordering on . You might think that we should use that the sequence of primes grows to quickly in the natural numbers, but here is another critical ordering:

For any we let . This function gives us a partial ordering defined by . If we use the lexicographic ordering. We now define . I think this gives a sequence with discrepancy 1, but I haven’t found a proof.

In this example the sequence of primes grows very fast. One thing to notices in this example is that you can’t define the ordering only by looking at the logs of the primes (unless you use infinitesimals): We have so but for any n so .

Do there exist orderings "defined using only logs" where the sequence of primes grows about as fast as in the natural numbers and where the EDP fails?

One obvious ordering on is: . For this ordering the problem is just the EDP. One nice thing about this ordering is that for any , the “numbers” (elements in l_0) divisible by a are exactly every nth number, where . For a general ordering, the numbers divisible by a is not even periodic. It seems natural to use that fact that say every 3rd number is divisible by 3 in a proof of EDP. This motivates the question:

Are there any other ordering such that exactly every 3rd number is divisible by some prime?
and more generaly
Are there any other ordering and any and such that the numbers divisible by are exactly every nth number.

I tried some approach which i may talk more about later but the following toy question (which might even be related to Sune’s extensions) came up.

Suppose that you consider all numbers that can be presented as with their ususal order, so that and ignore all other numbers. (So, in words, in case the latex wont compile, all the integers that can be written as the product of prime powers where you take the first d primes and the exponents at most m.)

We give each such number a sign and we want to show that the discrepency on some “HAP” x 2x 3x …. is large. (But note that the sequence 1,2,… takes only those integers left alive).

The reason this problem might be easier is that the (multiplicative) Fourier tools may go further.

Gil, I’m not quite sure what your question is, since you seem to be talking about a finite set of integers. (By the way, do you mean ?) To get a lower bound on the discrepancy as a function of and would clearly be interesting. But I suspect that we can probably get bounded discrepancy if we restrict to numbers of the form for fixed and arbitrary, perhaps by some variant of the constructions discussed in the comments following this post — though it isn’t immediately obvious how…