In probability/statistics, there is a notion of two things being "independent", which would basically mean that any information we can get about one thing has no effect on our (probabilistic) knowledge of the other.

What are the possible notions of "independent" for the natural numbers? Under such a notion, for instance, the properties of being "multiple of 2" and "multiple of 3" are independent, while "multiple of 4" and "multiple of 6" are not, because something being a multiple of 4 means that it is even and that makes it more likely that it is a multiple of 6.

There's no probability measure on the natural numbers where every natural number carries an equal positive weight (no uniform distribution on the natural numbers, because they're countable) so the substitute seems to be to look at all the natural numbers up to some natural number n, consider the extent of dependence there, and then take the limit as $n \to \infty$. Or, perhaps instead of looking at initial segments, we can look at segments of consecutive integers starting and ending at finite points, and measure the degree of dependence there. Are there other notions that are qualitatively different, or stronger, or weaker? What notions of independence are most useful for specific applications (such as the distribution of prime numbers, additive combinatorics)?

On a related note, is there some way of making sense of the "correlation" between two (infinite) subsets of the natural numbers, that would play some role analogous to what correlation plays in probabliity/statistics? Even if there isn't a numerically rigorous way, is there some way we can define a notion of "uncorrelated" for infinite subsets of the natural numbers. (Hopefully, in a way that independent subsets are uncorrelated)? My guess would be to measure correlations in some suitable way for all numbers up to $n$ and then take the limit as $n \to \infty$.

$\begingroup$The first suggestion you make (uniform measure on natural numbers up to n, then let n grow) is essentially what is studied under the name "asymptotic density". Google will point you to lots of information about it.$\endgroup$
– Mark MeckesFeb 17 '10 at 0:54

2 Answers
2

There are some ways to assign probability measures to the set of antural numbers. Consider the probability measure $P_s$ on the positive integers which assigns "probability" $n^{-s}/\zeta(s)$ to the integer $n$. ($s$ is a constant real number greater than $1$.)

Then under this measure being a multiple of $r$ and a multiple of $s$ are independent events, in the probabilistic sense, if $r$ and $s$ don't have a common multiple. You can show this starting form the fact that the measure assigned to the set of multiples of $k$, for some positive integer $k$, is
$$ {1 \over \zeta(s)} \sum_{n=1}^\infty {1 \over (kn)^s} = {1 \over \zeta(s)} {1 \over k^s} \zeta(s) = {1 \over k^s}. $$
That is, the probability that a random positive integers is divisible by $k$ is $k^{-s}$. Of course you really want all integers to be equally likely, which should correspond to $s = 1$.

(I learned this from Gian-Carlo Rota, Combinatorial Snapshots. Link goes to SpringerLink; sorry if you don't have access.)

Under "suitable conditions", which I don't know what they are because Rota doesn't say, the density of any set of natural numbers $A$ is the limit $\lim_{s \to 1^+} P_s(A)$.

In particular it might be reasonable to define correlation between sets of natural numbers in the same way. Let $A$ and $B$ be two sets of natural numbers. Let $X$ and $Y$ be the indicator random variables of the sets $A$ and $B$ in the measure $P_s$. The Pearson correlation coefficient between $X$ and $Y$ is
$$ {(E(XY) - E(X) E(Y)) \over \sigma_X \sigma_Y }$$
where $E$ is expectation and $\sigma$ is standard deviation. Of course this can be simplified in the case where $X$ and $Y$ are indicators (and thus only take the values $0$ or $1$) -- in particular it simplifies to
$$ {P_s(A \cap B) - P_s(A) P_s(B) \over \sqrt{P_s(A) P_s(B) (1-P_s(A)) (1-P_s(B))}} $$
We could then deifne the correlation between $A$ and $B$ to be the limit of this as $s \to 1+$.

In the case where $A$ is the event divisible by 2'', for example, and $B$ is the eventdivisible by 3'', then $A \cap B$ is the event ``divisible by 6''. So $P_s(A \cap B) = 6^{-s}$, $P_s(A) = 2^{-s}$, and $P_s(A) = 3^{-s}$, so the numerator here is 0 and so the correlation is zero.

But in the case where $A$ is the event divisible by 4'' and $B$ is the eventdivisible by 6'', then $A \cap B$ is the event ``divisible by 12''. So the correlation with respect to $P_s$ is
$$ {12^{-s} - 24^{-s} \over \sqrt{4^{-s} 6^{-s} (1-4^{-s}) (1-6^{-s})}} $$
which has the limit $1/\sqrt{15}$ as $s \to 1^+$; more generally the correlation between being divisible by $a$ and being divisible by $b$ is
$$ {ab - lcm(a,b) \over lcm(a,b) \sqrt{(a-1)(b-1)}} $$
and this may or may not be what you want.

Independence has nothing to do with the uniform distribution. It is extremely common to define an infinite sequence of measures whose joint measure is the product measure. It is also common to define finite-dimensional distributions and then use an extension theorem (Daniell-Kolmogorov, Ionescu Tulcea, etc.) to prove the existence of a (unique) measure on the infinite product space.