What is the expected length of the longest consecutive subsequence of a random permutation of the integers 1 to N? To be more precise we are looking for the longest string of consecutive integers in the permutation (disregarding where this string occurs). I believe the answer should be ~ c ln(n), but I have been unable to prove this.

Update: We are looking for the longest increasing subsequence. This is to say 9,1,5,2,7,3 has an increasing subsequence 1,2,3.

I'm guessing that, given a random permutation $a_1 a_2 \ldots a_n$, you're looking for the longest increasing substring, i. e. a sequence of the form $a_{i+1} < a_{i+2} < \ldots < a_{i+r}$ is a consecutive subsequence of length $r$. This seems to me it would have the property you conjecture.
–
Michael LugoMar 2 '10 at 3:02

David's interpretation is what I intended, although this may not have been exactly what I wanted (I need to rethink this). Micheal, I think the answer to your interpretation is 2 \sqrt{n}, see: jstor.org/pss/2959770.
–
Mark LewkoMar 2 '10 at 3:09

No, $2\sqrt{n}$ is the typical length of the longest increasing subsequence that is not necessarily consecutive. For the consequtive subsequences, the answer is, indeed, logarithmic.
–
fedjaMar 2 '10 at 3:16

Ah, yes! That is what I wanted. I'll revise the question.
–
Mark LewkoMar 2 '10 at 3:19

4 Answers
4

The purpose of this answer is to use the second moment method to make rigorous the heuristic argument of Michael Lugo. (Here is why his argument is only heuristic: If $N$ is a nonnegative integer random variable, such as the number of length-$r$ increasing consecutive sequences in a random permutation of $\{1,2,\ldots,n\}$, knowing that $E[N] \gg 1$ does not imply that $N$ is positive with high probability, because the large expectation could be caused by a rare event in which $N$ is very large.)

Theorem: The expected length of the longest increasing block in a random permutation of $\{1,2,\ldots,n\}$ is $r_0(n) + O(1)$ as $n \to \infty$, where $r_0(n)$ is the smallest positive integer such that $r_0(n)!>n$. ("Block" means consecutive subsequence $a_{i+1},a_{i+2},\ldots,a_{i+r}$ for some $i$ and $r$, with no conditions on the relative sizes of the $a_i$.)

Note: As Michael pointed out, $r_0(n)$ is of order $(\log n)/(\log \log n)$.

Proof of theorem: Let $P_r$ be the probability that there exists an increasing block of length at least $r$. The expected length of the longest increasing block is then $\sum_{r \ge 0} r(P_r-P_{r+1}) = \sum_{r \ge 1} P_r$. We will bound the latter sum from above and below.

Upper bound: The probability $P_r$ is at most the expected number of increasing blocks of length $r$, which is exactly $(n-r+1)/r!$, since for each of the $n-r+1$ values of $i$ in $\{0,\ldots,n-r\}$ the probability that $a_{i+1},\ldots,a_{i+r}$ are in increasing order is $1/r!$. Thus $P_r \le n/r!$. By comparison with a geometric series with ratio $2$, we have $\sum_{r > r_0(n)} P_r \le P_{r_0(n)} \le 1$. On the other hand $\sum_{1 \le r \le r_0(n)} P_r \le \sum_{1 \le r \le r_0(n)} 1 \le r_0(n)$, so $\sum_{r \ge 1} P_r \le r_0(n) + 1$.

Lower bound: Here we need the second moment method. For $i \in \{1,\ldots,n-r\}$, let $Z_i$ be $1$ if $a_{i+1}<a_{i+2}<\ldots<a_{i+r}$ and $a_i>a_{i+1}$, and $0$ otherwise. (The added condition $a_i>a_{i+1}$ is a trick to reduce the positive correlation between nearby $Z_i$.) The probability that $a_{i+1}<a_{i+2}<\ldots<a_{i+r}$ is $1/r!$, and the probability that this holds while $a_i>a_{i+1}$ fails is $1/(r+1)!$, so $E[Z_i]=1/r! - 1/(r+1)!$. Let $Z=\sum_{i=1}^{n-r} Z_i$, so
$$E[Z]=(n-r)(1/r! - 1/(r+1)!).$$
Next we compute the second moment $E[Z^2]$ by expanding $Z^2$. If $i=j$, then $E[Z_i Z_j] = E[Z_i] = 1/r! - 1/(r+1)!$; summing this over $i$ gives less than or equal to $n/r!$. If $0<|i-j|<r$, then $E[Z_i Z_j]=0$ since the inequalities are incompatible. If $|i-j|=r$, then $E[Z_i Z_j] \le 1/r!^2$ (the latter is the probability if we drop the added condition in the definition of $Z_i$ and $Z_j$). If $|i-j|>r$, then $Z_i$ and $Z_j$ are independent, so $E[Z_i Z_j] = (1/r! - 1/(r+1)!)^2$. Summing over all $i$ and $j$ shows that
$$E[Z^2] \le \frac{n}{r!} + E[Z]^2 \le \left(1 + O(r!/n) \right) E[Z]^2.$$
The second moment method gives the second inequality in
$$P_r \ge \operatorname{Prob}(Z \ge 1) \ge \frac{E[Z]^2}{E[Z^2]} \ge 1 - O(r!/n).$$
If $r \le r_0(n)-2$, then $r! \le (r_0(n)-1)!/(r_0(n)-1)$, so $r!/n \le 1/(r_0(n)-1)$, so $P_r \ge 1 - O(1/r_0(n))$. Thus
$$\sum_{r \ge 1} P_r \ge \sum_{r=1}^{r_0(n)-2} \left( 1 - O(1/r_0(n)) \right) = r_0(n) - O(1).$$

Assuming the interpretation that I gave in a comment above, it suffices to find the length of the longest consecutive subsequence in a sequence of $n$ random variables $(X_1, \ldots, X_n)$, each of which is uniform on $[0,1]$. Such an $n$-tuple determines a permutation as long as we never have $X_i = X_j$; this occurs with probability zero.

The probability that $X_{i+1} < \cdots < X_{i+r}$ is just $1/r!$. So the expected number of increasing consecutive subsequences of length $r$ in a sequence of length $n$ is $n/r!$.

Therefore the length of the longest increasing consecutive subsequence should be near $r_0(n)$, where $r_0 := r_0(n)$ satisfies $r_0! = n$. This is not a constant times $\log n$. However, if we solve $(r/e)^r = n$ for $r$ (this is roughly Stirling's approximation), we get $r = \log(n)/W(\log(n)/e)$, where W is the Lambert W function. As $z \to \infty$, $W(z)$ grows slower than $\log z$, so it seems like $r_0(n)$ grows a bit faster than $\log n/\log \log n$. It wouldn't be hard to confuse this with a constant times $\log n$ if you were working off of numerical data.

Another interpretation of this problem is, given a permutation $a_1 a_2 \ldots a_n$, to find the longest sequence $i_1, i_2, \ldots, i_r$ such that $a_{i_1}, a_{i_2}, \ldots, a_{i_r}$ are consecutive integers. (For example, in 691528347, the numbers 1, 2, 3, 4 appear in that order.) The longest subsequence of $\sigma$ in this sense has the same length as the longest subsequence of $\sigma^{-1}$ in the sense previously in this answer; for example, the inverse of 691528347 is 357841962, which has the consecutive increasing sequence 3578. So the lengths of the longest increasing sequences in these two interpretations have the same distribution.

It's interesting to contrast this with the longest sequence of +1s in a sequence of +1s and -1s, which is logarithmic. The events $X_i \lt X_{i+1}$ and $X_{i+1} \lt X_{i+2}$ are not independent.
–
Douglas ZareMar 2 '10 at 3:33

1

@Michael: Are you planning to add the details needed to turn your heuristic into a proof? I think that one way would be to use the second moment method.
–
Bjorn PoonenMar 2 '10 at 4:07

If I understand the question correctly, I think the answer is a little larger than $2$.

I am interpreting the question to be looking for the longest substring of the form $i(i+1)(i+2) \cdots (i+r)$ occurring anywhere in the permutation. The probability that any string of $3$ characters will be of this form is $(n-2)/\binom{n}{3} = O(1/n^2)$. By linearity of expectation, the expected number of such strings of length $3$ is $O(1/n)$. In particular, the probability that we have a string of length $3$ or higher is $O(1/n)$.

Are you looking for the expected value $E(n)$ of the largest $k$ for which $1,2,\dots,k$ is an increasing subsequence? The probability that $1,2,\dots,k$ is an increasing subsequence is $1/k!$, so the probability that $k$ is the largest such value is $\frac{1}{k!}-\frac{1}{(k+1)!}$. Thus
$$ \lim_{n\to\infty}E(n) = \sum_{k\geq 1}k\left(\frac{1}{k!}-\frac{1}{(k+1)!}\right) = e-1. $$