Given a set $S=\{s_1,s_2,...,s_n\}$ having $n$ elements, I want to separate two elements, say $s_1,s_2$, in $S$ by repeatedly using set partition operations. Each set partition operation randomly partitions the very subset containing $s_1,s_2$, say $A$, into two non-empty subsets, $B$ and $C$, such that $A=B\cup C$ and $B\cap C=\emptyset$. The randomness of partition means that, $B$ and $C$ are random.

I want to calculate or approximate the expectation of partition time, $E(n)$, to separate $s_1$ and $s_2$.

Let see two simple cases when $n$ is small.

(1) $n=2$:

In this case, $S=\{s_1,s_2\}$. The only feasible partition will separate $S=\{s_1,s_2\}$ as $\{s_1\}$ and $\{s_2\}$. So, $E(2) = 1$.

(2) $n=3$:

In this case, $S=\{s_1,s_2,s_3\}$. There are two possible situations when performing the first partition:

(a) if the first partition is $\{s_1\},\{s_2,s_3\}$ or $\{s_2\},\{s_1,s_3\}$, then 1 partition is ok!

(b) if the first partition is $\{s_1,s_2\},\{s_3\}$, then I need a second partition making $\{s_1,s_2\}$ into $\{s_1\},\{s_2\}$. So the partition time is 2.

The possibility of situation (a) is 2/3 and situation (b) 1/3 due to the randomness. So, $E(3)=1*(2/3)+2*(1/3)=4/3$.

I tried using recursive formula but it seems to be a non-closed form. I also wonder whether or not $E(n)$ can be approximated by some other continuous functions?

$\begingroup$Could you elaborate on the random process? Here are a few options about what you could mean. (a) At each step, refine the partition by picking a block and splitting it into two blocks. Among all such partitions, pick uniformly at random [so big blocks are more likely to be split up]. Or perhaps you mean (b) do (a), but first pick the block to split up with each block being equally likely. Or perhaps you mean the very different (c) pick sets S_1, S_2, S_3, ... independently uniformly at random [with/without replacement], and consider minimal non-empty sets in the topology generated by them.$\endgroup$
– Pat DevlinMay 10 at 13:41

$\begingroup$Hi @Devlin Each, partition will perform on the particular subset containing $s_1,s_2$, since if $s_1,s_2$ are separated, the partition process will stop. So I think (a) is not. The block, i.e. the subset, is the very one, not random. So (b) is not. It seems that (c) is also not. I will re-edit and clarify the process.$\endgroup$
– oNgStrIngMay 10 at 13:48

1 Answer
1

First answer

It’s very close to 2, but it’s a little less.

Consider the following closely related process (which isn’t the same).

Easier process: at each step, take the set containing $\{s_1, s_2\}$, and randomly divide it into two sets with all such divisions being equally likey (perhaps one of which is empty). Keep going until the elements are separated.

If we do easier process, then each time we do it, the chance that it ends as a result of the next step is exactly $1/2$ (independent of set size), and it’s a geometric process with probability $1/2$. So ‘easier process’ lasts an expected 2 steps exactly.

Your process is the same as the easier process, but if easier process ever does a trivial cut (where nothing happens) then you don’t consider that a step. So we can just do easier process and then ignore every step that did nothing.

With this coupling, we see your process ends sooner than the easier one, so your expected value is at most 2.

And if $n$ is large, these processes are very close, so 2 will be a very good approximation. If you’re happy saying the expected value is between 1 and 2, then this does it. If you want to argue the expected value is in fact very close to 2, we’d have to think just a little bit more. Let me know if this works for you.

More exact answer

Let $E(n)$ denote the expectation you care about. Then we have the recurrence $E(2) = 1$ and for all $n > 2$,

which follows since ${{n-2} \choose {k-2}} / (2^{n-1} -1)$ is the probability that after the first split, $s_1$ and $s_2$ are still together, and they are in a set of cardinality $k$. From this, you might be able to find a generating function for $E(n)$ if you really wanted to.

But from this recurrence, it's not difficult to prove by induction that $2 - \frac{2}{n} \leq E(n) \leq 2 - (2^{n-1}-1)^{-1}$ for all $n \geq 2$. The base case ($n=2$) is clear. For the induction step we use

By induction, using the above with $A=2$ gives a lower bound on $E(n)$ [which happens to be greater than $2 - 2/n$ for all $n$], and setting $A=0$ gives an upper bound on $E(n)$ of $2 - (2^{n-1} -1)^{-1}$.

$\begingroup$@Devlin Thank you very much! I though it might be a large number when $n$ is large enough. But it does make sense that the difficulty of separating two elements may not grow so quickly with $n$ and upper bounded by 2.$\endgroup$
– oNgStrIngMay 11 at 13:33

$\begingroup$Glad to help. Does this fully answer the question for you?$\endgroup$
– Pat DevlinMay 11 at 23:08