I have the following problem: Suppose there is an infinite list which is fully populated up to a certain point $k$ after which it's empty. I need to suggest an algorithm to find $k$ while checking as few entries in the list (asymptotically) and prove that it is indeed of minimal complexity.

One idea I had was to check the $2^i$ entries for $i\in\mathbb{N}$ till I find an $n\in\mathbb{N}$ such that $2^n$ is populated and $2^{\left(n+1\right)}$ isn't. Then run a binary search on all the entries between those two values. It would find $k$ but I have no idea whether it's of minimal complexity or how to even begin to analyze that question.

3 Answers
3

You cannot do any better than $O(\log_2 k)$. In any range, $[2^n,2^{n+1})$, there are $2^n$ different values for $k$. We therefore require at least one of those values to take $n$ tests (because there are only $2^{n-1}$ possible sequences of test results with fewer tests.)

So for some $k\in [2^n,2^{n+1})$ the number of steps is at least $n=\left\lfloor\log_2 k\right\rfloor$.

Even if the binary search is the optimal solution for the range $\left[2^{n},2^{n+1}\right)$ how do I also work into the complexity the part where I check every $2^i$ entry and show that it's still optimal together?
–
SerpahimzApr 5 '13 at 20:03

It takes $1+\log_2 k$ trials to find $n$ so that $k\in[2^n,2^{n+1})$ and then $\log_2 k$ trials to do the binary search. This all assumes the integer arithmetic is $O(1)$, which is, I suppose, unrealistic, but it gives your algorithm a complexity of $O(\log k)$
–
Thomas AndrewsApr 5 '13 at 20:05

To paraphrase the problem: The adversary selects some natural number and provides you with an oracle which will tell you whether a number of your choice is greater or less than the secret number. You want to find the secret in the fewest possible tries.

The first question to consider is what a reasonable complexity measure would be. Average complexity is out because it would depend on the probability distribution with which the adversary selects various number, and there's no "natural" probability distribution on $\mathbb N$ we can assume in default of better information.

The only thing that seems to make good sense is considering how the running time of an algorithm grows as a function of the adversary's unknown number. We could consider ordinary asymptotic growth here (that is, ignoring constant factors), but since "number of questions to the oracle" is a well-defined unit of complexity, we can be slightly more precise and only ignore constant terms.

The counting argument Thomas Andrews presents shows that any correct algorithm must use at least $\log_2 k \pm c$ questions in the limit to locate the number $k$, for some constant $c$.

The doubling-then-binary-search algorithm you suggest will use $2\log_2 k \pm c$ questions in the limit -- namely first $\log_2 k$ questions to determine the size of $k$ and then $\log_2 k$ questions the discover its lower-order bits. The constant $c$ depends on how large your initial guess is; you can get as large a negative $c$ simply my making your initial guess in the doubling phase large. (This is why we might as well ignore constant terms in the number of questions; they can be traded for more time spent in a bounded initial range of $k$s).

What's the best $a$ such that there's an algorithm that uses at most $a\log_2 k + c$ questions in the limit? So far we have the bounds $1\le a\le 2$; can we narrow down that interval? I don't know.

Surely, you can find $k$ in $O(\log k)$ time (for example, using the algorithm you described). The question is can you do it faster? The answer is no, you can't. The reason is that whatever the $k$ is, you need to construct it using arithmetical operations (addition, multiplication, bit shift, etc.). However, you cannot construct such a number in less than $\Theta(\log k)$ steps, and this is the lower bound for your algorithm (take note, that for very large numbers multiplication takes much more than $O(1)$ time!).

I would appreciate it if you could elaborate more specifically how you achieve the lower bound.
–
SerpahimzApr 5 '13 at 20:04

Use your algorithm and show that for any $k$ it does $O(\log k)$ steps.
–
dtldarekApr 5 '13 at 20:07

Ye but how do I show there is no other algorithm with better complexity? I'm quite new to the whole field of algorithms and complexity so I didn't really get what you meant about not being able to construct $k$ more efficiently than $\Omega\left(\log k\right)$
–
SerpahimzApr 5 '13 at 20:10

If you were to construct $k$ in memory, then you would need write somewhere every digit and every such action takes some (constant) time. $k$ has $\log k$ digits, you need to use at least that much time. This works even if you can specify $k$ with some other rules, as there exist numbers with Kolmogorov complexity equal their length times constant.
–
dtldarekApr 5 '13 at 20:15