This is an excerpt from the algorithms textbook How to Think About Algorithms by Jeff Edmonds (This book is a gem by the way).

I get his conclusion about Merge/Quick/Heap sorts having $O(NlogN)$ operations with respect to $N$, the number of elements in the input list, and at the same time they are linear in the context that they need $O(n)$ opertaions with respect to the number of bits needed to represent the input list. Different models of computation and a good way to describe one algorithm under different models.

But my question is in the line

Assuming that the $N$ numbers to be sorted are distinct, each needs $logN$ bits to be represented, for a total of $n = \Theta(NlogN)$ bits.

My understanding of this was that with $N$ distinct numbers, we need word size $w = \theta(logN)$, as defined in CLRS. We want the word size to be able to at least index into $N$ different elements but not so big that we can put everything in one word. Also, with $N$ distinct elements, we need $\Omega(logN)$ bits to represent the largest number. Assuming each word will fit in $w$, Edmonds' claim that $n = \theta(NlogN)$ bits made sense. Please correct me if my analysis is wrong up to here.

But when I try to apply this to counting sort, something doesn't seem right. With $N$ again the number of elements in the input and $k$ the value of the maximum element in the list, the running time is $O(N + k)$. Counting sort is linear with respect to $N$, the number of elements in the input, when $k = O(N)$. Using this constraint to represent the input as the total number of bits $n$, I think $n = O(Nlogk) = O(NlogN)$.

So with in the RAM model of computation, how can I express the run-time of counting sort with respect to $n$ bits? Merge/quick/heap sorts had time complexity $O(n)$ with respect to $n$ bits, as was expressed cleanly by Edmonds. I am not sure how to do something similar for counting sort, and maybe possibly for radix sort using counting sort as a subroutine. Any idea how to do this in the two cases when $k = \Theta(N)$ and when this condition is not present? I am suspecting the former will give some kind of polynomial time with respect to $n$ and the latter exponential (hence pseudo-polynomial) time with respect to $n$ bits, but have trouble expressing the mathematics...

1 Answer
1

In the RAM model of computation, machine words are $\Theta(\log n)$ bits long (where $n$ is the size of the input in bits), and operations on machine words can be done in $O(1)$. Counting sort has running time $O(N + k)$ in the RAM model assuming that each value is $\Theta(\log n)$ bits long.

Let us now consider an array consisting of $N$ values, the maximal value of which is $k$. We would need $\log k$ bits to store each value. Assuming that we use $\Theta(\log k)$ bits per value, the input length is $n = \Theta(N\log k)$ bits, and so a machine word is $\log N + \log\log k + O(1)$ bits long. Let us denote by $w$ the number of machine words needed to store a single entry, which we can calculate by
$$ w = \left\lceil \frac{\log k}{\log N + \log\log k} \right\rceil = \Theta\left(\max\left(1,\frac{\log k}{\log N + \log\log k}\right)\right). $$
The running time of counting sort is $O(Nw + k)$.

$\begingroup$the CLRS post and this post I thought a word $w = \Theta(logN)$, where $N$ is the number of elements, not $n$ the size of the input in bits. Am I missing some connection here? Which definition should I follow? (I will go through your answer after you reply and will probably have additional questions that I will leave as comments here)$\endgroup$
– namesake22Jan 7 '18 at 16:11

1

$\begingroup$The definition in CLRS is circular. There is no ambiguity if, for example, the input is an array of $N$ words. If a word takes $\Theta(\log N)$ bits then the total input size is $n = \Theta(N\log N)$, and so $\log n = \Theta(\log N)$.$\endgroup$
– Yuval FilmusJan 7 '18 at 16:13

1

$\begingroup$The usual assumption is that each element fits in a constant number of words. If this is not the case, you have to specify it.$\endgroup$
– Yuval FilmusJan 7 '18 at 16:22

$\begingroup$Here is what Wikipedia has to say: In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the length of the input (the number of bits required to represent it) and the numeric value of the input (the largest integer present in the input)..$\endgroup$
– Yuval FilmusJan 8 '18 at 16:25