[Solved]: How again do certain sorting methods use $o(n log n)$ time?

Problem Detail: I hope this question isn’t too ‘soft’ for here. It’s been a while $tiny{text{an eternity for some people’s standards}}$ since I’ve touched this stuff, and I had a convincing explanation to this question eight minutes ago but must have misplaced it, so please help me find it again: Most classical sorting algorithms one reads about are comparison sorts, in which a single comparison produces a binary outcome (let’s break ties arbitrarily). The motivation for this I remember is that they are general, requiring only that an ordering be prescribed over the items, and insensitive to how the items are encoded, but suffer from a $Omega(n log n)$-time barrier simply for reason of needing to do enough comparisons to distinguish the $n!$ possibilities. One then meets algorithms such as radix sort, which is supposed to beat this under caveats such as the space of items being discrete and bounded, and assuming the ability to retrieve any decimal/etc. place in constant time. But as I recently thought about it, this operation is essentially a $b$-way switch as opposed to the (ordinary) $2$-way kind used in comparison sorts. Then radix sort would be subject to the same lower bound on time taken since the foregoing gives an improvement of $(log b)/(log 2)$ (at most?). So how is radix sort ever preferable to one of the standard comparison sorts? I know this is a contradiction, but not how so. In my experience this kind of paradox typically dissolves when formalised. I would rather have an intuitive reason that explains it away if such is available, though.

Asked By : Vandermonde

Answered By : Yuval Filmus

Radix sort cannot be reduced to a comparison-based sort. If the alphabet is binary, then in some sense you are doing 2-way comparisons, since each bit is either 0 or 1. If there are $n$ numbers of width $w$, radix sort takes time $O(nw)$. Therefore as long as $w ll log n$, radix sort will be faster than any comparison-based sort. What does the condition $w ll log n$ mean? If numbers have width $w$ then they are in the range $0$ to $2^w-1$. When $w ll log n$, there are less than $n$ possible numbers, so some numbers repeat. In this regime you can beat comparison-based sorting. Another algorithm beating comparison-based sorting in the regime in which there are less than $n$ possible number is counting sort. Suppose that the domain is of size $m$. We initialize an array of size $m$, and count how many elements of each type there are. We can then unpack this histogram into the sorted array. This runs in time $O(n+m)$. If $m ll n$ then this algorithm runs in optimal $O(n)$ time. In both of these cases, the algorithms cannot be rephrased in the decision-tree model (try doing it!), and so aren’t subject to the limitations of comparison-based sorting. No paradoxes here.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/35164 Ask a Question Download Related Notes/Documents