[Solved]: How asymptotically bad is naive shuffling?

Problem Detail: It’s well-known that this ‘naive’ algorithm for shuffling an array by swapping each item with another randomly-chosen one doesn’t work correctly:

for (i=0..n-1)   swap(A[i], A[random(n)]);

Specifically, since at each of $n$ iterations, one of $n$ choices is made (with uniform probability), there are $n^n$ possible ‘paths’ through the computation; because the number of possible permutations $n!$ doesn’t divide evenly into the number of paths $n^n$, it’s impossible for this algorithm to produce each of the $n!$ permutations with equal probability. (Instead, one should use the so-called Fischer-Yates shuffle, which essentially changes out the call to choose a random number from [0..n) with a call to choose a random number from [i..n); that’s moot to my question, though.) What I’m wondering is, how ‘bad’ can the naive shuffle be? More specifically, letting $P(n)$ be the set of all permutations and $C(rho)$ be the number of paths through the naive algorithm that produce the resulting permutation $rhoin P(n)$, what is the asymptotic behavior of the functions $qquad displaystyle M(n) = frac{n!}{n^n}max_{rhoin P(n)} C(rho)$ and $qquad displaystyle m(n) = frac{n!}{n^n}min_{rhoin P(n)} C(rho)$? The leading factor is to ‘normalize’ these values: if the naive shuffle is ‘asymptotically good’ then $qquad displaystyle lim_{ntoinfty}M(n) = lim_{ntoinfty}m(n) = 1$. I suspect (based on some computer simulations I’ve seen) that the actual values are bounded away from 1, but is it even known if $lim M(n)$ is finite, or if $lim m(n)$ is bounded away from 0? What’s known about the behavior of these quantities?

Asked By : Steven Stadnicki

Answered By : Peter Shor

We will show by induction that the permutation $rho_n = (2,3,4,ldots, n,1)$ is an example with $C(rho_n) = 2^{n-1}$. If this is the worst case, as it is for the first few $n$ (see the notes for OEIS sequence A192053), then $m(n) approx (2/e)^{n}$. So the normalized min, like the normalized max, is ‘exponentially bad’. The base case is easy. For the induction step, we need a lemma: Lemma: In any path from $(2,3,4, ldots, n, 1)$ to $(1,2,3, ldots, n)$, either the first move swaps positions $1$ and $n$, or the last move swaps positions $1$ and $n$. Proof Sketch: Suppose not. Consider the first move that involves the $n$’th position. Assume that it is the $i$’th move, $ineq 1$ and $i neq n$. This move must place the item $1$ in the $i$’th place. Now consider the next move that touches the item $1$. Assume this move is the $j$’th move. This move must swap $i$ and $j$, moving the item $1$ into the $j$’th place, with $i < j$. A similar argument says that the item $1$ can only subsequently be moved to the right. But the item $1$ needs to end up in the first place, a contradiction. $square$ Now, if the first move swaps the positions $1$ and $n$, the remaining moves must take the permutation $(1, 3,4,5, ldots, n,2)$ to $(1,2,3,4, ldots, n)$. If the remaining moves don’t touch the first position, then this is the permutation $rho_{n-1}$ in positions $2 ldots n$, and we know by induction that there are $C(rho_{n-1})=2^{n-2}$ paths that do this. An argument similar to the proof of the Lemma says that there is no path that touches the first position, as the item $1$ must then end up in the incorrect position. If the last move swaps the positions $1$ and $n$, the first $n-1$ moves must take the permutation $(2,3,4,ldots, n,1)$ to the permutation $(n,2, 3,4, ldots, n-1, 1)$. Again, if these moves don’t touch the last position, then this is the permutation $rho_{n-1}$, and by induction there are $C(rho_{n-1})=2^{n-2}$ paths that do it. And again, if one of the first $n-1$ moves here touches the last position, the item $1$ can never end up in the correct place. Thus, $C(rho_n) = 2C(rho_{n-1}) = 2^{n-1}$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6519

Asked By : Steven Stadnicki

Answered By : Peter Shor

Best Answer from StackOverflow

Related