Counting Inversions Using Merge Sort

Problem Detail: In Corman, Introduction To Algorithms, 3rd edition, question 2-4 it asks to count the number of inversions in a list of numbers in $theta( n lg n )$ time. He uses a modified Merge Sort to accomplish this. However, there is something in his algorithm which seems redundant / unnecessary to me:

MERGE-INVERSIONS(A, p, q, r) n1 = q - p + 1 n2 = r - q let L[1 ... n1 + 1] and R[1 ... n2 + 1] be new arrays for i = 1 to n1     L[i] = A[p + i - 1] for j = 1 to n2     R[j] = A[q + j]  L[n1 + 1] = infinity R[n2 + 1] = infinity i = 1 j = 1 inversions = 0 counted = FALSE for k = p to r     if counted == FALSE and R[j]  < L[i]         inversions = inversions + n1 - i + 1         counted = TRUE     if L[i] <= R[j]          A[k] = L[i]         i++     else A[k] = R[j]          j++         counted = FALSE return inversions

The counted variable seems redundant to me and I would have written the last for loop as follows:

inversions = 0 for k = p to r     if L[i] <= R[j]          A[k] = L[i]         i++     else A[k] = R[j]          inversions = inversions + n1 - i + 1         j++ return inversions

What am I missing, or is counted really unnecessary?

Asked By : Robert S. Barnes

Answered By : Raphael

We can show that after every iteration of the for-loop in question, counted is FALSE. Therefore, inversions = inversions + n1 – i + 1 is executed if and only if j++ is executed in the same iteration (both are guarded by R[j] < L[i]). Since neither i nor j is changed between evaluation of the two if conditions, this implies that your version is equivalent.

We perform an induction over k. Before the loop, counted is explicitly set to FALSE. In any given iteration, we start (by induction hypothesis) with counted == FALSE. If L[i] <= R[j], counted is not changed throughout the current iteration. If the opposite is the case, counted is first set to TRUE, but reset to FALSE in the else block. For this, we note that the truth value of R[j] < L[i] remains unchanged from the first to the second if.

I guess the extra block is there for didactic reasons.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/9858

Asked By : Robert S. Barnes

Answered By : Raphael

Best Answer from StackOverflow

Related