[Solved]: Exercise on Divide&Conquer's technique

Problem Detail: I need help on this exercise: You are given an array of n elements, and you notice that some of the elements are duplicates; that is, they appear more than once in the array. Show how to remove all duplicates from the array in time O(n log n). I find a solution but in O(n^2)… How can I apply D&C?

Asked By : Ivankin

Answered By : Rick Decker

Given an array of elements $a[i], 0le ile n-1$, we know we can sort it in time $O(nlog n)$. Then do the following to produce an array $b[;]$:

declare b to be of size n old = a[0] b[0] = a[0] j = 1 for i = 1, ... , n    if a[i] =/= old    // found a new value       old = a[i]      // save it       b[j] = a[i]     // store it in b[j]       j = j + 1       // and increment the b index

It’s clear that, first, this can be done in one pass through the $a$ array, so this part will take time $O(n)$ (along with $O(n)$ time to allocate $b$), and, second, after this part the $b$ array will contain all of the values of the original array, with no duplicate values. Note. This may or may not be what you want. Starting with an initial array $langle,7, 3, 7, 4, 1, 3, 1, 7,rangle$, this will result in a $b$ array $langle,1, 3, 4, 7,rangle$, which has all the values without duplicates, but the original place order of the elements isn’t preserved. There are at least two possibilities for the result, depending on whether you want to eliminate all the duplicate values after the first occurrence, (giving you $langle,7, 3, 4, 1,rangle$) or if you want to eliminate all the duplicate values before the last occurrence, giving you $langle,4, 3, 1, 7,rangle$. For the first option, a fairly simple modification is to do a preprocessing step on the $a$ array, changing it to an array of pairs, $(value, index)$, where $index$ is the position of the element with that value. In the example above, this will give you a new array $$ langle,(7, 0), (3, 1), (7, 2), (4, 3), (1, 4), (3, 5), (1, 6), (7, 7),rangle $$ Now, as we did originally, sort this array by the first element of the pairs, giving $$ langle,(1, 4), (1, 6), (3, 1), (3, 5), (4, 3), (7, 0), (7, 2), (7, 7),rangle $$ (here we need to use a stable sort, like mergesort, to maintain the original place order). Eliminating duplicates, as above, we will have $$ langle,(1, 4), (3, 1), (4, 3), (7, 0),rangle $$ and if we sort the result by the second index, we’ll have $$ langle,(7, 0), (3, 1), (4, 3), (1, 4),rangle $$ and looking at the first element, we’ll have $langle,7, 3, 4, 1,rangle$, which is what we wanted. This algorithm will take $O(n)$ steps to preprocess the array, $O(nlog n)$ steps to sort the result, $O(n)$ steps to eliminate the duplicate values, and $O(nlog n)$ steps to do the final sort, giving us an algorithnm that runs in time $O(n+nlog n+n+nlog n)=O(nlog n)$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/44006 Ask a Question Download Related Notes/Documents