The problem is NP-Hard, and I’m trying to find a bound on how well does the greedy algorithm perform.
I know it looks a lot, but please bare with me. I pretty much did most of the work, I’m just missing that last small piece. Here is the pseudo code: Input: $U$ – set of elements, $F$ – family of sets s.t. $bigcup_{Sin F}S=U$
Output: $C$ – a family of sets; $Csubseteq F$ s.t. $bigcup_{Sin C}S=U$
initially C is empty while U is not empty do: choose S from F that maximizes the cover of elements in U add S to C subtract S's elements from U return C
The algorithm is pretty straight forward, and it is easy to see that it is indeed polynomial. This is my attempt to approximate:
Claim 1: In a set $U$ of $m$ objects, that can be covered with $k$ sets, there has to be a set $Ssubseteq U$ whose size is at least $frac{1}{k}m$.
Proof: Trivial. (I decided not to prove it) A corollary is that given the situation described in that claim, the greedy algorithm will choose a set whose size is at least $frac{1}{k}m$. Claim 2: Given a universe $U$, if there exist a cover of size $k$, then after $k$ iterations, the greedy algorithm will cover at least half of the elements, meaning at least $frac{1}{2}n$ elements.
Proof: By claim 1, in the first iteration the algorithm will cover at least $frac{1}{k}n$ elements. Upon entering the second iteration, there are at most $n-nfrac{1}{k}$ elements, and so the greedy algorithm will at least cover additional $frac{1}{k}(n-nfrac{1}{k})$ elements. In general, on the $i$’th iteration, the algorithm will cover $frac{1}{k}(n-nfrac{i-1}{k})$ elements. So after $k$ iterations:
$$sum_{i=1}^{k}frac{1}{k}(n-nfrac{i-1}{k})=sum_{i=0}^{k-1}frac{1}{k}(n-nfrac{i}{k})$$
$$=sum_{i=0}^{k-1}frac{n}{k}-sum_{i=0}^{k-1}frac{ni}{k^2}=n-frac{n}{k^2}sum_{i=0}^{k-1}i$$
$$=n-frac{n}{k^2}(frac{k(k-1)}{2})=n-frac{n}{2}(frac{k-1}{k})geqfrac{1}{2}n$$ OK, now this is where I need help: I know that in the first $k$ iterations, the algorithm picks at least half of the elements. After another $k$ iterations, another half was covered, out of what’s left (meaning another $frac{1}{4}n$). So in general, I know that the bound is $klog n$, I just can’t figure out how to formalize it.
What formula represents this behaviour? $T(ki)=T(frac{1}{2^i}n)$ and solve for $i$? It didn’t work…
What formula or equation should I solve to actually show that the number of iterations is bounded by $klog n$?
Asked By : so.very.tired
Answered By : Yuval Filmus
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/40764