[Solved]: Binomial coefficient to approach multi-way choices DP problem?

Problem Detail: I’m trying to understand this dynamic programming related problem, adapted from Kleinberg’s Algorithm Design book. Not homework: i’ve already a solution, just considering if i’m ok with the theory.

Suppose we want to replicate a file over a collection of $n$ servers $S_1, S_2 , ldots , S_n$. Each server $S_i$ has a placement cost $c_i$ to place a copy of the file at $S_i$. If a user requests the file from $S_i$ and no copy is present, then servers $S_{i+1}, S_{i+2}, S_{i+3}, ldots$ are searched in order until a copy is found, say at $S_j$ (where $j > i$). This results in an access cost $a_i$, where $$ a_i= begin{cases} j-i qquad hbox{if $S_i$ does not contain a copy the file} 0 qquad hbox{ if $S_i$ contains a copy of the file} end{cases} $$ We require that $S_n$ contains the file, so that all searches will terminate. Given $I subset left{1,ldots, n − 1right}$, a subset of servers to place the file on, we define the total cost $$ c(I) = c_n + sum_{i in I} c_i + sum_{i=1}^{n} a_i $$ To be the sum of the placement costs for each server that contains the file (recall that $S_n$ must contain the file), plus the sum of the access costs associated with all $n$ servers. Design a polynomial-time algorithm to compute a subset I of servers to contain the file that minimizes total cost.

My approach is to consider every possible choice, like for other problems e.g. segmented least squares, ending up with a $O(n^4)$ running time algorithm. Anyway it looks like there is a better solution, given by the following recurrence: $$ OPT(j) = c_j + displaystylemin_{0 leq i < j}(OPT(i)+binom{j-i}{2}) $$ With the initialitazions $OPT(0)= 0$ and $binom{1}{2}=0$. The answer is given as usual by $OPT(n)$. This is to me very counterintuitive: why the binomial coefficient, and why the number $2$?. Is it a standard way to approach similar problems?

Asked By : kentilla

Answered By : Yuval Filmus

The binomial coefficient here appears through the formula $$ sum_{i=1}^{n-1} i = binom{n}{2}. $$ In the formula, $OPT(j)$ is the minimal cost of a solution for the first $j$ servers. Another way of looking it that it’s the minimal cost up to $j$ of a solution which puts the file in $S_j$. Where does the recurrence come form? Take an optimal solution for the first $i$ servers. We know that the file appears in $S_j$. Let $S_i$ be the next largest server it appears on (or $i=0$ if it only appears in $S_j$). Without loss of generality, we can assume that the solution restricted to $S_1,ldots,S_i$ is optimal, and so of cost $OPT(i)$. Putting the file in $S_j$ costs $c_j$. Accessing the file at positions $i+1,ldots,j-1$ costs $(j-(i+1)) + cdots + (j-(j-1)) = binom{j-i}{2}$. This gives the recurrence formula. (We also have to verify the case $i=0$ separately, which I leave to you.)

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/47916 Ask a Question Download Related Notes/Documents