[Solved]: Time complexity of an enumeration of SUBSET SUM instances

Problem Detail: An instance of the SUBSET SUM problem (given $y$ and $A = {x_1,…,x_n}$ is there a non-empty subset of $A$ whose sum is $y$) can be represented on a one-tape Turing Machine with a list of comma separated numbers in binary format. If $Sigma = {0,1,#}$ a reasonable format could be: $( 1 ; (0|1)^* ; #)^* #$ Where the first required argument is the value $y$ and $##$ encodes the end of the input. For example:

 1  0  0  #  1  0  #  1  #  # ^^^^^^^^     ^^^^     ^    y          x1     x2 Instance: y=4, A={2,1} 

I would like to enumerate the SUBSET SUM instances.

Question: What is the (best) time complexity that can be achieved by a Turing Machine $TM_{Enum}$ that on input $m$ (which can be represented on the tape with a string of size $log m + 1$) – outputs the $m$-th SUBSET SUM instance in the above format?

EDIT: Yuval’s answer is fine, this is only a longer explanation. Without loss of generality we set that $y > 0$ and $0 < x_1 leq x_2 leq … leq x_n$, $n geq 0$ And we can represent an instance of subset sum using this encoding: $y # x_1# d_2# …# d_{n} ##$ where $d_i geq 1, x_i = x_{i-1} + d_i – 1 ; , i geq 2$ Using a binary representation for $y,x_1, d_2, d_3, …$ we have the following representation: $1 ; ((0|1)^* # 1)^* ; ##$ Equivalent to $1 ; (0|1|#1)^* ; ##$. There is always a leading 1 and a trailing ## so we can consider only the $(0|1|#1)^*$ part. So the decoder TM on input $m$ in binary format should:

  • output the leading 1
  • convert $m$ to base 3 mapping digit 2 to $#1$
  • when outputing the i-th intermediate $#$ calculate $x_i = d_i + x_{i-1}-1$
  • output the trailing $##$

No duplicate instances are generated.

Asked By : Vor

Answered By : Yuval Filmus

SUBSET-SUM instances can be encoded in base 3. We have codes for $0,1,#$. Some codings are invalid, but in that case we can just immediately output $##$ (or $#$, if we have just written $#$). Every SUBSET-SUM problem has infinitely many encodings, I hope that’s not a problem. If the input has length $ell$, then (assuming the tape alphabet has at least 4 symbols) we can do the conversion in time $O(ell^2)$. I don’t know whether this is the “best” time complexity achievable. Edit: Here is a better encoding. We still have only three input codes, $0,1,#$. The output string always starts with $1$ and ends with $##$. Further, $#$ is output as $#1$. Now each output string is generated once, though several output strings could correspond to the same instance. As an example, your instance is encoded by “00#0#”.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/4794