Is the complement of { ww | … } context-free?

Problem Detail: Define the language $L$ as $L = {a, b}^* – {wwmid w in {a, b}^*}$. In other words, $L$ contains the words that cannot be expressed as some word repeated twice. Is $L$ context-free or not? I’ve tried to intersect $L$ with $a^*b^*a^*b^*$, but I still can’t prove anything. I also looked at Parikh’s theorem, but it doesn’t help.

Asked By : Evgeny Eltishev

Answered By : Evgeny Eltishev

It’s context-free. Here’s the grammar:

$S to A | B|AB|BA$
$A to a|aAa|aAb|bAb|bAa$
$B to b|aBa|aBb|bBb|bBa$ $A$ generates words of odd length with $a$ in the center. Same for $B$ and $b$. I’ll present a proof that this grammar is correct. Let $L = {a,b}^* setminus {ww mid w in {a,b}^*}$ (the language in the question). Theorem. $L = L(S)$. In other words, this grammar generates the language in the question. Proof. This certainly holds for all odd-length words, since this grammar generates all odd-lengths words, as does $L$. So let’s focus on even-length words. Suppose $x in L$ has even length. I’ll show that $x in L(G)$. In particular, I claim that $x$ can be written in the form $x=uv$, where both $u$ and $v$ have odd length and have different central letters. Thus $x$ can be derived from either $AB$ or $BA$ (according to whether $u$’s central letter is $a$ or $b$). Justification of claim: Let the $i$th letter of $x$ be denoted $x_i$, so that $x = x_1 x_2 cdots x_n$. Then since $x$ is not in ${ww mid w in {a,b}^{n/2}}$, there must exist some index $i$ such that $x_i ne x_{i+n/2}$. Consequently we can take $u = x_1 cdots x_{2i-1}$ and $v = x_{2i} cdots x_n$; the central letter of $u$ will be $x_i$, and the central letter of $v$ will be $x_{i+n/2}$, so by construction $u,v$ have different central letters. Next suppose $x in L(G)$ has even length. I’ll show that we must have $x in L$. If $x$ has even length, it must be derivable from either $AB$ or $BA$; without loss of generality, suppose it is derivable from $AB$, and $x=uv$ where $u$ is derivable from $A$ and $v$ is derivable from $B$. If $u,v$ have the same lengths, then we must have $une v$ (since they have different central letters), so $x notin {ww mid w in {a,b}^*}$. So suppose $u,v$ have different lengths, say length $ell$ and $n-ell$ respectively. Then their central letters are $u_{(ell+1)/2}$ and $v_{(n-ell+1)/2}$. The fact that $u,v$ have different central letters means that $u_{(ell+1)/2} ne v_{(n-ell+1)/2}$. Since $x=uv$, this means that $x_{(ell+1)/2} ne x_{(n+ell+1)/2}$. If we attempt to decompose $x$ as $x=ww’$ where $w,w’$ have the same length, then we’ll discover that $w_{(ell+1)/2} = x_{(ell+1)/2} ne x_{(n+ell+1)/2} = w’_{(ell+1)/2}$, i.e., $wne w’$, so $x notin {ww mid w in {a,b}^*}$. In particular, it follows that $x in L$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/19151