[Solved]: Showing that the entropy of i.i.d. random variables is the sum of entropies

Problem Detail: The shannon entropy of a random variable $Y$ (with possible outcomes $Sigma={sigma_{1},…,sigma_{k}}$) is given by
$H(Y)=-sumlimits_{i=1}^{k}P(Y=sigma_{i});log(P(Y=sigma_{i}))$. For a second random variable $X=X_{1}X_{2}…X_{n}$, where all $X_{i}$’s are independent and equally distributed (each $X_{i}$ is a copy of the same random variable $Y$), the following equation is known to be true:

$H(X)=ncdot H(Y)$

I want to prove this simple equation, where the outcomes from $Y$ are interpreted as symbols from an alphabet $Sigma$ and therefore $X$ is the random variable for strings of length $n$ (based on the distribution of $Y$). It is easy to see, that $P(X=w)=P(X=w_{1}…w_{n})=P(X_{1}=w_{1});cdot;…;cdot; P(X_{n}=w_{n})=prodlimits_{i=1}^{n}P(Y=w_{i})$
… but my approach to prove $H(X)=ncdot H(Y)$ seems to be in a dead point
(every word $w$ has the form $w=w_{1}…w_{n}$ with $w_{i}inSigma$ and let $large |w|_{sigma_{i}}$ be the number of occurrences of $sigma_{i}$ in $w$): $H(X)=-sumlimits_{winSigma^{n}}P(X=w);log(P(X=w))$
$=-sumlimits_{winSigma^{n}}left(prodlimits_{i=1}^{n}P(Y=w_{i})right);logleft(prodlimits_{i=1}^{n}P(Y=w_{i})right)$
$=-sumlimits_{winSigma^{n}}left(prodlimits_{i=1}^{n}P(Y=w_{i})right)left(sumlimits_{i=1}^{n}logleft(P(Y=w_{i})right)right)$
$=-sumlimits_{winSigma^{n}}left(prodlimits_{i=1}^{k}P(Y=sigma_{i})^{large |w|_{sigma_{i}}}right)left(sumlimits_{i=1}^{k}large |w|_{sigma_{i}}normalsize ;logleft(P(Y=sigma_{i})right)right)$ So, I am able to change some indices from word length to the length of the alphabet, which is used in $H(Y)$. But what now? Any help?

Asked By : Danny

Answered By : Shaull

Your attempt at the proof does not take into account any natural order on $Sigma^n$. While it may still work, it seems difficult. A much easier proof can be given by induction over $n$. The base case is trivial ($H(X)=H(Y)$) for $X=Y$. Then, denote $X=X_1cdots X_n$ and $X’=X_1cdots X_{n-1}$, so you have $$H(X=w)=-sum_{winSigma^{n-1}}sum_{sigmain Sigma}Pr(X=wsigma)log Pr(X=wsigma)$$ $$=-sum_{winSigma^{n-1}}sum_{sigmain Sigma}Pr(X’=w)Pr(Y=sigma)log Pr(X’=w)Pr(Y=sigma)=$$ $$=-sum_{winSigma^{n-1}}Pr(X’=w)sum_{sigmain Sigma}Pr(Y=sigma)(log Pr(X’=w)+log Pr(Y=sigma))=$$ $$=-sum_{winSigma^{n-1}}Pr(X’=w)left(sum_{sigmain Sigma}Pr(Y=sigma)log Pr(X’=w)+sum_{sigmain Sigma}Pr(Y=sigma)log Pr(Y=sigma)right)=$$ $$=-sum_{winSigma^{n-1}}Pr(X’=w)log Pr(X’=w)sum_{sigmain Sigma}Pr(Y=sigma)+sum_{win Sigma^{n-1}}Pr(X’=w)sum_{sigmain Sigma}Pr(Y=sigma)log Pr(Y=sigma)=$$ $$=H(X’)+H(Y)=(n-1)H(Y)+H(Y)=nH(Y)$$
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/21525  Ask a Question  Download Related Notes/Documents