[Solved]: Viterbi algorithm recursive justification

Problem Detail: I have a question regarding recursion in Viterbi algorithm. Define $pi(k; u; v)$ which is the maximum probability for any sequence of length $k$, ending in the tag bigram $(u; v)$. The base case if obvious $pi(0,*,*)=1$ The general case. $pi(k,u,v) = max_{w in K_{k-2} } pi(k-1,w,u) cdot q(v|w,u) cdot e(x_k|v)$ The author justifies the recursion as folllows:

How can we justify this recurrence? Recall that $pi(k, u, v)$ is the highest probability for any sequence $y_{−1}…y_k$ ending in the bigram $(u, v)$. Any such sequence must have $y_{k−2} = w$ for some state $w$. The highest probability for any sequence of length $k − 1$ ending in the bigram $(w, u)$ is $pi(k − 1, w, u)$, hence the highest probability for any sequence of length $k$ ending in the trigram $(w, u, v)$ must be $pi(k − 1,w, u) cdot q(v|w, u) cdot e(x_k |v)$

I do not understand why it’s actually true, I think it’s possible to reach $pi(n,u, v)$ from any $(n-1,w, u)$ not actually the maximum one $pi(n-1,w, u)$ just because $q(v|w, u) cdot e(x_k |v)$ might have a higher influence on the resulting $(n,u, v)$ than any $pi(n-1,w, u)$. I would appreciate if anyone could explain me why it’s true.

Asked By : user16168

Answered By : Yuval Filmus

The maximum is over the entire expression $pi(k-1,w,u) cdot q(v|w,u) cdot e(x_k|v)$, for exactly the same reason that is worrying you. Moreover, under your original interpretation, there is no reason to keep $pi(k-1,w,u)$ for the non-optimal $w$. We are forced to store $pi(k-1,w,u)$ for all $w$ exactly for the reason you mention.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/19093