Similar Problem
Problem Statement I have seen it being used in a similar problem where in a string was given in a compressed form. Meaning, e.g. if the string is “aaabccdeeee” then the compressed form is:
3 a 1 b 2 c 1 d 4 e
How data was stored They are stored in an str[] array as :
str[] = [{'a','3'}, {'b','1'}, {'c','2'}....]
HASHING Concept that was used in the solutions And programmers had used the following hash concept to find if the given substring is a Palindrome or not. Given a substring of string S as (i,j), they computed the hash of substring [i , (i+j)/2] and the reverse hash of substring [(i+j+2)/2, j] and checked if they were equal or not. So if they wanted to check if in string S = “daabac” whether substring [1, 5] is a a palindrome or not, they computed the following :
h1 = forward_hash("aa") h2 = reverse_hash("ba") h1 == h2
Code for the Hashing Concept The hash precomputation was done as follows :
/* Computing the Prefix Hash Table */ pre_hash[0] = 0; for(int i=1;i<=len(str);i++) { pre_hash[i] = pre_hash[i-1]*very_large_prime + str[i].first; pre_hash[i] = pre_hash[i]*very_large_prime + str[i].second; } /* Computing the Suffix Hash Table */ suff_hash[0] = 0; for(int i=1;i<=len(str);i++) { suff_hash[i] = suff_hash[i-1]*very_large_prime + str[K-i+1].first; suff_hash[i] = suff_hash[i]*very_large_prime + str[K-i+1].second; }
And then the hash was computed using the following functions :
/* Calculates the Forward hash of substring [i,j] */ unsigned long long CalculateHash(int i, int j) { if(i>j) return -1; unsigned long long ret = pre_hash[j] - POW(very_large_prime, [2*(j-i+1)])*pre_hash[i-1]; return ret; } /* Calculates the reverse hash of substring [i,j] */ unsigned long long CalculateHash_Reverse(int i, int j) { unsigned long long ret = suff_hash[j] - POW(very_large_prime,[2*(j-i+1)])*suff_hash[i-1]; return ret; }
What I am trying to do I am looking for a general approach to the above concept. Given a Pattern P, I want to check if the pattern P is present in a string S. I know the index (i) to check where it may be present. And I also know the length of pattern P represented as |P|. In short I want to check if hash of S[i, i+|P|] and hash of P match or not in O(1) using some form of pre computation on S. Ignoring the time taken to compute hash of P else it would be O(1+|P|)
Asked By : Kyuubi
Answered By : D.W.
The intuition
Imagine that we used a very simple hash, which is just the sum of the bytes modulo 256: $$H(S[i..j]) = S[i] + S[i+1] + S[i+2] + dots + S[j-1] bmod 256.$$ This is a crummy hash function (it has lots of collisions), but let’s go with it, since the purpose is just to get some intuition. This is a rolling hash; given the hash of a prefix of the string, it is easy to extend it by one more byte (just add that byte to the hash value you’ve got so far). Now with this hash function, we could easily solve your problem. During the precomputation, we’d fill in an array $A[]$ so that $A[j] = S[0] + S[1] + dots + S[j-1] bmod 256$ for each $j$; this can be done in $O(n)$ time. Now, we can compute $H(S[i..j])$ via the simple relation $$H(S[i..j]) = A[j] – A[i] bmod 256.$$ This can be computed in $O(1)$ time, given the precomputed array $A$. Of course, you’d never use this in practice, because it’s such a crummy hash function. But the same idea can be generalized to work with many other rolling hash functions, which you would want to use. (Intuitively, that’s because they are based upon a binary operation that is associative and has an inverse, which is all you need for this trick to work.) I’ll describe the details next.
The inverse property
All rolling hashes have the following property: The accumulation property: Given $H(S[i..j])$ and $S[j]$, we can compute $H(S[i..j+1])$. We need one additional property from the rolling hash: The inverse property: Given $H(S[i..j])$ and $H(S[i..k])$, where $j < k$, it should be possible to derive $H(S[j+1..k])$. Many standard rolling hashes do have this inverse property. For instance, the Rabin-Karp rolling hash has the inverse property. The same is true for hashing with cyclic polynomials (Buzhash).
How to use the inverse property for your problem
Create an array $A[]$ of $n$ elements. During precomputation, we will fill in the elements of $A$ so that $A[j] = H(S[0..j])$. This precomputation can be done in $O(n)$. Now, at some later point, you would like to compute the rolling hash of $S[i..j]$, i.e., to compute $H(S[i..j])$. How do you do that? It turns out it is easy. First, look up $H(S[0..i])$ and $H(S[0..j])$ using the array; this involves reading the value of $A[i]$ and $A[j]$. Next, use the inverse property to compute $H(S[i..j])$ from these two values. Done! That’s all there is to it.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/28163