The KMP Algorithm
Lecture notes:
http://www.cs.cmu.edu/afs/cs/academic/class/15451-f17/www/lectures/lec09-kmp.pdf
In the practical implementation below the data structure that we build
from preprocessing the pattern is all contained in an array of
integers called memo[]. This implementation illustrates the use of
the algorithm for finding all matches of a pattern s in a text t.
public class kmp {
public static void main(String[] args) {
char[] s = "test".toCharArray();
int n = s.length;
int[] memo = new int[n];
/* memo[i] will store the length of the longest prefix of s
that matches the tail of s1...si */
int j=0;
for (int i=1; i0 && s[i] != s[j]) j = memo[j-1];
if (s[i] == s[j]) j++;
memo[i] = j;
}
/* Here's another version of the algorithm that computes
exactly the same thing. */
/*
for (int i=1; i0 && s[i] != s[memo[j-1]]) j = memo[j-1];
memo[i] = (j>0) ? memo[j-1]+1 : 0;
}
*/
char[] t = "testestest hello there test!".toCharArray();
int m = t.length;
j=0;
for (int i=0; i0 && t[i] != s[j]) j = memo[j-1];
if (t[i] == s[j]) j++;
if (j==n) {
System.out.println("match found at "+(i-j+1));
j = memo[j-1];
}
}
}
}
The first phase of the algorithm computes the memo[] array.
This is a preprocessing step on the pattern, and does not
depend on the text string at all.
The second phase uses the memo array to find all the matches
of the pattern in the text.
The code is tricky to understand given how short it is.
The invariant that holds at the start of the loop is the following:
memo[k] for all k