When looking at the KMP algorithm, I want to simply count the execution time and performance.
The conclusion is: Java's String indexOf method has the best performance, followed by the KMP algorithm, followed by the traditional BF algorithm. Of course, the comparison is a bit far-fetched. SUN's algorithm is also implemented in Java, and it doesn't look like it. KMP needs to be studied in detail.
The test code looks like this:
package com.test.test.kmp;import java.util.Random;public class KMPTest {public static void main(String[] args) {test();}//public static void test() {//int allLen = 8000000;int subLen = 11;int charLen = 4;String cl = buildString(charLen);int times = 50;//testMatch(allLen, subLen, cl, times);} public static void testMatch(int allLen, int subLen, String cl, int times){int allBF = 0;int allString = 0;int allKMP = 0;int allGC = 0;int allbuild = 0;int alltoArray = 0;long start = System.currentTimeMillis();System.out.println("--------------------------");for (int i = 0; i < times; i++) {// 1. The loss of constructing a string long buildStart = System.currentTimeMillis();String sub = buildString(subLen, cl);String all = buildString(allLen, sub) +sub;long buildEnd = System .currentTimeMillis();allbuild += (buildEnd - buildStart);// 2. The loss of toCharArray long toArrayStart = System.currentTimeMillis();char[] allArray = all.toCharArray();char[] subArray = sub.toCharArray();long toArrayEnd = System.currentTimeMillis(); alltoArray += (toArrayEnd - toArrayStart);//long startBF = System.currentTimeMillis();int indexBF = bfIndexOf(all, sub);long endBF = System.currentTimeMillis();//long timeoutBF = endBF - startBF;allBF += timeoutBF;System.gc();allGC += (System.currentTimeMillis() - endBF);/ /long startString = System.currentTimeMillis();int indexString = stringIndexOf(all, sub);long endString = System.currentTimeMillis();//long timeoutString = endString - startString;allString += timeoutString;System.gc();allGC += (System.currentTimeMillis() - endString);//long startKMP = System.currentTimeMillis();//int indexKMP = kmpIndexOf(all, sub);int indexKMP = KMP_Index(allArray, subArray);long endKMP = System.currentTimeMillis();//long timeoutKMP = endKMP - startKMP;allKMP += timeoutKMP;System.gc();allGC += (System.currentTimeMillis() - endKMP);////System.out.println(" all="+all.substring(0,100)+" ......");//System.out.println("sub="+sub);////System.out.println("indexBF="+indexBF+";Time consuming: "+timeoutBF+" ms");//System.out.println("indexString="+indexString+";Time consuming: "+timeoutString+"ms");if(indexBF == indexString && indexKMP == indexString){//System.out.println("!!!!Contrast equal. ");} else {throw new RuntimeException("Comparison error.");}//if(i % 20 == 10){System.out.println(i+"bfIndexOf= "+allBF+"ms");System .out.println(i+"stringIndexOf= "+allString+"ms");System.out.println(i+"KMPIndexOf= "+allKMP+"ms");System.out.println(i+"allbuild= "+allbuild+"ms");System.out.println(i+"alltoArray= "+alltoArray+"ms");System.out. println(i+"*3 times,GC= "+allGC+"ms");long end = System.currentTimeMillis();long allTime = end-start;long lossTime = allTime - (allBF+allString+allKMP+allGC);System.out.println(i+" times, all total time consuming: "+(allTime)+"ms; loss:" + lossTime +"ms; loss ratio: " +( (0.0+lossTime)/allTime * 100)+"%");System.out.println("--------------------------");}}//System. out.println(times+"times bfIndexOf; total time consumption: "+allBF+"ms");System.out.println(times+"times stringIndexOf; total time consumption: "+allString+"ms");System.out.println( times+" times KMPIndexOf; total time spent: "+allKMP+"ms");System.out.println(times+"allbuild= "+allbuild+"ms");System.out.println(times+"alltoArray= "+alltoArray+"ms");System.out. println(times+"*3 times, GC; total time spent: "+allGC+"ms");long end = System.currentTimeMillis();long allTime = end-start;long lossTime = allTime - (allBF+allString+allKMP+allGC);System.out.println(times+" times, all total time consumption: "+(allTime)+"ms; loss:" + lossTime +" ms; Loss ratio: " +((0.0+lossTime)/allTime * 100)+"%");System.out.println("------------------- -------");}//// Build characters public static String buildString(int len){return buildString(len, null);}public static String buildString(int len, String sub){//final int TYPE_NUMBER = 0;final int TYPE_LOWERCASE = 1;final int TYPE_UPPERCASE = 2;//long seed = System.nanoTime();Random random = new Random(seed);////StringBuilder builder = new StringBuilder();for (int i = 0; i < len; i++) {//int type = random.nextInt(3);// 0-2int cvalue = 0;if(TYPE_NUMBER == type){cvalue = '0' + random.nextInt(10);// 0~(n-1)} else if(TYPE_LOWERCASE == type){cvalue = 'a' + random.nextInt(26);// 0~(n-1)} else if(TYPE_UPPERCASE == type){cvalue = 'A' + random.nextInt( 26);// 0~(n-1)} else {throw new RuntimeException(Random.class.getName()+"#newxtInt(int);Wrong!");}////cvalue = 'A' + random.nextInt(26);// 0~(n-1)/ /char c = (char)cvalue;if(null != sub && sub.length() > 1){int index = random.nextInt(sub.length());c = sub.charAt(index);}//String s = String.valueOf(c);builder.append(c);//}//return builder.toString();}/** * Java’s own method, Internally implemented * @param all * @param sub * @return */public static int stringIndexOf(String all, String sub){// Defensive programming if(null == all || null== sub){return -1;}//Call Java's String method return all.indexOf(sub);}/** * BF algorithm* @param all * @param sub * @return */public static int bfIndexOf(String all, String sub) {// Defensive programming if(null == all || null== sub){return -1;}//int lenAll = all.length();int lenSub = sub.length();int i = 0;while (i < lenAll) {// Compare int starting from i subscript j = 0;while (j < lenSub) {//char all_i = all.charAt(i);char sub_j = sub.charAt(j) ;if(all_i == sub_j){// If equal, continue to compare the next character i++;j++; // This increase is very important} else {// Otherwise, jump out of the inner loop break;}}// If j If it grows to be equal to lenSub, the match is successful if (lenSub == j){return i - lenSub;} else {i = 1+i - j; // Backtrack i}}//return -1;}/** * KMP algorithm * @param all * @param sub * @return */public static int kmpIndexOf(String all, String sub){//char[] allArray = all.toCharArray(); char[] subArray = sub.toCharArray(); //return KMP_Index(allArray, subArray);} /** * Get the next function value of the string* * @param t * String* @return next function value*/ public static int[] next(char[] t) { int[] next = new int[t.length]; next[0] = -1; int i = 0; int j = -1; while (i < t.length - 1) { if (j == -1 || t[i] == t[j]) { i++; j++; if (t[i] != t[j]) { next[i] = j; } else { next[i] = next[j]; } } else { j = next[j]; } } return next; } /** * KMP matching string* * @param allArray * Main string * @param subArray * Pattern string * @return If the match is successful, return the subscript, otherwise return -1 */ public static int KMP_Index(char[] allArray, char[] subArray) { int[] next = next(subArray ); int i = 0; int j = 0; while (i <= allArray.length - 1 && j <= subArray.length - 1) { if (j == -1 || allArray[i] == subArray[j]) { i++; j++; } else { j = next[j]; } } if (j < subArray.length) { return -1; } else return i - subArray.length; // Return the head index of the pattern string in the main string} }
One result of the test is shown below:
--------------------------10 times bfIndexOf= 94ms10 times stringIndexOf= 56ms10 times KMPIndexOf= 76ms10 times allbuild= 5870ms10 times alltoArray= 137ms10*3 times ,GC= 155ms10 times, total time taken: 6388ms; loss: 6007ms; loss ratio: 94.03569192235442%-----------------------------30 times bfIndexOf= 365ms30 times stringIndexOf= 222ms30 times KMPIndexOf= 295ms30 times allbuild= 16452ms30 times alltoArray= 395ms30* 3 times, GC= 452ms30 times, total time taken: 18184ms; Loss: 16850ms; Loss ratio: 92.66388033435987%--------------------------50 times bfIndexOf; total time consumption: 550ms50 times stringIndexOf; total time consumption : 335ms50 times KMPIndexOf; total time taken: 450ms50 times allbuild= 26501ms50 times alltoArray= 637ms50*3 times, GC; total time consumption: 734ms50 times, total time consumption: 29211ms; loss: 27142ms; loss ratio: 92.91705179555647%-------------------- ------
It can be seen from this that using StringBuilder to construct a string, 8 million * 50 appends consume approximately 26 seconds. This translates to 16 million times per second. . .
It seems that we need to write a special timing function. Originally, Junit had its own statistics, but the samples were not easy to make.
From this point of view, data preparation, that is, the selection of test cases, is critical, and may need to be generated and written to a text file first.