Prague Stringology Conference 2015

Bruce W. Watson, Derrick G. Kourie and Loek Cleophas

Quantum Leap Pattern Matching

Abstract:
Quantum leap matching is introduced as a generic pattern matching strategy for the single keyword exact pattern matching problem, that can be used on top of existing Boyer-Moore-style string matching algorithms. The cost of the technique is minimal: an additional shift table (of one dimension, for shifts in the opposite direction to the parent algorithm's shifts), and the replacement of a simple table lookup assignment statement in the original algorithm with a similar conditional assignment. Together with each of the conventional shift table lookups, the additional shift table is typically also indexed on the text character that is at a distance of z away from the current sliding window. Under conditions that are identified, the returned values from the two shift tables allow a "quantum leap" of distance more than the length of the keyword for the next matching attempt. If the conditions are not met, then there is a fall back is to the traditional shift. Quick Search (by Sunday) is used as a case study to illustrate the technique. The performance of the derived "Quantum Leap Quick Search" algorithm is compared against Quick Search. When searching for shorter patterns over natural language and genomic texts, the technique improves on Quick Search's time for most values of z. Improvements are also sometimes seen for various values of z on larger patterns. Most interestingly, under best case conditions it performs, on average, at about three times faster than Quick Search.

Download paper:   Article in PDF BibTeX Reference
   PDF   BibTeX reference 
Download presentation: Presentation