Day 12 Naive Pattern Search

Pattern searching requires two base components:

  • A text to scan
  • A pattern to search for

In our naive search, we can imagine the text being scanned as one long string of characters, one after another. The pattern is a separate, shorter string that we slide along the original text one character at a time, like a finger following letters in a book.

For each character of the original text, we count the number of following characters that match the pattern. If a disparity is found, then we move to the next letter of the text, but if the number of matching characters equals the length of the pattern, well then we found the pattern in the text!

The constant backtracking to the next character of the input text is the main cause of this slow performance, causing the algorithm to check the same characters many times. Better integrating the iteration of the pattern within the larger iteration of the text is the key to more optimized search algorithms, such as the Knuth–Morris–Pratt algorithm. It tracks collections of characters in the pattern called prefixes to intelligently skip through the original text after having checked if a pattern matches, thereby preventing backtracking, and getting a runtime of O(n+k).

Below is the code implementation of this algorithm.

def pattern_search(text, pattern):
  print("Input Text:", text, "Input Pattern:", pattern)
  for index in range(len(text)):
    print("Text Index:", index)
    match_count = 0
    for char in range(len(pattern)):
      print("Pattern Index:", char)
      if pattern[char] == text[char+index]:
        print("Matching index found")
        print("Match Count:",match_count)
        match_count+=1
      else:
        break
    if match_count == len(pattern):
      print("{0} found at index {1}".format(pattern,index))


text = "HAYHAYNEEDLEHAYHAYHAYNEEDLEHAYHAYHAYHAYNEEDLE"
pattern = "NEEDLE"
pattern_search(text, pattern)

 

posted @ 2022-01-24 17:12  M1stF0rest  阅读(42)  评论(0)    收藏  举报