border理论学习笔记

\(border\)理论学习笔记

本文参考自金策大神的[字符串算法选讲]

\(command\)_\(block\)大神的\(blog\)border理论小记

大概会简要介绍和证明以下各个\(Lemma\)和证明,然后做几道例题

\(Lemma1\)

  • \(p\)\(q\)是字符串\(s\)\(period\)(以下简称\(pe\)),且\(p + q \leq |S|\),那么\(gcd(p,q)\)也是\(s\)\(pe\)
  • 证明:假设\(p > q\),\(s[i] = s[i-q] = s[i-q+p] -> d = (p - q)\)也是\(s\)\(pe\),由辗转相减法可得\(gcd(p,q)\)\(s\)\(pe\).
  • 有个更强的形式,实际上条件仅需满足\(p + q - gcd(p,q) \leq |S|\)

\(Lemma2\)

  • 若字符串\(u,v\)满足\(2 * |v| \geq |u|\),则\(v\)\(u\)中的匹配位置构成等差数列,并且若该等差数列大于等于\(3\),则\(v\)的最小周期就是公差\(d\).
  • 证明:设\(v\)\(u\)中第一次匹配和第二次匹配位置差为\(d\),第二次匹配和某次匹配差为\(q\),如下图,显然\(d,q,r = gcd(d,q)\)都是\(v\)的周期,设其最小周期为\(p\),那么\(p \leq r \leq d \leq q\),又因为\(d \leq p\),因为否则将第一个匹配右移\(p\)位就能得到第二个匹配了.所以\(d = p = r\),所以\(gcd(d,q) == d,即d | q\).

\(Lemma3\)

  • 字符串的所有不小于\(|S|/2\)\(border\)构成等差数列.
  • 证明:考虑最长的\(border\)\(n-p\),任意一个不小于\(|S| / 2\)\(border\)\(n-q\),那么\(p,q,gcd(p,q)\)\(S\)\(pe\),那么有\(border:n - gcd(p,q)\),所以\([n - gcd(p,q) \leq n - p] - > [gcd(p,q) \geq p] - > [gcd(p,q) == p] - > p | q\).

\(Lemma4\)

  • \((Lemma3的强化版)\)字符串按长度可以被分为长度为\([2^{i-1},2^i]...[2^{i},m]\)的段,每个段内的\(border\)都构成等差数列.
  • 证明:考虑\([2^{i-1},2^i]\)中最长的\(border\)设为\(u,len_u = 2^i - p\),则\([2^{i-1},2^i]\)次方中其余所有\(border\)都是\(u\)\(border\),由\(Lemma3\)知道余下的\(border\)都构成一个等差数列(\(2^{i-1} * 2 \geq 2^{i}\)),且公差为\(p\),所以这个等差数列加上\(u\)也同样构成等差数列.证毕.

一个推论

  • 一个串公差\(\ge d\)\(border\)等差数列总和\(\leq n/ d\)
    待更新
posted @ 2021-03-10 20:41  y_dove  阅读(679)  评论(0编辑  收藏  举报