border理论学习笔记
\(border\)理论学习笔记
本文参考自金策大神的[字符串算法选讲]
\(command\)_\(block\)大神的\(blog\)border理论小记
大概会简要介绍和证明以下各个\(Lemma\)和证明,然后做几道例题
\(Lemma1\)
- 若\(p\)和\(q\)是字符串\(s\)的\(period\)(以下简称\(pe\)),且\(p + q \leq |S|\),那么\(gcd(p,q)\)也是\(s\)的\(pe\)
- 证明:假设\(p > q\),\(s[i] = s[i-q] = s[i-q+p] -> d = (p - q)\)也是\(s\)的\(pe\),由辗转相减法可得\(gcd(p,q)\)是\(s\)的\(pe\).
- 有个更强的形式,实际上条件仅需满足\(p + q - gcd(p,q) \leq |S|\)
\(Lemma2\)
- 若字符串\(u,v\)满足\(2 * |v| \geq |u|\),则\(v\)在\(u\)中的匹配位置构成等差数列,并且若该等差数列大于等于\(3\),则\(v\)的最小周期就是公差\(d\).
- 证明:设\(v\)在\(u\)中第一次匹配和第二次匹配位置差为\(d\),第二次匹配和某次匹配差为\(q\),如下图,显然\(d,q,r = gcd(d,q)\)都是\(v\)的周期,设其最小周期为\(p\),那么\(p \leq r \leq d \leq q\),又因为\(d \leq p\),因为否则将第一个匹配右移\(p\)位就能得到第二个匹配了.所以\(d = p = r\),所以\(gcd(d,q) == d,即d | q\).
\(Lemma3\)
- 字符串的所有不小于\(|S|/2\)的\(border\)构成等差数列.
- 证明:考虑最长的\(border\)为\(n-p\),任意一个不小于\(|S| / 2\)的\(border\)为\(n-q\),那么\(p,q,gcd(p,q)\)是\(S\)的\(pe\),那么有\(border:n - gcd(p,q)\),所以\([n - gcd(p,q) \leq n - p] - > [gcd(p,q) \geq p] - > [gcd(p,q) == p] - > p | q\).
\(Lemma4\)
- \((Lemma3的强化版)\)字符串按长度可以被分为长度为\([2^{i-1},2^i]...[2^{i},m]\)的段,每个段内的\(border\)都构成等差数列.
- 证明:考虑\([2^{i-1},2^i]\)中最长的\(border\)设为\(u,len_u = 2^i - p\),则\([2^{i-1},2^i]\)次方中其余所有\(border\)都是\(u\)的\(border\),由\(Lemma3\)知道余下的\(border\)都构成一个等差数列(\(2^{i-1} * 2 \geq 2^{i}\)),且公差为\(p\),所以这个等差数列加上\(u\)也同样构成等差数列.证毕.
一个推论
- 一个串公差\(\ge d\)的\(border\)等差数列总和\(\leq n/ d\)
待更新