PHP通过KMP算法实现strpos

起因

昨天看了阮一峰老师的一篇博客《字符串匹配的KMP算法》,讲的非常棒。这篇文章也是解决了:

有一个字符串"BBC ABCDAB ABCDABCDABDE",里面是否包含另一个字符串"ABCDABD"?

后来发现,其实这不是就PHP自带函数strpos的功能吗?于是突发奇想,自己写个类,实现一下这个算法。

代码

<?php

class KMP
{
    public  $haystack;
    public  $needle;
    private $_haystackLen;
    private $_needleLen;
    private $_matchTable;
    private $_isMatch;

    //构造函数
    function __construct($haystack,$needle)
    {
        $this->haystack      = $haystack;
        $this->needle        = $needle;
    }

    //初始化一些参数
    private function init(){
        $this->_haystackLen  = $this->getLen($this->haystack);
        $this->_needleLen    = $this->getLen($this->needle);
        $this->_matchTable   = $this->getMatchTable();
        $this->_isMatch = false;

    }

    //类似strpos函数功能
    public function strpos()
    {
        $this->init();

        //haystack
        $haystackIdx  = $matchNum = 0;
        while ($haystackIdx <= $this->_haystackLen - $this->_needleLen){
            //needle
            $needIdx = 0;
            for (; $needIdx < $this->_needleLen; $needIdx++){
                if (strcmp($this->haystack[$haystackIdx],$this->needle[$needIdx]) <> 0){
                    if ($matchNum > 0){
                        $lastMatchValue = $this->getLastMatchValue($needIdx-1);
                        $haystackIdx   += $this->getStep($matchNum,$lastMatchValue);
                        $matchNum = 0;
                    } else {
                        $haystackIdx++;
                    }
                    break;
                } else {
                    $haystackIdx++;
                    $matchNum++;
                    if ($matchNum == $this->_needleLen){
                        $this->_isMatch = true;
                        break;
                    }
                }
            }
            if($this->_isMatch == true){
                break;
            }
        }
        return $this->_isMatch ? $haystackIdx - $this->_needleLen : false;
    }

    //获取字符长度
    private function getLen($str)
    {
       return mb_strlen($str,'utf-8');
    }

    //获取部分匹配表
    private function getMatchTable()
    {
        $matchTable = [];
        for ($i=0; $i < $this->_needleLen; $i++){
            $intersectLen = 0;
            $nowStr = mb_substr($this->needle,0,$i + 1,'utf-8');
            $preFixArr = $this->getPreFix($nowStr);
            $sufFixArr = $this->getSufFix($nowStr);
            if($preFixArr && $sufFixArr){
                $intersectArr = array_intersect($preFixArr,$sufFixArr);
                if (!empty($intersectArr)){
                    $intersect = array_pop($intersectArr);
                    $intersectLen = mb_strlen($intersect,'utf-8');
                }
            }
            $matchTable[$i] = $intersectLen;
        }
        return $matchTable;
    }

    //获取前缀数组
    private function getPreFix($str)
    {
        $outArr = [];
        $strLen = $this->getLen($str);
        if ($strLen > 1){
            for ($i = 1;$i < $strLen; $i++){
               $outArr[] = mb_substr($str,0,$i,'utf-8');
            }
        }
        return $outArr;
    }

    //获取后缀数组
    private function getSufFix($str)
    {
        $outArr = [];
        $strLen = $this->getLen($str);
        if ($strLen > 1){
            for ($i = 1;$i < $strLen; $i++){
                $outArr[] = mb_substr($str,$i,null,'utf-8');
            }
        }
        return $outArr;
    }

    //计算步长
    private function getStep($matchNum,$lastMatchValue)
    {
        return $matchNum - $lastMatchValue;
    }

    //获取最后匹配值
    private function getLastMatchValue($index)
    {
        return isset($this->_matchTable[$index]) ? $this->_matchTable[$index] : 0;
    }
}
$str = 'BBC ABCDAB ABCDABCDABDE';
$subStr = 'ABCDABD';
$kmp = new KMP($str,$subStr);
var_dump($kmp->strpos());
$kmp->haystack = 'i believe';
$kmp->needle   = 'lie';
var_dump($kmp->strpos());
$kmp->haystack = 'i love u';
$kmp->needle   = 'hate';
var_dump($kmp->strpos());

结论

以上代码就是我看完这篇文章最想做的事-实践。虽然代码不是最好的,但是基本功能实现啦。

posted @ 2019-09-12 18:03  luyuqiang  阅读(286)  评论(0编辑  收藏  举报