PHP Memoization

经常会遇到这种情况，比如一个读文件内容的类，有个比较耗时的方法 getContent，假设这个方法对于固定参数返回的内容是固定的，比如：

class FileReader {
    public function getContent($file) {
        $content = // open -> read -> do some decode
        return $content;
    }
}
$r = new FileReader();
$r->readContent('a.txt');
$r->readContent('a.txt');
$r->readContent('a.txt');

这样重复调用开销比较大的，然后就加了个cache：

class FileReader {
    private $cache = array();
    public function getContent($file) {
        if($cache[$file]) {
            return $cache[$file];
        }
        $cache[$file] = // open -> read -> do some decode
        return $cache[$file];
    }
}

这样好很多，问题是

1. 破坏了getContent的功能单一性
2. 可读性
3. 还加了个类成员变量$cache

需要一种方法能把cache机制抽象出来，类似js的memoize(见参考1)

function create_memoized_class($new_class, $base, $memo_funcs) {
    if(is_string($memo_funcs)) 
        return create_memoized_class($new_class, $base, array($memo_funcs));

    static $NON_STATIC_FUNC = 1;//01b
    static $STATIC_FUNC = 2;//10b

    $ref = new ReflectionClass($base);

    $type = 0;
    foreach($memo_funcs as $f) {
        if(!$ref->hasMethod($f)) throw new Exception("function '$f' not found in class '$base'");
        $m = $ref->getMethod($f);
        $type = $type | ($m->isStatic() ? $STATIC_FUNC : $NON_STATIC_FUNC);
    }

    $code = array(
        "class $new_class extends $base {"
    );
    if($type & $NON_STATIC_FUNC)
        $code[] = 
            "private \$cache = array();
            function __call(\$m, \$v) {
                \$k = md5(serialize(func_get_args()));
                if(!array_key_exists(\$k, \$this->cache)) {
                    \$this->cache[\$k] = call_user_func_array(array('parent', \$m), \$v);
                }
                return \$this->cache[\$k];
            }";
    if($type & $STATIC_FUNC)
        $code[] = 
            "private static \$s_cache = array();
            public static function __callStatic(\$m, \$v) {
                \$k = md5(serialize(func_get_args()));
                if(!array_key_exists(\$k, self::\$s_cache)) {
                    self::\$s_cache[\$k] = call_user_func_array(array('parent', \$m), \$v);
                }
                return self::\$s_cache[\$k];
            }";

    foreach($memo_funcs as $f) {
        $m = $ref->getMethod($f);
        $prefix = $m->isPrivate() ? 'private' : ($m->isPublic() ? 'public' : 'protected');
        if($m->isStatic()) {
            $code[] = 
                "$prefix static function $f() {
                    return self::__callStatic('$f', func_get_args());
                }";
        } else {
            $code[] = 
                "$prefix function $f() {
                    return \$this->__call('$f', func_get_args());
                }";
        }
    }
    $code[] = 
        "}";

    $code = implode('', $code);
    // debug($code);
    eval($code);
}

用法：

//测试
class Person {
    var $job = "coding";

    function __construct($name) {
        debug("a person created: [$name]");
    }
    function work($time) {
        debug("now: $time, sleep 1 hour first...");
        sleep(1);
        debug("Hi, I'm $this->job");
        return "going home";
    }
    public static function getInstance($name) {//原Singleton模式可被简化成这样
        return new Person($name);
    }
}
create_memoized_class("LazyPerson", "Person", array("work", "getInstance"));
//非静态方法，没问题
$z = new LazyPerson('zz');
$z->work('9:00');
$z->work('9:00');//成功cache住
//静态方法，没问题
$x = LazyPerson::getInstance('aj');
$y = LazyPerson::getInstance('aj');
debug($x===$y); //true
//把上面2个放一起问题就来了
debug(LazyPerson::getInstance('aj')->work('9:00'));
debug(LazyPerson::getInstance('aj')->work('9:00'));//没有cache住，原始work被调用2次
//原因是getInstance里new出来的是原始Person，而不是LazyPerson。
//解决方法： 让Person::getInstance 返回 LazyPerson。

原理：动态eval出一个新的类，挂钩其中关键函数，加上cache机制

需注意点：
1. 被cache的方法不能是private的，至少需要改成protected，原因是本机制是通过extend实现，所以原函数至少得是protected的
2. Singleton和其他方法一起cache的话，getInstance中返回的实例必须是cache过的类，否则很容易出现陷阱，看似有cache功能实际无效而且很隐蔽不会报错，见上面代码

可改进的地方：
1. 目前的key是通过 $k = md5(serialize(func_get_args())); 生成的，如果函数参数复杂的话，可以扩展一下，自定义一个 key_gen
2. 可以把cache存放处抽象出来，比如存到memcache中，或者文件

参考：

1. JS 的 memoization 机制：http://www.cnblogs.com/aj3423/archive/2011/02/28/3150514.html
2. PHP普通函数的memoization： http://stackoverflow.com/questions/3540403/caching-function-results-in-php
3. 原讨论帖： http://topic.csdn.net/u/20110916/14/e2bb3043-44e6-45fa-a4f0-631bc168a1d2.html
4. 一个用于 memoize 的 php 插件： https://github.com/arraypad/php-memoize

posted @ 2011-09-24 00:42 aj3423 阅读(166) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

aj3423

PHP Memoization

公告