PHP UNICODE 编码转换(JS的encodeURIComponent函数 和 PHP的自定义unescape函数)

<?
/**
 * 将字符串转换成unicode编码
 *
 * @param string $input
 * @param string $input_charset
 * @return string
 */
function str_to_unicode($input, $input_charset = 'gbk'){
    $input = iconv($input_charset, "gbk", $input);
    preg_match_all("/[x80-xff]?./", $input, $ar);
    $b = array_map('utf8_unicode_', $ar[0]);
    $outstr = join("", $b);

    return $outstr;
}

function utf8_unicode_($c, $input_charset = 'gbk') {
    $c = iconv($input_charset, 'utf-8', $c);
    return utf8_unicode($c);
}

// utf8 -> unicode
function utf8_unicode($c) {
    switch(strlen($c)) {
        case 1:
            return $c;
        case 2:
            $n = (ord($c[0]) & 0x3f) << 6;
            $n += ord($c[1]) & 0x3f;
        break;
        case 3:
            $n = (ord($c[0]) & 0x1f) << 12;
            $n += (ord($c[1]) & 0x3f) << 6;
            $n += ord($c[2]) & 0x3f;
        break;
        case 4:
            $n = (ord($c[0]) & 0x0f) << 18;
            $n += (ord($c[1]) & 0x3f) << 12;
            $n += (ord($c[2]) & 0x3f) << 6;
            $n += ord($c[3]) & 0x3f;
        break;
    }

    return "&#$n;";
}

/**
 * 将unicode字符转换成普通编码字符
 *
 * @param string $str
 * @param string $out_charset
 * @return string
 */
function str_from_unicode($str, $out_charset = 'gbk') {
    $str = preg_replace_callback("|&#([0-9]{1,5});|", 'unicode2utf8_', $str);
    $str = iconv("UTF-8", $out_charset, $str);
    return $str;
}

function unicode2utf8_($c) {
    return unicode2utf8($c[1]);
}
function unicode2utf8($c){
    $str="";
    if ($c < 0x80) {
        $str.=$c;
    } else if ($c < 0x800) {
        $str.=chr(0xC0 | $c>>6);
        $str.=chr(0x80 | $c & 0x3F);
    } else if ($c < 0x10000) {
        $str.=chr(0xE0 | $c>>12);
        $str.=chr(0x80 | $c>>6 & 0x3F);
        $str.=chr(0x80 | $c & 0x3F);
    } else if ($c < 0x200000) {
        $str.=chr(0xF0 | $c>>18);
        $str.=chr(0x80 | $c>>12 & 0x3F);
        $str.=chr(0x80 | $c>>6 & 0x3F);
        $str.=chr(0x80 | $c & 0x3F);
    }

    return $str;
}

/**
 * 模拟JS里的unescape
 *
 * @param unknown_type $str
 * @return unknown
 */
function unescape($str) {
    $str = rawurldecode($str);
    preg_match_all("/(?:%u.{4})|.{4};|&#d+;|.+/U",$str,$r);
    $ar = $r[0];
    #print_r($ar);
    foreach($ar as $k=>$v) {
        if(substr($v,0,2) == "%u")
            $ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,-4)));
        elseif(substr($v,0,3) == "")
            $ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,3,-1)));
        elseif(substr($v,0,2) == "&#") {
            echo substr($v,2,-1)."";
            $ar[$k] = iconv("UCS-2","GB2312",pack("n",substr($v,2,-1)));
        }
    }
    return join("",$ar);
}
?>
View Code

根据上面提供的资料,项目中实战如下:

前端用JS函数 “ encodeURIComponent ” 编码,服务端用下面这个自定义PHP函数解码:

/** 模拟JS里的unescape
 *
 * @param unknown_type $str
 */
function unescape($str, $charset = 'utf-8') {
    $str = rawurldecode($str);
    preg_match_all("/(?:%u.{4})|.{4};|&#d+;|.+/U", $str, $r);
    $ar = $r[0];
    foreach ($ar as $k => $v) {
        if (substr($v,0,2) == "%u")
            $ar[$k] = iconv("UCS-2", $charset, pack("H4",substr($v,-4)));
        elseif (substr($v, 0, 3) == "")
            $ar[$k] = iconv("UCS-2", $charset, pack("H4", substr($v, 3, -1)));
        elseif (substr($v,0,2) == "&#") {
            echo substr($v,2,-1) . "";
            $ar[$k] = iconv("UCS-2", $charset, pack("n", substr($v, 2, -1)));
        }
    }

    return join("", $ar);
}
View Code

服务端解码:

$firstName = urldecode(trim($POST('firstName')));
$firstName  = unescape($firstName);

http://blog.snsgou.com/post-331.html

posted on 2014-06-12 22:50  今天又进步了  阅读(1653)  评论(0编辑  收藏  举报

导航