Perl中文/unicode/utf8/GB2312之间的转换

参考:http://daimajishu.iteye.com/blog/959239
不过具测试,也有错误:
原文如下:

# author: jiangyujie
use utf8;  ##在最后一个例子,这里面不能有use utf8;
use Encode;
use URI::Escape;

$\ = "\n";

#从unicode得到utf8编码
$str = '%u6536';
$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;
$str = encode( "utf8", $str );
print uc unpack( "H*", $str );

# 从unicode得到gb2312编码
$str = '%u6536';
$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;
$str = encode( "gb2312", $str );
print uc unpack( "H*", $str );

# 从中文得到utf8编码
$str = "收";
print uri_escape($str);

# 从utf8编码得到中文
$utf8_str = uri_escape("收");
print uri_unescape($str);

# 从中文得到perl unicode
utf8::decode($str);
@chars = split //, $str;
foreach (@chars) {
printf "%x ", ord($_);
}

# 从中文得到标准unicode
$a = "汉语";
$a = decode( "utf8", $a );
map { print "\\u", sprintf( "%x", $_ ) } unpack( "U*", $a );

# 从标准unicode得到中文
$str = '%u6536';
$str =~ s/\%u([0-9a-fA-F]{4})/pack("U",hex($1))/eg;
$str = encode( "utf8", $str );
print $str;

# 从perl unicode得到中文
my $unicode = "\x{505c}\x{8f66}";
print encode( "utf8", $unicode );   ##据我测试,这里有错误!应该这样写: utf8::encode($unicode); print $unicode;

 

 

======================下面是我的测试

1)编码中文
[root@tts177:/tmp]$more uuu.pl
#!/usr/bin/perl
use warnings;
use Data::Dumper;
use URI::Escape;

$utf8_str = uri_escape("收");

print $utf8_str;
[root@tts177:/tmp]$
[root@tts177:/tmp]$./uuu.pl
%E6%94%B6[root@tts177:/tmp]$
[root@tts177:/tmp]$

2)解码url
[root@tts177:/tmp]$more uuu.pl
#!/usr/bin/perl
use warnings;
use Data::Dumper;
use URI::Escape;

$utf8_str = uri_escape("收");

print $utf8_str;
[root@tts177:/tmp]$
[root@tts177:/tmp]$./uuu.pl
%E6%94%B6[root@tts177:/tmp]$
[root@tts177:/tmp]$
[root@tts177:/tmp]$
[root@tts177:/tmp]$more uuu.pl
#!/usr/bin/perl
use warnings;
use Data::Dumper;
use URI::Escape;

$str = "%E6%94%B6";

print uri_unescape($str);
[root@tts177:/tmp]$
[root@tts177:/tmp]$./uuu.pl
收[root@tts177:/tmp]$
[root@tts177:/tmp]$

posted @ 2014-04-05 09:04  芽滴滴  阅读(2493)  评论(0编辑  收藏  举报