全角转半角实现
首先,需要明确为什么要把全角转换成半角。
一般在业务实现中,日文或英文可以输入全角,同样的名字用全角和半角输入应该代表同一个意思,
但是在排序中全角要排在半角的后面,这不符合某些人(当然,是所谓产品经理)的逻辑,那只有硬着头皮分析转换咯;
下面我们来看全角和半角所在的unicode区间对照表:
全角和半角的比较
ASCII
|
全角字符
|
Unicode
|
半角字符
|
Unicode
|
0x20
|
""空格
|
U+3000
|
" "空格
|
U+0020
|
0x21
|
!
|
U+ff01
|
!
|
U+0021
|
0x22
|
"
|
U+ff02
|
"
|
U+0022
|
0x23
|
#
|
U+ff03
|
#
|
U+0023
|
0x24
|
$
|
U+ff04
|
$
|
U+0024
|
0x25
|
%
|
U+ff05
|
%
|
U+0025
|
0x26
|
&
|
U+ff06
|
&
|
U+0026
|
0x27
|
‘
|
U+ff07
|
‘
|
U+0027
|
0x28
|
(
|
U+ff08
|
(
|
U+0028
|
0x29
|
)
|
U+ff09
|
)
|
U+0029
|
0x2a
|
*
|
U+ff0a
|
*
|
U+002a
|
0x2b
|
+
|
U+ff0b
|
+
|
U+002b
|
0x2c
|
,
|
U+ff0c
|
,
|
U+002c
|
0x2d
|
-
|
U+ff0d
|
-
|
U+002d
|
0x2e
|
.
|
U+ff0e
|
.
|
U+002e
|
0x2f
|
/
|
U+ff0f
|
/
|
U+002f
|
0x30
|
0
|
U+ff10
|
0
|
U+0030
|
0x31
|
1
|
U+ff11
|
1
|
U+0031
|
0x32
|
2
|
U+ff12
|
2
|
U+0032
|
0x33
|
3
|
U+ff13
|
3
|
U+0033
|
0x34
|
4
|
U+ff14
|
4
|
U+0034
|
0x35
|
5
|
U+ff15
|
5
|
U+0035
|
0x36
|
6
|
U+ff16
|
6
|
U+0036
|
0x37
|
7
|
U+ff17
|
7
|
U+0037
|
0x38
|
8
|
U+ff18
|
8
|
U+0038
|
0x39
|
9
|
U+ff19
|
9
|
U+0039
|
0x3a
|
:
|
U+ff1a
|
:
|
U+003a
|
0x3b
|
;
|
U+ff1b
|
;
|
U+003b
|
0x3c
|
<
|
U+ff1c
|
<
|
U+003c
|
0x3d
|
=
|
U+ff1d
|
=
|
U+003d
|
0x3e
|
>
|
U+ff1e
|
>
|
U+003e
|
0x3f
|
?
|
U+ff1f
|
?
|
U+003f
|
0x40
|
@
|
U+ff20
|
@
|
U+0040
|
0x41
|
A
|
U+ff21
|
A
|
U+0041
|
0x42
|
B
|
U+ff22
|
B
|
U+0042
|
0x43
|
C
|
U+ff23
|
C
|
U+0043
|
0x44
|
D
|
U+ff24
|
D
|
U+0044
|
0x45
|
E
|
U+ff25
|
E
|
U+0045
|
0x46
|
F
|
U+ff26
|
F
|
U+0046
|
0x47
|
G
|
U+ff27
|
G
|
U+0047
|
0x48
|
H
|
U+ff28
|
H
|
U+0048
|
0x49
|
I
|
U+ff29
|
I
|
U+0049
|
0x4a
|
J
|
U+ff2a
|
J
|
U+004a
|
0x4b
|
K
|
U+ff2b
|
K
|
U+004b
|
0x4c
|
L
|
U+ff2c
|
L
|
U+004c
|
0x4d
|
M
|
U+ff2d
|
M
|
U+004d
|
0x4e
|
N
|
U+ff2e
|
N
|
U+004e
|
0x4f
|
O
|
U+ff2f
|
O
|
U+004f
|
0x50
|
P
|
U+ff30
|
P
|
U+0050
|
0x51
|
Q
|
U+ff31
|
Q
|
U+0051
|
0x52
|
R
|
U+ff32
|
R
|
U+0052
|
0x53
|
S
|
U+ff33
|
S
|
U+0053
|
0x54
|
T
|
U+ff34
|
T
|
U+0054
|
0x55
|
U
|
U+ff35
|
U
|
U+0055
|
0x56
|
V
|
U+ff36
|
V
|
U+0056
|
0x57
|
W
|
U+ff37
|
W
|
U+0057
|
0x58
|
X
|
U+ff38
|
X
|
U+0058
|
0x59
|
Y
|
U+ff39
|
Y
|
U+0059
|
0x5a
|
Z
|
U+ff3a
|
Z
|
U+005a
|
0x5b
|
[
|
U+ff3b
|
[
|
U+005b
|
0x5c
|
\
|
U+ff3c
|
\
|
U+005c
|
0x5d
|
]
|
U+ff3d
|
]
|
U+005d
|
0x5e
|
^
|
U+ff3e
|
^
|
U+005e
|
0x5f
|
_
|
U+ff3f
|
_
|
U+005f
|
0x60
|
‘
|
U+ff40
|
`
|
U+0060
|
0x61
|
a
|
U+ff41
|
a
|
U+0061
|
0x62
|
b
|
U+ff42
|
b
|
U+0062
|
0x63
|
c
|
U+ff43
|
c
|
U+0063
|
0x64
|
d
|
U+ff44
|
d
|
U+0064
|
0x65
|
e
|
U+ff45
|
e
|
U+0065
|
0x66
|
f
|
U+ff46
|
f
|
U+0066
|
0x67
|
g
|
U+ff47
|
g
|
U+0067
|
0x68
|
h
|
U+ff48
|
h
|
U+0068
|
0x69
|
i
|
U+ff49
|
i
|
U+0069
|
0x6a
|
j
|
U+ff4a
|
j
|
U+006a
|
0x6b
|
k
|
U+ff4b
|
k
|
U+006b
|
0x6c
|
l
|
U+ff4c
|
l
|
U+006c
|
0x6d
|
m
|
U+ff4d
|
m
|
U+006d
|
0x6e
|
n
|
U+ff4e
|
n
|
U+006e
|
0x6f
|
o
|
U+ff4f
|
o
|
U+006f
|
0x70
|
p
|
U+ff50
|
p
|
U+0070
|
0x71
|
q
|
U+ff51
|
q
|
U+0071
|
0x72
|
r
|
U+ff52
|
r
|
U+0072
|
0x73
|
s
|
U+ff53
|
s
|
U+0073
|
0x74
|
t
|
U+ff54
|
t
|
U+0074
|
0x75
|
u
|
U+ff55
|
u
|
U+0075
|
0x76
|
v
|
U+ff56
|
v
|
U+0076
|
0x77
|
w
|
U+ff57
|
w
|
U+0077
|
0x78
|
x
|
U+ff58
|
x
|
U+0078
|
0x79
|
y
|
U+ff59
|
y
|
U+0079
|
0x7a
|
z
|
U+ff5a
|
z
|
U+007a
|
0x7b
|
{
|
U+ff5b
|
{
|
U+007b
|
0x7c
|
|
|
U+ff5c
|
|
|
U+007c
|
0x7d
|
}
|
U+ff5d
|
}
|
U+007d
|
0x7e
|
~
|
U+ff5e
|
~
|
U+007e
|
仔细的小伙伴可以发现,除了第一个空格的全半角unicode编码值为(0x3000-0x0020)=0x2FE0(12256)外,其它字符的全半角unicode差均为0xFEE0(65248)
那么发现这个规律,可以说我们就已经完成了转换;对应代码实现,由于语言差别,小伙伴根据自己实现方式实现吧;