python3 str 数字类型判断 str.isdecimal(), isdigit(), isnumeric()

范围:Numeric > Digit > Decimal

Numeric Type[a][b] (Unicode character property)
Numeric typeCodeHas Numeric Value ExampleRemarks
Not numeric None No  不表示数的含义,或者不仅表示数字含义
  • A
  • X (Latin)
  • !
  • Д
  • μ
Numeric Value="NaN"
Decimal De Yes 直接表示十进制
  • 0
  • 1
  • 9
  •  (Devanagari 6)
  •  (Kannada 6)
  • 𝟨 (Mathematical, styled sans serif)
Straight digit (decimal-radix). Corresponds both ways with General Category=Nd[a]
Digit Di Yes

Decimal的变体

  • ¹ (superscript)
  •  (digit with full stop)
Decimal, but in typographic context
Numeric Nu Yes 仅有数字含义,但不直接表示十进制
  • ¾
  •  (Tamil number ten)
  •  (Roman numeral)
  •  (Han number 6)
Numeric value, but not decimal-radix

 refer 维基百科 

str.isdecimal()

Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those that can be used to form numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal character is a character in the Unicode General Category “Nd”.

In [56]: int('\u096a') +5  ## ४ 印度数字4
Out[56]: 9
str.isdigit()
Return true if all characters in the string are digits and there is at least one character, false otherwise. Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi numbers. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal.
IIn [47]: str.isdigit('⑦')
Out[47]: True

IIn [48]: str.isdecimal('⑦')
Out[48]: False
str.isnumeric()
Return true if all characters in the string are numeric characters, and there is at least one character, false otherwise. Numeric characters include digit characters, and all characters that have the Unicode numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally, numeric characters are those with the property value Numeric_Type=Digit, Numeric_Type=Decimal or   Numeric_Type=Numeric.
In [49]: str.isnumeric("四")
Out[49]: True

 python3官方文档

################################################

总结 

python3 str的默认判断是 unicodem对应的是unicode的数字定义范围 ;比python2默认bytes 要广泛得多.

isdecimal: Nd, 
isdigit:   No, Nd,
isnumeric: No, Nd, Nl,
isalnum:   No, Nd, Nl, Lu, Lt, Lo, Lm, Ll,
refer:stackoverflow

 

转换为int 时的问题

##py3

   In [5]: '⑦'.isdigit()
  Out[5]: True

In [54]: int('')
ValueError Traceback (most recent call last)
<ipython-input-54-bce0236ab387> in <module>()
----> 1 int('')
ValueError: invalid literal for int() with base 10: ''

#python2,中unicode

In [22]: int(u'\u096a') +5
Out[22]: 9

In [29]: str.isdecimal('5')

AttributeError Traceback (most recent call last)
<ipython-input-29-db35622e1d07> in <module>()
----> 1 str.isdecimal('5')

AttributeError: type object 'str' has no attribute 'isdecimal'

也是是说

python3中 默认的unicode isdigit,但是不能转int,,,

Python2 isdecimal() 方法检查字符串是否只包含十进制字符。这种方法只存在于unicode对象。

这与python2是不兼容了。。。

python2里 isdigit,现在对应python3  isdecimal 

numpy:再用isdigit 就可能转不了 int了

  

 

 

------------ps  

str.isalnum()   

包括 c.isalpha()c.isdecimal()c.isdigit(), or c.isnumeric().Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise. A character c is alphanumeric if one of the following returns Truec.isalpha()c.isdecimal()c.isdigit(), or c.isnumeric().

str.isalpha()  ——unicode letter

    包括通常意义的[a-z]英文字母,以及其他European Latin、Non-European & historic Latin等字母“Letter”,甚至汉字

Return true if all characters in the string are alphabetic and there is at least one character, false otherwise. Alphabetic characters are those characters defined in the Unicode character database as “Letter”, i.e., those with general category property being one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard.

In [33]: str.isalpha('Ƣ')
Out[33]: True
In [51]: str.isalpha("中")
Out[51]: True
posted @ 2017-12-15 23:22  willowj  阅读(1634)  评论(0编辑  收藏  举报