Python Unicode数据库

unicodedata模块用于使用Unicode字符数据库访问所有Unicode字符。在此数据库中，存在所有字符的字符属性。

要使用此模块，我们需要在代码中导入unicodedata模块。

import unicodedata

Unicode数据库方法

此处描述了unicodedata模块的一些模块。

模块（unicodedata.lookup(name)）-

此方法用于按名称查找字符。名称有效时，应返回该字符。否则会引发KeyError。

模块（unicodedata.name（chr [，默认]））-

此方法用于将给定字符的名称作为字符串返回。如果提供了默认值，则当数据库中不存在该字符时，它可能返回默认值，否则将引发ValueError。

模块（unicodedata.digit（chr [，默认]））-

此方法用于返回给定字符的整数。如果提供了默认值，则在数据库中字符不存在或没有正确显示字符时，它可能返回默认值，否则将引发ValueError。

模块（unicodedata.category(chr)）-

此方法用于返回分配有字符的常规类别。像字母一样，它将返回“ L”，对于大写字母，它将返回“ u”，对于开括号，它将返回Ps（标点符号开始）等。

模块（unicodedata.mirrored(chr)）-

此方法用于检查字符是否具有任何镜像字符。有些字符具有镜像字符，例如'（'和'）'等。当它与镜像字符匹配时，将返回1，否则返回0。

范例程式码

import unicodedata as ud
print(ud.lookup('ASTERISK'))
print(ud.lookup('Latin Capital letter G'))

#The Unicode name from the characters
print(ud.name(u'x'))
print(ud.name(u'°'))

#The Unicode character to decimal and numerics
print(ud.decimal(u'6'))
print(ud.numeric(u'9'))

#The Unicode character categoty
print(ud.category(u'A'))
print(ud.category(u'9'))
print(ud.category(u'[')) #Punctuation Start

#Unicode character to check whether mirrored or not
print(ud.mirrored(u'A'))
print(ud.mirrored(u'<'))

输出结果

*
G
LATIN SMALL LETTER X
DEGREE SIGN
6
9.0
Lu
Nd
Ps
0
1

基础教程