python处理emoji表情
爬到有表情的内容存mysql会报错,mongodb则没有事,解决办法 去除表情后存储。
pypi有现成的库emoji来处理emoji字符串 : https://pypi.org/project/emoji/
安装:
pip install emoji --upgrade
eg:
>> import emoji >> print(emoji.emojize('Python is :thumbs_up:')) Python is 👍 >> print(emoji.emojize('Python is :thumbsup:', use_aliases=True)) Python is 👍 >> print(emoji.demojize('Python is 👍')) Python is :thumbs_up:
替换函数
def filter_emoji(desstr,restr=''): #过滤表情 try: co = re.compile(u'[\U00010000-\U0010ffff]') except re.error: co = re.compile(u'[\uD800-\uDBFF][\uDC00-\uDFFF]') return co.sub(restr, desstr)