网站更新内容:请访问: https://bigdata.ministep.cn/

#单位度量衡转换

单位度量衡转换

#单位度量衡转换
def GB_MB_TB(msg):
    if msg == "":
        return None
    if msg is None or len(msg) == 0:
        pass 
    m = re.match("^\d+\.\d+[GB|MB|TB]", msg)
    if m:
        if re.search('GB',msg):
            number = re.match("^\d+(\.\d+)?", msg).group()
            number = float(number)
            return number
        elif re.search('TB',msg):
            number = re.match("^\d+(\.\d+)?", msg).group()
            number = float(number)*1000
            return number
        elif re.search('MB',msg):
            number = re.match("^\d+(\.\d+)?", msg).group()
            number = float(number)/1000
            return number
        else :
            number = 0
GB_MB_TB('3.03GB')

种子大小GB清洗

df['size']=df['大小'].apply(lambda x:GB_MB_TB(x))
df['size'] = df['size'].astype('float')

数据类型转化

df[['评论数', '种子数', '下载数','完成数']] = df[['评论数', '种子数', '下载数','完成数']].apply(lambda x: x.astype('int'))

posted @ 2021-04-04 19:46  ministep88  阅读(85)  评论(0编辑  收藏  举报
网站更新内容:请访问:https://bigdata.ministep.cn/