cup_leo

2019年10月12日

摘要：这里说的字符串不是一般意义上的字符串，是指在读取日期类型的数据时，如果还没有及时解析字符串，它就还不是日期类型，那么此时的字符串该怎么与时间戳之间进行转换呢？ ① 时间字符串转化成时间戳将时间字符串转化成时间戳分为两步：第一步：将时间字符串转换成时间元组第二步：将时间元组转换成时间戳类型 ② 阅读全文

posted @ 2019-10-12 15:08 cup_leo 阅读(2923) 评论(0) 推荐(0) 编辑

2019年9月26日

python去掉字符串中重复字符的方法

摘要： If order does not matter, you can use foo = "mppmt" "".join(set(foo)) set() will create a set of unique letters in the string, and "".join() will join 阅读全文

posted @ 2019-09-26 09:51 cup_leo 阅读(11793) 评论(0) 推荐(0) 编辑

2019年9月20日

centos7通过yum安装JDK1.8

posted @ 2019-09-20 15:07 cup_leo 阅读(3319) 评论(0) 推荐(0) 编辑

2019年9月9日

正则实现对一段文本同时匹配多个字符串

摘要： def asr_to_correct(text): rep = dict((re.escape(k), v) for k, v in error_asr_map.items()) pattern = re.compile("|".join(rep.keys())) text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text) retu 阅读全文

posted @ 2019-09-09 11:16 cup_leo 阅读(1317) 评论(0) 推荐(0) 编辑

2019年9月4日

新词发现基于ngram方法

摘要：原文 https://spaces.ac.cn/archives/4256/comment-page-1#comments 阅读全文

posted @ 2019-09-04 15:44 cup_leo 阅读(453) 评论(0) 推荐(0) 编辑

节约内存，用一个迭代器来逐篇输出

摘要： import re import pymongo from tqdm import tqdm import hashlib db = pymongo.MongoClient().weixin.text_articles md5 = lambda s: hashlib.md5(s).hexdigest() def texts(): texts_set = set() for a in tqdm(db 阅读全文

posted @ 2019-09-04 11:14 cup_leo 阅读(285) 评论(0) 推荐(0) 编辑

2019年9月3日

【新词发现】基于SNS的文本数据挖掘、短语挖掘

摘要：互联网时代的社会语言学：基于SNS的文本数据挖掘 python实现 https://github.com/jtyoui/Jtyoui/tree/master/jtyoui/word 这是一个无监督训练文本词库与分词（转载） java实现 https://gitee.com/tyoui/jsns 这阅读全文

posted @ 2019-09-03 10:19 cup_leo 阅读(1026) 评论(0) 推荐(0) 编辑

2019年8月28日

优化代码如何去除停顿词

摘要： # 方法一：暴力法，对每个词进行判断传统方法 def remove_stopwords1(text): words = text.split(' ') new_words = list() for word in words: if word not in stopwords: new_words.append(word) return new_words # 方法二：先构建停用词的映射推荐方阅读全文

posted @ 2019-08-28 17:00 cup_leo 阅读(344) 评论(0) 推荐(0) 编辑

一行代码加快pandas计算速度

摘要：一行代码加快pandas计算速度 DASK https://blog.csdn.net/sinat_38682860/article/details/84844964 阅读全文

posted @ 2019-08-28 13:58 cup_leo 阅读(350) 评论(0) 推荐(0) 编辑

2019年8月27日

centos7下glances系统监控的安装

摘要：启动》》glances 阅读全文

posted @ 2019-08-27 07:44 cup_leo 阅读(1124) 评论(0) 推荐(0) 编辑

公告