2017 年 7月 9 日随笔档案 - vivi~

2017年7月9日

摘要： for x in topic_replay:# 去掉在x左右的空白,\t,\n和\r字符. x1 = x.strip(' \t\n\r') if x1 !='': topic_replay_end.append(x1) # 先将文章中的\r 都去掉，有些单独的'\r' 就变成了空的列表元素：''，再用if 来判断下就好了 artical... 阅读全文

posted @ 2017-07-09 17:19 vivi~ 阅读(3009) 评论(0) 推荐(0) 编辑

scrapy 按顺序抓取text内容

摘要：需求：获得如下li.clearfix 下的所有text，并且按顺序输出 1. x.css('div.reply-doc h4 a::text').extract(); 2. x.css('div.reply-doc h4::text').extract(); 3. x.css('div.reply- 阅读全文

posted @ 2017-07-09 17:13 vivi~ 阅读(1183) 评论(0) 推荐(0) 编辑

公告