摘要: # -*- coding: utf-8 -*-#1,首先导入库import requestsfrom bs4 import BeautifulSoupimport pdfkitimport lxmlimport lxml.etreeimport osimport os.pathfrom PyPDF2 阅读全文
posted @ 2018-12-19 13:29 hyolyn 阅读(493) 评论(0) 推荐(0) 编辑
摘要: 首先利用tor和vps配置好服务器与代理,具体步骤百度import seleniumfrom selenium import webdriverimport timeimport pymongo#连接mongodbclient = pymongo.MongoClient('localhost', 2 阅读全文
posted @ 2018-12-18 13:34 hyolyn 阅读(357) 评论(0) 推荐(0) 编辑
摘要: import osfirst_dir = os.listdir('D:\')list_i = []for i in first_dir: second_dir = os.listdir('D:\套'+'\\'+i) n = 1 for index,k in enumerate(second_dir) 阅读全文
posted @ 2018-03-18 20:22 hyolyn 阅读(158) 评论(0) 推荐(0) 编辑
摘要: # -*- coding: utf-8 -*-# Define here the models for your scraped items## See documentation in:# https://doc.scrapy.org/en/latest/topics/items.htmlimpo 阅读全文
posted @ 2018-02-13 06:17 hyolyn 阅读(149) 评论(0) 推荐(0) 编辑
摘要: from selenium import webdriver from bs4 import BeautifulSoup import time import json b_time = time.time() ip_dict = {} while True: driver = webdriver. 阅读全文
posted @ 2018-02-01 15:24 hyolyn 阅读(162) 评论(0) 推荐(0) 编辑
摘要: 1.建立目录:mkdir目录名 2.删除空目录:rmdir目录名 3.无条件删除子目录:rm -rf 目录名 4.改变当前目录:cd目录名 (进入用户home目录:cd~;进入上一级目录:cd-) 5.查看自己所在目录:pwd 6.查看当前目录大小:du 7.显示目录文件列表:ls -l (-a:增 阅读全文
posted @ 2018-01-28 22:06 hyolyn 阅读(110) 评论(0) 推荐(0) 编辑
摘要: python -m pydoc -p 7777 阅读全文
posted @ 2018-01-25 08:40 hyolyn 阅读(207) 评论(0) 推荐(0) 编辑
摘要: import time lt = time.time() time.sleep(5) f = map(lambda x, y: x + y, [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]) for i in f: print(i) nt = time.time() p 阅读全文
posted @ 2018-01-25 07:57 hyolyn 阅读(130) 评论(0) 推荐(0) 编辑
摘要: import os import os.path rootname = 'E:\file_handle' rootdir = os.listdir(rootname) for z in rootdir: d_name = rootname+'\\'+z d_dir = os.listdir(d_na 阅读全文
posted @ 2018-01-24 16:44 hyolyn 阅读(194) 评论(0) 推荐(0) 编辑
摘要: class Node(): def __init__(self,item,pre=None,next=None): self.item = item self.next = next self.pre = pre class DoubleLinkList(): def __init__(self,n 阅读全文
posted @ 2018-01-23 11:29 hyolyn 阅读(103) 评论(0) 推荐(0) 编辑