浙江省高等学校教师教育理论培训

微信搜索“毛凌志岗前心得”小程序

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

spider.py 0.5 : Python Package Index

spider.py 0.5

Multithreaded crawling, reporting, and mirroring for Web and FTP

This module provides multithreaded crawling, reporting, and mirroring for Web
and FTP in one convenient library. Crawling depth, maximum number of URLs to
crawl, and maximum number of threads are user-configurable. Reports can be
generated on external URLS, internal redirects to outside URLs, unparsable HTML,
non-HTTP/FTP URLs, and broken links.

posted on 2012-05-03 17:12  lexus  阅读(301)  评论(0)    收藏  举报