摘要: reference:http://blog.csdn.net/flynetcn/archive/2009/01/08/3733574.aspxcost time:24s('', 0) 625('', 0) 625('', 0) 625('', 0) 625('', 0)... 阅读全文
posted @ 2010-05-29 14:31 lexus

摘要: socket.getaddrinfo("http://www.news.com",None)when you try to parser dns,can not start with http://,belows is the right waysocket.getaddrinfo("www.xx.com",None) 阅读全文
posted @ 2010-05-29 14:23 lexus

摘要: finally i got this http://ubuntuforums.org/showthread.php?t=1575in sudo /etc/environment 阅读全文
posted @ 2010-05-29 13:23 lexus

摘要: nodejs,go都有适合做爬虫的地方,不过,它们的基础设施还不够完善,我年纪也大了,暂时不想折腾,待观察这几位的发展,目前我的观点是,要做一个爬虫的健壮性还是很重要的,并不是用了非阴塞IO,或是用c写了,就行了,这是一个系统工程,环环相扣,即使你爬虫写得再好,以你目前的实力,你能租用多大的带宽,能把带宽占满就O了, 阅读全文
posted @ 2010-05-29 11:41 lexus