pythonchallenge闯关 第4题

4、Hint:(1)urllib可能会有帮助。不要尝试一直循环,停不下来。400次就已经足够了

     (2)www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345

进入链接之后会提示下一个nothing=的值

用urllib库和re库 类似于爬虫

# -*- coding:UTF-8 -*-

from urllib import request
import re

def findURL(x):
    while x.isdigit():
        url = 'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing='+x
        print(url)
        response = request.urlopen(url)
        html = response.read()
        html = html.decode('utf-8')
        print(html)
        replacetext = re.findall(r'[0-9]', str(html))
        print(replacetext)
        x = "".join(replacetext)
    return html

if __name__ == '__main__':
    findURL('12345')
    findURL('8022')
(4)

中间会有一个页面内容是:Yes. Divide by two and keep going.

只好包装成函数然后再继续运行

最后答案是peak.html

posted @ 2017-10-01 10:44  anixtt  阅读(159)  评论(0编辑  收藏  举报