K站图片爬取

关于K站爬取原图的脚本

按图片ID爬取

  • 代码如下

    • from lxml import etree
      import requests
       
       
      headers={
          'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36'
      }    # 此处修改为你个人浏览器的User-Agent
      for i in range(270000,270004):    # 在此处修改图片的ID号
          urls=['https://konachan.com/post/show/{}/'.format(str(i))]
          for url in urls:
              res = requests.get(url= url,headers = headers).text
              tree = etree.HTML(res)
              second_url = tree.xpath('//*[@id="png"]')
              if len(second_url)==0:
                  second_url = tree.xpath('//*[@id="highres"]')[0]
                  new_url = second_url.xpath('./@href')[0]
                  data = requests.get(new_url,headers = headers)
                  title = tree.xpath('//*[@id="stats"]/ul/li[1]')[0].text[3:]
                  with open('./{}'.format(title)+'.jpg','wb') as f:
                      f.write(data.content)
                      print(title,'下载成功')
                  
              else:
                  second_url = tree.xpath('//*[@id="png"]')[0]
                  new_url = second_url.xpath('./@href')[0]
                  data = requests.get(new_url,headers = headers)
                  title = tree.xpath('//*[@id="stats"]/ul/li[1]')[0].text[3:]
                  with open('./{}'.format(title)+'.png','wb') as f:
                      f.write(data.content)
                      print(title,'下载成功')
      

备注

  • K站(konachan)链接就不发了,懂得都懂
  • 此脚本为本人编写,由于技术能力局限,不足之处还请见谅
  • 因此脚本引发的所有后果本人概不负责
posted @ 2020-04-03 18:08  Xuan_ZL  阅读(1084)  评论(0编辑  收藏  举报