腾讯云、爬虫、微信脚本

记录一些有趣的代码

1,腾讯云cos存储用python实现,之后写表,(安装腾讯云的依赖 pip install -U cos-python-sdk-v5 )

import pymysql
from qcloud_cos import CosConfig
from qcloud_cos import CosS3Client
import sys
import logging
import time
import os

logging.basicConfig(level=logging.INFO, stream=sys.stdout)

secret_id = '????'
secret_key = '????'
region = '????'

token = None
scheme = 'https'

list1 = os.listdir('读取这个路径的文件,把文件名放到一个列表里')

conn = pymysql.Connect(host='', port=?, user='', passwd='', db='',       #数据库连接配置
                       charset='utf8')
cursor = conn.cursor()

config = CosConfig(Region=region, SecretId=secret_id, SecretKey=secret_key, Token=token, Scheme=scheme)
client = CosS3Client(config)


for i in list1:
    file_name = str(time.time()) + i
# 视频上传到腾讯云 with open(
'文件路径\\' + i, 'rb') as fp: client.put_object( Bucket='存储桶名', Body=fp, Key='桶文件夹名' + file_name, StorageClass='test/', EnableMD5=False ) logging.info("~~~~~~~~~%s!!!!!!!上传完成~~~~~~~~~" % file_name) url = '存储桶地址路径' + file_name cover_pic = "anything" sort = anything video_name = anything sql = f''' insert into 表名(字段1,字段2,字段3,。。。。) values(值1,值2。。。'{url}','{cover_pic}','{sort}'。。。) ''' cursor.execute(sql) conn.commit() logging.info("~~~~~~~~~%s!!!!!!!!!写表成功~~~~~~~~~" % file_name) logging.info("~~~~~~~~成功~~~~~~~~~")

当初用Java写腾讯云的上传下载,用了很久,现在用python算是轻车熟路了,一会就搞定了,语言差异对编程的影响好像是挺小的。

2,生成格式化的yaml文件   安装依赖 pip install ruamel.yaml 

功能很简单,但是急用的时候,不用对着yaml包口吐芬芳了

from ruamel.yaml import YAML

yaml = YAML()

src_data = {'user':
                {'name': '可优',
                 'age': 17,
                 'money': None,
                 'gender': True
                 },
            'lovers':
                ['柠檬小姐姐', '橘子小姐姐', '小可可']
            }
with open('aa.yaml', 'w', encoding='utf-8') as f:
    yaml.dump(src_data, f)

展示效果
user:
  name: 可优
  age: 17
  money:
  gender: true
lovers:
- 柠檬小姐姐
- 橘子小姐姐
- 小可可

3,一个简单的爬虫

爬虫东西太多了,我只是学了点皮毛,方便抓取一些需要的东西。

一个正则匹配的小用处,提取数据的方法,如果放到网页源码里,那就意义非凡了,学会一个.*?就能干很多事。

import re


s = '''
<div class="jjs"><span id="1">大聪明</span></div>
<div class="jjss"><span id="2">大聪</span></div>
<div class="jjsss"><span id="3">大dada</span></div>
<div class="jjssss"><span id="4">大聪明a </span></div>
'''

obj = re.compile(r'<div class="(?P<class>.*?)"><span id="\d+">(?P<name>.*?)</span></div>',re.S)
ret = obj.finditer(s)
for i in ret:
    # print(i.group("class"))
        print(i.group("name"))

放一个百度翻译的例子

import requests

'''百度翻译'''


url = 'https://fanyi.baidu.com/sug'
str = input('请输入你要翻译的英文')
data = {
    "kw": str
}

res = requests.post(url, data=data)
print(res.json())


url = 'https://fanyi.baidu.com/v2transapi?from=zh&to=en'
str2 = input('请输入你要翻译的中文')
data2 = {
    "kw": str2
}
res = requests.post(url, data=data2)
print(res.json())
玩一下百度翻译

放一个梨视频的例子,视频里有反爬链,研究爬虫的过程就是和网页源码斗智斗勇的过程啊

headers里 有的要加 "User-Agent",有的要加"Referer",这个就要自己试了

import requests


url = "https://www.pearvideo.com/video_1756213"
cont_id = url.split('_')[1]


header = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36",
    "Referer": url,
}

srcUrl = "https://www.pearvideo.com/videoStatus.jsp?contId=%s&mrd=0.389415005000727"%cont_id
resp = requests.get(srcUrl, headers=header)
dic = resp.json()

urll = dic["videoInfo"]["videos"]["srcUrl"]
systemTime = dic["systemTime"]
urll = urll.replace(systemTime, "cont-%s"%cont_id)

with open("video.mp4",'wb') as f:
    f.write(requests.get(urll).content)
下载一个视频试试

一个xpath的例子,xpath用网页的层级结构,就可以很好的避免标签里面的动态属性了,而且能用浏览器右键复制xpath,很方便了属于是。

import requests
from lxml import etree
import csv
import time

url = "https://www.softwareadvice.com/categories/"

resp = requests.get(url,timeout=20)
resp.encoding = 'utf-8'

html = etree.HTML(resp.text)
divs = html.xpath(
    "/html/body/app-root/main/app-categories-container/div/section[2]/app-category-list/div")

for div in divs:
    first_name = div.xpath("./h2/a/text()")
    second_name = div.xpath("./ul/li/a/text()")
    second_url = div.xpath("./ul/li/a/@href")
    for i in range(len(second_url)):
        try:
            response = requests.get(second_url[i], timeout=20)
            page_content = response.text
            child_html = etree.HTML(page_content)
            child_divs = child_html.xpath('//*[@id="product-catalog"]/div/section[2]/div/div')
            for di in child_divs[1:]:
                product_logo = di.xpath("./a/div/img/@src")
                product_name = di.xpath("./div/a/h3/text()")
                product_score = di.xpath("./div/div[1]/p/strong/text()")
                args = {
                    "first_name": first_name,
                    "second_name": second_name[i],
                    "name": product_name[0],
                    "logo": product_logo[0],
                    "score": product_score[0]
                }
                with open('product_data.csv', 'a') as f:
                    csv.writer(f).writerow(args.values())

        except Exception as e:
            print(e)
一个xpath的例子
requests.get()的时候,如果请求不到网址,可能永久阻塞,所以要加一个超时属性 timeout 超过时间就抛出异常,用try捕捉,下一位。

4,微信表白不会,我帮你

写一个脚本,帮你发消息。微信检测到你一直发消息,会屏蔽的,估计只能发100多条吧,够用了

import time
from pynput.keyboard import Controller as key_cl
from pynput.mouse import Button, Controller


def keyboard_input(string):
    keyboard = key_cl()
    keyboard.type(string)


def mouse_click():
    mouse = Controller()
    mouse.press(Button.left)
    mouse.release(Button.left)


def main(number, string):
    time.sleep(5)
    for i in range(number):
        keyboard_input(string)
        mouse_click()
        time.sleep(0.2)


if __name__ == '__main__':
    main(4, "你想说的话")     配置你想说的话,和次数

我这只是简单的写一下,如果你有心,可以做个情(脏)话汇总文件,然后用随机数····咳咳,点到为止。请符合社会主义核心价值观!

5,别人朋友圈九图好帅气啊,怎么做的?

程序员怎么会让你羡慕别人呢。

from PIL import Image

# 读取图片
im = Image.open("122.jpg")

# 宽高除以三
width = im.size[0]//3
height = im.size[1]//3

# 裁剪图片的左上角
start_x = 0
start_y = 0


im_name = 1

for i in range(3):
    for j in range(3):
        crop = im.crop((start_x, start_y, start_x+width, start_y+height))
        crop.save("images/" + str(im_name) + '.jpg')

        start_x += width
        im_name += 1

    start_x = 0
    start_y += height

6,不会PS,图片尺寸怎么调,小事儿

from PIL import Image

img = Image.open("00.jpg")
out = img.resize((358, 441))

out.save('000.jpg')

就这样,继续努力,继续学习。

posted @ 2022-04-19 15:26  木_糖  阅读(191)  评论(0编辑  收藏  举报