GPT-4官方插件收集到Excel中
1、下载json文件,格式如下。https://github.com/copilot-us/chatgpt-plugins/tree/cefe4e2a917fa07c6f1651fff2e32c45ef98a7a7
2、这里为了方便用户查看,调用百度翻译把描述翻译成中文(若不用可注释掉),需要申请百度翻译开放平台开发者,我这里申请为高级版。申请流程:http://api.fanyi.baidu.com/doc/12
3、按照使用文档编写url。百度翻译通用文本翻译使用文档:http://api.fanyi.baidu.com/doc/21
4、把需要的内容写入Excel里。
import json import requests import random import hashlib import urllib from openpyxl import Workbook def baiduTranslate(translate_text): """ 百度翻译接口 :param translate_text: 需要翻译的文本-英 :return: 翻译后文本-中 """ appid = "xxx" secretKey = xxx" myurl = 'http://api.fanyi.baidu.com/api/trans/vip/translate' fromLang = 'en' # 原文语种 toLang = 'zh' # 译文语种 salt = random.randint(3276, 65536) sign = appid + translate_text + str(salt) + secretKey sign = hashlib.md5(sign.encode()).hexdigest() url = myurl + '?q=' + urllib.parse.quote(translate_text) + '&from=' + fromLang + \ '&to=' + toLang + '&appid=' + appid + '&salt=' + str(salt) + '&sign=' + sign try: response = requests.get(url) result = response.json() return result['trans_result'][0]['dst'] except Exception as e: print(e) with open("./chatgpt_plugins.json", "r", encoding="utf-8") as f: contents = f.read() text_dict = json.loads(contents) items = text_dict["items"] wb = Workbook() ws4 = wb.create_sheet(title="plugins", index=0) sheet = wb.active sheet["a1"] = "插件名" sheet["b1"] = "描述" c = 1 for i in items: name = i["namespace"] manifest = i["manifest"] description = manifest["description_for_human"] # trans_des = baiduTranslate(description) sheet.cell(row=c, column=1).value = name sheet.cell(row=c, column=2).value = description c += 1 wb.save("plugins.xlsx")
5、由于在别的网站能获取到类别,于是想在上面插件名+描述的基础上加上分类,把该网站的表格复制下来到“sheet21.xlsx” 这个文件上。https://www.startuphub.ai/a-list-of-78-chatgpt-plugins-currently-available-and-their-use-case/?__cf_chl_tk=aItOx.OMz_TtAUw8jpYdwryroMFtsbWJf90goOvLnzQ-1688974153-0-gaNycGzNFpA
-复制后的格式大概是这样,有些分类是空的,需要检查手动加上空行,否则后面取值可能导致插件名:分类没有一一对上。
-需要用到列表切片,[::3]表示从0开始每隔3取值,[2::3]表示从2开始每隔3取值。
(单纯用Excel可能有更方便的转换方式,我这里用最笨的方法顺便学习读文件和字符串、列表等操作)
6、读取该表,以插件名:分类的形式存在一个字典中。
7、读取上面生成的“plugins.xlsx”,判断插件名是否在字典中,是则把分类写到该行的第三列。由于两个网站的插件名可能格式不太一致,需要replace方法替换“ ”、“_”等字符,并且统一小写(虽然还是有些插件名不统一,先不管)。
from openpyxl import load_workbook workbook = load_workbook("sheet21.xlsx") sheet = workbook.active row_datas = [] for row in sheet.iter_rows(): row_data = [cell.value for cell in row] row_datas.append(row_data) name = row_datas[::3] name_type = row_datas[2::3] # print(name_type) dict_name = {} for i, n in enumerate(name): # print(n[0]) r_name = n[0].replace(" ", "") p_name = r_name.lower() dict_name[p_name] = name_type[i][0] # print(dict_name) workbook_1 = load_workbook("plugins.xlsx") sheet_1 = workbook_1["plugins"] for index, row in enumerate(sheet_1.iter_rows()): row_data = [cell.value for cell in row] row_name = row_data[0].replace("_", "") l_row_name = row_name.lower() # print(index, l_row_name) if l_row_name in dict_name.keys(): # print(l_row_name, dict_name[l_row_name]) sheet_1.cell(row=index+1, column=3).value = dict_name[l_row_name] workbook_1.save("plugins.xlsx")