csv存储
1. csv文件时大数据文件储存格式的文件结构与Excel不同;
2. CSV是一种通用、相对简单的文件格式,被用户。商业和科学广泛应用。最广泛的应用是在程序之间转移表格数据,而这些程序本身是在不兼容的格式上进行操作的(往往是私有的和/或无规范的格式);
3. 因为大量程序都支持某种CSV变体,至少是作为一种可选择的输入/输出格式;
4. 例如,一个用户可能需要交换信息,从一个私有格式存储数据的数据库程序,到一个数据格式完全不同的电子表格。最可能的情况是,该数据库程序可以导出数据为“CSV”,然后被导出的CSV文件可以被电子表字程序导入。
import csv headers = ("name", "age", "height") students = [ ("李四", 18, 180), ("张三", 18, 180), ("张三", 18, 180), ("张三", 18, 180), ("张三", 18, 180), ("张三", 18, 180), ] # with open("students.csv", "a+", encoding="utf-8", newline="") as fp: # write = csv.DictWriter(fp, headers) # # write.writeheader() # 写入头 # write.writerows(students) with open("students.csv", "w+", encoding="utf-8", newline="") as fp: write = csv.writer(fp) write.writerow(headers) # writerow 用于单个数据类型(元组类型) with open("students.csv", "a", encoding="utf-8", newline="") as fp: write = csv.writer(fp) write.writerows(students) # writerows 用于多组数据类型(列表类型)
Mysql数据库存储
MySQL基本命令
1. 登录数据库: mysql -uusername -ppassword
2. 查看数据库: show databases
3. 创建数据库: create database database_name
4. 使用数据库: use database_name
5. 创建表: create table if not exists table_name(字段1 类型 属性,字段2 类型 属性,......)
6. 查看所有数据: select * from table_name
7. 插入数据: insert into table_name(字段1, 字段2, ...) values(字段1的值,字段2的值,...)
Python连接MySQL
import pymysql # 连接mysql conn = pymysql.connect(user="root", password="root", host="localhost", port=3306, database="maqu", charset="utf8mb4") # 获取游标 cursor = conn.cursor() # 添加一条数据 sql = "insert into photos(title,href,img_url) values(%s,%s,%s)" data = ( "this a title", "https://www.baidu.com", "https://www.baidu.com/imgs/asdfjalsdhflksahdfk.jpg" ) cursor.execute(sql, data) conn.commit() # 提交
import requests import re import json import hashlib from bs4 import BeautifulSoup import pymysql url = "https://www.huashi6.com/" document = requests.get(url).text bs = BeautifulSoup(document, "html.parser") items = bs.select("div.c-section-waterfall div.c-section-work-item") photos = [] # 用来存储图片数据 for item in items: try: a = item.select_one("div.waterfall-img a") title, href = a["title"], a["href"] document2 = requests.get(href).text # 使用正则表达式从script标签中提取内容 match_obj = re.search(r'<script type="application/ld\+json">(.*?)</script>', document2, re.S) json_str = match_obj.group(1).strip("\n\r\t ") img_url = "https:" + json.loads(json_str)["images"][0].split("?")[0] # 下载图片 content = requests.get(img_url).content file_name = hashlib.md5(img_url.encode("utf-8")).hexdigest() file_dir = "./imgs/" + file_name + ".png" print("正在下载: ", img_url) with open(file_dir, "wb") as fp: fp.write(content) # 把数据放到列表里面 photos.append((title, href, file_dir)) except Exception: pass """ photos = [ ("asdfa", "adsfasd.com", "asdlfjals;.jpg), ("asdfa", "adsfasd.com", "asdlfjals;.jpg), ("asdfa", "adsfasd.com", "asdlfjals;.jpg), ("asdfa", "adsfasd.com", "asdlfjals;.jpg), ] """ # 连接mysql conn = pymysql.connect(user="root", password="root", host="localhost", port=3306, database="maqu", charset="utf8mb4") # 获取游标 cursor = conn.cursor() # 添加一条数据 sql = "insert into photos(title,href,img_url) values(%s,%s,%s)" # 添加一条数据 # data = ( # "this a title", # "https://www.baidu.com", # "https://www.baidu.com/imgs/asdfjalsdhflksahdfk.jpg" # # ) # cursor.execute(sql, data) # 添加多条数据 try: cursor.executemany(sql, photos) conn.commit() # 提交 except Exception as e: print(e) conn.rollback() cursor.close() conn.close()
mongoDB数据库存储
1. Ubuntu中启动mongodb: sudo service mongodb start; 关闭: sudo service mongodb stop
2. 启动mongodb: mongo
3. 显示数据库别表:show dbs
4. 切换当前数据集至test:use test [test为数据库的库名,可以更换。如果不存在该数据库,则会新创建一个test数据库]
5. 显示当前数据库中的模块:show collections
6. 创建集合:db.createCollection('集合') [如果没有该集合又直接使用了,则会自动创建]
7. 查找数据:db.data.find() [data为集合的名称]
8. 插入数据:data.insert({'x': 1, 'y': 2}) [插入的数据要是字典形式]
from pymongo import MongoClient client = MongoClient("127.0.0.1", 27017) # 创建数据库 maqu = client.maqu music = maqu.music # collection # 添加文档 # music.insert_one({"title": "mongodb的使用"}) # music.insert_one({"name": "friendship"}) # 获取数据 # rv = music.find() # 获取所有的文档,是一个游标对象。 # for item in rv: # print(item) # print(list(rv)) # 把游标对象转成列表 # 获取一条数据 # rv = music.find_one() # 返回结果是一个字典 # print(rv, type(rv), rv["title"]) # rv = music.find_one({"name": "friendship"}) # 条件查询 获取一条 # print(rv) # rv = music.find({"name": "friendship"}) # 条件查询 获取多条 # print(list(rv))
from pymongo import MongoClient import datetime client = MongoClient("127.0.0.1", 27017) db = "maqushop" maqushop = client[db] # 如果数据库的名称是变量 # 创建一个测试结合 test_collection collect = "test_collection" collect = maqushop[collect] # 构建一个document # post = {"author": "Mike", # "text": "My first blog post!", # "tags": ["mongodb", "python", "pymongo"], # "date": datetime.datetime.utcnow()} # obj_id = collect.insert_one(post).inserted_id # # print(obj_id) # 插入多条数据 posts = [ {"author": "Mike", "text": "My first blog post!", "tags": ["mongodb", "python", "pymongo"], "date": datetime.datetime.utcnow()}, {"author": "friendhsip", "text": "My first blog post!", "tags": ["mongodb", "python", "pymongo"], "date": datetime.datetime.utcnow()}, {"author": "yuer", "text": "My first blog post!", "tags": ["mongodb", "python", "pymongo"], "date": datetime.datetime.utcnow()} ] ids = collect.insert_many(posts).inserted_ids # print(ids)
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· winform 绘制太阳,地球,月球 运作规律
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理