peewee操作
一、使用ORM的优点
1、隔离数据库和数据库版本之间的差异
2、便于维护
3、ORM会提供防sql注入等功能
4、变量传递式的调用更加简单
5、很多立志不用ORM的项目会自己开发一套ORM
ORM的核心原理是将一张表映射成一个类或者一个对象。
二、peewee的优点
1、目前三种主流ORM django orm、sqlalchemy、peewee。peewee相对来说更加轻量级
2、简单、灵活、声明方式和django的orm接近
3、star数量高,活跃度高
4、文档质量高
三、安装peewee
安装peewee前需要先安装PyMysql,之前博客里面有PyMysql的安装教程,可以参考
三、peewee新建表
可以在External library库中看到peewee的源码
使用peewee新创建表
新建peewee_test.py文件
peewee_test.py
from peewee import *
db=MySQLDatabase("spider",host="127.0.0.1",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField()
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'users'
if __name__ == '__main__':
db.create_tables([Person])
执行结果如下:
可以看到peewee生成一张表,默认生成了id,并设置为主键,并且默认所有值为非空。
使用peewee新创建表并设置字段长度
from peewee import *
db=MySQLDatabase("spider",host="127.0.0.1",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20,null=True)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'users'
if __name__ == '__main__':
db.create_tables([Person])
运行结果如下:
可以更新数据库中表的字段长度并设置为可以为NULL
使用peewee新创建表并设置主键
from peewee import *
db=MySQLDatabase("spider",host="127.0.0.1",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20,primary_key=True)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'persons'
if __name__ == '__main__':
db.create_tables([Person])
运行结果如下:
四、通过peewee进行增删改查
增加数据
from peewee import *
from datetime import date
db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'persons'
if __name__ == '__main__':
#db.create_tables([Person])
uncle_bob=Person(name='Bom',birthday=date(1979,1,15))
uncle_bob.save()
执行结果:
查询数据
只查询一条数据
from peewee import *
db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'persons'
if __name__ == '__main__':
#查询数据
# 只查询一条数据
result1= Person.select().where(Person.name=='Bom').get()
print(result1)
print(result1.name)
print(result1.birthday)
输出结果:
数据库中的数据
查询多条数据
from peewee import *
db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'persons'
if __name__ == '__main__':
# 查询多条数据
result2 = Person.select().where(Person.name=='Bom')[1:]
for person in result2:
print(person)
print(person.name)
print(person.birthday)
输出结果:
数据库表中的数据
修改数据
from peewee import * from datetime import date db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456") class Person(Model): name = CharField(max_length=20) birthday = DateField() class Meta: database=db #This model use the "people.db" database table_name = 'persons' if __name__ == '__main__': # 修改数据 result1 = Person.select().where(Person.name == 'Bom') for person in result1: person.birthday=date.today() person.save() #在没有数据的时候新增数据,存在的时候修改数据
运行结果:
修改前数据
修改后数据
删除数据
from peewee import *
from datetime import date
db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456")
class Person(Model):
name = CharField(max_length=20)
birthday = DateField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'persons'
if __name__ == '__main__':
# 删除数据
result1 = Person.select().where(Person.name == 'Bom')
for person in result1:
person.delete_instance()
运行结果:
删除前数据:
删除后数据:
实例:peewee爬取豆瓣网站Top250网站电影
import requests
from bs4 import BeautifulSoup
from peewee import *
# 请求URL
url = 'https://movie.douban.com/top250'
# 请求头部
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
db=MySQLDatabase("spider",host="localhost",port=3306,user="root",passwd="123456")
class movie_model(Model):
title = CharField(max_length=20)
rating_num = FloatField()
comment_num=TextField()
class Meta:
database=db #This model use the "people.db" database
table_name = 'douban_movie'
# 解析页面函数
def parse_html(html):
soup = BeautifulSoup(html, 'lxml')
movie_list = soup.find('ol', class_='grid_view').find_all('li')
for movie in movie_list:
title = movie.find('div', class_='hd').find('span', class_='title').get_text()
rating_num = movie.find('div', class_='star').find('span', class_='rating_num').get_text()
comment_num = movie.find('div', class_='star').find_all('span')[-1].get_text()
print(title, rating_num,comment_num)
result=movie_model(title=title,rating_num=rating_num,comment_num=comment_num)
result.save()
# 保存数据函数
def save_data():
db.create_tables([movie_model])
for i in range(10):
url = 'https://movie.douban.com/top250?start=' + str(i*25)+'&filter='
response = requests.get(url, headers=headers)
parse_html(response.text)
if __name__ == '__main__':
save_data()
输出结果:
CSDN博客爬虫举例
https://www.cnblogs.com/Mangnolia/p/14011850.html