数据库

1.数据库管理系统

1.1管理系统相关

1.1.1基础概念

数据库管理系统：专门用来管理数据文件，帮助用户更简洁的操作数据的软件
- DBMS
数据：data
文件
文件夹：数据库database，db
数据库管理员，DBA

1.1.2分类

关系型数据库
- sql server
- oracle
- mysql
- sqllite
非关系型数据库
- redis
- mongodb

1.2 mysql相关

C/S架构

1.2.1 server端使用

mysqld install
- 安装数据库服务
net start mysql
- 启动数据库的server端
net stop mysql
- 停止数据库的server端

1.2.2客户端使用

客户端可以是python代码，也可以是exe程序
- mysql自带exe程序：mysql.exe(添加环境变量直接使用即可)
登录客户端
- 终端使用：mysql -u用户名 -p密码
- 最高权限和创建用户
  - 最高权限root
  - 不仅可以连接本地的数据库，也可以连接网络上的某一个数据库的server端

1.2.3 mysql用户相关指令

select user();
- 查看当前用户是谁
set password = password('密码')
- 设置密码
create user 's21'@'192.168.12.%' identified by '123';
- 创建用户
grant all on day37.* to 's21'@'192.168.12.%';
- 授权
- grant all on day37.* to 'alex'@'%' identified by '123';
  - 授权并创建用户

2. sql语言

创建库、创建表 DDL数据库定义语言
存数据，删除数据,修改数据,查看 DML数据库操纵语言
grant/revoke DCL控制权限语言

2.1库操作

create database 数据库名; # 创建库
show databases; # 查看当前有多少个数据库
select database();# 查看当前使用的数据库
use 数据库的名字; # 切换到这个数据库(文件夹)下
drop database 数据库名; 删除库

2.2表操作

2.2.1表操作

show tables;查看当前文件夹中有多少张表
create table student(id int,name char(4));创建表
drop table student;删除表
desc 表名;查看表结构

2.2.2数据操作

insert into student values (1,'alex');数据的增加
- 所有的在这个表中的字段都需要按照顺序被填写在这里
  - value只能增加一个
  - values可以增加多个
```
# t1 id,name,age
# insert into t1 value (1,'alex',83)
# insert into t1 values (1,'alex',83),(2,'wusir',74)
```
- insert into 表名(字段名，字段名。。。) values (值....)
  - 所有在字段位置填写了名字的字段和后面的值必须是一一对应
- insert into 表名(字段名，字段名。。。) values (值....),(值....),(值....)
  - 所有在字段位置填写了名字的字段和后面的值必须是一一对应
```
# insert into t1(name,age) value ('alex',83)
# insert into t1(name,age) values ('alex',83),('wusir',74)
```
select * from student;数据的查看
- select * from 表
- select 字段,字段.. from 表
- select distinct 字段,字段.. from 表 # 按照查出来的字段去重
- select 字段*5 from 表
- select 字段 as 新名字,字段 as 新名字 from 表
- select 字段新名字 from 表
update 表 set 字段名=值修改数据(修改所有)
- update student set name = 'wusir' where id=2;修改数据(修改单个)
delete from 表名字;删除数据(删除整个表)
- delete from student where id=1;删除单个数据

3.存储引擎

3.1 InnoDB

适合并发比较高的，对事务一致性要求高的，相对更适应频繁的修改和删除操作
mysql5.6以上的默认存储方式
存储文件的个数：表结构、表中的数据
支持行级锁
- 修改某一行数据会将该行上锁，保证数据安全
- 能够更好的处理并发的修改问题
支持事务
- 将某段代码操作变成原子型，不可拆分，执行完才生效
支持外键
- 图书管理和出版社，分成两个表进行存储，两个相关联，相关联的一列称为外键

3.2 MyTSAM

适合做读、插入数据比较频繁的，对修改和删除涉及少的
mysql5.5以下的默认存储方式
存储文件的个数：表结构、表中的数据、索引
支持表级锁
不支持行级锁、不支持事务、不支持外键

3.3 MEMORY

存储在内存中
存储文件的个数：表结构
优势：增删改查都很快
劣势：重启数据消失、容量有限

3.4存储引擎相关sql语句

查看配置项
- show variables like '%engine%';查看存储引擎
- show variables like '%char%';查看编码
创建表的时候指定存储引擎
- create table t2 (id int, name char(4)) engine=myisam;
- create table t3 (id int, name char(4)) engine=memory;
查看表结构
- 能够看到和这张表相关的所有信息
  - show create table 表名;
- 只能查看表的字段的基础信息
  - desc 表名;
  - describe 表名;

4.数据类型

4.1数字类型

整数int
- create table t4 (id1 int(4),id2 int(11));
- 注意点
  - int默认是有符号的
  - 它能表示的数字的范围不被宽度约束
  - 它只能约束数字的显示宽度
- 创建无符号型整数
  - create table t5 (id1 int unsigned,id2 int);
小数float
- float/double
  - 两个都不能精确表示
  - create table t6 (f1 float(5,2),d1 double(5,2));
    - 约束条件中第一个数字为小数的总长度，第二个为小数部分的长度
- 精确表示decimal
  - create table t8 (d1 decimal,d2 decimal(25,20));

4.2时间类型

year 年
date 年月日
time 时分秒
datetime、timestamp 年月日时分秒
- timestamp 有时间上下限，而且不为空默认自动更新为当前时间，也可以将datetime设置为这样：
  
  dt datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
create table t9(y year, d date, dt datetime, t time, ts timestamp);

4.3字符串类型

char(15) 定长的单位
- 总是按照这个长度来存储，不够就补空格
varchar(15) 变长的单位
- 有多长就存多长，后面加上当前存储的长度
  - alex alex4
优缺点
- varchar：节省空间、存取效率相对低
- char：浪费空间，存储效率相对高，存储长度变化小的
create table t11 (name1 char(5),name2 varchar(5));

4.4 enum / set

enum(单选)
- 枚举，只能选择设置好的里面的一个，不在里面的话不存储
set(多选)
- 选择设置好的里面的多个，重复只存一个
create table t12(name char(12), gender ENUM('male', 'female'), hobby set('抽烟', '喝酒', '烫头', '洗脚'));
- 存数据时
- insert into t12 values('alex','male','抽烟,喝酒,洗脚,洗脚,按摩');

5.约束

unsigned 设置某一个数字无符号
not null 某一个字段不能为空

default 给某个字段设置默认值

# create table t2(
#   id int not null,
#   name char(12) not null,
#   age int default 18,
#   gender enum('male','female') not null default 'male'
# )

unique 设置某一个字段不能重复

# create table t3(
#     id int unique,
#     username char(12) unique,
#     password char(18)
# );

联合唯一

# create table t4(
#     id int,
#     ip char(15),
#     server char(10),
#     port int,
#     unique(ip,port)
# );

auto_increment 设置某一个int类型的字段自动增加

auto_increment 自带 not null效果
需要设置条件 int unique，必须是唯一的

# create table t5(
#     id int unique auto_increment,
#     username char(10),
#     password char(18)
# )
# insert into t5(username,password) values('alex','alex3714')

primary key 主键，设置某一个字段非空且不能重复

约束力相当于 not null + unique
一张表只能有一个主键，也最好有一个主键

# create table t6(
#     id int not null unique,     # 你指定的第一个非空且唯一的字段会被定义成主键
#     name char(12) not null unique
# )

# create table t7(
#     id int primary key,     # 你指定的第一个非空且唯一的字段会被定义成主键
#     name char(12) not null unique
# )

联合主键

# create table t4(
#     id int,
#     ip char(15),
#     server char(10),
#     port int,
#     primary key(ip,port)
# );

foreign key 外键

references

# 员工表
# create table staff(
# id  int primary key auto_increment,
# age int,
# gender  enum('male','female'),
# salary  float(8,2),
# hire_date date,
# post_id int,
# foreign key(post_id) references post(pid)
# )

# 部门表
#  pid postname post_comment post_phone
# create table post(
#     pid  int  primary key,
#     postname  char(10) not null unique,
#     comment   varchar(255),
#     phone_num  char(11)
# )

# update post set pid=2 where pid = 1;
# delete from post where pid = 1;

级联删除和更新

# create table staff2(
# id  int primary key auto_increment,
# age int,
# gender  enum('male','female'),
# salary  float(8,2),
# hire_date date,
# post_id int,
# foreign key(post_id) references post(pid) on update cascade on delete set null
# )

6.表相关

6.1修改表

alter table 表名 add 添加字段
- alter table 表名 add 字段名数据类型(宽度) 约束 first/after name
alter table 表名 drop 删除字段
- alter table 表名 drop 字段名
alter table 表名 modify 修改已经存在的字段的类型宽度约束
- alter table 表名 modify name varchar(12) not null
alter table 表名 change 修改已经存在的字段的类型宽度约束和字段名字
- alter table 表名 change name new_name varchar(12) not null

6.2表关系

两张表中的数据之间的关系

多对一：foreign key
- 永远在多的那张表中设置外键
- 多个学生都是同一个班级的
- 学生表关联班级表
- 学生是多，班级是一
一对一：foreign key +unique
- 后出现的后一张表中的数据作为外键，并且要约束这个外键是唯一的
- 客户关系表：手机号码招生老师上次联系的时间备注信息
- 学生表：姓名入学日期缴费日期结业
多对多：产生第三张表，把两个关联关系的字段作为第三张表的外键
- 书作者
- 出版社书

7.表查询

7.1单表查

7.1.1 where语句

7.1.1.1比较运算

<
>
>=
<=
!= 不等于
<> 不等于

7.1.1.2范围筛选

多选一
- 字段名 in (值1,值2,值3);
- select * from employee where salary in (20000,30000,3000,19000,18000,17000);
在一个模糊的范围里
- 在一个数值区间：between 1000 and 2000
  - select emp_name from employee where salary between 10000 and 20000;
- 字符串的模糊查询：like
  - 通配符 % 匹配任意长度的任意字符
  - 通配符 _ 匹配一个字符长度的任意内容
- 正则匹配：regexp，需要更加细粒度的匹配的时候
- select * from 表 where 字段 regexp 正则表达式;
- select * from employee where emp_name regexp '^j[a-z]{5}';
示例：查看岗位是teacher且名字是jin开头的员工姓名、年薪
- select emp_name,salary*12 from employee where post='teacher' and emp_name like 'jin%';
- select emp_name,salary*12 from employee where post='teacher' and emp_name regexp '^jin.*';

7.1.1.3逻辑运算

条件的拼接
与 and
或 or
非 not
- select * from employee where salary not in (20000,30000,3000,19000,18000,17000);

7.1.1.4身份运算

关于null 需要使用is/is not
- 示例：查看岗位描述不为NULL的员工信息
- select * from employee where post_comment is not null;

7.1.2分组聚合

7.1.2.1分组

group by
- select * from employee group by post;
注意点：会把在group by后面的这个字段，也就是post字段中的每一个不同的项都保留下来，并且把值是这一项的所有行归为一组

7.1.2.2聚合

把很多行的同一个字段进行一些统计，最终得到一个结果
- count(字段) 统计这个字段有多少项
- sum(字段) 统计这个字段对应的数值的和
- avg(字段) 统计这个字段对应的数值的平均值
- min(字段)
- max(字段)

7.1.2.3分组聚合

求各个部分的人数
- select count(*) from employee group by post;
求公司里男生和女生的人数
- select count(id) from employee group by sex;
求各部门年龄最小的
- select post,min(age) from employee group by post;

7.1.3 having / order by / limit

7.1.3.1 having

条件语句，主要过滤组条件，与group by 连用
示例：部门人数大于3的部门
- select post from employee group by post having count(*) > 3;
在group by 后面只能用having条件
- 执行顺序总是先执行where，再执行group by分组
示例：平均薪资大于10000的部门
- select post from employee group by post having avg(salary) > 10000;

7.1.3.2 order by

order by 某一个字段 asc;默认是升序asc(可不写)，从小到大
order by 某一个字段 desc;指定降序排列desc，从大到小
order by 某一个字段 asc,第二个字段 desc;
- 指定先根据第一个字段升序排列，在第一个字段相同的情况下，再根据第二个字段排列

7.1.3.3 limit

取前n个 limit n == limit 0,n
应用：分页
- limit m,n 从m+1开始取n个
limit n offset m == limit m,n 从m+1开始取n个

7.1.4执行顺序

from、where、group by依次执行在select之前执行
order by、limit依次执行在select之后执行
select distinct 需要显示的列 from 表
- where 条件
- group by 分组
- having 过滤组条件
- order by 排序
- limit 前n 条

7.2多表查

两张表是怎么连在一起的
- select * from emp,department;使用笛卡尔乘积

7.2.1连表查询

7.2.1.1内连接

inner join
- 两张表条件不匹配的项不会出现在结果中
select * from emp inner join department on emp.dep_id = department.id;

7.2.1.2外连接

左外连接：left join
- 永远显示全量的左表中的数据
- select * from emp left join department on emp.dep_id = department.id;
右外连接：right join
- 永远显示全量的右表中的数据
- select * from emp right join department on emp.dep_id = department.id;
全外连接：union
select * from emp left join department on emp.dep_id = department.id union select * from emp right join department on emp.dep_id = department.id;

7.2.2子查询

找技术部门的所有人的姓名
- 先找到部门表技术部门的部门id
  - select id from department where name = '技术';
- 再找emp表中部门id = 200
  - select name from emp where dep_id = (select id from department where name = '技术');
找到技术部门和销售部门所有人的姓名
- 先找到技术部门和销售部门的的部门id
  - select id from department where name = '技术' or name='销售'
- 找到emp表中部门id = 200或者202的人名
  - select name from emp where dep_id in (select id from department where name = '技术' or name='销售');
带exists关键字的子查询
- 在使用EXISTS关键字时，内层查询语句不返回查询的记录，而是返回一个真假值。True或False
- 当返回True时，外层查询语句将进行查询；当返回值为False时，外层查询语句不进行查询
- ```
select * from employee where exists (select id from department where id=200);
```

7.2.3总结

连接语法
- select 字段 from 表1 xxx join 表2 on 表1.字段 = 表2.字段;
- 常用
  - 内连接
  - 左外连接
优先使用多表查询，因为连表查询的效率高

8.pymysql模块

登录连接到server端
- conn = pymysql.connect(host='127.0.0.1', user='root', password="123",database='day40')
创建数据库操作符
- cur = conn.cursor()
- 将查询的结果变成字典
  - cur = conn.cursor(pymysql.cursors.DictCursor)
sql语句操作
- cur.execute(sql语句)
- 增删改的sql语句需要提交
  - conn.commit()
- 查
  - 得到查询的结果
  - ret = cur.fetchone()得到第一个结果
  - ret = cur.fetchmany(5)得到前5个结果
  - ret = cur.fetchall()得到所有结果

conn = pymysql.connect(host='127.0.0.1', user='root', password="123",
                 database='day40')
cur = conn.cursor(pymysql.cursors.DictCursor)   # 数据库操作符 游标
cur.execute('select * from employee '
            'where id > 10')
ret = cur.fetchone()
print(ret['emp_name'])
# ret = cur.fetchmany(5)
ret = cur.fetchall()
print(ret)
conn.close()

conn = pymysql.connect(host='127.0.0.1', user='root', password="123",
                 database='day40')
cur = conn.cursor()   # 数据库操作符 游标
cur.execute('insert into employee(emp_name,sex,age,hire_date) '
            'values ("郭凯丰","male",40,20190808)')
cur.execute('delete from employee where id = 18')
conn.commit()
conn.close()

9.索引

9.1索引相关

9.1.1索引的定义

什么是索引 -- 目录
- 就是建立起的一个在存储表阶段就有的一个存储结构，能加速查询

9.1.2索引的原理

block 磁盘预读原理
- 读硬盘的io操作的时间非常的长，比CPU执行指令的时间长很多，尽量的减少IO次数才是读写数据的主要要解决的问题

9.1.3数据库存储方式

新的数据结构-----树
- 平衡树 balance tree --b树
- 在b树的基础上进行了改良 --b+树
  - 分支节点和根节点都不再存储实际的数据了，让分支和根节点能存储更多的索引的信息，就降低了树的高度，所有的实际数据都存储在叶子节点中
  - 在叶子节点之间加入了双向的链式结构，方便在查询中的范围条件
- mysql当中所有的b+树索引的高度都基本控制在3层
  - io操作的次数非常稳定
  - 有利于通过范围查询
- 影响索引的效率----树的高度
  - 对哪一列创建索引，选择尽量短的列做索引
  - 对区分度高的列建索引，重复率超过了10%那么不适合创建索引

9.1.4索引的种类

在innodb中聚集索引和辅助索引并存的
- 聚集索引：数据直接存储在树结构的叶子节点
  - primary key 主键
    - 联合主键
  - 约束的作用：非空 + 唯一
- 辅助索引：数据不直接存储在树中
  - 除了主键之外所有的索引都是辅助索引
  - unique 自带索引辅助索引
    - 联合唯一
    - 约束的作用：唯一
  - index 辅助索引
    - 联合索引
在myisam中只有辅助索引，没有聚集索引

9.1.5创建索引

create index 索引名字 on 表(字段)
DROP INDEX 索引名 ON 表名字;

9.2索引不生效的原因

9.2.1要查询的数据范围大

> < >= <= !=
between and
select * from 表 order by age limit 0,5;
- 越到后面，时间越长，索引不生效
select * from 表 where id between 1000000 and 1000005;
- 索引生效
like
- 结果范围很大，索引不生效
  - 使用abc%索引生效，%abc索引不生效

9.2.2区分度不高

如果一列内容的区分度不高，索引也不生效

9.2.3索引列参与计算

索引列不能在条件中参与计算
select * from s1 where id*10 = 1000000; 索引不生效

9.2.4 and / or

对两列内容进行条件查询
and
- and条件两端的内容，优先选择一个有索引的，并且树形结构更好的，来进行查询
- 两个条件都成立才能完成where条件，先完成范围小的缩小后面条件的压力
- select * from s1 where id =1000000 and email = 'eva1000000@oldboy';
or
- or条件的，不会进行优化，只是根据条件从左到右依次筛选
- 条件中带有or的要想命中索引，这些条件中所有的列都是索引列
- select * from s1 where id =1000000 or email = 'eva1000000@oldboy';

9.2.5联合索引

create index ind_mix on s1(id,name,email);
在联合索引中如果使用了or条件索引就不能生效
- select * from s1 where id =1000000 or email = 'eva1000000@oldboy';
最左前缀原则：在联合索引中，条件必须含有在创建索引的时候的第一个索引列
- select * from s1 where id =1000000; 能命中索引
- select * from s1 where email = 'eva1000000@oldboy'; 不能命中索引
在整个条件中，从开始出现模糊匹配的那一刻，索引就失效了
- select * from s1 where id >1000000 and email = 'eva1000001@oldboy';
- select * from s1 where id =1000000 and email like 'eva%';
使用场景
- 只对 a 对abc 条件进行索引，而不会对b，对c进行单列的索引

9.2.6总结

单列索引
- 选择一个区分度高的列建立索引，条件中的列不要参与计算，条件的范围尽量小，使用and作为条件的连接符
使用or来连接多个条件
- 在满上上述条件的基础上，对or相关的所有列分别创建索引
覆盖索引
- 如果我们使用索引作为条件查询，查询完毕之后，不需要回表查，覆盖索引
- explain select id from s1 where id = 1000000;
- explain select count(id) from s1 where id > 1000000;
合并索引
- 对两个字段分别创建索引，由于sql的条件让两个索引同时生效了，那么这个时候这两个索引就成为了合并索引
执行计划
- explain
- 如果你想在执行sql之前就知道sql语句的执行情况，那么可以使用执行计划
- 情况1：30000000条数据
  - 直接执行需要20s
  - explain sql ---> 不会真正的执行sql，而是会给你列出一个执行计划
- 情况2：目前20条数据，以后想存更多

9.2.7注意事项

建表、使用sql语句的时候注意的
- char 代替 varchar
- 连表代替子查询
- 创建表的时候固定长度的字段放在前面

10.数据备份和恢复

10.1数据备份

退出客户端，直接在命令行执行
语法：mysqldump -h 服务器 -u用户名 -p密码数据库名 > 备份文件.sql
- 单库备份
- 多库备份
- 备份所有库

#单库备份
mysqldump -uroot -p123 db1 > db1.sql
mysqldump -uroot -p123 db1 table1 table2 > db1-table1-table2.sql

#多库备份
mysqldump -uroot -p123 --databases db1 db2 mysql db3 > db1_db2_mysql_db3.sql

#备份所有库
mysqldump -uroot -p123 --all-databases > all.sql

10.2数据恢复

在当前数据库下，使用source语句
- source 路径

#方法一：
[root@egon backup]# mysql -uroot -p123 < /backup/all.sql

#方法二：
mysql> use db1;
mysql> SET SQL_LOG_BIN=0;   #关闭二进制日志，只对当前session生效
mysql> source /root/db1.sql

11.事务和锁

在执行修改操作时，将数据库文件锁住，执行完操作后，再让另一个操作来取值进行操作
注意点
- 执行了for update才算是加锁

begin;  # 开启事务
select * from emp where id = 1 for update;  # 查询id值，for update添加行锁；
update emp set salary=10000 where id = 1; # 完成更新
commit; # 提交事务

12. sql注入问题

先建表准备数据

# create table userinfo(
# id int primary key auto_increment,
# name char(12) unique not null,
# password char(18) not null
# )

# insert into userinfo(name,password) values('alex','alex3714')

python代码操作
- sql语句中注释：--
  - 注释掉之后的sql语句

# username = input('user >>>')
# password = input('passwd >>>')
# sql = "select * from userinfo where name = '%s' and password = '%s'"%(username,password)
# print(sql)

# username输入'alex' ;--，password任意
select * from userinfo where name = 'alex' ;-- and password = '792164987034';
# username输入任意加上 or 1=1 ;--   password任意
select * from userinfo where name = 219879 or 1=1 ;-- and password = 792164987034;
# select * from userinfo where name = '219879' or 1=1 ;-- and password = '792164987034';

出现问题的原因：手动拼接sql语句导致出现问题
- 解决方法：不手动拼接，由cur.execute()自动帮我们拼接
```
sql = "select * from userinfo where name = %s and password = %s"
cur.execute(sql,(username,password))
```

import pymysql

conn = pymysql.connect(host = '127.0.0.1',user = 'root',
                       password = '123',database='day41')
cur = conn.cursor()
username = input('user >>>')
password = input('passwd >>>')
sql = "select * from userinfo where name = %s and password = %s"
cur.execute(sql,(username,password))
print(cur.fetchone())
cur.close()
conn.close()

posted @ 2020-03-21 12:49 Hedger_Lee 阅读(101) 评论(0) 编辑收藏举报

刷新页面返回顶部

Hedger_Lee

数据库

数据库