数据库(九):多表查询②
进击のpython
数据库——多表查询
那接着上一节说,其实在真正的操作中,不光是要把数据联合,还需要进行筛选数据,比如:
打印员工平均年龄大于三十的部门
拿到一个需求教你怎么写!
需要员工员工年龄和部门
员工年龄存放在employee中,部门存放在deparement中
应该是什么连接呢?根据需求应该是内连接~(这个,你自己理解一下就可以了嘛)
select * from employee inner join department on employee.dep_id=department.id;
mysql> select * from employee inner join department on employee.dep_id=department.id;
+----+-----------+--------+------+--------+------+--------------+
| id | name | sex | age | dep_id | id | name |
+----+-----------+--------+------+--------+------+--------------+
| 1 | egon | male | 18 | 200 | 200 | 技术 |
| 2 | alex | female | 48 | 201 | 201 | 人力资源 |
| 3 | wupeiqi | male | 38 | 201 | 201 | 人力资源 |
| 4 | yuanhao | female | 28 | 202 | 202 | 销售 |
| 5 | liwenzhou | male | 18 | 200 | 200 | 技术 |
+----+-----------+--------+------+--------+------+--------------+
5 rows in set (0.00 sec)
拿到一个虚拟表,怎么叫虚拟表呢?因为这个表不是硬盘尚存的,而是临时拼在内存里的
但!他是一张表!
根据题意应该是部门之间的员工的年龄进行比较,所以就应该分组,按照什么分组呢???
如果按照id(dep_id),那我select后面拿到的只能是id(dep_id)【参考分组时说的知识点】
所以就应该用部门的名字name字段,但是发现了吗?有两个name字段,所以应该强制指定:
select department.name from employee inner join department on employee.dep_id=department.id
group by department.name;
接着我们要员工年龄的平均值大于三十,是不是应该用having过滤一下:
select department.name from employee inner join department on employee.dep_id=department.id
group by department.name
having avg(age)>30;
为了清晰一点,加个字符串拼接:
select concat('平均年龄超过三十的部门:',department.name) from employee inner join department on employee.dep_id=department.id
group by department.name
having avg(age)>30;
mysql> select concat('平均年龄超过三十的部门:',department.name) from employee inner join department on employee.dep_id=department.id
-> group by department.name
-> having avg(age)>30;
+----------------------------------------------------------------+
| concat('平均年龄超过三十的部门:',department.name) |
+----------------------------------------------------------------+
| 平均年龄超过三十的部门:人力资源 |
+----------------------------------------------------------------+
1 row in set (2.76 sec)
表头不好看,重命名一下:
select concat('平均年龄超过三十的部门:',department.name) info from employee inner join department on employee.dep_id=department.id
group by department.name
having avg(age)>30;
mysql> select concat('平均年龄超过三十的部门:',department.name) info from employee inner join department on employee.dep_id=department.id
-> group by department.name
-> having avg(age)>30;
+--------------------------------------------------+
| info |
+--------------------------------------------------+
| 平均年龄超过三十的部门:人力资源 |
+--------------------------------------------------+
1 row in set (0.00 sec)
那这个需求就做出来了
而至此!select的终极版语法才算是真正的水落石出:
定义顺序:
SELECT DISTINCT <select_list>
FROM <left_table>
<join_type> JOIN <right_table>
ON <join_condition>
WHERE <where_condition>
GROUP BY <group_by_list>
HAVING <having_condition>
ORDER BY <order_by_condition>
LIMIT <limit_number>
执行顺序:
(7) SELECT
(8) DISTINCT <select_list>
(1) FROM <left_table>
(3) <join_type> JOIN <right_table>
(2) ON <join_condition>
(4) WHERE <where_condition>
(5) GROUP BY <group_by_list>
(6) HAVING <having_condition>
(9) ORDER BY <order_by_condition>
(10) LIMIT <limit_number>
子查询
除了这种查询方法,还有子查询,子查询就是把查询的结果,当做另一个表的查询条件
那刚才的例子,也可以这么搞,我先拿到部门平均年龄过30的id,然后再去employee去找对应的部门名
带in的子查询
先拿到超过30的id:
select dep_id from employee inner join department on employee.dep_id=department.id
group by dep_id
having avg(age)>30;
mysql> select dep_id from employee inner join department on employee.dep_id=department.id
-> group by dep_id
-> having avg(age)>30;
+--------+
| dep_id |
+--------+
| 201 |
+--------+
1 row in set (0.00 sec)
拿到了之后进行传值:
select name from department where id in (
select dep_id from employee inner join department on employee.dep_id=department.id
group by dep_id
having avg(age)>30);
mysql> select name from department where id in (
->
-> select dep_id from employee inner join department on employee.dep_id=department.id
->
-> group by dep_id
->
-> having avg(age)>30);
+--------------+
| name |
+--------------+
| 人力资源 |
+--------------+
1 row in set (0.00 sec)
是不是也达到了需求~
查询技术部员工的姓名怎么写?
select name from employee where dep_id in (
select dep_id from employee inner join department on employee.dep_id=department.id
where department.name = "技术"
group by dep_id);
mysql> select name from employee where dep_id in (
-> select dep_id from employee inner join department on employee.dep_id=department.id
-> where department.name = "技术"
-> group by dep_id);
+-----------+
| name |
+-----------+
| egon |
| liwenzhou |
+-----------+
2 rows in set (0.00 sec)
查看不足1人的部门名这个怎么做?
这个其实就用到了数学的思想~总数里去掉一人及以上的,是不是就是不足一人的了
而只要是在用户表里面的用户拥有dep_id的,是不是都是有部门的,去掉这一部分是不是就满足了题意
select name from department where id not in (select distinct dep_id from employee);
(子查询得到的是有人的部门id)
mysql> select name from department where id not in (select distinct dep_id from employee);
+--------+
| name |
+--------+
| 运营 |
+--------+
1 row in set (0.47 sec)
带比较运算符的子查询
提一句吧,既然拿到的是数据,就像1 12 3 一样的,你就把它看作个数字,当然是可以比较的
查询大于所有人平均年龄的员工名与年龄
select avg(age) from employee;
查询所有员工的年龄 # 28.000
那在你心里,此时这一串乱七八糟的东西就是28.000
此时题意就变成,所有年龄大于28.000的员工名与年龄
select name,age from employee where age > 28.000;
然后将28.000进行“等量代换”
select name,age from employee where age > (select avg(age) from employee);
mysql> select name,age from employee where age > (select avg(age) from employee);
+---------+------+
| name | age |
+---------+------+
| alex | 48 |
| wupeiqi | 38 |
+---------+------+
2 rows in set (0.00 sec)
查询大于部门内平均年龄的员工名、年龄
部门内,一看就是分组,而且应该是以部门进行分组
select employee.name,employee.age from employee
inner join
(select dep_id,avg(age) avg_age from employee group by dep_id) t2
on employee.dep_id = t2.dep_id
where employee.age > t2.avg_age;
mysql> select employee.name,employee.age from employee
-> inner join
-> (select dep_id,avg(age) avg_age from employee group by dep_id) t2
-> on employee.dep_id = t2.dep_id
-> where employee.age > t2.avg_age;
+------+------+
| name | age |
+------+------+
| alex | 48 |
+------+------+
1 row in set (0.34 sec)
但是这个employee太长了,所以,给他重新命名比较好(知道为什么是t2了吧)
select employee.name,employee.age from employee t1
inner join
(select dep_id,avg(age) avg_age from employee group by dep_id) t2
on t1.dep_id = t2.dep_id
where t1.age > t2.avg_age;
带exists关键字的子查询
理解成if
select * from employee where exists (select id from department where id=200);
括号里是不是不为空,那就是真,就相当于if True:于是就有打印结果
select * from employee where exists (select id from department where id=204);
括号里是不是为空,那就是假,就相当于if False:于是没有打印结果
接下来借着这个例子给你讲讲一种思想:我连我自己
每个职位最新入职的员工
自己创建表:
create table employee(
id int not null unique auto_increment,
name varchar(20) not null,
sex enum('male','female') not null default 'male', #大部分是男的
age int(3) unsigned not null default 28,
hire_date date not null,
post varchar(50),
post_comment varchar(100),
salary double(15,2),
office int, #一个部门一个屋子
depart_id int
);
插入数据:
insert into employee(name,sex,age,hire_date,post,salary,office,depart_id) values
('egon','male',18,'20170301','驻沙河办事处外交大使',7300.33,401,1), #以下是教学部
('alex','male',78,'20150302','teacher',1000000.31,401,1),
('wupeiqi','male',81,'20130305','teacher',8300,401,1),
('yuanhao','male',73,'20140701','teacher',3500,401,1),
('liwenzhou','male',28,'20121101','teacher',2100,401,1),
('jingliyang','female',18,'20110211','teacher',9000,401,1),
('jinxin','male',18,'19000301','teacher',30000,401,1),
('成龙','male',48,'20101111','teacher',10000,401,1),
('歪歪','female',48,'20150311','sale',3000.13,402,2),#以下是销售部门
('丫丫','female',38,'20101101','sale',2000.35,402,2),
('丁丁','female',18,'20110312','sale',1000.37,402,2),
('星星','female',18,'20160513','sale',3000.29,402,2),
('格格','female',28,'20170127','sale',4000.33,402,2),
('张野','male',28,'20160311','operation',10000.13,403,3), #以下是运营部门
('程咬金','male',18,'19970312','operation',20000,403,3),
('程咬银','female',18,'20130311','operation',19000,403,3),
('程咬铜','male',18,'20150411','operation',18000,403,3),
('程咬铁','female',18,'20140512','operation',17000,403,3)
;
select post,max(hire_date) from employee group by post;
mysql> select post,max(hire_date) from employee
->
-> group by post;
+--------------------------------+----------------+
| post | max(hire_date) |
+--------------------------------+----------------+
| operation | 2016-03-11 |
| sale | 2017-01-27 |
| teacher | 2015-03-02 |
| 驻沙河办事处外交大使 | 2017-03-01 |
+--------------------------------+----------------+
这是不是每个部门最新入职的员工,但是,你虽然看到了这张表,但事实上是虚拟的,是不存在的
所以我们为了让它存在的“合法化”,我们应该将其整体重新命名:
(select post,max(hire_date) from employee group by post) as t2;
我们要的是那名员工!
我们新建的表是不是有职位和日期!employee是不是也有部门和日期,这是不是就联系上了?
那我们就可以通过这个联系来找到这些员工
select * from employee as t1 inner join
(select post,max(hire_date) hire_date from employee group by post) as t2 on
t1.post = t2.post and t1.hire_date=t2.hire_date;
mysql> select * from employee as t1 inner join
->
-> (select post,max(hire_date) hire_date from employee group by post) as t2 on
->
-> t1.post = t2.post and t1.hire_date=t2.hire_date;
+----+--------+--------+-----+------------+--------------------------------+--------------+------------+--------+-----------+--------------------------------+------------+
| id | name | sex | age | hire_date | post | post_comment | salary | office | depart_id | post | hire_date |
+----+--------+--------+-----+------------+--------------------------------+--------------+------------+--------+-----------+--------------------------------+------------+
| 1 | egon | male | 18 | 2017-03-01 | 驻沙河办事处外交大使 | NULL | 7300.33 | 401 | 1 | 驻沙河办事处外交大使 | 2017-03-01 |
| 2 | alex | male | 78 | 2015-03-02 | teacher | NULL | 1000000.31 | 401 | 1 | teacher | 2015-03-02 |
| 13 | 格格 | female | 28 | 2017-01-27 | sale | NULL | 4000.33 | 402 | 2 | sale | 2017-01-27 |
| 14 | 张野 | male | 28 | 2016-03-11 | operation | NULL | 10000.13 | 403 | 3 | operation | 2016-03-11 |
+----+--------+--------+-----+------------+--------------------------------+--------------+------------+--------+-----------+--------------------------------+------------+
4 rows in set (0.00 sec)
当然也可以用where来做,你自己尝试做一下
select * from employee as t1 inner join
(select post,max(hire_date) hire_date from employee group by post) as t2 on
t1.post = t2.post
where t1.hire_date=t2.hire_date;