37. SQL--self join:自连接
1. 前言
self join 用于将一个表和自身连接,就好像存在两个表一样。为了区分两个表,在 sql 语句中需要至少重命名一个表。
自连接通常用于将表的某个字段与该表的同一字段的其它值进行比较。
2. 语法
self join 的基本语法如下:
select a.column1, b.column1... from table1 as a, table1 as b where a.common_column < b.common_column;
您看,sql 并没有 self join 关键字,而是使用 where 子句来达到自连接的目的。
3. 示例
自连接的语法比较简单,但是结果往往不是那么容易理解,让我们通过下面的例子来说明一下。
示例1
现在有如下所示的客户表,名字为 customers:
+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+
现在,让我们使用 SELF JOIN 来比较 SALARY 字段,找到比某个客户工资低的其它所有客户,如下所示:
sql> select a.id, a.name, a.salary, b.name as this_name, b.salary as this_salary from customer as a, customer as b where a.salary < b.salary;
执行结果如下:
+----+----------+--------+-----------+-------------+ | ID | NAME | SALARY | THIS_NAME | THIS_SALARY | +----+----------+--------+-----------+-------------+ | 2 | Khilan | 1500 | Ramesh | 2000 | | 2 | Khilan | 1500 | Kaushik | 2000 | | 1 | Ramesh | 2000 | Chaitali | 6500 | | 2 | Khilan | 1500 | Chaitali | 6500 | | 3 | Kaushik | 2000 | Chaitali | 6500 | | 6 | Komal | 4500 | Chaitali | 6500 | | 1 | Ramesh | 2000 | Hardik | 8500 | | 2 | Khilan | 1500 | Hardik | 8500 | | 3 | Kaushik | 2000 | Hardik | 8500 | | 4 | Chaitali | 6500 | Hardik | 8500 | | 6 | Komal | 4500 | Hardik | 8500 | | 1 | Ramesh | 2000 | Komal | 4500 | | 2 | Khilan | 1500 | Komal | 4500 | | 3 | Kaushik | 2000 | Komal | 4500 | | 1 | Ramesh | 2000 | Muffy | 10000 | | 2 | Khilan | 1500 | Muffy | 10000 | | 3 | Kaushik | 2000 | Muffy | 10000 | | 4 | Chaitali | 6500 | Muffy | 10000 | | 5 | Hardik | 8500 | Muffy | 10000 | | 6 | Komal | 4500 | Muffy | 10000 | +----+----------+--------+-----------+-------------+
从执行结果可以发现,SELF JOIN 以右表为主,它先将左表中的每一行与右表中的第一行进行比较,然后再将左表中的第一行与右表中的第二行进行比较,以此类推,直到右表的最后一行。
示例2
现在有如下所示的员工表,名字为 employee:
+----+----------+-----+--------+------------+ | id | name | sex | salary | manager_id | +----+----------+-----+--------+------------+ | 1 | joe | m | 7000 | 3 | | 2 | henry | m | 8000 | 4 | | 3 | sam | f | 6000 | NULL | | 4 | max | m | 9000 | NULL | | 5 | ramesh | m | 7500 | 3 | | 6 | khilan | f | 11500 | 4 | | 7 | Chaitali | m | 8600 | 4 | | 8 | Komal | f | 7800 | 4 | +----+----------+-----+--------+------------+
现在,让我们使用 SELF JOIN 找出工资超过各自经理的员工姓名:
select e1.name as employee_name, e1.salary, e2.name as manager_name, e2.salary from employee as e1, employee as e2 where e1.manager_id = e2.id and e1.salary > e2.salary;
运行结果:
+---------------+--------+--------------+--------+ | employee_name | salary | manager_name | salary | +---------------+--------+--------------+--------+ | joe | 7000 | sam | 6000 | | ramesh | 7500 | sam | 6000 | | khilan | 11500 | max | 9000 | +---------------+--------+--------------+--------+
示例3
现在有如下所示的用户表,名字为 user:
+----+------------+-----------------+--------------------------+-------------+-----------------+ | id | username | password | email | cellphone | ip | +----+------------+-----------------+--------------------------+-------------+-----------------+ | 1 | mozhiyan | 123456 | java@biancheng.net | 13112344444 | 121.17.25.194 | | 2 | fanhua | xyzabc | cplusplus@biancheng.net | 15028882233 | 117.25.156.179 | | 3 | greatman | 123abc | python@biancheng.net | 13290029028 | 125.64.104.35 | | 4 | xiaogun123 | hsdjs23ZX | c@biancheng.net | 19036895933 | 101.200.188.230 | | 5 | my_code | 19951206 | php@biancheng.net | 19234561234 | 103.133.176.211 | | 6 | guxiaonuan | xy232323 | javascript@biancheng.net | 13409873222 | 113.96.109.117 | | 7 | coder | 20200304shengri | python@biancheng.net | 15645990222 | 101.37.97.51 | | 8 | happy_gril | sd@@sd | c@biancheng.net | 19090903636 | 220.181.38.148 | +----+------------+-----------------+--------------------------+-------------+-----------------+
让我们使用 self join 删除表中 email 重复的用户,只保留一个,如下所示:
delete u1 from user as u1, user as u2 where u1.email = u2.email and u1.id > u2.id;
执行完该语句,user 表的数据如下:
+----+------------+-----------+--------------------------+-------------+-----------------+ | id | username | password | email | cellphone | ip | +----+------------+-----------+--------------------------+-------------+-----------------+ | 1 | mozhiyan | 123456 | java@biancheng.net | 13112344444 | 121.17.25.194 | | 2 | fanhua | xyzabc | cplusplus@biancheng.net | 15028882233 | 117.25.156.179 | | 3 | greatman | 123abc | python@biancheng.net | 13290029028 | 125.64.104.35 | | 4 | xiaogun123 | hsdjs23ZX | c@biancheng.net | 19036895933 | 101.200.188.230 | | 5 | my_code | 19951206 | php@biancheng.net | 19234561234 | 103.133.176.211 | | 6 | guxiaonuan | xy232323 | javascript@biancheng.net | 13409873222 | 113.96.109.117 | +----+------------+-----------+--------------------------+-------------+-----------------+
你看,最后两个用户被删除。