SQLSERVER 交、并、差集，In,Not In,Exist,Not Exist的区别，四种连接（转载）

原文链接：https://blog.csdn.net/weixin_46879188/article/details/119652042

一、SQLServer 交(intersect)、并(union)、差(except)集合运算
SQLServer中通过intersect,union,except和三个关键字对应交、并、差三种集合运算。
他们的对应关系可以参考下面图示

前提条件:

两个查询的结果集的规则：
两个查询中列的数量和顺序必须相同。相应列的数据类型必须相同或兼容。列名可以不一样。

1 Union 取合集并过滤重复数据
SELECT * FROM A
UNION
SELECT * FROM B;

2 Union all 取合集不过滤重复数据
SELECT * FROM A
UNION all
SELECT * FROM B;

3 Intersect 取交集(两个表中都有数据)
SELECT * FROM A
Intersect
SELECT * FROM B;

4 except 取差集(取A-B的记录)
SELECT * FROM A
except
SELECT * FROM B;

二、SQLServer 四种连接
1、内联接（典型的联接运算，使用像 = 或 <> 之类的比较运算符）。包括相等联接和自然联接。
内联接使用比较运算符根据每个表共有的列的值匹配两个表中的行。例如，检索 students和courses表中学生标识号相同的所有行。

2、外联接。外联接可以是左向外联接、右向外联接或完整外部联接。
在 FROM子句中指定外联接时，可以由下列几组关键字中的一组指定：

1）LEFT JOIN或LEFT OUTER JOIN
左向外联接的结果集包括 LEFT OUTER子句中指定的左表的所有行，而不仅仅是联接列所匹配的行。如果左表的某行在右表中没有匹配行，则在相关联的结果集行中右表的所有选择列表列均为空值。

2）RIGHT JOIN 或 RIGHT OUTER JOIN
右向外联接是左向外联接的反向联接。将返回右表的所有行。如果右表的某行在左表中没有匹配行，则将为左表返回空值。

3）FULL JOIN 或 FULL OUTER JOIN
完整外部联接返回左表和右表中的所有行。当某行在另一个表中没有匹配行时，则另一个表的选择列表列包含空值。如果表之间有匹配行，则整个结果集行包含基表的数据值。

在数据库中使用连接有两种方式:使用join关键字,另一种是使用（+）这种方式
例子：
a表

id name
1 张3
2 李四
3 王武
b表

id job parent_id
1 23 1
2 34 2
3 34 4
a.id同parent_id 存在关系

1）内连接
select a.*,b.* from a inner join b on a.id=b.parent_id

或者是:

select a.*,b.* from a,b where a.id=b.parent_id

结果是

a.id a.name b.id b.job b.parent_id
1 张3 1 23 1
2 李四 2 34 2
2）左连接
select a.*,b.* from a left join b on a.id=b.parent_id

结果是

a.id a.name b.id b.job b.parent_id
1 张3 1 23 1
2 李四 2 34 2
3 王武 null null null
3）右连接
select a.*,b.* from a right join b on a.id=b.parent_id

结果是

a.id a.name b.id b.job b.parent_id
1 张3 1 23 1
2 李四 2 34 2
null null 3 34 4
4）完全连接
select a.*,b.* from a full join b on a.id=b.parent_id

结果是

a.id a.name b.id b.job b.parent_id
1 张3 1 23 1
2 李四 2 34 2
null null 3 34 4
3 王武 null null null

三、SQLServer IN和EXISTS、not in 和not exists的效率详解
in 和 exists

如果查询的两个表大小相当，那么用in和exists差别不大。

如果两个表中一个较小，一个是大表，则子查询表大的用exists，子查询表小的用in：

例如：表A（小表），表B（大表）
1：
select * from A where cc in (select cc from B)
效率低，用到了A表上cc列的索引；
select * from A where exists(select cc from B where cc=A.cc)
效率高，用到了B表上cc列的索引。

相反的
2：
select * from B where cc in (select cc from A)
效率高，用到了B表上cc列的索引；
select * from B where exists(select cc from A where cc=B.cc)
效率低，用到了A表上cc列的索引。

not in 和not exists

如果查询语句使用了not in 那么内外表都进行全表扫描，没有用到索引；
而not extsts 的子查询依然能用到表上的索引。
所以无论那个表大，用not exists都比not in要快。
也就是说，in和exists需要具体情况具体分析，not in和not exists就不用分析了，尽量用not exists就好了。

结论:
区别
① 集合是关联条件是Select中的字段。输出结果集是Select中的字段。
四种连接(not in,not Exist）是关联条件是Where中的字段。输出结果集是Select中字段。

联系

① 集合是对行的筛选，四种连接(not in,not Exist）也是对行的筛选。

②交集和内连接功能类似,求交集。

③并集union 和全连接功能类似,求并集。

④差集和(not in,not Exist）功能类似,求差集。