SQL JOIN 的常见用法-MySQL、Hive、Spark、Presto

无论是关系型数据库,还是SQL on Hadoop类的大数据技术组件,都有SQL JOIN功能,join大致分为内连接(inner join)、左外连接(left outer join)、右外连接(right outer join)、全外连接(full outer join)。

image-20210202110311961

笛卡尔积

要理解各种JOIN首先要理解笛卡尔积。笛卡尔积就是将A表的每一条记录与B表的每一条记录进行关联。所以,如果A表有n条记录,B表有m条记录,笛卡尔积产生的结果就会产生n*m条记录。比如,如果t_1表记录3条,B表记录3条,笛卡尔积结果就是9条。而JOIN的结果,可以简单理解为对笛卡尔积结果的再处理。

笛卡尔积结果

mysql> select * from t1;
+----+----------+
| id | name     |
+----+----------+
|  1 | zhangsan |
|  2 | lisi     |
|  4 | wangwu   |
+----+----------+
mysql> select * from t2;
+----+--------+
| id | gender |
+----+--------+
|  1 | men    |
|  2 | women  |
|  3 | men    |
+----+--------+
mysql> select * from t1,t2 order by t1.id;
+----+----------+----+--------+
| id | name     | id | gender |
+----+----------+----+--------+
|  1 | zhangsan |  1 | men    |
|  1 | zhangsan |  2 | women  |
|  1 | zhangsan |  3 | men    |
|  2 | lisi     |  1 | men    |
|  2 | lisi     |  2 | women  |
|  2 | lisi     |  3 | men    |
|  4 | wangwu   |  2 | women  |
|  4 | wangwu   |  3 | men    |
|  4 | wangwu   |  1 | men    |
+----+----------+----+--------+

内连接:INNER JOIN

内连接INNER JOIN是最常用的连接操作。从数学的角度讲就是求两个表的交集,从笛卡尔积的角度讲就是从笛卡尔积结果中挑选出“t1.id =t2.id”的结果。

内连接结果

mysql> select * from t1;
+----+----------+
| id | name     |
+----+----------+
|  1 | zhangsan |
|  2 | lisi     |
|  4 | wangwu   |
+----+----------+
mysql> select * from t2;
+----+--------+
| id | gender |
+----+--------+
|  1 | men    |
|  2 | women  |
|  3 | men    |
+----+--------+
mysql> select *  from t1 inner join t2 on t1.id =t2.id order by t1.id;
+----+----------+----+--------+
| id | name     | id | gender |
+----+----------+----+--------+
|  1 | zhangsan |  1 | men    |
|  2 | lisi     |  2 | women  |
+----+----------+----+--------+

左外连接:LEFT OUTER JOIN

左外连接LEFT JOUTER OIN的含义就是求两个表的交集外加左表剩下的数据。从笛卡尔积的角度讲,就是先从笛卡尔积中挑出子句条件“t1.id =t2.id”成立的记录,然后加上左表中剩余的记录。

mysql> select * from t1;
+----+----------+
| id | name     |
+----+----------+
|  1 | zhangsan |
|  2 | lisi     |
|  4 | wangwu   |
+----+----------+
mysql> select * from t2;
+----+--------+
| id | gender |
+----+--------+
|  1 | men    |
|  2 | women  |
|  3 | men    |
+----+--------+
mysql> select *  from t1 left outer join t2 on t1.id =t2.id order by t1.id;
+----+----------+------+--------+
| id | name     | id   | gender |
+----+----------+------+--------+
|  1 | zhangsan |    1 | men    |
|  2 | lisi     |    2 | women  |
|  4 | wangwu   | NULL | NULL   |
+----+----------+------+--------+

右外连接:RIGHT OUTER JOIN

同理右外连接RIGHT OUTER JOIN就是求两个表的交集外加右表剩下的数据。再次从笛卡尔积的角度描述,右连接就是从笛卡尔积中挑出子句条件“t1.id =t2.id”成立的记录,然后加上右表中剩余的记录。

mysql> select * from t1;
+----+----------+
| id | name     |
+----+----------+
|  1 | zhangsan |
|  2 | lisi     |
|  4 | wangwu   |
+----+----------+
mysql> select * from t2;
+----+--------+
| id | gender |
+----+--------+
|  1 | men    |
|  2 | women  |
|  3 | men    |
+----+--------+
mysql> select *  from t1 right outer join t2 on t1.id =t2.id order by t1.id;
+------+----------+----+--------+
| id   | name     | id | gender |
+------+----------+----+--------+
| NULL | NULL     |  3 | men    |
|    1 | zhangsan |  1 | men    |
|    2 | lisi     |  2 | women  |
+------+----------+----+--------+

全外连接:FULL OUTER JOIN

全外连接就是求两个集合的并集。从笛卡尔积的角度讲就是从笛卡尔积中挑出子句条件“t1.id =t2.id”成立的记录,然后加上左表中剩余的记录,最后加上右表中剩余的记录。另外MySQL不支持OUTER JOIN,但是我们可以对左连接和右连接的结果做UNION操作来实现。

mysql> select * from t1;
+----+----------+
| id | name     |
+----+----------+
|  1 | zhangsan |
|  2 | lisi     |
|  4 | wangwu   |
+----+----------+
mysql> select * from t2;
+----+--------+
| id | gender |
+----+--------+
|  1 | men    |
|  2 | women  |
|  3 | men    |
+----+--------+
mysql> select * from t1 right outer join t2 on t1.id = t2.id  union (select * from t1 left outer join t2 on t1.id = t2.id);
+------+----------+------+--------+
| id   | name     | id   | gender |
+------+----------+------+--------+
|    1 | zhangsan |    1 | men    |
|    2 | lisi     |    2 | women  |
| NULL | NULL     |    3 | men    |
|    4 | wangwu   | NULL | NULL   |
+------+----------+------+--------+
posted @ 2021-02-02 21:14  数大招疯-公众号同名  阅读(522)  评论(1编辑  收藏  举报