SQL: UNION Operator
This SQL tutorial explains how to use the SQL UNION operator with syntax and examples.
Description
The SQL UNION operator is used to combine the result sets of 2 or more SELECT statements. It removes duplicate rows between the various SELECT statements.
Each SELECT statement within the UNION must have the same number of fields in the result sets with similar data types.
What is the difference between UNION and UNION ALL?
- UNION removes duplicate rows.
- UNION ALL does not remove duplicate(复制的,完全一样的) rows.
Syntax
The syntax for the UNION operator in SQL is:
SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions] UNION SELECT expression1, expression2, ... expression_n FROM tables [WHERE conditions];
Parameters or Arguments
- expression1, expression2, expression_n
- The columns or calculations that you wish to retrieve.
- tables
- The tables that you wish to retrieve records from. There must be at least one table listed in the FROM clause.
- WHERE conditions
- Optional. The conditions that must be met for(得到满足) the records to be selected.
Note
- There must be same number of expressions in both SELECT statements.
- The corresponding expressions must have the same data type in the SELECT statements. For example: expression1 must be the same data type in both the first and second SELECT statement.
- See also the UNION ALL operator.
Example - Single Field With Same Name
Let's look at how to use the SQL UNION operator that returns one field. In this simple example, the field in both SELECT statements will have the same name and data type.
For example:
SELECT supplier_id FROM suppliers UNION SELECT supplier_id FROM orders ORDER BY supplier_id;
In this SQL UNION operator example, if a supplier_id appeared in both the suppliers and orders table, it would appear once in your result set. The UNION operator removes duplicates. If you do not wish to remove duplicates, try using the UNION ALL operator.
Now, let's explore this example further will some data.
If you had the suppliers table populated with(填充) the following records:
supplier_id | supplier_name |
---|---|
1000 | Microsoft |
2000 | Oracle |
3000 | Apple |
4000 | Samsung |
And the orders table populated with the following records:
order_id | order_date | supplier_id |
---|---|---|
1 | 2015-08-01 | 2000 |
2 | 2015-08-01 | 6000 |
3 | 2015-08-02 | 7000 |
4 | 2015-08-03 | 8000 |
And you executed the following UNION statement:
SELECT supplier_id FROM suppliers UNION SELECT supplier_id FROM orders ORDER BY supplier_id;
You would get the following results:
supplier_id |
---|
1000 |
2000 |
3000 |
4000 |
6000 |
7000 |
8000 |
As you can see in this example, the UNION has taken all supplier_id values from both the suppliers table as well as the orders table and returned a combined result set. Because the UNION operator removed duplicates between the result sets, the supplier_id of 2000 only appears once, even though it is found in both the suppliers and orders table. If you do not wish to remove duplicates, try using the UNION ALL operator instead.
Example - Different Field Names
It is not necessary that the corresponding columns in each SELECT statement have the same name, but they do need to be the same corresponding data types.
When you don't have the same column names between the SELECT statements, it gets a bit tricky(这有点棘手), especially when you want to order the results of the query using the ORDER BY clause.
Let's look at how to use the UNION operator with different column names and order the query results.
For example:
SELECT supplier_id, supplier_name FROM suppliers WHERE supplier_id > 2000 UNION SELECT company_id, company_name FROM companies WHERE company_id > 1000 ORDER BY 1;
In this SQL UNION example, since the column names are different between the two SELECT statements, it is more advantageous to reference the columns in the ORDER BY clause by their position in the result set. In this example, we've sorted the results by supplier_id / company_id in ascending order(升序), as denoted(表示) by the ORDER BY 1
. The supplier_id / company_id fields are in position #1 in the result set.
Now, let's explore this example further with data.
If you had the suppliers table populated with the following records:
supplier_id | supplier_name |
---|---|
1000 | Microsoft |
2000 | Oracle |
3000 | Apple |
4000 | Samsung |
And the companies table populated with the following records:
company_id | company_name |
---|---|
1000 | Microsoft |
3000 | Apple |
7000 | Sony |
8000 | IBM |
And you executed the following UNION statement:
SELECT supplier_id, supplier_name FROM suppliers WHERE supplier_id > 2000 UNION SELECT company_id, company_name FROM companies WHERE company_id > 1000 ORDER BY 1;
You would get the following results:
supplier_id | supplier_name |
---|---|
3000 | Apple |
4000 | Samsung |
7000 | Sony |
8000 | IBM |
First, notice that the record with supplier_id of 3000 only appears once in the result set because the UNION query removed duplicate entries.
Second, notice that the column headings in the result set are called supplier_id and supplier_name. This is because these were the column names used in the first SELECT statement in the UNION.
If you had wanted to, you could have aliased the columns as follows:
SELECT supplier_id AS ID_Value, supplier_name AS Name_Value FROM suppliers WHERE supplier_id > 2000 UNION SELECT company_id AS ID_Value, company_name AS Name_Value FROM companies WHERE company_id > 1000 ORDER BY 1;
Now the column headings in the result will be aliased as ID_Value for the first column and Name_Value for the second column.
ID_Value | Name_Value |
---|---|
3000 | Apple |
4000 | Samsung |
7000 | Sony |
8000 | IBM |
Frequently Asked Questions
Question: I need to compare two dates and return the count of a field based on the date values. For example, I have a date field in a table called last updated date. I have to check if trunc(last_updated_date >= trunc(sysdate-13).
Answer: Since you are using the COUNT function which is an aggregate function, we'd recommend using the Oracle UNION operator. For example, you could try the following:
SELECT a.code AS Code, a.name AS Name, COUNT(b.Ncode) FROM cdmaster a, nmmaster b WHERE a.code = b.code AND a.status = 1 AND b.status = 1 AND b.Ncode <> 'a10' AND TRUNC(last_updated_date) <= TRUNC(sysdate-13) GROUP BY a.code, a.name UNION SELECT a.code AS Code, a.name AS Name, COUNT(b.Ncode) FROM cdmaster a, nmmaster b WHERE a.code = b.code AND a.status = 1 AND b.status = 1 AND b.Ncode <> 'a10' AND TRUNC(last_updated_date) > TRUNC(sysdate-13) GROUP BY a.code, a.name;
The Oracle UNION allows you to perform a count based on one set of criteria.
TRUNC(last_updated_date) <= TRUNC(sysdate-13)
As well as perform a count based on another set of criteria.
TRUNC(last_updated_date) > TRUNC(sysdate-13)