UNION or OR in SQL Server Queries
Recently I came across with an article on DB2 about using Union instead of OR. So I thought of carrying out a research on SQL Server on what scenarios UNION is optimal in and which scenarios OR would be best. I will analyze this with a few scenarios using samples taken from the AdventureWorks database Sales.SalesOrderDetail table.
Scenario 1: Selecting all columns
So we are going to select all columns and you have a non-clustered index on the ProductID column.
--Query 1 : OR SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 OR ProductID =709 OR ProductID =998 OR ProductID =875 OR ProductID =976 OR ProductID =874 --Query 2 : UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 709 UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 998 UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 875 UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 976 UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 874
So query 1 is using OR and the later is using UNION. Let us analyze the execution plans for these queries.
Query 1
Query 2
As expected Query 1 will use Clustered Index Scan but Query 2, uses all sorts of things. In this case, since it is using multiple CPUs you might have CX_PACKET waits as well.
Let’s look at the profiler results for these two queries:
CPU |
Reads |
Duration |
Row Counts |
|
OR |
78 |
1252 |
389 |
3854 |
UNION |
250 |
7495 |
660 |
3854 |
Scenario 2: Non-Clustered and Clustered Index Columns only
--Query 1 : OR SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 714 OR ProductID =709 OR ProductID =998 OR ProductID =875 OR ProductID =976 OR ProductID =874 GO --Query 2 : UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 714 UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 709 UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 998 UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 875 UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 976 UNION SELECT ProductID,SalesOrderID, SalesOrderDetailID FROM Sales.SalesOrderDetail WHERE ProductID = 874 GO
So this time, we will be selecting only index columns, which means these queries will avoid a data page lookup. As in the previous case we will analyze the execution plans:
Query 1
Query 2
Again, Query 2 is more complex than Query 1. Let us look at the profile analysis:
CPU |
Reads |
Duration |
Row Counts |
|
OR |
0 |
24 |
208 |
3854 |
UNION |
0 |
38 |
193 |
3854 |
Scenario 3: Selecting all columns for different fields
Up to now, we were using only one column (ProductID) in the where clause. What if we have two columns for where clauses and let us assume both are covered by non-clustered indexes?
--Query 1 : OR SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 OR CarrierTrackingNumber LIKE 'D0B8%' --Query 2 : UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 UNION SELECT * FROM Sales.SalesOrderDetail WHERE CarrierTrackingNumber LIKE 'D0B8%'
Query 1
Query 2:
As we can see, the query plan for the second query has improved. Let us see the profiler results.
CPU |
Reads |
Duration |
Row Counts |
|
OR |
47 |
1278 |
443 |
1228 |
UNION |
31 |
1334 |
400 |
1228 |
Scenario 4: Selecting Clustered index columns for different fields
Now let us go only with clustered indexes:
--Query 1 : OR SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 OR CarrierTrackingNumber LIKE 'D0B8%' --Query 2 : UNION SELECT * FROM Sales.SalesOrderDetail WHERE ProductID = 714 UNION SELECT * FROM Sales.SalesOrderDetail WHERE CarrierTrackingNumber LIKE 'D0B8%'
Query 1
Query 2
Now both execution plans are almost identical except is an additional Stream Aggregate is used in the first query. This means UNION has advantage over OR in this scenario. Let us see profiler results for these queries again.
CPU |
Reads |
Duration |
Row Counts |
|
OR |
0 |
319 |
366 |
1228 |
UNION |
0 |
50 |
193 |
1228 |
Now see the differences, in this scenario UNION has somewhat of an advantage over OR.
Conclusion
Using UNION or OR depends on the scenario you are faced with. So you need to do your analyzing before selecting the appropriate method. Also, above the four scenarios are not all an exhaustive list of scenarios, I selected those for the broad description purposes only.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· .NET周刊【3月第1期 2025-03-02】
· 分享 3 个 .NET 开源的文件压缩处理库,助力快速实现文件压缩解压功能!
· Ollama——大语言模型本地部署的极速利器