Adding Column to a SQL Server Table
Adding a column to a table is common task for DBAs. You can add a column to a table which is a nullable column or which has default values. But are these two operations are similar internally and which method is optimal?
Let us start this with an example.
I created a database and a table using following script:
USE master
Go
--Drop Database if exists
IF EXISTS (SELECT 1 FROM SYS.databases
WHERE name = 'AddColumn') DROP DATABASE AddColumn --Create the database CREATE DATABASE AddColumn GO
USE AddColumn
GO
--Drop the table if exists
IF EXISTS (
SELECT 1 FROM sys.tables WHERE Name = 'ExistingTable')
DROP TABLE ExistingTable
GO
--Create the table
CREATE TABLE ExistingTable
(ID BIGINT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
DateTime1 DATETIME DEFAULT GETDATE(),
DateTime2 DATETIME DEFAULT GETDATE(),
DateTime3 DATETIME DEFAULT GETDATE(),
DateTime4 DATETIME DEFAULT GETDATE(),
Gendar CHAR(1) DEFAULT 'M',
STATUS1 CHAR(1) DEFAULT 'Y'
)
GO
-- Insert 100,000 records with defaults records
INSERT INTO ExistingTable
DEFAULT VALUES
GO 100000
Before adding a Column
Before adding a column let us look at some of the details of the database
DBCC
IND (AddColumn,ExistingTable,1)
By running the above query, you will see 637 pages for the created table.
Adding a Column
You can add a column to the table with following statement.
ALTER
TABLE ExistingTable Add NewColumn INT NULL
Above will add a column with a null value for the existing records.
Alternatively you could add a column with default values.
ALTER TABLE ExistingTable Add
NewColumn INT NOT NULL DEFAULT 1
The above statement will add a column with a 1 value to the existing records.
In the below table I measured the performance difference between above two statements.
Parameter |
Nullable Column |
Default Value |
CPU |
31 |
702 |
Duration |
129 ms |
6653 ms |
Reads |
38 |
116,397 |
Writes |
6 |
1329 |
Row Count |
0 |
100000 |
If you look at the RowCount parameter, you can clearly see the difference. Though column is added in the first case, none of the rows are affected while in the second case all the rows are updated. That is the reason, why it has taken more duration and CPU to add column with Default value.
We can verify this by several methods.
Number of Pages
The number of data pages can be obtained by using DBCC IND command. Though, this an undocumented dbcc command, many experts are ok to use this command in production. However, since there is no official word from Microsoft, use this “at your own risk”.
DBCC IND (AddColumn,ExistingTable,1)
Before Adding the Columns |
637 |
Adding a Column with NULL |
637 |
Adding a column with DEFAULT value |
1270 |
This clearly shows that pages are physically modified. Please note, a high value indicated in the Adding a column with DEFAULT value column is also a result of page splits.
DBCC Page
The next thing you could do is, examine the data pages. For this as in the previous case, the DBCC PAGE is an undocumented DBCC command.
DBCC TRACEON(3604) DBCC PAGE (AddColumn, 1, 154, 3);
You need to switch on trace switch 3604 for DBCC Page command.
If you analyze page number 154 as shown in the example above, you will see the newly added column.
What does this mean? Is to say that even a Nullable column will modify the data pages? If that is the case, there shouldn’t be much different from adding a nullable column or adding a column with a default value, which is not what we saw in the first table.
Let us analyse lsn number of the page. Lsn is the Log Sequence Number of the last log record that changed the page.
You will see that there is no change to the lsn before and after adding the nullable column while there is a change to the lsn value, if you add a column with a default value.
This tells us two things.
1. This confirms that there are no physical changes to the data pages when you are adding a nullable column.
2. DBCC PAGE is not simply showing data dump of the data page.
Conclusion
When adding a column to a large table, the best practice would be adding a nullable column. However, if you want to update existing rows, consider using the Update statement after adding a nullable column.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· .NET周刊【3月第1期 2025-03-02】
· 分享 3 个 .NET 开源的文件压缩处理库,助力快速实现文件压缩解压功能!
· Ollama——大语言模型本地部署的极速利器