Optimizing ALTER TABLE ADD COLUMN Performance for Large Tables in SQL Server


2 views

When working with SQL Server tables containing millions of records (like your 10M row table), even simple DDL operations can become performance nightmares. The standard approach:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

becomes problematic because SQL Server needs to:

  • Update every existing row with the default value
  • Maintain transaction logs for the entire operation
  • Potentially lock the table during execution

For your immediate SQL Server 2000 situation, try this alternative approach:

-- Step 1: Add nullable column without default
ALTER TABLE T
ADD mycol BIT NULL

-- Step 2: Create constraint with default
ALTER TABLE T
ADD CONSTRAINT DF_T_mycol DEFAULT 0 FOR mycol

-- Step 3: Update in batches (adjust batch size as needed)
DECLARE @batch INT
SET @batch = 100000
WHILE EXISTS (SELECT 1 FROM T WHERE mycol IS NULL)
BEGIN
    UPDATE TOP (@batch) T
    SET mycol = 0
    WHERE mycol IS NULL
END

-- Step 4: Convert to NOT NULL
ALTER TABLE T
ALTER COLUMN mycol BIT NOT NULL

For your upcoming SQL Server 2008 migration, you have additional options:

Online Operations

ALTER TABLE T
ADD mycol BIT NOT NULL
CONSTRAINT DF_T_mycol DEFAULT 0 WITH (ONLINE = ON)

Minimally Logged Operations

-- Set recovery model to bulk-logged temporarily
ALTER DATABASE YourDB SET RECOVERY BULK_LOGGED

-- Perform the operation
ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

-- Revert recovery model
ALTER DATABASE YourDB SET RECOVERY FULL
  • Schedule during maintenance windows
  • Consider table partitioning for easier maintenance
  • Monitor tempdb usage during large operations
  • For very large tables, consider creating a new table and migrating data

For extremely large tables where downtime is unacceptable:

-- Create new table with desired schema
SELECT *, 0 AS mycol INTO T_new FROM T WHERE 1=0

-- Create constraints/indexes to match original

-- Migrate data in batches (use ID ranges or other criteria)
INSERT INTO T_new
SELECT TOP (100000) *, 0
FROM T
WHERE ID BETWEEN 1 AND 100000

-- Repeat until all data is migrated
-- Then rename tables (requires brief downtime)
EXEC sp_rename 'T', 'T_old'
EXEC sp_rename 'T_new', 'T'

When working with SQL Server 2000/2008 tables containing millions of records (like your 10M row table), schema modifications can become painfully slow operations. The operation you're attempting:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

is particularly resource-intensive because SQL Server needs to:

  • Update every existing row with the default value
  • Maintain transaction logs for the entire operation
  • Potentially lock the table during execution

1. The Minimal Logging Approach

For SQL Server 2000, the most effective solution is to break the operation into two steps:

-- Step 1: Add nullable column (instant operation)
ALTER TABLE T
ADD mycol BIT NULL

-- Step 2: Update in batches (minimal transaction impact)
UPDATE TOP (10000) T SET mycol = 0 WHERE mycol IS NULL
WHILE @@ROWCOUNT > 0
BEGIN
    UPDATE TOP (10000) T SET mycol = 0 WHERE mycol IS NULL
END

-- Step 3: Alter to NOT NULL (now all rows have values)
ALTER TABLE T
ALTER COLUMN mycol BIT NOT NULL

2. SQL Server 2008+ Improvements

For your upcoming SQL Server 2008 operation, take advantage of the improved metadata-only operations:

-- This performs much better in 2008+
ALTER TABLE T
ADD mycol BIT NOT NULL
CONSTRAINT DF_T_mycol DEFAULT 0 WITH VALUES

The WITH VALUES clause tells SQL Server to apply the default only to new rows, making it a metadata-only operation.

Partition Switching Strategy

For minimal downtime in production environments:

-- 1. Create new table with identical schema plus new column
SELECT *, CAST(0 AS BIT) AS mycol INTO T_new FROM T WHERE 1=0

-- 2. Recreate indexes, constraints on T_new
-- 3. Switch partitions or use batch inserts
-- 4. Rename tables (requires brief downtime)

Online Operations (Enterprise Edition Only)

SQL Server Enterprise Edition offers the ONLINE = ON option:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0
WITH (ONLINE = ON)

When these operations take longer than expected:

-- Check progress (SQL Server 2005+)
SELECT session_id, percent_complete, estimated_completion_time
FROM sys.dm_exec_requests
WHERE command LIKE '%ALTER TABLE%'

Consider adjusting the recovery model to bulk-logged during the operation if point-in-time recovery isn't required.