Optimizing ALTER TABLE ADD COLUMN Performance for Large Tables in SQL Server


11 views

When working with SQL Server tables containing millions of records (like your 10M row table), even simple DDL operations can become performance nightmares. The standard approach:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

becomes problematic because SQL Server needs to:

  • Update every existing row with the default value
  • Maintain transaction logs for the entire operation
  • Potentially lock the table during execution

For your immediate SQL Server 2000 situation, try this alternative approach:

-- Step 1: Add nullable column without default
ALTER TABLE T
ADD mycol BIT NULL

-- Step 2: Create constraint with default
ALTER TABLE T
ADD CONSTRAINT DF_T_mycol DEFAULT 0 FOR mycol

-- Step 3: Update in batches (adjust batch size as needed)
DECLARE @batch INT
SET @batch = 100000
WHILE EXISTS (SELECT 1 FROM T WHERE mycol IS NULL)
BEGIN
    UPDATE TOP (@batch) T
    SET mycol = 0
    WHERE mycol IS NULL
END

-- Step 4: Convert to NOT NULL
ALTER TABLE T
ALTER COLUMN mycol BIT NOT NULL

For your upcoming SQL Server 2008 migration, you have additional options:

Online Operations

ALTER TABLE T
ADD mycol BIT NOT NULL
CONSTRAINT DF_T_mycol DEFAULT 0 WITH (ONLINE = ON)

Minimally Logged Operations

-- Set recovery model to bulk-logged temporarily
ALTER DATABASE YourDB SET RECOVERY BULK_LOGGED

-- Perform the operation
ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

-- Revert recovery model
ALTER DATABASE YourDB SET RECOVERY FULL
  • Schedule during maintenance windows
  • Consider table partitioning for easier maintenance
  • Monitor tempdb usage during large operations
  • For very large tables, consider creating a new table and migrating data

For extremely large tables where downtime is unacceptable:

-- Create new table with desired schema
SELECT *, 0 AS mycol INTO T_new FROM T WHERE 1=0

-- Create constraints/indexes to match original

-- Migrate data in batches (use ID ranges or other criteria)
INSERT INTO T_new
SELECT TOP (100000) *, 0
FROM T
WHERE ID BETWEEN 1 AND 100000

-- Repeat until all data is migrated
-- Then rename tables (requires brief downtime)
EXEC sp_rename 'T', 'T_old'
EXEC sp_rename 'T_new', 'T'

When working with SQL Server 2000/2008 tables containing millions of records (like your 10M row table), schema modifications can become painfully slow operations. The operation you're attempting:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0

is particularly resource-intensive because SQL Server needs to:

  • Update every existing row with the default value
  • Maintain transaction logs for the entire operation
  • Potentially lock the table during execution

1. The Minimal Logging Approach

For SQL Server 2000, the most effective solution is to break the operation into two steps:

-- Step 1: Add nullable column (instant operation)
ALTER TABLE T
ADD mycol BIT NULL

-- Step 2: Update in batches (minimal transaction impact)
UPDATE TOP (10000) T SET mycol = 0 WHERE mycol IS NULL
WHILE @@ROWCOUNT > 0
BEGIN
    UPDATE TOP (10000) T SET mycol = 0 WHERE mycol IS NULL
END

-- Step 3: Alter to NOT NULL (now all rows have values)
ALTER TABLE T
ALTER COLUMN mycol BIT NOT NULL

2. SQL Server 2008+ Improvements

For your upcoming SQL Server 2008 operation, take advantage of the improved metadata-only operations:

-- This performs much better in 2008+
ALTER TABLE T
ADD mycol BIT NOT NULL
CONSTRAINT DF_T_mycol DEFAULT 0 WITH VALUES

The WITH VALUES clause tells SQL Server to apply the default only to new rows, making it a metadata-only operation.

Partition Switching Strategy

For minimal downtime in production environments:

-- 1. Create new table with identical schema plus new column
SELECT *, CAST(0 AS BIT) AS mycol INTO T_new FROM T WHERE 1=0

-- 2. Recreate indexes, constraints on T_new
-- 3. Switch partitions or use batch inserts
-- 4. Rename tables (requires brief downtime)

Online Operations (Enterprise Edition Only)

SQL Server Enterprise Edition offers the ONLINE = ON option:

ALTER TABLE T
ADD mycol BIT NOT NULL DEFAULT 0
WITH (ONLINE = ON)

When these operations take longer than expected:

-- Check progress (SQL Server 2005+)
SELECT session_id, percent_complete, estimated_completion_time
FROM sys.dm_exec_requests
WHERE command LIKE '%ALTER TABLE%'

Consider adjusting the recovery model to bulk-logged during the operation if point-in-time recovery isn't required.