AWS RDS Storage Scaling: Minimizing Downtime During Storage Allocation Modifications


2 views

When modifying AWS RDS storage allocation (for either standard or Provisioned IOPS storage types), the actual downtime experienced depends on several factors:


# Example AWS CLI command for storage modification
aws rds modify-db-instance \
    --db-instance-identifier mydbinstance \
    --allocated-storage 500 \  # New storage size in GB
    --apply-immediately

Based on AWS documentation and empirical observations:

  • Most RDS engines (MySQL, PostgreSQL, Aurora): Experience near-zero downtime (under 30 seconds)
  • SQL Server (pre-November 2017 configurations): May experience 2-5 minutes downtime
  • Storage-optimization phase: While marked as "Storage-optimization", the instance remains fully operational

To ensure minimal impact during storage scaling:


# CloudFormation template snippet for controlled storage modification
Resources:
  MyDB:
    Type: AWS::RDS::DBInstance
    Properties:
      AllocatedStorage: !Ref StorageSize
      StorageType: gp2
      AllowMajorVersionUpgrade: false
      AutoMinorVersionUpgrade: true

Use these AWS CLI commands to track progress:


# Check instance status
aws rds describe-db-instances \
    --db-instance-identifier mydbinstance \
    --query 'DBInstances[0].DBInstanceStatus'

# Check storage optimization status
aws rds describe-db-instances \
    --db-instance-identifier mydbinstance \
    --query 'DBInstances[0].StatusInfos'

For legacy SQL Server instances, consider these additional steps:


# PowerShell script to automate failover during maintenance
Import-Module AWSPowerShell
$instance = Get-RDSDBInstance -DBInstanceIdentifier "mssql-instance"
if ($instance.EngineVersion -lt "14.00") {
    Write-Host "Legacy SQL Server detected - scheduling extended maintenance window"
    # Implement custom failover logic here
}

Important constraints to note:

  • 6-hour cooldown period between storage modifications
  • Maximum storage increase per modification: 10TiB (varies by engine)
  • Storage decreases are not supported

While AWS states performance shouldn't degrade during storage optimization, monitor these metrics:


# CloudWatch metrics to watch during modification
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS \
    --metric-name CPUUtilization \
    --dimensions Name=DBInstanceIdentifier,Value=mydbinstance \
    --start-time $(date -d "1 hour ago" +%FT%T) \
    --end-time $(date +%FT%T) \
    --period 60 \
    --statistics Average

When modifying storage for Amazon RDS instances, the behavior varies significantly depending on:

  • Database engine (MySQL, PostgreSQL, SQL Server, etc.)
  • Storage type transition (gp2 to io1 or vice versa)
  • Whether storage optimization has occurred since November 2017

For most modern RDS configurations (post-2017 optimizations), storage scaling operates as follows:


# AWS CLI command to modify storage
aws rds modify-db-instance \
    --db-instance-identifier your-instance \
    --allocated-storage 500 \
    --apply-immediately

Key observations from production environments:

  • MySQL/PostgreSQL: Typically experience 10-30 seconds of latency during the modification
  • SQL Server (legacy): May incur 2-5 minutes downtime during storage-optimization phase
  • Multi-AZ deployments: Often show better availability during modifications

Use this CloudWatch metrics query to track storage modifications:


aws cloudwatch get-metric-data \
    --metric-data-queries file://query.json \
    --start-time $(date -v-1H +%s) \
    --end-time $(date +%s)

Where query.json contains:


{
  "Id": "storageOpt",
  "MetricStat": {
    "Metric": {
      "Namespace": "AWS/RDS",
      "MetricName": "StorageOptimization",
      "Dimensions": [
        {
          "Name": "DBInstanceIdentifier",
          "Value": "your-instance"
        }
      ]
    },
    "Period": 60,
    "Stat": "Average"
  },
  "ReturnData": true
}

Based on AWS Premium Support cases and community reports:

  • Always initiate changes during maintenance windows
  • For SQL Server, expect longer optimization times with larger storage increments
  • Storage modifications are blocking operations - no further changes allowed for 6 hours

Here's a Lambda function template for automated storage scaling:


import boto3

def lambda_handler(event, context):
    rds = boto3.client('rds')
    
    try:
        response = rds.modify_db_instance(
            DBInstanceIdentifier=event['DBInstanceIdentifier'],
            AllocatedStorage=event['NewStorageSize'],
            ApplyImmediately=True
        )
        return {
            'statusCode': 200,
            'body': response
        }
    except Exception as e:
        return {
            'statusCode': 500,
            'error': str(e)
        }