PowerShell Script to Retrieve & Analyze Emails Older Than 2 Years in Exchange by User with Size Metrics


5 views

When dealing with email retention policies and legal discovery requests, organizations often need granular metrics about historical email data. The challenge is to quantify exactly how much data would be involved when setting different retention thresholds (1 year, 2 years, etc.) while maintaining user-level visibility.

Here's an enhanced version that provides per-user statistics with age filtering capabilities:


# Get all mailboxes with their folder statistics including age data
$cutoffDate = (Get-Date).AddYears(-2) # Adjust for different retention periods
$results = @()

Get-Mailbox -ResultSize Unlimited | ForEach-Object {
    $mailbox = $_
    $stats = Get-MailboxFolderStatistics -Identity $mailbox.Identity -FolderScope All -IncludeOldestAndNewestItems
    
    $oldItems = $stats | Where-Object { $_.OldestItemReceivedDate -ne $null -and $_.OldestItemReceivedDate -lt $cutoffDate }
    
    $userStats = [PSCustomObject]@{
        UserPrincipalName = $mailbox.UserPrincipalName
        DisplayName = $mailbox.DisplayName
        TotalItemsOlderThanCutoff = ($oldItems | Measure-Object ItemsInFolder -Sum).Sum
        TotalSizeOlderThanCutoff = ($oldItems | Measure-Object FolderSize -Sum).Sum / 1MB
        OldestItemDate = ($oldItems | Sort-Object OldestItemReceivedDate | Select-Object -First 1).OldestItemReceivedDate
    }
    
    $results += $userStats
}

# Export results sorted by size
$results | Sort-Object TotalSizeOlderThanCutoff -Descending | Export-Csv -Path "Mailbox_Age_Analysis.csv" -NoTypeInformation

For more comprehensive reporting across multiple retention periods:


$analysisPeriods = @(
    @{Years=1; Label="1 Year"},
    @{Years=2; Label="2 Years"},
    @{Years=3; Label="3 Years"},
    @{Years=5; Label="5 Years"}
)

$fullReport = @()

Get-Mailbox -ResultSize Unlimited | ForEach-Object {
    $mailbox = $_
    $stats = Get-MailboxFolderStatistics -Identity $mailbox.Identity -FolderScope All -IncludeOldestAndNewestItems
    
    $userReport = [PSCustomObject]@{
        User = $mailbox.UserPrincipalName
        DisplayName = $mailbox.DisplayName
    }
    
    foreach ($period in $analysisPeriods) {
        $cutoff = (Get-Date).AddYears(-$period.Years)
        $oldData = $stats | Where-Object { $_.OldestItemReceivedDate -lt $cutoff }
        
        $totalSizeMB = ($oldData | Measure-Object FolderSize -Sum).Sum / 1MB
        
        $userReport | Add-Member -NotePropertyName "Size_$($period.Label)" -NotePropertyValue $totalSizeMB
    }
    
    $fullReport += $userReport
}

$fullReport | Export-Csv -Path "MultiPeriod_Retention_Analysis.csv" -NoTypeInformation

For large environments, consider these optimizations:

  • Process mailboxes in batches using -ResultSize parameter
  • Add error handling with try/catch blocks
  • Implement progress tracking with Write-Progress
  • Consider running during off-peak hours

The output CSV will contain columns showing each user's data volume for each retention period. This allows legal teams to:

  • See which users have the most historical data
  • Estimate storage requirements for different retention policies
  • Identify outliers that may need special handling

When dealing with Exchange mailbox archiving and retention policies, legal teams often need precise data on email age distribution. The challenge is to extract this information in a user-sorted format rather than aggregated totals. This allows administrators to understand storage patterns per user and make informed decisions about retention periods.

The original script provided a good starting point but lacked user-specific breakdowns. Here's an improved version that:

  1. Processes all mailboxes
  2. Filters items older than 2 years
  3. Groups results by user
  4. Provides size metrics

This script generates a detailed report showing each user's old email volume:


# Get all mailboxes and their folder statistics
$mailboxes = Get-Mailbox -ResultSize Unlimited
$report = @()

foreach ($mailbox in $mailboxes) {
    $stats = Get-MailboxFolderStatistics -Identity $mailbox.Identity -FolderScope All -IncludeOldestAndNewestItems
    
    # Calculate items older than 2 years
    $cutoffDate = (Get-Date).AddYears(-2)
    $oldItems = $stats | Where-Object { $_.OldestItemReceivedDate -lt $cutoffDate }
    
    $userData = [PSCustomObject]@{
        User = $mailbox.DisplayName
        Email = $mailbox.PrimarySmtpAddress
        OldItemsCount = ($oldItems | Measure-Object).Count
        TotalSizeMB = [math]::Round(($oldItems | Measure-Object -Property FolderSize -Sum).Sum / 1MB, 2)
        OldestEmailDate = ($oldItems | Sort-Object OldestItemReceivedDate | Select-Object -First 1).OldestItemReceivedDate
    }
    
    $report += $userData
}

# Export sorted results
$report | Sort-Object TotalSizeMB -Descending | Export-Csv -Path "OldEmailsByUser.csv" -NoTypeInformation

For more granular reporting, you can modify the date filter:


# Multiple date ranges for comparison
$dateRanges = @(
    @{Years=1; Label="1Year"},
    @{Years=2; Label="2Years"},
    @{Years=3; Label="3Years"}
)

foreach ($range in $dateRanges) {
    $cutoff = (Get-Date).AddYears(-$range.Years)
    # Add similar processing logic as above
}

For large environments:

  • Run during off-peak hours
  • Consider parallel processing with workflows
  • Limit properties retrieved to only what's needed
  • Add progress indicators for long-running operations

For organizations with thousands of mailboxes, consider this optimized version:


# Fast parallel processing using runspaces
$session = New-PSSession -ConfigurationName Microsoft.Exchange
Invoke-Command -Session $session -ScriptBlock {
    Get-Mailbox -ResultSize Unlimited | ForEach-Object -Parallel {
        # Individual mailbox processing here
    } -ThrottleLimit 10
} | Export-Csv "ParallelResults.csv"