PowerShell Folder Comparison: How to Diff File Contents Between Two Directories


3 views

When comparing two directory structures in PowerShell, most solutions focus on superficial differences like file names or metadata. However, developers often need to verify whether the actual content of corresponding files matches across folders - particularly when validating deployments or syncing environments.

Start by gathering all files from both directories:

$folder1Files = Get-ChildItem -Recurse -File "C:\path\to\folder1" | Select-Object FullName,Length
$folder2Files = Get-ChildItem -Recurse -File "C:\path\to\folder2" | Select-Object FullName,Length

This function performs a deep comparison by checking both file existence and content hashes:

function Compare-FolderContents {
    param(
        [string]$Path1,
        [string]$Path2
    )
    
    $results = @()
    $files1 = Get-ChildItem -Recurse -File $Path1
    $files2 = Get-ChildItem -Recurse -File $Path2

    # Compare file presence first
    $allFiles = ($files1 + $files2) | Select-Object -Unique -Property FullName
    
    foreach ($file in $allFiles) {
        $relativePath = $file.FullName.Substring($Path1.Length)
        $file1 = $files1 | Where-Object FullName -eq $file.FullName
        $file2 = $files2 | Where-Object FullName -like "*$relativePath"

        if (-not $file1) {
            $results += [PSCustomObject]@{
                File = $relativePath
                Status = "Only in Folder2"
            }
            continue
        }
        
        if (-not $file2) {
            $results += [PSCustomObject]@{
                File = $relativePath
                Status = "Only in Folder1"
            }
            continue
        }

        # Compare content via hash
        $hash1 = (Get-FileHash $file1.FullName -Algorithm SHA256).Hash
        $hash2 = (Get-FileHash $file2.FullName -Algorithm SHA256).Hash

        $results += [PSCustomObject]@{
            File = $relativePath
            Status = if ($hash1 -eq $hash2) { "Match" } else { "Content Mismatch" }
            Hash1 = $hash1
            Hash2 = $hash2
        }
    }
    
    return $results
}

Execute the comparison and filter results:

$diffResults = Compare-FolderContents -Path1 "D:\source" -Path2 "D:\backup"
$diffResults | Where-Object { $_.Status -ne "Match" } | Format-Table -AutoSize

For large directories, consider these optimizations:

  • Add parallel processing with ForEach-Object -Parallel (PowerShell 7+)
  • Implement file size pre-check before hash calculation
  • Use memory-efficient streaming for very large files

For quick comparisons without full content analysis:

# Fast size-only comparison
Get-ChildItem -Recurse -File $folder1 | ForEach-Object {
    $otherFile = Join-Path $folder2 $_.FullName.Substring($folder1.Length)
    if (-not (Test-Path $otherFile)) {
        "Missing in folder2: $($_.FullName)"
    }
    elseif ($_.Length -ne (Get-Item $otherFile).Length) {
        "Size mismatch: $($_.FullName)"
    }
}

When working with folder structures in Windows, you often need to verify whether two directories contain identical files - not just matching names, but also matching content. The standard Compare-Object approach only compares file objects at surface level, which isn't sufficient for many development scenarios.

For a proper content comparison, we should generate hash values of each file's content. This method is both efficient and reliable:


function Compare-Folders {
    param(
        [string]$Path1,
        [string]$Path2
    )
    
    $files1 = Get-ChildItem -Recurse -File $Path1
    $files2 = Get-ChildItem -Recurse -File $Path2
    
    # Create hash tables for comparison
    $hash1 = @{}
    $files1 | ForEach-Object {
        $relativePath = $_.FullName.Substring($Path1.Length)
        $hash1[$relativePath] = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
    }
    
    $hash2 = @{}
    $files2 | ForEach-Object {
        $relativePath = $_.FullName.Substring($Path2.Length)
        $hash2[$relativePath] = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
    }
    
    # Compare the hashes
    $comparison = Compare-Object -ReferenceObject $hash1.Keys -DifferenceObject $hash2.Keys
    
    # Files only in Path1
    $leftOnly = $comparison | Where-Object { $_.SideIndicator -eq '<=' } | Select-Object -ExpandProperty InputObject
    
    # Files only in Path2
    $rightOnly = $comparison | Where-Object { $_.SideIndicator -eq '=>' } | Select-Object -ExpandProperty InputObject
    
    # Files in both but with different content
    $commonFiles = $hash1.Keys | Where-Object { $hash2.ContainsKey($_) }
    $differentContent = $commonFiles | Where-Object { $hash1[$_] -ne $hash2[$_] }
    
    [PSCustomObject]@{
        Path1 = $Path1
        Path2 = $Path2
        FilesOnlyInPath1 = $leftOnly
        FilesOnlyInPath2 = $rightOnly
        FilesWithDifferentContent = $differentContent
    }
}

To compare two folders, simply call the function with the paths:


$result = Compare-Folders -Path1 "C:\folder1" -Path2 "D:\folder2"
$result.FilesWithDifferentContent | Format-Table

For very large directories, you might want to add progress reporting and parallel processing:


# Example with progress reporting
$files1 | ForEach-Object -Parallel {
    $relativePath = $_.FullName.Substring($using:Path1.Length)
    $hash = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
    [PSCustomObject]@{
        Path = $relativePath
        Hash = $hash
    }
} -ThrottleLimit 8 -AsJob | Receive-Job -Wait -AutoRemoveJob

If you're specifically working with text files and want line-by-line comparison, consider this approach:


function Compare-TextFiles {
    param(
        [string]$File1,
        [string]$File2
    )
    
    $content1 = Get-Content $File1
    $content2 = Get-Content $File2
    
    Compare-Object -ReferenceObject $content1 -DifferenceObject $content2
}