Runbook: Hybrid Exchange Rollback

Safely revert hybrid changes and restore prior known-good configuration.

⚠️ Business Consequence: Why Hybrid Rollback Matters

  • Financial Impact: Failed migration rollback = extended dual-licensing costs ($15K–$50K per month)
  • Compliance Exposure: Split mailbox data = incomplete eDiscovery/audit trails
  • Operational Risk: Cross-forest communications broken = inter-department collaboration halted
  • Migration Risk: Failed coexistence = project timeline slippage, executive escalation

Average rollback time: 20–40 minutes — prevents migration project collapse.

Hybrid Rollback Summary

  • Severity: P1/P2 (depends on scope)
  • Total time: 20-40 minutes (backup restore + validation)
  • Affects: Free-busy sync, mail flow (on-prem to cloud), mailbox migration
  • Risk: Medium (cross-forest operations disrupted during rollback)

Pre-Rollback Validation (5 minutes)

Before beginning rollback, confirm the root cause and assess impact scope:

  • Identify root cause: Which hybrid change caused the issue? (HCW failure, DNS change, org relationship misconfiguration, connector scope problem)
  • Document current state: Run command to export current org relationship and connectors before changes
  • Locate backup: Ensure you have pre-change configuration backup available
  • Assess impact: How many users affected? Is it free-busy only or mail flow disrupted?
  • Communication: Notify stakeholders rollback is beginning (impacts may continue 10-15 min)

3-Phase Rollback Procedure

Execute phases in order. Do NOT skip validation between phases. Allow 3-5 minutes for Azure AD sync between Phase 1 and Phase 2:

Phase 1: Restore Organization Relationship (10-15 min)

Objective: Restore the organization relationship configuration that enables cross-forest communication and free-busy sync.

  1. Verify backup exists: Before making changes, confirm you have a backup export from before the failed change
  2. \n
  3. Check current org relationship: Run from on-premises Exchange Management Shell:
    Get-OrganizationRelationship | Format-List Name, DomainNames, FreeBusyAccessEnabled, FreeBusyAccessLevel, MailboxMoveEnabled
  4. \n
  5. If configuration is wrong: Remove and restore from backup:
    Remove-OrganizationRelationship -Identity -Confirm:$false # Then restore from backup via Set-OrganizationRelationship
  6. \n
  7. Wait for sync: Organization relationship changes take 3-5 minutes to propagate to Azure AD. Monitor O365 tenant for update confirmation.
  8. \n
\n

Phase 2: Revert Connectors & Restart Transport (5-10 min)

Objective: Restore mail flow connectors to prior configuration and restart the transport service.

\n
  1. List current connectors: Check for unauthorized changes:
    Get-SendConnector | Format-List Name, SourceTransportServers, AddressSpaces, SmartHosts
  2. \n
  3. If modified: Remove bad connector and restore from backup:
    Remove-SendConnector -Identity \"Connector Name\" -Confirm:$false # Restore via Set-SendConnector with original parameters
  4. \n
  5. Restart transport service: This causes 3-5 minute disruption. Schedule during maintenance window if possible:
    Restart-Service MSExchangeTransport # Wait 2-3 minutes for service restart completion Start-Sleep -Seconds 180 Get-Service MSExchangeTransport | Select Status
  6. \n
\n

Phase 3: Validate DNS & Autodiscover (If HCW-Related)

Objective: Ensure DNS and Autodiscover configuration matches pre-change state (only if HCW modifications are root cause).

\n
  1. Verify Autodiscover DNS: Check that Autodiscover.yourdomain.com resolves correctly:
    nslookup Autodiscover.yourdomain.com # Should resolve to either: # - Cloud EXO: autodiscover.outlook.com # - Hybrid: On-prem server IP (for initial redirect)
  2. \n
  3. Correct DNS if drift: If DNS points to wrong destination, update it to pre-change value
  4. \n
  5. Validate Autodiscover endpoint: Test from cloud perspective:
    # From Exchange Online PowerShell Get-AutodiscoverConfig | Format-List Identity, ExternalURL
  6. \n
\n
\n

Validation & Success Criteria

Confirm rollback success before announcing to users. Run tests in order and allow 5 minutes between sections for full sync:

Immediate Validation (Within 10 minutes)

Critical checks that must PASS before resuming normal operations:

  • Organization Relationship healthy: Test cross-forest communication
    Test-OrganizationRelationship | Format-Table SourceOrganization, Targets, Result # Result should be SUCCESS for primary O365 domain
  • \n
  • Mail flow stable: Check for backlog in queues
    Get-Queue | Where-Object {$_.MessageCount -gt 0} | Format-List Identity, MessageCount # Expect empty or rapidly decreasing queue depths
  • \n
  • No NDRs: Verify message trace shows no recent delivery failures
    # From Exchange Online PowerShell Get-MessageTrace -StartDate (Get-Date).AddHours(-1) -Status Failed | Measure-Object # Should be 0 or very low count
  • \n

Extended Validation (2-24 hours post-rollback)

Broader checks confirming operations returned to normal:

  • Free-busy functionality: Test cross-forest calendar access. Users should see availability of other forest users in meeting invite lookup
  • \n
  • User escalations: No new support tickets about mail delays, Outlook connectivity, or calendar issues
  • \n
  • Mailbox migration: If migration in progress, confirm moves are resuming and not failing
  • \n
  • Autodiscover working: Outlook clients should authenticate without errors; no \"Please wait while we set up your account\" loops
  • \n
\n