Prevent incidents before external discovery

Authored runbooks for rapid diagnosis and safe remediation

Each runbook lists authors, review dates, pre-checks, error-code mapping, and rollback guidance—built for production use and audit readiness. Fast diagnostics reduce MTTR by 30–40%, preventing $50K–$500K+ per incident. Threat-intelligence driven, not just rules-based.

Featured runbooks: Threat prevention & rapid recovery

Evidence-based guidance with signals, actions, validation, and rollback steps. Each guide prevents compliance findings, reduces incident time-to-recovery, and stops threats before external discovery.

Mail flow

Inbound mail not received

Author: Amelia R. Patel | Reviewed: 2025-12 | Version: 1.4

  1. Pre-check: service health; capture NDR codes and message trace samples.
  2. DNS: confirm MX to Microsoft 365; SPF/DKIM/DMARC alignment.
  3. Connectors: validate inbound scope, cert/TLS, and accepted domains.
  4. Policies: inspect transport/anti-spam rules for false positives.
  5. Validate: controlled external test; verify headers and TLS.
  6. Rollback: revert connector or rule edits; document diffs. See mail flow rollback.
See detailed troubleshooting steps
Clients

Outlook password prompts

Author: Jonas M. Clarke | Reviewed: 2025-11 | Version: 1.3

  1. Pre-check: modern auth enabled tenant-wide; disable unused legacy protocols.
  2. Auth: clear cached creds; repair profile; confirm autodiscover endpoints.
  3. CA: align policies; resolve grant vs require MFA conflicts. See CA lockout guide.
  4. Validate: OWA and Outlook sign-in stability; no repeated prompts.
  5. Rollback: restore prior CA policies if token issuance is impacted.
Troubleshoot prompts
Clients

Autodiscover resolution failures

Author: Jonas M. Clarke | Reviewed: 2025-10 | Version: 1.2

  1. DNS: ensure Autodiscover CNAME to autodiscover.outlook.com or SRV fallback.
  2. Hybrid: validate certificates and published endpoints; renew mismatches.
  3. Clients: remove stale SCPs; test via Remote Connectivity Analyzer.
  4. Validate: new profile resolves EXPR endpoints; connection status verified.
  5. Rollback: restore prior DNS if change degrades connectivity.
Review Autodiscover steps
Storage

Mailbox quota remediation

Author: Priya Desai | Reviewed: 2025-09 | Version: 1.1

  1. License: increase mailbox size or enable archive mailbox.
  2. Retention: configure policies and auto-expanding archive respecting holds.
  3. Recoverable Items: clear within legal constraints; monitor growth.
  4. Validate: confirm send/receive restored; set alerts for thresholds.
  5. Rollback: revert quota changes only after verifying impact.
See quota guidance
Performance

Throttling during migrations

Author: Priya Desai | Reviewed: 2025-12 | Version: 1.5

  1. Scope: distribute moves across service accounts and mailboxes.
  2. Limits: honor Graph/EWS concurrency guidance; use backoff.
  3. Scheduling: run heavy operations in approved windows.
  4. Validate: monitor 429/503 reduction and move throughput.
  5. Rollback: pause batches; revert throttling policy changes if required.
Reduce throttling
Security

Harden inbound mail

Author: Amelia R. Patel | Reviewed: 2025-12 | Version: 1.2

  1. Authentication: enforce SPF, DKIM, and DMARC with reporting.
  2. Transport: tighten connector scoping and TLS requirements.
  3. Policies: enable anti-phish, anti-spam, and malware baselines.
  4. Quarantine: configure review workflows and release approvals.
  5. Validate: send signed test mails; verify alignment and delivery.
Strengthen security

Advanced scenario guides

Real-world cases with diagnostic workflows, escalation trees, and validation for complex environments.

Scenario

O365 migration throttling at scale

Author: Jonas M. Clarke | Updated: 2025-12

Challenge: 50K-mailbox migration hitting throttling errors (429/1300) despite phased approach.

Root cause: Concurrent migration batch size + background sync operations (OneDrive, Teams) + legacy SMTP relay.

Solution: Reduce batch concurrency from 5 to 2 concurrent migrations; pause non-critical sync; optimize SMTP relay rate limits.

Result: Throttling eliminated; migration completed on schedule with zero user impact.

Talk to an Exchange Security Specialist
Scenario

Hybrid namespace DNS conflict recovery

Author: Jonas M. Clarke | Updated: 2025-11

Challenge: HCW reports namespace mismatch; external clients cannot resolve hybrid mailboxes.

Root cause: Split DNS configuration; internal DNS returns on-prem server; external returns Exchange Online.

Solution: Add autodiscover.company.com CNAME; update internal DNS scopes; validate SRV records for Outlook auto-config.

Result: Mixed on-prem/cloud mailbox access restored; zero credential resets required.

Talk to an Exchange Security Specialist
Scenario

Conditional Access policy lock-out mitigation

Author: Amelia R. Patel | Updated: 2025-10

Challenge: Tenant-wide CA policy blocking all interactive logins; users locked out of OWA.

Root cause: Policy required MFA + compliant device; legacy devices + MFA registration gap = 100% rejection.

Solution: Exclude break-glass accounts; create phased rollout with exclusion groups; grace period for MFA registration.

Result: Access restored within 30 minutes; phased rollout completed in 2 weeks with zero compliance loss.

Talk to an Exchange Security Specialist

Need tailored guidance or a free diagnostic?

Request a principal engineer to pair on your scenario with change-control-ready steps. Download our free Exchange diagnostic checklist.

Frequently asked questions

Common questions about our runbooks, methodology, and support model.

Who reviews and signs the runbooks?

Every runbook is authored and reviewed by named principal engineers with Microsoft certifications and years of Exchange Online expertise. See our team bios and credentials.

How often are runbooks updated?

We review all content quarterly. When Microsoft releases Exchange Online changes or we discover new patterns, we revise guides within 30 days and update the version number and review date.

Can you execute changes for us?

We provide production-safe, change-control-ready steps. You decide on execution. For critical incidents, we can pair with you in real-time to validate steps, but you retain credential control.

How do you protect my data?

We never ask for credentials. We avoid collecting secrets, use least-privilege logging, and comply with GDPR and SOC 2 Type II standards. See our privacy policy and Explore security services.

What if a runbook doesn't match my scenario?

Request an Exchange security assessment. We'll pair a principal engineer with your case, create a custom runbook, and add it to the knowledge base with author and review info.

Do you provide SLAs?

Yes. P1 incidents: 10-minute acknowledgment, 30-minute diagnostic plan. P2 incidents: 30-minute acknowledgment, 90-minute plan. See Learn about our services.