Outage Communications Template¶
When systems fail, clear communication is as important as technical remediation. Stakeholders need to know what's happening, what the impact is, and when to expect resolution. This template provides structures for incident communication at each phase.
What problem this solves¶
During incidents, communication often breaks down:
- Stakeholders don't know there's an issue until customers complain.
- Updates are inconsistent, vague, or stop coming.
- Different people send conflicting messages.
- After resolution, there's no summary of what happened.
This leads to lost trust, duplicate questions, and wasted time. Structured communication templates ensure consistency, reduce cognitive load during stressful moments, and maintain stakeholder confidence even when things go wrong.
When to use this¶
Use outage communications for:
- SEV1 / SEV2 incidents — Customer-impacting issues that require stakeholder awareness.
- Planned maintenance — Communicate proactively before scheduled downtime.
- Partial degradations — When functionality is impaired but not completely unavailable.
For internal-only issues with no customer impact, you may not need external communications, but internal updates still apply.
Roles and ownership¶
| Role | Responsibility |
|---|---|
| Incident Commander | Decides when to communicate. Approves outgoing messages. |
| Communications Lead | Drafts and sends communications. Manages channels (status page, Slack, email). |
| Engineering Lead | Provides technical context for communications. Estimates resolution time. |
| Support Lead | Handles incoming customer questions. Escalates patterns or critical requests. |
In smaller teams, one person may fill multiple roles. What matters is that communication is someone's explicit responsibility.
Communication timeline¶
| Phase | When | What to communicate |
|---|---|---|
| Acknowledge | Within 15 minutes of detection | We know. We're investigating. |
| Update | Every 30 minutes (or as status changes) | What we know now. What we're doing. |
| Mitigated | When customer impact stops | It's fixed. What happened. |
| Resolved | When root cause is addressed | Full summary. Postmortem link. |
The worst thing is silence. Even "still investigating" is better than no update.
Signals that communications are working¶
- Stakeholders learn about issues from you, not from customers.
- Updates are consistent across channels.
- Customers and internal teams trust that they'll be kept informed.
- After incidents, stakeholders understand what happened and what's being done.
Failure modes and mitigations¶
| Failure mode | What it looks like | Mitigation |
|---|---|---|
| No initial acknowledgment | Stakeholders discover issues from complaints | Set up automated incident channel alerts; assign comms lead immediately |
| Updates stop | Initial message, then silence | Set timer reminders for updates; comms lead owns cadence |
| Overpromising ETAs | "Fixed in 30 minutes" becomes 3 hours | Use ranges; be honest about uncertainty |
| Conflicting messages | Status page says one thing, Slack says another | Single source of truth (status page); mirror to other channels |
| Too technical | Customers don't understand what's happening | Lead with impact; explain in user terms |
The templates¶
Initial acknowledgment¶
Send this within 15 minutes of incident detection.
**[Service/Product] — Investigating an issue**
**Status:** Investigating
**Impact:** [What users are experiencing]
**Started:** [Time, with timezone]
We are aware of an issue affecting [brief description of user impact]. Our team is actively investigating.
We will provide an update within [30 minutes / 1 hour].
**Next update:** [Time]
Example:
**Payments — Investigating an issue**
**Status:** Investigating
**Impact:** Some users are unable to complete checkout.
**Started:** 2024-01-15 14:32 UTC
We are aware of an issue affecting payment processing. Some users may see errors when completing checkout. Our team is actively investigating.
We will provide an update within 30 minutes.
**Next update:** 15:00 UTC
Progress update¶
Send this at regular intervals, or when status changes.
**[Service/Product] — Update**
**Status:** [Investigating / Identified / Implementing fix]
**Impact:** [Current user impact]
**Duration:** [Time since start]
**Update:**
[What we've learned. What we're doing now.]
[If applicable: Workaround for users]
**Next update:** [Time]
Example:
**Payments — Update**
**Status:** Identified
**Impact:** Some users unable to complete checkout.
**Duration:** 45 minutes
**Update:**
We have identified the cause as a database connection issue. Our team is working to restore connections. Users may retry checkout—some attempts may succeed.
**Next update:** 15:30 UTC
Mitigation notice¶
Send this when user impact stops, even if root cause isn't fully resolved.
**[Service/Product] — Mitigated**
**Status:** Mitigated
**Impact:** [What was affected] — now resolved
**Duration:** [Total time]
**Summary:**
[What happened, in user-impact terms]
We have mitigated the issue. [Service] is now operating normally. We are continuing to monitor and will investigate the root cause.
A full postmortem will follow within [timeframe].
Example:
**Payments — Mitigated**
**Status:** Mitigated
**Impact:** Checkout failures — now resolved
**Duration:** 1 hour 23 minutes
**Summary:**
Between 14:32 and 15:55 UTC, some users experienced errors during checkout due to a database connection issue. We have restored connections and payments are processing normally.
We are continuing to monitor. A full postmortem will follow within 48 hours.
Resolution / Postmortem summary¶
Send this after the postmortem is complete.
**[Service/Product] — Incident resolved**
**Incident:** [Brief title]
**Date:** [Date]
**Duration:** [Total duration]
**Impact:** [Summary of user impact]
**Root cause:**
[Plain-language explanation of what happened]
**What we're doing:**
[Key actions from the postmortem]
- [Action 1]
- [Action 2]
- [Action 3]
**Full postmortem:** [Link, if appropriate to share]
We apologize for the disruption and appreciate your patience.
Internal Slack announcement¶
For internal channels (e.g., #incidents, #engineering).
:rotating_light: **Incident: [Brief title]**
**Severity:** SEV[1/2/3]
**Impact:** [What's affected]
**Incident channel:** #inc-[date]-[slug]
**Status page:** [Link]
**Current status:** [Investigating / Identified / Mitigating]
**IC:** @[name]
**Comms:** @[name]
Updates will be posted in the incident channel.
Customer email template¶
For direct customer communication (support tickets, account managers).
Subject: [Service] incident — [Resolved/Update]
Hi [Name],
We wanted to let you know about an incident that may have affected your use of [Product].
**What happened:**
[Plain-language summary]
**Impact to you:**
[Specific impact, if known]
**What we're doing:**
[Actions being taken or completed]
**Current status:** [Resolved / Monitoring]
We apologize for any inconvenience. If you have questions or noticed any issues, please reply to this email or contact support at [email/link].
Best,
[Your name]
[Company]
Communication channels checklist¶
During incidents, update all relevant channels:
- Status page — Public-facing status (e.g., Statuspage, Cachet)
- Internal Slack — #incidents or #engineering channel
- Customer Slack — Shared channels with customers, if applicable
- Support team — Brief them on what to tell customers
- Sales/Account managers — For key accounts
- Email — For direct notification of affected users
- Social media — For major incidents, if your company uses it for status
Related pages¶
- Crisis: Outage Communication Playbook — Full process for incident communications.
- Postmortem Template — Document what happened after resolution.
- Runbook Template — Operational procedures for incident response.
- Delivery: Incident Response — End-to-end incident handling.