What are the four launch phases?

Pre-launch (T-30 days to T-1 hour), where everything is verified before the window: scope locked, QA complete, performance baselined, rollback documented and tested, DNS TTL lowered, and a go/no-go held. Cutover (T-0), the sequenced launch steps each with an owner and a verification. Verification (T+0 to T+24 hours), confirming the launch is healthy across critical flows, error rates, and performance. And stabilization (T+24 hours to T+7 days), monitoring the long tail day over day and planning the after-action report.

A launch lead owns the runbook and calls go/no-go and rollback; a deploy operator executes the technical steps; a QA lead runs verification and confirms each milestone; a comms lead handles internal updates and external messaging; an on-call engineer stays reachable during and after; and a stakeholder rep approves on behalf of the business. On a small team one person may fill several roles, but each role's responsibilities should still be explicit, because role ambiguity is the most common cause of launch chaos.

How do I handle rollback?

Define the criteria before launch, because the call is easier to make pre-emptively than under pressure. Automatic triggers include an error rate beyond a set threshold, a broken critical user flow, a database-integrity issue, and a security hole found post-deploy; discretionary triggers include performance degradation and a drop in a key business metric. The launch lead calls rollback, with a pre-defined deputy if they are unavailable, and the rollback procedure must be tested in advance, because an untested rollback is hope rather than procedure.

When should I not launch?

Not at the end of the day on a Friday, and not right before a holiday, because both cut the time available to respond if something breaks. Not without on-call coverage for the first 24 hours, since someone must be reachable. And not by skipping pre-launch QA to hit a date: the bugs do not disappear, they simply surface on launch day in front of users instead of in testing. For multi-day launches, plan rest cycles, because launch fatigue leads to errors.

How is this different from incident-response?

The launch runbook is forward-looking: it plans and executes a deliberate go-live. Incident-response is for the unplanned break, an active production issue that needs detection, triage, and mitigation. Pre-launch testing belongs to qa-testing, and the retrospective afterward belongs to the after-action report. The runbook is what you follow when everything goes to plan and the lever you pull (rollback) when it does not.

Skill · Launch runbook

Launch runbook.

The runbook is the document everyone runs on launch day.

Plan and execute the launch of a site, product, or major release. The runbook is the document the whole team runs on launch day, and it covers four phases: pre-launch verification, the cutover, verification, and stabilization. Clear roles and pre-defined rollback criteria keep the day from turning into chaos.

The decisions that matter are made before the window opens. Define rollback criteria in advance, test the rollback, and walk the runbook as a team, because the moment to discover a broken rollback is not the moment you need it.

Audience: engineering and product teams launching a site or major release, planning a DNS cutover, or coordinating a cross-team go-live.

View the skill on GitHub Browse the full catalog

The framework

Four phases the runbook covers.

A launch has four phases. The runbook covers all of them, from a month out to a week after.

01Pre-launch (T-30 days to T-1 hour): lock scope, complete QA, measure a performance baseline, document and test the rollback, lower DNS TTL, hold the go/no-go, and back up production.
02Cutover (T-0): the sequenced launch steps, each with an owner, pre-conditions, an action, a verification, a time estimate, and a rollback. Announce, migrate, deploy, smoke-test, DNS cutover, full smoke test, announce.
03Verification (T+0 to T+24 hours): confirm the critical flows (checkout, signup, login), watch error rates and performance, and check analytics, notifications, and key business metrics.
04Stabilization (T+24 hours to T+7 days): track error rates and performance day over day against baseline, address non-blocking issues, and plan the after-action report.

Decisions made in advance

Test the rollback before you need it.

Define the rollback criteria before the launch, because the decision is easier to make pre-emptively than under pressure. Automatic triggers (an error rate over threshold, a broken critical flow, a data-integrity issue, a post-deploy security hole) and discretionary ones (performance or business-metric degradation) both name who acts, and the launch lead calls rollback, with a pre-defined deputy if they are unavailable.

Test the rollback before you need it, because an untested rollback is hope, not procedure. Run a tabletop walkthrough of the whole runbook with the full team to surface the gaps, write each cutover step with an owner and a verification rather than a vague 'deploy to production', and verify each step before moving to the next so an error does not propagate down the sequence.

Timing matters. Launching at end of day Friday or before a holiday cuts the time available to respond, and someone must be on-call and reachable for the first 24 hours. Skipping pre-launch QA to hit a date does not remove the bugs; it moves them to launch day, in front of users.

Reference files

The reference that goes alongside the SKILL.md.

references/runbook-template.md
A fillable runbook template with example cutover sequences.

Browse all reference files on GitHub

Bridges to other skills

Before, during, and after the launch.

The runbook owns launch day. These cover the testing before it, the break during it, the move it often executes, and the retro after.

Before the launch
qa-testing
Pre-launch testing is QA's job, and the runbook's T-7 milestone depends on it being complete. Skipping it to hit a date just moves the bugs to launch day.
If it breaks
incident-response
When the launch goes sideways and users are impacted, incident response takes over. The runbook is the plan; incident response is the unplanned break.
Platform moves
content-migration
A platform move is a launch with a redirect map. Run the migration skill for the URL and cutover plan, and this runbook for the go-live mechanics around it.
After the launch
after-action-report
Scheduled within one to two weeks of the launch, the retrospective captures what worked and what to fix, feeding the next runbook.

Open source under MIT

Read the SKILL.md on GitHub.

The skill source lives in the rampstackco/claude-skills repository alongside dozens of other skills covering the full lifecycle of brand and product work. This page is a structured overview; the SKILL.md is the source. MIT licensed.

View SKILL.md Browse the full catalog

Frequently asked questions.

What are the four launch phases?: Pre-launch (T-30 days to T-1 hour), where everything is verified before the window: scope locked, QA complete, performance baselined, rollback documented and tested, DNS TTL lowered, and a go/no-go held. Cutover (T-0), the sequenced launch steps each with an owner and a verification. Verification (T+0 to T+24 hours), confirming the launch is healthy across critical flows, error rates, and performance. And stabilization (T+24 hours to T+7 days), monitoring the long tail day over day and planning the after-action report.
Who runs a launch?: A launch lead owns the runbook and calls go/no-go and rollback; a deploy operator executes the technical steps; a QA lead runs verification and confirms each milestone; a comms lead handles internal updates and external messaging; an on-call engineer stays reachable during and after; and a stakeholder rep approves on behalf of the business. On a small team one person may fill several roles, but each role's responsibilities should still be explicit, because role ambiguity is the most common cause of launch chaos.
How do I handle rollback?: Define the criteria before launch, because the call is easier to make pre-emptively than under pressure. Automatic triggers include an error rate beyond a set threshold, a broken critical user flow, a database-integrity issue, and a security hole found post-deploy; discretionary triggers include performance degradation and a drop in a key business metric. The launch lead calls rollback, with a pre-defined deputy if they are unavailable, and the rollback procedure must be tested in advance, because an untested rollback is hope rather than procedure.
When should I not launch?: Not at the end of the day on a Friday, and not right before a holiday, because both cut the time available to respond if something breaks. Not without on-call coverage for the first 24 hours, since someone must be reachable. And not by skipping pre-launch QA to hit a date: the bugs do not disappear, they simply surface on launch day in front of users instead of in testing. For multi-day launches, plan rest cycles, because launch fatigue leads to errors.
How is this different from incident-response?: The launch runbook is forward-looking: it plans and executes a deliberate go-live. Incident-response is for the unplanned break, an active production issue that needs detection, triage, and mitigation. Pre-launch testing belongs to qa-testing, and the retrospective afterward belongs to the after-action report. The runbook is what you follow when everything goes to plan and the lever you pull (rollback) when it does not.