Why does the skill insist on five flag types?

Because mixing flag types is the root cause of most flag mess. A flag that started as a release ramp, got repurposed as an experiment, then ended up gating a permission, has no clear lifecycle and no clear owner. The five types (release, experiment, operational, permission, configuration) have different lifetimes and removal expectations. Picking one at creation and refusing to let it drift is the discipline.

Does it depend on a specific platform?

No. The principles work on LaunchDarkly, Flagsmith, Split.io, VWO FME, GrowthBook, Statsig, PostHog, and Optimizely equally. For platform-specific MCP commands and example prompts, pair this skill with the matching /integrations/{platform} microsite. The skill produces the operational shape; the microsite shows how the platform implements it.

How does this differ from experiment-design?

Experiment-design covers the discipline above: hypothesis writing, sample size, decision-making. Feature-flagging covers the operational layer below: flag types, naming, lifecycle, rollout, stale flag cleanup, governance. Most experiments are delivered via feature flags, so the two skills compose. Use experiment-design when designing the test; use feature-flagging when implementing and operating the flag that gates the test.

What about flag-as-config patterns and dynamic configuration?

Configuration flags are one of the five types. They live alongside the others, with their own lifecycle: governed by sales and product agreements, long-lived, evolved as contracts change. The skill covers them explicitly so teams treat configuration flags as first-class infrastructure rather than a special case.

Why does the skill spend so much time on stale flag cleanup?

Because cleanup is the discipline that separates teams with healthy flag inventories from teams with hundreds of dead flags. The cost of leaving flags in is real (two code paths to maintain, evaluation overhead, mental load) and compounds across hundreds of flags. The fix is a quarterly cadence plus making removal part of the launch checklist, not a separate effort.

Flagship Skill · Feature flagging

The feature flagging skill.

Flags as infrastructure, not as accumulating debt.

A senior engineer's playbook for using feature flags well, not just frequently. Codifies the operational discipline for flags as production infrastructure: the five flag types and the discipline of not mixing them, naming conventions, lifecycle from birth through death, targeting rules, rollout strategies, stale flag cleanup, governance, and the technical debt patterns that bite teams who weren't deliberate.

Audience: product managers and the engineers running production code behind their flags.

View the skill on GitHub Browse feature-flag platforms

What this skill is for

The discipline that prevents the technical-debt outcome.

Feature flags are infrastructure. Treated as such, they enable kill switches, gradual rollouts, A/B experiments, permission gates, and operational toggles without redeploys. Treated casually, they become the largest accumulating technical debt in your codebase: thousands of dead flags, conflicting evaluation logic, brittle targeting, and a permission surface no one fully understands.

This skill is the discipline that prevents the second outcome. It assumes you have a feature flag platform; it does not advocate for one. It assumes your engineering team can implement targeting rules and SDK integration. The hard part is the operational discipline, and that is what is here.

The output is not platform configuration. The output is a flag inventory that stays healthy quarter over quarter: every flag has a clear type, owner, lifecycle, and rollout story. Stale flags get removed. Permissions tier correctly. Rollouts actually use the available abort criteria.

What is in the skill

Fourteen considerations covered in the body.

The SKILL.md spans the full flag lifecycle plus the cross-cutting operational concerns (governance, performance, observability) that make a flag inventory healthy at quarter nine.

01
What this skill is for
The skill spans the operational lifecycle of a flag from creation through retirement. It does not cover experiment design, statistical analysis, or platform-specific tooling.
02
The five flag types
Release, experiment, operational, permission, configuration. Each has different lifetime and removal expectations. Mixing them in one flag is the root cause of most flag mess.
03
Flag naming conventions
Typed prefix, owner prefix, semantic name, version or date. Vague names die a slow death; well-named flags survive code review and the cleanup playbook.
04
The flag lifecycle
Birth, adolescence, launch, maturity, death. Birth is fast (one PR); death requires intentional cleanup. Most flag mess is unfinished death.
05
Targeting rules and segmentation
Four target dimensions: user, account, request, time-based. Compose with AND, OR, NOT. Avoid volatile attributes; if your rule needs three nested clauses, your taxonomy is wrong.
06
Rollout strategies
Percentage, cohort, geo-staged, time-based, and combination strategies. The ramp-and-watch rule: at each step, monitor for at least one peak hour before advancing.
07
Stale flag management
Quarterly cadence. 30 days for release and experiment flags, 90 days for operational. One PR per removal. Make removal part of the launch checklist, not a separate effort.
08
Governance and permissions
Permission tiers (viewer, editor, approver, admin). Environment promotion (dev to staging to production with review gates). Audit trail. Emergency override drilled in incidents.
09
Flag dependencies and conflicts
Dependency: flag B requires flag A on. Conflict: two flags target the same surface. Detect via shared-key audit; coordinate cross-team via the experiment registry.
10
Performance considerations
Cache aggressively, use bulk evaluation, prefer server-side SDKs for sensitive logic. 5 ms total budget for fifty flag checks per request.
11
Testing flag-gated code
Both branches covered. Document transition behavior. Use test-only flag overrides for integration tests. Catch the staging-versus-production rule drift via staged rollout.
12
Rollback discipline
Flags enable instant rollback of the gated change. They are not a substitute for code rollback. Practice via incident drills.
13
Observability on flags
Log flag value as a contextual field on every request. Alert on evaluation rate changes. Build dashboards for production rollouts. Connect to error tracking during incidents.
14
Common failures
Rapid-fire pattern catalog: rollout rule wrong, cached value, dead flag pile, two-flag conflict, evaluation latency, permission tiering missing.

Reference files

Seven references that go alongside the SKILL.md.

Each reference is a self-contained doc the team can lift into a project: checklists, pattern catalogs, and the worked-out playbook for stale flag cleanup.

references/flag-naming-conventions.md
Typed prefixes (release_, exp_, ops_, perm_, cfg_), owner prefixes, semantic naming patterns, and the migration plan for existing badly-named flags.
references/flag-lifecycle-checklist.md
Phase-by-phase checklist (birth, adolescence, launch, maturity, death, audit) with explicit entry and exit criteria per phase.
references/flag-types-reference.md
The five flag types in detail with worked examples, common pitfalls, and the anti-pattern of type drift.
references/stale-flag-cleanup-playbook.md
Quarterly cleanup process: report, owner triage, triage meeting, removal PRs (one per flag), platform deletion, verification. Plus the orphan-ownership pattern.
references/targeting-rule-patterns.md
Common patterns (percentage, internal-only, opt-in beta, cohort, geo, time-based, composition) and anti-patterns (volatile attributes, deeply-nested rules, drift between staging and production).
references/flag-rollout-strategies.md
Five rollout strategies in detail with hold times and abort criteria. Worked example for a high-risk checkout redesign launch with the full 80-day timeline.
references/governance-and-permissions.md
Permission tiers, environment-based scope, approval workflow, audit trail, emergency override, service accounts, the production-console-freeze anti-pattern.

Browse all reference files on GitHub

Where to use it

The full flag lifecycle.

At creation. Pick the type. Apply the naming convention. Set the target removal date in metadata. Document the rollout plan and abort criteria. The pre-experiment-readiness checklist for experiments applies to flag-gated rollouts too.

During rollout. The ramp-and-watch discipline. One peak hour at each percentage step. Abort criteria pre-committed. Production rules promoted from staging, not authored directly in production.

Post-launch. The 30-day review. Confirm production behavior matches the rollout expectation. Schedule the removal PR for release and experiment flags.

Quarterly cadence. The stale flag cleanup playbook. Generate the report, triage by owner, open one removal PR per flag, delete from the platform after the PR ships. Without this cadence, dead flags compound.

Pairs with these platforms

Eight feature-flag platforms in /integrations.

The skill is platform-agnostic for the operational shape. For platform-specific MCP commands, auth setup, and example prompts, pair this skill with the matching integration microsite.

Where this skill goes next

Skill 2 of 3 in the PM-experimentation suite.

Feature-flagging is the operational layer below experiment-design. Most experiments are delivered via feature flags, so the two skills compose. Experiment-design covers the discipline above (hypothesis writing, sample size, decision-making); this skill covers the operational layer below (flag types, lifecycle, rollout, governance).

experimentation-analytics is the third skill in the suite, covering the analytical layer: variance reduction techniques (CUPED, stratified sampling, control variates), Bayesian alternatives, sequential testing math, and deeper interpretation of marginal results. Skill landing page lands when the skill ships.

An optional fourth skill, experimentation-platform-orchestrator, may follow after the three foundational skills land. That skill schedules; this skill operates; experiment-design designs.

Open source under MIT

Read the SKILL.md on GitHub.

The skill source lives in the rampstackco/claude-skills repository alongside dozens of other skills covering the full lifecycle of brand and product work. MIT licensed.

View SKILL.md Browse the full catalog

Frequently asked questions.

Why does the skill insist on five flag types?: Because mixing flag types is the root cause of most flag mess. A flag that started as a release ramp, got repurposed as an experiment, then ended up gating a permission, has no clear lifecycle and no clear owner. The five types (release, experiment, operational, permission, configuration) have different lifetimes and removal expectations. Picking one at creation and refusing to let it drift is the discipline.
Does it depend on a specific platform?: No. The principles work on LaunchDarkly, Flagsmith, Split.io, VWO FME, GrowthBook, Statsig, PostHog, and Optimizely equally. For platform-specific MCP commands and example prompts, pair this skill with the matching /integrations/{platform} microsite. The skill produces the operational shape; the microsite shows how the platform implements it.
How does this differ from experiment-design?: Experiment-design covers the discipline above: hypothesis writing, sample size, decision-making. Feature-flagging covers the operational layer below: flag types, naming, lifecycle, rollout, stale flag cleanup, governance. Most experiments are delivered via feature flags, so the two skills compose. Use experiment-design when designing the test; use feature-flagging when implementing and operating the flag that gates the test.
What about flag-as-config patterns and dynamic configuration?: Configuration flags are one of the five types. They live alongside the others, with their own lifecycle: governed by sales and product agreements, long-lived, evolved as contracts change. The skill covers them explicitly so teams treat configuration flags as first-class infrastructure rather than a special case.
Why does the skill spend so much time on stale flag cleanup?: Because cleanup is the discipline that separates teams with healthy flag inventories from teams with hundreds of dead flags. The cost of leaving flags in is real (two code paths to maintain, evaluation overhead, mental load) and compounds across hundreds of flags. The fix is a quarterly cadence plus making removal part of the launch checklist, not a separate effort.

The feature flagging skill.

The discipline that prevents the technical-debt outcome.

Fourteen considerations covered in the body.

What this skill is for

The five flag types

Flag naming conventions

The flag lifecycle

Targeting rules and segmentation

Rollout strategies

Stale flag management

Governance and permissions

Flag dependencies and conflicts

Performance considerations

Testing flag-gated code

Rollback discipline

Observability on flags

Common failures

Seven references that go alongside the SKILL.md.

The full flag lifecycle.

Eight feature-flag platforms in /integrations.

LaunchDarkly

Flagsmith

Split.io (Harness FME)

VWO FME

GrowthBook

Statsig

PostHog

Optimizely

Skill 2 of 3 in the PM-experimentation suite.

Read the SKILL.md on GitHub.

Frequently asked questions.