Flagship Skill · Experimentation platform orchestrator
The experimentation platform orchestrator skill.
Pick the right platform. Migrate when wrong. Coordinate when multi-platform.
A senior product and engineering leader's playbook for the experimentation platform decision. When to use Statsig vs PostHog vs GrowthBook vs Optimizely vs Amplitude vs Eppo vs Kameleoon. How to migrate when you have already chosen wrong. When multi-platform is genuinely warranted vs when to consolidate. The decisions that compound for years and the ones you can defer.
Audience: product and engineering leaders, founders selecting a platform for the first time, heads of product evaluating consolidation, data leads owning experimentation infrastructure.
What this skill is for
The platform decision is the one that compounds for years.
Picking an experimentation platform is one of those decisions that looks easy at the start and compounds for years afterward. The wrong choice costs you in lost experiments (because the team avoids the painful workflow), in cost (because the wrong pricing model penalizes your usage shape), in vendor lock-in (because migration is real engineering work, not a config change), and in cultural drift (because the platform's defaults shape what the team thinks experimentation is).
This skill is the discipline that makes the decision well the first time, and the migration plan when you didn't. It assumes the experiment design layer is handled by the experiment-design skill, the result interpretation layer by experimentation-analytics, and the flag operations layer by feature-flagging. The orchestrator's job is platform choice, migration strategy, multi-platform coordination, and governance setup.
The output is a defensible recommendation. A primary platform with reasoning, or a short list of two when the situation is genuinely ambiguous, or a defer when a precondition is not met (no traffic, no culture, pending pivot).
What is in the skill
Thirteen sections covered in the body.
The SKILL.md spans the full platform-decision lifecycle from initial selection through migration and governance. Each section names a decision the team will face and the discipline that produces a defensible answer.
01
What this skill is for
Spans platform selection, multi-platform decisions, migration planning, and governance setup. Pairs with the rest of the PM-experimentation suite for design, interpretation, and flag operations. Decisive voice rather than open evaluation.
02
The 7 considerations for the platform decision
Data architecture, statistical rigor, MCP availability, feature flag integration, analytics depth, governance, cost shape. Answer all seven before moving to per-platform profiles. Skipping any of them is how teams end up in a migration project a year later.
03
Per-platform profiles
One section each for Statsig, PostHog, GrowthBook, Optimizely, Amplitude, Eppo, and Kameleoon. Strengths, gotchas, ideal customer, MCP availability, pricing shape. Honest and specific.
04
Decision matrix
Context-to-platform mapping for the most common situations. Pure SaaS, regulated industry, marketing-led, data-team-led, AI-forward, pre-PMF, high-volume cost-pressured. Each row names a primary recommendation and the common alternative.
05
Multi-platform: when warranted, when a mess
Warranted when surfaces do not overlap (marketing site plus product app). A mess when surfaces do overlap. The 80/20 rule, the ownership matrix, the shared metric definitions, and the reconciliation pattern.
06
Migration patterns
Five common migrations (Statsig to PostHog, Optimizely to Statsig, PostHog to Eppo, GrowthBook to Eppo, any to Statsig consolidation). Effort estimates in engineer-weeks. The first 30 days plan, the 30 to 90 day cutover, what NOT to do.
07
The cost reality
Three pricing shapes (event-based, seats-plus-warehouse, combined event volume). Real-world ranges at different traffic tiers. The total cost frame: engineer time, statistical correctness, governance overhead, migration risk, cultural drift.
08
Governance and team fit
Permission tiers, approval workflows, audit trails, environment promotion. Team fit assessment for PM-led, engineer-led, data-team-led, marketing-led cultures. The wrong fit is constant friction; the right fit compounds.
09
The MCP capability layer
Six of seven platforms have first-party or hosted MCPs as of May 2026. Eppo offers REST only. Capability differs by an order of magnitude (PostHog 200+ tools, Kameleoon 7). For agentic workflows, MCP is becoming the deciding factor.
10
When to defer the decision
Pre-PMF, single-experiment validation, no experimentation culture, pending acquisition. Sometimes the right answer is not now. Free tiers (PostHog, Statsig) cover the deferral period without committing.
11
Common mistakes
Twelve patterns that recur in platform decisions. Picked the cheapest, picked the most-featured, tried multi-platform first, demo-driven decisions, ignored governance, ignored MCP trajectory, renewed without re-evaluating.
12
The framework: 11 considerations
Data architecture, statistical rigor, MCP availability, flag integration, analytics depth, governance, cost shape, team fit, onboarding cost, migration cost, multi-platform fit. The output is a primary recommendation, a short list of two if ambiguous, or a defer if a precondition is not met.
13
When in doubt, pick boring
Boring means mature, well-documented, large user base, recoverable migration. Statsig is boring for SaaS, PostHog for product-led growth, Optimizely for enterprise, GrowthBook for warehouse-native. Boring is not the optimum but it is almost never wrong.
Reference files
Seven references that go alongside the SKILL.md.
The references hold the matrices, playbooks, capability comparisons, cost models, governance templates, and pattern catalogs the SKILL.md cites. Each is a self-contained doc the team can lift into a project without reading the rest.
references/platform-decision-matrix.md
Context-to-platform matrix with worked examples across the most common decision contexts. Five worked examples covering Series B SaaS, regulated fintech, marketing-led brand, data-team-led organization, and pre-PMF startup.
references/migration-playbook.md
Five common migration paths with effort estimates, the first 30 days plan, the 30 to 90 day cutover plan, and what NOT to do. Statsig to PostHog, Optimizely to Statsig, PostHog to Eppo, GrowthBook to Eppo, and any to Statsig consolidation.
references/mcp-capability-comparison.md
Side-by-side capability matrix across the seven platforms. Read flags, write flags, read experiments, write experiments, query analytics, manage segments, governance actions, code generation. Plus hosting model, auth method, tool count, and recommended scoping.
references/multi-platform-orchestration.md
When multi-platform is genuine, when it is a mess. The ownership matrix template, shared metric definitions patterns, reconciliable dashboards, and the primary platform rule. When to consolidate.
references/cost-and-pricing-models.md
Pricing shapes per platform. Real-world cost ranges at different traffic tiers. The total cost frame. Three finance conversation patterns: cost per experiment, avoided wrong-result experiments, total cost of ownership across alternatives.
references/governance-and-team-setup.md
Permission tier templates, approval workflow patterns, audit trail requirements, environment promotion rules. Team fit assessment. Onboarding playbook for a new PM or engineer joining the experimentation discipline.
references/common-mistakes.md
Twelve failure patterns with name, symptom, root cause, fix, and prevention for each. Picked the cheapest, picked the most-featured, tried multi-platform first, did not check governance, ignored MCP trajectory, renewed without re-evaluating.
Where to use it
Four moments where the orchestrator earns its place.
Choosing a platform from scratch. Walk the 11-consideration framework, read the per-platform profiles, consult the decision matrix. The output is a recommendation with reasoning that survives a CFO review.
Evaluating whether to switch. The annual fit review. If the platform is the right answer, renew with confidence. If the platform is no longer the right answer, the migration playbook gives you the engineer-week estimate and the cutover plan to make the switch.
Deciding whether to consolidate. Multi-platform setups drift. The orchestrator names the three signals that consolidation is overdue and the pattern for the consolidation move (almost always to the platform closest to canonical for the highest-stakes surface).
Planning a migration that has already been approved. The migration playbook covers the five common paths with effort estimates, the first 30 days plan, the 30 to 90 day cutover, and what NOT to do. Plus the validation signals that tell you the migration succeeded.
Pairs with these platforms
Seven experimentation platforms in /integrations.
The orchestrator names which platform fits which situation. Open the integration microsite for the platform you are evaluating to see its MCP commands, auth setup, and example prompts. The skill is the strategic layer; the microsite is the tactical layer.
Modern product teams running experiments at speed
Statsig
Experiments and gates as MCP tools
Open the pageProduct-led growth teams
PostHog
Open-source product analytics with experiments
Open the pageData-native experimentation teams
GrowthBook
Open-source warehouse-native experimentation
Open the pageEnterprise personalization and experimentation
Optimizely
Web Experimentation and Feature Experimentation in one MCP
Open the pageProduct analytics and experimentation teams
Amplitude
Behavioral analytics and experiments via natural language
Open the pageData-warehouse-native experimentation teams
Eppo
Warehouse-native experimentation, REST API only
Open the pageAI-driven personalization teams
Kameleoon
Winning experiment to production code
Open the page
Where this skill fits in the suite
The fourth skill in the PM-experimentation suite.
The orchestrator completes the PM-experimentation suite. It is the strategic-advisor layer that sits above the three tactical skills. Read whichever one fits the current phase of the work.
experiment-design covers how to design a single experiment well: hypothesis writing, sample size and MDE, duration, segments, ratio metrics, sequential testing, and the failure modes that produce wrong shipping calls. Read it before designing the test.
experimentation-analytics covers how to read the result panel: confidence intervals, p-values, multiple testing, sequential testing, CUPED, ratio metrics with the delta method, heterogeneous treatment effects, network effects, and dashboard reconciliation. Read it when results land.
feature-flagging covers flags as production infrastructure: types, naming, lifecycle, targeting rules, rollout strategies, stale flag cleanup, and governance. Read it when the flag system is the bottleneck.
The orchestrator (this skill) sits above the three. It names the platform that hosts all of them and the migration plan when the platform was wrong.
Open source under MIT
Read the SKILL.md on GitHub.
The skill source lives in the rampstackco/claude-skills repository alongside dozens of other skills covering the full lifecycle of brand and product work. MIT licensed.
Frequently asked questions.
- How is this different from a vendor comparison blog post?
- A vendor comparison post lists features. This skill names which platform is right for which situation, and what it costs to be wrong. The framework is decision-shaped: 11 considerations, then a per-platform profile, then a context-to-platform matrix, then migration patterns when the decision was wrong. The skill takes positions; comparisons hedge.
- Why these seven platforms?
- Statsig, PostHog, GrowthBook, Optimizely, Amplitude, Eppo, and Kameleoon are the seven covered in the integrations catalog and represent the full shape of the market. Vendor-native (Statsig, Optimizely), product-suite (PostHog, Amplitude), warehouse-native (GrowthBook, Eppo), and specialty (Kameleoon). LaunchDarkly is feature-flag-primary and covered in the feature-flagging skill instead.
- How does it pair with the rest of the PM-experimentation suite?
- The orchestrator is the platform-decision layer. Experiment-design covers how to design a single experiment. Experimentation-analytics covers how to read the result panel. Feature-flagging covers flags as production infrastructure. Together the four cover the full PM-experimentation lifecycle from platform choice through individual experiment shipping.
- What about LaunchDarkly?
- LaunchDarkly is feature-flag-primary with experiments as a secondary feature. It is covered in the feature-flagging skill. If experiments are the primary use case, the seven platforms covered here are the better fit. If feature flags are the primary use case and experiments are supplementary, LaunchDarkly is a strong choice and shows up in the feature-flagging skill's coverage.
- How often does the platform landscape change?
- Platforms ship features monthly. MCP capability specifically is moving fast in 2025 and 2026. The skill names May 2026 as its current-state reference and recommends an annual platform fit review tied to the renewal date. The structural advice (data architecture patterns, the 11 considerations, the migration patterns) ages slowly; the per-platform feature claims age faster.
- What is the right answer when the team disagrees?
- Team disagreement on the platform decision is a data signal: the situation is not yet clear enough for any answer to be obviously correct. The skill recommends a 30-day trial on the top two candidates, evaluated against the 11-consideration framework with real usage data. Demos lie about real usage in ways that only the trial reveals. Pick after the trial, not before.