KI-SkillAnalyze experimentMarketing

Decide whether an experiment should ship, stop, or keep running. — Claude Skill

Name: A/B Test Analysis
Author: ElasticFlow

Ein Claude-Skill für Claude Code von ElasticFlow✓ — ausführen mit /ab-test-analysis in Claude·Aktualisiert am 12. Juni 2026·vmanual@2026-06-12

Kompatibel mitChatGPT

ClaudeClaude CodeClaude DesktopCodex / Codex CLI

Cursor

GeminiHermes (via Continue / Cline)

OpenClaw

Windsurf

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

Explains experiment results in plain language instead of only reporting a p-value or dashboard screenshot.
Checks primary metric, sample size, segment differences, and guardrail metrics before recommending a decision.
Separates meaningful lift from noise, novelty effects, broken tracking, or mixed segment behavior.
Returns a decision memo with evidence, risk, next test idea, and what a human should confirm.

DuHeute

A growth marketer screenshots the experiment dashboard, says the test is up, and debates confidence in a meeting.

Mit /ab-test-analysis

Run /ab-test-analysis with the result table and context. The skill returns a decision, evidence, risks, and follow-up test.

1 Paste result table2 Check guardrails3 Interpret decision risk4 Write ship/stop/continue memo

Für wen

Growth Marketer

Turn experiment results into clear launch, stop, or continue decisions.

Skills für diese Rolle ansehen

Product Manager

Understand experiment impact on user behavior, product risk, and next iteration.

Skills für diese Rolle ansehen

Analytics Engineer

Spot tracking, sample, and guardrail issues before stakeholders trust the readout.

Skills für diese Rolle ansehen

Funktionen

Growth experiment readout

Turn Optimizely, Amplitude, or GA results into a decision memo.

Guardrail review

Check whether a conversion lift came with revenue, support, speed, or retention risk.

Experiment design critique

Find tracking, segment, sample size, or timing problems before trusting the result.

So funktioniert's

Share the experiment goal, variants, dates, traffic, sample size, and metric results.

Add guardrail metrics such as churn, revenue, refund rate, support tickets, or page speed if available.

The skill interprets lift, confidence, practical significance, and business risk.

It recommends ship, stop, keep running, or re-run with a cleaner design.

Eingabeoptionen

Experiment setup

Hypothesis, variants, dates, traffic split, audience, and success metric.

Beispiel

What the user pastes

Experiment: Pricing page CTA copy.
Variant A: Start free trial.
Variant B: Build my plan.
Dates: June 1-14.

Results:
- A: 24,100 visitors, 1,084 trials, 4.5% conversion.
- B: 23,900 visitors, 1,267 trials, 5.3% conversion.
- Confidence shown in tool: 96%.

Guardrails:
- Paid conversion after trial: A 18.4%, B 17.9%.
- Support questions about pricing increased 11% for B.

Need: ship, stop, or continue, and what to tell leadership.

Useful result

Decision

Ship to 50% first, not 100%. Variant B improves trial start rate from 4.5% to 5.3%, but paid conversion is slightly lower and pricing questions increased.

Why

The lift is likely real for trial starts. The business impact is not proven until trial quality and support load are watched for one more week.

Guardrail risk

Pricing confusion may be rising. Add a pricing FAQ link near the CTA before full rollout.

Leadership wording

The new CTA increases trial starts by about 18% relative, but we will roll out gradually while monitoring paid conversion and pricing support tickets.

Human review

Confirm attribution window, whether paid conversion is mature enough, and whether support ticket tagging is consistent.

Verbesserte Metriken

Conversion Rate

+5-20%

Marketing

Statistical Significance

Decision risk reduced

Marketing

Metric Trust

+20-40%

Marketing

Funktioniert mit

Google Sheets

manuell

Compare result tables and write the decision memo.

Optimizely

manuell

Use experiment results, variants, confidence, and traffic allocation.

Amplitude

manuell

Check product behavior, activation, retention, and segment impact.

google-analytics

manuell

Use traffic, conversion, and acquisition context.

Überall einsatzbereit

Eigenständig

Keine Einrichtung nötig

Paste the notes, exports, screenshots, or summaries you already have. The skill works without a connected system.

Verbunden

CRM + Tools integriert

Connect the relevant support, analytics, CRM, or data tool when you want fresher source evidence.

Möchten Sie A/B Test Analysis nutzen?

Wählen Sie, wie Sie starten möchten.

In Claude Code ausführen

Kostenlos. Open Source.

Installieren und führen Sie diesen Skill lokal auf Ihrem Computer aus.

Claude Code installieren

Öffnen Sie ein Terminal auf Ihrem Computer und fügen Sie diesen Befehl ein:

Skill installieren

Besuchen Sie das GitHub-Repository und folgen Sie den Installationshinweisen im README.

Ausführen

Starten Sie Claude Code und geben Sie den Befehl ein:

dann

Auf ElasticFlow nutzen

Team- und Kollaborationsfunktionen

Führen Sie Skills aus Ihrem Browser aus. Ergebnisse teilen, Zugriffe verwalten, mit Ihrem Team zusammenarbeiten. Kein Terminal nötig.

14 Tage kostenlos. Jederzeit kündbar.

A/B Test Analysis

Command: /ab-test-analysis

When to use it

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

What the skill produces

Explains experiment results in plain language instead of only reporting a p-value or dashboard screenshot.
Checks primary metric, sample size, segment differences, and guardrail metrics before recommending a decision.
Separates meaningful lift from noise, novelty effects, broken tracking, or mixed segment behavior.
Returns a decision memo with evidence, risk, next test idea, and what a human should confirm.

Inputs to provide

Experiment setup: Hypothesis, variants, dates, traffic split, audience, and success metric.
Result table: Visitors, conversions, conversion rate, revenue, confidence, or exported dashboard numbers.
Guardrails and context: Support volume, refunds, page speed, churn, revenue per user, or segment constraints.

Recommended flow

Share the experiment goal, variants, dates, traffic, sample size, and metric results.
Add guardrail metrics such as churn, revenue, refund rate, support tickets, or page speed if available.
The skill interprets lift, confidence, practical significance, and business risk.
It recommends ship, stop, keep running, or re-run with a cleaner design.

Useful result example

Decision

Ship to 50% first, not 100%. Variant B improves trial start rate from 4.5% to 5.3%, but paid conversion is slightly lower and pricing questions increased.

Why

The lift is likely real for trial starts. The business impact is not proven until trial quality and support load are watched for one more week.

Guardrail risk

Pricing confusion may be rising. Add a pricing FAQ link near the CTA before full rollout.

Leadership wording

The new CTA increases trial starts by about 18% relative, but we will roll out gradually while monitoring paid conversion and pricing support tickets.

Human review

Confirm attribution window, whether paid conversion is mature enough, and whether support ticket tagging is consistent.

Guardrails

Keep user-provided numbers, dates, tool names, commands, IDs, URLs, and rules intact.
Do not invent a source, metric, owner, decision, or risk that is not present in the supplied material.
Clearly mark what a human must confirm before publishing, changing a tool, or making a business decision.

Referenzdokumente

A/B Test Analysis

ElasticFlow editorial instructions for presenting /ab-test-analysis in the catalogue.

Purpose

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

Non-technical presentation

Explain the business problem, what the user provides, what the AI returns, and what a human still needs to confirm. Avoid implementation detail unless the user supplied it.

Catalogue Presentation Method

Every skill should read clearly for a business owner: current painful workflow, better workflow, concrete example, and review checklist.

The page must answer four questions: when to use it, what to provide, what the AI returns, and which human decision remains.

Verfügbar in: English Français

KI-SkillAnalyze experimentMarketing

Decide whether an experiment should ship, stop, or keep running. — Claude Skill

Ein Claude-Skill für Claude Code von ElasticFlow✓ — ausführen mit /ab-test-analysis in Claude·Aktualisiert am 12. Juni 2026·vmanual@2026-06-12

Kompatibel mitChatGPT

ClaudeClaude CodeClaude DesktopCodex / Codex CLI

Cursor

GeminiHermes (via Continue / Cline)

OpenClaw

Windsurf

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

Explains experiment results in plain language instead of only reporting a p-value or dashboard screenshot.
Checks primary metric, sample size, segment differences, and guardrail metrics before recommending a decision.
Separates meaningful lift from noise, novelty effects, broken tracking, or mixed segment behavior.
Returns a decision memo with evidence, risk, next test idea, and what a human should confirm.

DuHeute

A growth marketer screenshots the experiment dashboard, says the test is up, and debates confidence in a meeting.

Mit /ab-test-analysis

Run /ab-test-analysis with the result table and context. The skill returns a decision, evidence, risks, and follow-up test.

1 Paste result table2 Check guardrails3 Interpret decision risk4 Write ship/stop/continue memo

Für wen

Growth Marketer

Turn experiment results into clear launch, stop, or continue decisions.

Skills für diese Rolle ansehen

Product Manager

Understand experiment impact on user behavior, product risk, and next iteration.

Skills für diese Rolle ansehen

Analytics Engineer

Spot tracking, sample, and guardrail issues before stakeholders trust the readout.

Skills für diese Rolle ansehen

Funktionen

Growth experiment readout

Turn Optimizely, Amplitude, or GA results into a decision memo.

Guardrail review

Check whether a conversion lift came with revenue, support, speed, or retention risk.

Experiment design critique

Find tracking, segment, sample size, or timing problems before trusting the result.

So funktioniert's

Share the experiment goal, variants, dates, traffic, sample size, and metric results.

Add guardrail metrics such as churn, revenue, refund rate, support tickets, or page speed if available.

The skill interprets lift, confidence, practical significance, and business risk.

It recommends ship, stop, keep running, or re-run with a cleaner design.

Eingabeoptionen

Experiment setup

Hypothesis, variants, dates, traffic split, audience, and success metric.

Beispiel

What the user pastes

Experiment: Pricing page CTA copy.
Variant A: Start free trial.
Variant B: Build my plan.
Dates: June 1-14.

Results:
- A: 24,100 visitors, 1,084 trials, 4.5% conversion.
- B: 23,900 visitors, 1,267 trials, 5.3% conversion.
- Confidence shown in tool: 96%.

Guardrails:
- Paid conversion after trial: A 18.4%, B 17.9%.
- Support questions about pricing increased 11% for B.

Need: ship, stop, or continue, and what to tell leadership.

Useful result

Decision

Ship to 50% first, not 100%. Variant B improves trial start rate from 4.5% to 5.3%, but paid conversion is slightly lower and pricing questions increased.

Why

The lift is likely real for trial starts. The business impact is not proven until trial quality and support load are watched for one more week.

Guardrail risk

Pricing confusion may be rising. Add a pricing FAQ link near the CTA before full rollout.

Leadership wording

The new CTA increases trial starts by about 18% relative, but we will roll out gradually while monitoring paid conversion and pricing support tickets.

Human review

Confirm attribution window, whether paid conversion is mature enough, and whether support ticket tagging is consistent.

Verbesserte Metriken

Conversion Rate

+5-20%

Marketing

Statistical Significance

Decision risk reduced

Marketing

Metric Trust

+20-40%

Marketing

Funktioniert mit

Google Sheets

manuell

Compare result tables and write the decision memo.

Optimizely

manuell

Use experiment results, variants, confidence, and traffic allocation.

Amplitude

manuell

Check product behavior, activation, retention, and segment impact.

google-analytics

manuell

Use traffic, conversion, and acquisition context.

Überall einsatzbereit

Eigenständig

Keine Einrichtung nötig

Paste the notes, exports, screenshots, or summaries you already have. The skill works without a connected system.

Verbunden

CRM + Tools integriert

Connect the relevant support, analytics, CRM, or data tool when you want fresher source evidence.

Möchten Sie A/B Test Analysis nutzen?

Wählen Sie, wie Sie starten möchten.

In Claude Code ausführen

Kostenlos. Open Source.

Installieren und führen Sie diesen Skill lokal auf Ihrem Computer aus.

Claude Code installieren

Öffnen Sie ein Terminal auf Ihrem Computer und fügen Sie diesen Befehl ein:

Skill installieren

Besuchen Sie das GitHub-Repository und folgen Sie den Installationshinweisen im README.

Ausführen

Starten Sie Claude Code und geben Sie den Befehl ein:

dann

Auf ElasticFlow nutzen

Team- und Kollaborationsfunktionen

Führen Sie Skills aus Ihrem Browser aus. Ergebnisse teilen, Zugriffe verwalten, mit Ihrem Team zusammenarbeiten. Kein Terminal nötig.

14 Tage kostenlos. Jederzeit kündbar.

A/B Test Analysis

Command: /ab-test-analysis

When to use it

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

What the skill produces

Explains experiment results in plain language instead of only reporting a p-value or dashboard screenshot.
Checks primary metric, sample size, segment differences, and guardrail metrics before recommending a decision.
Separates meaningful lift from noise, novelty effects, broken tracking, or mixed segment behavior.
Returns a decision memo with evidence, risk, next test idea, and what a human should confirm.

Inputs to provide

Experiment setup: Hypothesis, variants, dates, traffic split, audience, and success metric.
Result table: Visitors, conversions, conversion rate, revenue, confidence, or exported dashboard numbers.
Guardrails and context: Support volume, refunds, page speed, churn, revenue per user, or segment constraints.

Recommended flow

Share the experiment goal, variants, dates, traffic, sample size, and metric results.
Add guardrail metrics such as churn, revenue, refund rate, support tickets, or page speed if available.
The skill interprets lift, confidence, practical significance, and business risk.
It recommends ship, stop, keep running, or re-run with a cleaner design.

Useful result example

Decision

Ship to 50% first, not 100%. Variant B improves trial start rate from 4.5% to 5.3%, but paid conversion is slightly lower and pricing questions increased.

Why

The lift is likely real for trial starts. The business impact is not proven until trial quality and support load are watched for one more week.

Guardrail risk

Pricing confusion may be rising. Add a pricing FAQ link near the CTA before full rollout.

Leadership wording

The new CTA increases trial starts by about 18% relative, but we will roll out gradually while monitoring paid conversion and pricing support tickets.

Human review

Confirm attribution window, whether paid conversion is mature enough, and whether support ticket tagging is consistent.

Guardrails

Keep user-provided numbers, dates, tool names, commands, IDs, URLs, and rules intact.
Do not invent a source, metric, owner, decision, or risk that is not present in the supplied material.
Clearly mark what a human must confirm before publishing, changing a tool, or making a business decision.

Referenzdokumente

A/B Test Analysis

ElasticFlow editorial instructions for presenting /ab-test-analysis in the catalogue.

Purpose

Reads experiment results, sample size, conversion changes, guardrail metrics, and business context to recommend a clear ship, stop, or continue decision.

Non-technical presentation

Explain the business problem, what the user provides, what the AI returns, and what a human still needs to confirm. Avoid implementation detail unless the user supplied it.

Catalogue Presentation Method

Every skill should read clearly for a business owner: current painful workflow, better workflow, concrete example, and review checklist.

The page must answer four questions: when to use it, what to provide, what the AI returns, and which human decision remains.

Decide whether an experiment should ship, stop, or keep running. — Claude Skill

Für wen

Funktionen

So funktioniert's

Eingabeoptionen

Beispiel

Verbesserte Metriken

Funktioniert mit

Überall einsatzbereit

Möchten Sie A/B Test Analysis nutzen?

Skill-Anweisungen

A/B Test Analysis

When to use it

What the skill produces

Inputs to provide

Recommended flow

Useful result example

Decision

Why

Guardrail risk

Leadership wording

Human review

Guardrails

Referenzdokumente

A/B Test Analysis

Purpose

Non-technical presentation

Catalogue Presentation Method

Decide whether an experiment should ship, stop, or keep running. — Claude Skill

Für wen

Funktionen

So funktioniert's

Eingabeoptionen

Beispiel

Verbesserte Metriken

Funktioniert mit

Überall einsatzbereit

Möchten Sie A/B Test Analysis nutzen?

Skill-Anweisungen

A/B Test Analysis

When to use it

What the skill produces

Inputs to provide

Recommended flow

Useful result example

Decision

Why

Guardrail risk

Leadership wording

Human review

Guardrails

Referenzdokumente

A/B Test Analysis

Purpose

Non-technical presentation

Catalogue Presentation Method