AI SkillAnalyze product metricsProduct & Engineering

When funnel data needs interpretation, /product-analyst builds metric frameworks, so you can act on cohort signals. — Claude Skill

A Claude Skill for Claude Code by Nick Jensen — run /product-analyst in Claude·Updated

Compatible withChatGPT·Claude·Gemini·OpenClaw

Build metric frameworks, cohort analyses, and experimentation plans

  • Structure AARRR and HEART metric frameworks with tracking specs
  • Design cohort retention tables with segment breakdowns
  • Calculate A/B test sample sizes and minimum detectable effects
  • Build funnel analysis queries with drop-off diagnostics
  • Create experimentation roadmaps with sequential test dependencies

Who this is for

What it does

AARRR framework setup

Run /product-analyst to define all 5 AARRR stages with 2-3 metrics each, tracking specs, and target ranges — producing a measurement plan covering 12-15 KPIs.

Cohort retention analysis

Use /product-analyst to structure a 12-week retention cohort table across 3 user segments, with churn inflection detection and a reactivation trigger plan.

A/B test design

Run /product-analyst to design an experiment: define hypothesis, primary metric, guardrail metrics, sample size (e.g., n=2400 for 5% MDE at 80% power), and runtime estimate.

Funnel drop-off diagnosis

Use /product-analyst to analyze a 6-step conversion funnel, flag the 2 highest-friction steps, and propose instrumentation for rage-clicks and session replays at each stage.

How it works

1

Share your product type, key user actions, and the question you want data to answer — retention curve, conversion rate, or feature adoption.

2

The skill selects the right metric framework (AARRR, HEART, or custom) and maps your events to each stage with concrete definitions.

3

It produces analysis templates: cohort tables, funnel breakdowns, or experiment designs with statistical parameters baked in.

4

You receive a ready-to-implement measurement plan with event schemas, SQL query templates, and dashboard wireframes.

Example

Analysis request
E-commerce marketplace, 20k monthly transactions. We redesigned checkout last month and need to measure impact on conversion and average order value.
Checkout impact analysis plan
Metric framework
Primary: checkout completion rate (pre: 62%, target: 68%). Secondary: AOV, time-to-purchase, cart abandonment rate. Guardrail: refund rate, support tickets per 1k orders.
Cohort comparison design
Compare 4-week pre-period vs 4-week post-period. Segment by: new vs returning, mobile vs desktop, cart size (<$50, $50-150, >$150). Flag if any segment regresses >2pp.
SQL query templates
3 queries provided: (1) daily conversion rate with confidence intervals, (2) cohort-by-segment pivot table, (3) funnel step timing with P50/P95 latency per step. All parameterized by date range.

Metrics this improves

Activation Rate
+10-25%
Product & Engineering
Conversion Rate
+15-30%
Product & Engineering
Statistical Significance
Valid test design
Product & Engineering

Works with

Product Analyst

Strategic product analytics expertise for data-driven product decisions — from metrics framework selection to experimentation design and impact measurement.

Philosophy

Great product analytics isn't about tracking everything. It's about measuring what matters to drive better product decisions.

The best product analytics:

  1. Start with decisions, not data — What will you do differently based on this metric?
  2. Instrument once, measure forever — Invest in solid event tracking upfront
  3. Balance leading and lagging — Predict outcomes, don't just report them
  4. Make data accessible — Self-serve dashboards beat SQL queues
  5. Experiment before you ship — Validate hypotheses with real users

How This Skill Works

When invoked, apply the guidelines in rules/ organized by:

  • metrics-* — Frameworks (AARRR, HEART), KPI selection, metric hierarchies
  • funnel-* — Conversion analysis, drop-off diagnosis, optimization
  • cohort-* — Retention analysis, segmentation, lifecycle tracking
  • feature-* — Adoption tracking, usage patterns, feature success
  • experiment-* — A/B testing, hypothesis design, statistical rigor
  • instrumentation-* — Event tracking, data modeling, collection best practices
  • dashboard-* — Visualization, stakeholder reporting, self-serve analytics

Core Frameworks

AARRR (Pirate Metrics)

StageQuestionKey Metrics
AcquisitionWhere do users come from?Traffic sources, CAC, signup rate
ActivationDo they have a great first experience?Time-to-value, setup completion, aha moment
RetentionDo they come back?DAU/MAU, D1/D7/D30 retention, churn
RevenueDo they pay?Conversion rate, ARPU, LTV
ReferralDo they tell others?NPS, referral rate, viral coefficient

HEART Framework (Google)

DimensionDefinitionSignal Types
HappinessUser attitudes, satisfactionNPS, CSAT, surveys
EngagementDepth of involvementSessions, time-in-app, actions/session
AdoptionNew users/features uptakeNew users, feature adoption %
RetentionContinued usage over timeRetention curves, churn rate
Task SuccessEfficiency and completionTask completion, error rate, time-on-task

The Metrics Hierarchy

                    ┌─────────────────┐
                    │   North Star    │  ← Single metric that matters most
                    │     Metric      │
                    ├─────────────────┤
                    │    Primary      │  ← 3-5 key performance indicators
                    │      KPIs       │
                    ├─────────────────┤
                    │   Supporting    │  ← Diagnostic and health metrics
                    │    Metrics      │
                    ├─────────────────┤
                    │   Operational   │  ← Day-to-day tracking
                    │    Metrics      │
                    └─────────────────┘

Retention Analysis Types

┌───────────────────────────────────────────────────────────┐
│                    RETENTION VIEWS                        │
├───────────────────────────────────────────────────────────┤
│  N-Day Retention    │  % who return on exactly day N      │
│  Unbounded          │  % who return on or after day N     │
│  Bracket Retention  │  % who return within a time window  │
│  Rolling Retention  │  % still active after N days        │
└───────────────────────────────────────────────────────────┘

Experimentation Rigor Ladder

LevelApproachWhen to Use
1. GutShip and hopeNever for important features
2. QualitativeUser research, feedbackEarly exploration
3. ObservationalPre/post analysisLow-risk changes
4. Quasi-experimentCohort comparisonWhen randomization hard
5. A/B TestRandomized controlOptimization, validation
6. Multi-arm BanditAdaptive allocationWhen speed > precision

Metric Selection Criteria

CriterionQuestionGood Sign
ActionableCan we influence this?Direct lever exists
AccessibleCan we measure it reliably?<5% missing data
AuditableCan we debug anomalies?Clear calculation logic
AlignedDoes it tie to business value?Executive cares
AttributableCan we trace changes to causes?A/B testable

Anti-Patterns

  • Vanity metrics — Tracking what looks good, not what drives decisions
  • Metric overload — 50 dashboards, zero insights
  • Lagging only — Measuring outcomes without predictive indicators
  • Silent failures — No alerting on data quality issues
  • HiPPO-driven — Highest-paid person's opinion beats data
  • P-hacking — Running tests until you get significance
  • Ship and forget — Launching features without success criteria
  • Segment blindness — Looking only at averages, missing cohort differences

Reference documents


title: Section Organization

1. Metrics Frameworks (metrics)

Impact: CRITICAL Description: Core metrics frameworks like AARRR and HEART, KPI selection, metric hierarchies, and North Star definition. The foundation for all product analytics.

2. Funnel Analysis (funnel)

Impact: CRITICAL Description: Conversion funnel design, drop-off analysis, bottleneck identification, and optimization strategies. Where most growth opportunities live.

3. Cohort & Retention (cohort)

Impact: CRITICAL Description: Cohort analysis, retention curves, churn prediction, and lifecycle segmentation. The key to sustainable growth.

4. Feature Analytics (feature)

Impact: HIGH Description: Feature adoption tracking, usage depth measurement, success criteria, and feature-level decision making.

5. Experimentation (experiment)

Impact: HIGH Description: A/B testing design, hypothesis formulation, statistical rigor, and interpreting experiment results.

6. Data Instrumentation (instrumentation)

Impact: HIGH Description: Event taxonomy, tracking implementation, data quality, and building reliable analytics foundations.

7. Dashboard Design (dashboard)

Impact: MEDIUM-HIGH Description: Product dashboards, stakeholder reporting, self-serve analytics, and visualization best practices.


title: Cohort Analysis and Retention impact: CRITICAL tags: cohort, retention, churn, lifecycle

Cohort Analysis and Retention

Impact: CRITICAL

Cohort analysis separates the signal from the noise. It answers: "Are things getting better?" by comparing groups of users who share a common starting point.

What is a Cohort?

A cohort is a group of users who share a characteristic, typically their signup date (acquisition cohort) or first action date (behavioral cohort).

Week 1 Cohort:  All users who signed up during Week 1
Week 2 Cohort:  All users who signed up during Week 2
...

Compare: How does Week 5 cohort perform vs Week 1 cohort?

The Classic Retention Table

         Week 0   Week 1   Week 2   Week 3   Week 4   Week 5
Cohort   ──────   ──────   ──────   ──────   ──────   ──────
Jan 1    1,000    450      380      320      290      280
Jan 8    1,200    540      465      400      360        -
Jan 15   1,100    520      440      385        -        -
Jan 22   1,300    610      530        -        -        -
Jan 29   1,400    650        -        -        -        -

As Percentages:
         Week 0   Week 1   Week 2   Week 3   Week 4   Week 5
         ──────   ──────   ──────   ──────   ──────   ──────
Jan 1    100%     45%      38%      32%      29%      28%
Jan 8    100%     45%      39%      33%      30%       -
Jan 15   100%     47%      40%      35%       -        -
Jan 22   100%     47%      41%       -        -        -
Jan 29   100%     46%       -        -        -        -

Reading the table:

  • Rows: Different cohorts (by signup week)
  • Columns: Time periods since signup
  • Diagonal: Same calendar week across cohorts
  • Trend right: How retention evolves over time for a cohort
  • Trend down: How retention improves/degrades for newer cohorts

Types of Retention Metrics

TypeDefinitionUse Case
N-Day Retention% returning on exactly day NMobile apps, daily products
N-Day Rolling% still active after day NSubscription products
Unbounded% returning on or after day NCasual use products
Bracket/Range% active within a time windowWeekly/monthly products

Examples:

N-Day (D7):
  Active on exactly day 7 / Total users = 22%

N-Day Rolling (D7):
  Active at least once days 7-14 / Total users = 35%

Unbounded (D7):
  Active on day 7 OR any day after / Total users = 45%

Bracket (Week 2):
  Active during days 8-14 / Total users = 32%

Retention Curves

100% ┤
     │ ●
 80% ┤  ╲
     │   ●
 60% ┤    ╲
     │     ●───●───●───●───●───●  ← Good: Flattening
 40% ┤
     │         ●
 20% ┤          ╲
     │           ●───●
  0% ┤                ╲───●───●  ← Bad: Still declining
     └────────────────────────────
       W0  W1  W2  W3  W4  W5  W6  W7  W8

What the curve tells you:

  • Steep early drop: Activation problem
  • Gradual continued decline: Value realization problem
  • Flattens at high %: Strong product-market fit
  • Never flattens: Fundamental value problem

Cohort Analysis Techniques

1. Acquisition Cohort Analysis Group by signup date to measure:

  • Is retention improving over time?
  • Did a product change help/hurt retention?
  • Are newer users more/less engaged?
Question: Are we getting better at retaining users?
Compare: Jan cohort W4 retention vs March cohort W4 retention
Result: 29% → 35% improvement
Insight: Onboarding improvements are working

2. Behavioral Cohort Analysis Group by first action or behavior to measure:

  • Which first actions predict retention?
  • What separates engaged vs churned users?
  • What's the "aha moment"?
Cohort by first action:
┌──────────────────────────────────────────────┐
│ First Action        │ W4 Retention │ LTV    │
├──────────────────────────────────────────────┤
│ Created project     │    42%       │ $280   │
│ Invited team member │    58%       │ $420   │
│ Imported data       │    65%       │ $510   │
│ Just browsed        │    12%       │ $45    │
└──────────────────────────────────────────────┘

Insight: Users who import data retain 5x better
Action: Prioritize data import in onboarding

3. Segment Cohort Analysis Compare retention across user segments:

W4 Retention by Segment:
┌──────────────────────────────────────────────┐
│ Segment              │ W4 Retention          │
├──────────────────────────────────────────────┤
│ Enterprise (500+)    │    72%                │
│ Mid-market (50-500)  │    48%                │
│ SMB (10-50)          │    35%                │
│ Micro (<10)          │    18%                │
└──────────────────────────────────────────────┘

Insight: Enterprise retains 4x better than micro
Action: Shift ICP focus or improve SMB experience

Calculating Key Retention Metrics

Churn Rate

Monthly Churn = Customers lost in month / Customers at start of month
Example: 50 churned / 1,000 starting = 5% monthly churn
Annual Churn ≈ 1 - (1 - Monthly Churn)^12 = 46% annual churn

Retention Rate

Retention Rate = 1 - Churn Rate
Example: 1 - 5% = 95% monthly retention

Customer Lifetime (Average)

Average Lifetime = 1 / Churn Rate
Example: 1 / 5% = 20 months average lifetime

Net Revenue Retention (NRR)

NRR = (Starting MRR + Expansion - Contraction - Churn) / Starting MRR
Example: ($100k + $15k - $5k - $8k) / $100k = 102% NRR

Good vs Bad Retention Curves

Good: Consumer Subscription (Spotify-like)

Day    Retention    Interpretation
───────────────────────────────────
D1     70%          Good first session
D7     45%          Habit forming
D30    30%          Solid monthly engagement
D90    22%          Core user base forming
D180   18%          Plateau = product-market fit

Bad: Leaky Bucket

Day    Retention    Interpretation
───────────────────────────────────
D1     50%          Many bouncing immediately
D7     25%          Half gone in a week
D30    8%           Hemorrhaging users
D90    2%           Almost no one stays
D180   0.5%         Fundamental value problem

Good: B2B SaaS

Month  Retention    Interpretation
───────────────────────────────────
M1     92%          Normal startup churn
M3     88%          Onboarding complete
M6     85%          Sticky integrations
M12    80%          Annual renewal strong
M24    75%          Long-term value delivered

Finding Your "Aha Moment"

The aha moment is the action or experience that predicts retention.

Analysis approach:

1. Define retained (e.g., active at W4)
2. List candidate behaviors in W1
3. Calculate correlation with retention
4. Find the threshold that maximizes separation

Candidate Analysis:
┌─────────────────────────────────────────────────────────┐
│ W1 Behavior           │ W4 Retention │ Correlation     │
├─────────────────────────────────────────────────────────┤
│ Created 1 project     │    35%       │ 0.42            │
│ Created 3+ projects   │    62%       │ 0.71            │
│ Invited 1 teammate    │    48%       │ 0.55            │
│ Invited 3+ teammates  │    78%       │ 0.85 ← Aha!     │
│ Used mobile app       │    52%       │ 0.58            │
└─────────────────────────────────────────────────────────┘

Aha moment: Invite 3+ teammates in week 1

Resurrection Analysis

Resurrection Rate = Resurrected users / Churned users

Monthly resurrection:
┌────────────────────────────────────────────────┐
│ Churned Month │ Resurrected │ Rate │ Via      │
├────────────────────────────────────────────────┤
│ January       │    45       │ 12%  │ Email    │
│ February      │    38       │ 10%  │ Feature  │
│ March         │    52       │ 14%  │ Email    │
└────────────────────────────────────────────────┘

Insight: Win-back emails resurrect ~12% of churned
Action: Invest in better resurrection campaigns

Retention Improvement Tactics

ProblemIndicatorsTactics
D1 drop-off<60% D1 retentionImprove onboarding, reduce friction
W1 declineSteep W1 curveStrengthen habit loops, add value faster
No plateauCurve never flattensFind and amplify aha moment
Segment varianceWide cohort spreadFocus on high-retaining segments
Seasonal churnPredictable dropsProactive engagement before drop

Anti-Patterns

  • Only tracking DAU/MAU — Masks cohort-level problems
  • Ignoring early churn — Most churn happens in first week
  • Average-only analysis — Segment differences matter
  • Short windows — Need 3-6 months for meaningful curves
  • No resurrection tracking — Churned users can come back
  • Vanity retention — Counting logins without value delivery
  • Static analysis — Retention should improve over time

title: Dashboard Design for Product impact: MEDIUM-HIGH tags: dashboard, visualization, reporting, self-serve

Dashboard Design for Product

Impact: MEDIUM-HIGH

The best dashboards drive action, not just awareness. Design dashboards that answer questions and prompt decisions.

Dashboard Design Principles

1. One Dashboard, One Purpose

Bad: "Product Dashboard" (everything)
Good: "Weekly Activation Health Check" (specific purpose)

2. Action-Oriented

Ask: "What will someone DO after viewing this?"
If no clear action → wrong metrics or wrong audience

3. Audience-Appropriate

Executive: High-level trends, strategic metrics
Manager: Team performance, weekly progress
IC: Detailed diagnostics, debugging tools

4. Progressive Disclosure

Level 1: Summary metrics (5 seconds to understand)
Level 2: Key trends (30 seconds)
Level 3: Segment breakdowns (2 minutes)
Level 4: Detailed drill-down (investigation)

Dashboard Architecture

The Dashboard Hierarchy:

                    ┌─────────────────────┐
                    │    Executive        │  Weekly/Monthly
                    │    Overview         │  Strategy decisions
                    └──────────┬──────────┘
                               │
            ┌──────────────────┼──────────────────┐
            ▼                  ▼                  ▼
    ┌───────────────┐  ┌───────────────┐  ┌───────────────┐
    │   Growth      │  │   Product     │  │   Revenue     │
    │   Dashboard   │  │   Health      │  │   Metrics     │
    └───────┬───────┘  └───────┬───────┘  └───────┬───────┘
            │                  │                  │
    ┌───────┴───────┐  ┌───────┴───────┐  ┌───────┴───────┐
    │  Acquisition  │  │  Retention    │  │  Conversion   │
    │  Activation   │  │  Engagement   │  │  Expansion    │
    │  Funnel       │  │  Feature      │  │  Churn        │
    └───────────────┘  └───────────────┘  └───────────────┘
            │                  │                  │
            ▼                  ▼                  ▼
    [Detailed drill-down dashboards and ad-hoc exploration]

Dashboard Types and Templates

1. Executive Overview

SectionMetricsVisualization
Health ScoreOverall product score (composite)Single number + trend
North StarPrimary metric + trendBig number + sparkline
AARRR SummaryEach stage's key metricRow of metrics
Risks/WinsNotable changes this periodText callouts

Layout:

┌────────────────────────────────────────────────────────┐
│  PRODUCT HEALTH SCORE: 78/100 (↑3 vs last week)       │
├────────────────────────────────────────────────────────┤
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐  │
│  │ SIGNUPS  │ │ACTIVATION│ │RETENTION │ │ REVENUE  │  │
│  │  2,340   │ │   62%    │ │   45%    │ │  $142K   │  │
│  │  +12%    │ │   +3pp   │ │   -1pp   │ │   +8%    │  │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘  │
├────────────────────────────────────────────────────────┤
│  📈 Wins: Activation hit all-time high                │
│  ⚠️ Watch: D30 retention declining for 3 weeks       │
└────────────────────────────────────────────────────────┘

2. Funnel Dashboard

┌────────────────────────────────────────────────────────┐
│               SIGNUP FUNNEL (Last 7 Days)              │
├────────────────────────────────────────────────────────┤
│                                                        │
│  Landing Page   ██████████████████████████  10,000     │
│                              ↓ 35%                     │
│  Started Signup ████████████                 3,500     │
│                              ↓ 78%                     │
│  Completed      ████████                     2,730     │
│                              ↓ 52%                     │
│  Activated      █████                        1,420     │
│                              ↓ 35%                     │
│  Retained (D7)  ██                            497      │
│                                                        │
├────────────────────────────────────────────────────────┤
│  Biggest drop-off: Completed → Activated (48% loss)   │
│  vs Last Week: Start→Complete improved +5pp          │
└────────────────────────────────────────────────────────┘

3. Retention Dashboard

┌────────────────────────────────────────────────────────┐
│              RETENTION COHORT ANALYSIS                 │
├────────────────────────────────────────────────────────┤
│         │ W0   │ W1   │ W2   │ W3   │ W4   │ W5      │
│  Jan 1  │ 100% │ 45%  │ 38%  │ 32%  │ 29%  │ 28%     │
│  Jan 8  │ 100% │ 47%  │ 40%  │ 35%  │ 31%  │         │
│  Jan 15 │ 100% │ 48%  │ 42%  │ 38%  │      │         │
│  Jan 22 │ 100% │ 52%  │ 45%  │      │      │         │
├────────────────────────────────────────────────────────┤
│  Trend: W1 retention improving (45% → 52%)            │
│  Target: 55% W1 by end of Q1                          │
└────────────────────────────────────────────────────────┘

4. Feature Adoption Dashboard

┌────────────────────────────────────────────────────────┐
│              FEATURE ADOPTION TRACKER                  │
├────────────────────────────────────────────────────────┤
│  Feature            │ Adoption │ Trend │ Health       │
│  ─────────────────────────────────────────────────────│
│  Core Reporting     │   72%    │  ↑    │ ✅ Strong    │
│  Collaboration      │   48%    │  ↑    │ ✅ Growing   │
│  Advanced Analytics │   25%    │  →    │ ⚠️ Stalled  │
│  API Access         │   12%    │  ↓    │ ⚠️ Watch    │
│  New: Smart Alerts  │    8%    │  ↑    │ 🆕 Ramping  │
├────────────────────────────────────────────────────────┤
│  Focus Area: Advanced Analytics discovery             │
└────────────────────────────────────────────────────────┘

Visualization Best Practices

Choose the Right Chart:

Data TypeBest VisualizationAvoid
Single metricBig number, gaugePie chart
Trend over timeLine chartStacked bar
Part of wholeStacked bar, 100% barPie chart (>5 segments)
ComparisonHorizontal barVertical bar (many items)
DistributionHistogram, box plotPie chart
CorrelationScatter plotLine chart
FunnelFunnel chart, waterfallPie chart

Color Usage:

Use color intentionally:
- Green: Good/positive/target met
- Red: Bad/negative/needs attention
- Yellow: Warning/watch
- Gray: Neutral/context
- Blue: Primary data series

Avoid:
- Rainbow charts
- More than 5 colors
- Red/green only (colorblind users)

Good vs Bad Dashboard Examples

Good: Focused, Actionable

Dashboard: Weekly Activation Review
Audience: Growth team
Purpose: Identify activation blockers

Content:
1. Activation rate vs target (big number)
2. Activation funnel with WoW comparison
3. Top 3 drop-off points with absolute numbers
4. Cohort activation trend (are we improving?)
5. Segment breakdown (mobile vs web, source)

Actions enabled:
- Identify biggest activation blocker
- Prioritize next optimization
- Track experiment impact

Bad: Unfocused, Vanity-Driven

Dashboard: Product Dashboard
Audience: Everyone
Purpose: Show everything

Content:
1. Total users (all-time, always up!)
2. Page views (meaningless)
3. Session duration (average, not segmented)
4. 47 different charts
5. No trends or comparisons

Problems:
- No clear purpose
- Vanity metrics
- Too much information
- No actionable insight
- Wrong granularity

Self-Serve Analytics

Building Self-Serve Culture:

LevelWhat Users Can DoTools
ViewerSee pre-built dashboardsDashboard tool
ExplorerFilter, segment, drill-downBI tool
BuilderCreate simple chartsBI tool
AnalystWrite SQL, complex analysisSQL tool

Self-Serve Guardrails:

✓ Do:
- Provide data dictionary
- Offer templates and examples
- Set up calculated metrics
- Enable sandboxed exploration

✗ Don't:
- Give raw database access
- Allow editing production dashboards
- Let users create without naming conventions

Dashboard Maintenance

Review Cadence:

ActivityFrequency
Metric accuracy checkWeekly
Usage reviewMonthly
Relevance auditQuarterly
Full dashboard reviewSemi-annually

Health Indicators:

SignalHealthyUnhealthy
Views/week>10<3
Active explorersGrowingDeclining
Stale dashboards<10%>30%
User questionsDecreasingIncreasing

Anti-Patterns

  • Dashboard sprawl — 50 dashboards, nobody uses any
  • Vanity metrics — Always-up numbers that don't drive action
  • No context — Numbers without targets, trends, or comparisons
  • Information overload — 30 charts on one page
  • Stale data — Data from yesterday (or last week)
  • No ownership — Nobody maintains, data quality degrades
  • Pretty but useless — Beautiful charts with no insight
  • One-size-fits-all — Same dashboard for exec and IC

title: A/B Testing and Experimentation impact: HIGH tags: experiment, ab-testing, hypothesis, statistics

A/B Testing and Experimentation

Impact: HIGH

A/B testing removes opinion from product decisions. Done right, it's the difference between "we think" and "we know."

When to A/B Test

ScenarioA/B Test?Why
Optimizing existing flowYesMeasure impact of changes
Major redesignMaybeConsider qualitative first
New feature launchYesValidate before full rollout
Copy/CTA changesYesQuick wins, clear measurement
Pricing changesCarefullyRevenue impact, user trust
Bug fixesNoJust fix it
Low traffic areasNoWon't reach significance

Anatomy of a Good Experiment

┌────────────────────────────────────────────────────────────┐
│                    EXPERIMENT DESIGN                       │
├────────────────────────────────────────────────────────────┤
│ Hypothesis:                                                │
│   Changing the CTA from "Start Free" to "Start in 60      │
│   seconds" will increase signup rate by 10%                │
│                                                            │
│ Primary Metric: Signup completion rate                     │
│ Secondary Metrics: Time to signup, D7 retention            │
│ Guardrail Metrics: Page load time, error rate              │
│                                                            │
│ Sample Size: 5,000 per variant (10,000 total)              │
│ Duration: 14 days minimum                                  │
│ Significance Level (α): 5%                                 │
│ Power (1-β): 80%                                           │
│ MDE: 10% relative improvement                              │
│                                                            │
│ Variants:                                                  │
│   Control (50%): "Start Free"                              │
│   Treatment (50%): "Start in 60 seconds"                   │
│                                                            │
│ Segments to analyze: New vs returning, mobile vs desktop   │
│ Exclusions: Existing users, bot traffic                    │
└────────────────────────────────────────────────────────────┘

The Hypothesis Formula

Structure:

If we [change X],
then [metric Y] will [increase/decrease] by [Z%],
because [reason/evidence].

Good Example:

If we reduce signup form fields from 6 to 3,
then signup completion rate will increase by 15%,
because user research showed field fatigue is the #1 drop-off reason.

Bad Example:

If we change the button color,
it might help,
because blue is nice.

Sample Size and Duration

Sample Size Calculation:

Baseline RateMDERequired Sample/Variant
1%10% relative (1.1%)150,000
5%10% relative (5.5%)30,000
10%10% relative (11%)15,000
20%10% relative (22%)6,500
50%10% relative (55%)1,600

Duration Guidelines:

Minimum: 1 full week (capture weekly patterns)
Typical: 2-4 weeks
Maximum before concern: 6-8 weeks

Factors extending duration:
- Low traffic volume
- Smaller expected effect
- Higher baseline variance
- Need for segment analysis

Statistical Rigor

Key Concepts:

ConceptDefinitionTarget
Significance (α)False positive rate5% (p < 0.05)
Power (1-β)True positive rate80%
MDEMinimum Detectable EffectSmallest meaningful change
Confidence IntervalRange of true effectDoesn't cross zero

Reading Results:

Experiment Results:
┌─────────────────────────────────────────────────────────┐
│ Metric: Signup Completion Rate                          │
├─────────────────────────────────────────────────────────┤
│ Control:     12.4% (n=5,200)                            │
│ Treatment:   13.8% (n=5,180)                            │
│                                                         │
│ Relative Lift: +11.3%                                   │
│ Absolute Lift: +1.4 percentage points                   │
│ 95% CI: [+5.2%, +17.8%]                                 │
│ p-value: 0.002                                          │
│ Status: SIGNIFICANT                                     │
└─────────────────────────────────────────────────────────┘

Interpretation:
- Treatment wins with 11.3% improvement
- CI doesn't include zero = statistically significant
- p < 0.05 = reject null hypothesis
- Ready to ship treatment

Common Testing Mistakes

MistakeProblemPrevention
PeekingStopping early on exciting resultsPre-set duration, sequential analysis
P-hackingRunning until significantPre-register hypothesis
HARKingHypothesizing after resultsDocument hypothesis before
Segment huntingFinding any winning segmentPre-define segments
Ignoring powerToo small sampleCalculate sample size upfront
No guardrailsWinning metric, losing elsewhereDefine counter-metrics
Too many variantsDiluted sampleLimit to 2-4 variants

Guardrail Metrics

Primary: Signup completion rate (trying to improve)

Guardrails (must not degrade):
┌─────────────────────────────────────────────────────────┐
│ Guardrail           │ Threshold    │ Result            │
├─────────────────────────────────────────────────────────┤
│ D7 retention        │ -5% max      │ -2% ✓             │
│ Page load time      │ +200ms max   │ +50ms ✓           │
│ Error rate          │ +0.5% max    │ +0.1% ✓           │
│ Support tickets     │ +10% max     │ -5% ✓             │
└─────────────────────────────────────────────────────────┘

Status: All guardrails passed, safe to ship

Good vs Bad Experiment Design

Good: Complete, Rigorous

Experiment: Onboarding Flow Simplification

Background:
Current onboarding has 8 steps, 45% completion rate.
User research shows steps 4-6 cause most drop-offs.

Hypothesis:
Reducing onboarding from 8 to 5 steps (removing low-value steps)
will increase completion rate by 15% because users experience
less friction and see value faster.

Design:
- Primary metric: Onboarding completion rate
- Secondary: Time to complete, D7 activation
- Guardrails: Setup quality score, D30 retention
- Sample: 10,000/variant
- Duration: 3 weeks
- Targeting: New users only, all platforms

Success Criteria:
- Primary: >10% relative lift, p<0.05
- Guardrails: No degradation beyond thresholds
- Qualitative: User feedback positive

Decision Framework:
- If significant win: Ship to 100%
- If neutral: Consider qualitative findings
- If negative: Abandon, iterate

Bad: Incomplete, Unmeasurable

Experiment: Try New Onboarding

Let's test a new onboarding flow.

Variants: A (current) vs B (new)

Run for a week and see what happens.

Problems:
- No hypothesis
- No sample size calculation
- Arbitrary duration
- No guardrails
- No success criteria
- No decision framework

Sequential Testing

For faster decisions with statistical validity:

Sequential Analysis Framework:
┌──────────────────────────────────────────────────────────┐
│ Check Point │ Cumulative n │ Continue if │ Stop if      │
├──────────────────────────────────────────────────────────┤
│ Day 3       │ 2,000        │ |z| < 2.8   │ |z| > 2.8    │
│ Day 7       │ 5,000        │ |z| < 2.5   │ |z| > 2.5    │
│ Day 14      │ 10,000       │ |z| < 2.2   │ |z| > 2.2    │
│ Day 21      │ 15,000       │ Final decision              │
└──────────────────────────────────────────────────────────┘

Benefits:
- Can stop early for clear winners/losers
- Maintains statistical validity
- Faster iteration cycle

Experiment Velocity

MetricTargetCurrent
Experiments/month8-126
Win rate30-40%35%
Time to decision2 weeks3 weeks
Experiment coverage40% of features25%

Increasing Velocity:

  • Pre-built experiment templates
  • Self-serve experiment platform
  • Reduced approval friction
  • Parallel testing where possible

Anti-Patterns

  • HiPPO override — Ignoring data for opinion
  • Winner's curse — Effect sizes shrink in production
  • Novelty effects — Early lift fades over time
  • Feature flag debt — Experiments live forever
  • Analysis paralysis — Running tests that don't matter
  • Underpowered tests — Results too noisy to trust
  • One-and-done — Not iterating on insights
  • No documentation — Learnings lost to time

title: Experimentation Design and Process impact: HIGH tags: experiment, design, hypothesis, process

Experimentation Design and Process

Impact: HIGH

Beyond A/B testing mechanics lies experiment design: asking the right questions, designing valid tests, and building a culture of learning.

The Experimentation Hierarchy

Level 5: Continuous Experimentation  ← Org-wide, embedded culture
            │
Level 4: Experiment Platform         ← Self-serve, scaled testing
            │
Level 3: Regular A/B Testing         ← Team runs experiments routinely
            │
Level 2: Ad-hoc Experiments          ← Occasional, project-based
            │
Level 1: No Experiments              ← Ship and hope

Experiment Prioritization

ICE Framework:

FactorQuestionScore (1-10)
ImpactHow much lift if it wins?Estimated % improvement
ConfidenceHow likely to win?Evidence strength
EaseHow hard to implement?Inverse of effort

ICE Score = Impact × Confidence × Ease

Prioritization Example:

Experiment Ideas Ranked:
┌────────────────────────────────────────────────────────────┐
│ Idea                      │ I │ C │ E │ Score │ Priority  │
├────────────────────────────────────────────────────────────┤
│ Simplify signup form      │ 8 │ 7 │ 8 │  448  │ 1st       │
│ Add social proof          │ 6 │ 8 │ 9 │  432  │ 2nd       │
│ Redesign onboarding       │ 9 │ 5 │ 4 │  180  │ 3rd       │
│ New pricing page          │ 7 │ 4 │ 3 │   84  │ 4th       │
└────────────────────────────────────────────────────────────┘

Hypothesis Development

From Observation to Hypothesis:

1. Observation
   "Users are dropping off after signup"

2. Research
   - Funnel data: 45% drop between signup and first action
   - Session recordings: Users seem confused by dashboard
   - Survey: "I didn't know what to do first"

3. Insight
   "Users lack clear guidance to their first valuable action"

4. Hypothesis
   "Adding a guided first-action wizard will increase
   signup-to-first-action rate by 15% because users
   will have clear direction"

5. Experiment Design
   - Control: Current post-signup experience
   - Treatment: Guided wizard to first action
   - Primary metric: First action completion (7 days)
   - Sample: 5,000 per variant

Types of Experiments

TypeDescriptionWhen to Use
A/B TestControl vs one variantOptimization, validation
A/B/n TestControl vs multiple variantsExploring options
MultivariateMultiple variables at onceUI optimization
HoldoutExclude % from all changesMeasure cumulative impact
Geo ExperimentDifferent regionsWhen randomization hard
SwitchbackAlternating on/offTime-sensitive effects
BanditAdaptive allocationFast optimization

Experiment Documentation

Experiment Brief Template:

┌────────────────────────────────────────────────────────────┐
│                    EXPERIMENT BRIEF                        │
├────────────────────────────────────────────────────────────┤
│ Name: Guided Onboarding Wizard                             │
│ Owner: Sarah (PM), Jake (Eng)                              │
│ Status: Design Review                                      │
│ Start Date: TBD                                            │
├────────────────────────────────────────────────────────────┤
│ CONTEXT                                                    │
│ Current signup-to-first-action rate is 55%. User research  │
│ shows confusion about next steps. Competitors show 70%+.   │
├────────────────────────────────────────────────────────────┤
│ HYPOTHESIS                                                 │
│ Adding a 3-step guided wizard after signup will increase   │
│ first-action completion by 15% by providing clear          │
│ direction and reducing cognitive load.                     │
├────────────────────────────────────────────────────────────┤
│ VARIANTS                                                   │
│ Control (50%): Current experience (dashboard with tips)    │
│ Treatment (50%): Guided 3-step wizard                      │
├────────────────────────────────────────────────────────────┤
│ METRICS                                                    │
│ Primary: First action completion (7 days)                  │
│ Secondary: Time to first action, D7 retention              │
│ Guardrails: Skip rate, support tickets, D30 retention      │
├────────────────────────────────────────────────────────────┤
│ TARGETING                                                  │
│ Who: New signups only                                      │
│ Exclusions: Enterprise users, mobile app                   │
│ Sample size: 10,000 total (5,000/variant)                  │
│ Duration: 3 weeks minimum                                  │
├────────────────────────────────────────────────────────────┤
│ SUCCESS CRITERIA                                           │
│ Ship if: >10% lift with p<0.05, guardrails pass            │
│ Iterate if: 5-10% lift or mixed signals                    │
│ Kill if: <5% lift or guardrail regression                  │
├────────────────────────────────────────────────────────────┤
│ RISKS & MITIGATIONS                                        │
│ Risk: Wizard feels forced                                  │
│ Mitigation: Clear skip option, track skip rate             │
└────────────────────────────────────────────────────────────┘

Good vs Bad Experiment Design

Good: Complete, Rigorous, Learnable

Experiment: Reduce Checkout Friction

Background:
- Cart abandonment at 68%
- Heatmaps show hesitation at shipping step
- Competitor analysis shows simpler flows

Hypothesis:
Combining shipping and billing into one step will increase
checkout completion by 12% by reducing perceived effort.

Design:
- A/B test, 50/50 split
- Primary: Checkout completion rate
- Secondary: Time to complete, support contacts
- Guardrails: Failed payments, address errors
- n = 8,000/variant (16,000 total)
- Duration: 4 weeks
- Segments: New vs returning, mobile vs desktop

Decision rules:
- >8% lift + p<0.05 + guardrails pass → Ship
- 3-8% lift → Extend + investigate segments
- <3% lift → Kill, learn, iterate

What we'll learn either way:
- Win: Friction reduction works, look for more
- Lose: Friction isn't the issue, test value prop

Bad: Incomplete, No Learning

Experiment: New Checkout

Test the new checkout page.

Variants:
- Old checkout
- New checkout

Run for 2 weeks.

Problems:
- No hypothesis
- No primary metric
- No sample size calculation
- No guardrails
- No decision criteria
- No learning documented

Running an Experimentation Program

Monthly Experiment Cadence:

WeekActivities
Week 1Analyze previous experiments, update backlog
Week 2Design next experiments, build variants
Week 3Launch experiments, monitor ramp
Week 4Analyze results, document learnings

Experiment Pipeline:

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│   BACKLOG    │ →  │   DESIGN     │ →  │  IMPLEMENT   │
│  (20+ ideas) │    │ (3-5 briefs) │    │  (2-3 ready) │
└──────────────┘    └──────────────┘    └──────────────┘
        │                   │                  │
        ▼                   ▼                  ▼
┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│   RUNNING    │ ←  │   LAUNCHED   │ ←  │     QA       │
│  (2-4 live)  │    │  (ramping)   │    │  (testing)   │
└──────────────┘    └──────────────┘    └──────────────┘
        │
        ▼
┌──────────────┐    ┌──────────────┐
│   ANALYZE    │ →  │  DOCUMENT    │
│  (results)   │    │  (learnings) │
└──────────────┘    └──────────────┘

Experiment Velocity Metrics

MetricDefinitionTarget
Experiments/MonthTests completed8-12
Win Rate% with positive result30-40%
Coverage% of features tested>40%
Cycle TimeIdea to result (days)<21
Learning RateInsights documented100%

Building Experiment Culture

Signs of Good Culture:

✓ PMs write hypotheses before building
✓ Engineers suggest experiments proactively
✓ Failed experiments are celebrated for learning
✓ Decisions cite experiment results
✓ Experiment results are shared widely
✓ Backlog is always full of ideas

Signs of Poor Culture:

✗ Experiments are done to "validate" decisions already made
✗ Inconclusive results are spun as wins
✗ Only "safe" experiments are run
✗ Results aren't documented or shared
✗ Leadership overrides experiment results
✗ Teams don't know experiment status

Experimentation Ethics

PrincipleGuidelines
Informed usersDisclose testing in terms of service
No harmDon't test things that hurt users
Fair valueDon't withhold core value from control
PrivacyOnly collect necessary data
Sensitive topicsExtra care with health, finance

Anti-Patterns

  • HIPPO culture — Highest-paid person overrides data
  • P-hacking — Running until you get significance
  • Cherry-picking — Only reporting favorable segments
  • Feature factory — Shipping without measuring
  • One-and-done — Not iterating on learnings
  • Analysis paralysis — Perfect is enemy of good
  • Cargo cult — Running experiments without understanding stats
  • Winner's curse — Expecting production results = test results

title: Feature Adoption Tracking impact: HIGH tags: feature, adoption, usage, product-analytics

Feature Adoption Tracking

Impact: HIGH

Shipping features is easy. Getting users to adopt them is hard. Track adoption systematically to understand what's working and where to invest.

The Feature Adoption Lifecycle

                Awareness → Trial → Adoption → Habit → Advocacy
                    │         │        │         │         │
                    ▼         ▼        ▼         ▼         ▼
                "Saw it"  "Tried it" "Use it"  "Need it"  "Love it"
                    │         │        │         │         │
Metrics:        Exposure   First Use  Regular   Power    Referral
                Rate       Rate       Usage     Usage     Rate

Core Feature Adoption Metrics

MetricCalculationGood Target
Awareness RateUsers who saw feature / Total eligible users>80%
Trial RateUsers who tried feature / Users aware>40%
Adoption RateRegular users / Users who tried>50%
Feature DAUDaily active users of featureDepends
Feature Breadth% of user base using feature weekly>30% for core
Feature DepthActions/session using featureGrowing
Time to AdoptDays from signup to regular use<14 days

Feature Adoption Framework

Step 1: Define What "Adopted" Means

Feature: Collaboration Comments

NOT Adopted:
- Saw comments feature (awareness only)
- Left one comment ever (trial only)

Adopted:
- Left 3+ comments in 2 different weeks
- Received 2+ replies to comments
- Comments are part of regular workflow

Step 2: Build the Adoption Funnel

Eligible Users          1,000    100%
   │
   ▼ Saw feature
Aware                     820     82%
   │
   ▼ Clicked/opened
Explored                  410     41%
   │
   ▼ Completed first use
Tried                     285     28.5%
   │
   ▼ Used 3+ times
Activated                 170     17%
   │
   ▼ Regular weekly use
Adopted                    95      9.5%
   │
   ▼ Daily/power use
Power User                 28      2.8%

Step 3: Measure by Cohort

Feature launched: March 1

Adoption rate by signup cohort:
┌────────────────────────────────────────────┐
│ Signup Cohort │ 30-Day Adoption Rate       │
├────────────────────────────────────────────┤
│ Pre-launch    │    8% (discovered later)   │
│ March W1      │   12% (early exposure)     │
│ March W2      │   18% (improved UX)        │
│ March W3      │   22% (onboarding added)   │
│ March W4      │   25% (tutorial added)     │
└────────────────────────────────────────────┘

Insight: Each improvement increases adoption

Feature Usage Segmentation

Usage Depth Tiers:

TierDefinition% of Feature Users
PowerDaily use, advanced actions5-10%
RegularWeekly use, core actions20-30%
OccasionalMonthly use, basic actions30-40%
Trial1-2 uses total25-40%

Segment Analysis:

Feature: Advanced Analytics

User Segment Analysis:
┌─────────────────────────────────────────────────────────┐
│ User Type    │ Adoption │ Depth      │ Retention Impact│
├─────────────────────────────────────────────────────────┤
│ Enterprise   │   68%    │ High       │ +25% retention  │
│ Mid-market   │   42%    │ Medium     │ +12% retention  │
│ SMB          │   15%    │ Low        │ +5% retention   │
│ Free tier    │    3%    │ Very low   │ +2% retention   │
└─────────────────────────────────────────────────────────┘

Insight: Feature is enterprise-focused, not relevant to SMB
Action: Gate feature for SMB, or simplify for their needs

Feature Success Scorecard

DimensionMetricTargetStatus
Reach% of users who tried40%28%
Adoption% of triers who adopted50%35%
EngagementActions/week by adopters53.2
RetentionFeature user retention vs non+10%+8%
ValueNPS of feature users5042
RevenueConversion impact+5%+3%

Overall Score: Needs improvement (4/6 under target)

Feature Adoption vs Retention Impact

Question: Is this feature worth keeping?

Feature Impact Matrix:
                        Low Retention Impact    High Retention Impact
                        ──────────────────────────────────────────────
High Adoption           │ SIMPLIFY             │ INVEST
                        │ Popular but not      │ Core feature,
                        │ sticky. Reduce cost  │ double down
                        │                      │
Low Adoption            │ SUNSET               │ EXPOSE
                        │ Nobody uses,         │ Valuable but hidden.
                        │ nobody misses        │ Improve discovery
                        ──────────────────────────────────────────────

Good vs Bad Feature Tracking

Good: Complete Picture

Feature: Smart Scheduling

Tracking implemented:
├── Awareness events
│   ├── smart_schedule_banner_shown
│   ├── smart_schedule_tooltip_shown
│   └── smart_schedule_menu_seen
├── Trial events
│   ├── smart_schedule_clicked
│   ├── smart_schedule_first_created
│   └── smart_schedule_preview_viewed
├── Adoption events
│   ├── smart_schedule_created (with count property)
│   ├── smart_schedule_accepted
│   └── smart_schedule_modified
└── Value events
    ├── smart_schedule_conflict_resolved
    └── smart_schedule_time_saved (calculated)

Dashboard includes:
- Adoption funnel
- Cohort adoption curves
- Segment breakdown
- Retention impact
- Feature health score

Bad: Basic Counting

Feature: Smart Scheduling

Tracking implemented:
- smart_schedule_used (single event)

Problems:
- Can't distinguish trial vs adoption
- No awareness tracking
- No segment analysis
- No value measurement
- No retention correlation

Feature Launch Measurement Plan

Pre-Launch:

  1. Define adoption criteria
  2. Instrument all funnel events
  3. Set success targets
  4. Build dashboard
  5. Establish baseline (if applicable)

Launch Week:

Day 1-3: Awareness check
- Are users seeing the feature?
- Is discovery working?

Day 3-7: Trial check
- Are aware users trying it?
- What's the drop-off point?

Day 7-14: Early adoption
- Are triers coming back?
- What patterns emerge?

Post-Launch (30/60/90 days):

Monthly Review:
┌─────────────────────────────────────────────────────────┐
│ Metric              │ Target │ Day 30 │ Day 60 │ Day 90│
├─────────────────────────────────────────────────────────┤
│ Adoption rate       │  25%   │  12%   │  18%   │  22%  │
│ Weekly active users │ 1,000  │  400   │  750   │  920  │
│ Retention impact    │  +10%  │   +4%  │   +7%  │   +9% │
│ User satisfaction   │  4.2   │   3.8  │   4.0  │   4.1 │
└─────────────────────────────────────────────────────────┘

Feature Sunset Criteria

When to consider sunsetting:

SignalThresholdAction
Adoption<5% after 6 monthsSunset or pivot
Retention impact<2% differenceLow priority
Maintenance costHigh, growingEvaluate ROI
User complaints>complaints than praiseInvestigate or sunset
Strategic fitOff-roadmapPlanned deprecation

Anti-Patterns

  • Ship and forget — No measurement plan = no learning
  • Counting clicks — Clicks ≠ adoption, measure meaningful use
  • Ignoring non-users — Why didn't they try? That's data too
  • Universal metrics — Different features need different criteria
  • Short windows — Features need 30-90 days to reach adoption
  • No control group — Can't measure impact without comparison
  • Feature bloat — More features ≠ better product
  • Vanity features — Building for press releases, not users

title: Funnel Analysis and Optimization impact: CRITICAL tags: funnel, conversion, optimization, growth

Funnel Analysis and Optimization

Impact: CRITICAL

Funnels reveal where users drop off and where optimization delivers the biggest returns. Master funnel analysis to find your highest-leverage growth opportunities.

Funnel Fundamentals

A funnel tracks users through a sequence of steps toward a goal. Each step has conversion and drop-off rates.

Step 1: Landing Page Visit     100%    ─┐
                                        │ 60% convert
Step 2: Signup Started          60%    ─┤
                                        │ 80% convert
Step 3: Signup Completed        48%    ─┤
                                        │ 50% convert
Step 4: Onboarding Done         24%    ─┤
                                        │ 40% convert
Step 5: First Core Action        9.6%  ─┘

Overall Conversion: 9.6%

Core Funnel Metrics

MetricCalculationInterpretation
Step CVRUsers at Step N+1 / Users at Step NIndividual step performance
Cumulative CVRUsers at Step N / Users at Step 1Overall funnel performance
Drop-off Rate1 - Step CVRLoss at each step
Funnel VelocityTime from Step 1 to final stepSpeed through funnel
Recovery RateReturning drop-offs / Total drop-offsRe-engagement success

Building Effective Funnels

Step 1: Define the Goal

What outcome defines success?
- Signup completed
- First purchase made
- Feature activated
- Subscription started

Step 2: Map the Critical Path

What steps MUST users complete?
1. Landing page → Signup form
2. Signup form → Email verified
3. Email verified → Onboarding complete
4. Onboarding complete → First action

Step 3: Instrument Events

Each step needs a trackable event:
- page_viewed (landing)
- signup_started
- signup_completed
- email_verified
- onboarding_completed
- first_action_taken

Step 4: Define the Conversion Window

How long to wait before calling it "dropped"?
- Signup funnel: 1 session or 30 minutes
- Onboarding funnel: 7 days
- Purchase funnel: 30 days
- Enterprise sales: 90 days

Funnel Analysis Techniques

1. Step-by-Step CVR Analysis

Step TransitionCVRBenchmarkStatus
Visit → Signup Started35%40%Under
Signup Started → Completed75%80%Under
Completed → Activated50%45%Good
Activated → Paid12%10%Good

2. Segment Comparison

Signup Funnel by Source:
┌──────────────────────────────────────────────────┐
│ Source      │ Started │ Completed │ Activated  │
├──────────────────────────────────────────────────┤
│ Google Ads  │  100%   │   68%     │   25%      │
│ Organic     │  100%   │   82%     │   45%      │
│ Referral    │  100%   │   88%     │   62%      │
│ Social      │  100%   │   55%     │   18%      │
└──────────────────────────────────────────────────┘

Insight: Referral traffic has 2.5x higher activation rate
Action: Increase investment in referral program

3. Time-Based Analysis

Drop-off timing within onboarding step:
┌──────────────────────────────────────────────────┐
│ Time in Step │ Still Active │ % of Drop-offs    │
├──────────────────────────────────────────────────┤
│ 0-1 min      │    82%       │    35%            │
│ 1-3 min      │    65%       │    28%            │
│ 3-5 min      │    51%       │    22%            │
│ 5+ min       │    48%       │    15%            │
└──────────────────────────────────────────────────┘

Insight: 35% of drop-offs happen in first minute
Action: Investigate first-screen friction

Finding the Biggest Opportunity

Calculate Impact Potential:

Current funnel:
Visit (1000) → Start (350) → Complete (280) → Activate (140) → Pay (17)

Impact of 10% improvement at each step:

Step improved       New Paying    Lift vs Baseline
────────────────────────────────────────────────
Visit → Start        18.7            +10%
Start → Complete     18.7            +10%
Complete → Activate  18.7            +10%
Activate → Pay       18.7            +10%

All steps equal? Focus on EARLIEST step for compounding.
A 10% improvement to Step 1 benefits ALL downstream steps.

The Opportunity Formula:

Opportunity Score = Drop-off Volume × Conversion Headroom

Step              Volume    Current CVR   Benchmark   Headroom   Score
────────────────────────────────────────────────────────────────────────
Visit → Start      1000        35%          45%        10%       100
Start → Complete    350        80%          85%         5%        17.5
Complete → Active   280        50%          60%        10%        28
Active → Pay        140        12%          15%         3%         4.2

Focus: Visit → Start has highest opportunity score

Common Funnel Drop-off Causes

Drop-off PointCommon CausesDiagnostic Actions
Landing → SignupUnclear value prop, slow load, trust issuesA/B test headlines, speed audit, add social proof
Signup Started → CompleteToo many fields, technical errors, frictionForm analytics, error tracking, reduce fields
Signup → OnboardingEmail not verified, confusing next stepsEmail deliverability, clearer CTA
Onboarding → ActivationOverwhelming UX, unclear value, too longUser session recordings, simplify flow
Activation → RetentionValue not realized, wrong ICP, bugsCohort analysis, user interviews
Free → PaidPrice, perceived value, frictionPricing tests, value messaging, checkout UX

Good vs Bad Funnel Examples

Good: Well-Defined, Actionable

Goal: Free trial to paid conversion

Funnel Definition:
1. Trial Started (tracked: trial_started event)
2. Core Feature Used (tracked: first_report_created)
3. Team Invited (tracked: first_team_member_added)
4. Limit Hit (tracked: usage_limit_reached)
5. Upgrade Started (tracked: upgrade_clicked)
6. Paid (tracked: subscription_created)

Conversion Window: 14-day trial period

Analysis Segmentation:
- By company size
- By acquisition source
- By first feature used
- By engagement level

Clear ownership: Growth team owns steps 1-3, Sales owns 4-6

Bad: Vague, Unmeasurable

Goal: Get more customers

Funnel:
1. Someone visits
2. They like it
3. They sign up
4. They use it
5. They pay

Problems:
- Events not defined
- "Like it" not measurable
- No conversion windows
- No segmentation plan
- No ownership

Optimization Playbook

For each low-converting step:

  1. Quantify the drop-off

    • How many users?
    • What's the benchmark?
    • What's the revenue impact?
  2. Diagnose the cause

    • Watch session recordings
    • Check error logs
    • Run user surveys at drop-off
    • Analyze by segment
  3. Hypothesize and prioritize

    • List possible fixes
    • Estimate impact × confidence
    • Pick highest-ROI test
  4. Test and measure

    • A/B test the change
    • Wait for significance
    • Measure downstream effects
  5. Learn and iterate

    • Document results
    • Roll out winners
    • Update benchmarks

Anti-Patterns

  • Too many steps — Combine micro-steps, focus on key conversions
  • Wrong conversion window — Too short misses slow converters, too long dilutes signal
  • Ignoring segments — Averages hide important cohort differences
  • Single-metric focus — Improving one step might hurt another
  • No baseline — Track historical performance before optimizing
  • Premature optimization — Get statistical significance first
  • Funnel-only thinking — Some users take non-linear paths

title: Impact Measurement impact: HIGH tags: impact, measurement, roi, outcomes

Impact Measurement

Impact: HIGH

Shipping features is easy. Proving they worked is hard. Rigorous impact measurement separates "we launched" from "we succeeded."

The Impact Measurement Framework

┌──────────────────────────────────────────────────────────────┐
│                    IMPACT MEASUREMENT                        │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  1. DEFINE SUCCESS                                           │
│     What outcome are we trying to achieve?                   │
│                        ↓                                     │
│  2. ESTABLISH BASELINE                                       │
│     What's the current state?                                │
│                        ↓                                     │
│  3. SET TARGETS                                              │
│     What improvement do we expect?                           │
│                        ↓                                     │
│  4. DESIGN MEASUREMENT                                       │
│     How will we know if we succeeded?                        │
│                        ↓                                     │
│  5. ANALYZE RESULTS                                          │
│     Did we achieve the target?                               │
│                        ↓                                     │
│  6. ATTRIBUTE CAUSATION                                      │
│     Was it because of our change?                            │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Types of Impact Measurement

TypeApproachConfidenceWhen to Use
A/B TestRandomized controlHighOptimization
Pre/PostBefore vs afterMediumWhen A/B not possible
Cohort ComparisonNew vs old usersMediumFeature additions
Difference-in-DifferenceTreatment vs control over timeMedium-HighQuasi-experimental
InstrumentationEvent trackingN/AAdoption measurement

Setting Success Criteria Before Launch

Pre-Launch Success Definition:

Feature: Smart Recommendations

Success Criteria Document:
┌──────────────────────────────────────────────────────────────┐
│ Metric                │ Baseline │ Target    │ Measurement  │
├──────────────────────────────────────────────────────────────┤
│ Click-through rate    │   2.1%   │   3.5%    │ A/B test     │
│ Items viewed/session  │   3.2    │   4.5     │ A/B test     │
│ Conversion rate       │   4.8%   │   5.5%    │ A/B test     │
│ Adoption rate         │   N/A    │   40%     │ Tracking     │
│ User satisfaction     │   3.8    │   4.2     │ Survey       │
├──────────────────────────────────────────────────────────────┤
│ Timeline: 30 days post-launch                                │
│ Decision: Keep if 2+ primary metrics hit target              │
│ Owner: PM + Analytics                                        │
└──────────────────────────────────────────────────────────────┘

Impact Measurement Methods

1. A/B Test Impact

Approach: Randomized experiment
Best for: Feature optimization, UX changes

Example:
┌──────────────────────────────────────────────────────────────┐
│ New Checkout Flow A/B Test Results                           │
├──────────────────────────────────────────────────────────────┤
│ Metric           │ Control │ Treatment │ Lift    │ p-value  │
├──────────────────────────────────────────────────────────────┤
│ Checkout CVR     │  12.4%  │   14.8%   │ +19.4%  │  0.001   │
│ Avg order value  │  $87    │   $89     │  +2.3%  │  0.142   │
│ Cart abandonment │  68%    │   62%     │  -8.8%  │  0.003   │
├──────────────────────────────────────────────────────────────┤
│ Impact: +$142K estimated annual revenue                      │
│ Confidence: HIGH (A/B tested with significance)              │
└──────────────────────────────────────────────────────────────┘

2. Pre/Post Analysis

Approach: Compare before and after launch
Best for: When A/B test not feasible

Example:
┌──────────────────────────────────────────────────────────────┐
│ New Onboarding Pre/Post Analysis                             │
├──────────────────────────────────────────────────────────────┤
│ Period           │ Signup → Active │ D7 Retention           │
├──────────────────────────────────────────────────────────────┤
│ Pre (4 weeks)    │     45%         │     32%                │
│ Post (4 weeks)   │     58%         │     41%                │
│ Change           │    +13pp        │    +9pp                │
├──────────────────────────────────────────────────────────────┤
│ Confounds to consider:                                       │
│ - Seasonality (controlled: same weeks YoY similar)           │
│ - Marketing changes (controlled: no campaign changes)        │
│ - Product changes (controlled: isolated change)              │
├──────────────────────────────────────────────────────────────┤
│ Confidence: MEDIUM (confounds controlled but not eliminated) │
└──────────────────────────────────────────────────────────────┘

3. Cohort Comparison

Approach: Compare users who had feature vs those who didn't
Best for: Gradual rollouts, feature additions

Example:
┌──────────────────────────────────────────────────────────────┐
│ New Feature Cohort Analysis                                  │
├──────────────────────────────────────────────────────────────┤
│ Cohort              │ D30 Retention │ ARPU   │ NPS           │
├──────────────────────────────────────────────────────────────┤
│ With feature        │     48%       │ $42    │  52           │
│ Without feature     │     38%       │ $35    │  41           │
│ Difference          │    +10pp      │ +$7    │ +11           │
├──────────────────────────────────────────────────────────────┤
│ Caution: Self-selection bias possible                        │
│ Users who adopt features may be inherently more engaged      │
└──────────────────────────────────────────────────────────────┘

Calculating Business Impact

Revenue Impact Calculation:

Feature: Improved Recommendations

Metrics:
- 10,000 daily users exposed
- CTR improved from 2% to 3.5% (+75%)
- CVR on clicked items: 8%
- AOV: $50

Calculation:
Before: 10,000 × 2% × 8% × $50 = $800/day
After:  10,000 × 3.5% × 8% × $50 = $1,400/day
Lift:   $600/day = $219,000/year

Impact Statement:
"Smart Recommendations drive an estimated $219K additional
annual revenue through improved discovery."

LTV Impact Calculation:

Feature: Onboarding Improvements

Metrics:
- 30-day retention improved 10pp (32% → 42%)
- Monthly churn decreased from 8% to 6%
- Average monthly revenue: $30

LTV Calculation:
Before: $30 / 8% = $375 LTV
After:  $30 / 6% = $500 LTV
Lift:   +$125 per customer (+33%)

With 1,000 new customers/month:
Annual LTV impact: $125 × 1,000 × 12 = $1.5M

Good vs Bad Impact Measurement

Good: Rigorous, Pre-defined

Feature Launch: Team Collaboration

Pre-Launch:
- Success criteria defined
- Baseline metrics captured
- A/B test designed
- 30-day measurement plan

Results (30 days):
┌──────────────────────────────────────────────────────────────┐
│ Metric            │ Goal     │ Result   │ Status            │
├──────────────────────────────────────────────────────────────┤
│ Adoption rate     │ 25%      │ 32%      │ ✓ Exceeded        │
│ Team retention    │ +5pp     │ +8pp     │ ✓ Exceeded        │
│ Seats per team    │ +0.5     │ +0.8     │ ✓ Exceeded        │
│ Support tickets   │ No lift  │ +5%      │ ⚠ Watch           │
├──────────────────────────────────────────────────────────────┤
│ Decision: Ship, address support ticket drivers               │
│ Impact: Est. $340K ARR from seat expansion                   │
└──────────────────────────────────────────────────────────────┘

Bad: Post-hoc, Cherry-picked

Feature Launch: Team Collaboration

Post-Launch:
- "Let's see what metrics improved"
- Found: Page views up 15%!
- Declared: Success!

Problems:
- No pre-defined success criteria
- No baseline comparison
- Cherry-picked positive metric
- Page views ≠ business value
- No causal attribution

Attribution Challenges

ChallengeDescriptionMitigation
ConfoundsOther things changedControl for variables
Selection BiasNon-random exposureRandomized experiment
Novelty EffectInitial excitement fadesLonger measurement window
Regression to MeanExtreme values normalizeUse control groups
Hawthorne EffectBehavior changes when watchedBlind experiments

Impact Reporting Template

┌──────────────────────────────────────────────────────────────┐
│                    IMPACT REPORT                             │
│                 Feature: Smart Search                        │
├──────────────────────────────────────────────────────────────┤
│ SUMMARY                                                      │
│ Smart Search improved search conversion by 23% in A/B test,  │
│ generating estimated $180K annual revenue.                   │
├──────────────────────────────────────────────────────────────┤
│ HYPOTHESIS                                                   │
│ AI-powered search suggestions would increase search→result   │
│ clicks by 20% by surfacing more relevant results.            │
├──────────────────────────────────────────────────────────────┤
│ METHODOLOGY                                                  │
│ - A/B test, 50/50 split                                      │
│ - n = 45,000 (22,500/variant)                                │
│ - Duration: 28 days                                          │
│ - Primary metric: Search→click rate                          │
├──────────────────────────────────────────────────────────────┤
│ RESULTS                                                      │
│ Metric              │ Control │ Treatment │ p-value         │
│ Search→click rate   │  18.2%  │   22.4%   │ <0.001          │
│ Searches/session    │   1.8   │    2.1    │  0.003          │
│ Purchase after search│  4.2%  │   4.8%    │  0.021          │
├──────────────────────────────────────────────────────────────┤
│ BUSINESS IMPACT                                              │
│ - 23% more users find what they're looking for               │
│ - 14% increase in search-driven purchases                    │
│ - Estimated $180K additional annual revenue                  │
├──────────────────────────────────────────────────────────────┤
│ RECOMMENDATION                                               │
│ Ship to 100%. Consider extending to mobile app.              │
├──────────────────────────────────────────────────────────────┤
│ LEARNINGS                                                    │
│ - AI suggestions most effective for complex queries          │
│ - Less impact for branded/simple searches                    │
│ - Mobile users engaged more with suggestions                 │
└──────────────────────────────────────────────────────────────┘

Impact Measurement Cadence

TimeframeFocusActivities
Pre-LaunchDefinitionSuccess criteria, baseline, measurement plan
Launch +7dAdoptionIs feature being used? Any bugs?
Launch +30dPrimaryDid we hit success criteria?
Launch +90dSustainedAre gains holding? Any degradation?
QuarterlyPortfolioWhich features drove most impact?

Anti-Patterns

  • Post-hoc success definition — Defining success after seeing results
  • Vanity metrics — Measuring what looks good, not what matters
  • No baseline — Can't measure lift without starting point
  • Ignoring context — External factors explain "lift"
  • Single metric obsession — Winning primary while losing guardrails
  • Short-term thinking — Week 1 results ≠ long-term impact
  • No follow-up — Measuring once, never checking if it holds
  • Correlation claims — "Feature users retain better" (selection bias)

title: Data Instrumentation and Events impact: HIGH tags: instrumentation, events, tracking, data-quality

Data Instrumentation and Events

Impact: HIGH

Bad data in, bad decisions out. Solid instrumentation is the foundation of all product analytics. Invest here first.

Event Tracking Fundamentals

What is an event? An event is a record of something that happened: a user action, system occurrence, or state change.

{
  "event": "button_clicked",
  "timestamp": "2024-01-15T10:30:00Z",
  "user_id": "user_123",
  "properties": {
    "button_name": "signup_cta",
    "page": "homepage",
    "variant": "blue"
  }
}

Event Taxonomy Principles

1. Naming Conventions

ConventionFormatExample
Object_Actionnoun_verbsignup_completed, project_created
Action_Objectverb_nouncompleted_signup, created_project
Page_Actioncontext_verbonboarding_step_viewed

Pick ONE convention and stick to it.

2. Event Hierarchy

Level 1: Categories (high-level)
├── user_lifecycle
├── content_engagement
├── feature_usage
├── commerce
└── system

Level 2: Objects (what)
├── user_lifecycle
│   ├── signup
│   ├── login
│   └── subscription
├── feature_usage
│   ├── project
│   ├── report
│   └── export

Level 3: Actions (happened)
├── user_lifecycle
│   ├── signup
│   │   ├── started
│   │   ├── completed
│   │   └── failed

3. Property Standards

Every event should include:
┌─────────────────────────────────────────────────────────┐
│ Property        │ Type     │ Description               │
├─────────────────────────────────────────────────────────┤
│ timestamp       │ datetime │ When event occurred       │
│ user_id         │ string   │ Unique user identifier    │
│ session_id      │ string   │ Session identifier        │
│ platform        │ string   │ web, ios, android         │
│ app_version     │ string   │ Application version       │
└─────────────────────────────────────────────────────────┘

Event-specific properties:
┌─────────────────────────────────────────────────────────┐
│ Property        │ Type     │ Example                   │
├─────────────────────────────────────────────────────────┤
│ object_id       │ string   │ project_456               │
│ object_type     │ string   │ project                   │
│ value           │ number   │ 99.99                     │
│ source          │ string   │ email_campaign            │
│ variant         │ string   │ experiment_a              │
└─────────────────────────────────────────────────────────┘

Building an Event Spec

Event Specification Document:

Event: project_created
Category: feature_usage
Description: User creates a new project

Trigger:
- When: After successful project save to database
- Where: Project creation flow (web, mobile)
- Who: Authenticated users only

Properties:
┌───────────────────────────────────────────────────────────────┐
│ Property       │ Type    │ Required │ Description            │
├───────────────────────────────────────────────────────────────┤
│ project_id     │ string  │ Yes      │ Unique project ID      │
│ project_name   │ string  │ Yes      │ User-defined name      │
│ template_used  │ string  │ No       │ Template ID if used    │
│ team_id        │ string  │ No       │ Team if shared         │
│ source         │ string  │ Yes      │ Where created from     │
│ time_to_create │ number  │ Yes      │ Seconds from start     │
└───────────────────────────────────────────────────────────────┘

Source values: onboarding, dashboard, import, api, template

Example:
{
  "event": "project_created",
  "timestamp": "2024-01-15T10:30:00Z",
  "user_id": "user_123",
  "properties": {
    "project_id": "proj_789",
    "project_name": "Q1 Marketing",
    "template_used": "marketing_template",
    "team_id": "team_456",
    "source": "template",
    "time_to_create": 45
  }
}

Good vs Bad Event Design

Good: Clear, Complete, Consistent

// Clear event name
event: project_created

// Complete properties
properties: {
  project_id: "proj_123",
  project_type: "marketing",
  template_used: "q1_template",
  created_from: "dashboard",
  team_size: 5,
  is_first_project: true
}

// Consistent naming
- project_created (not "Created Project")
- project_updated (not "projectUpdated")
- project_deleted (not "delete_project")

Bad: Vague, Incomplete, Inconsistent

// Vague event name
event: click

// Missing context
properties: {
  button: "create"
}

// Inconsistent naming
- click
- Project Created
- updateProject
- deleted

Problems:
- Can't identify what was clicked
- Missing user context
- No way to build funnels
- Inconsistent casing

Common Event Patterns

1. Funnel Events

signup_started        → Entry point
signup_email_entered  → Step 1
signup_password_set   → Step 2
signup_profile_done   → Step 3
signup_completed      → Success
signup_failed         → Track errors

2. Feature Usage Events

feature_viewed        → Saw the feature
feature_clicked       → Interacted
feature_used          → Completed action
feature_error         → Something went wrong

3. Commerce Events

product_viewed        → Browsing
product_added_to_cart → Intent
checkout_started      → High intent
purchase_completed    → Conversion

4. Search/Discovery Events

search_performed      → User searched
search_results_shown  → Results displayed
search_result_clicked → Selected result
search_no_results     → Empty state

Data Quality Monitoring

Quality Dimensions:

DimensionWhat to CheckAlert Threshold
Completeness% of required properties filled<95%
FreshnessLag from event to warehouse>15 minutes
VolumeEvent count vs expected±30% from baseline
Validity% passing schema validation<99%
UniquenessDuplicate event rate>1%

Monitoring Dashboard:

Event Health Check (Last 24h):
┌────────────────────────────────────────────────────────────┐
│ Event              │ Count    │ Quality │ Status          │
├────────────────────────────────────────────────────────────┤
│ signup_completed   │ 2,450    │ 99.2%   │ ✓ Healthy       │
│ project_created    │ 8,230    │ 98.7%   │ ✓ Healthy       │
│ report_exported    │ 1,120    │ 87.3%   │ ⚠ Low quality   │
│ purchase_completed │ 342      │ 99.8%   │ ✓ Healthy       │
│ feature_x_used     │ 0        │ N/A     │ ✗ No events     │
└────────────────────────────────────────────────────────────┘

Alerts:
- report_exported: Missing 'format' property in 12% of events
- feature_x_used: No events in 24h (expected 500+)

Implementation Best Practices

1. Single Source of Truth

Good:
- Event fired from one location in code
- Wrapper function ensures consistency

Bad:
- Same event fired from multiple places
- Inconsistent property population

2. Server-Side When Possible

Prefer Server-Side:
- Reliable delivery
- Harder to block
- Consistent timestamps
- Secure properties

Client-Side Only:
- UI interactions (clicks, scrolls)
- Performance metrics
- Error states

3. Event Validation

// Schema validation on send
const eventSchema = {
  project_created: {
    required: ['project_id', 'project_name', 'source'],
    properties: {
      project_id: { type: 'string' },
      project_name: { type: 'string', maxLength: 100 },
      source: { type: 'string', enum: ['dashboard', 'import', 'api'] }
    }
  }
};

// Validate before sending
validateAndSend(event, eventSchema);

Event Planning Worksheet

QuestionAnswer
What behavior are we measuring?Project creation success
What decisions will this inform?Onboarding optimization
What's the trigger moment?Project saved to database
What context is needed?Source, template, time
Who owns implementation?Engineering team
Who owns data quality?Analytics team
When is this reviewed?Quarterly audit

Event Governance

Tracking Plan Ownership:

1. Product Manager defines requirements
2. Analyst defines event spec
3. Engineer implements
4. QA validates
5. Analyst monitors quality
6. All: Quarterly review

Change Management:

Adding event:
1. Create spec document
2. Review with stakeholders
3. Implement with tests
4. Validate in staging
5. Deploy and monitor

Changing event:
1. Document change reason
2. Version the event (v2)
3. Parallel track during transition
4. Migrate dashboards
5. Deprecate old version

Anti-Patterns

  • Track everything — Noise drowns signal, costs explode
  • Track nothing — Flying blind, can't learn
  • Client-only — Ad blockers, reliability issues
  • No schema — Properties drift, data becomes unusable
  • No ownership — Nobody maintains, quality degrades
  • Hardcoded strings — Typos break tracking
  • Missing context — Events without user/session info
  • No documentation — Nobody knows what events mean

title: AARRR and HEART Frameworks impact: CRITICAL tags: metrics, framework, aarrr, heart, pirate-metrics

AARRR and HEART Frameworks

Impact: CRITICAL

Two proven frameworks for structuring your product metrics. AARRR focuses on lifecycle stages; HEART focuses on user experience dimensions.

AARRR (Pirate Metrics)

Developed by Dave McClure for startup metrics. Tracks the user lifecycle funnel.

StageCore QuestionExample MetricsTypical Owner
AcquisitionWhere do users come from?Signups by channel, CAC, landing page conversionGrowth/Marketing
ActivationDo they experience value?Onboarding completion, time-to-first-value, aha momentProduct/Growth
RetentionDo they come back?D1/D7/D30, MAU/DAU ratio, resurrection rateProduct
RevenueDo they pay?Conversion rate, ARPU, LTV, expansion rateProduct/Finance
ReferralDo they tell others?Viral coefficient, NPS, referral conversionGrowth/Product

AARRR Deep Dive: Metrics by Stage

Acquisition Metrics

┌─────────────────────────────────────────────────────────────┐
│  Traffic Sources          │  Signups by source              │
│  Channel CAC              │  Cost per signup by channel     │
│  Landing Page CVR         │  Visitor → Signup rate          │
│  Signup Completion        │  Started → Completed signup     │
│  Channel Mix              │  % of signups by channel        │
└─────────────────────────────────────────────────────────────┘

Activation Metrics

┌─────────────────────────────────────────────────────────────┐
│  Setup Completion Rate    │  % completing onboarding        │
│  Time to First Value      │  Signup → Core action (minutes) │
│  Aha Moment Rate          │  % reaching activation event    │
│  Day 1 Return Rate        │  % returning within 24 hours    │
│  Feature Discovery        │  % using 2+ features in week 1  │
└─────────────────────────────────────────────────────────────┘

Retention Metrics

┌─────────────────────────────────────────────────────────────┐
│  D1/D7/D30 Retention      │  % returning on day N           │
│  Weekly Retention Curve   │  % active each week post-signup │
│  Stickiness (DAU/MAU)     │  Daily engagement ratio         │
│  Churn Rate               │  % lost per period              │
│  Resurrection Rate        │  % of churned who return        │
└─────────────────────────────────────────────────────────────┘

Revenue Metrics

┌─────────────────────────────────────────────────────────────┐
│  Free → Paid Conversion   │  % converting to paid           │
│  Trial → Paid Conversion  │  % of trials converting         │
│  ARPU/ARPA                │  Average revenue per user/acct  │
│  LTV                      │  Lifetime value                 │
│  Expansion Revenue %      │  % revenue from existing        │
└─────────────────────────────────────────────────────────────┘

Referral Metrics

┌─────────────────────────────────────────────────────────────┐
│  NPS                      │  Net Promoter Score             │
│  Viral Coefficient (K)    │  Invites × Invite CVR           │
│  Referral Rate            │  % users who refer              │
│  Referral Conversion      │  Referred visitor → signup rate │
│  Organic Word-of-Mouth    │  Direct/organic signup %        │
└─────────────────────────────────────────────────────────────┘

HEART Framework (Google)

Developed by Kerry Rodden at Google. Focuses on user experience quality.

DimensionWhat It MeasuresSignal TypesExample Metrics
HappinessUser satisfactionAttitudinalNPS, CSAT, ease-of-use
EngagementInvolvement depthBehavioralSessions/week, actions/session
AdoptionUptake of featuresBehavioralNew feature users, breadth
RetentionContinued usageBehavioralRetention rate, churn
Task SuccessEfficiencyBehavioralCompletion rate, time-on-task

HEART Goals-Signals-Metrics (GSM)

For each HEART dimension, define:

StepDefinitionExample (File Sharing App)
GoalWhat experience outcome?Users successfully share files
SignalWhat indicates goal met?Completed file shares
MetricHow do we measure signal?Files shared per active user

Full HEART GSM Example:

DimensionGoalSignalMetric
HappinessUsers satisfied with sharingSurvey responsesCSAT score on share flow
EngagementUsers share frequentlyShare activityShares per week per user
AdoptionNew users start sharingFirst share complete% of new users who share in W1
RetentionUsers keep sharingRepeated share behaviorWeek-over-week share retention
Task SuccessSharing is easySuccessful sharesShare completion rate, time to share

When to Use Which Framework

Use AARRR WhenUse HEART When
Analyzing full lifecycleDeep-diving on UX quality
Cross-functional alignmentProduct team focus
Growth modelingFeature-level analysis
Business metrics reviewUser experience audit
Startup/scale phaseMature product optimization

Combining AARRR + HEART

The frameworks complement each other:

AARRR Stage          HEART Dimensions to Apply
───────────────────────────────────────────────
Acquisition     →    Task Success (signup flow)
Activation      →    Task Success, Adoption, Happiness
Retention       →    Engagement, Retention, Happiness
Revenue         →    Task Success (purchase flow), Happiness
Referral        →    Happiness (NPS), Engagement

Good vs Bad Example: AARRR Implementation

Good: Clear, Measurable, Actionable

Company: Analytics SaaS Tool

Acquisition:
- Metric: Signups by channel
- Target: 500 signups/month, <$50 CAC

Activation:
- Metric: First dashboard created within 7 days
- Target: 60% of signups

Retention:
- Metric: Weekly active users (ran 1+ query)
- Target: 45% W4 retention

Revenue:
- Metric: Trial → Paid conversion
- Target: 12% within 30 days

Referral:
- Metric: NPS, organic signup %
- Target: NPS > 50, 30%+ organic

Bad: Vague, Unmeasurable, No Targets

Company: Analytics SaaS Tool

Acquisition: "Get more users"
Activation: "Users like the product"
Retention: "Keep users coming back"
Revenue: "Make money"
Referral: "Users recommend us"

Problems:
- No specific metrics defined
- No targets to measure against
- No way to know if improving
- No team accountability

HEART Implementation Mistakes

MistakeProblemFix
Tracking all dimensions everywhereOverwhelming, unfocusedPick 2-3 relevant per feature
Only happinessSatisfaction ≠ behaviorBalance attitudinal + behavioral
Ignoring task successMissing friction pointsTrack completion rates
No GSM processMetrics without purposeStart with Goals, derive Signals

Anti-Patterns

  • Framework cargo cult — Using AARRR/HEART labels without real metrics
  • Measuring everything — Pick what matters, ignore the rest
  • No targets — Metrics without goals are just numbers
  • Stage skipping — Jumping to revenue before activation works
  • Averages only — Segment metrics by cohort, source, behavior
  • Snapshot thinking — Track trends, not just current state

title: Defining Your North Star Metric impact: CRITICAL tags: metrics, strategy, north-star, kpi

Defining Your North Star Metric

Impact: CRITICAL

Your North Star Metric (NSM) is the single metric that best captures the core value your product delivers to customers. Get this right and alignment follows.

What Makes a Good North Star

A North Star Metric must:

  1. Measure value delivered to customers (not just activity)
  2. Be a leading indicator of revenue
  3. Be measurable and movable
  4. Be understandable across the company
  5. Connect to your product's core purpose

North Star Selection Framework

FactorQuestionGood Sign
ValueDoes this reflect customer value?Users would agree it matters
GrowthDoes improving this grow the business?Correlates with revenue
FocusDoes this guide prioritization?Teams can rally around it
MeasurableCan we track this reliably?Daily/weekly updates possible
ControllableCan our product actions influence this?Clear input metrics exist

Examples by Business Model

Business TypeNorth Star MetricWhy It Works
SlackDaily Active Users sending messagesMeasures communication value
AirbnbNights bookedCore value exchange
SpotifyTime spent listeningEngagement = value delivered
ShopifyGMV (Gross Merchandise Value)Merchant success = Shopify success
NotionWeekly active teamsCollaboration value
ZoomWeekly meeting minutesUsage = value
HubSpotWeekly active teams using 2+ hubsPlatform adoption = stickiness

North Star + Input Metrics

Your NSM doesn't exist alone. Build a metrics tree:

                    ┌─────────────────────────┐
                    │   NORTH STAR METRIC     │
                    │  (Weekly Active Teams)  │
                    └───────────┬─────────────┘
                                │
            ┌───────────────────┼───────────────────┐
            ▼                   ▼                   ▼
    ┌───────────────┐   ┌───────────────┐   ┌───────────────┐
    │  Input: New   │   │ Input: Team   │   │  Input: Team  │
    │ Team Signups  │   │  Activation   │   │  Retention    │
    └───────┬───────┘   └───────┬───────┘   └───────┬───────┘
            │                   │                   │
    ┌───────┴───────┐   ┌───────┴───────┐   ┌───────┴───────┐
    │ • Traffic     │   │ • Onboarding  │   │ • Weekly      │
    │ • Signup rate │   │   completion  │   │   engagement  │
    │ • First team  │   │ • First value │   │ • Feature     │
    │   created     │   │   moment      │   │   depth       │
    └───────────────┘   └───────────────┘   └───────────────┘

Good vs Bad Examples

Good: Value-Focused

Product: Project Management Tool
North Star: Weekly Active Projects with 3+ collaborators

Why good:
- Measures collaboration value (core purpose)
- "3+ collaborators" ensures real team usage
- Weekly timeframe balances signal and noise
- Teams can influence via activation, features, retention

Bad: Vanity Metric

Product: Project Management Tool
"North Star": Total Registered Users

Why bad:
- Doesn't measure ongoing value
- Includes churned users
- Doesn't differentiate engaged vs dormant
- Can't prioritize based on this

Good: Leading Indicator

Product: E-commerce Platform
North Star: Weekly purchasing customers

Why good:
- Measures actual value exchange
- Leading indicator of GMV
- Combines acquisition + retention
- Clear optimization paths

Bad: Lagging Only

Product: E-commerce Platform
"North Star": Annual Revenue

Why bad:
- Too slow to react to
- Teams can't see weekly impact
- Doesn't guide product decisions
- Finance metric, not product metric

North Star for Different Stages

StageNorth Star FocusExample
Pre-PMFEngagement intensityDaily active rate of first 100 users
Early GrowthActivation + retentionWeekly users completing core action
ScalingSustainable growthMonthly active + expansion revenue
MatureEfficiency + expansionRevenue per user, platform usage

Validating Your North Star

Before committing, test these:

  1. Correlation test: Does improving this correlate with revenue growth historically?
  2. Prioritization test: Would this help you decide between two features?
  3. Communication test: Can you explain it to anyone in 30 seconds?
  4. Counter-metric test: What could go wrong if you only optimize for this?

Common North Star Mistakes

MistakeProblemBetter Alternative
RevenueLagging, not product-focusedWeekly paying customers
Total usersIgnores churn, qualityWeekly/Monthly active users
SessionsActivity ≠ valueSessions with core action
NPSInfrequent, hard to moveUsage-based proxy + NPS
Multiple NSMsDefeats the purposePick one, track others as KPIs

Anti-Patterns

  • Too many North Stars — If everything is a priority, nothing is
  • Pure vanity — "Total downloads" tells you nothing about value
  • Unchangeable — If teams can't move it, it's not useful
  • Revenue-only — Finance metric, not product metric
  • Ignoring quality — "Messages sent" without "messages read"
  • Copying competitors — Your NSM should reflect YOUR value prop

title: User Behavior Analytics impact: HIGH tags: behavior, analytics, patterns, segmentation

User Behavior Analytics

Impact: HIGH

Understanding how users actually behave — not how you think they behave — is the foundation of product insight. Behavior data reveals intent, friction, and opportunity.

Types of Behavioral Data

TypeWhat It CapturesTools
Event DataActions (clicks, creates, purchases)Amplitude, Mixpanel, Segment
Session DataSequences and flowsFullstory, Hotjar
EngagementDepth and frequencyProduct analytics
JourneyCross-session pathsJourney mapping tools
HeatmapsWhere users look/clickHotjar, Crazy Egg
RecordingsActual user sessionsFullstory, Logrocket

Behavioral Segmentation

Segmentation Dimensions:

DimensionExamplesUse Case
Usage FrequencyDaily, weekly, monthlyIdentify power users
Feature AdoptionFeatures used, depthProduct development
Lifecycle StageNew, activated, retainedTargeted engagement
Value TierFree, paid, enterpriseMonetization
Behavior PatternCreator, consumer, lurkerUX optimization

Creating Behavioral Segments:

Segment: Power Users
Definition:
- Active 5+ days per week
- Use 3+ core features
- Generate content (not just consume)
- Active for 30+ days

Characteristics:
┌─────────────────────────────────────────────────────────┐
│ Metric               │ Power Users │ Regular Users     │
├─────────────────────────────────────────────────────────┤
│ % of user base       │    8%       │     92%           │
│ Sessions/week        │   12.4      │      2.1          │
│ Features used        │    6.2      │      2.3          │
│ Content created/week │   15.0      │      1.8          │
│ Retention (D90)      │   85%       │     32%           │
│ LTV                  │  $420       │     $85           │
└─────────────────────────────────────────────────────────┘

Insight: Power users are 5x more valuable
Question: How do we create more power users?

Behavioral Pattern Analysis

1. Path Analysis

Most Common Paths to First Purchase:

Path 1 (35% of purchases):
Homepage → Product Page → Add to Cart → Purchase

Path 2 (25% of purchases):
Search → Product Page → Wishlist → Return → Purchase

Path 3 (18% of purchases):
Category → Filter → Product → Compare → Purchase

Path 4 (12% of purchases):
Email Link → Product Page → Purchase

Insight: Direct purchase path (Path 1) converts fastest
Opportunity: Reduce steps for Path 2/3 users

2. Sequence Analysis

What do retained users do in Week 1?

Retained (D30 > 0) vs Churned users:
┌────────────────────────────────────────────────────────┐
│ W1 Action           │ Retained │ Churned │ Difference │
├────────────────────────────────────────────────────────┤
│ Completed tutorial  │   78%    │   35%   │   +43pp    │
│ Created first item  │   85%    │   42%   │   +43pp    │
│ Invited team member │   52%    │   12%   │   +40pp    │
│ Used mobile app     │   65%    │   38%   │   +27pp    │
│ Set up integration  │   42%    │   18%   │   +24pp    │
└────────────────────────────────────────────────────────┘

Activation checklist priority:
1. Completed tutorial (biggest gap)
2. Created first item
3. Invited team member

3. Frequency Analysis

User Activity Distribution:
┌────────────────────────────────────────────────────────┐
│ Sessions/Week │ % Users │ Retention │ LTV            │
├────────────────────────────────────────────────────────┤
│ 0 (dormant)   │   25%   │   N/A     │   $0           │
│ 1-2 (light)   │   40%   │   22%     │   $45          │
│ 3-5 (regular) │   25%   │   55%     │   $120         │
│ 6+ (heavy)    │   10%   │   82%     │   $340         │
└────────────────────────────────────────────────────────┘

Target: Move light users to regular

Behavioral Triggers

Identifying Critical Moments:

Trigger TypeExampleResponse
First-timeFirst login, first actionGuided onboarding
AchievementMilestone reachedCelebration + next step
InactivityNo action in X daysRe-engagement email
Risk signalDecrease in usageProactive outreach
OpportunityFeature almost usedContextual nudge

Behavioral Trigger Implementation:

Trigger: Inactivity Warning

Condition:
- User was active (3+ sessions/week)
- Now inactive (0 sessions in 7 days)
- Has valuable content in account

Action:
- Send "We miss you" email at day 7
- Show "what's new" at next login
- Log re-engagement_triggered event

Measurement:
- Re-engagement rate: 18%
- Resurrection rate: 12%

Session Analysis

Session Metrics:

MetricDefinitionTarget
Session DurationTime from start to endDepends on product
Actions/SessionEvents in session8-15 for engaged
Pages/SessionScreens visited4-8 typical
Session DepthFeatures touched2+ for value
Bounce RateSingle-page sessions<30% for core flows

Session Recording Analysis:

Session Review Protocol:
1. Watch 10-20 sessions per week
2. Focus on:
   - Drop-off points (where users quit)
   - Confusion signals (rage clicks, back navigation)
   - Success patterns (smooth completion)
   - Unexpected behaviors (workarounds)

Document:
- Friction point and timestamp
- User segment
- Frequency (1 user or pattern?)
- Potential fix
- Priority (impact × frequency)

Good vs Bad Behavior Analysis

Good: Segmented, Actionable

Analysis: Feature X Usage Patterns

Approach:
1. Segmented by user type (new vs returning)
2. Tracked full usage funnel
3. Compared high vs low engagers
4. Correlated with outcomes

Findings:
┌─────────────────────────────────────────────────────────┐
│ Behavior                  │ High Retain │ Low Retain   │
├─────────────────────────────────────────────────────────┤
│ Used Feature X in W1      │    72%      │    28%       │
│ Used Feature X 3+ times   │    65%      │    12%       │
│ Combined with Feature Y   │    58%      │     8%       │
└─────────────────────────────────────────────────────────┘

Insight: Feature X + Y combination predicts retention
Action: Prompt Feature Y users to try Feature X

Result: +15% feature adoption, +8% D30 retention

Bad: Aggregate, No Action

Analysis: User Behavior Report

Findings:
- Average session duration: 4.2 minutes
- Average actions per session: 6.8
- Most clicked: Dashboard

Problems:
- No segmentation
- No comparison to outcome
- Averages hide patterns
- No actionable insight
- "Most clicked" isn't insight

Predictive Behavior Signals

Churn Prediction Signals:

High-Risk Indicators:
┌─────────────────────────────────────────────────────────┐
│ Signal                   │ Weight │ Threshold          │
├─────────────────────────────────────────────────────────┤
│ Session frequency drop   │  0.35  │ -50% vs avg        │
│ Feature usage decline    │  0.25  │ -30% features      │
│ Support tickets up       │  0.20  │ 2+ in 30 days      │
│ Login failures           │  0.10  │ 3+ in 7 days       │
│ Billing issues           │  0.10  │ Failed payment     │
└─────────────────────────────────────────────────────────┘

Risk Score = Σ(signal × weight)
High risk: Score > 0.6

Upgrade Prediction Signals:

Ready-to-Upgrade Indicators:
┌─────────────────────────────────────────────────────────┐
│ Signal                   │ Indicator                   │
├─────────────────────────────────────────────────────────┤
│ Limit approaching        │ 80% of plan limit           │
│ Premium feature attempts │ 3+ clicks on locked feature │
│ Team growth              │ Inviting new members        │
│ Usage acceleration       │ +50% activity vs last month │
│ Pricing page visits      │ 2+ views in 30 days         │
└─────────────────────────────────────────────────────────┘

Behavioral Analytics Tools

ToolBest ForKey Features
AmplitudeProduct analyticsFunnels, cohorts, paths
MixpanelEvent analyticsReal-time, segmentation
FullstorySession replayRecordings, heatmaps
HeapAuto-captureRetroactive analysis
PendoIn-app analyticsUsage tracking, guides
PostHogOpen sourceFull stack, self-hosted

Anti-Patterns

  • Aggregate obsession — Averages hide the story
  • Correlation = causation — Behavior correlates, doesn't cause
  • Recency bias — This week's pattern ≠ trend
  • Confirmation bias — Finding data to support decisions
  • Privacy violation — Tracking too much, too granularly
  • Analysis without action — Insights that don't change anything
  • Segment pollution — Too many segments, no clear personas
  • Ignoring context — Behavior without the "why"