When funnel data needs interpretation, /product-analyst builds metric frameworks, so you can act on cohort signals. — Claude Skill
A Claude Skill for Claude Code by Nick Jensen — run /product-analyst in Claude·Updated
Build metric frameworks, cohort analyses, and experimentation plans
- Structure AARRR and HEART metric frameworks with tracking specs
- Design cohort retention tables with segment breakdowns
- Calculate A/B test sample sizes and minimum detectable effects
- Build funnel analysis queries with drop-off diagnostics
- Create experimentation roadmaps with sequential test dependencies
Who this is for
What it does
Run /product-analyst to define all 5 AARRR stages with 2-3 metrics each, tracking specs, and target ranges — producing a measurement plan covering 12-15 KPIs.
Use /product-analyst to structure a 12-week retention cohort table across 3 user segments, with churn inflection detection and a reactivation trigger plan.
Run /product-analyst to design an experiment: define hypothesis, primary metric, guardrail metrics, sample size (e.g., n=2400 for 5% MDE at 80% power), and runtime estimate.
Use /product-analyst to analyze a 6-step conversion funnel, flag the 2 highest-friction steps, and propose instrumentation for rage-clicks and session replays at each stage.
How it works
Share your product type, key user actions, and the question you want data to answer — retention curve, conversion rate, or feature adoption.
The skill selects the right metric framework (AARRR, HEART, or custom) and maps your events to each stage with concrete definitions.
It produces analysis templates: cohort tables, funnel breakdowns, or experiment designs with statistical parameters baked in.
You receive a ready-to-implement measurement plan with event schemas, SQL query templates, and dashboard wireframes.
Example
E-commerce marketplace, 20k monthly transactions. We redesigned checkout last month and need to measure impact on conversion and average order value.
Primary: checkout completion rate (pre: 62%, target: 68%). Secondary: AOV, time-to-purchase, cart abandonment rate. Guardrail: refund rate, support tickets per 1k orders.
Compare 4-week pre-period vs 4-week post-period. Segment by: new vs returning, mobile vs desktop, cart size (<$50, $50-150, >$150). Flag if any segment regresses >2pp.
3 queries provided: (1) daily conversion rate with confidence intervals, (2) cohort-by-segment pivot table, (3) funnel step timing with P50/P95 latency per step. All parameterized by date range.
Metrics this improves
Works with
Product Analyst
Strategic product analytics expertise for data-driven product decisions — from metrics framework selection to experimentation design and impact measurement.
Philosophy
Great product analytics isn't about tracking everything. It's about measuring what matters to drive better product decisions.
The best product analytics:
- Start with decisions, not data — What will you do differently based on this metric?
- Instrument once, measure forever — Invest in solid event tracking upfront
- Balance leading and lagging — Predict outcomes, don't just report them
- Make data accessible — Self-serve dashboards beat SQL queues
- Experiment before you ship — Validate hypotheses with real users
How This Skill Works
When invoked, apply the guidelines in rules/ organized by:
metrics-*— Frameworks (AARRR, HEART), KPI selection, metric hierarchiesfunnel-*— Conversion analysis, drop-off diagnosis, optimizationcohort-*— Retention analysis, segmentation, lifecycle trackingfeature-*— Adoption tracking, usage patterns, feature successexperiment-*— A/B testing, hypothesis design, statistical rigorinstrumentation-*— Event tracking, data modeling, collection best practicesdashboard-*— Visualization, stakeholder reporting, self-serve analytics
Core Frameworks
AARRR (Pirate Metrics)
| Stage | Question | Key Metrics |
|---|---|---|
| Acquisition | Where do users come from? | Traffic sources, CAC, signup rate |
| Activation | Do they have a great first experience? | Time-to-value, setup completion, aha moment |
| Retention | Do they come back? | DAU/MAU, D1/D7/D30 retention, churn |
| Revenue | Do they pay? | Conversion rate, ARPU, LTV |
| Referral | Do they tell others? | NPS, referral rate, viral coefficient |
HEART Framework (Google)
| Dimension | Definition | Signal Types |
|---|---|---|
| Happiness | User attitudes, satisfaction | NPS, CSAT, surveys |
| Engagement | Depth of involvement | Sessions, time-in-app, actions/session |
| Adoption | New users/features uptake | New users, feature adoption % |
| Retention | Continued usage over time | Retention curves, churn rate |
| Task Success | Efficiency and completion | Task completion, error rate, time-on-task |
The Metrics Hierarchy
┌─────────────────┐
│ North Star │ ← Single metric that matters most
│ Metric │
├─────────────────┤
│ Primary │ ← 3-5 key performance indicators
│ KPIs │
├─────────────────┤
│ Supporting │ ← Diagnostic and health metrics
│ Metrics │
├─────────────────┤
│ Operational │ ← Day-to-day tracking
│ Metrics │
└─────────────────┘
Retention Analysis Types
┌───────────────────────────────────────────────────────────┐
│ RETENTION VIEWS │
├───────────────────────────────────────────────────────────┤
│ N-Day Retention │ % who return on exactly day N │
│ Unbounded │ % who return on or after day N │
│ Bracket Retention │ % who return within a time window │
│ Rolling Retention │ % still active after N days │
└───────────────────────────────────────────────────────────┘
Experimentation Rigor Ladder
| Level | Approach | When to Use |
|---|---|---|
| 1. Gut | Ship and hope | Never for important features |
| 2. Qualitative | User research, feedback | Early exploration |
| 3. Observational | Pre/post analysis | Low-risk changes |
| 4. Quasi-experiment | Cohort comparison | When randomization hard |
| 5. A/B Test | Randomized control | Optimization, validation |
| 6. Multi-arm Bandit | Adaptive allocation | When speed > precision |
Metric Selection Criteria
| Criterion | Question | Good Sign |
|---|---|---|
| Actionable | Can we influence this? | Direct lever exists |
| Accessible | Can we measure it reliably? | <5% missing data |
| Auditable | Can we debug anomalies? | Clear calculation logic |
| Aligned | Does it tie to business value? | Executive cares |
| Attributable | Can we trace changes to causes? | A/B testable |
Anti-Patterns
- Vanity metrics — Tracking what looks good, not what drives decisions
- Metric overload — 50 dashboards, zero insights
- Lagging only — Measuring outcomes without predictive indicators
- Silent failures — No alerting on data quality issues
- HiPPO-driven — Highest-paid person's opinion beats data
- P-hacking — Running tests until you get significance
- Ship and forget — Launching features without success criteria
- Segment blindness — Looking only at averages, missing cohort differences
Reference documents
title: Section Organization
1. Metrics Frameworks (metrics)
Impact: CRITICAL Description: Core metrics frameworks like AARRR and HEART, KPI selection, metric hierarchies, and North Star definition. The foundation for all product analytics.
2. Funnel Analysis (funnel)
Impact: CRITICAL Description: Conversion funnel design, drop-off analysis, bottleneck identification, and optimization strategies. Where most growth opportunities live.
3. Cohort & Retention (cohort)
Impact: CRITICAL Description: Cohort analysis, retention curves, churn prediction, and lifecycle segmentation. The key to sustainable growth.
4. Feature Analytics (feature)
Impact: HIGH Description: Feature adoption tracking, usage depth measurement, success criteria, and feature-level decision making.
5. Experimentation (experiment)
Impact: HIGH Description: A/B testing design, hypothesis formulation, statistical rigor, and interpreting experiment results.
6. Data Instrumentation (instrumentation)
Impact: HIGH Description: Event taxonomy, tracking implementation, data quality, and building reliable analytics foundations.
7. Dashboard Design (dashboard)
Impact: MEDIUM-HIGH Description: Product dashboards, stakeholder reporting, self-serve analytics, and visualization best practices.
title: Cohort Analysis and Retention impact: CRITICAL tags: cohort, retention, churn, lifecycle
Cohort Analysis and Retention
Impact: CRITICAL
Cohort analysis separates the signal from the noise. It answers: "Are things getting better?" by comparing groups of users who share a common starting point.
What is a Cohort?
A cohort is a group of users who share a characteristic, typically their signup date (acquisition cohort) or first action date (behavioral cohort).
Week 1 Cohort: All users who signed up during Week 1
Week 2 Cohort: All users who signed up during Week 2
...
Compare: How does Week 5 cohort perform vs Week 1 cohort?
The Classic Retention Table
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5
Cohort ────── ────── ────── ────── ────── ──────
Jan 1 1,000 450 380 320 290 280
Jan 8 1,200 540 465 400 360 -
Jan 15 1,100 520 440 385 - -
Jan 22 1,300 610 530 - - -
Jan 29 1,400 650 - - - -
As Percentages:
Week 0 Week 1 Week 2 Week 3 Week 4 Week 5
────── ────── ────── ────── ────── ──────
Jan 1 100% 45% 38% 32% 29% 28%
Jan 8 100% 45% 39% 33% 30% -
Jan 15 100% 47% 40% 35% - -
Jan 22 100% 47% 41% - - -
Jan 29 100% 46% - - - -
Reading the table:
- Rows: Different cohorts (by signup week)
- Columns: Time periods since signup
- Diagonal: Same calendar week across cohorts
- Trend right: How retention evolves over time for a cohort
- Trend down: How retention improves/degrades for newer cohorts
Types of Retention Metrics
| Type | Definition | Use Case |
|---|---|---|
| N-Day Retention | % returning on exactly day N | Mobile apps, daily products |
| N-Day Rolling | % still active after day N | Subscription products |
| Unbounded | % returning on or after day N | Casual use products |
| Bracket/Range | % active within a time window | Weekly/monthly products |
Examples:
N-Day (D7):
Active on exactly day 7 / Total users = 22%
N-Day Rolling (D7):
Active at least once days 7-14 / Total users = 35%
Unbounded (D7):
Active on day 7 OR any day after / Total users = 45%
Bracket (Week 2):
Active during days 8-14 / Total users = 32%
Retention Curves
100% ┤
│ ●
80% ┤ ╲
│ ●
60% ┤ ╲
│ ●───●───●───●───●───● ← Good: Flattening
40% ┤
│ ●
20% ┤ ╲
│ ●───●
0% ┤ ╲───●───● ← Bad: Still declining
└────────────────────────────
W0 W1 W2 W3 W4 W5 W6 W7 W8
What the curve tells you:
- Steep early drop: Activation problem
- Gradual continued decline: Value realization problem
- Flattens at high %: Strong product-market fit
- Never flattens: Fundamental value problem
Cohort Analysis Techniques
1. Acquisition Cohort Analysis Group by signup date to measure:
- Is retention improving over time?
- Did a product change help/hurt retention?
- Are newer users more/less engaged?
Question: Are we getting better at retaining users?
Compare: Jan cohort W4 retention vs March cohort W4 retention
Result: 29% → 35% improvement
Insight: Onboarding improvements are working
2. Behavioral Cohort Analysis Group by first action or behavior to measure:
- Which first actions predict retention?
- What separates engaged vs churned users?
- What's the "aha moment"?
Cohort by first action:
┌──────────────────────────────────────────────┐
│ First Action │ W4 Retention │ LTV │
├──────────────────────────────────────────────┤
│ Created project │ 42% │ $280 │
│ Invited team member │ 58% │ $420 │
│ Imported data │ 65% │ $510 │
│ Just browsed │ 12% │ $45 │
└──────────────────────────────────────────────┘
Insight: Users who import data retain 5x better
Action: Prioritize data import in onboarding
3. Segment Cohort Analysis Compare retention across user segments:
W4 Retention by Segment:
┌──────────────────────────────────────────────┐
│ Segment │ W4 Retention │
├──────────────────────────────────────────────┤
│ Enterprise (500+) │ 72% │
│ Mid-market (50-500) │ 48% │
│ SMB (10-50) │ 35% │
│ Micro (<10) │ 18% │
└──────────────────────────────────────────────┘
Insight: Enterprise retains 4x better than micro
Action: Shift ICP focus or improve SMB experience
Calculating Key Retention Metrics
Churn Rate
Monthly Churn = Customers lost in month / Customers at start of month
Example: 50 churned / 1,000 starting = 5% monthly churn
Annual Churn ≈ 1 - (1 - Monthly Churn)^12 = 46% annual churn
Retention Rate
Retention Rate = 1 - Churn Rate
Example: 1 - 5% = 95% monthly retention
Customer Lifetime (Average)
Average Lifetime = 1 / Churn Rate
Example: 1 / 5% = 20 months average lifetime
Net Revenue Retention (NRR)
NRR = (Starting MRR + Expansion - Contraction - Churn) / Starting MRR
Example: ($100k + $15k - $5k - $8k) / $100k = 102% NRR
Good vs Bad Retention Curves
Good: Consumer Subscription (Spotify-like)
Day Retention Interpretation
───────────────────────────────────
D1 70% Good first session
D7 45% Habit forming
D30 30% Solid monthly engagement
D90 22% Core user base forming
D180 18% Plateau = product-market fit
Bad: Leaky Bucket
Day Retention Interpretation
───────────────────────────────────
D1 50% Many bouncing immediately
D7 25% Half gone in a week
D30 8% Hemorrhaging users
D90 2% Almost no one stays
D180 0.5% Fundamental value problem
Good: B2B SaaS
Month Retention Interpretation
───────────────────────────────────
M1 92% Normal startup churn
M3 88% Onboarding complete
M6 85% Sticky integrations
M12 80% Annual renewal strong
M24 75% Long-term value delivered
Finding Your "Aha Moment"
The aha moment is the action or experience that predicts retention.
Analysis approach:
1. Define retained (e.g., active at W4)
2. List candidate behaviors in W1
3. Calculate correlation with retention
4. Find the threshold that maximizes separation
Candidate Analysis:
┌─────────────────────────────────────────────────────────┐
│ W1 Behavior │ W4 Retention │ Correlation │
├─────────────────────────────────────────────────────────┤
│ Created 1 project │ 35% │ 0.42 │
│ Created 3+ projects │ 62% │ 0.71 │
│ Invited 1 teammate │ 48% │ 0.55 │
│ Invited 3+ teammates │ 78% │ 0.85 ← Aha! │
│ Used mobile app │ 52% │ 0.58 │
└─────────────────────────────────────────────────────────┘
Aha moment: Invite 3+ teammates in week 1
Resurrection Analysis
Resurrection Rate = Resurrected users / Churned users
Monthly resurrection:
┌────────────────────────────────────────────────┐
│ Churned Month │ Resurrected │ Rate │ Via │
├────────────────────────────────────────────────┤
│ January │ 45 │ 12% │ Email │
│ February │ 38 │ 10% │ Feature │
│ March │ 52 │ 14% │ Email │
└────────────────────────────────────────────────┘
Insight: Win-back emails resurrect ~12% of churned
Action: Invest in better resurrection campaigns
Retention Improvement Tactics
| Problem | Indicators | Tactics |
|---|---|---|
| D1 drop-off | <60% D1 retention | Improve onboarding, reduce friction |
| W1 decline | Steep W1 curve | Strengthen habit loops, add value faster |
| No plateau | Curve never flattens | Find and amplify aha moment |
| Segment variance | Wide cohort spread | Focus on high-retaining segments |
| Seasonal churn | Predictable drops | Proactive engagement before drop |
Anti-Patterns
- Only tracking DAU/MAU — Masks cohort-level problems
- Ignoring early churn — Most churn happens in first week
- Average-only analysis — Segment differences matter
- Short windows — Need 3-6 months for meaningful curves
- No resurrection tracking — Churned users can come back
- Vanity retention — Counting logins without value delivery
- Static analysis — Retention should improve over time
title: Dashboard Design for Product impact: MEDIUM-HIGH tags: dashboard, visualization, reporting, self-serve
Dashboard Design for Product
Impact: MEDIUM-HIGH
The best dashboards drive action, not just awareness. Design dashboards that answer questions and prompt decisions.
Dashboard Design Principles
1. One Dashboard, One Purpose
Bad: "Product Dashboard" (everything)
Good: "Weekly Activation Health Check" (specific purpose)
2. Action-Oriented
Ask: "What will someone DO after viewing this?"
If no clear action → wrong metrics or wrong audience
3. Audience-Appropriate
Executive: High-level trends, strategic metrics
Manager: Team performance, weekly progress
IC: Detailed diagnostics, debugging tools
4. Progressive Disclosure
Level 1: Summary metrics (5 seconds to understand)
Level 2: Key trends (30 seconds)
Level 3: Segment breakdowns (2 minutes)
Level 4: Detailed drill-down (investigation)
Dashboard Architecture
The Dashboard Hierarchy:
┌─────────────────────┐
│ Executive │ Weekly/Monthly
│ Overview │ Strategy decisions
└──────────┬──────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Growth │ │ Product │ │ Revenue │
│ Dashboard │ │ Health │ │ Metrics │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐
│ Acquisition │ │ Retention │ │ Conversion │
│ Activation │ │ Engagement │ │ Expansion │
│ Funnel │ │ Feature │ │ Churn │
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
▼ ▼ ▼
[Detailed drill-down dashboards and ad-hoc exploration]
Dashboard Types and Templates
1. Executive Overview
| Section | Metrics | Visualization |
|---|---|---|
| Health Score | Overall product score (composite) | Single number + trend |
| North Star | Primary metric + trend | Big number + sparkline |
| AARRR Summary | Each stage's key metric | Row of metrics |
| Risks/Wins | Notable changes this period | Text callouts |
Layout:
┌────────────────────────────────────────────────────────┐
│ PRODUCT HEALTH SCORE: 78/100 (↑3 vs last week) │
├────────────────────────────────────────────────────────┤
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ SIGNUPS │ │ACTIVATION│ │RETENTION │ │ REVENUE │ │
│ │ 2,340 │ │ 62% │ │ 45% │ │ $142K │ │
│ │ +12% │ │ +3pp │ │ -1pp │ │ +8% │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
├────────────────────────────────────────────────────────┤
│ 📈 Wins: Activation hit all-time high │
│ ⚠️ Watch: D30 retention declining for 3 weeks │
└────────────────────────────────────────────────────────┘
2. Funnel Dashboard
┌────────────────────────────────────────────────────────┐
│ SIGNUP FUNNEL (Last 7 Days) │
├────────────────────────────────────────────────────────┤
│ │
│ Landing Page ██████████████████████████ 10,000 │
│ ↓ 35% │
│ Started Signup ████████████ 3,500 │
│ ↓ 78% │
│ Completed ████████ 2,730 │
│ ↓ 52% │
│ Activated █████ 1,420 │
│ ↓ 35% │
│ Retained (D7) ██ 497 │
│ │
├────────────────────────────────────────────────────────┤
│ Biggest drop-off: Completed → Activated (48% loss) │
│ vs Last Week: Start→Complete improved +5pp │
└────────────────────────────────────────────────────────┘
3. Retention Dashboard
┌────────────────────────────────────────────────────────┐
│ RETENTION COHORT ANALYSIS │
├────────────────────────────────────────────────────────┤
│ │ W0 │ W1 │ W2 │ W3 │ W4 │ W5 │
│ Jan 1 │ 100% │ 45% │ 38% │ 32% │ 29% │ 28% │
│ Jan 8 │ 100% │ 47% │ 40% │ 35% │ 31% │ │
│ Jan 15 │ 100% │ 48% │ 42% │ 38% │ │ │
│ Jan 22 │ 100% │ 52% │ 45% │ │ │ │
├────────────────────────────────────────────────────────┤
│ Trend: W1 retention improving (45% → 52%) │
│ Target: 55% W1 by end of Q1 │
└────────────────────────────────────────────────────────┘
4. Feature Adoption Dashboard
┌────────────────────────────────────────────────────────┐
│ FEATURE ADOPTION TRACKER │
├────────────────────────────────────────────────────────┤
│ Feature │ Adoption │ Trend │ Health │
│ ─────────────────────────────────────────────────────│
│ Core Reporting │ 72% │ ↑ │ ✅ Strong │
│ Collaboration │ 48% │ ↑ │ ✅ Growing │
│ Advanced Analytics │ 25% │ → │ ⚠️ Stalled │
│ API Access │ 12% │ ↓ │ ⚠️ Watch │
│ New: Smart Alerts │ 8% │ ↑ │ 🆕 Ramping │
├────────────────────────────────────────────────────────┤
│ Focus Area: Advanced Analytics discovery │
└────────────────────────────────────────────────────────┘
Visualization Best Practices
Choose the Right Chart:
| Data Type | Best Visualization | Avoid |
|---|---|---|
| Single metric | Big number, gauge | Pie chart |
| Trend over time | Line chart | Stacked bar |
| Part of whole | Stacked bar, 100% bar | Pie chart (>5 segments) |
| Comparison | Horizontal bar | Vertical bar (many items) |
| Distribution | Histogram, box plot | Pie chart |
| Correlation | Scatter plot | Line chart |
| Funnel | Funnel chart, waterfall | Pie chart |
Color Usage:
Use color intentionally:
- Green: Good/positive/target met
- Red: Bad/negative/needs attention
- Yellow: Warning/watch
- Gray: Neutral/context
- Blue: Primary data series
Avoid:
- Rainbow charts
- More than 5 colors
- Red/green only (colorblind users)
Good vs Bad Dashboard Examples
Good: Focused, Actionable
Dashboard: Weekly Activation Review
Audience: Growth team
Purpose: Identify activation blockers
Content:
1. Activation rate vs target (big number)
2. Activation funnel with WoW comparison
3. Top 3 drop-off points with absolute numbers
4. Cohort activation trend (are we improving?)
5. Segment breakdown (mobile vs web, source)
Actions enabled:
- Identify biggest activation blocker
- Prioritize next optimization
- Track experiment impact
Bad: Unfocused, Vanity-Driven
Dashboard: Product Dashboard
Audience: Everyone
Purpose: Show everything
Content:
1. Total users (all-time, always up!)
2. Page views (meaningless)
3. Session duration (average, not segmented)
4. 47 different charts
5. No trends or comparisons
Problems:
- No clear purpose
- Vanity metrics
- Too much information
- No actionable insight
- Wrong granularity
Self-Serve Analytics
Building Self-Serve Culture:
| Level | What Users Can Do | Tools |
|---|---|---|
| Viewer | See pre-built dashboards | Dashboard tool |
| Explorer | Filter, segment, drill-down | BI tool |
| Builder | Create simple charts | BI tool |
| Analyst | Write SQL, complex analysis | SQL tool |
Self-Serve Guardrails:
✓ Do:
- Provide data dictionary
- Offer templates and examples
- Set up calculated metrics
- Enable sandboxed exploration
✗ Don't:
- Give raw database access
- Allow editing production dashboards
- Let users create without naming conventions
Dashboard Maintenance
Review Cadence:
| Activity | Frequency |
|---|---|
| Metric accuracy check | Weekly |
| Usage review | Monthly |
| Relevance audit | Quarterly |
| Full dashboard review | Semi-annually |
Health Indicators:
| Signal | Healthy | Unhealthy |
|---|---|---|
| Views/week | >10 | <3 |
| Active explorers | Growing | Declining |
| Stale dashboards | <10% | >30% |
| User questions | Decreasing | Increasing |
Anti-Patterns
- Dashboard sprawl — 50 dashboards, nobody uses any
- Vanity metrics — Always-up numbers that don't drive action
- No context — Numbers without targets, trends, or comparisons
- Information overload — 30 charts on one page
- Stale data — Data from yesterday (or last week)
- No ownership — Nobody maintains, data quality degrades
- Pretty but useless — Beautiful charts with no insight
- One-size-fits-all — Same dashboard for exec and IC
title: A/B Testing and Experimentation impact: HIGH tags: experiment, ab-testing, hypothesis, statistics
A/B Testing and Experimentation
Impact: HIGH
A/B testing removes opinion from product decisions. Done right, it's the difference between "we think" and "we know."
When to A/B Test
| Scenario | A/B Test? | Why |
|---|---|---|
| Optimizing existing flow | Yes | Measure impact of changes |
| Major redesign | Maybe | Consider qualitative first |
| New feature launch | Yes | Validate before full rollout |
| Copy/CTA changes | Yes | Quick wins, clear measurement |
| Pricing changes | Carefully | Revenue impact, user trust |
| Bug fixes | No | Just fix it |
| Low traffic areas | No | Won't reach significance |
Anatomy of a Good Experiment
┌────────────────────────────────────────────────────────────┐
│ EXPERIMENT DESIGN │
├────────────────────────────────────────────────────────────┤
│ Hypothesis: │
│ Changing the CTA from "Start Free" to "Start in 60 │
│ seconds" will increase signup rate by 10% │
│ │
│ Primary Metric: Signup completion rate │
│ Secondary Metrics: Time to signup, D7 retention │
│ Guardrail Metrics: Page load time, error rate │
│ │
│ Sample Size: 5,000 per variant (10,000 total) │
│ Duration: 14 days minimum │
│ Significance Level (α): 5% │
│ Power (1-β): 80% │
│ MDE: 10% relative improvement │
│ │
│ Variants: │
│ Control (50%): "Start Free" │
│ Treatment (50%): "Start in 60 seconds" │
│ │
│ Segments to analyze: New vs returning, mobile vs desktop │
│ Exclusions: Existing users, bot traffic │
└────────────────────────────────────────────────────────────┘
The Hypothesis Formula
Structure:
If we [change X],
then [metric Y] will [increase/decrease] by [Z%],
because [reason/evidence].
Good Example:
If we reduce signup form fields from 6 to 3,
then signup completion rate will increase by 15%,
because user research showed field fatigue is the #1 drop-off reason.
Bad Example:
If we change the button color,
it might help,
because blue is nice.
Sample Size and Duration
Sample Size Calculation:
| Baseline Rate | MDE | Required Sample/Variant |
|---|---|---|
| 1% | 10% relative (1.1%) | 150,000 |
| 5% | 10% relative (5.5%) | 30,000 |
| 10% | 10% relative (11%) | 15,000 |
| 20% | 10% relative (22%) | 6,500 |
| 50% | 10% relative (55%) | 1,600 |
Duration Guidelines:
Minimum: 1 full week (capture weekly patterns)
Typical: 2-4 weeks
Maximum before concern: 6-8 weeks
Factors extending duration:
- Low traffic volume
- Smaller expected effect
- Higher baseline variance
- Need for segment analysis
Statistical Rigor
Key Concepts:
| Concept | Definition | Target |
|---|---|---|
| Significance (α) | False positive rate | 5% (p < 0.05) |
| Power (1-β) | True positive rate | 80% |
| MDE | Minimum Detectable Effect | Smallest meaningful change |
| Confidence Interval | Range of true effect | Doesn't cross zero |
Reading Results:
Experiment Results:
┌─────────────────────────────────────────────────────────┐
│ Metric: Signup Completion Rate │
├─────────────────────────────────────────────────────────┤
│ Control: 12.4% (n=5,200) │
│ Treatment: 13.8% (n=5,180) │
│ │
│ Relative Lift: +11.3% │
│ Absolute Lift: +1.4 percentage points │
│ 95% CI: [+5.2%, +17.8%] │
│ p-value: 0.002 │
│ Status: SIGNIFICANT │
└─────────────────────────────────────────────────────────┘
Interpretation:
- Treatment wins with 11.3% improvement
- CI doesn't include zero = statistically significant
- p < 0.05 = reject null hypothesis
- Ready to ship treatment
Common Testing Mistakes
| Mistake | Problem | Prevention |
|---|---|---|
| Peeking | Stopping early on exciting results | Pre-set duration, sequential analysis |
| P-hacking | Running until significant | Pre-register hypothesis |
| HARKing | Hypothesizing after results | Document hypothesis before |
| Segment hunting | Finding any winning segment | Pre-define segments |
| Ignoring power | Too small sample | Calculate sample size upfront |
| No guardrails | Winning metric, losing elsewhere | Define counter-metrics |
| Too many variants | Diluted sample | Limit to 2-4 variants |
Guardrail Metrics
Primary: Signup completion rate (trying to improve)
Guardrails (must not degrade):
┌─────────────────────────────────────────────────────────┐
│ Guardrail │ Threshold │ Result │
├─────────────────────────────────────────────────────────┤
│ D7 retention │ -5% max │ -2% ✓ │
│ Page load time │ +200ms max │ +50ms ✓ │
│ Error rate │ +0.5% max │ +0.1% ✓ │
│ Support tickets │ +10% max │ -5% ✓ │
└─────────────────────────────────────────────────────────┘
Status: All guardrails passed, safe to ship
Good vs Bad Experiment Design
Good: Complete, Rigorous
Experiment: Onboarding Flow Simplification
Background:
Current onboarding has 8 steps, 45% completion rate.
User research shows steps 4-6 cause most drop-offs.
Hypothesis:
Reducing onboarding from 8 to 5 steps (removing low-value steps)
will increase completion rate by 15% because users experience
less friction and see value faster.
Design:
- Primary metric: Onboarding completion rate
- Secondary: Time to complete, D7 activation
- Guardrails: Setup quality score, D30 retention
- Sample: 10,000/variant
- Duration: 3 weeks
- Targeting: New users only, all platforms
Success Criteria:
- Primary: >10% relative lift, p<0.05
- Guardrails: No degradation beyond thresholds
- Qualitative: User feedback positive
Decision Framework:
- If significant win: Ship to 100%
- If neutral: Consider qualitative findings
- If negative: Abandon, iterate
Bad: Incomplete, Unmeasurable
Experiment: Try New Onboarding
Let's test a new onboarding flow.
Variants: A (current) vs B (new)
Run for a week and see what happens.
Problems:
- No hypothesis
- No sample size calculation
- Arbitrary duration
- No guardrails
- No success criteria
- No decision framework
Sequential Testing
For faster decisions with statistical validity:
Sequential Analysis Framework:
┌──────────────────────────────────────────────────────────┐
│ Check Point │ Cumulative n │ Continue if │ Stop if │
├──────────────────────────────────────────────────────────┤
│ Day 3 │ 2,000 │ |z| < 2.8 │ |z| > 2.8 │
│ Day 7 │ 5,000 │ |z| < 2.5 │ |z| > 2.5 │
│ Day 14 │ 10,000 │ |z| < 2.2 │ |z| > 2.2 │
│ Day 21 │ 15,000 │ Final decision │
└──────────────────────────────────────────────────────────┘
Benefits:
- Can stop early for clear winners/losers
- Maintains statistical validity
- Faster iteration cycle
Experiment Velocity
| Metric | Target | Current |
|---|---|---|
| Experiments/month | 8-12 | 6 |
| Win rate | 30-40% | 35% |
| Time to decision | 2 weeks | 3 weeks |
| Experiment coverage | 40% of features | 25% |
Increasing Velocity:
- Pre-built experiment templates
- Self-serve experiment platform
- Reduced approval friction
- Parallel testing where possible
Anti-Patterns
- HiPPO override — Ignoring data for opinion
- Winner's curse — Effect sizes shrink in production
- Novelty effects — Early lift fades over time
- Feature flag debt — Experiments live forever
- Analysis paralysis — Running tests that don't matter
- Underpowered tests — Results too noisy to trust
- One-and-done — Not iterating on insights
- No documentation — Learnings lost to time
title: Experimentation Design and Process impact: HIGH tags: experiment, design, hypothesis, process
Experimentation Design and Process
Impact: HIGH
Beyond A/B testing mechanics lies experiment design: asking the right questions, designing valid tests, and building a culture of learning.
The Experimentation Hierarchy
Level 5: Continuous Experimentation ← Org-wide, embedded culture
│
Level 4: Experiment Platform ← Self-serve, scaled testing
│
Level 3: Regular A/B Testing ← Team runs experiments routinely
│
Level 2: Ad-hoc Experiments ← Occasional, project-based
│
Level 1: No Experiments ← Ship and hope
Experiment Prioritization
ICE Framework:
| Factor | Question | Score (1-10) |
|---|---|---|
| Impact | How much lift if it wins? | Estimated % improvement |
| Confidence | How likely to win? | Evidence strength |
| Ease | How hard to implement? | Inverse of effort |
ICE Score = Impact × Confidence × Ease
Prioritization Example:
Experiment Ideas Ranked:
┌────────────────────────────────────────────────────────────┐
│ Idea │ I │ C │ E │ Score │ Priority │
├────────────────────────────────────────────────────────────┤
│ Simplify signup form │ 8 │ 7 │ 8 │ 448 │ 1st │
│ Add social proof │ 6 │ 8 │ 9 │ 432 │ 2nd │
│ Redesign onboarding │ 9 │ 5 │ 4 │ 180 │ 3rd │
│ New pricing page │ 7 │ 4 │ 3 │ 84 │ 4th │
└────────────────────────────────────────────────────────────┘
Hypothesis Development
From Observation to Hypothesis:
1. Observation
"Users are dropping off after signup"
2. Research
- Funnel data: 45% drop between signup and first action
- Session recordings: Users seem confused by dashboard
- Survey: "I didn't know what to do first"
3. Insight
"Users lack clear guidance to their first valuable action"
4. Hypothesis
"Adding a guided first-action wizard will increase
signup-to-first-action rate by 15% because users
will have clear direction"
5. Experiment Design
- Control: Current post-signup experience
- Treatment: Guided wizard to first action
- Primary metric: First action completion (7 days)
- Sample: 5,000 per variant
Types of Experiments
| Type | Description | When to Use |
|---|---|---|
| A/B Test | Control vs one variant | Optimization, validation |
| A/B/n Test | Control vs multiple variants | Exploring options |
| Multivariate | Multiple variables at once | UI optimization |
| Holdout | Exclude % from all changes | Measure cumulative impact |
| Geo Experiment | Different regions | When randomization hard |
| Switchback | Alternating on/off | Time-sensitive effects |
| Bandit | Adaptive allocation | Fast optimization |
Experiment Documentation
Experiment Brief Template:
┌────────────────────────────────────────────────────────────┐
│ EXPERIMENT BRIEF │
├────────────────────────────────────────────────────────────┤
│ Name: Guided Onboarding Wizard │
│ Owner: Sarah (PM), Jake (Eng) │
│ Status: Design Review │
│ Start Date: TBD │
├────────────────────────────────────────────────────────────┤
│ CONTEXT │
│ Current signup-to-first-action rate is 55%. User research │
│ shows confusion about next steps. Competitors show 70%+. │
├────────────────────────────────────────────────────────────┤
│ HYPOTHESIS │
│ Adding a 3-step guided wizard after signup will increase │
│ first-action completion by 15% by providing clear │
│ direction and reducing cognitive load. │
├────────────────────────────────────────────────────────────┤
│ VARIANTS │
│ Control (50%): Current experience (dashboard with tips) │
│ Treatment (50%): Guided 3-step wizard │
├────────────────────────────────────────────────────────────┤
│ METRICS │
│ Primary: First action completion (7 days) │
│ Secondary: Time to first action, D7 retention │
│ Guardrails: Skip rate, support tickets, D30 retention │
├────────────────────────────────────────────────────────────┤
│ TARGETING │
│ Who: New signups only │
│ Exclusions: Enterprise users, mobile app │
│ Sample size: 10,000 total (5,000/variant) │
│ Duration: 3 weeks minimum │
├────────────────────────────────────────────────────────────┤
│ SUCCESS CRITERIA │
│ Ship if: >10% lift with p<0.05, guardrails pass │
│ Iterate if: 5-10% lift or mixed signals │
│ Kill if: <5% lift or guardrail regression │
├────────────────────────────────────────────────────────────┤
│ RISKS & MITIGATIONS │
│ Risk: Wizard feels forced │
│ Mitigation: Clear skip option, track skip rate │
└────────────────────────────────────────────────────────────┘
Good vs Bad Experiment Design
Good: Complete, Rigorous, Learnable
Experiment: Reduce Checkout Friction
Background:
- Cart abandonment at 68%
- Heatmaps show hesitation at shipping step
- Competitor analysis shows simpler flows
Hypothesis:
Combining shipping and billing into one step will increase
checkout completion by 12% by reducing perceived effort.
Design:
- A/B test, 50/50 split
- Primary: Checkout completion rate
- Secondary: Time to complete, support contacts
- Guardrails: Failed payments, address errors
- n = 8,000/variant (16,000 total)
- Duration: 4 weeks
- Segments: New vs returning, mobile vs desktop
Decision rules:
- >8% lift + p<0.05 + guardrails pass → Ship
- 3-8% lift → Extend + investigate segments
- <3% lift → Kill, learn, iterate
What we'll learn either way:
- Win: Friction reduction works, look for more
- Lose: Friction isn't the issue, test value prop
Bad: Incomplete, No Learning
Experiment: New Checkout
Test the new checkout page.
Variants:
- Old checkout
- New checkout
Run for 2 weeks.
Problems:
- No hypothesis
- No primary metric
- No sample size calculation
- No guardrails
- No decision criteria
- No learning documented
Running an Experimentation Program
Monthly Experiment Cadence:
| Week | Activities |
|---|---|
| Week 1 | Analyze previous experiments, update backlog |
| Week 2 | Design next experiments, build variants |
| Week 3 | Launch experiments, monitor ramp |
| Week 4 | Analyze results, document learnings |
Experiment Pipeline:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ BACKLOG │ → │ DESIGN │ → │ IMPLEMENT │
│ (20+ ideas) │ │ (3-5 briefs) │ │ (2-3 ready) │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ RUNNING │ ← │ LAUNCHED │ ← │ QA │
│ (2-4 live) │ │ (ramping) │ │ (testing) │
└──────────────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ ANALYZE │ → │ DOCUMENT │
│ (results) │ │ (learnings) │
└──────────────┘ └──────────────┘
Experiment Velocity Metrics
| Metric | Definition | Target |
|---|---|---|
| Experiments/Month | Tests completed | 8-12 |
| Win Rate | % with positive result | 30-40% |
| Coverage | % of features tested | >40% |
| Cycle Time | Idea to result (days) | <21 |
| Learning Rate | Insights documented | 100% |
Building Experiment Culture
Signs of Good Culture:
✓ PMs write hypotheses before building
✓ Engineers suggest experiments proactively
✓ Failed experiments are celebrated for learning
✓ Decisions cite experiment results
✓ Experiment results are shared widely
✓ Backlog is always full of ideas
Signs of Poor Culture:
✗ Experiments are done to "validate" decisions already made
✗ Inconclusive results are spun as wins
✗ Only "safe" experiments are run
✗ Results aren't documented or shared
✗ Leadership overrides experiment results
✗ Teams don't know experiment status
Experimentation Ethics
| Principle | Guidelines |
|---|---|
| Informed users | Disclose testing in terms of service |
| No harm | Don't test things that hurt users |
| Fair value | Don't withhold core value from control |
| Privacy | Only collect necessary data |
| Sensitive topics | Extra care with health, finance |
Anti-Patterns
- HIPPO culture — Highest-paid person overrides data
- P-hacking — Running until you get significance
- Cherry-picking — Only reporting favorable segments
- Feature factory — Shipping without measuring
- One-and-done — Not iterating on learnings
- Analysis paralysis — Perfect is enemy of good
- Cargo cult — Running experiments without understanding stats
- Winner's curse — Expecting production results = test results
title: Feature Adoption Tracking impact: HIGH tags: feature, adoption, usage, product-analytics
Feature Adoption Tracking
Impact: HIGH
Shipping features is easy. Getting users to adopt them is hard. Track adoption systematically to understand what's working and where to invest.
The Feature Adoption Lifecycle
Awareness → Trial → Adoption → Habit → Advocacy
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
"Saw it" "Tried it" "Use it" "Need it" "Love it"
│ │ │ │ │
Metrics: Exposure First Use Regular Power Referral
Rate Rate Usage Usage Rate
Core Feature Adoption Metrics
| Metric | Calculation | Good Target |
|---|---|---|
| Awareness Rate | Users who saw feature / Total eligible users | >80% |
| Trial Rate | Users who tried feature / Users aware | >40% |
| Adoption Rate | Regular users / Users who tried | >50% |
| Feature DAU | Daily active users of feature | Depends |
| Feature Breadth | % of user base using feature weekly | >30% for core |
| Feature Depth | Actions/session using feature | Growing |
| Time to Adopt | Days from signup to regular use | <14 days |
Feature Adoption Framework
Step 1: Define What "Adopted" Means
Feature: Collaboration Comments
NOT Adopted:
- Saw comments feature (awareness only)
- Left one comment ever (trial only)
Adopted:
- Left 3+ comments in 2 different weeks
- Received 2+ replies to comments
- Comments are part of regular workflow
Step 2: Build the Adoption Funnel
Eligible Users 1,000 100%
│
▼ Saw feature
Aware 820 82%
│
▼ Clicked/opened
Explored 410 41%
│
▼ Completed first use
Tried 285 28.5%
│
▼ Used 3+ times
Activated 170 17%
│
▼ Regular weekly use
Adopted 95 9.5%
│
▼ Daily/power use
Power User 28 2.8%
Step 3: Measure by Cohort
Feature launched: March 1
Adoption rate by signup cohort:
┌────────────────────────────────────────────┐
│ Signup Cohort │ 30-Day Adoption Rate │
├────────────────────────────────────────────┤
│ Pre-launch │ 8% (discovered later) │
│ March W1 │ 12% (early exposure) │
│ March W2 │ 18% (improved UX) │
│ March W3 │ 22% (onboarding added) │
│ March W4 │ 25% (tutorial added) │
└────────────────────────────────────────────┘
Insight: Each improvement increases adoption
Feature Usage Segmentation
Usage Depth Tiers:
| Tier | Definition | % of Feature Users |
|---|---|---|
| Power | Daily use, advanced actions | 5-10% |
| Regular | Weekly use, core actions | 20-30% |
| Occasional | Monthly use, basic actions | 30-40% |
| Trial | 1-2 uses total | 25-40% |
Segment Analysis:
Feature: Advanced Analytics
User Segment Analysis:
┌─────────────────────────────────────────────────────────┐
│ User Type │ Adoption │ Depth │ Retention Impact│
├─────────────────────────────────────────────────────────┤
│ Enterprise │ 68% │ High │ +25% retention │
│ Mid-market │ 42% │ Medium │ +12% retention │
│ SMB │ 15% │ Low │ +5% retention │
│ Free tier │ 3% │ Very low │ +2% retention │
└─────────────────────────────────────────────────────────┘
Insight: Feature is enterprise-focused, not relevant to SMB
Action: Gate feature for SMB, or simplify for their needs
Feature Success Scorecard
| Dimension | Metric | Target | Status |
|---|---|---|---|
| Reach | % of users who tried | 40% | 28% |
| Adoption | % of triers who adopted | 50% | 35% |
| Engagement | Actions/week by adopters | 5 | 3.2 |
| Retention | Feature user retention vs non | +10% | +8% |
| Value | NPS of feature users | 50 | 42 |
| Revenue | Conversion impact | +5% | +3% |
Overall Score: Needs improvement (4/6 under target)
Feature Adoption vs Retention Impact
Question: Is this feature worth keeping?
Feature Impact Matrix:
Low Retention Impact High Retention Impact
──────────────────────────────────────────────
High Adoption │ SIMPLIFY │ INVEST
│ Popular but not │ Core feature,
│ sticky. Reduce cost │ double down
│ │
Low Adoption │ SUNSET │ EXPOSE
│ Nobody uses, │ Valuable but hidden.
│ nobody misses │ Improve discovery
──────────────────────────────────────────────
Good vs Bad Feature Tracking
Good: Complete Picture
Feature: Smart Scheduling
Tracking implemented:
├── Awareness events
│ ├── smart_schedule_banner_shown
│ ├── smart_schedule_tooltip_shown
│ └── smart_schedule_menu_seen
├── Trial events
│ ├── smart_schedule_clicked
│ ├── smart_schedule_first_created
│ └── smart_schedule_preview_viewed
├── Adoption events
│ ├── smart_schedule_created (with count property)
│ ├── smart_schedule_accepted
│ └── smart_schedule_modified
└── Value events
├── smart_schedule_conflict_resolved
└── smart_schedule_time_saved (calculated)
Dashboard includes:
- Adoption funnel
- Cohort adoption curves
- Segment breakdown
- Retention impact
- Feature health score
Bad: Basic Counting
Feature: Smart Scheduling
Tracking implemented:
- smart_schedule_used (single event)
Problems:
- Can't distinguish trial vs adoption
- No awareness tracking
- No segment analysis
- No value measurement
- No retention correlation
Feature Launch Measurement Plan
Pre-Launch:
- Define adoption criteria
- Instrument all funnel events
- Set success targets
- Build dashboard
- Establish baseline (if applicable)
Launch Week:
Day 1-3: Awareness check
- Are users seeing the feature?
- Is discovery working?
Day 3-7: Trial check
- Are aware users trying it?
- What's the drop-off point?
Day 7-14: Early adoption
- Are triers coming back?
- What patterns emerge?
Post-Launch (30/60/90 days):
Monthly Review:
┌─────────────────────────────────────────────────────────┐
│ Metric │ Target │ Day 30 │ Day 60 │ Day 90│
├─────────────────────────────────────────────────────────┤
│ Adoption rate │ 25% │ 12% │ 18% │ 22% │
│ Weekly active users │ 1,000 │ 400 │ 750 │ 920 │
│ Retention impact │ +10% │ +4% │ +7% │ +9% │
│ User satisfaction │ 4.2 │ 3.8 │ 4.0 │ 4.1 │
└─────────────────────────────────────────────────────────┘
Feature Sunset Criteria
When to consider sunsetting:
| Signal | Threshold | Action |
|---|---|---|
| Adoption | <5% after 6 months | Sunset or pivot |
| Retention impact | <2% difference | Low priority |
| Maintenance cost | High, growing | Evaluate ROI |
| User complaints | >complaints than praise | Investigate or sunset |
| Strategic fit | Off-roadmap | Planned deprecation |
Anti-Patterns
- Ship and forget — No measurement plan = no learning
- Counting clicks — Clicks ≠ adoption, measure meaningful use
- Ignoring non-users — Why didn't they try? That's data too
- Universal metrics — Different features need different criteria
- Short windows — Features need 30-90 days to reach adoption
- No control group — Can't measure impact without comparison
- Feature bloat — More features ≠ better product
- Vanity features — Building for press releases, not users
title: Funnel Analysis and Optimization impact: CRITICAL tags: funnel, conversion, optimization, growth
Funnel Analysis and Optimization
Impact: CRITICAL
Funnels reveal where users drop off and where optimization delivers the biggest returns. Master funnel analysis to find your highest-leverage growth opportunities.
Funnel Fundamentals
A funnel tracks users through a sequence of steps toward a goal. Each step has conversion and drop-off rates.
Step 1: Landing Page Visit 100% ─┐
│ 60% convert
Step 2: Signup Started 60% ─┤
│ 80% convert
Step 3: Signup Completed 48% ─┤
│ 50% convert
Step 4: Onboarding Done 24% ─┤
│ 40% convert
Step 5: First Core Action 9.6% ─┘
Overall Conversion: 9.6%
Core Funnel Metrics
| Metric | Calculation | Interpretation |
|---|---|---|
| Step CVR | Users at Step N+1 / Users at Step N | Individual step performance |
| Cumulative CVR | Users at Step N / Users at Step 1 | Overall funnel performance |
| Drop-off Rate | 1 - Step CVR | Loss at each step |
| Funnel Velocity | Time from Step 1 to final step | Speed through funnel |
| Recovery Rate | Returning drop-offs / Total drop-offs | Re-engagement success |
Building Effective Funnels
Step 1: Define the Goal
What outcome defines success?
- Signup completed
- First purchase made
- Feature activated
- Subscription started
Step 2: Map the Critical Path
What steps MUST users complete?
1. Landing page → Signup form
2. Signup form → Email verified
3. Email verified → Onboarding complete
4. Onboarding complete → First action
Step 3: Instrument Events
Each step needs a trackable event:
- page_viewed (landing)
- signup_started
- signup_completed
- email_verified
- onboarding_completed
- first_action_taken
Step 4: Define the Conversion Window
How long to wait before calling it "dropped"?
- Signup funnel: 1 session or 30 minutes
- Onboarding funnel: 7 days
- Purchase funnel: 30 days
- Enterprise sales: 90 days
Funnel Analysis Techniques
1. Step-by-Step CVR Analysis
| Step Transition | CVR | Benchmark | Status |
|---|---|---|---|
| Visit → Signup Started | 35% | 40% | Under |
| Signup Started → Completed | 75% | 80% | Under |
| Completed → Activated | 50% | 45% | Good |
| Activated → Paid | 12% | 10% | Good |
2. Segment Comparison
Signup Funnel by Source:
┌──────────────────────────────────────────────────┐
│ Source │ Started │ Completed │ Activated │
├──────────────────────────────────────────────────┤
│ Google Ads │ 100% │ 68% │ 25% │
│ Organic │ 100% │ 82% │ 45% │
│ Referral │ 100% │ 88% │ 62% │
│ Social │ 100% │ 55% │ 18% │
└──────────────────────────────────────────────────┘
Insight: Referral traffic has 2.5x higher activation rate
Action: Increase investment in referral program
3. Time-Based Analysis
Drop-off timing within onboarding step:
┌──────────────────────────────────────────────────┐
│ Time in Step │ Still Active │ % of Drop-offs │
├──────────────────────────────────────────────────┤
│ 0-1 min │ 82% │ 35% │
│ 1-3 min │ 65% │ 28% │
│ 3-5 min │ 51% │ 22% │
│ 5+ min │ 48% │ 15% │
└──────────────────────────────────────────────────┘
Insight: 35% of drop-offs happen in first minute
Action: Investigate first-screen friction
Finding the Biggest Opportunity
Calculate Impact Potential:
Current funnel:
Visit (1000) → Start (350) → Complete (280) → Activate (140) → Pay (17)
Impact of 10% improvement at each step:
Step improved New Paying Lift vs Baseline
────────────────────────────────────────────────
Visit → Start 18.7 +10%
Start → Complete 18.7 +10%
Complete → Activate 18.7 +10%
Activate → Pay 18.7 +10%
All steps equal? Focus on EARLIEST step for compounding.
A 10% improvement to Step 1 benefits ALL downstream steps.
The Opportunity Formula:
Opportunity Score = Drop-off Volume × Conversion Headroom
Step Volume Current CVR Benchmark Headroom Score
────────────────────────────────────────────────────────────────────────
Visit → Start 1000 35% 45% 10% 100
Start → Complete 350 80% 85% 5% 17.5
Complete → Active 280 50% 60% 10% 28
Active → Pay 140 12% 15% 3% 4.2
Focus: Visit → Start has highest opportunity score
Common Funnel Drop-off Causes
| Drop-off Point | Common Causes | Diagnostic Actions |
|---|---|---|
| Landing → Signup | Unclear value prop, slow load, trust issues | A/B test headlines, speed audit, add social proof |
| Signup Started → Complete | Too many fields, technical errors, friction | Form analytics, error tracking, reduce fields |
| Signup → Onboarding | Email not verified, confusing next steps | Email deliverability, clearer CTA |
| Onboarding → Activation | Overwhelming UX, unclear value, too long | User session recordings, simplify flow |
| Activation → Retention | Value not realized, wrong ICP, bugs | Cohort analysis, user interviews |
| Free → Paid | Price, perceived value, friction | Pricing tests, value messaging, checkout UX |
Good vs Bad Funnel Examples
Good: Well-Defined, Actionable
Goal: Free trial to paid conversion
Funnel Definition:
1. Trial Started (tracked: trial_started event)
2. Core Feature Used (tracked: first_report_created)
3. Team Invited (tracked: first_team_member_added)
4. Limit Hit (tracked: usage_limit_reached)
5. Upgrade Started (tracked: upgrade_clicked)
6. Paid (tracked: subscription_created)
Conversion Window: 14-day trial period
Analysis Segmentation:
- By company size
- By acquisition source
- By first feature used
- By engagement level
Clear ownership: Growth team owns steps 1-3, Sales owns 4-6
Bad: Vague, Unmeasurable
Goal: Get more customers
Funnel:
1. Someone visits
2. They like it
3. They sign up
4. They use it
5. They pay
Problems:
- Events not defined
- "Like it" not measurable
- No conversion windows
- No segmentation plan
- No ownership
Optimization Playbook
For each low-converting step:
-
Quantify the drop-off
- How many users?
- What's the benchmark?
- What's the revenue impact?
-
Diagnose the cause
- Watch session recordings
- Check error logs
- Run user surveys at drop-off
- Analyze by segment
-
Hypothesize and prioritize
- List possible fixes
- Estimate impact × confidence
- Pick highest-ROI test
-
Test and measure
- A/B test the change
- Wait for significance
- Measure downstream effects
-
Learn and iterate
- Document results
- Roll out winners
- Update benchmarks
Anti-Patterns
- Too many steps — Combine micro-steps, focus on key conversions
- Wrong conversion window — Too short misses slow converters, too long dilutes signal
- Ignoring segments — Averages hide important cohort differences
- Single-metric focus — Improving one step might hurt another
- No baseline — Track historical performance before optimizing
- Premature optimization — Get statistical significance first
- Funnel-only thinking — Some users take non-linear paths
title: Impact Measurement impact: HIGH tags: impact, measurement, roi, outcomes
Impact Measurement
Impact: HIGH
Shipping features is easy. Proving they worked is hard. Rigorous impact measurement separates "we launched" from "we succeeded."
The Impact Measurement Framework
┌──────────────────────────────────────────────────────────────┐
│ IMPACT MEASUREMENT │
├──────────────────────────────────────────────────────────────┤
│ │
│ 1. DEFINE SUCCESS │
│ What outcome are we trying to achieve? │
│ ↓ │
│ 2. ESTABLISH BASELINE │
│ What's the current state? │
│ ↓ │
│ 3. SET TARGETS │
│ What improvement do we expect? │
│ ↓ │
│ 4. DESIGN MEASUREMENT │
│ How will we know if we succeeded? │
│ ↓ │
│ 5. ANALYZE RESULTS │
│ Did we achieve the target? │
│ ↓ │
│ 6. ATTRIBUTE CAUSATION │
│ Was it because of our change? │
│ │
└──────────────────────────────────────────────────────────────┘
Types of Impact Measurement
| Type | Approach | Confidence | When to Use |
|---|---|---|---|
| A/B Test | Randomized control | High | Optimization |
| Pre/Post | Before vs after | Medium | When A/B not possible |
| Cohort Comparison | New vs old users | Medium | Feature additions |
| Difference-in-Difference | Treatment vs control over time | Medium-High | Quasi-experimental |
| Instrumentation | Event tracking | N/A | Adoption measurement |
Setting Success Criteria Before Launch
Pre-Launch Success Definition:
Feature: Smart Recommendations
Success Criteria Document:
┌──────────────────────────────────────────────────────────────┐
│ Metric │ Baseline │ Target │ Measurement │
├──────────────────────────────────────────────────────────────┤
│ Click-through rate │ 2.1% │ 3.5% │ A/B test │
│ Items viewed/session │ 3.2 │ 4.5 │ A/B test │
│ Conversion rate │ 4.8% │ 5.5% │ A/B test │
│ Adoption rate │ N/A │ 40% │ Tracking │
│ User satisfaction │ 3.8 │ 4.2 │ Survey │
├──────────────────────────────────────────────────────────────┤
│ Timeline: 30 days post-launch │
│ Decision: Keep if 2+ primary metrics hit target │
│ Owner: PM + Analytics │
└──────────────────────────────────────────────────────────────┘
Impact Measurement Methods
1. A/B Test Impact
Approach: Randomized experiment
Best for: Feature optimization, UX changes
Example:
┌──────────────────────────────────────────────────────────────┐
│ New Checkout Flow A/B Test Results │
├──────────────────────────────────────────────────────────────┤
│ Metric │ Control │ Treatment │ Lift │ p-value │
├──────────────────────────────────────────────────────────────┤
│ Checkout CVR │ 12.4% │ 14.8% │ +19.4% │ 0.001 │
│ Avg order value │ $87 │ $89 │ +2.3% │ 0.142 │
│ Cart abandonment │ 68% │ 62% │ -8.8% │ 0.003 │
├──────────────────────────────────────────────────────────────┤
│ Impact: +$142K estimated annual revenue │
│ Confidence: HIGH (A/B tested with significance) │
└──────────────────────────────────────────────────────────────┘
2. Pre/Post Analysis
Approach: Compare before and after launch
Best for: When A/B test not feasible
Example:
┌──────────────────────────────────────────────────────────────┐
│ New Onboarding Pre/Post Analysis │
├──────────────────────────────────────────────────────────────┤
│ Period │ Signup → Active │ D7 Retention │
├──────────────────────────────────────────────────────────────┤
│ Pre (4 weeks) │ 45% │ 32% │
│ Post (4 weeks) │ 58% │ 41% │
│ Change │ +13pp │ +9pp │
├──────────────────────────────────────────────────────────────┤
│ Confounds to consider: │
│ - Seasonality (controlled: same weeks YoY similar) │
│ - Marketing changes (controlled: no campaign changes) │
│ - Product changes (controlled: isolated change) │
├──────────────────────────────────────────────────────────────┤
│ Confidence: MEDIUM (confounds controlled but not eliminated) │
└──────────────────────────────────────────────────────────────┘
3. Cohort Comparison
Approach: Compare users who had feature vs those who didn't
Best for: Gradual rollouts, feature additions
Example:
┌──────────────────────────────────────────────────────────────┐
│ New Feature Cohort Analysis │
├──────────────────────────────────────────────────────────────┤
│ Cohort │ D30 Retention │ ARPU │ NPS │
├──────────────────────────────────────────────────────────────┤
│ With feature │ 48% │ $42 │ 52 │
│ Without feature │ 38% │ $35 │ 41 │
│ Difference │ +10pp │ +$7 │ +11 │
├──────────────────────────────────────────────────────────────┤
│ Caution: Self-selection bias possible │
│ Users who adopt features may be inherently more engaged │
└──────────────────────────────────────────────────────────────┘
Calculating Business Impact
Revenue Impact Calculation:
Feature: Improved Recommendations
Metrics:
- 10,000 daily users exposed
- CTR improved from 2% to 3.5% (+75%)
- CVR on clicked items: 8%
- AOV: $50
Calculation:
Before: 10,000 × 2% × 8% × $50 = $800/day
After: 10,000 × 3.5% × 8% × $50 = $1,400/day
Lift: $600/day = $219,000/year
Impact Statement:
"Smart Recommendations drive an estimated $219K additional
annual revenue through improved discovery."
LTV Impact Calculation:
Feature: Onboarding Improvements
Metrics:
- 30-day retention improved 10pp (32% → 42%)
- Monthly churn decreased from 8% to 6%
- Average monthly revenue: $30
LTV Calculation:
Before: $30 / 8% = $375 LTV
After: $30 / 6% = $500 LTV
Lift: +$125 per customer (+33%)
With 1,000 new customers/month:
Annual LTV impact: $125 × 1,000 × 12 = $1.5M
Good vs Bad Impact Measurement
Good: Rigorous, Pre-defined
Feature Launch: Team Collaboration
Pre-Launch:
- Success criteria defined
- Baseline metrics captured
- A/B test designed
- 30-day measurement plan
Results (30 days):
┌──────────────────────────────────────────────────────────────┐
│ Metric │ Goal │ Result │ Status │
├──────────────────────────────────────────────────────────────┤
│ Adoption rate │ 25% │ 32% │ ✓ Exceeded │
│ Team retention │ +5pp │ +8pp │ ✓ Exceeded │
│ Seats per team │ +0.5 │ +0.8 │ ✓ Exceeded │
│ Support tickets │ No lift │ +5% │ ⚠ Watch │
├──────────────────────────────────────────────────────────────┤
│ Decision: Ship, address support ticket drivers │
│ Impact: Est. $340K ARR from seat expansion │
└──────────────────────────────────────────────────────────────┘
Bad: Post-hoc, Cherry-picked
Feature Launch: Team Collaboration
Post-Launch:
- "Let's see what metrics improved"
- Found: Page views up 15%!
- Declared: Success!
Problems:
- No pre-defined success criteria
- No baseline comparison
- Cherry-picked positive metric
- Page views ≠ business value
- No causal attribution
Attribution Challenges
| Challenge | Description | Mitigation |
|---|---|---|
| Confounds | Other things changed | Control for variables |
| Selection Bias | Non-random exposure | Randomized experiment |
| Novelty Effect | Initial excitement fades | Longer measurement window |
| Regression to Mean | Extreme values normalize | Use control groups |
| Hawthorne Effect | Behavior changes when watched | Blind experiments |
Impact Reporting Template
┌──────────────────────────────────────────────────────────────┐
│ IMPACT REPORT │
│ Feature: Smart Search │
├──────────────────────────────────────────────────────────────┤
│ SUMMARY │
│ Smart Search improved search conversion by 23% in A/B test, │
│ generating estimated $180K annual revenue. │
├──────────────────────────────────────────────────────────────┤
│ HYPOTHESIS │
│ AI-powered search suggestions would increase search→result │
│ clicks by 20% by surfacing more relevant results. │
├──────────────────────────────────────────────────────────────┤
│ METHODOLOGY │
│ - A/B test, 50/50 split │
│ - n = 45,000 (22,500/variant) │
│ - Duration: 28 days │
│ - Primary metric: Search→click rate │
├──────────────────────────────────────────────────────────────┤
│ RESULTS │
│ Metric │ Control │ Treatment │ p-value │
│ Search→click rate │ 18.2% │ 22.4% │ <0.001 │
│ Searches/session │ 1.8 │ 2.1 │ 0.003 │
│ Purchase after search│ 4.2% │ 4.8% │ 0.021 │
├──────────────────────────────────────────────────────────────┤
│ BUSINESS IMPACT │
│ - 23% more users find what they're looking for │
│ - 14% increase in search-driven purchases │
│ - Estimated $180K additional annual revenue │
├──────────────────────────────────────────────────────────────┤
│ RECOMMENDATION │
│ Ship to 100%. Consider extending to mobile app. │
├──────────────────────────────────────────────────────────────┤
│ LEARNINGS │
│ - AI suggestions most effective for complex queries │
│ - Less impact for branded/simple searches │
│ - Mobile users engaged more with suggestions │
└──────────────────────────────────────────────────────────────┘
Impact Measurement Cadence
| Timeframe | Focus | Activities |
|---|---|---|
| Pre-Launch | Definition | Success criteria, baseline, measurement plan |
| Launch +7d | Adoption | Is feature being used? Any bugs? |
| Launch +30d | Primary | Did we hit success criteria? |
| Launch +90d | Sustained | Are gains holding? Any degradation? |
| Quarterly | Portfolio | Which features drove most impact? |
Anti-Patterns
- Post-hoc success definition — Defining success after seeing results
- Vanity metrics — Measuring what looks good, not what matters
- No baseline — Can't measure lift without starting point
- Ignoring context — External factors explain "lift"
- Single metric obsession — Winning primary while losing guardrails
- Short-term thinking — Week 1 results ≠ long-term impact
- No follow-up — Measuring once, never checking if it holds
- Correlation claims — "Feature users retain better" (selection bias)
title: Data Instrumentation and Events impact: HIGH tags: instrumentation, events, tracking, data-quality
Data Instrumentation and Events
Impact: HIGH
Bad data in, bad decisions out. Solid instrumentation is the foundation of all product analytics. Invest here first.
Event Tracking Fundamentals
What is an event? An event is a record of something that happened: a user action, system occurrence, or state change.
{
"event": "button_clicked",
"timestamp": "2024-01-15T10:30:00Z",
"user_id": "user_123",
"properties": {
"button_name": "signup_cta",
"page": "homepage",
"variant": "blue"
}
}
Event Taxonomy Principles
1. Naming Conventions
| Convention | Format | Example |
|---|---|---|
| Object_Action | noun_verb | signup_completed, project_created |
| Action_Object | verb_noun | completed_signup, created_project |
| Page_Action | context_verb | onboarding_step_viewed |
Pick ONE convention and stick to it.
2. Event Hierarchy
Level 1: Categories (high-level)
├── user_lifecycle
├── content_engagement
├── feature_usage
├── commerce
└── system
Level 2: Objects (what)
├── user_lifecycle
│ ├── signup
│ ├── login
│ └── subscription
├── feature_usage
│ ├── project
│ ├── report
│ └── export
Level 3: Actions (happened)
├── user_lifecycle
│ ├── signup
│ │ ├── started
│ │ ├── completed
│ │ └── failed
3. Property Standards
Every event should include:
┌─────────────────────────────────────────────────────────┐
│ Property │ Type │ Description │
├─────────────────────────────────────────────────────────┤
│ timestamp │ datetime │ When event occurred │
│ user_id │ string │ Unique user identifier │
│ session_id │ string │ Session identifier │
│ platform │ string │ web, ios, android │
│ app_version │ string │ Application version │
└─────────────────────────────────────────────────────────┘
Event-specific properties:
┌─────────────────────────────────────────────────────────┐
│ Property │ Type │ Example │
├─────────────────────────────────────────────────────────┤
│ object_id │ string │ project_456 │
│ object_type │ string │ project │
│ value │ number │ 99.99 │
│ source │ string │ email_campaign │
│ variant │ string │ experiment_a │
└─────────────────────────────────────────────────────────┘
Building an Event Spec
Event Specification Document:
Event: project_created
Category: feature_usage
Description: User creates a new project
Trigger:
- When: After successful project save to database
- Where: Project creation flow (web, mobile)
- Who: Authenticated users only
Properties:
┌───────────────────────────────────────────────────────────────┐
│ Property │ Type │ Required │ Description │
├───────────────────────────────────────────────────────────────┤
│ project_id │ string │ Yes │ Unique project ID │
│ project_name │ string │ Yes │ User-defined name │
│ template_used │ string │ No │ Template ID if used │
│ team_id │ string │ No │ Team if shared │
│ source │ string │ Yes │ Where created from │
│ time_to_create │ number │ Yes │ Seconds from start │
└───────────────────────────────────────────────────────────────┘
Source values: onboarding, dashboard, import, api, template
Example:
{
"event": "project_created",
"timestamp": "2024-01-15T10:30:00Z",
"user_id": "user_123",
"properties": {
"project_id": "proj_789",
"project_name": "Q1 Marketing",
"template_used": "marketing_template",
"team_id": "team_456",
"source": "template",
"time_to_create": 45
}
}
Good vs Bad Event Design
Good: Clear, Complete, Consistent
// Clear event name
event: project_created
// Complete properties
properties: {
project_id: "proj_123",
project_type: "marketing",
template_used: "q1_template",
created_from: "dashboard",
team_size: 5,
is_first_project: true
}
// Consistent naming
- project_created (not "Created Project")
- project_updated (not "projectUpdated")
- project_deleted (not "delete_project")
Bad: Vague, Incomplete, Inconsistent
// Vague event name
event: click
// Missing context
properties: {
button: "create"
}
// Inconsistent naming
- click
- Project Created
- updateProject
- deleted
Problems:
- Can't identify what was clicked
- Missing user context
- No way to build funnels
- Inconsistent casing
Common Event Patterns
1. Funnel Events
signup_started → Entry point
signup_email_entered → Step 1
signup_password_set → Step 2
signup_profile_done → Step 3
signup_completed → Success
signup_failed → Track errors
2. Feature Usage Events
feature_viewed → Saw the feature
feature_clicked → Interacted
feature_used → Completed action
feature_error → Something went wrong
3. Commerce Events
product_viewed → Browsing
product_added_to_cart → Intent
checkout_started → High intent
purchase_completed → Conversion
4. Search/Discovery Events
search_performed → User searched
search_results_shown → Results displayed
search_result_clicked → Selected result
search_no_results → Empty state
Data Quality Monitoring
Quality Dimensions:
| Dimension | What to Check | Alert Threshold |
|---|---|---|
| Completeness | % of required properties filled | <95% |
| Freshness | Lag from event to warehouse | >15 minutes |
| Volume | Event count vs expected | ±30% from baseline |
| Validity | % passing schema validation | <99% |
| Uniqueness | Duplicate event rate | >1% |
Monitoring Dashboard:
Event Health Check (Last 24h):
┌────────────────────────────────────────────────────────────┐
│ Event │ Count │ Quality │ Status │
├────────────────────────────────────────────────────────────┤
│ signup_completed │ 2,450 │ 99.2% │ ✓ Healthy │
│ project_created │ 8,230 │ 98.7% │ ✓ Healthy │
│ report_exported │ 1,120 │ 87.3% │ ⚠ Low quality │
│ purchase_completed │ 342 │ 99.8% │ ✓ Healthy │
│ feature_x_used │ 0 │ N/A │ ✗ No events │
└────────────────────────────────────────────────────────────┘
Alerts:
- report_exported: Missing 'format' property in 12% of events
- feature_x_used: No events in 24h (expected 500+)
Implementation Best Practices
1. Single Source of Truth
Good:
- Event fired from one location in code
- Wrapper function ensures consistency
Bad:
- Same event fired from multiple places
- Inconsistent property population
2. Server-Side When Possible
Prefer Server-Side:
- Reliable delivery
- Harder to block
- Consistent timestamps
- Secure properties
Client-Side Only:
- UI interactions (clicks, scrolls)
- Performance metrics
- Error states
3. Event Validation
// Schema validation on send
const eventSchema = {
project_created: {
required: ['project_id', 'project_name', 'source'],
properties: {
project_id: { type: 'string' },
project_name: { type: 'string', maxLength: 100 },
source: { type: 'string', enum: ['dashboard', 'import', 'api'] }
}
}
};
// Validate before sending
validateAndSend(event, eventSchema);
Event Planning Worksheet
| Question | Answer |
|---|---|
| What behavior are we measuring? | Project creation success |
| What decisions will this inform? | Onboarding optimization |
| What's the trigger moment? | Project saved to database |
| What context is needed? | Source, template, time |
| Who owns implementation? | Engineering team |
| Who owns data quality? | Analytics team |
| When is this reviewed? | Quarterly audit |
Event Governance
Tracking Plan Ownership:
1. Product Manager defines requirements
2. Analyst defines event spec
3. Engineer implements
4. QA validates
5. Analyst monitors quality
6. All: Quarterly review
Change Management:
Adding event:
1. Create spec document
2. Review with stakeholders
3. Implement with tests
4. Validate in staging
5. Deploy and monitor
Changing event:
1. Document change reason
2. Version the event (v2)
3. Parallel track during transition
4. Migrate dashboards
5. Deprecate old version
Anti-Patterns
- Track everything — Noise drowns signal, costs explode
- Track nothing — Flying blind, can't learn
- Client-only — Ad blockers, reliability issues
- No schema — Properties drift, data becomes unusable
- No ownership — Nobody maintains, quality degrades
- Hardcoded strings — Typos break tracking
- Missing context — Events without user/session info
- No documentation — Nobody knows what events mean
title: AARRR and HEART Frameworks impact: CRITICAL tags: metrics, framework, aarrr, heart, pirate-metrics
AARRR and HEART Frameworks
Impact: CRITICAL
Two proven frameworks for structuring your product metrics. AARRR focuses on lifecycle stages; HEART focuses on user experience dimensions.
AARRR (Pirate Metrics)
Developed by Dave McClure for startup metrics. Tracks the user lifecycle funnel.
| Stage | Core Question | Example Metrics | Typical Owner |
|---|---|---|---|
| Acquisition | Where do users come from? | Signups by channel, CAC, landing page conversion | Growth/Marketing |
| Activation | Do they experience value? | Onboarding completion, time-to-first-value, aha moment | Product/Growth |
| Retention | Do they come back? | D1/D7/D30, MAU/DAU ratio, resurrection rate | Product |
| Revenue | Do they pay? | Conversion rate, ARPU, LTV, expansion rate | Product/Finance |
| Referral | Do they tell others? | Viral coefficient, NPS, referral conversion | Growth/Product |
AARRR Deep Dive: Metrics by Stage
Acquisition Metrics
┌─────────────────────────────────────────────────────────────┐
│ Traffic Sources │ Signups by source │
│ Channel CAC │ Cost per signup by channel │
│ Landing Page CVR │ Visitor → Signup rate │
│ Signup Completion │ Started → Completed signup │
│ Channel Mix │ % of signups by channel │
└─────────────────────────────────────────────────────────────┘
Activation Metrics
┌─────────────────────────────────────────────────────────────┐
│ Setup Completion Rate │ % completing onboarding │
│ Time to First Value │ Signup → Core action (minutes) │
│ Aha Moment Rate │ % reaching activation event │
│ Day 1 Return Rate │ % returning within 24 hours │
│ Feature Discovery │ % using 2+ features in week 1 │
└─────────────────────────────────────────────────────────────┘
Retention Metrics
┌─────────────────────────────────────────────────────────────┐
│ D1/D7/D30 Retention │ % returning on day N │
│ Weekly Retention Curve │ % active each week post-signup │
│ Stickiness (DAU/MAU) │ Daily engagement ratio │
│ Churn Rate │ % lost per period │
│ Resurrection Rate │ % of churned who return │
└─────────────────────────────────────────────────────────────┘
Revenue Metrics
┌─────────────────────────────────────────────────────────────┐
│ Free → Paid Conversion │ % converting to paid │
│ Trial → Paid Conversion │ % of trials converting │
│ ARPU/ARPA │ Average revenue per user/acct │
│ LTV │ Lifetime value │
│ Expansion Revenue % │ % revenue from existing │
└─────────────────────────────────────────────────────────────┘
Referral Metrics
┌─────────────────────────────────────────────────────────────┐
│ NPS │ Net Promoter Score │
│ Viral Coefficient (K) │ Invites × Invite CVR │
│ Referral Rate │ % users who refer │
│ Referral Conversion │ Referred visitor → signup rate │
│ Organic Word-of-Mouth │ Direct/organic signup % │
└─────────────────────────────────────────────────────────────┘
HEART Framework (Google)
Developed by Kerry Rodden at Google. Focuses on user experience quality.
| Dimension | What It Measures | Signal Types | Example Metrics |
|---|---|---|---|
| Happiness | User satisfaction | Attitudinal | NPS, CSAT, ease-of-use |
| Engagement | Involvement depth | Behavioral | Sessions/week, actions/session |
| Adoption | Uptake of features | Behavioral | New feature users, breadth |
| Retention | Continued usage | Behavioral | Retention rate, churn |
| Task Success | Efficiency | Behavioral | Completion rate, time-on-task |
HEART Goals-Signals-Metrics (GSM)
For each HEART dimension, define:
| Step | Definition | Example (File Sharing App) |
|---|---|---|
| Goal | What experience outcome? | Users successfully share files |
| Signal | What indicates goal met? | Completed file shares |
| Metric | How do we measure signal? | Files shared per active user |
Full HEART GSM Example:
| Dimension | Goal | Signal | Metric |
|---|---|---|---|
| Happiness | Users satisfied with sharing | Survey responses | CSAT score on share flow |
| Engagement | Users share frequently | Share activity | Shares per week per user |
| Adoption | New users start sharing | First share complete | % of new users who share in W1 |
| Retention | Users keep sharing | Repeated share behavior | Week-over-week share retention |
| Task Success | Sharing is easy | Successful shares | Share completion rate, time to share |
When to Use Which Framework
| Use AARRR When | Use HEART When |
|---|---|
| Analyzing full lifecycle | Deep-diving on UX quality |
| Cross-functional alignment | Product team focus |
| Growth modeling | Feature-level analysis |
| Business metrics review | User experience audit |
| Startup/scale phase | Mature product optimization |
Combining AARRR + HEART
The frameworks complement each other:
AARRR Stage HEART Dimensions to Apply
───────────────────────────────────────────────
Acquisition → Task Success (signup flow)
Activation → Task Success, Adoption, Happiness
Retention → Engagement, Retention, Happiness
Revenue → Task Success (purchase flow), Happiness
Referral → Happiness (NPS), Engagement
Good vs Bad Example: AARRR Implementation
Good: Clear, Measurable, Actionable
Company: Analytics SaaS Tool
Acquisition:
- Metric: Signups by channel
- Target: 500 signups/month, <$50 CAC
Activation:
- Metric: First dashboard created within 7 days
- Target: 60% of signups
Retention:
- Metric: Weekly active users (ran 1+ query)
- Target: 45% W4 retention
Revenue:
- Metric: Trial → Paid conversion
- Target: 12% within 30 days
Referral:
- Metric: NPS, organic signup %
- Target: NPS > 50, 30%+ organic
Bad: Vague, Unmeasurable, No Targets
Company: Analytics SaaS Tool
Acquisition: "Get more users"
Activation: "Users like the product"
Retention: "Keep users coming back"
Revenue: "Make money"
Referral: "Users recommend us"
Problems:
- No specific metrics defined
- No targets to measure against
- No way to know if improving
- No team accountability
HEART Implementation Mistakes
| Mistake | Problem | Fix |
|---|---|---|
| Tracking all dimensions everywhere | Overwhelming, unfocused | Pick 2-3 relevant per feature |
| Only happiness | Satisfaction ≠ behavior | Balance attitudinal + behavioral |
| Ignoring task success | Missing friction points | Track completion rates |
| No GSM process | Metrics without purpose | Start with Goals, derive Signals |
Anti-Patterns
- Framework cargo cult — Using AARRR/HEART labels without real metrics
- Measuring everything — Pick what matters, ignore the rest
- No targets — Metrics without goals are just numbers
- Stage skipping — Jumping to revenue before activation works
- Averages only — Segment metrics by cohort, source, behavior
- Snapshot thinking — Track trends, not just current state
title: Defining Your North Star Metric impact: CRITICAL tags: metrics, strategy, north-star, kpi
Defining Your North Star Metric
Impact: CRITICAL
Your North Star Metric (NSM) is the single metric that best captures the core value your product delivers to customers. Get this right and alignment follows.
What Makes a Good North Star
A North Star Metric must:
- Measure value delivered to customers (not just activity)
- Be a leading indicator of revenue
- Be measurable and movable
- Be understandable across the company
- Connect to your product's core purpose
North Star Selection Framework
| Factor | Question | Good Sign |
|---|---|---|
| Value | Does this reflect customer value? | Users would agree it matters |
| Growth | Does improving this grow the business? | Correlates with revenue |
| Focus | Does this guide prioritization? | Teams can rally around it |
| Measurable | Can we track this reliably? | Daily/weekly updates possible |
| Controllable | Can our product actions influence this? | Clear input metrics exist |
Examples by Business Model
| Business Type | North Star Metric | Why It Works |
|---|---|---|
| Slack | Daily Active Users sending messages | Measures communication value |
| Airbnb | Nights booked | Core value exchange |
| Spotify | Time spent listening | Engagement = value delivered |
| Shopify | GMV (Gross Merchandise Value) | Merchant success = Shopify success |
| Notion | Weekly active teams | Collaboration value |
| Zoom | Weekly meeting minutes | Usage = value |
| HubSpot | Weekly active teams using 2+ hubs | Platform adoption = stickiness |
North Star + Input Metrics
Your NSM doesn't exist alone. Build a metrics tree:
┌─────────────────────────┐
│ NORTH STAR METRIC │
│ (Weekly Active Teams) │
└───────────┬─────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Input: New │ │ Input: Team │ │ Input: Team │
│ Team Signups │ │ Activation │ │ Retention │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐
│ • Traffic │ │ • Onboarding │ │ • Weekly │
│ • Signup rate │ │ completion │ │ engagement │
│ • First team │ │ • First value │ │ • Feature │
│ created │ │ moment │ │ depth │
└───────────────┘ └───────────────┘ └───────────────┘
Good vs Bad Examples
Good: Value-Focused
Product: Project Management Tool
North Star: Weekly Active Projects with 3+ collaborators
Why good:
- Measures collaboration value (core purpose)
- "3+ collaborators" ensures real team usage
- Weekly timeframe balances signal and noise
- Teams can influence via activation, features, retention
Bad: Vanity Metric
Product: Project Management Tool
"North Star": Total Registered Users
Why bad:
- Doesn't measure ongoing value
- Includes churned users
- Doesn't differentiate engaged vs dormant
- Can't prioritize based on this
Good: Leading Indicator
Product: E-commerce Platform
North Star: Weekly purchasing customers
Why good:
- Measures actual value exchange
- Leading indicator of GMV
- Combines acquisition + retention
- Clear optimization paths
Bad: Lagging Only
Product: E-commerce Platform
"North Star": Annual Revenue
Why bad:
- Too slow to react to
- Teams can't see weekly impact
- Doesn't guide product decisions
- Finance metric, not product metric
North Star for Different Stages
| Stage | North Star Focus | Example |
|---|---|---|
| Pre-PMF | Engagement intensity | Daily active rate of first 100 users |
| Early Growth | Activation + retention | Weekly users completing core action |
| Scaling | Sustainable growth | Monthly active + expansion revenue |
| Mature | Efficiency + expansion | Revenue per user, platform usage |
Validating Your North Star
Before committing, test these:
- Correlation test: Does improving this correlate with revenue growth historically?
- Prioritization test: Would this help you decide between two features?
- Communication test: Can you explain it to anyone in 30 seconds?
- Counter-metric test: What could go wrong if you only optimize for this?
Common North Star Mistakes
| Mistake | Problem | Better Alternative |
|---|---|---|
| Revenue | Lagging, not product-focused | Weekly paying customers |
| Total users | Ignores churn, quality | Weekly/Monthly active users |
| Sessions | Activity ≠ value | Sessions with core action |
| NPS | Infrequent, hard to move | Usage-based proxy + NPS |
| Multiple NSMs | Defeats the purpose | Pick one, track others as KPIs |
Anti-Patterns
- Too many North Stars — If everything is a priority, nothing is
- Pure vanity — "Total downloads" tells you nothing about value
- Unchangeable — If teams can't move it, it's not useful
- Revenue-only — Finance metric, not product metric
- Ignoring quality — "Messages sent" without "messages read"
- Copying competitors — Your NSM should reflect YOUR value prop
title: User Behavior Analytics impact: HIGH tags: behavior, analytics, patterns, segmentation
User Behavior Analytics
Impact: HIGH
Understanding how users actually behave — not how you think they behave — is the foundation of product insight. Behavior data reveals intent, friction, and opportunity.
Types of Behavioral Data
| Type | What It Captures | Tools |
|---|---|---|
| Event Data | Actions (clicks, creates, purchases) | Amplitude, Mixpanel, Segment |
| Session Data | Sequences and flows | Fullstory, Hotjar |
| Engagement | Depth and frequency | Product analytics |
| Journey | Cross-session paths | Journey mapping tools |
| Heatmaps | Where users look/click | Hotjar, Crazy Egg |
| Recordings | Actual user sessions | Fullstory, Logrocket |
Behavioral Segmentation
Segmentation Dimensions:
| Dimension | Examples | Use Case |
|---|---|---|
| Usage Frequency | Daily, weekly, monthly | Identify power users |
| Feature Adoption | Features used, depth | Product development |
| Lifecycle Stage | New, activated, retained | Targeted engagement |
| Value Tier | Free, paid, enterprise | Monetization |
| Behavior Pattern | Creator, consumer, lurker | UX optimization |
Creating Behavioral Segments:
Segment: Power Users
Definition:
- Active 5+ days per week
- Use 3+ core features
- Generate content (not just consume)
- Active for 30+ days
Characteristics:
┌─────────────────────────────────────────────────────────┐
│ Metric │ Power Users │ Regular Users │
├─────────────────────────────────────────────────────────┤
│ % of user base │ 8% │ 92% │
│ Sessions/week │ 12.4 │ 2.1 │
│ Features used │ 6.2 │ 2.3 │
│ Content created/week │ 15.0 │ 1.8 │
│ Retention (D90) │ 85% │ 32% │
│ LTV │ $420 │ $85 │
└─────────────────────────────────────────────────────────┘
Insight: Power users are 5x more valuable
Question: How do we create more power users?
Behavioral Pattern Analysis
1. Path Analysis
Most Common Paths to First Purchase:
Path 1 (35% of purchases):
Homepage → Product Page → Add to Cart → Purchase
Path 2 (25% of purchases):
Search → Product Page → Wishlist → Return → Purchase
Path 3 (18% of purchases):
Category → Filter → Product → Compare → Purchase
Path 4 (12% of purchases):
Email Link → Product Page → Purchase
Insight: Direct purchase path (Path 1) converts fastest
Opportunity: Reduce steps for Path 2/3 users
2. Sequence Analysis
What do retained users do in Week 1?
Retained (D30 > 0) vs Churned users:
┌────────────────────────────────────────────────────────┐
│ W1 Action │ Retained │ Churned │ Difference │
├────────────────────────────────────────────────────────┤
│ Completed tutorial │ 78% │ 35% │ +43pp │
│ Created first item │ 85% │ 42% │ +43pp │
│ Invited team member │ 52% │ 12% │ +40pp │
│ Used mobile app │ 65% │ 38% │ +27pp │
│ Set up integration │ 42% │ 18% │ +24pp │
└────────────────────────────────────────────────────────┘
Activation checklist priority:
1. Completed tutorial (biggest gap)
2. Created first item
3. Invited team member
3. Frequency Analysis
User Activity Distribution:
┌────────────────────────────────────────────────────────┐
│ Sessions/Week │ % Users │ Retention │ LTV │
├────────────────────────────────────────────────────────┤
│ 0 (dormant) │ 25% │ N/A │ $0 │
│ 1-2 (light) │ 40% │ 22% │ $45 │
│ 3-5 (regular) │ 25% │ 55% │ $120 │
│ 6+ (heavy) │ 10% │ 82% │ $340 │
└────────────────────────────────────────────────────────┘
Target: Move light users to regular
Behavioral Triggers
Identifying Critical Moments:
| Trigger Type | Example | Response |
|---|---|---|
| First-time | First login, first action | Guided onboarding |
| Achievement | Milestone reached | Celebration + next step |
| Inactivity | No action in X days | Re-engagement email |
| Risk signal | Decrease in usage | Proactive outreach |
| Opportunity | Feature almost used | Contextual nudge |
Behavioral Trigger Implementation:
Trigger: Inactivity Warning
Condition:
- User was active (3+ sessions/week)
- Now inactive (0 sessions in 7 days)
- Has valuable content in account
Action:
- Send "We miss you" email at day 7
- Show "what's new" at next login
- Log re-engagement_triggered event
Measurement:
- Re-engagement rate: 18%
- Resurrection rate: 12%
Session Analysis
Session Metrics:
| Metric | Definition | Target |
|---|---|---|
| Session Duration | Time from start to end | Depends on product |
| Actions/Session | Events in session | 8-15 for engaged |
| Pages/Session | Screens visited | 4-8 typical |
| Session Depth | Features touched | 2+ for value |
| Bounce Rate | Single-page sessions | <30% for core flows |
Session Recording Analysis:
Session Review Protocol:
1. Watch 10-20 sessions per week
2. Focus on:
- Drop-off points (where users quit)
- Confusion signals (rage clicks, back navigation)
- Success patterns (smooth completion)
- Unexpected behaviors (workarounds)
Document:
- Friction point and timestamp
- User segment
- Frequency (1 user or pattern?)
- Potential fix
- Priority (impact × frequency)
Good vs Bad Behavior Analysis
Good: Segmented, Actionable
Analysis: Feature X Usage Patterns
Approach:
1. Segmented by user type (new vs returning)
2. Tracked full usage funnel
3. Compared high vs low engagers
4. Correlated with outcomes
Findings:
┌─────────────────────────────────────────────────────────┐
│ Behavior │ High Retain │ Low Retain │
├─────────────────────────────────────────────────────────┤
│ Used Feature X in W1 │ 72% │ 28% │
│ Used Feature X 3+ times │ 65% │ 12% │
│ Combined with Feature Y │ 58% │ 8% │
└─────────────────────────────────────────────────────────┘
Insight: Feature X + Y combination predicts retention
Action: Prompt Feature Y users to try Feature X
Result: +15% feature adoption, +8% D30 retention
Bad: Aggregate, No Action
Analysis: User Behavior Report
Findings:
- Average session duration: 4.2 minutes
- Average actions per session: 6.8
- Most clicked: Dashboard
Problems:
- No segmentation
- No comparison to outcome
- Averages hide patterns
- No actionable insight
- "Most clicked" isn't insight
Predictive Behavior Signals
Churn Prediction Signals:
High-Risk Indicators:
┌─────────────────────────────────────────────────────────┐
│ Signal │ Weight │ Threshold │
├─────────────────────────────────────────────────────────┤
│ Session frequency drop │ 0.35 │ -50% vs avg │
│ Feature usage decline │ 0.25 │ -30% features │
│ Support tickets up │ 0.20 │ 2+ in 30 days │
│ Login failures │ 0.10 │ 3+ in 7 days │
│ Billing issues │ 0.10 │ Failed payment │
└─────────────────────────────────────────────────────────┘
Risk Score = Σ(signal × weight)
High risk: Score > 0.6
Upgrade Prediction Signals:
Ready-to-Upgrade Indicators:
┌─────────────────────────────────────────────────────────┐
│ Signal │ Indicator │
├─────────────────────────────────────────────────────────┤
│ Limit approaching │ 80% of plan limit │
│ Premium feature attempts │ 3+ clicks on locked feature │
│ Team growth │ Inviting new members │
│ Usage acceleration │ +50% activity vs last month │
│ Pricing page visits │ 2+ views in 30 days │
└─────────────────────────────────────────────────────────┘
Behavioral Analytics Tools
| Tool | Best For | Key Features |
|---|---|---|
| Amplitude | Product analytics | Funnels, cohorts, paths |
| Mixpanel | Event analytics | Real-time, segmentation |
| Fullstory | Session replay | Recordings, heatmaps |
| Heap | Auto-capture | Retroactive analysis |
| Pendo | In-app analytics | Usage tracking, guides |
| PostHog | Open source | Full stack, self-hosted |
Anti-Patterns
- Aggregate obsession — Averages hide the story
- Correlation = causation — Behavior correlates, doesn't cause
- Recency bias — This week's pattern ≠ trend
- Confirmation bias — Finding data to support decisions
- Privacy violation — Tracking too much, too granularly
- Analysis without action — Insights that don't change anything
- Segment pollution — Too many segments, no clear personas
- Ignoring context — Behavior without the "why"