#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
Module 8
10 min read

Metric Selection

Choose the right metrics for experiments

What You'll Learn

  • Choosing primary metrics
  • Metric characteristics
  • Leading vs lagging metrics
  • Sensitivity and robustness
  • Common metric pitfalls

Characteristics of Good Metrics

1. Aligned with business goals Measures what you care about

2. Sensitive Detects meaningful changes

3. Robust Not easily gamed or manipulated

4. Interpretable Stakeholders understand it

5. Timely Measurable in reasonable timeframe

Primary Metric Selection

Choose ONE primary metric: The decision criterion

Must be:

  • Directly related to goal
  • Measurable for all users
  • Available during test period

Examples:

E-commerce:

  • Conversion rate
  • Revenue per visitor
  • Average order value

Social media:

  • Daily active users
  • Time spent
  • Engagement rate

SaaS:

  • Sign-up rate
  • Activation rate
  • Retention rate

Click-Through Rate (CTR)

Definition: Clicks / Impressions

Pros:

  • Easy to measure
  • Quick feedback
  • High sensitivity

Cons:

  • Doesn't measure outcomes
  • Can be gamed (clickbait)
  • May not correlate with business value

When to use: Early funnel optimization

When NOT to use: When you care about final conversions

Conversion Rate

Definition: Conversions / Visitors

Pros:

  • Directly measures outcomes
  • Clear business value
  • Interpretable

Cons:

  • May need large sample
  • Slower than CTR
  • Binary (doesn't capture magnitude)

Variations:

  • Purchase conversion
  • Sign-up conversion
  • Free-to-paid conversion

Revenue Metrics

Revenue per visitor (RPV): Total revenue / Visitors

Pros:

  • Captures magnitude
  • Direct business impact
  • Comprehensive

Cons:

  • High variance (few big purchases)
  • Need large samples
  • Outlier sensitive

Average order value (AOV): Revenue / Orders

Pros:

  • Easier to move than volume
  • Lower variance than RPV

Cons:

  • Ignores conversion rate
  • Could decrease orders while increasing AOV

Composite Metrics

Combine multiple signals:

Example: OEC (Overall Evaluation Criterion) Weighted combination: OEC = 0.5(Revenue) + 0.3(Engagement) + 0.2(Retention)

Pros:

  • Holistic view
  • Balance trade-offs

Cons:

  • Complex to explain
  • Weights are subjective
  • Harder to debug issues

Leading vs Lagging Metrics

Leading metrics: Early indicators, quick feedback

  • Click rate
  • Add-to-cart
  • Email opens

Lagging metrics: Final outcomes, slower

  • Revenue
  • Retention
  • Lifetime value

Strategy:

  • Optimize leading metrics for speed
  • Validate with lagging metrics
  • Ensure correlation between them!

Example: Leading: Email click rate Lagging: Conversions from email → Only optimize clicks if they convert!

Sensitivity and Variance

Sensitivity: Ability to detect small changes

High sensitivity:

  • Frequent events
  • Low variance
  • Large sample

Low sensitivity:

  • Rare events
  • High variance
  • Need huge sample

Example comparison:

Metric A: Page views per user

  • Mean: 5.2, SD: 2.1
  • Sensitive to changes

Metric B: Purchases per user

  • Mean: 0.05, SD: 0.22
  • Less sensitive (rare event)

Count vs Rate Metrics

Count metrics: Total clicks, revenue, users

  • Affected by exposure
  • Need to normalize

Rate metrics: CTR, conversion rate, RPV

  • Already normalized
  • Better for comparison

Example: ❌ "Treatment had 1000 more clicks" (What if it had more traffic?)

✓ "Treatment had 2% higher CTR" (Normalized)

User-Level vs Session-Level

User-level: Metrics per user

  • Better for retention
  • Removes session count effect

Session-level: Metrics per session

  • Better for engagement
  • Captures revisit behavior

Example:

User-level: 60% of users converted Session-level: 40% of sessions converted (Same users, multiple sessions)

Choose based on goal!

Ratio Metrics

Form: Numerator / Denominator

Examples:

  • CTR = Clicks / Impressions
  • Conversion = Orders / Visitors
  • Engagement = Time / Sessions

Challenge: Both parts can vary!

Delta method: Approximate variance for statistical tests

Alternative: Bootstrap confidence intervals

Guardrail Metrics

Purpose: Ensure you don't break things

Examples:

Performance:

  • Page load time
  • Error rate
  • Crash rate

User experience:

  • Bounce rate
  • Time to first action
  • Help tickets

Revenue:

  • Revenue per user shouldn't tank

Set thresholds: "Stop test if page load > 3 seconds"

Metric Tradeoffs

Common tensions:

Short-term vs long-term: Clicks (short) vs retention (long)

Engagement vs monetization: Time spent vs revenue

Growth vs experience: New users vs user satisfaction

Strategy:

  • Define acceptable tradeoffs
  • Monitor secondary metrics
  • Long-term tracking

Novelty and Primacy Effects

Novelty effect: Users react to change itself → Metrics improve then regress

Primacy effect: Existing users prefer old version → Metrics worse then improve

Solution:

  • Run longer tests (2-4 weeks)
  • Analyze new vs returning separately
  • Focus on new users for true effect

Statistical Properties

Good metric properties:

1. Low variance More stable, easier to detect changes

2. Normal distribution Standard tests work well

3. No ceiling/floor effects Room to improve

Example:

Metric A: 50% engagement (can go up or down) Metric B: 95% engagement (ceiling effect!)

Metric A is better for testing

Metric Debugging

When results seem weird:

1. Check metric definition Calculated correctly?

2. Check data quality Logging working?

3. Check sample ratio 50/50 split?

4. Check segments Effect in all groups?

5. Check time patterns Consistent across days?

Industry-Specific Metrics

Marketplace:

  • Gross merchandise value (GMV)
  • Take rate
  • Liquidity (supply/demand balance)

Content:

  • Time on site
  • Pages per session
  • Return rate

Freemium:

  • Free-to-paid conversion
  • Feature adoption
  • Upgrade rate

Gaming:

  • Day 1/7/30 retention
  • ARPU (average revenue per user)
  • Session length

Proxy Metrics

When long-term metrics take too long:

Ultimate metric: 1-year retention Proxy: Week 1 engagement

Requirements:

  • Correlation with ultimate metric
  • Measurable quickly
  • Validate relationship regularly

Example: Videos watched (proxy) → Retention (ultimate)

Practice Exercise

Scenario: Testing new product recommendation algorithm

Questions:

  1. What could be primary metrics?
  2. What are guardrail metrics?
  3. Leading vs lagging considerations?
  4. How long to run test?

Possible answers:

  1. Primary: CTR on recommendations, Add-to-cart rate, Revenue per user
  2. Guardrails: Overall conversion, customer satisfaction, page load time
  3. Leading: Click-through, Lagging: Purchases
  4. At least 2 weeks to capture repeat behavior

Next Steps

Learn about Avoiding Peeking!

Tip: The right metric makes or breaks your experiment!

SkillsetMaster - AI, Web Development & Data Analytics Courses