A/B Testing Glossary
A friendly A/B testing glossary with simple definitions for common experimentation terms.
- A/B Test
A controlled experiment comparing two or more versions of something (like a web page or feature) to see which performs better.
- Absolute Uplift
The raw difference in conversion rates (e.g., 12% vs. 10% = +2%).
- AOV (Average Order Value)
The average amount of money spent per order.
- ARPV (Average Revenue per Visitor)
The average revenue earned per visitor to your site.
- Bayesian Analysis
An approach that estimates the probability one variant is better than another — often easier to interpret than traditional statistics.
- Breakdown Analysis
A closer look at performance by segment to uncover patterns or outliers.
- Confidence Level
The percentage of confidence in your result — for example, 95% means you’re 95% sure the difference is real.
- Control
The original version used as a baseline to measure improvement.
- Conversion
When a user completes your desired action, like making a purchase or signing up.
- Conversion Rate
The percentage of users who convert out of all who saw the variant.
- Credible Interval
The range that likely contains the true result based on your data.
- CUPED (Controlled Pre-Experiment Data)
A technique that reduces noise by accounting for pre-experiment behavior, making tests more sensitive.
- Experiment Dashboard
A visual overview showing each variant’s performance, probability, and uplift.
- Experiment Drift
When external changes (like seasonality or ad campaigns) affect your test results.
- Experiment Duration
How long your test runs before results can be trusted.
- Experiment Queue
A prioritized list of upcoming tests, usually based on impact vs. effort.
- False Negative
When a real effect is missed because of too little data or bad setup.
- False Positive
When a test shows a winner by chance, not because the change really worked.
- Frequentist Analysis
A traditional statistical approach using p-values and significance levels to decide if results are real.
- Goal
The metric you want to improve in your experiment — for instance, “Sign-ups per visitor.”
- Guardrail Metric
A metric you track to ensure the experiment doesn’t harm key KPIs (like revenue or uptime).
- Holdout Group
A control group excluded from personalization or new features to measure long-term impact.
- Hypothesis
Your prediction of what change will have a positive impact — and why.
- Interaction Effect
When two changes interact and produce unexpected combined results.
- Learning Velocity
How quickly your team runs and learns from experiments — a measure of experimentation culture.
- Lift
Another word for improvement or gain over control.
- Losing Variant
A variant that performs worse than control.
- Minimum Detectable Effect (MDE)
The smallest difference you care to detect between variants.
- Multi-armed Bandit
A smarter testing method that automatically sends more traffic to better-performing variants over time.
- Neutral Result
When no variant clearly outperforms the others.
- North Star Metric
The ultimate goal your experiments support — the metric that reflects long-term success.
- Posterior Distribution
In Bayesian testing, this shows the full probability range of possible outcomes.
- Power
The probability your test correctly detects a real difference.
- Primary Goal
The main metric you use to decide the winner of an experiment.
- Probability to Beat Control (PBC)
The likelihood that a variant performs better than the control.
- Relative Uplift
The percentage increase compared to control (e.g., 12% vs. 10% = +20%).
- Revenue per Session
The total revenue divided by total sessions — a common goal metric in e-commerce.
- Sample Size
The number of visitors or conversions you need for reliable results.
- Secondary Goal
Additional metrics you track to understand side effects or extra insights.
- Segment
A subgroup of your audience, such as device type, location, or traffic source.
- Statistical Significance
A confidence measure showing the chance that results are not due to randomness.
- Stopping Rule
A rule for when to end a test — e.g., after two weeks or when confidence exceeds 95%.
- Traffic Split
How your visitors are divided between variants, such as 50/50 or 25/25/25/25.
- Treatment
Another term for a variation or test version.
- Uplift
The percentage improvement of a variant compared to the control.
- Variant (or Variation)
One of the versions being tested — for example, “A” (control) and “B” (new idea).
- Winner
The variant that performs best based on your chosen goal or probability threshold.
Stop guessing. Start testing.
Try Lyftio free for 30 days and launch experiments in minutes. No long-term commitment. Cancel any time.