Revenue Calculator // Online

A/B Test Significance Calculator.

An A/B test significance calculator determines whether the difference in conversion rates between two variants is statistically meaningful or due to random chance. Enter the number of visitors and conversions for each variant, select your confidence level, and the calculator runs a two-proportion Z-test with pooled proportion to produce a p-value and verdict. It computes Wilson score confidence intervals for each variant rate, a Newcombe interval for the rate difference, and relative lift. Switch to Plan Sample Size mode to calculate the visitors per variant you need before starting a test, based on your baseline rate, minimum detectable effect, confidence level, and statistical power.

Revenue Calculator
Free Tool
A/B
System Active
Variant A (Control)
Variant B (Test)
Settings

Probability that the result is not due to chance

Common Pitfalls
Don't peek too early.Checking results repeatedly inflates false-positive rates. Decide your sample size in advance and wait until you reach it before drawing conclusions.
Run for full business cycles.Conversion rates vary by day of week and time of month. Run tests for at least one full week — ideally two — to avoid cyclical bias.
CI overlap does not mean non-significance.Two overlapping confidence intervals can still produce a significant result. Always use the Z-test p-value, not visual CI overlap, to judge significance.
Save & Share Your Results

Enter your email to receive a copy of your results and share them with your team.

We will only use your email to share your results. No spam.

How to Use

Get Started in 3 Steps

Step 01

Enter Your Test Data

Input the number of visitors or email sends and conversions for each variant (A and B). Select your desired confidence level from the dropdown. Variant A is your control and variant B is your test.

Step 02

Review Statistical Results

Click Analyze Results to see the p-value, Z-statistic, verdict, confidence intervals for each variant, the Newcombe interval for the difference, and relative lift with a visual comparison chart.

Step 03

Plan Future Tests

Switch to Plan Sample Size mode to calculate how many visitors you need per variant. Enter your baseline conversion rate, minimum detectable effect in percentage points, confidence level, and statistical power.

How It Works

Under the Hood

This calculator implements a two-proportion Z-test using a pooled proportion for the standard error estimate. It computes the pooled conversion rate across both variants, calculates the standard error of the difference, and derives a Z-statistic. The two-sided p-value comes from the normal CDF approximation using the Abramowitz and Stegun rational approximation of the error function, accurate to within 1.5 times 10 to the negative 7.

Confidence intervals use the Wilson score method for individual variant rates rather than the simpler Wald interval. The Wilson interval correctly handles extreme proportions near 0 percent and 100 percent, producing nonzero-width intervals with finite samples. For the difference between proportions, the calculator uses the Newcombe interval, which combines Wilson intervals from both groups to produce a confidence interval for the rate difference.

The verdict follows strict decision boundaries. INCONCLUSIVE appears when both variants have zero conversions, either variant has zero visitors, or both have identical zero-variance rates (all converted or none converted). NOT YET appears when the math is well-defined but the p-value exceeds your alpha threshold. SIGNIFICANT appears when the p-value is at or below alpha, with winner identification based on which variant has the higher conversion rate.

Sample size calculation uses the standard two-proportion formula with precomputed Z-scores for confidence (1.645 for 90 percent, 1.96 for 95 percent, 2.576 for 99 percent) and power (0.842 for 80 percent, 1.282 for 90 percent, 1.645 for 95 percent). The calculator flags sample sizes exceeding 10 million per variant as impractical.

FAQ

Frequently Asked Questions

What is statistical significance in A/B testing?
Statistical significance means the difference in conversion rates between your two variants is unlikely to have occurred by random chance alone. This calculator uses a two-proportion Z-test with a pooled proportion to compute a p-value. If the p-value falls below your chosen significance level (alpha), the result is statistically significant. A 95 percent confidence level means alpha equals 0.05, so you need a p-value of 0.05 or lower. Significance does not tell you the size of the effect, only that a real difference likely exists. Always pair it with confidence intervals and relative lift to understand both the existence and magnitude of the effect.
How many visitors do I need for a statistically significant A/B test?
The required sample size depends on four factors: your baseline conversion rate, the minimum detectable effect you care about, your desired confidence level, and statistical power. Use the Plan Sample Size mode in this calculator to get an exact number. As a rough guide, detecting a 2 percentage point lift from a 5 percent baseline at 95 percent confidence and 80 percent power requires around 1,500 visitors per variant. Smaller effects or higher confidence requirements increase the sample size dramatically. Always determine your sample size before starting the test to avoid the temptation of early peeking.
What is the difference between confidence level and statistical power?
Confidence level controls the false-positive rate. A 95 percent confidence level means there is a 5 percent chance of declaring a winner when no real difference exists (Type I error). Statistical power controls the false-negative rate. Eighty percent power means there is a 20 percent chance of missing a real difference that actually exists (Type II error). Both affect required sample size: higher confidence or higher power means you need more visitors per variant. The most common combination is 95 percent confidence with 80 percent power, which balances accuracy against practical sample size requirements.
Can I check my A/B test results before the test is complete?
Checking results repeatedly before reaching your target sample size is called peeking and it inflates your false-positive rate well beyond the nominal alpha level. A test designed for 95 percent confidence can behave like a 20 to 30 percent confidence test if you check after every batch of visitors. The safest approach is to calculate your required sample size upfront using the Plan Sample Size mode, then wait until you reach that number before analyzing. If you must monitor results during a test, use sequential testing methods that formally adjust for repeated looks at the data.
What is the Wilson score confidence interval and why use it?
The Wilson score interval is a method for estimating the range of plausible values for a conversion rate given your observed data. Unlike the simpler Wald interval (rate plus or minus z times standard error), the Wilson interval handles edge cases correctly. It produces valid nonzero-width intervals even when the observed conversion rate is exactly 0 percent or 100 percent, which matters in real experiments with small sample sizes or extreme rates. This calculator uses Wilson intervals for individual variant rates and the Newcombe method for the difference between variants, which combines Wilson intervals from both groups.
Related Tools

Explore More Tools

Need Expert Help?

We Optimize Your Email Campaigns

Our Email Marketing service helps you design, test, and optimize email campaigns for maximum engagement. We handle A/B test strategy, segmentation, deliverability, and conversion tracking so you can focus on revenue.

Learn About Email Marketing
Related Articles

Learn More