Sample size: comparing two proportions

sample size

proportions

rates

Fast, MMed-friendly sample-size calculator for comparing two independent proportions or rates. Plain-language inputs, citations to support each assumption, and a copy-ready methods paragraph.

Published

May 7, 2026

Compare two proportions (or rates)

Use this tool when your outcome is binary (e.g. died / survived, cured / not cured, complication / no complication) and you want to compare two independent groups. Defaults are tuned for a typical MMed-level study.

Outcome rate in Group 1 (control / reference) p₁

As a proportion between 0 and 1 (e.g. 0.30 = 30 %).

How to justify this number

This is what you expect to see in your control or standard-care group, before running the study. Sources, in order of strength:

A recent local study in a similar South African setting.
A systematic review or meta-analysis of the outcome.
A registry / audit (NICD, Stats SA, hospital M&M data).
A pilot chart review of 30–50 of your own records.

Search: "<outcome>" "<setting>" prevalence OR incidence South Africa on PubMed (last 5 years).

In your protocol: "Based on Smith et al. (2022), the 30-day mortality at our institution for this cohort was approximately 30 %."

Outcome rate in Group 2 (intervention / comparison) p₂

As a proportion between 0 and 1.

How to justify this number

Two valid framings:

Expected effect — what you predict the comparison group will show, based on prior trials.
Minimum clinically important difference (MCID) — the smallest change that would justify acting on the result. Often more defensible.

Cite an MCID from clinical guidelines or expert consensus where possible (e.g. SA Society of Anaesthesiologists guideline 2019).

Significance level α

Acceptable false-positive rate. Two-sided test.

How to justify this number

α = 0.05 is the universal default in clinical research. Only deviate with reason — strict (0.01) for confirmatory or multiple-testing studies; lenient (0.10) only for exploratory work, and flag it.

Cite: ICH E9 (1998) Statistical Principles for Clinical Trials.

Power 1 − β

Probability of detecting the effect if it really exists.

How to justify this number

Convention is 0.80. Increase to 0.90 if the study is expensive, hard to repeat, or missing the effect would have major clinical consequences.

Cite: Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. 1988.

Account for clustering

Tick this if participants are grouped (clinics, classrooms, households). People in the same cluster are more similar than random individuals, which inflates the required sample size.

You need

— per group

— in total

Adjust the inputs to see your sample size.

What does this calculation actually do?

For comparing two independent proportions with a two-sided z-test, the required sample size per group is approximately:

n = [ z₁₋α/₂ · √(2 · p̄ · (1 − p̄))  +  z₁₋β · √(p₁(1 − p₁) + p₂(1 − p₂)) ]² / (p₂ − p₁)²

where p̄ = (p₁ + p₂) / 2. When clustering is on, the required sample is inflated by the design effect:

DEff = 1 + (m − 1) · ρ

Assumptions: independent observations within groups (or constant ICC across clusters); no interim analyses; complete follow-up. Inflate by your expected drop-out rate before recruiting.

References: Fleiss JL, Tytun A, Ury HK. Biometrics 1980;36:343–6. · Cohen J. Statistical Power Analysis 1988. · Lwanga SK, Lemeshow S. Sample Size Determination in Health Studies, WHO 1991.

Worked MMed example

Setting: A registrar audits 30-day mortality in two ICUs in a tertiary hospital. ICU A uses a standard sepsis bundle; ICU B uses an enhanced bundle introduced in 2023.

From a 2024 single-centre cohort (Mokoena et al.), 30-day mortality on the standard bundle was about 30 %. The enhanced bundle is expected to drop mortality to roughly 50 % of patients surviving without complications — i.e. the registrar wants to detect a difference between p₁ = 0.30 and p₂ = 0.50, with α = 0.05 and 80 % power.

The calculator returns roughly 93 patients per ICU (186 total), comfortably feasible in a 12-month chart-review window. After inflating for an expected 10 % missing-record rate, the registrar plans for 207 charts.

Prefer to explore this in R / Shiny? Open the interactive R version →