Two Binary Co-Primary Endpoints (Exact Methods)
Source:vignettes/two-binary-endpoints-exact.Rmd
two-binary-endpoints-exact.RmdOverview
This vignette demonstrates exact sample size calculation and power analysis for clinical trials with two co-primary binary endpoints. The methodology is based on Homma and Yoshida (2025), which provides exact inference methods using the bivariate binomial distribution.
Background
When to Use Exact Methods
Exact methods are recommended when:
- Small to medium sample sizes ()
- Extreme probabilities ( or )
- Strict Type I error control is required
- Regulatory requirements for exact inference
Asymptotic methods may not maintain the nominal Type I error rate in these situations.
Statistical Framework
Model and Assumptions
Consider a two-arm parallel-group superiority trial comparing treatment (group 1) with control (group 2). Let and denote the sample sizes in groups 1 and 2, respectively.
For patient in group (: treatment, : control), we observe two binary outcomes:
Endpoint ():
where if patient in group is a responder for endpoint , and 0 otherwise.
True response probabilities:
where for each and .
Joint Distribution of Binary Outcomes
The paired binary outcomes for patient in group follow a multinomial distribution with four possible outcomes:
Per-trial probabilities:
- : Both endpoints successful
- : Only endpoint 1 successful
- : Only endpoint 2 successful
- : Both endpoints unsuccessful
where .
Let denote the random variable representing the number of times takes the value for . Then:
Bivariate Binomial Distribution
Following Homma and Yoshida (2025), the joint distribution of can be expressed as a bivariate binomial distribution:
where is a dependence parameter related to the correlation between and .
Probability mass function (Equation 3 in Homma and Yoshida, 2025):
For more details, please see Homma and Yoshida (2025).
Correlation Structure
The correlation between and is:
The dependence parameter is related to through (Equation 4 in Homma and Yoshida, 2025):
Important property: The correlation between and equals , the same as the correlation between and .
Marginal distributions:
Correlation bounds: Due to , the correlation is bounded:
where:
Special cases: - If , then - If , then
Hypothesis Testing
Superiority Hypotheses
Since higher values of both endpoints indicate treatment benefit, we test:
For endpoint 1:
For endpoint 2:
Co-Primary Endpoints (Intersection-Union Test)
The trial succeeds only if superiority is demonstrated for both endpoints simultaneously:
Null hypothesis: (at least one null is true)
Alternative hypothesis: (both alternatives are true)
Decision rule: Reject at level if and only if both and are rejected at level without multiplicity adjustment.
Statistical Tests
Homma and Yoshida (2025) consider five exact test methods:
Method 1: One-sided Pearson Chi-squared Test (Chisq)
For endpoint , the test statistic is:
where:
- is the sample proportion
- is the pooled proportion
Reject if , where is the -quantile of the standard normal distribution.
Method 2: Fisher’s Exact Test (Fisher)
Conditional test: Conditions on the total number of successes .
Under , follows a hypergeometric distribution given .
One-sided p-value:
Reject if .
Method 3: Fisher’s Mid-P Test (Fisher-midP)
Reduces conservatism by adding half the probability of the observed outcome:
Note: The twoCoprimary package can implement the Fisher’s
Mid-P Test, but Homma and Yoshida (2025) has not investigated this
test.
Exact Power Calculation
Power Formula
The exact power for test method is (Equation 9 in Homma and Yoshida, 2025):
where:
- is the parameter vector
- is the rejection region for endpoint
Sample Size Calculation
The required sample size to achieve target power is (Equation 10 in Homma and Yoshida, 2025):
This cannot be expressed as a closed-form formula due to:
- Discreteness of binary outcomes
- Non-monotonic “sawtooth” power curve
Algorithm: Sequential search starting from asymptotic normal approximation (AN method) as initial value.
Replicating Homma and Yoshida (2025) Table 4
Table 4 from Homma and Yoshida (2025) shows sample sizes for various correlations using the Chisq, Fisher, Z-pool, and Boschloo. Note that the following sample code compute only scenario for .
The notation used in the function is: p11 =
,
p12 =
,
p21 =
,
p22 =
,
where the first subscript denotes the group (1 = treatment, 2 = control)
and the second subscript denotes the endpoint (1 or 2).
# Recreate Homma and Yoshida (2025) Table 4
library(dplyr)
library(tidyr)
library(readr)
param_grid_bin_exact_ss <- tibble(
p11 = 0.54,
p12 = 0.54,
p21 = 0.25,
p22 = 0.25
)
result_bin_exact_ss <- do.call(
bind_rows,
lapply(c("Chisq", "Fisher", "Z-pool", "Boschloo"), function(test) {
do.call(
bind_rows,
lapply(1:2, function(r) {
design_table(
param_grid = param_grid_bin_exact_ss,
rho_values = c(0, 0.3, 0.5, 0.8),
r = r,
alpha = 0.025,
beta = 0.1,
endpoint_type = "binary",
Test = test
) %>%
mutate(alpha = 0.025, r = r, Test = test)
})
)
})
) %>%
pivot_longer(
cols = starts_with("rho_"),
names_to = "rho",
values_to = "N",
names_transform = list(rho = parse_number)
) %>%
select(r, rho, Test, N) %>%
pivot_wider(names_from = Test, values_from = N) %>%
as.data.frame()
kable(result_bin_exact_ss,
caption = "Table 4: Total Sample Size (N) for Two Co-Primary Binary Endpoints (α = 0.025, 1-β = 0.90)^a,b^",
digits = 1,
col.names = c("r", "ρ", "Chisq", "Fisher", "Z-pool", "Boschloo"))| r | ρ | Chisq | Fisher | Z-pool | Boschloo |
|---|---|---|---|---|---|
| 1 | 0.0 | 142 | 152 | 144 | 144 |
| 1 | 0.3 | 142 | 150 | 142 | 142 |
| 1 | 0.5 | 140 | 150 | 140 | 140 |
| 1 | 0.8 | 128 | 144 | 134 | 134 |
| 2 | 0.0 | 162 | 174 | 180 | 162 |
| 2 | 0.3 | 159 | 174 | 180 | 159 |
| 2 | 0.5 | 156 | 171 | 177 | 156 |
| 2 | 0.8 | 147 | 159 | 168 | 150 |
a Chisq denotes the one-sided Pearson chi-squared test. Fisher stands for Fisher’s exact test. Z-pool represents the Z-pooled exact unconditional test. Boschloo signifies Boschloo’s exact unconditional test.
b The required sample sizes were obtained by assuming that and .
Practical Examples
Example 1: Basic Exact Power Calculation
# Calculate exact power using Fisher's exact test
result_fisher <- power2BinaryExact(
n1 = 50,
n2 = 50,
p11 = 0.70, p12 = 0.65,
p21 = 0.50, p22 = 0.45,
rho1 = 0.5, rho2 = 0.5,
alpha = 0.025,
Test = "Fisher"
)
print(result_fisher)
#>
#> Power calculation for two binary co-primary endpoints
#>
#> n1 = 50
#> n2 = 50
#> p (group 1) = 0.7, 0.65
#> p (group 2) = 0.5, 0.45
#> rho = 0.5, 0.5
#> alpha = 0.025
#> Test = Fisher
#> power1 = 0.46345
#> power2 = 0.46196
#> powerCoprimary = 0.297231Interpretation:
-
power1: Power for endpoint 1 alone -
power2: Power for endpoint 2 alone -
powerCoprimary: Exact power for both co-primary endpoints
Example 2: Sample Size Calculation
# Calculate required sample size using Boschloo's test
result_ss <- ss2BinaryExact(
p11 = 0.70, p12 = 0.65,
p21 = 0.50, p22 = 0.45,
rho1 = 0.5, rho2 = 0.5,
r = 1,
alpha = 0.025,
beta = 0.2,
Test = "Boschloo"
)
print(result_ss)
#>
#> Sample size calculation for two binary co-primary endpoints
#>
#> n1 = 120
#> n2 = 120
#> N = 240
#> p (group 1) = 0.7, 0.65
#> p (group 2) = 0.5, 0.45
#> rho = 0.5, 0.5
#> allocation = 1
#> alpha = 0.025
#> beta = 0.2
#> Test = BoschlooExample 3: Comparison of Test Methods
# Compare different exact test methods
test_methods <- c("Chisq", "Fisher", "Fisher-midP", "Z-pool", "Boschloo")
comparison <- lapply(test_methods, function(test) {
result <- ss2BinaryExact(
p11 = 0.50, p12 = 0.40,
p21 = 0.20, p22 = 0.10,
rho1 = 0.7, rho2 = 0.6,
r = 1,
alpha = 0.025,
beta = 0.2,
Test = test
)
data.frame(
Test = test,
n2 = result$n2,
N = result$N
)
})
comparison_table <- bind_rows(comparison)
kable(comparison_table,
caption = "Sample Size Comparison Across Test Methods",
col.names = c("Test Method", "n per group", "N total"))| Test Method | n per group | N total |
|---|---|---|
| Chisq | 42 | 84 |
| Fisher | 49 | 98 |
| Fisher-midP | 43 | 86 |
| Z-pool | 43 | 86 |
| Boschloo | 43 | 86 |
Impact of Correlation
Example 4: Correlation Effect
# Calculate sample size for different correlation values
rho_values <- c(0, 0.3, 0.5, 0.8)
correlation_effect <- lapply(rho_values, function(rho) {
result <- ss2BinaryExact(
p11 = 0.70, p12 = 0.60,
p21 = 0.40, p22 = 0.30,
rho1 = rho, rho2 = rho,
r = 1,
alpha = 0.025,
beta = 0.2,
Test = "Fisher"
)
data.frame(
rho = rho,
n2 = result$n2,
N = result$N
)
})
rho_table <- bind_rows(correlation_effect)
kable(rho_table,
caption = "Impact of Correlation on Sample Size (Fisher's Test)",
col.names = c("ρ", "n per group", "N total"))| ρ | n per group | N total |
|---|---|---|
| 0.0 | 61 | 122 |
| 0.3 | 60 | 120 |
| 0.5 | 59 | 118 |
| 0.8 | 56 | 112 |
Key finding: Higher positive correlation reduces required sample size.
Comparison: Exact vs Asymptotic
Example 5: Exact vs AN Method
# Exact method (Chisq)
exact_result <- ss2BinaryExact(
p11 = 0.60, p12 = 0.40,
p21 = 0.30, p22 = 0.10,
rho1 = 0.5, rho2 = 0.5,
r = 1,
alpha = 0.025,
beta = 0.1,
Test = "Chisq"
)
# Asymptotic method (AN)
asymp_result <- ss2BinaryApprox(
p11 = 0.60, p12 = 0.40,
p21 = 0.30, p22 = 0.10,
rho1 = 0.5, rho2 = 0.5,
r = 1,
alpha = 0.025,
beta = 0.1,
Test = "AN"
)
comparison_exact_asymp <- data.frame(
Method = c("Exact (Chisq)", "Asymptotic (AN)"),
n_per_group = c(exact_result$n2, asymp_result$n2),
N_total = c(exact_result$N, asymp_result$N),
Difference = c(0, asymp_result$N - exact_result$N)
)
kable(comparison_exact_asymp,
caption = "Comparison: Exact vs Asymptotic Methods",
col.names = c("Method", "n per group", "N total", "Difference"))| Method | n per group | N total | Difference |
|---|---|---|---|
| Exact (Chisq) | 59 | 118 | 0 |
| Asymptotic (AN) | 60 | 120 | 2 |
Practical Recommendations
Test Method Selection
-
Fisher’s exact test:
- Most widely used and accepted
- Conservative but guarantees Type I error control
- Recommended for regulatory submissions
-
Boschloo’s test:
- Most powerful among exact tests
- Best choice when computational resources permit
- Recommended for final analysis
-
Chi-squared test:
- Less conservative than Fisher
- May be anti-conservative for small samples
- Use with caution for
-
Z-pooled and Fisher-midP:
- Intermediate between Fisher and chi-squared
- Reduce conservatism while maintaining validity
When to Use Each Method
Sample size guidelines:
: Always use exact methods
-
: Exact methods preferred, especially if:
- Extreme probabilities ( or )
- Strict Type I error control required
and : Asymptotic methods acceptable