library(CorOncoEndpoints)
set.seed(123)Overview
The CorOncoEndpoints package provides tools for
generating correlated oncology endpoints in clinical trial simulations.
This vignette introduces the basic functionality and common use
cases.
Why CorOncoEndpoints?
In oncology clinical trials, we often need to simulate multiple correlated endpoints:
- Overall Survival (OS): Time from randomization to death
- Progression-Free Survival (PFS): Time from randomization to progression or death
- Objective Response (Response): Binary indicator of tumor response
These endpoints are not independent—they exhibit natural correlations. For example:
- Patients who respond to treatment tend to have longer survival times
- PFS is always ≤ OS (you cannot die before progressing)
- Response is associated with both OS and PFS
CorOncoEndpoints generates realistic simulated data that
preserves these correlation structures.
Basic Usage
Example 1: Generate OS and Response
Let’s start with a simple example generating correlated OS and Response data:
# Generate data for two groups
data1 <- rOncoEndpoints(
nsim = 100, # 100 simulations
group = c("Treatment", "Control"), # Two treatment groups
n = c(150, 150), # Sample size per group
p = c(0.4, 0.3), # Response rates
hazard_OS = c(0.05, 0.07), # Hazard rates for OS
rho_tte_resp = c(0.3, 0.2), # Correlation between OS and Response
copula = "Clayton" # Copula family
)
# View first few rows
head(data1)
#> simID Group OS Response
#> 1 1 Treatment 6.7816835 0
#> 2 1 Treatment 31.0521872 0
#> 3 1 Treatment 10.5180043 1
#> 4 1 Treatment 42.9146021 1
#> 5 1 Treatment 56.4245855 0
#> 6 1 Treatment 0.9325366 0
# Check dimensions
cat("Total observations:", nrow(data1), "\n")
#> Total observations: 30000
cat("Number of simulations:", length(unique(data1$simID)), "\n")
#> Number of simulations: 100
cat("Groups:", unique(data1$Group), "\n")
#> Groups: Treatment ControlExample 2: All Three Endpoints
Now let’s generate all three endpoints (OS, PFS, Response):
data2 <- rOncoEndpoints(
nsim = 100,
group = c("Experimental", "Standard"),
n = c(200, 200),
p = c(0.5, 0.35),
hazard_OS = c(0.04, 0.06),
hazard_PFS = c(0.08, 0.10), # Note: hazard_PFS > hazard_OS
rho_tte_resp = c(0.4, 0.25), # Correlation between OS and Response
copula = "Frank"
)
head(data2)
#> simID Group OS PFS Response
#> 1 1 Experimental 13.758314 13.7583143 1
#> 2 1 Experimental 38.484101 0.5779209 1
#> 3 1 Experimental 23.023374 23.0233739 1
#> 4 1 Experimental 2.137153 0.9313718 1
#> 5 1 Experimental 8.745343 6.6986000 0
#> 6 1 Experimental 10.339117 10.3391172 1Important: When generating all three endpoints, you
specify the correlation between OS and Response
(rho_tte_resp). The correlation between PFS and Response is
automatically determined by the model structure.
Example 3: Verify Correlations
Let’s check the correlations in our simulated data:
# For the Experimental group in simulation 1
sim1_exp <- subset(data2, simID == 1 & Group == "Experimental")
# Correlation between OS and Response
cor_os_resp <- cor(sim1_exp$OS, sim1_exp$Response)
cat("Correlation (OS, Response):", round(cor_os_resp, 3), "\n")
#> Correlation (OS, Response): 0.423
# Correlation between PFS and Response
cor_pfs_resp <- cor(sim1_exp$PFS, sim1_exp$Response)
cat("Correlation (PFS, Response):", round(cor_pfs_resp, 3), "\n")
#> Correlation (PFS, Response): 0.322
# Correlation between OS and PFS
cor_os_pfs <- cor(sim1_exp$OS, sim1_exp$PFS)
cat("Correlation (OS, PFS):", round(cor_os_pfs, 3), "\n")
#> Correlation (OS, PFS): 0.54
# Verify PFS <= OS constraint
cat("All PFS <= OS?", all(sim1_exp$PFS <= sim1_exp$OS), "\n")
#> All PFS <= OS? TRUEValidation of Simulation Results
Use CheckSimResults() to validate that your simulations
match theoretical values:
# Generate more simulations for better validation
data_val <- rOncoEndpoints(
nsim = 1000,
group = c("Treatment", "Control"),
n = c(100, 100),
p = c(0.4, 0.3),
hazard_OS = c(0.05, 0.07),
rho_tte_resp = c(0.3, 0.2),
copula = "Clayton"
)
# Validate results
validation <- CheckSimResults(
dataset = data_val,
p = c(Treatment = 0.4, Control = 0.3),
hazard_OS = c(Treatment = 0.05, Control = 0.07),
rho_tte_resp = c(Treatment = 0.3, Control = 0.2),
copula = "Clayton"
)
# Show results
print(validation, n = 20)
#> # A tibble: 8 × 10
#> Group Endpoint Empirical Theoretical Bias Relative_Bias SE MSE
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Treatment OS_Mean 20.0 20 0.0374 0.187 2.00 3.99
#> 2 Treatment OS_Medi… 14.0 13.9 0.0924 0.667 1.96 3.86
#> 3 Treatment Response 0.401 0.4 0.00112 0.280 0.0484 0.00235
#> 4 Treatment Cor_OS_… 0.308 0.3 0.00811 2.70 0.0972 0.00952
#> 5 Control OS_Mean 14.3 14.3 0.0229 0.161 1.45 2.11
#> 6 Control OS_Medi… 9.98 9.90 0.0734 0.741 1.43 2.05
#> 7 Control Response 0.296 0.3 -0.00355 -1.18 0.0455 0.00208
#> 8 Control Cor_OS_… 0.203 0.2 0.00317 1.59 0.0994 0.00990
#> # ℹ 2 more variables: RMSE <dbl>, Assessment <chr>Interpretation:
- Bias: Should be close to 0 for unbiased methods
- Relative_Bias: < 5% is excellent, < 10% is acceptable
- SE: Standard error of estimates across simulations
- RMSE: Overall accuracy measure
Understanding Correlation Bounds
Not all correlations are feasible. Use
CorBoundResponseTTE() to check feasible ranges:
# For response probability = 0.4
bounds <- CorBoundResponseTTE(p = 0.4)
cat("Feasible correlation range:",
round(bounds[1], 3), "to", round(bounds[2], 3), "\n")
#> Feasible correlation range: -0.626 to 0.748
# Try different response probabilities
p_values <- c(0.2, 0.4, 0.6, 0.8)
bounds_matrix <- sapply(p_values, CorBoundResponseTTE)
colnames(bounds_matrix) <- paste("p =", p_values)
rownames(bounds_matrix) <- c("Lower", "Upper")
print(round(bounds_matrix, 3))
#> p = 0.2 p = 0.4 p = 0.6 p = 0.8
#> Lower -0.446 -0.626 -0.748 -0.805
#> Upper 0.805 0.748 0.626 0.446Notice that the feasible range depends on the response probability.
Copula Families
Clayton Copula
- Exhibits lower tail dependence
- Cannot model negative correlations (rho > 0 only)
- Good for survival data where patients with poor outcomes tend to have poor outcomes across endpoints
# Clayton copula example
data_clayton <- rOncoEndpoints(
nsim = 100,
n = 100,
p = 0.4,
hazard_OS = 0.05,
rho_tte_resp = 0.3,
copula = "Clayton"
)
head(data_clayton)
#> simID Group OS Response
#> 1 1 Group1 0.9734742 0
#> 2 1 Group1 1.7577495 0
#> 3 1 Group1 16.7348480 1
#> 4 1 Group1 43.1064927 1
#> 5 1 Group1 21.5900391 0
#> 6 1 Group1 61.0736650 0Frank Copula
- Flexible for both positive and negative correlations
- Symmetric tail behavior
- More general choice
# Frank copula with negative correlation
bounds_neg <- CorBoundResponseTTE(p = 0.4)
rho_negative <- -0.2 # Must be within bounds
data_frank <- rOncoEndpoints(
nsim = 100,
n = 100,
p = 0.4,
hazard_OS = 0.05,
rho_tte_resp = rho_negative,
copula = "Frank"
)
# Check negative correlation
cor(data_frank[data_frank$simID == 1, ]$OS,
data_frank[data_frank$simID == 1, ]$Response)
#> [1] -0.194624Common Use Cases
Use Case 1: Power Analysis
Generate data under different scenarios to estimate statistical power:
# Scenario: Treatment vs Control
scenarios <- expand.grid(
hazard_ratio = c(0.7, 0.8, 0.9),
response_diff = c(0.1, 0.15, 0.2),
correlation = c(0.2, 0.3, 0.4)
)
# For each scenario, generate data and calculate power
# (Example code structure - not run)Use Case 2: Sample Size Calculation
Determine required sample size for detecting treatment effects:
# Generate data with different sample sizes
n_values <- c(50, 100, 150, 200)
# For each n, simulate trials and calculate detection rates
# (Example code structure - not run)Summary
This vignette introduced the basic functionality of
CorOncoEndpoints:
-
Generate correlated endpoints with
rOncoEndpoints() -
Validate simulations with
CheckSimResults() -
Check feasible correlations with
CorBoundResponseTTE() - Choose appropriate copulas (Clayton for positive only, Frank for both)
For more advanced usage, see the “Advanced Usage and Examples” vignette. For theoretical details, see the “Theoretical Background” vignette.
Next Steps
- Explore the
advanced-usagevignette for complex scenarios - Read the
theoretical-backgroundvignette to understand the mathematical framework - Check function documentation with
?rOncoEndpoints,?CheckSimResults, etc.