
Cumulative Distribution Function of the Difference of Two Independent t-Distributed Variables via Monte Carlo Simulation
Source:R/ptdiff_MC.R
ptdiff_MC.RdCalculates the cumulative distribution function (CDF) of the difference between two independent non-standardised t-distributed random variables using Monte Carlo simulation. Specifically, computes \(P(T_t - T_c \le q)\) or \(P(T_t - T_c > q)\), where \(T_k \sim t(\mu_k, \sigma_k^2, \nu_k)\) for \(k \in \{t, c\}\).
Arguments
- nMC
A positive integer giving the number of Monte Carlo draws. Typical values range from 10,000 (quick estimates) to 100,000 or more (high precision). Larger values reduce Monte Carlo error at the cost of computation time.
- q
A numeric scalar representing the quantile threshold.
- mu_t
A numeric scalar or vector giving the location parameter of the t-distribution for the treatment group.
- mu_c
A numeric scalar or vector giving the location parameter of the t-distribution for the control group.
- sd_t
A positive numeric scalar or vector giving the scale parameter of the t-distribution for the treatment group.
- sd_c
A positive numeric scalar or vector giving the scale parameter of the t-distribution for the control group.
- nu_t
A numeric scalar giving the degrees of freedom of the t-distribution for the treatment group. Must be greater than 2 for finite variance.
- nu_c
A numeric scalar giving the degrees of freedom of the t-distribution for the control group. Must be greater than 2 for finite variance.
- lower.tail
A logical scalar; if
TRUE(default), the function returns \(P(T_t - T_c \le q)\), otherwise \(P(T_t - T_c > q)\).
Value
A numeric scalar or vector in [0, 1]. When mu_t,
mu_c, sd_t, or sd_c are vectors of length
\(n\), a vector of length \(n\) is returned.
Details
The algorithm proceeds as follows:
Generate an
nMC x nmatrix of standard t draws for \(T_t\) (\(\nu_t\) degrees of freedom), then scale and shift each column bysd_t[i]andmu_t[i].Repeat for \(T_c\) with \(\nu_c\) degrees of freedom.
Compute the
nMC x ndifference matrix \(D = T_t - T_c\).Return
colMeans(D > q)as the estimated \(P(T_t - T_c > q)\) for each parameter set.
All operations are matrix-based (no R-level loop over parameter sets), so
performance scales well with \(n\). However, note that the matrix size
is nMC x n, so memory usage grows linearly with both nMC
and \(n\). When \(n\) is large (e.g., nsim x n_scenarios in
pbayesdecisionprob1cont), memory requirements can become
prohibitive; in such cases prefer CalcMethod = 'MM'.
Monte Carlo error is approximately \(\sqrt{p(1 - p) / \mathrm{nMC}}\); near \(p = 0.5\) this is roughly \(0.5 / \sqrt{\mathrm{nMC}}\).
Examples
# P(T_t - T_c > 3) with equal parameters
ptdiff_MC(nMC = 1e5, q = 3, mu_t = 2, mu_c = 0, sd_t = 1, sd_c = 1,
nu_t = 17, nu_c = 17, lower.tail = FALSE)
#> [1] 0.249
# P(T_t - T_c > 1) with unequal scales
ptdiff_MC(nMC = 1e5, q = 1, mu_t = 5, mu_c = 3, sd_t = 2, sd_c = 1.5,
nu_t = 10, nu_c = 15, lower.tail = FALSE)
#> [1] 0.64637
# P(T_t - T_c > 0) with different degrees of freedom
ptdiff_MC(nMC = 1e5, q = 0, mu_t = 1, mu_c = 1, sd_t = 1, sd_c = 1,
nu_t = 5, nu_c = 20, lower.tail = FALSE)
#> [1] 0.50184
# Lower tail: P(T_t - T_c <= 2)
ptdiff_MC(nMC = 1e5, q = 2, mu_t = 3, mu_c = 0, sd_t = 1.5, sd_c = 1.2,
nu_t = 12, nu_c = 15, lower.tail = TRUE)
#> [1] 0.31276
# Vectorised usage
ptdiff_MC(nMC = 1e5, q = 1, mu_t = c(2, 3, 4), mu_c = c(0, 1, 2),
sd_t = c(1, 1.2, 1.5), sd_c = c(1, 1.1, 1.3),
nu_t = 10, nu_c = 10, lower.tail = FALSE)
#> [1] 0.74568 0.71653 0.68132