Generate Random Samples from a Dirichlet Distribution

Generates random samples from a Dirichlet distribution using the Gamma representation: if $Y_i \sim \mathrm{Gamma}(\alpha_i, 1)$ independently for $i = 1, \ldots, K$, then $(Y_1 / S, \ldots, Y_K / S) \sim \mathrm{Dirichlet}(\alpha_1, \ldots, \alpha_K)$, where $S = \sum_{i=1}^{K} Y_i$.

Usage

rdirichlet(n, alpha)

Arguments

n: A positive integer specifying the number of random vectors to generate.
alpha: A numeric vector of length $K \ge 2$ containing positive concentration parameters of the Dirichlet distribution. All elements must be strictly positive.

Value

A numeric matrix of dimensions n x K where each row is one random draw from the Dirichlet distribution, with all elements in [0, 1] and each row summing to 1. When n = 1, a numeric vector of length K is returned.

Details

The Dirichlet distribution is a multivariate generalisation of the Beta distribution and is commonly used as a conjugate prior for multinomial proportions in Bayesian statistics.

The probability density function is: $$f(x_1, \ldots, x_K) = \frac{\Gamma\!\left(\sum_{i=1}^{K} \alpha_i\right)} {\prod_{i=1}^{K} \Gamma(\alpha_i)} \prod_{i=1}^{K} x_i^{\alpha_i - 1}$$ where $x_i > 0$ and $\sum_{i=1}^{K} x_i = 1$.

Key properties:

Each marginal follows a Beta distribution: $X_i \sim \mathrm{Beta}\!\left(\alpha_i,\, \sum_{l \neq i} \alpha_l\right)$.
$E[X_i] = \alpha_i / \sum_{l=1}^{K} \alpha_l$.
Components are negatively correlated unless one component dominates.

Implementation steps:

Generate independent $Y_i \sim \mathrm{Gamma}(\alpha_i, 1)$ for each $i = 1, \ldots, K$.
Normalise: $X_i = Y_i / \sum_{l=1}^{K} Y_l$.

Examples

# Example 1: Generate 5 samples from Dirichlet(1, 1, 1) - uniform on simplex
samples <- rdirichlet(5, c(1, 1, 1))
print(samples)
#>             [,1]       [,2]       [,3]
#> [1,] 0.004092311 0.55270422 0.44320347
#> [2,] 0.030488642 0.91940609 0.05010527
#> [3,] 0.071334047 0.49724604 0.43141991
#> [4,] 0.671113146 0.09846796 0.23041889
#> [5,] 0.317548254 0.14868130 0.53377044
rowSums(samples)  # Each row should sum to 1
#> [1] 1 1 1 1 1

# Example 2: Generate samples with unequal concentrations
samples <- rdirichlet(1000, c(2, 5, 3))
colMeans(samples)  # Expected values: approximately c(0.2, 0.5, 0.3)
#> [1] 0.1964174 0.4976209 0.3059617

# Example 3: Sparse Dirichlet (small alpha values)
samples <- rdirichlet(100, c(0.1, 0.1, 0.1, 0.1))
head(samples)  # Most weight concentrated on one component
#>              [,1]       [,2]         [,3]         [,4]
#> [1,] 6.049788e-01 0.39501328 4.733303e-06 3.185047e-06
#> [2,] 1.570795e-02 0.97414402 8.258922e-07 1.014720e-02
#> [3,] 1.303292e-07 0.99970420 2.915372e-04 4.137287e-06
#> [4,] 4.750049e-07 0.05348021 9.465184e-01 8.640396e-07
#> [5,] 4.092988e-01 0.58949822 1.200735e-03 2.205456e-06
#> [6,] 1.213324e-04 0.99941858 3.247636e-04 1.353210e-04

# Example 4: Concentrated Dirichlet (large alpha values)
samples <- rdirichlet(100, c(100, 100, 100))
colMeans(samples)  # Concentrated around c(1/3, 1/3, 1/3)
#> [1] 0.3358933 0.3295870 0.3345198

# Example 5: Bayesian update with Jeffreys prior for 4 categories
prior_alpha     <- c(0.5, 0.5, 0.5, 0.5)
observed_counts <- c(10, 5, 8, 7)
posterior_samples <- rdirichlet(1000, prior_alpha + observed_counts)
colMeans(posterior_samples)  # Posterior mean
#> [1] 0.3270649 0.1735726 0.2658122 0.2335504