Simulation-based calibration checking

About

This vignette shows the results of a simulation-based calibration (SBC) checking study to validate the implementation of the models in brms.mmrm. SBC checking tests the ability of a Bayesian model to recapture the parameters used to simulate prior predictive data. For details on SBC checking, please read Modrák et al. (2024) and the SBC R package (Kim et al. 2022). This particular SBC checking study uses the targets pipeline in the viggnettes/sbc/ subdirectory of the brms.mmrm package source code.

Conclusion

From the results below, the SBC rank statistics are approximately uniformly distributed. In other words, the posterior distribution from the brms/Stan MMRM modeling code matches the prior from which the datasets were simulated. This is evidence that both the subgroup and non-subgroup models in brms.mmrm are implemented correctly.

Setup

To show the SBC checking results in this vignette, we first load code from the SBC checking study, and we use custom functions to read and plot rank statistics:

library(dplyr)
library(ggplot2)
library(tibble)
library(tidyr)

source("sbc/R/prior.R")
source("sbc/R/response.R")
source("sbc/R/scenarios.R")

read_ranks <- function(path) {
  fst::read_fst(path) |>
    tibble::as_tibble() |>
    pivot_longer(
      cols = everything(),
      names_to = "parameter",
      values_to = "rank"
    )
}

plot_ranks <- function(ranks) {
  ggplot(ranks) +
    geom_histogram(
      aes(x = rank),
      breaks = seq(from = 0, to = max(ranks$rank), length.out = 10)
    ) +
    facet_wrap(~parameter)
}

Each section below is its own SBC checking study based on a given modeling scenario. Each scenario shows results from 1000 independent simulations from the prior.

Subgroup scenario

The subgroup scenario distinguishes itself from the others by the presence of a subgroup factor. Assumptions:

  • 2 treatment groups
  • 2 subgroup levels
  • 3 time points
  • 150 patients per treatment group
  • Baseline covariates (2 continuous and 2 categorical)
  • 30% dropout
  • 8% rate of independent/sporadic missing response.

Model formula:

#> response ~ group + group:subgroup + group:subgroup:time + group:time + subgroup + subgroup:time + time + continuous1 + continuous2 + balanced + unbalanced + unstr(time = time, gr = patient) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(subgroup) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior     class                                       coef
#> 1   normal(0.1097, 1.4716) Intercept                                           
#> 3  normal(-0.0527, 2.5388)         b                             balancedlevel2
#> 4  normal(-0.2422, 0.9587)         b                             balancedlevel3
#> 5  normal(-0.0307, 1.0234)         b                                continuous1
#> 6   normal(-0.044, 0.7195)         b                                continuous2
#> 7    normal(0.0489, 1.229)         b                               groupgroup_2
#> 8  normal(-0.0804, 0.3929)         b            groupgroup_2:subgroupsubgroup_2
#> 9  normal(-0.0719, 2.6758)         b groupgroup_2:subgroupsubgroup_2:timetime_2
#> 10   normal(0.049, 1.6262)         b groupgroup_2:subgroupsubgroup_2:timetime_3
#> 11 normal(-0.0724, 1.3682)         b                    groupgroup_2:timetime_2
#> 12  normal(0.1503, 0.2546)         b                    groupgroup_2:timetime_3
#> 13 normal(-0.1619, 0.3869)         b                         subgroupsubgroup_2
#> 14  normal(0.1422, 0.3832)         b              subgroupsubgroup_2:timetime_2
#> 15   normal(-0.078, 1.678)         b              subgroupsubgroup_2:timetime_3
#> 16  normal(0.2474, 0.3863)         b                                 timetime_2
#> 17 normal(-0.1646, 2.0886)         b                                 timetime_3
#> 18 normal(-0.2129, 0.4422)         b                           unbalancedlevel2
#> 19  normal(0.0655, 2.0917)         b                           unbalancedlevel3
#> 20             lkj(1.2079)   cortime                                           
#> 22  normal(0.1745, 2.1779)         b                                 timetime_1
#> 23 normal(-0.1817, 0.5107)         b                                 timetime_2
#> 24  normal(0.2083, 2.1871)         b                                 timetime_3
#>     dpar
#> 1       
#> 3       
#> 4       
#> 5       
#> 6       
#> 7       
#> 8       
#> 9       
#> 10      
#> 11      
#> 12      
#> 13      
#> 14      
#> 15      
#> 16      
#> 17      
#> 18      
#> 19      
#> 20      
#> 22 sigma
#> 23 sigma
#> 24 sigma

The following histograms show the SBC rank statistics which compare the prior parameter draws draws to the posterior draws. If the data simulation code and modeling code are both correct and consistent, then the rank statistics should be approximately uniformly distributed.

ranks_subgroup <- read_ranks("sbc/results/subgroup.fst")

Fixed effect parameter ranks:

ranks_subgroup |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Standard deviation parameter ranks:

ranks_subgroup |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_subgroup |>
  filter(grepl("cortime_", parameter)) |>
  plot_ranks()

Unstructured scenario

This scenario uses unstructured correlation and does not use a subgroup variable. Assumptions:

  • 3 treatment groups
  • No subgroup
  • 4 time points
  • 100 patients per treatment group
  • No covariate adjustment
  • No missing responses

Model formula:

#> response ~ 0 + group + time + unstr(time = time, gr = patient) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(unstructured) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior   class         coef  dpar
#> 2   normal(-0.1211, 0.617)       b groupgroup_1      
#> 3   normal(0.2158, 2.1803)       b groupgroup_2      
#> 4   normal(0.1576, 2.8512)       b groupgroup_3      
#> 5   normal(0.1158, 1.3975)       b   timetime_2      
#> 6   normal(0.2129, 0.2934)       b   timetime_3      
#> 7  normal(-0.0098, 1.4783)       b   timetime_4      
#> 8              lkj(1.2257) cortime                   
#> 10  normal(-0.1186, 0.952)       b   timetime_1 sigma
#> 11  normal(0.1117, 1.6101)       b   timetime_2 sigma
#> 12 normal(-0.1213, 1.9628)       b   timetime_3 sigma
#> 13 normal(-0.2103, 0.9874)       b   timetime_4 sigma

SBC checking rank statistics:

ranks_unstructured <- read_ranks("sbc/results/unstructured.fst")

Fixed effect parameter ranks:

ranks_unstructured |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_unstructured |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_unstructured |>
  filter(grepl("cortime_", parameter)) |>
  plot_ranks()

Autoregressive moving average scenario

This scenario uses an autoregressive moving average (ARMA) model with autoregressive order 1 and moving average order 1. Assumptions:

  • 2 treatment groups
  • No subgroup
  • 3 time points
  • 100 patients per treatment group
  • No covariate adjustment
  • No missing responses

Model formula:

#> response ~ 0 + group + time + arma(time = time, gr = patient, p = 1L, q = 1L, cov = FALSE) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(autoregressive_moving_average) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior class         coef  dpar
#> 1        uniform(0.1, 0.9)    ar                   
#> 3   normal(0.1469, 0.4041)     b groupgroup_1      
#> 4   normal(0.2012, 2.5269)     b groupgroup_2      
#> 5  normal(-0.0896, 2.1888)     b   timetime_2      
#> 6  normal(-0.1274, 1.9056)     b   timetime_3      
#> 7        uniform(0.1, 0.9)    ma                   
#> 9   normal(-0.186, 1.2316)     b   timetime_1 sigma
#> 10 normal(-0.2477, 1.9945)     b   timetime_2 sigma
#> 11  normal(0.0514, 2.4463)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_autoregressive_moving_average <- read_ranks(
  "sbc/results/autoregressive_moving_average.fst"
)

Fixed effect parameter ranks:

ranks_autoregressive_moving_average |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_autoregressive_moving_average |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_autoregressive_moving_average |>
  filter(parameter %in% c("ar[1]", "ma[1]")) |>
  plot_ranks()

Autoregressive scenario

This scenario is the same as above, but the correlation structure is autoregressive with order 2. Model formula:

#> response ~ 0 + group + time + ar(time = time, gr = patient, p = 2L, cov = FALSE) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(autoregressive) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior class         coef  dpar
#> 1        uniform(0.1, 0.9)    ar                   
#> 3   normal(0.1916, 0.8145)     b groupgroup_1      
#> 4  normal(-0.0271, 1.9418)     b groupgroup_2      
#> 5  normal(-0.0177, 1.7237)     b   timetime_2      
#> 6    normal(0.038, 1.7802)     b   timetime_3      
#> 8   normal(0.1238, 0.7551)     b   timetime_1 sigma
#> 9  normal(-0.1352, 0.4516)     b   timetime_2 sigma
#> 10  normal(0.1657, 0.9257)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_autoregressive <- read_ranks("sbc/results/autoregressive.fst")

Fixed effect parameter ranks:

ranks_autoregressive |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_autoregressive |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_autoregressive |>
  filter(parameter %in% c("ar[1]", "ar[2]")) |>
  plot_ranks()

Moving average scenario

This scenario is the same as above, but it uses a moving average correlation structure with order 2. Model formula:

#> response ~ 0 + group + time + ma(time = time, gr = patient, q = 2L, cov = FALSE) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(moving_average) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior class         coef  dpar
#> 2  normal(-0.1701, 1.8751)     b groupgroup_1      
#> 3  normal(-0.2394, 1.2411)     b groupgroup_2      
#> 4  normal(-0.0211, 2.2069)     b   timetime_2      
#> 5   normal(0.0679, 1.0899)     b   timetime_3      
#> 6        uniform(0.1, 0.9)    ma                   
#> 8    normal(0.012, 2.8984)     b   timetime_1 sigma
#> 9  normal(-0.1521, 2.0592)     b   timetime_2 sigma
#> 10 normal(-0.1642, 0.8724)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_moving_average <- read_ranks("sbc/results/moving_average.fst")

Fixed effect parameter ranks:

ranks_moving_average |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_moving_average |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_moving_average |>
  filter(parameter %in% c("ma[1]", "ma[2]")) |>
  plot_ranks()

Compound symmetry scenario

This scenario is the same as above, but it uses a compound symmetry correlation structure. Model formula:

#> response ~ 0 + group + time + cosy(time = time, gr = patient) 
#> sigma ~ 0 + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(compound_symmetry) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior class         coef  dpar
#> 2  normal(-0.2101, 1.5892)     b groupgroup_1      
#> 3   normal(0.0507, 2.7596)     b groupgroup_2      
#> 4    normal(0.0575, 2.969)     b   timetime_2      
#> 5    normal(0.0698, 0.429)     b   timetime_3      
#> 6        uniform(0.1, 0.9)  cosy                   
#> 8  normal(-0.0667, 2.6089)     b   timetime_1 sigma
#> 9   normal(0.1108, 2.3325)     b   timetime_2 sigma
#> 10  normal(-0.243, 1.9243)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_compound_symmetry <- read_ranks("sbc/results/compound_symmetry.fst")

Fixed effect parameter ranks:

ranks_compound_symmetry |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_compound_symmetry |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

Correlation parameter ranks:

ranks_compound_symmetry |>
  filter(parameter == "cosy") |>
  plot_ranks()

Diagonal scenario

This scenario is the same as above, but it uses a diagonal correlation structure (independent time points within patients). Model formula:

#> response ~ 0 + group + time 
#> sigma ~ group + group:time + time

The prior was randomly generated and used for both simulation and analysis:

setup_prior(diagonal) |>
  select(prior, class, coef, dpar) |>
  as.data.frame()
#>                      prior     class                    coef  dpar
#> 2  normal(-0.2072, 1.7943)         b            groupgroup_1      
#> 3  normal(-0.0777, 2.0291)         b            groupgroup_2      
#> 4   normal(0.2352, 1.1269)         b              timetime_2      
#> 5  normal(-0.1475, 1.7343)         b              timetime_3      
#> 6  normal(-0.1284, 2.1209) Intercept                         sigma
#> 8   normal(-0.1719, 0.254)         b            groupgroup_2 sigma
#> 9  normal(-0.1298, 2.4546)         b groupgroup_2:timetime_2 sigma
#> 10  normal(-0.233, 1.0706)         b groupgroup_2:timetime_3 sigma
#> 11 normal(-0.0093, 0.5344)         b              timetime_2 sigma
#> 12  normal(0.0107, 1.8822)         b              timetime_3 sigma

SBC checking rank statistics:

ranks_diagonal <- read_ranks("sbc/results/diagonal.fst")

Fixed effect parameter ranks:

ranks_diagonal |>
  filter(grepl("^b_", parameter)) |>
  filter(!grepl("^b_sigma", parameter)) |>
  plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_diagonal |>
  filter(grepl("b_sigma", parameter)) |>
  plot_ranks()

References

Kim, S., Moon, H., Modrák, M., and Säilynoja, T. (2022), SBC: Simulation based calibration for rstan/cmdstanr models.
Modrák, M., Moon, A. H., Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., Gelman, A., and Vehtari, A. (2024), Simulation-Based Calibration Checking for Bayesian Computation: The Choice of Test Quantities Shapes Sensitivity,” Bayesian Analysis, forthcoming. https://doi.org/10.1214/23-BA1404.