| Title: | Mixed Models for Repeated Measures |
|---|---|
| Description: | Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'. |
| Authors: | Daniel Sabanes Bove [aut, cre] (ORCID: <https://orcid.org/0000-0002-0176-9239>), Liming Li [aut] (ORCID: <https://orcid.org/0009-0008-6870-0878>), Julia Dedic [aut], Doug Kelkhoff [aut], Kevin Kunzmann [aut], Brian Matthew Lang [aut], Christian Stock [aut], Ya Wang [aut], Craig Gower-Page [ctb], Dan James [aut], Jonathan Sidi [aut], Daniel Leibovitz [aut], Daniel D. Sjoberg [aut] (ORCID: <https://orcid.org/0000-0003-0862-2018>), Nikolas Ivan Krieger [aut] (ORCID: <https://orcid.org/0000-0002-4581-3545>), Lukas A. Widmer [ctb] (ORCID: <https://orcid.org/0000-0003-1471-3493>), Boehringer Ingelheim Ltd. [cph, fnd], Gilead Sciences, Inc. [cph, fnd], F. Hoffmann-La Roche AG [cph, fnd], Merck Sharp & Dohme, Inc. [cph, fnd], AstraZeneca plc [cph, fnd], inferential.biostatistics GmbH [cph, fnd] |
| Maintainer: | Daniel Sabanes Bove <[email protected]> |
| License: | Apache License 2.0 |
| Version: | 0.3.17.9000 |
| Built: | 2026-05-27 06:11:47 UTC |
| Source: | https://github.com/openpharma/mmrm |
mmrm Packagemmrm implements mixed models with repeated measures (MMRM) in R.
Maintainer: Daniel Sabanes Bove [email protected] (ORCID)
Authors:
Liming Li [email protected] (ORCID)
Julia Dedic [email protected]
Doug Kelkhoff [email protected]
Kevin Kunzmann [email protected]
Brian Matthew Lang [email protected]
Christian Stock [email protected]
Ya Wang [email protected]
Dan James [email protected]
Jonathan Sidi [email protected]
Daniel Leibovitz [email protected]
Daniel D. Sjoberg [email protected] (ORCID)
Nikolas Ivan Krieger [email protected] (ORCID)
Other contributors:
Craig Gower-Page [email protected] [contributor]
Lukas A. Widmer (ORCID) [contributor]
Boehringer Ingelheim Ltd. [copyright holder, funder]
Gilead Sciences, Inc. [copyright holder, funder]
F. Hoffmann-La Roche AG [copyright holder, funder]
Merck Sharp & Dohme, Inc. [copyright holder, funder]
AstraZeneca plc [copyright holder, funder]
inferential.biostatistics GmbH [copyright holder, funder]
Useful links:
as.cov_struct(x, ...) ## S3 method for class 'formula' as.cov_struct(x, warn_partial = TRUE, ...)as.cov_struct(x, ...) ## S3 method for class 'formula' as.cov_struct(x, warn_partial = TRUE, ...)
x |
an object from which to derive a covariance structure. See object specific sections for details. |
... |
additional arguments unused. |
warn_partial |
( |
A covariance structure can be parsed from a model definition formula or call. Generally, covariance structures defined using non-standard evaluation take the following form:
type( (visit, )* visit | (group /)? subject )
For example, formulas may include terms such as
us(time | subject) cp(time | group / subject) sp_exp(coord1, coord2 | group / subject)
Note that only sp_exp (spatial) covariance structures may provide multiple
coordinates, which identify the Euclidean distance between the time points.
A cov_struct() object.
as.cov_struct(formula): When provided a formula, any specialized functions are assumed to be
covariance structure definitions and must follow the form:
y ~ xs + type( (visit, )* visit | (group /)? subject )
Any component on the right hand side of a formula is considered when searching for a covariance definition.
Other covariance types:
cov_struct(),
covariance_types
# provide a covariance structure as a right-sided formula as.cov_struct(~ csh(visit | group / subject)) # when part of a full formula, suppress warnings using `warn_partial = FALSE` as.cov_struct(y ~ x + csh(visit | group / subject), warn_partial = FALSE)# provide a covariance structure as a right-sided formula as.cov_struct(~ csh(visit | group / subject)) # when part of a full formula, suppress warnings using `warn_partial = FALSE` as.cov_struct(y ~ x + csh(visit | group / subject), warn_partial = FALSE)
bcva_databcva_data
A tibble with 10,000 rows and 7 variables:
USUBJID: subject ID.
VISITN: visit number (numeric).
AVISIT: visit number (factor).
ARMCD: treatment, TRT or CTL.
RACE: 3-category race.
BCVA_BL: BCVA at baseline.
BCVA_CHG: Change in BCVA at study visits.
Measurements of BCVA (best corrected visual acuity) is a measure of how how many letters a person can read off of an eye chart using corrective lenses or contacts. This a common endpoint in ophthalmology trials.
This is an artificial dataset.
mmrm_tmb Objectscomponent( object, name = c("cov_type", "subject_var", "n_theta", "n_subjects", "n_timepoints", "n_obs", "beta_vcov", "beta_vcov_complete", "varcor", "score_per_subject", "formula", "dataset", "n_groups", "reml", "convergence", "evaluations", "method", "optimizer", "conv_message", "call", "theta_est", "beta_est", "beta_est_complete", "beta_aliased", "x_matrix", "x_matrix_complete", "y_vector", "neg_log_lik", "jac_list", "theta_vcov", "full_frame", "xlev", "contrasts") )component( object, name = c("cov_type", "subject_var", "n_theta", "n_subjects", "n_timepoints", "n_obs", "beta_vcov", "beta_vcov_complete", "varcor", "score_per_subject", "formula", "dataset", "n_groups", "reml", "convergence", "evaluations", "method", "optimizer", "conv_message", "call", "theta_est", "beta_est", "beta_est_complete", "beta_aliased", "x_matrix", "x_matrix_complete", "y_vector", "neg_log_lik", "jac_list", "theta_vcov", "full_frame", "xlev", "contrasts") )
object |
( |
name |
( |
Available component() names are as follows:
call: low-level function call which generated the model.
formula: model formula.
dataset: data set name.
cov_type: covariance structure type.
n_theta: number of parameters.
n_subjects: number of subjects.
n_timepoints: number of modeled time points.
n_obs: total number of observations.
reml: was REML used (ML was used if FALSE).
neg_log_lik: negative log likelihood.
convergence: convergence code from optimizer.
conv_message: message accompanying the convergence code.
evaluations: number of function evaluations for optimization.
method: Adjustment method which was used (for mmrm objects),
otherwise NULL (for mmrm_tmb objects).
beta_vcov: estimated variance-covariance matrix of coefficients
(excluding aliased coefficients). When Kenward-Roger/Empirical adjusted
coefficients covariance matrix is used, the adjusted covariance matrix is returned (to still obtain the
original asymptotic covariance matrix use object$beta_vcov).
beta_vcov_complete: estimated variance-covariance matrix including
aliased coefficients with entries set to NA.
varcor: estimated covariance matrix for residuals. If there are multiple
groups, a named list of estimated covariance matrices for residuals will be
returned. The names are the group levels.
score_per_subject: score per subject in empirical covariance.
See the vignette vignette("coef_vcov", package = "mmrm").
theta_est: estimated variance parameters.
beta_est: estimated coefficients (excluding aliased coefficients).
beta_est_complete: estimated coefficients including aliased coefficients
set to NA.
beta_aliased: whether each coefficient was aliased (i.e. cannot be estimated)
or not.
theta_vcov: estimated variance-covariance matrix of variance parameters.
x_matrix: design matrix used (excluding aliased columns).
x_matrix_complete: design matrix used, including aliased columns.
xlev: a named list of character vectors giving the full set of levels to be assumed for each factor.
contrasts: a list of contrasts used for each factor.
y_vector: response vector used.
jac_list: Jacobian, see h_jac_list() for details.
full_frame: data.frame with n rows containing all variables needed in the model.
The corresponding component of the object, see details.
In the lme4 package there is a similar function getME().
fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Get all available components. component(fit) # Get convergence code and message. component(fit, c("convergence", "conv_message")) # Get modeled formula as a string. component(fit, c("formula"))fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Get all available components. component(fit) # Get convergence code and message. component(fit, c("convergence", "conv_message")) # Get modeled formula as a string. component(fit, c("formula"))
cov_struct( type = cov_types(), visits, subject, group = character(), heterogeneous = FALSE )cov_struct( type = cov_types(), visits, subject, group = character(), heterogeneous = FALSE )
type |
( |
visits |
( |
subject |
( |
group |
( |
heterogeneous |
( |
A cov_struct object.
Other covariance types:
as.cov_struct(),
covariance_types
cov_struct("csh", "AVISITN", "USUBJID") cov_struct("spatial", c("VISITA", "VISITB"), group = "GRP", subject = "SBJ")cov_struct("csh", "AVISITN", "USUBJID") cov_struct("spatial", c("VISITA", "VISITB"), group = "GRP", subject = "SBJ")
cov_types( form = c("name", "abbr", "habbr"), filter = c("heterogeneous", "spatial") )cov_types( form = c("name", "abbr", "habbr"), filter = c("heterogeneous", "spatial") )
form |
( |
filter |
( |
A character vector of accepted covariance structure type names and abbreviations.
| Structure | Description | Parameters | element
|
| ad | Ante-dependence |
|
|
| adh | Heterogeneous ante-dependence |
|
|
| ar1 | First-order auto-regressive |
|
|
| ar1h | Heterogeneous first-order auto-regressive |
|
|
| cs | Compound symmetry |
|
|
| csh | Heterogeneous compound symmetry |
|
|
| toep | Toeplitz |
|
|
| toeph | Heterogeneous Toeplitz |
|
|
| us | Unstructured |
|
|
where and denote -th and -th time points,
respectively, out of total time points, .
The ante-dependence covariance structure in this package refers to
homogeneous ante-dependence, while the ante-dependence covariance structure
from SAS PROC MIXED refers to heterogeneous ante-dependence and the
homogeneous version is not available in SAS.
For all non-spatial covariance structures, the time variable must be coded as a factor.
| Structure | Description | Parameters | element
|
| sp_exp | spatial exponential |
|
|
where denotes the Euclidean distance between time points
and .
Other covariance types:
as.cov_struct(),
cov_struct()
Calculates the estimate, adjusted standard error, degrees of freedom,
t statistic and p-value for one-dimensional contrast.
df_1d(object, contrast)df_1d(object, contrast)
object |
( |
contrast |
( |
List with est, se, df, t_stat and p_val.
object <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) contrast <- numeric(length(object$beta_est)) contrast[3] <- 1 df_1d(object, contrast)object <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) contrast <- numeric(length(object$beta_est)) contrast[3] <- 1 df_1d(object, contrast)
Calculates the estimate, standard error, degrees of freedom,
t statistic and p-value for m-dimensional contrast, depending on the method
used in
mmrm().
df_md(object, contrast)df_md(object, contrast)
object |
( |
contrast |
( |
List with num_df, denom_df, f_stat and p_val (2-sided p-value).
object <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) contrast <- matrix(data = 0, nrow = 2, ncol = length(object$beta_est)) contrast[1, 2] <- contrast[2, 3] <- 1 df_md(object, contrast)object <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) contrast <- matrix(data = 0, nrow = 2, ncol = length(object$beta_est)) contrast[1, 2] <- contrast[2, 3] <- 1 df_md(object, contrast)
emmeans
This package includes methods that allow mmrm objects to be used
with the emmeans package. emmeans computes estimated marginal means
(also called least-square means) for the coefficients of the MMRM.
We can also e.g. obtain differences between groups by applying
pairs() on the object returned
by emmeans::emmeans().
fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) if (require(emmeans)) { emmeans(fit, ~ ARMCD | AVISIT) pairs(emmeans(fit, ~ ARMCD | AVISIT), reverse = TRUE) }fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) if (require(emmeans)) { emmeans(fit, ~ ARMCD | AVISIT) pairs(emmeans(fit, ~ ARMCD | AVISIT), reverse = TRUE) }
Obtain empirical start value for unstructured covariance
emp_start( y_vector, x_matrix, full_frame, visit_var, subject_var, subject_groups, ... )emp_start( y_vector, x_matrix, full_frame, visit_var, subject_var, subject_groups, ... )
y_vector |
( |
x_matrix |
( |
full_frame |
( |
visit_var |
( |
subject_var |
( |
subject_groups |
( |
... |
not used. |
This emp_start only works for unstructured covariance structure.
It uses linear regression to first obtain the coefficients and use the residuals
to obtain the empirical variance-covariance, and it is then used to obtain the
starting values.
A numeric vector of starting values.
fev_datafev_data
A tibble with 800 rows and 7 variables:
USUBJID: subject ID.
AVISIT: visit number.
ARMCD: treatment, TRT or PBO.
RACE: 3-category race.
SEX: sex.
FEV1_BL: FEV1 at baseline (%).
FEV1: FEV1 at study visits.
WEIGHT: weighting variable.
VISITN: integer order of the visit.
VISITN2: coordinates of the visit for distance calculation.
Measurements of FEV1 (forced expired volume in one second) is a measure of how quickly the lungs can be emptied. Low levels of FEV1 may indicate chronic obstructive pulmonary disease (COPD).
This is an artificial dataset.
This is the low-level function to fit an MMRM. Note that this does not
try different optimizers or adds Jacobian information etc. in contrast to
mmrm().
fit_mmrm( formula, data, weights, reml = TRUE, covariance = NULL, tmb_data, formula_parts, control = mmrm_control() )fit_mmrm( formula, data, weights, reml = TRUE, covariance = NULL, tmb_data, formula_parts, control = mmrm_control() )
formula |
( |
data |
( |
weights |
( |
reml |
( |
covariance |
( |
tmb_data |
( |
formula_parts |
( |
control |
( |
The formula typically looks like:
FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID)
which specifies response and covariates as usual, and exactly one special term defines which covariance structure is used and what are the visit and subject variables.
Always use only the first optimizer if multiple optimizers are provided.
List of class mmrm_tmb, see h_mmrm_tmb_fit() for details.
In addition, it contains elements call and optimizer.
formula <- FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID) data <- fev_data system.time(result <- fit_mmrm(formula, data, rep(1, nrow(fev_data))))formula <- FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID) data <- fev_data system.time(result <- fit_mmrm(formula, data, rep(1, nrow(fev_data))))
This function helps to fit an MMRM using TMB with a single optimizer,
while capturing messages and warnings.
fit_single_optimizer( formula, data, weights, reml = TRUE, covariance = NULL, tmb_data, formula_parts, ..., control = mmrm_control(...) )fit_single_optimizer( formula, data, weights, reml = TRUE, covariance = NULL, tmb_data, formula_parts, ..., control = mmrm_control(...) )
formula |
( |
data |
( |
weights |
( |
reml |
( |
covariance |
( |
tmb_data |
( |
formula_parts |
( |
... |
Additional arguments to pass to |
control |
( |
fit_single_optimizer will fit the mmrm model using the control provided.
If there are multiple optimizers provided in control, only the first optimizer
will be used.
If tmb_data and formula_parts are both provided, formula, data, weights,
reml, and covariance are ignored.
The mmrm_fit object, with additional attributes containing warnings,
messages, optimizer used and convergence status in addition to the
mmrm_tmb contents.
mod_fit <- fit_single_optimizer( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = rep(1, nrow(fev_data)), optimizer = "nlminb" ) attr(mod_fit, "converged")mod_fit <- fit_single_optimizer( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = rep(1, nrow(fev_data)), optimizer = "nlminb" ) attr(mod_fit, "converged")
Format Covariance Structure Object
## S3 method for class 'cov_struct' format(x, ...)## S3 method for class 'cov_struct' format(x, ...)
x |
( |
... |
Additional arguments unused. |
A formatted string for x.
This is the main function fitting the MMRM.
mmrm( formula, data, weights = NULL, covariance = NULL, reml = TRUE, control = mmrm_control(...), ... )mmrm( formula, data, weights = NULL, covariance = NULL, reml = TRUE, control = mmrm_control(...), ... )
formula |
( |
data |
( |
weights |
( |
covariance |
( |
reml |
( |
control |
( |
... |
arguments passed to |
The formula typically looks like:
FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID)
so specifies response and covariates as usual, and exactly one special term
defines which covariance structure is used and what are the time point and
subject variables. The covariance structures in the formula can be
found in covariance_types.
The time points have to be unique for each subject. That is, there cannot be time points with multiple observations for any subject. The rationale is that these observations would need to be correlated, but it is not possible within the currently implemented covariance structure framework to do that correctly. Moreover, for non-spatial covariance structures, the time variable must be a factor variable.
When optimizer is not set, first the default optimizer
(L-BFGS-B) is used to fit the model. If that converges, this is returned.
If not, the other available optimizers from h_get_optimizers(),
including BFGS, CG and nlminb are
tried (in parallel if n_cores is set and not on Windows).
If none of the optimizers converge, then the function fails. Otherwise
the best fit is returned.
Note that fine-grained control specifications can either be passed directly
to the mmrm function, or via the control argument for bundling together
with the mmrm_control() function. Both cannot be used together, since
this would delete the arguments passed via mmrm.
An mmrm object.
The mmrm object is also an mmrm_fit and an mmrm_tmb object,
therefore corresponding methods also work (see mmrm_tmb_methods).
Additional contents depend on vcov (see mmrm_control()):
If Kenward-Roger covariance matrix is used, kr_comp contains necessary
components and beta_vcov_adj includes the adjusted coefficients covariance
matrix.
If Empirical covariance matrix is used, beta_vcov_adj contains the
corresponding coefficients covariance matrix estimate. In addition,
empirical_g_mat contains the empirical g matrix, which is used to calculate
the Satterthwaite degrees of freedom. The score_per_subject contains the
empirical score per subject.
If Asymptotic covariance matrix is used in combination with Satterthwaite
d.f. adjustment, the Jacobian information jac_list is included.
Note that these additional elements might change over time and are to be considered internal implementation details rather than part of the public user interface of the package.
Use of the package emmeans is supported, see emmeans_support.
NA values are always omitted regardless of na.action setting.
When the number of visit levels is large, it usually requires large memory to create the
covariance matrix. By default, the maximum allowed visit levels is 100, and if there are more
visit levels, a confirmation is needed if run interactively.
You can use options(mmrm.max_visits = <target>) to increase the maximum allowed number of visit
levels. In non-interactive sessions the confirmation is not raised and will directly give you an error if
the number of visit levels exceeds the maximum.
fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Direct specification of control details: fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = fev_data$WEIGHTS, method = "Kenward-Roger" ) # Alternative specification via control argument (but you cannot mix the # two approaches): fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, control = mmrm_control(method = "Kenward-Roger") )fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Direct specification of control details: fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = fev_data$WEIGHTS, method = "Kenward-Roger" ) # Alternative specification via control argument (but you cannot mix the # two approaches): fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, control = mmrm_control(method = "Kenward-Roger") )
Fine-grained specification of the MMRM fit details is possible using this
control function.
mmrm_control( n_cores = 1L, method = c("Satterthwaite", "Kenward-Roger", "Residual", "Between-Within"), vcov = NULL, start = std_start, accept_singular = TRUE, drop_visit_levels = TRUE, disable_theta_vcov = FALSE, ..., optimizers = h_get_optimizers(...) )mmrm_control( n_cores = 1L, method = c("Satterthwaite", "Kenward-Roger", "Residual", "Between-Within"), vcov = NULL, start = std_start, accept_singular = TRUE, drop_visit_levels = TRUE, disable_theta_vcov = FALSE, ..., optimizers = h_get_optimizers(...) )
n_cores |
( |
method |
( |
vcov |
( |
start |
( |
accept_singular |
( |
drop_visit_levels |
( |
disable_theta_vcov |
( |
... |
additional arguments passed to |
optimizers |
( |
For example, if the data only has observations at visits VIS1, VIS3 and VIS4, by default
they are treated to be equally spaced, the distance from VIS1 to VIS3, and from VIS3 to VIS4,
are identical. However, you can manually convert this visit into a factor, with
levels = c("VIS1", "VIS2", "VIS3", "VIS4"), and also use drop_visits_levels = FALSE,
then the distance from VIS1 to VIS3 will be double, as VIS2 is a valid visit.
However, please be cautious because this can lead to convergence failure
when using an unstructured covariance matrix and there are no observations
at the missing visits.
The method and vcov arguments specify the degrees of freedom and coefficients
covariance matrix adjustment methods, respectively.
Allowed vcov includes: "Asymptotic", "Kenward-Roger", "Kenward-Roger-Linear", "Empirical" (CR0),
"Empirical-Jackknife" (CR3), and "Empirical-Bias-Reduced" (CR2).
Allowed method includes: "Satterthwaite", "Kenward-Roger", "Between-Within" and "Residual".
If method is "Kenward-Roger" then only "Kenward-Roger" or "Kenward-Roger-Linear" are allowed for vcov.
The vcov argument can be NULL to use the default covariance method depending on the method
used for degrees of freedom, see the following table:
method |
Default vcov |
| Satterthwaite | Asymptotic |
| Kenward-Roger | Kenward-Roger |
| Residual | Empirical |
| Between-Within | Asymptotic |
Please note that "Kenward-Roger" for "Unstructured" covariance gives different results
compared to SAS; Use "Kenward-Roger-Linear" for vcov instead for better matching
of the SAS results.
The argument start is used to facilitate the choice of initial values for fitting the model.
If function is provided, make sure its parameter is a valid element of mmrm_tmb_data
or mmrm_tmb_formula_parts and it returns a numeric vector.
By default or if NULL is provided, std_start will be used.
Other implemented methods include emp_start.
Disabling the theta_vcov calculation can speed up the fitting process when there are many variance
parameters. However, this has also drawbacks. We can no longer check that the variance parameter estimates
are indeed at a local maximum of the log-likelihood surface, and therefore convergence issues might go unnoticed.
In addition, some covariance matrix adjustment methods (e.g., Kenward-Roger) require theta_vcov to be calculated.
These will then raise errors.
List of class mmrm_control with the control parameters.
mmrm_control( optimizer_fun = stats::optim, optimizer_args = list(method = "L-BFGS-B") )mmrm_control( optimizer_fun = stats::optim, optimizer_args = list(method = "L-BFGS-B") )
mmrm ObjectsThese methods tidy the estimates from an mmrm object into a
summary.
## S3 method for class 'mmrm' tidy(x, conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'mmrm' glance(x, ...) ## S3 method for class 'mmrm' augment( x, newdata = NULL, interval = c("none", "confidence", "prediction"), se_fit = (interval != "none"), type.residuals = c("response", "pearson", "normalized"), ... )## S3 method for class 'mmrm' tidy(x, conf.int = FALSE, conf.level = 0.95, ...) ## S3 method for class 'mmrm' glance(x, ...) ## S3 method for class 'mmrm' augment( x, newdata = NULL, interval = c("none", "confidence", "prediction"), se_fit = (interval != "none"), type.residuals = c("response", "pearson", "normalized"), ... )
x |
( |
conf.int |
( |
conf.level |
( |
... |
only used by |
newdata |
( |
interval |
( |
se_fit |
( |
type.residuals |
( |
tidy(mmrm): derives tidy tibble from an mmrm object.
glance(mmrm): derives glance tibble from an mmrm object.
augment(mmrm): derives augment tibble from an mmrm object.
mmrm_methods, mmrm_tmb_methods for additional methods.
fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Applying tidy method to return summary table of covariate estimates. fit |> tidy() fit |> tidy(conf.int = TRUE, conf.level = 0.9) # Applying glance method to return summary table of goodness of fit statistics. fit |> glance() # Applying augment method to return merged `tibble` of model data, fitted and residuals. fit |> augment() fit |> augment(interval = "confidence") fit |> augment(type.residuals = "pearson")fit <- mmrm( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data ) # Applying tidy method to return summary table of covariate estimates. fit |> tidy() fit |> tidy(conf.int = TRUE, conf.level = 0.9) # Applying glance method to return summary table of goodness of fit statistics. fit |> glance() # Applying augment method to return merged `tibble` of model data, fitted and residuals. fit |> augment() fit |> augment(interval = "confidence") fit |> augment(type.residuals = "pearson")
mmrm_tmb Objects## S3 method for class 'mmrm_tmb' coef(object, complete = TRUE, ...) ## S3 method for class 'mmrm_tmb' fitted(object, ...) ## S3 method for class 'mmrm_tmb' predict( object, newdata, se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, nsim = 1000L, conditional = FALSE, ... ) ## S3 method for class 'mmrm_tmb' model.frame( formula, data, include = c("subject_var", "visit_var", "group_var", "response_var"), exclude = NULL, full, na.action = "na.omit", ... ) ## S3 method for class 'mmrm_tmb' model.matrix( object, data, use_response = TRUE, contrasts.arg = attr(object$tmb_data$x_matrix, "contrasts"), ... ) ## S3 method for class 'mmrm_tmb' terms(x, include = "response_var", ...) ## S3 method for class 'mmrm_tmb' logLik(object, ...) ## S3 method for class 'mmrm_tmb' formula(x, ...) ## S3 method for class 'mmrm_tmb' vcov(object, complete = TRUE, ...) ## S3 method for class 'mmrm_tmb' VarCorr(x, sigma = NA, ...) ## S3 method for class 'mmrm_tmb' deviance(object, ...) ## S3 method for class 'mmrm_tmb' AIC(object, corrected = FALSE, ..., k = 2) ## S3 method for class 'mmrm_tmb' BIC(object, ...) ## S3 method for class 'mmrm_tmb' print(x, ...) ## S3 method for class 'mmrm_tmb' residuals(object, type = c("response", "pearson", "normalized"), ...) ## S3 method for class 'mmrm_tmb' simulate( object, nsim = 1, seed = NULL, newdata, ..., method = c("conditional", "marginal") )## S3 method for class 'mmrm_tmb' coef(object, complete = TRUE, ...) ## S3 method for class 'mmrm_tmb' fitted(object, ...) ## S3 method for class 'mmrm_tmb' predict( object, newdata, se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, nsim = 1000L, conditional = FALSE, ... ) ## S3 method for class 'mmrm_tmb' model.frame( formula, data, include = c("subject_var", "visit_var", "group_var", "response_var"), exclude = NULL, full, na.action = "na.omit", ... ) ## S3 method for class 'mmrm_tmb' model.matrix( object, data, use_response = TRUE, contrasts.arg = attr(object$tmb_data$x_matrix, "contrasts"), ... ) ## S3 method for class 'mmrm_tmb' terms(x, include = "response_var", ...) ## S3 method for class 'mmrm_tmb' logLik(object, ...) ## S3 method for class 'mmrm_tmb' formula(x, ...) ## S3 method for class 'mmrm_tmb' vcov(object, complete = TRUE, ...) ## S3 method for class 'mmrm_tmb' VarCorr(x, sigma = NA, ...) ## S3 method for class 'mmrm_tmb' deviance(object, ...) ## S3 method for class 'mmrm_tmb' AIC(object, corrected = FALSE, ..., k = 2) ## S3 method for class 'mmrm_tmb' BIC(object, ...) ## S3 method for class 'mmrm_tmb' print(x, ...) ## S3 method for class 'mmrm_tmb' residuals(object, type = c("response", "pearson", "normalized"), ...) ## S3 method for class 'mmrm_tmb' simulate( object, nsim = 1, seed = NULL, newdata, ..., method = c("conditional", "marginal") )
object |
( |
complete |
( |
... |
mostly not used;
Exception is |
newdata |
( |
se.fit |
( |
interval |
( |
level |
( |
nsim |
( |
conditional |
( |
formula |
( |
data |
( |
include |
( |
exclude |
( |
full |
( |
na.action |
( |
use_response |
( |
contrasts.arg |
( |
x |
( |
sigma |
cannot be used (this parameter does not exist in MMRM). |
corrected |
( |
k |
( |
type |
( |
seed |
unused argument from |
method |
( |
include argument controls the variables the model frame will include upon creation.
Possible options are "response_var", "subject_var", "visit_var" and "group_var", representing the
response variable, subject variable, visit variable or group variable.
By default all four variable types are included, which means that the original model frame will be used.
If only a subset of variable types is requested, the model frame will be constructed freshly, and
if there are more complete rows (without NAs) in the required variables than in the original model frame,
those rows will be included as well.
The exclude argument can be used to exclude specific variable types after the model frame creation.
By default no variable types are excluded. The exclude argument is useful when the original
model frame should be used as basis (with same incomplete rows omitted), but some variable types should be removed
afterwards.
character values in new data will always be factorized according to the data in the fit
to avoid mismatched in levels or issues in model.matrix.
Depends on the method, see Functions.
coef(mmrm_tmb): obtains the estimated coefficients.
fitted(mmrm_tmb): obtains the fitted values.
predict(mmrm_tmb): predict conditional means for new data;
optionally with standard errors and confidence or prediction intervals.
Returns a vector of predictions if se.fit == FALSE and
interval == "none"; otherwise it returns a data.frame with multiple
columns and one row per input data row.
model.frame(mmrm_tmb): obtains the model frame.
model.matrix(mmrm_tmb): obtains the model matrix.
terms(mmrm_tmb): obtains the terms object.
logLik(mmrm_tmb): obtains the attained log likelihood value.
Includes as attributes the number of variance parameters n_param, number
of estimated coefficients n_coef, degrees of freedom df, and number of
subjects nobs. The nobs attribute is so named so that if this
function's results are passed to stats::BIC(), the BIC value will be
calculated correctly. Resulting value is of class
logLik.
formula(mmrm_tmb): obtains the used formula.
vcov(mmrm_tmb): obtains the variance-covariance matrix estimate
for the coefficients.
VarCorr(mmrm_tmb): obtains the variance-covariance matrix estimate
for the residuals.
deviance(mmrm_tmb): obtains the deviance, which is defined here
as twice the negative log likelihood, which can either be integrated
over the coefficients for REML fits or the usual one for ML fits.
AIC(mmrm_tmb): obtains the Akaike Information Criterion,
where the degrees of freedom are the number of variance parameters (n_theta).
If corrected, then this is multiplied with m / (m - n_theta - 1) where
m is the number of observations minus the number of coefficients, or
n_theta + 2 if it is smaller than that (Hurvich and Tsai 1989; Burnham and Anderson 1998).
BIC(mmrm_tmb): obtains the Bayesian Information Criterion,
which is using the natural logarithm of the number of subjects for the
penalty parameter k.
print(mmrm_tmb): prints the object.
residuals(mmrm_tmb): to obtain residuals - either unscaled ('response'), 'pearson' or 'normalized'.
simulate(mmrm_tmb): simulate responses from a fitted model according
to the simulation method, returning a data.frame of dimension [n, m]
where n is the number of rows in newdata,
and m is the number nsim of simulated responses.
If the response in the formula is a complicated expression, e.g. with multiple
variables, and not all variables are provided in newdata, a warning is issued.
In this case it is recommended to provide all variables in the newdata data set,
and manually set the response variables to NA as needed for obtaining unconditional
predictions e.g.
Hurvich CM, Tsai C (1989). “Regression and time series model selection in small samples.” Biometrika, 76(2), 297–307. doi:10.2307/2336663.
Burnham KP, Anderson DR (1998). “Practical use of the information-theoretic approach.” In Model selection and inference, 75–117. Springer. doi:10.1007/978-1-4757-2917-7_3.
Gałecki A, Burzykowski T (2013). “Linear mixed-effects model.” In Linear mixed-effects models using R, 245–273. Springer.
mmrm_methods, mmrm_tidiers for additional methods.
formula <- FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID) object <- fit_mmrm(formula, fev_data, weights = rep(1, nrow(fev_data))) # Estimated coefficients: coef(object) # Fitted values: fitted(object) predict(object, newdata = fev_data) # Model frame: model.frame(object) model.frame(object, include = "subject_var") # Model matrix: model.matrix(object) # terms: terms(object) terms(object, include = "subject_var") # Log likelihood given the estimated parameters: logLik(object) # Formula which was used: formula(object) # Variance-covariance matrix estimate for coefficients: vcov(object) # Variance-covariance matrix estimate for residuals: VarCorr(object) # REML criterion (twice the negative log likelihood): deviance(object) # AIC: AIC(object) AIC(object, corrected = TRUE) # BIC: BIC(object) # residuals: residuals(object, type = "response") residuals(object, type = "pearson") residuals(object, type = "normalized")formula <- FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID) object <- fit_mmrm(formula, fev_data, weights = rep(1, nrow(fev_data))) # Estimated coefficients: coef(object) # Fitted values: fitted(object) predict(object, newdata = fev_data) # Model frame: model.frame(object) model.frame(object, include = "subject_var") # Model matrix: model.matrix(object) # terms: terms(object) terms(object, include = "subject_var") # Log likelihood given the estimated parameters: logLik(object) # Formula which was used: formula(object) # Variance-covariance matrix estimate for coefficients: vcov(object) # Variance-covariance matrix estimate for residuals: VarCorr(object) # REML criterion (twice the negative log likelihood): deviance(object) # AIC: AIC(object) AIC(object, corrected = TRUE) # BIC: BIC(object) # residuals: residuals(object, type = "response") residuals(object, type = "pearson") residuals(object, type = "normalized")
Print a Covariance Structure Object
## S3 method for class 'cov_struct' print(x, ...)## S3 method for class 'cov_struct' print(x, ...)
x |
( |
... |
Additional arguments unused. |
x invisibly.
refit_multiple_optimizers(fit, ..., control = mmrm_control(...))refit_multiple_optimizers(fit, ..., control = mmrm_control(...))
fit |
( |
... |
Additional arguments passed to |
control |
( |
The best (in terms of log likelihood) fit which converged.
For Windows, no parallel computations are currently implemented.
fit <- fit_single_optimizer( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = rep(1, nrow(fev_data)), optimizer = "nlminb" ) best_fit <- refit_multiple_optimizers(fit)fit <- fit_single_optimizer( formula = FEV1 ~ RACE + SEX + ARMCD * AVISIT + us(AVISIT | USUBJID), data = fev_data, weights = rep(1, nrow(fev_data)), optimizer = "nlminb" ) best_fit <- refit_multiple_optimizers(fit)
mmrm FitsIf supplied only one model fit, the function will calculate and return the significance of the model terms. If supplied more than one model fit, standard diagnostics will be returned for each model, optionally including likelihood ratio test (LRT) results for adjacent models.
## S3 method for class 'mmrm' anova(object, ..., test = TRUE, refit = FALSE)## S3 method for class 'mmrm' anova(object, ..., test = TRUE, refit = FALSE)
object |
( |
... |
( |
test |
( |
refit |
( |
When test = FALSE (or, when only one model is supplied), this
function will process any mmrm fits, related or unrelated.
When supplying multiple models and test = TRUE, adjacent pairs of models
are tested sequentially. In other words, the order of the supplied models
matters. Furthermore, there are are multiple requirements for successful
LRT. See the section "Requirements for LRT" below.
A data frame with a row for each supplied model. If ... is empty,
this will be the the returned value of h_anova_single_mmrm_model() with
object supplied as its argument. Otherwise, the resulting data frame will
have the following columns:
Model: the sequence number of the model according to the order in which
the models were supplied to this function.
refit: logical, indicating whether or not the model was refitted. If
the refit argument was FALSE, all values will be FALSE.
REML: logical, indicating whether or not the model was fitted using
restricted maximum likelihood (REML) estimation. If FALSE, the model was
fitted using maximum likelihood (ML) estimation.
n_param: the number of variance parameters in the model fit, obtained
via logLik.mmrm_tmb().
n_coef: the number of estimated coefficients in the model fit, obtained
via logLik.mmrm_tmb().
df: degrees of freedom of the model fit, obtained via
logLik.mmrm_tmb().
AIC: Akaike's "An Information Criterion" of the model fit, obtained via
stats::AIC().
BIC: the Bayesian Information Criterion of the model fit, obtained via
stats::BIC().
logLik: the log likelihood of the model fit, obtained via
logLik.mmrm_tmb().
call: the call that created the model fit, obtained via component()
with name = "call", which is passed to deparse1(). If the model was
refitted (i.e., if its refit entry in this table is TRUE), this call
will be different from the call component of the pre-refitted model fit.
The data frame will have these additional columns inserted before call if
test = TRUE. Note that since each of these columns describe the results
of a likelihood ratio test (LRT) between the previous row's and current
row's model fits, the first element of each of these columns will be NA.
test: character, indicating which two model fits were compared. Values
are of the form {Model - 1} vs {Model}.
log_likelihood_ratio: the logarithm of the likelihood ratio between the two
models being compared.
p_value: the p-value of the log_likelihood_ratio.
Each supplied model fit must have more degrees of freedom than the preceding model fit.
If all supplied models were estimated using maximum likelihood (ML), the models must have nested covariates in order to undergo LRT. In other words, the set of covariates for each model must be a subset of the covariates of the next model. However, if any of the supplied models were estimated using restricted maximum likelihood (REML), all models must have the same covariates.
The covariance structure of each model must be either (a) the same as that of the next model or (b) a special case of that of the next model. See the section "Covariance structure nesting hierarchy" below.
All supplied model fits must either already use the same data or be
refitted using refit = TRUE, which refits all models to the dataset of
common observations between all models' respective data sets.
Tautologically, all covariance structures are special cases of an unstructured covariance, and a model with a covariance structure can be considered "nested" within an model without a covariance structure (assuming that the covariates are also nested).
All homogeneous covariance structures are nested within their corresponding heterogeneous counterparts. For instance, the homogeneous Toeplitz covariance structure is nested within the heterogeneous Toeplitz covariance structure.
Some different covariance structure types are also nested:
First-order auto-regressive (ar1 / ar1h) is nested within:
ante-dependence (ad / adh)
Toeplitz (toep / toeph)
Compound symmetry (cs / csh) is nested within Toeplitz (toep / toeph)
For details on the single model operation of this function, see
h_anova_single_mmrm_model(). For details on the generic, see
stats::anova().
# Create a few model fits, only adding terms from one to the next. # Notice also that each covariance structure is a special case of the one # that follows. fit_sex_ar1 <- mmrm(FEV1 ~ FEV1_BL + SEX + ARMCD + ar1(AVISIT | USUBJID), data = fev_data, reml = FALSE ) fit_sex_race_toeph <- mmrm( FEV1 ~ FEV1_BL + SEX + RACE + ARMCD + toeph(AVISIT | USUBJID), data = fev_data, reml = FALSE ) fit_interaction_us <- mmrm( FEV1 ~ FEV1_BL + SEX * RACE + ARMCD + us(AVISIT | USUBJID), data = fev_data, reml = FALSE ) # Single model fit, showing significance of model terms: anova(fit_interaction_us) # Multiple model fits, with diagnostics for each fit and likelihood ratio # testing (LRT) for each adjacent pair. LRT is possible because when the fits # are in this order, their covariates and covariance structures are nested. anova(fit_sex_ar1, fit_sex_race_toeph, fit_interaction_us) # We can only change the order if we forego LRT using test = FALSE. anova(fit_sex_race_toeph, fit_interaction_us, fit_sex_ar1, test = FALSE) # Create a subset of fev_data set with the 4th visit removed. fev_subset <- droplevels(fev_data[fev_data$VISITN < 4, ]) # Recreate fit_sex_race_toeph but this time based off fev_subset: fit_sex_race_toeph_sub <- mmrm( FEV1 ~ FEV1_BL + SEX + RACE + ARMCD + toeph(AVISIT | USUBJID), data = fev_subset, reml = FALSE ) # If a model was created with a different data set, refit = TRUE is needed. anova(fit_sex_ar1, fit_sex_race_toeph_sub, fit_interaction_us, refit = TRUE)# Create a few model fits, only adding terms from one to the next. # Notice also that each covariance structure is a special case of the one # that follows. fit_sex_ar1 <- mmrm(FEV1 ~ FEV1_BL + SEX + ARMCD + ar1(AVISIT | USUBJID), data = fev_data, reml = FALSE ) fit_sex_race_toeph <- mmrm( FEV1 ~ FEV1_BL + SEX + RACE + ARMCD + toeph(AVISIT | USUBJID), data = fev_data, reml = FALSE ) fit_interaction_us <- mmrm( FEV1 ~ FEV1_BL + SEX * RACE + ARMCD + us(AVISIT | USUBJID), data = fev_data, reml = FALSE ) # Single model fit, showing significance of model terms: anova(fit_interaction_us) # Multiple model fits, with diagnostics for each fit and likelihood ratio # testing (LRT) for each adjacent pair. LRT is possible because when the fits # are in this order, their covariates and covariance structures are nested. anova(fit_sex_ar1, fit_sex_race_toeph, fit_interaction_us) # We can only change the order if we forego LRT using test = FALSE. anova(fit_sex_race_toeph, fit_interaction_us, fit_sex_ar1, test = FALSE) # Create a subset of fev_data set with the 4th visit removed. fev_subset <- droplevels(fev_data[fev_data$VISITN < 4, ]) # Recreate fit_sex_race_toeph but this time based off fev_subset: fit_sex_race_toeph_sub <- mmrm( FEV1 ~ FEV1_BL + SEX + RACE + ARMCD + toeph(AVISIT | USUBJID), data = fev_subset, reml = FALSE ) # If a model was created with a different data set, refit = TRUE is needed. anova(fit_sex_ar1, fit_sex_race_toeph_sub, fit_interaction_us, refit = TRUE)
Obtain standard start values.
std_start(cov_type, n_visits, n_groups, ...)std_start(cov_type, n_visits, n_groups, ...)
cov_type |
( |
n_visits |
( |
n_groups |
( |
... |
not used. |
std_start will try to provide variance parameter from identity matrix.
However, for ar1 and ar1h the corresponding values are not ideal because the
is usually a positive number thus using 0 as starting value can lead to
incorrect optimization result, and we use 0.5 as the initial value of .
A numeric vector of starting values.