Modular Functions for Mixed Model Fits
modular.RdModular functions for mixed model fits
Usage
glFormula(formula, data = NULL, family = gaussian,
subset, weights, na.action, offset, contrasts = NULL,
start, mustart, etastart, control = glmerControl(), ...)Arguments
- formula
a two-sided linear formula object describing both the fixed-effects and random-effects parts of the model, with the response on the left of a
~operator and the terms, separated by+operators, on the right. Random-effects terms are distinguished by vertical bars ("|") separating expressions for design matrices from grouping factors.- data
an optional data frame containing the variables named in
formula. By default the variables are taken from the environment from whichlmeris called. Whiledatais optional, the package authors strongly recommend its use, especially when later applying methods such asupdateanddrop1to the fitted model (such methods are not guaranteed to work properly ifdatais omitted). Ifdatais omitted, variables will be taken from the environment offormula(if specified as a formula) or from the parent frame (if specified as a character vector).- subset
an optional expression indicating the subset of the rows of
datathat should be used in the fit. This can be a logical vector, or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default.- weights
an optional vector of ‘prior weights’ to be used in the fitting process. Should be
NULLor a numeric vector.- na.action
a function that indicates what should happen when the data contain
NAs. The default action (na.omit, inherited from the 'factory fresh' value ofgetOption("na.action")) strips any observations with any missing values in any variables.- offset
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be
NULLor a numeric vector of length equal to the number of cases. One or moreoffsetterms can be included in the formula instead or as well, and if more than one is specified their sum is used. Seemodel.offset.- contrasts
an optional
list. See thecontrasts.argofmodel.matrix.default.- control
a list giving
- for
[g]lFormula: all options for running the model, see
lmerControl;- for
mkLmerDevfun,mkGlmerDevfun: options for the inner optimization step;
- for
optimizeLmerandoptimizeGlmer: control parameters for nonlinear optimizer (typically inherited from the ... argument to
lmerControl).
% FIXME: reference optCtrl
- for
- start
starting values (see
lmer; forglFormula, should be just a numeric vector of fixed-effect coefficients)- family
- mustart
optional starting values on the scale of the conditional mean; see
glmfor details.- etastart
optional starting values on the scale of the unbounded predictor; see
glmfor details.- ...
other potential arguments; for
optimizeLmerandoptimizeGlmer, these are passed to internal functionoptwrap, which has relevant parameterscalc.derivsanduse.last.params(seelmerControl).
Value
lFormula and glFormula return a list containing
components:
- fr
model frame
- X
fixed-effect design matrix
- reTrms
list containing information on random effects structure: result of
mkReTrms
mkLmerDevfun and mkGlmerDevfun return a function to
calculate deviance (or restricted deviance) as a function of the
theta (random-effect) parameters. updateGlmerDevfun
returns a function to calculate the deviance as a function of a
concatenation of theta and beta (fixed-effect) parameters. These
deviance functions have an environment containing objects required
for their evaluation. CAUTION: The environment of
functions returned by mk(Gl|L)merDevfun contains reference
class objects (see ReferenceClasses,
merPredD-class, lmResp-class), which
behave in ways that may surprise many users. For example, if the
output of mk(Gl|L)merDevfun is naively copied, then
modifications to the original will also appear in the copy (and
vice versa). To avoid this behavior one must make a deep copy (see
ReferenceClasses for details).
optimizeLmer and optimizeGlmer return the results of an
optimization.
Details
These functions make up the internal components of an [gn]lmer fit.
[g]lFormulatakes the arguments that would normally be passed to[g]lmer, checking for errors and processing the formula and data input to create a list of objects required to fit a mixed model.mk(Gl|L)merDevfuntakes the output of the previous step (minus theformulacomponent) and creates a deviance functionoptimize(Gl|L)mertakes a deviance function and optimizes overtheta(or overthetaandbeta, ifstageis set to 2 foroptimizeGlmerupdateGlmerDevfuntakes the first stage of a GLMM optimization (withnAGQ=0, optimizing overthetaonly) and produces a second-stage deviance functionmkMerModtakes the environment of a deviance function, the results of an optimization, a list of random-effect terms, a model frame, and a model all and produces a[g]lmerModobject.
Examples
# \donttest{
library(lme4)
### Fitting a linear mixed model in 4 modularized steps
## 1. Parse the data and formula:
lmod <- lFormula(Reaction ~ Days + (Days|Subject), sleepstudy)
names(lmod)
#> [1] "fr" "X" "reTrms" "REML" "formula" "wmsgs"
## 2. Create the deviance function to be optimized:
(devfun <- do.call(mkLmerDevfun, lmod))
#> function (theta)
#> .Call(lmer_Deviance, pp$ptr(), resp$ptr(), as.double(theta))
#> <environment: 0x000002b8f2f95428>
ls(environment(devfun)) # the environment of 'devfun' contains objects
#> [1] "lmer_Deviance" "lower" "pp" "resp"
# required for its evaluation
## 3. Optimize the deviance function:
opt <- optimizeLmer(devfun)
opt[1:3]
#> $par
#> [1] 0.96674177 0.01516906 0.23090995
#>
#> $fval
#> [1] 1743.628
#>
#> $feval
#> [1] 43
#>
## 4. Package up the results:
mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)
#> Linear mixed model fit by REML ['lmerMod']
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
### Same model in one line
lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: sleepstudy
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
### Fitting a generalized linear mixed model in six modularized steps
## 1. Parse the data and formula:
glmod <- glFormula(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
#.... see what've got :
str(glmod, max=1, give.attr=FALSE)
#> List of 6
#> $ fr :'data.frame': 56 obs. of 3 variables:
#> $ X : num [1:56, 1:4] 1 1 1 1 1 1 1 1 1 1 ...
#> $ reTrms :List of 10
#> $ family :List of 12
#> $ formula:Class 'formula' language cbind(incidence, size - incidence) ~ period + (1 | herd)
#> $ wmsgs : chr(0)
## 2. Create the deviance function for optimizing over theta:
(devfun <- do.call(mkGlmerDevfun, glmod))
#> function (theta)
#> {
#> resp$updateMu(lp0)
#> pp$setTheta(theta)
#> p <- pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L),
#> compDev = compDev, maxit = maxit, verbose = verbose)
#> resp$updateWts()
#> p
#> }
#> <environment: 0x000002b8efea7528>
ls(environment(devfun)) # the environment of devfun contains lots of info
#> [1] "compDev" "lower" "lp0" "maxit" "pp"
#> [6] "pwrssUpdate" "resp" "tolPwrss" "verbose"
## 3. Optimize over theta using a rough approximation (i.e. nAGQ = 0):
(opt <- optimizeGlmer(devfun))
#> parameter estimates: 0.641838555348909
#> objective: 184.108693002453
#> number of function evaluations: 18
## 4. Update the deviance function for optimizing over theta and beta:
(devfun <- updateGlmerDevfun(devfun, glmod$reTrms))
#> function (pars)
#> {
#> resp$setOffset(baseOffset)
#> resp$updateMu(lp0)
#> pp$setTheta(as.double(pars[dpars]))
#> spars <- as.numeric(pars[-dpars])
#> offset <- if (length(spars) == 0)
#> baseOffset
#> else baseOffset + pp$X %*% spars
#> resp$setOffset(offset)
#> p <- pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GQmat,
#> compDev = compDev, grpFac = fac, maxit = maxit, verbose = verbose)
#> resp$updateWts()
#> p
#> }
#> <environment: 0x000002b8efea7528>
## 5. Optimize over theta and beta:
opt <- optimizeGlmer(devfun, stage=2)
str(opt, max=1) # seeing what we'got
#> List of 7
#> $ fval : num 184
#> $ par : num [1:5] 0.642 -1.398 -0.992 -1.128 -1.58
#> $ convergence: num 0
#> $ NM.result : int 3
#> $ message : chr "parameter values converged to within tolerance"
#> $ control :List of 17
#> $ feval : num 285
#> - attr(*, "optimizer")= chr "Nelder_Mead"
#> - attr(*, "control")=List of 3
#> - attr(*, "warnings")= list()
#> - attr(*, "derivs")=List of 2
## 6. Package up the results:
(fMod <- mkMerMod(environment(devfun), opt, glmod$reTrms, fr = glmod$fr))
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#> Approximation) [glmerMod]
#> Family: binomial ( logit )
#> AIC BIC logLik deviance df.resid
#> 194.0531 204.1799 -92.0266 184.0531 51
#> Random effects:
#> Groups Name Std.Dev.
#> herd (Intercept) 0.6421
#> Number of obs: 56, groups: herd, 15
#> Fixed Effects:
#> (Intercept) period2 period3 period4
#> -1.3983 -0.9919 -1.1282 -1.5797
### Same model in one line
fM <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
all.equal(fMod, fM, check.attributes=FALSE, tolerance = 1e-12)
#> [1] TRUE
# ---- -- even tolerance = 0 may work
# }