Modular Functions for Mixed Model Fits
modular.Rd
Modular functions for mixed model fits
Usage
glFormula(formula, data = NULL, family = gaussian,
subset, weights, na.action, offset, contrasts = NULL,
start, mustart, etastart, control = glmerControl(), ...)
Arguments
- formula
a two-sided linear formula object describing both the fixed-effects and random-effects parts of the model, with the response on the left of a
~
operator and the terms, separated by+
operators, on the right. Random-effects terms are distinguished by vertical bars ("|"
) separating expressions for design matrices from grouping factors.- data
an optional data frame containing the variables named in
formula
. By default the variables are taken from the environment from whichlmer
is called. Whiledata
is optional, the package authors strongly recommend its use, especially when later applying methods such asupdate
anddrop1
to the fitted model (such methods are not guaranteed to work properly ifdata
is omitted). Ifdata
is omitted, variables will be taken from the environment offormula
(if specified as a formula) or from the parent frame (if specified as a character vector).- subset
an optional expression indicating the subset of the rows of
data
that should be used in the fit. This can be a logical vector, or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default.- weights
an optional vector of ‘prior weights’ to be used in the fitting process. Should be
NULL
or a numeric vector.- na.action
a function that indicates what should happen when the data contain
NA
s. The default action (na.omit
, inherited from the 'factory fresh' value ofgetOption("na.action")
) strips any observations with any missing values in any variables.- offset
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be
NULL
or a numeric vector of length equal to the number of cases. One or moreoffset
terms can be included in the formula instead or as well, and if more than one is specified their sum is used. Seemodel.offset
.- contrasts
an optional
list
. See thecontrasts.arg
ofmodel.matrix.default
.- control
a list giving
- for
[g]lFormula
: all options for running the model, see
lmerControl
;- for
mkLmerDevfun,mkGlmerDevfun
: options for the inner optimization step;
- for
optimizeLmer
andoptimizeGlmer
: control parameters for nonlinear optimizer (typically inherited from the ... argument to
lmerControl
).
% FIXME: reference optCtrl
- for
- start
starting values (see
lmer
; forglFormula
, should be just a numeric vector of fixed-effect coefficients)- family
- mustart
optional starting values on the scale of the conditional mean; see
glm
for details.- etastart
optional starting values on the scale of the unbounded predictor; see
glm
for details.- ...
other potential arguments; for
optimizeLmer
andoptimizeGlmer
, these are passed to internal functionoptwrap
, which has relevant parameterscalc.derivs
anduse.last.params
(seelmerControl
).
Value
lFormula
and glFormula
return a list containing
components:
- fr
model frame
- X
fixed-effect design matrix
- reTrms
list containing information on random effects structure: result of
mkReTrms
mkLmerDevfun
and mkGlmerDevfun
return a function to
calculate deviance (or restricted deviance) as a function of the
theta (random-effect) parameters. updateGlmerDevfun
returns a function to calculate the deviance as a function of a
concatenation of theta and beta (fixed-effect) parameters. These
deviance functions have an environment containing objects required
for their evaluation. CAUTION: The environment
of
functions returned by mk(Gl|L)merDevfun
contains reference
class objects (see ReferenceClasses
,
merPredD-class
, lmResp-class
), which
behave in ways that may surprise many users. For example, if the
output of mk(Gl|L)merDevfun
is naively copied, then
modifications to the original will also appear in the copy (and
vice versa). To avoid this behavior one must make a deep copy (see
ReferenceClasses
for details).
optimizeLmer
and optimizeGlmer
return the results of an
optimization.
Details
These functions make up the internal components of an [gn]lmer fit.
[g]lFormula
takes the arguments that would normally be passed to[g]lmer
, checking for errors and processing the formula and data input to create a list of objects required to fit a mixed model.mk(Gl|L)merDevfun
takes the output of the previous step (minus theformula
component) and creates a deviance functionoptimize(Gl|L)mer
takes a deviance function and optimizes overtheta
(or overtheta
andbeta
, ifstage
is set to 2 foroptimizeGlmer
updateGlmerDevfun
takes the first stage of a GLMM optimization (withnAGQ=0
, optimizing overtheta
only) and produces a second-stage deviance functionmkMerMod
takes the environment of a deviance function, the results of an optimization, a list of random-effect terms, a model frame, and a model all and produces a[g]lmerMod
object.
Examples
# \donttest{
library(lme4)
### Fitting a linear mixed model in 4 modularized steps
## 1. Parse the data and formula:
lmod <- lFormula(Reaction ~ Days + (Days|Subject), sleepstudy)
names(lmod)
#> [1] "fr" "X" "reTrms" "REML" "formula" "wmsgs"
## 2. Create the deviance function to be optimized:
(devfun <- do.call(mkLmerDevfun, lmod))
#> function (theta)
#> .Call(lmer_Deviance, pp$ptr(), resp$ptr(), as.double(theta))
#> <environment: 0x000002b8f2f95428>
ls(environment(devfun)) # the environment of 'devfun' contains objects
#> [1] "lmer_Deviance" "lower" "pp" "resp"
# required for its evaluation
## 3. Optimize the deviance function:
opt <- optimizeLmer(devfun)
opt[1:3]
#> $par
#> [1] 0.96674177 0.01516906 0.23090995
#>
#> $fval
#> [1] 1743.628
#>
#> $feval
#> [1] 43
#>
## 4. Package up the results:
mkMerMod(environment(devfun), opt, lmod$reTrms, fr = lmod$fr)
#> Linear mixed model fit by REML ['lmerMod']
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
### Same model in one line
lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: sleepstudy
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
### Fitting a generalized linear mixed model in six modularized steps
## 1. Parse the data and formula:
glmod <- glFormula(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
#.... see what've got :
str(glmod, max=1, give.attr=FALSE)
#> List of 6
#> $ fr :'data.frame': 56 obs. of 3 variables:
#> $ X : num [1:56, 1:4] 1 1 1 1 1 1 1 1 1 1 ...
#> $ reTrms :List of 10
#> $ family :List of 12
#> $ formula:Class 'formula' language cbind(incidence, size - incidence) ~ period + (1 | herd)
#> $ wmsgs : chr(0)
## 2. Create the deviance function for optimizing over theta:
(devfun <- do.call(mkGlmerDevfun, glmod))
#> function (theta)
#> {
#> resp$updateMu(lp0)
#> pp$setTheta(theta)
#> p <- pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L),
#> compDev = compDev, maxit = maxit, verbose = verbose)
#> resp$updateWts()
#> p
#> }
#> <environment: 0x000002b8efea7528>
ls(environment(devfun)) # the environment of devfun contains lots of info
#> [1] "compDev" "lower" "lp0" "maxit" "pp"
#> [6] "pwrssUpdate" "resp" "tolPwrss" "verbose"
## 3. Optimize over theta using a rough approximation (i.e. nAGQ = 0):
(opt <- optimizeGlmer(devfun))
#> parameter estimates: 0.641838555348909
#> objective: 184.108693002453
#> number of function evaluations: 18
## 4. Update the deviance function for optimizing over theta and beta:
(devfun <- updateGlmerDevfun(devfun, glmod$reTrms))
#> function (pars)
#> {
#> resp$setOffset(baseOffset)
#> resp$updateMu(lp0)
#> pp$setTheta(as.double(pars[dpars]))
#> spars <- as.numeric(pars[-dpars])
#> offset <- if (length(spars) == 0)
#> baseOffset
#> else baseOffset + pp$X %*% spars
#> resp$setOffset(offset)
#> p <- pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GQmat,
#> compDev = compDev, grpFac = fac, maxit = maxit, verbose = verbose)
#> resp$updateWts()
#> p
#> }
#> <environment: 0x000002b8efea7528>
## 5. Optimize over theta and beta:
opt <- optimizeGlmer(devfun, stage=2)
str(opt, max=1) # seeing what we'got
#> List of 7
#> $ fval : num 184
#> $ par : num [1:5] 0.642 -1.398 -0.992 -1.128 -1.58
#> $ convergence: num 0
#> $ NM.result : int 3
#> $ message : chr "parameter values converged to within tolerance"
#> $ control :List of 17
#> $ feval : num 285
#> - attr(*, "optimizer")= chr "Nelder_Mead"
#> - attr(*, "control")=List of 3
#> - attr(*, "warnings")= list()
#> - attr(*, "derivs")=List of 2
## 6. Package up the results:
(fMod <- mkMerMod(environment(devfun), opt, glmod$reTrms, fr = glmod$fr))
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#> Approximation) [glmerMod]
#> Family: binomial ( logit )
#> AIC BIC logLik deviance df.resid
#> 194.0531 204.1799 -92.0266 184.0531 51
#> Random effects:
#> Groups Name Std.Dev.
#> herd (Intercept) 0.6421
#> Number of obs: 56, groups: herd, 15
#> Fixed Effects:
#> (Intercept) period2 period3 period4
#> -1.3983 -0.9919 -1.1282 -1.5797
### Same model in one line
fM <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
all.equal(fMod, fM, check.attributes=FALSE, tolerance = 1e-12)
#> [1] TRUE
# ---- -- even tolerance = 0 may work
# }