## C

> boxplot(E ~ Limosa\$fSex * Limosa\$fPeriod, main = "Sex & Period")

> boxplot(E ~ Limosa\$ID, main = "Day")

Note that the variation in residual spread is larger for the unknown sex, and it is also larger for the summer period. This means that in step 3 of the protocol, we could do with a varldent variance structure with the variance covariates Period and Sex. Figure 7.12D shows that we need the term ID (sampling day) as an explanatory variable; at some days, all the residuals are above or below zero. We can either use ID as a fixed effect or as a random effect. In this example, it is obvious to use it as a random effect (it allows for correlation between observations from the same day; it avoids estimating lots of parameters and it allows us to generalise the conclusions); see also Chapter 5.

7.6.4.2 Step 3 of the Protocol: Choose an Appropriate Variance Structure

We already discussed in the previous step that we need a varldent variance structure and ID as random effect. Such a model is given by

> Mi.lme <- lme(IntakeRate ~ Time + Time2 + fPeriod +

fSex, data = Limosa, weights = varIdent(form =~ 1 | fSex * fPeriod), random =~ 1 | fID, method = "REML")

Perhaps it is useful to give the corresponding equation for this, just in case you find it difficult to see this from the R code.

IntakeRatejj = a + p x Timeij + p2 x Time2 + p3 x Periody + /34 x Sexy + ai + By a ~ V(0, d2)

We have seen most of this equation already in Section 7.6.1. The term ai is the random intercept (Chapter 5). The subscripts for the a are there because we allow for different residual variances depending on sex and period.

7.6.4.3 Steps 4-6 of the Protocol: Find the Optimal Random Structure

We are going to save some space by summarising a couple of model selection steps. The model that was fitted in step 3 is the optimal one in terms of the random structure. Leave out the random effect, refit the model, and compare both models with the likelihood ratio test, and you will get p-values smaller than 0.001. The same holds if you drop the varIdent variance structure if you use the varIdent with only sex or only with period. The R code to do these analyses was given in Chapters 4 and 5.

7.6.4.4 Steps 7-8 of the Protocol: Find the Optimal Fixed Structure

It is now time to find the optimal model in terms of the explanatory variables time, period, and sex. We use the likelihood ratio test with ML estimation. The starting model contains Time, Time2, Period, and Sex. The last three can be dropped (Time is nested in Time2 and cannot be dropped). The R code to do this is as follows.

> M1.lme <- lme(IntakeRate ~ Time + Time2 + fPeriod +

fSex, data = Limosa, weights = varIdent(form =~ 1 | fSex * fPeriod), random =~ 1 | fID, method = "ML")

The output is not shown here, but the least significant term is Period (L = 1.28, df = 2, p = 0.52); hence, it can be dropped. In the next round, Time2 is dropped, followed by Time in the third round. In the fourth and last round, we have a model that only contains Sex. The following code gives us one p-value for the nominal variable Sex (the update command fits a model with only the intercept):

> M4.lme <- lme(IntakeRate ~ fSex, data = Limosa, weights = varIdent(form =~ 1 | fSex * fPeriod), random =~ 1 | fID, method = "ML")

Model df AIC BIC logLik Test L.Ratio p-value

M4.lme 1 11 -359.3379 -322.6779 190.6689

M4.lmeA 2 9 -355.4784 -325.4839 186.7392 1 vs 2 7.85945 0.0196

Hence, the optimal model contains only Sex in the fixed part of the model. If we have to quote a p-value for this term, it will be 0.0196, which is not very impressive. A model validation shows that everything is now ok (no heterogeneity patterns in the normalised residuals).

7.6.4.5 Step 9 of the Protocol: Refit with REML

We now discuss the numerical output of the model. First we have to refit it with REML.

> M4.lme <- lme(IntakeRate ~ fSex, data = Limosa, weights = varIdent(form =~ 1 | fSex * fPeriod), random =~ 1 | fID, method = "REML")

> summary(M4.lme)

Linear mixed-effects model fit by REML. Data: Limosa AIC BIC logLik