Julian Faraway 22 September 2020
See the introduction for an overview. Load the libraries:
library(ggplot2)
library(INLA)
Load in and plot the data:
data(penicillin, package="faraway")
summary(penicillin)
treat blend yield
A:5 Blend1:4 Min. :77
B:5 Blend2:4 1st Qu.:81
C:5 Blend3:4 Median :87
D:5 Blend4:4 Mean :86
Blend5:4 3rd Qu.:89
Max. :97
library(ggplot2, quietly=TRUE)
ggplot(penicillin,aes(x=blend,y=yield,group=treat,linetype=treat))+geom_line()
ggplot(penicillin,aes(x=treat,y=yield,group=blend,linetype=blend))+geom_line()
Fit the default INLA model:
formula = yield ~ treat+f(blend, model="iid")
result = inla(formula, family="gaussian", data=penicillin)
result <- inla.hyperpar(result)
summary(result)
Call:
"inla(formula = formula, family = \"gaussian\", data = penicillin)"
Time used:
Pre = 1.16, Running = 0.181, Post = 0.0755, Total = 1.42
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 84.048 2.447 79.198 84.045 88.905 84.040 0
treatB 0.947 3.461 -5.931 0.949 7.799 0.955 0
treatC 4.922 3.461 -1.959 4.927 11.771 4.934 0
treatD 1.941 3.461 -4.938 1.944 8.792 1.949 0
Random effects:
Name Model
blend IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 3.7e-02 1.2e-02 0.017 3.50e-02 6.4e-02 0.033
Precision for blend 2.0e+04 2.0e+04 500.709 1.39e+04 7.4e+04 0.082
Expected number of effective parameters(stdev): 3.96(0.014)
Number of equivalent replicates : 5.04
Marginal log-Likelihood: -80.80
Precision for the blend effect looks implausibly large. Problem with default gamma prior.
Try a half-normal prior on the blend precision. I have used variance of the response to help with the scaling so these are more informative.
resprec <- 1/var(penicillin$yield)
formula = yield ~ treat+f(blend, model="iid", prior="logtnormal", param=c(0, resprec))
result = inla(formula, family="gaussian", data=penicillin)
result <- inla.hyperpar(result)
summary(result)
Call:
"inla(formula = formula, family = \"gaussian\", data = penicillin)"
Time used:
Pre = 1.22, Running = 0.161, Post = 0.0745, Total = 1.46
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 84.030 2.772 78.505 84.029 89.553 84.028 0
treatB 0.966 2.765 -4.546 0.968 6.456 0.972 0
treatC 4.950 2.765 -0.565 4.954 10.437 4.960 0
treatD 1.962 2.765 -3.551 1.965 7.451 1.969 0
Random effects:
Name Model
blend IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.059 2.30e-02 0.024 0.056 0.113 0.050
Precision for blend 956.937 4.52e+05 0.013 0.074 2.644 0.031
Expected number of effective parameters(stdev): 6.73(0.929)
Number of equivalent replicates : 2.97
Marginal log-Likelihood: -78.71
Looks more plausible. Compute the transforms to an SD scale for the blend and error. Make a table of summary statistics for the posteriors:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
restab=cbind(restab, sapply(result$marginals.random$blend,function(x) inla.zmarginal(x, silent=TRUE)))
colnames(restab) = c("mu","B-A","C-A","D-A","alpha","epsilon",levels(penicillin$blend))
data.frame(restab)
mu B.A C.A D.A alpha epsilon Blend1 Blend2 Blend3 Blend4 Blend5
mean 84.03 0.96583 4.9504 1.962 3.9306 4.3484 4.1649 -2.0883 -0.69619 1.3925 -2.7824
sd 2.7724 2.7651 2.7651 2.7651 1.9924 0.90416 2.8532 2.6104 2.5357 2.564 2.6748
quant0.025 78.501 -4.5497 -0.5684 -3.5543 0.61883 2.9784 -0.75786 -7.676 -5.9853 -3.5355 -8.5319
quant0.25 82.26 -0.82636 3.1588 0.16993 2.5717 3.6984 2.1561 -3.6663 -2.1987 -0.20884 -4.4214
quant0.5 84.016 0.95464 4.9403 1.9511 3.6611 4.2103 3.9991 -1.9371 -0.6234 1.2428 -2.6229
quant0.75 85.772 2.7336 6.7188 3.7299 5.0027 4.8459 5.9317 -0.3922 0.80284 2.8989 -0.97575
quant0.975 89.538 6.4399 10.421 7.4353 8.6559 6.4986 10.231 2.733 4.3114 6.8014 1.9915
Also construct a plot the SD posteriors:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,15)
Posterior for the blend SD is more diffuse than the error SD. Posterior for the blend SD has non-zero density at zero.
Now try more informative gamma priors for the precisions. Define it so
the mean value of gamma prior is set to the inverse of the variance of
the fixed-effects model residuals. We expect the two error variances to
be lower than this variance so this is an overestimate. The variance of
the gamma prior (for the precision) is controlled by the apar
shape
parameter - smaller values are less informative.
apar <- 0.5
lmod <- lm(yield ~ treat, data=penicillin)
bpar <- apar*var(residuals(lmod))
lgprior <- list(prec = list(prior="loggamma", param = c(apar,bpar)))
formula = yield ~ treat+f(blend, model="iid", hyper = lgprior)
result <- inla(formula, family="gaussian", data=penicillin)
result <- inla.hyperpar(result)
summary(result)
Call:
"inla(formula = formula, family = \"gaussian\", data = penicillin)"
Time used:
Pre = 1.15, Running = 0.148, Post = 0.0728, Total = 1.37
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 84.030 3.037 77.986 84.029 90.080 84.027 0.001
treatB 0.967 2.733 -4.481 0.969 6.392 0.973 0.000
treatC 4.952 2.733 -0.499 4.955 10.374 4.961 0.000
treatD 1.963 2.733 -3.486 1.965 7.388 1.970 0.000
Random effects:
Name Model
blend IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.062 0.023 0.025 0.059 0.115 0.053
Precision for blend 0.070 0.054 0.010 0.056 0.213 0.033
Expected number of effective parameters(stdev): 7.09(0.585)
Number of equivalent replicates : 2.82
Marginal log-Likelihood: -78.99
Compute the summaries as before:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
restab=cbind(restab, sapply(result$marginals.random$blend,function(x) inla.zmarginal(x, silent=TRUE)))
colnames(restab) = c("mu","B-A","C-A","D-A","alpha","epsilon",levels(penicillin$blend))
data.frame(restab)
mu B.A C.A D.A alpha epsilon Blend1 Blend2 Blend3 Blend4 Blend5
mean 84.03 0.96662 4.9516 1.9629 4.7091 4.2387 4.6789 -2.3396 -0.77994 1.5598 -3.1194
sd 3.0376 2.7328 2.7329 2.7328 2.1182 0.84686 3.008 2.9173 2.8901 2.9003 2.941
quant0.025 77.977 -4.4847 -0.50307 -3.4893 2.1707 2.9473 -0.9083 -8.397 -6.6541 -4.1024 -9.2773
quant0.25 82.132 -0.80278 3.1828 0.19361 3.2989 3.6339 2.7317 -4.0828 -2.5208 -0.23398 -4.8748
quant0.5 84.014 0.95558 4.9416 1.9521 4.2169 4.1125 4.531 -2.2761 -0.767 1.4916 -3.0344
quant0.75 85.896 2.7119 6.6975 3.7083 5.5239 4.7015 6.4483 -0.54381 0.96007 3.269 -1.2872
quant0.975 90.058 6.3768 10.359 7.3723 10.175 6.2517 11.022 3.2473 4.9132 7.4932 2.4377
Make the plots:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,15)
Posterior for blend SD has no weight near zero.
In Simpson et al (2015), penalized complexity priors are proposed. This requires that we specify a scaling for the SDs of the random effects. We use the SD of the residuals of the fixed effects only model (what might be called the base model in the paper) to provide this scaling.
lmod <- lm(yield ~ treat, data=penicillin)
sdres <- sd(residuals(lmod))
pcprior <- list(prec = list(prior="pc.prec", param = c(3*sdres,0.01)))
formula <- yield ~ treat + f(blend, model="iid", hyper = pcprior)
result <- inla(formula, family="gaussian", data=penicillin)
result <- inla.hyperpar(result)
summary(result)
Call:
"inla(formula = formula, family = \"gaussian\", data = penicillin)"
Time used:
Pre = 1.16, Running = 0.171, Post = 0.0757, Total = 1.4
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 84.032 2.638 78.788 84.030 89.272 84.028 0
treatB 0.965 2.817 -4.654 0.967 6.560 0.972 0
treatC 4.949 2.817 -0.673 4.952 10.541 4.959 0
treatD 1.961 2.817 -3.659 1.963 7.555 1.968 0
Random effects:
Name Model
blend IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 5.8e-02 2.30e-02 0.023 0.054 0.111 0.047
Precision for blend 6.4e+03 3.51e+06 0.016 0.102 11.140 0.040
Expected number of effective parameters(stdev): 6.44(1.06)
Number of equivalent replicates : 3.11
Marginal log-Likelihood: -78.92
Compute the summaries as before:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
restab=cbind(restab, sapply(result$marginals.random$blend,function(x) inla.zmarginal(x, silent=TRUE)))
colnames(restab) = c("mu","B-A","C-A","D-A","alpha","epsilon",levels(penicillin$blend))
data.frame(restab)
mu B.A C.A D.A alpha epsilon Blend1 Blend2 Blend3 Blend4 Blend5
mean 84.032 0.96456 4.9485 1.9606 3.3676 4.4304 3.7546 -1.8661 -0.62252 1.2443 -2.4724
sd 2.6382 2.8165 2.8165 2.8165 1.9125 0.93848 2.7536 2.4171 2.3075 2.3488 2.5159
quant0.025 78.784 -4.6575 -0.67705 -3.6624 0.29798 3.0047 -0.71206 -7.1459 -5.5114 -3.176 -7.9728
quant0.25 82.337 -0.85827 3.1264 0.13789 2.078 3.7521 1.7186 -3.3155 -1.9492 -0.20167 -4.0201
quant0.5 84.018 0.95327 4.9384 1.9496 3.1286 4.2888 3.566 -1.6535 -0.50078 1.0342 -2.2628
quant0.75 85.699 2.7626 6.7472 3.7587 4.3632 4.9538 5.4831 -0.25031 0.69006 2.5974 -0.67833
quant0.975 89.259 6.5438 10.524 7.5389 7.9169 6.6505 9.651 2.4311 3.9104 6.2997 1.7615
Make the plots:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,15)
Posterior for blend SD has some weight near zero. Results are comparable to previous analyses.
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] INLA_20.03.17 foreach_1.5.0 sp_1.4-2 Matrix_1.2-18 ggplot2_3.3.2 knitr_1.29
loaded via a namespace (and not attached):
[1] pillar_1.4.6 compiler_4.0.2 iterators_1.0.12 tools_4.0.2 digest_0.6.25
[6] evaluate_0.14 lifecycle_0.2.0 tibble_3.0.3 gtable_0.3.0 lattice_0.20-41
[11] pkgconfig_2.0.3 rlang_0.4.7 yaml_2.2.1 xfun_0.16 withr_2.2.0
[16] dplyr_1.0.2 stringr_1.4.0 MatrixModels_0.4-1 generics_0.0.2 vctrs_0.3.4
[21] grid_4.0.2 tidyselect_1.1.0 glue_1.4.2 R6_2.4.1 rmarkdown_2.3
[26] farver_2.0.3 purrr_0.3.4 magrittr_1.5 splines_4.0.2 scales_1.1.1
[31] codetools_0.2-16 ellipsis_0.3.1 htmltools_0.5.0.9000 colorspace_1.4-1 Deriv_4.0.1
[36] labeling_0.3 stringi_1.4.6 munsell_0.5.0 crayon_1.3.4