Funnily, mixed effect regression was the first type of regression analysis I learned (I was given a huge complex data set with no prior R experience and told to analyze it). I compiled a collection of papers and link and books that I used to self teach. Right now it’s a bit disorganized, but I will slowly put some structure when I have free time. My goal is to provide the links and a description as to why they were useful/why I needed that information so one can follow along and self teach also.
I am also continually updating it as new sources arise.
To begin, the following are MUST READ books or papers or tutorials. They really set the foundation for understanding and building multilevel models and what they capture beyond regular ordinary least squares regression (they aren’t necessarily in reading order, links may need to be updated too).
 Gelman and Hill: Data Analysis Using Regression and Multilevel/Hierchical Models
 Andrew Gelman is the boss. Although the techniques and packages he uses aren’t necessarily what I ended up using in my analysis flow, he explains the intuition behind MLM very well.
 Nezlek 2008: Multilevel modeling for social and personality psychology
 Super easy to understand introduction to MLM using the HLM package (which I don’t use), but it helped me understand the equations more intuitively.
 Barr 2013: Random effects structure for confirmatory hypothesis testing: Keep it maximal
 Setting the random effect structure can be a confusing and complicated task and the incorrect structure can lead to inflated false positives. Barr provides extremely helpful advice on how to go about this. I find myself rereading this everytime I encounter a new analysis problem.
 The main paper only discusses random main effects, in this paper he adds interactions.
 Also important to read [link, pdf version]. The same authors directly compare ANOVA with mixed regression and clarify misunderstandings about both.
 Baayen 2017: The Cave of Shadows Addressing the human factor with generalized additive mixed models
 Argues against maximal random effect structures in models (Barr 2013), provides alternative practices. I was just told about this paper and the Bates 2015 below. So maybe the answer isn’t as clean as make the maximal model possible.
 Bates 2015: Parsimonious mixed models
 Also argues against maximal models and also provides alternative approaches.
 Brauer & Curtin 2017: Linear MixedEffects Models and the Analysis of Nonindependent Data: A Unified Framework to Analyze Categorical and Continuous Independent Variables that Vary WithinSubjects and/or WithinItems
 Very accessible tutorial. What I like most about it is that it has the r code next to the more formal equation so one can learn how both notations relate.
 Matuschek 2017: Balancing Type I error and power in linear mixed models
 This paper unpacks both viewpoints on maximal models. Here’s a twitter thread on the matter.
 Harrison 2018: A brief introduction to mixed effects modelling and multimodel inference in ecology
 Newer paper giving an introduction to mixed modeling in ecology, however, it provides some good general tips for modeling decisions that will come up when constructing yours.
 Baayen 2008: Mixedeffects modeling with crossed random effects for subjects and items.
 Great paper on the necessity of including stimuli as random effects along with subjects.
 Judd 2012 made the same point in that this is typically neglected in social psychology.
 Westfall, Nichols, & Yarkoni made the same point for fMRI analysis.
 West: LINEAR MIXED MODELS A Practical Guide Using Statistical Software
 This book provides a good overview of using r to run these models. They have model building methods that I use as a source for helping me in exploratory analyses.
 Also, here is a related book by same author on actual R functions for mixed effect modeling.
 Winter: A very basic tutorial for performing linear mixed effects analyses
 This is quick tutorial but explains the concepts so clearly. He dumbs down the language so it was excellent when I was first learning.
 Knowles: Getting Started with Multilevel modeling in R
 Another basic tutorial, but it was instrumental in helping me learn and explore the models.
 Here is the second part of the tutorial.
 Nakagawa 2012: Nested by design: model fitting and interpretation in a mixed model era
 Great paper explaining how to deal with nested/crossed designs in mixed models and generally explains all the components of a mixed model.
 Bolker 2009: Generalized linear mixed models: a practical guide for ecology and evolution
 Great accessible paper on the mechanics/technical side of the models.
 Schielzeth 2009: Conclusions beyond support: overconfident estimates in mixed models
 You need random slopes, not just random intercepts, to protect against anticonservative fixed effect estimates.
 Formulae in R: Anova and other models, mixed and fixed
 Great resource for the syntax of the code in R
 Howell – Mixed models for missing data with repeated measures
 Hajduk 2017: Introduction to mixed models – great tutorial with R code.
 Choosing R packages for mixed effects modeling based on the car you drive
 Excellent and accessible comparison of the many different R packages for running mixed models.
 Freeman: Visualization of hierarchical models
 I’m a visual person and this tutorial shows an easy visual way to conceptualize mixed models (and links it to equations – which I find harder to digest)
The following are extra materials that are highly relevant but that I didn’t interact much with. They may (or may not) be useful.
 Pinheiro & Bates: Mixed effect models in S and SPlus
 Zuur: Mixed effect models and extensions in ecology with R
 Finch, Bolin, & Kelley: Multilevel modeling using R
 Bates: lme4, mixed effect modeling with R
 Ogorek: Random regression coefficients using lme4
 Bolker: GLMM: worked examples
 Fox: Linear Mixed Models
 CCAGE: Linear Mixed Effects
 Ashander: Visualizing fits, inference, and implications of GLMMs
 Chitwood: Fixed Linear Modeling using lme4
 Bristol: Random Slope Models
 Tufts: Is a mixed model right for your needs?
 Granath: Random intercept/slope model vs nested random effect in R
 UCLA: Introduction to Generalized Linear Mixed Models
 Johnson 2014: Progress in regression, why natural language data calls for mixed models
 Ayeimanolr: Mixed effect modeling workshop
 Bolker: Mixed Models
 Bristol: Module 5: Introduction to Multilevel Modelling Concepts
 DataScience+: R for Publication: Lesson 6, Part 1 – Linear Mixed Effects Models
 Singmann: An Introduction to Mixed Models for Experimental Psychology
 Bolker: Linear mixed model: starling example
 Heck: Multilevel Modeling classes + resources
Finally, here are some pages that go over some of the basic questions I had during implementation. I will try to cluster them into overarching topics.
Understanding the analysis
Of course, the first search I did was to understand mixed regression in general. What does it do? Why do I need it? How is it different from other analyses?
 The Repeated and Random Statements in Mixed Models for Repeated Measures
 explains mixed models in terms of the nonindependence of observations and has a concise explanation of controlling for random effects.
 Is it a fixed or random effect?
 one of the BIGGEST questions I had was what counted as a fixed or random effect in my data. There’s lots of opinions on this,but this blog was the easiest to understand and I think I agree with most.
 Have I correctly specified my model in lmer?
 Explaining Fixed Effects: Random Effects modelling of TimeSeries CrossSectional and Panel Data
 Fixed effect vs random effect when all possibilities are included in a mixed effects model
 Specifying Fixed and Random Factors in Mixed Models
 Random and fixedeffects structure in linearmixed models
 More on random slopes and what it means if your effect is not longer significant after the inclusion of random slopes
 When I was learning how these models worked, I started playing with the random slopes and some of my predictors became “nonsignificant”, so I wondered what happened.
 Significant fixed effect only when random slope is included
 The opposite situation.
 Why the introduction of a random slope effect enlarged the slope’s SE?
 One of the interesting effects that I saw when I started was that the beta standard errors would increase (instead of decrease). I wondered why this was the case.
 Writing up lmer results
 After using mixed models, I was confused about how to report the results. I kept trying to fit it into an anova style reporting, but these examples helped me understand the conventions.
 Here is a link on how to report from likelihood ratio tests (below)
 Mixed models for ANOVA designs with one observation per unit of observation and cell of the design
 Anova in R
 What is the difference between fixed effect, random effect and mixed effect models?
 Random regression coefficients using lme4
 Showing shrinkage with a plot for the interaction coefficient in a mixedmodel
 Linear Models, ANOVA, GLMs and MixedEffects models in R
 A review of Mixed models
 Excellent tutorial from a stanford stats class with r code
 Plotting partial pooling in mixed effect models
 Another great tutorial that provides visualization of the partial pooling/shrinkage advantages in MLM.
Confidence Intervals
After running your regression, how do you get confidence intervals for your betas? Typically you use confint(model) or if you want wald (asymptotic and fast but less precise) confidence intervals, you use confint(model, method=’Wald’). However, here are some links for comparing confidence intervals through other packages or the difference between prediction intervals and confidence intervals.
 How trustworthy are the confidence intervals for lmer objects through effects package?
 provides a comparison of the different methods for calculating confidence intervals.
 Confidence Intervals for prediction in GLMMs
 How to get coefficients and their confidence intervals in mixed effects models?
(Restricted) Maximum Likelihood Estimation
An important aspect to understanding these models is how the parameters are estimated (hint: not using least squares). They use Maximum Likelihood (ML) or Restricted Maximum Likelihood (REML).
 Searle: Variance Components
 EXCELLENT book on understanding these methods (and their relation to anova estimation)
 REML vs ML stepAIC
 Has some good information on when it’s appropriate to using either in model selection
 A few words about REML
 Excellent handout on each.
 Estimating Parameters in Linear MixedEffects Models
 Mathy source for how this works.
 When no model comparison, should I use REML vs ML?
Inference
I understand the lack of p values in these models, but I come from traditional labs, so I had to learn how to draw p value based inferences from these models. There are many methods for this: likelihood ratio test (lrt) for model comparison, lmerTest for both anova and predictor style inference, bootstrapping, etc.
 Three ways to get parameterspecific pvalues from lmer
 Getting P value with mixed effect with lme4 package
 How to obtain the pvalue (check significance) of an effect in a lme4 mixed model?
 Significance Testing in Multilevel Regression
 How to get an “overall” pvalue and effect size for a categorical factor in a mixed model (lme4)?
 What is the null model for a likelihood ratio test of a withinsubjects factor?
 Good advice for what counts as a null model in lrt.
 F and Wald chisquare tests in mixedeffects models
 If you’re looking for more anova like results.
 lme vs. lmer
 This link advocates for the use of lrt for fixed effects.
 Satterthwaite vs KenwardRoger approximations for the df in mixed effects models
 If you use the lmerTest package to run your models so the p values are automatically included, there is the option of using Satt or KR approximations, so I wondered what the difference was.
 How are the likelihood ratio, Wald, and Lagrange multiplier (score) tests different and/or similar?
 Depending on the analysis you use, you may be using waldbased inferences (this gives you z statistics instead of t because it’s asymptotic and can’t calculate degrees of freedom needed for t test, typically used for data that would take a long time to compute or in logistic regressions) or likelihood ratio tests. This provides a good comparison of these methods.
 Different pvalues for fixed effects in summary() of glmer() and likelihood ratio test comparison in R
 When trying these different inference methods out, some of the time they didn’t agree (as in the same p value). Sometimes it had to do with REML vs ML, but there are also differences in the estimation methods that should be taken into account.
 Should I include this fixed effect? lme4 likelihood ratio test and lmerTest anova disagree
 Shows the horrors of not understanding what goes on under the hood with these functions based on how you process your data.
 DRAFT rsigmixedmodels FAQ
 Great resource for a more authoritative voice on inference and issues that may come up.
 lsmeans
 I personally use this package the most. It’s flexible in obtaining multiple comparisons (both of averages and slopes) AND estimating slopes/averages across variables and allows p value adjustment if needed.
 Here is another tutorial, and a question on p value adjustment.
 How to grab the estimates and plot them: Link
 Complex analyses/inferences
 Multiple Comparisons for GLMMs using glmer() & glht()
 Gelman: Why We (Usually) Don’t Have to Worry About Multiple Comparisons
 lmer multiple comparisons for interaction between continuous and categorical predictor
 Effect sizes in lmer
 I’ve pulled my hair out (jk) trying to figure out how to estimate effect sizes in lmer (especially with complex models). Model fits like r2 work for assessing some sort of effect size for full models, but there is none I have found for specific betas in the regression. If you know please let me know too.
 Westfall shows a simple example of obtaining the d stat, but not clear how this works for different model types.
 Some concerns to consider in standardizing variables in multilevel models
 I had a collaborator who asked about standardizing variables, but I wasn’t sure how this was done and consequences in MLM, this provides some clues.
 Boostrapping
 The bootstrap for linear model predictions
 basic tutorial on using bootmer for bootstrapping coefficients
 Using bootMer to do model comparison in R
 Great code snippets for comparing bootstrapped estimates across models.
 Model comparison and bootrapping
 Introduction to bootstrap with applications to mixed effect models
 The bootstrap for linear model predictions
Logistic Regression
I ended up modeling trial accuracy data, which is a binary outcome variable and thus requires logistic regression models. The implementation wasn’t difficult, but interpreting the results takes practice and care. These links are general tutorials that helped me understand implementation and coefficient interpretation.
 UCLA: R data examples: mixed effect logistic regression
 UCLA: Logit Regression
 UCLA: Deciphering Interactions in Logistic Regression
 This was an important link as interactions are a messy thing to interpret
 UCLA: Logistic regression with stata
 not in r language, but provides interpretation intuitions.
 Why use Odds Ratios in Logistic Regression
 One thing I was confused about is what to report from a logistic regression. Do I report log odds, probability, odds ratios? It seems different fields vary, but I stick to odds ratios now.
 Odds Ratios NEED To Be Graphed On Log Scales
 How to create odds ratio and 95 % CI plot in R
 I used this link for the small code at the bottom that I always forget (scale_y_log10) to plot odds ratios.
 ggplot2: stat_smooth for logistic outcomes with facet_wrap returning ‘full’ or ‘subset’ glm models
 You can’t just use ggplot to plot the regression from the data using the ggplot functioning because it will miss the nuances of your model (multiple predictors or random effects). So you have to predict the values from the model to plot.
 Graphing a Probability Curve for a Logit Model With Multiple Predictors
 Output of logistic model in R
 provides information on how to get predicted probability or odds ratios from the model (for plotting).
 Here is another link for this.
 Logistic Regression in R (Odds Ratio)
 Quick understanding of how to get confidence intervals. I don’t necessarily use this method anymore, but still useful.
 Binomial glmm with a categorical variable with full successes
 If your SEs are crazy large (>1000s), there might be complete separation (though you should plot your data first to figure this out).
Model Building
I keep getting mixed advice about this approach and its varieties. I was taught by a statistician who said stepwise approaches were ok but I read otherwise. For exploratory work this may be ok (as compared to confirmatory), but do what you want. I’ll just post the materials I used to understand these methods.
 Why I don’t do backwards selection
 I think title explains all.
 Random slopes in LME
 Random effect: Should I stay or should I go?
 Is adjusting pvalues in a multiple regression for multiple comparisons a good idea?
 Using bootMer to do model comparison in R
 R lmerTest and Tests of Multiple Random Effects
Model Complexity
When I first started, I wondered how crazy these models can get. Can I just throw every variable in? Are there costs/benefits/limitations to parsimony vs complexity?
 Is it possible to have too many random intercepts? – linguistic example
 Why are your statistical models more complex these days? – thoughts on why models are becoming increasingly complex given software development, etc.
 The above articles by Barr, Baayens, & Bates on best approaches inform this problem.
Model Fits
Diagnosing whether the model fits well and how to do so is important. This typically involves some form of checking unexplained variance along with examining assumptions.
 How High Should Rsquared Be in Regression Analysis?
 Just gave me a sense for what I should expect from indicators like r2
 Interpreting residual plots to improve your regression
 Not necessarily related to mixed models, but very informative on residual shapes.
 Nakagawa 2012: A general and simple method for obtaining R^{2} from generalized linear mixedeffects models
 Excellent paper on how to calculate R2 (two different kinds) for mixed models.
 Calculating R2 in mixed models using Nakagawa & Schielzeth’s (2013) R2glmm method
 R^2 FOR LINEAR MIXED EFFECTS MODELS
 Another function for r2, uses the equations from the Nakagawa paper. Provides a good discussion of the index.
 Visualizing fits, inference, implications of (G)LMMs
 Diagnosing Linear Models
 Diagnosing Logistic Models
 Unexpected residuals plot of mixed linear model using lmer (lme4 package) in R
 Plotting residuals for GLMER model and zero counts
 Visualizing (generalized) linear mixed effects models
 Use predicted values with or without random part to plot Residuals with binnedplot of a logistic regression in glmer (lme4 package) in R?
 Interpreting a binned residual plot in logistic regression
 R squared in logistic regression
 The HosmerLemeshow goodness of fit test for logistic regression
Convergence
When the data is not robust enough for the model or the model is too complex, it will not converge. This tends to render your estimates unreliable. So this is an important issues to either fix or look into to see how bad it is.
 Convergence error for development version of lme4
 Optimization in R
 lme4 convergence warnings: troubleshooting
 I go back to this specific page often.
 Correlated random effects
 Fitting linear mixedeffects models using lme4
 When are zerocorrelation mixed models theoretically sound?
 This problem came up recently. People sometimes uncorrelate their random effects in order to achieve model convergence (instead of dropping random effects). However, I wondered if there was a hidden assumption or consequence to this action. These two links provide some discussion on the topic.
Variance Components
I’m currently working on projects that are more interested in the variance components than the betas. The variance components tell you how much the means vary across units of your random effects, e.g., if participants is a random effect, how much their intercepts vary. Important to this topic are intraclass correlations (ICC) and variance partitioning coefficients (VPC) and their interrelations.
 Searle: Variance Components
 Making sense of random effects
 Neat ecology example to help understand the variance components
 What are variance components?
 Goldstein 2002: Partitioning variation in multilevel models
 I use this paper often to understand how the ICC is related to the VPC in random intercept only or also slope models.
 VPC differences between interceptonly and random slope models
 Other estimation methods
 The Intraclass Correlation Coefficient in Mixed Models
 Great introductory post on how the ICC works in mixed models.
 Interpreting the random effect in a mixedeffect model
 great small tutorial on how to think about variance components and how they relates to the ICC
 Intraclass correlation (ICC) for an interaction?
 Random Slope Models
 I go back to this to understand why the VPC is not the same as the ICC in random slope models. (under calculating total variance)
 The ICC equation isn’t the same for interceptonly models and random slope models.
 Computing repeatability of effects from an lmer model
 Mixed models: Calculating ICC for model with a random intercept and a random slope
 Intraclass Correlation Coefficient in mixed model with random slopes
 intraclass correlation (ICC) for random intercept and slope
 How to partition the variance explained at group and individual level
 ICC Confidence Intervals
 Prediction interval for lmer() mixed effects model in R
 I use this specifically for that bootmer function that allows you to run parametric bootstraps. It’s useful not just for prediction intervals, but you can make a function to estimate ICC from a model and bootstrap it to get ICC confidence intervals and get closer to making ICC inferences.
 Mixed models: When can adding a predictor increase the residual variance?
 It seems mixed models are more complicated in terms of what how we predict the variance should change when we add fixed effects.
Reliability
Related to variance components, the within/between subject variance can give you a sense about the reliability of your measure. The within subject variance would be the residuals that aren’t captured in the model, the between would be the random effect groupings. Not all links are necessarily mixed model related, but may be useful. Note: this is intimately related to variance components/icc above so those sources will also help.
 Nakagawa 2010: Repeatability for Gaussian and nonGaussian data: a practical guide for biologists
 Excellent paper that I often refer back to – especially the equations. They have an R package to estimate repeatability in data that I don’t use myself, but may be useful for others.
 Stoffel 2017: rptR: Repeatability estimation and variance decomposition by generalized linear mixedeffects models
 Reliability Analysis explained – very detailed about all the possible ways to measure reliability
 What does Cronbach’s alpha mean?
 Some illustrations of reliability analysis.
 provides a hint at which ICC you get from lmer depending on model structure
Power Analysis
The hardest part (for me) about starting a study is determining power, especially when your analyses consist of complex mixed models. I haven’t fully read through all of these links, but I am aggregating them to read soon.
 Multilevel Power Tool
 Judd, Westfall, & Kenny: Experiments with More than One Random Factor: Designs, Analytic Models, and Statistical Power
 Westfall, Kenny, & Judd: Statistical Power and Optimal Design in Experiments in Which Samples of Participants Respond to Samples of Stimuli
 Sample size calculation for mixed models
 Kain, Bolker, & McCoy: A practical guide and power analysis for GLMMs: detecting among treatment variation in random effects
 Finding Power and Sample Size for Mixed Models in Study Designs with Repeated Measures and Clustering
 Power Analysis for mixedeffect models in R

 Awesome looking package for automated simulation power analyses, less work up front in creating the simulations by scratch
 r package
 Test examples
 Example from scratch
 Another example
 Make Power Fun (simglm package)
 Simulationbased power analysis for mixed models in
lme4
 Simulation methods to estimate design power: an overview for applied research
 Power Analysis by Simulation: R, RCT, Malaria Example
 Power Analysis Simulations in R
 simstudy R package
 Using simulation for power analysis: an example based on a stepped wedge study design
 Power analysis calculator for stimulus x participant crossed designs by Westfall
Bayesian
This is an approach I’m slowly starting to look into, how to make my multilevel models bayesian. Here are some packages that are helpful.
 BRMS package
 Uses the same R syntax as lmer but runs bayesian estimation.
 Here is another link for it, tutorial
 Tutorial for posterior samples + tidyBayes
 glmer2stan
 a function like lmer but uses STAN as the compiler. A bit hard to use imo.
 rstanarm
 very complete tutorial on how to use this package here.
 Correlated Psychological Variables, Uncertainty, and Bayesian Estimation
 BayesFactor package
Miscellaneous
This is just stuff I learned through the process that may not be directly related to mixed models.
 Convenience function for parallel estimation of multiple (lmer) models
 SUPER USEFUL function for running many lmer models in parallel. I often have many models to run and this speeds up computation greatly (R uses one core only, this provides access to as many as your computer has).
 Beautiful plotting in R: A
ggplot2
cheatsheet Not lying, beautiful tutorial on ggplot functioning.
 Functions
 some functions for diagnostic plots for lmer + other stuff. I never used them, but could be useful for others.
 Knowles: Explore multilevel models faster with the new merTools R package
 Haven’t used this package, but looks really cool.
So these are the links I found most useful, and I will update as I continue forward. And when I have more time I will make the links more descriptive as they are cryptic at the moment.