Review
Generalized linear mixed models: a practical guide for ecology and evolution

https://doi.org/10.1016/j.tree.2008.10.008Get rights and content

How should ecologists and evolutionary biologists analyze nonnormal data that involve random effects? Nonnormal data such as counts or proportions often defy classical statistical procedures. Generalized linear mixed models (GLMMs) provide a more flexible approach for analyzing nonnormal data when random effects are present. The explosion of research on GLMMs in the last decade has generated considerable uncertainty for practitioners in ecology and evolution. Despite the availability of accurate techniques for estimating GLMM parameters in simple cases, complex GLMMs are challenging to fit and statistical inference such as hypothesis testing remains difficult. We review the use (and misuse) of GLMMs in ecology and evolution, discuss estimation and inference and summarize ‘best-practice’ data analysis procedures for scientists facing this challenge.

Section snippets

Generalized linear mixed models: powerful but challenging tools

Data sets in ecology and evolution (EE) often fall outside the scope of the methods taught in introductory statistics classes. Where basic statistics rely on normally distributed data, EE data are often binary (e.g. presence or absence of a species in a site [1], breeding success [2], infection status of individuals or expression of a genetic disorder [3]), proportions (e.g. sex ratios [4], infection rates [5] or mortality rates within groups) or counts (number of emerging seedlings [6], number

Estimation

Estimating the parameters of a statistical model is a key step in most statistical analyses. For GLMMs, these parameters are the fixed-effect parameters (effects of covariates, differences among treatments and interactions: in Box 1, these are the overall fruit set per individual and the effects of fertilization, clipping and their interaction on fruit set) and random-effect parameters (the standard deviations of the random effects: in Box 1, variation in fruit set, fertilization, clipping and

Conclusion

Ecologists and evolutionary biologists have much to gain from GLMMs. GLMMs allow analysis of blocked designs in traditional ecological experiments with count or proportional responses. By incorporating random effects, GLMMs also allow biologists to generalize their conclusions to new times, places and species. GLMMs are invaluable when the random variation is the focus of attention, particularly in studies of ecological heterogeneity or the heritability of discrete characters.

In this review, we

Acknowledgements

We would like to thank Denis Valle, Paulo Brando, Jim Hobert, Mike McCoy, Craig Osenberg, Will White, Ramon Littell and members of the R-sig-mixed-models mailing list (Douglas Bates, Ken Beath, Sonja Greven, Vito Muggeo, Fabian Scheipl and others) for useful comments. Josh Banta and Massimo Pigliucci provided data and guidance on the Arabidopsis example. S.W.G. was funded by a New Zealand Fulbright–Ministry of Research, Science and Technology Graduate Student Award.

References (67)

  • D.A. Elston

    Analysis of aggregation, a worked example: numbers of ticks on red grouse chicks

    Parasitology

    (2001)
  • A.R. Gilmour

    The analysis of binomial data by a generalized linear mixed model

    Biometrika

    (1985)
  • L.E.B. Kruuk

    Antler size in red deer: heritability and selection but no evolution

    Evolution

    (2002)
  • A.J. Wilson

    Environmental coupling of selection and heritability limits evolution

    PLoS Biol.

    (2006)
  • P. Chesson

    Mechanisms of maintenance of species diversity

    Annu. Rev. Ecol. Syst.

    (2000)
  • B.A. Melbourne et al.

    Extinction risk depends strongly on factors contributing to stochasticity

    Nature

    (2008)
  • G.A. Fox et al.

    Demographic stochasticity and the variance reduction effect

    Ecology

    (2002)
  • C.A. Pfister et al.

    Individual variation and environmental stochasticity: implications for matrix model predictions

    Ecology

    (2003)
  • G.P. Quinn et al.

    Experimental Design and Data Analysis for Biologists

    (2002)
  • M.J. Crawley

    Statistical Computing: An Introduction to Data Analysis Using S-PLUS

    (2002)
  • M.J. Whittingham

    Why do we still use stepwise modelling in ecology and behaviour?

    J. Anim. Ecol.

    (2006)
  • A.M. Ellison

    Bayesian inference in ecology

    Ecol. Lett.

    (2004)
  • W.J. Browne et al.

    A comparison of Bayesian and likelihood-based methods for fitting multilevel models

    Bayesian Anal.

    (2006)
  • S.R. Lele

    Sampling variability and estimates of density dependence: a composite-likelihood approach

    Ecology

    (2006)
  • R. Schall

    Estimation in generalized linear models with random effects

    Biometrika

    (1991)
  • R. Wolfinger et al.

    Generalized linear mixed models: a pseudo-likelihood approach

    J. Statist. Comput. Simulation

    (1993)
  • N.E. Breslow et al.

    Approximate inference in generalized linear mixed models

    J. Am. Stat. Assoc.

    (1993)
  • S.W. Raudenbush

    Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation

    J. Comput. Graph. Statist.

    (2000)
  • J.C. Pinheiro et al.

    Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models

    J. Comput. Graph. Statist.

    (2006)
  • W.R. Gilks

    Introducing Markov chain Monte Carlo

  • J.C. Pinheiro et al.

    Mixed-Effects Models in S and S-PLUS

    (2000)
  • R.C. Littell

    SAS for Mixed Models

    (2006)
  • N.E. Breslow

    Whither PQL?

  • Cited by (6461)

    View all citing articles on Scopus
    View full text