Generalized Additive Models
for Location, Scale and Shape
Statistical modelling at its best
ABOUT GAMLSS
01. What is GAMLSS
GAMLSS are distributional regression models. In classical regression models, the explanatory variables, X, affect the expected value of the response y, X -> E(y). In a distributional regression the X's effects all parts of the distribution of y, X -> D(y). The GAMLSS models are appropriate when the focus of the study is not only shifts in the mean (or location) of the distribution for y but possibly other parts like the variance (volatility), the skewness, the kurtosis (heavy tails), the quantiles. All aspects of the the distribution of the response can be modelled as functions of the explanatory variables.
02. How to use GAMLSS
GAMLSS are implemented in R. The basic packages gamlss, gamlss.data, gamlss.dist containing (penalised) likelihood fitting algorithms, data and distributions, respectively. The package gamboostLSS fits GAMLSS using boosting. The package bamlss fits GAMLSS using Bayesian MCMC. Other packages connect GAMLSS with machine learning methodologies like neural networks, regression trees, LASSO, principal component regression (PCR), and with the Grammar of Graphics (ggplot2). A faster and extended version gamlss2 is under preparation and can be downloaded here. It will be released in CRAN soon.
03. What distributions can be used within GAMLSS
GAMLSS provide over 100 continuous, discrete and mixed distributions for modelling the response variable. Truncated, censored (interval response), log and logit transformed and finite mixture versions of these distributions can be also used. The book Distributions for Modelling Location, Scale, and Shape: Using GAMLSS in R, is a comprehensive review of all the distributions used within GAMLSS.
04. What additive terms can be used within GAMLSS
Machine Learning techniques can be use within GAMLSS to model all distribution parameters. Which machine learning technique will depend on the amount of explanatory power is required from the model. For additive smoothing terms; P-splines, Cubic smoothing splines, regression spines and loess are available. For dimensionality reduction ridge, LASSO, and principle component regression (PCR) .For modelling interactions, regression trees, random forests and neural networks. For simpler interactions varying coefficient models. Random effects are incorporated within GAMLSS.
05. Who's is using GAMLSS
The main beneficiaries from GAMLSS are practitioners who have to deal with highly skewed and kurtotic data. Such data sets are very common in medicine i.e. growth curve fitting (centile estimation), environmental studies and in the financial community. The World Health Organisation (WHO), the Global Lung Function Initiative, the International Monetary Fund (IMF), the Bank of England, the Bank of America, (BoA), European parliament, and the Great Barrier Reef Marine Park Authority all have used GAMLSS to analyse data.
06. How to learn more about GAMLSS
The following books on GAMLSS are available: i) Flexible Regression and Smoothing: Using GAMLSS in R (April 2017) ii) Distributions for Modelling Location, Scale, and Shape: Using GAMLSS in R (October 2019) iii) Generalized Additive Models for Location, Scale and Shape; A Distributional Regression Approach, with Applications (February 2024)
Short courses on GAMLSS are held regularly. The slides from the last course in Porto, Portugal, can be found here. No future course is schedule at the moment but please check here for updates.
.