First weeks of GSoC

First post of a series about my contributions to Bambi in this Google Summer of Code season. This post highlights new features related to default priors and priors for group-specific effects.

Published

June 28, 2021

I am really happy to participate in this Google Summer of Code season with NumFOCUS to contribute to the Bambi library. The coding period ranges from June 7 to August 16, with an intermediate evaluation taking place between July 12 and July 16.

Overview

My project is called Extend available models and default priors in Bambi. The main goal of this project is to add new families of generalized linear models, such as beta regression, robust linear regression (i.e. linear model with error following a T-Student distribution)1 as well as multinomial regression. However, this raises a second problem, which is about default priors distributions.

Default priors in Bambi are limited to the families implemented in the GLM module instatsmodels, which does not include the families mentioned above. For this reason, it is first necessary to incorporate alternative automatic priors so new families work without requiring the user to manually specify priors.

Therefore, these first weeks of the coding period were centered around understanding how default priors work on other high-level modeling packages such as brms and rstanarm, how to translate their ideas into PyMC3 code, and finally how to implement everything within Bambi.

Alternative default priors

Currently, Bambi uses maximum likelihood estimates in the construction of its default priors. There are two limitations associated with this approach. First, current default priors don’t exist whenever uniquely identifiable maximum likelihood estimates don’t exist (e.g. \(p > n\) or complete separation scenarios). Secondly, these estimates are obtained via the GLM module in statsmodels, which means default priors can only be obtained for families made available in statsmodels.

Based on the available documentation and simulations I’ve done, I decided to implement alternative default priors that are much like the default priors in rstanarm. These priors aim to be weakly-informative in most scenarios and do not depend on maximum likelihood estimates. Their documentation is excellent and it was a great guide for my implementation.

This is the PR where I implement alternative default priors inspired on rstanarm default priors. In addition, I also implement LKJ prior for the correlation matrices of group-specific effects.

How to invoke alternative default priors

The Model() class has gained one new argument, automatic_priors, that can be equal to "default" to use Bambi’s default method, or "rstanarm" to use the alternative implementation2.

model = bmb.Model("y ~ x + z", data, automatic_priors="rstanarm")

How to use LKJ priors for correlation matrices of group-specific effects

Group-specific effects can now have non-independent priors. Instead of using independent normal distributions, we can use a multivariate normal distribution whose correlation matrix has an LKJ prior distribution. This distribution depends on a parameter \(\eta > 0\). If \(\eta=1\), the LJK prior is jointly uniform over all correlation matrices of the same dimension. If \(\eta >1\) increases, the mode of the distribution is the identity matrix. The larger the value of \(\eta\) the more sharply peaked the density is at the identity matrix.

Model has an argument priors_cor where we can pass a dictionary to indicate which groups are going to have a LKJ prior. The keys of the dictionary are the names of the groups, and the values are the values for \(\eta\).

In the following model, we have a varying intercept and varying slope for the groups given by group. These varying effects have a multivariate normal prior whose covariance matrix depends on a correlation matrix that has a LKJ hyperprior with \(\eta=1\).

model = bmb.Model("y ~ x + (x|group)", data, priors_cor={"group": 1})

Footnotes

  1. These two distributions are not members of the exponential family so using them as the distribution of the random component does not result in a generalized linear model in a strict sense. But I would usually refer to them as GLMs since the linear predictor, link function, and random component properties are still present.↩︎

  2. Both the argument name and the options may change↩︎