Some Notes on GNU Licenses in R Packages

Win-Vector Blog 2019-08-20

I was recently asked if Win-Vector LLC would move the R wrapr package from a GPL-3 license to an LGPL license. In the end I decided to move wrapr distribution to a “GPL-2 | GPL-3” license. This means the package is now available under both GPL-2 and GPL-3 licensing, allowing the user to pick which of these two licenses they wish to accept the software under. I decided to stick to “GPL-*” style licensing as I endorse the values underlying these licenses, and my (not-legal advice, I am not a lawyer!) opinion this is the licensing pattern closest to the license R itself is distributed under (and hence the closest to the values of the core R community).

Please read on for some background issues I found (not-legal advice, I am not a lawyer!)

In my opinion there seems to be no convenient way to simultaneously avoid both the GPL-2 and GPL-3 licenses when working with R as R itself is under a “GPL-2 | GPL-3” license (as are many of its popular packages). So likely anyone who feels they must avoid simultaneously both the GPL-2 and the GPL-3 licenses should probably avoid CRAN R altogether. Now the issue is complicated (and there are some distinctions about interpreters, APIs, and distribution): but it really appears if you are using R and some of its popular packages you are likely using software components distributed with GPL-2, GPL-3, “GPL-2 | GPL-3”, and other licenses.

One thing to keep clear: despite the similarity in names the various GNU-license families, LGPL-, GPL-, and AGPL-*, are very different licenses families. With that in mind let’s look at the declared licenses of a few popular packages as distributed via CRAN:

NewImage

Let’s take the dplyr package as an example. It declares an MIT license. However it also declares asserthat, rlang, and tidyselect as dependencies. This means that dplyr when normally installed also asks these three packages also be installed. It also implies that dplyr may not be able to function without these three additional packages also installed. The point is: each of these three additional packages is distributed by CRAN under a GPL-3 license. So one using dplyr in the standard way from a standard CRAN install is likely using GPL-3 software received under a GPL-3 licensed distribution (unless one has obtained different licenses from the package rights holders). dplyr may have a user-permissive license (such as the MIT license), but to use dplyr with R in the usual manner the user will typically have installed and be using additional GPL-3 licensed components (at least asserthat, rlang, and tidyselect; and likely more through indirect dependencies).

To be clear: I like the GPL-* licenses, I am merely describing some of the care that one trying to avoid them must take. This also has little to do with “the virality of the GPL license” or even license compatibility. I am just pointing out: unless one has prevented the dependent installs or made other distribution arrangements with the rights-holders of the additional packages that one has likely included, then GPL-2/GPL-3 distributed software is often installed and also used when working with common R packages.

I must emphasize that direct Depends/Imports declarations may not be the only way license obligations may be collected.

Finally: consider the AGPL-* series of licenses. Common user interfaces and servers such as the open-source versions of RStudio Desktop and Shiny Server are distributed under the AGPLv3 license. This means if you are using one of these user interfaces or servers you are using AGPLv3 license distributed software in addition to the various licenses found on R and various R packages. This is interesting, as some companies prohibit internal use of AGPL licensed components. For instance Google has such a rule. I do not believe Google has the same policy completely avoiding LGPL-* or GPL-* software (ref). So if your intent is to avoid GNU-licenses, it may be advisable to choose which of these licenses to avoid instead of attempting to avoid them all.