Chapter 3 Licenses in the R World

Disclaimer: This book is by no mean a legal book and should not be used as such. This book aims at helping you decipher the complexity of open source licenses, but if you have legal concerns and questions about open source licenses, please refer to a professional lawyer.

3.1 R license

license()
## 
## This software is distributed under the terms of the GNU General
## Public License, either Version 2, June 1991 or Version 3, June 2007.
## The terms of version 2 of the license are in a file called COPYING
## which you should have received with
## this software and which can be displayed by RShowDoc("COPYING").
## Version 3 of the license can be displayed by RShowDoc("GPL-3").
## 
## Copies of both versions 2 and 3 of the license can be found
## at https://www.R-project.org/Licenses/.
## 
## A small number of files (the API header files listed in
## R_DOC_DIR/COPYRIGHTS) are distributed under the
## LESSER GNU GENERAL PUBLIC LICENSE, version 2.1 or later.
## This can be displayed by RShowDoc("LGPL-2.1"),
## or obtained at the URI given.
## Version 3 of the license can be displayed by RShowDoc("LGPL-3").
## 
## 'Share and Enjoy.'

3.1.1 What does that license imply?

From the GNU GPL FAQ:

Does the GPL have different requirements for statically vs dynamically linked modules with a covered work? (#GPLStaticVsDynamic)

No. Linking a GPL covered work A 'covered work' means either the unmodified Program or a work based on the Program. (GPL-3 license) statically or dynamically with other modules is making a combined work based on the GPL covered work. Thus, the terms and conditions of the GNU General Public License cover the whole combination.

As we can see in this part of the GPL FAQ, “Linking a GPL covered work statically or dynamically with other modules is making a combined work based on the GPL covered work”. That would mean that, as R is GPL, any work linking dynamically or statically to R would have to be GPL-based also.

3.1.2 Is an R extension linking?

Most of the time, when we are creating R-extensions, we’re building packages. Packages enhance R in the sense that they are extension to R, and they add functionalities in an R run time, not to R itself. Which seems to match the definition of a plug-in:

In computing, a plug-in (or plugin, add-in, addin, add-on, or addon) is a software component that adds a specific feature to an existing computer program. When a program supports plug-ins, it enables customization.

The R Core team uses a quite similar term (“add-on”) to describe R packages: packages are named “add-on” packages in the R Installation and Administration manuals.

As said before, this book is not legal advice but aims at providing elements to understand how lincensing works. Here are extracts from the GPL FAQ that provide information about the status of an R package, regarding whether or not R license impacts the choice of a package license

According to FSF Free Software Foundation , any derivative work of GPL licensed work is to be licensed under GPL-compatible license.

In the GPL FAQ, “When is a program and its plug-ins considered a single combined program?”, the FSF states that :

It depends on how the main program invokes its plug-ins. If the main program uses fork and exec to invoke plug-ins, and they establish intimate communication by sharing complex data structures, or shipping complex data structures back and forth, that can make them one single combined program. A main program that uses simple fork and exec to invoke plug-ins and does not establish intimate communication between them results in the plug-ins being a separate program.

If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins. If the main program dynamically links plug-ins, but the communication between them is limited to invoking the ‘main’ function of the plug-in with some options and waiting for it to return, that is a borderline case.

In the GPL FAQ, “If I write a plug-in to use with a GPL-covered program, what requirements does that impose on the licenses I can use for distributing my plug-in?”, the FSF states that :

If the main program and the plugins are a single combined program then this means you must license the plug-in under the GPL or a GPL-compatible free software license and distribute it with source code in a GPL-compliant way. A main program that is separate from its plug-ins makes no requirements for the plug-ins.

We’ll leave to the reader the appreciation about whether or not an R package can be considered as an addin.

3.2 Package Licenses

Most of the time, when you are talking about R extensions, we are talking about R packages, which are to be “officially” distributed through CRAN and / or Bioconductor.

3.2.1 “Officially Authorized” Licenses

From R Licenses.

  • The “GNU Affero General Public License” version 3 (AGPL)
  • The “Artistic License” version 2.0
  • The “BSD 2-clause License”
  • The “BSD 3-clause License”
  • The “GNU General Public License” version 2 (GPLv2)
  • The “GNU General Public License” version 3 (GPLv3)
  • The “GNU Library General Public License” version 2 (LGPLv2)
  • The “GNU Lesser General Public License” version 2.1 (LGPLv2.1)
  • The “GNU Lesser General Public License” version 3 (LGPLv3)
  • The “MIT License”
  • The “Creative Commons Attribution-ShareAlike International License” version 4.0 (CC BY-SA 4)

Note that all these licenses are found in the list of GPL compatible licenses, with the exception of CC BY-SA, which is :

one-way compatible with the GNU GPL version 3: this means you may license your modified versions of CC BY-SA 4.0 materials under GNU GPL version 3, but you may not relicense GPL 3 licensed works under CC BY-SA 4.0.

You’ll also find on the Creative Common compatibility list that CC BY-SA version 4.0 is compatible with GPL, version 3 only, and also one-way compatible.

3.2.2 From Bioconductor

“License:” field: should preferably refer to a standard license (…) using one of R’s standard specifications. Be specific about any version that applies (e.g., GPL-2). Core Bioconductor packages are typically licensed under Artistic-2.0. To specify a non-standard license, include a file named LICENSE in your package (containing the full terms of your license) and use the string “file LICENSE” (without the double quotes) in this “License:” field. The package should contain only code that can be redistributed according to the package license. Be aware of the licensing agreements for packages you are depending on in your package. Not all packages are open source even if they are publicly available.

BioConductor Package Guidelines


3.3 Classifying the 11 “official” licenses

Now that we have seen these 11 “official” licenses, how can you choose one? Let’s gather here some information from the web about what you can / can’t do with these

library(rvest)
url <- read_html("https://en.wikipedia.org/wiki/Comparison_of_free_and_open-source_software_licenses")
tbl <- url %>% 
  html_node("table") %>% 
  html_table()
tbl <- tbl %>%
  filter(License %in% c(
    "GNU Affero General Public License", "Artistic License", "BSD License", 
    "GNU General Public License", "GNU Lesser General Public License", 
    "MIT license / X11 license", "CC-BY-SA"
  )) %>% 
  mutate_all(~gsub("\\[[0-9]*\\]", "", .x))
tbl %>%
  select(License, Linking, Distribution, Modification, `Patent grant`,`Private use`,Sublicensing, `TM grant`)
  • Linking - linking of the licensed code with code licensed under a different license (e.g. when the code is provided as a library)
  • Distribution - distribution of the code to third parties
  • Modification - modification of the code by a icense
  • Patent grant - protection of icenses from patent claims made by code contributors regarding their contribution, and protection of contributors from patent claims made by icenses
  • Private use - whether modification to the code must be shared with the community or may be used privately (e.g. internal use by a corporation)
  • Sub-licensing - whether modified code may be licensed under a different license (for example a copyright) or must retain the same license under which it was provided
  • Trademark grant - use of trademarks associated with the licensed code or its contributors by a icense

3.3.1 By degrees of “permissiveness”

From The Free-Libre / Open Source Software (FLOSS) License Slide

3.3.2 By permission

Adapted from TL;DR Legal

Commercial Use Modify Distribute Sub-license Private Use Hold Liable Place Warranty Use Trademark Use Patent Claims
AGPLv3 Can Can Can Cannot NA Cannot Can - -
Artistic 2 Can Can Can Can Can Cannot - Cannot -
BSD 2-clause Can Can Can - - Cannot Can - -
BSD 3-clause Can Can Can - - Cannot –: Cannot -
GPLv2 Can Can Can Cannot - Cannot Can - -
GPLv3 Can Can Can Cannot - Cannot Can - Can
LGPLv2 - - - - - - - - -
LGPLv2.1 Can Can Can Cannot - Cannot - - -
LGPLv3 Can Can Can Cannot - Cannot Can - Can
MIT Can Can Can Can Can Cannot - - -
CC BY-SA 4 Can Can Can Cannot - - - - -

3.3.3 By obligation

Adapted from TL;DR Legal

Include Copyright Include License Include Original State Changes Disclose source Include install instruction Rename Include Notice Give Credit
AGPLv3 yes yes NA yes yes yes - - -
Artistic 2 - - yes yes - yes yes - -
BSD 2-clause yes yes - - - - - - -
BSD 3-clause yes yes - - - - - - -
GPLv2 yes yes yes yes yes - - - -
GPLv3 yes yes yes yes yes yes - - -
LGPLv2 - - - - - - - - -
LGPLv2.1 yes yes yes yes yes - - yes -
LGPLv3 yes yes yes yes yes yes - - -
MIT yes yes - - - - - - -
CC BY-SA 4 - - - - - - - - yes

3.4 “Non-standard”

We can also find on CRAN some packages with icenses which are “non standard”, in a sense that they are not listed in the 11 officially recognized licenses.

3.4.1 Region-based licenses

Regional icenses are licenses which are relative to a specific part of the world. For example, the CeCILL license is issued by french academic organisations, and is designed to create an extension of the GPL in the European context.

cecill <- full_db %>%
  filter(str_detect(license, "CeCILL"))
cecill %>%
  select(package, contains("license"))

24 packages have chosen a CeCILL-based license, which is a French license for open source software.

Let’s just for the sake of exploration have a look at the domain part of the emails from the maintainers:

cecill$maintainer %>% 
  str_match_all("@([^>]*)") %>% 
  map_chr(2) %>%
  table() %>% 
  as_tibble() %>%
  arrange(desc(n))

Has you can see, these are mostly (we can’t tell for the gmail and yahoo part) french entities emails: “umontpellier.fr” is the university of Montpellier, IRD is a French research institute, u-bordeaux.fr is the university of Bordeaux, etc.

What is CeCILL? CeCILL is a French, GPL-Compatible license that comes in three flavors: CeCILL (version 1, 2, and 2.1), CeCILL-B and CeCILL-C. The idea behind this new license was to create “a license which is compatible with French laws and with the principles from anglo-saxon open source licenses.” (our translation from French).

Note that CeCILL-B & CeCILL-C are non GPL-compatible

3.4.2 Restrictive licenses

Let’s now have a look at the licenses which the author(s) choose to list as “restricts uses”:

ru <- db %>% 
  filter(license_restricts_use == "yes") %>%
  select(package, license, src)
ru

What is a “restrictive use” of a package?

According to GNU,

A program is free software if the program’s users have the four essential freedoms:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help others (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

So how can a package be “restrictive” in the sense of the GNU definition of Free Software? It can for example:

  • Prohibit from commercial use, which is restrictive in the sense of freedom 0. This is for example what you can find in:

  • {gpclib}

readLines(
  system.file("LICENSE", package = "gpclib")
) %>% glue::as_glue()
## Free for non-commercial use; commercial use prohibited (see the files
## `gpc.c' and `gpc.h' for details)
  • {rngwell19937}
readLines(
  system.file("LICENSE", package = "rngwell19937")
  
) %>% glue::as_glue()
## This code can be used freely for personal, academic, or non-commercial purposes.
## 
## For commercial purposes, see the conditions formulated in file rngwell19937/src/WELL19937a.c.

What is interesting in both these package is that the “non-commercial” use is defined by internal files gpc.c / gpc.h / rngwell19937/src/WELL19937a.c, so in the code which is statically linked by the package (when you install these packages, the pieces of C code are bundled with it).

Head of gpc.c :

download.file("https://cran.r-project.org/src/contrib/gpclib_1.5-5.tar.gz", "gpclib.tar.gz") 
untar("gpclib.tar.gz")
readLines(
  "gpclib/src/gpc.c"
)[1:30] %>% glue::as_glue()
## /*
## ===========================================================================
## 
## Project:   Generic Polygon Clipper
## 
##            A new algorithm for calculating the difference, intersection,
##            exclusive-or or union of arbitrary polygon sets.
## 
## File:      gpc.c
## Author:    Alan Murta (email: gpc@cs.man.ac.uk)
## Version:   2.32
## Date:      17th December 2004
## 
## Copyright: (C) 1997-2004, Advanced Interfaces Group,
##            University of Manchester.
## 
##            This software is free for non-commercial use. It may be copied,
##            modified, and redistributed provided that this copyright notice
##            is preserved on all copies. The intellectual property rights of
##            the algorithms used reside with the University of Manchester
##            Advanced Interfaces Group.
## 
##            You may not use this software, in whole or in part, in support
##            of any commercial product without the express consent of the
##            author.
## 
##            There is no warranty or other guarantee of fitness of this
##            software for any purpose. It is provided solely "as is".
## 
## ===========================================================================
unlink("gpclib.tar.gz")
unlink("gpclib", recursive = TRUE)

Head of rngwell19937/src/WELL19937a.c :

download.file("https://cran.r-project.org/src/contrib/rngwell19937_0.6-0.tar.gz", "rngwell19937.tar.gz") 
untar("rngwell19937.tar.gz")
readLines(
  "rngwell19937/src/WELL19937a.c"
)[1:8] %>% glue::as_glue()
## /* ***************************************************************************** */
## /* Copyright:      Francois Panneton and Pierre L'Ecuyer, University of Montreal */
## /*                 Makoto Matsumoto, Hiroshima University                        */
## /* Notice:         This code can be used freely for personal, academic,          */
## /*                 or non-commercial purposes. For commercial purposes,          */
## /*                 please contact P. L'Ecuyer at: lecuyer@iro.UMontreal.ca       */
## /* ***************************************************************************** */
unlink("rngwell19937.tar.gz")
unlink("rngwell19937")

As you can see, both these packages uses internal C files which have an explicit copyright prevent from any commercial use. Hence, as these packages are doing static linking of code that is under this copyright, there is a restriction notice in the LICENSE file.

Same goes for {tripack}:

readLines(
  system.file("LICENSE", package = "tripack")
) %>% glue::as_glue()
## 
## 
## 1.  Fortran code: 
## 
## Copyrighted and Licensed by ACM, 
## see http://www.acm.org/publications/policies/software-copyright-notice
## 
## 
## 2. R interface and extensions related to Voronoi mosaics
## (src/voronoi.f, src/left.f): 
## 
## The R interface code has been developed as work based on the 
## ACM licensed code, hence it is also ACM licensed, copyright 
## is by A. Gebhardt <albrecht.gebhardt@aau.at>. 
## 
## In order to fulfill the ACM copyright and license noted above, 
## it is stated here that this work contains modified ACM material, 
## and to fulfill this, the modified work including the R interface 
## is available free to secondary users, and no charge is associated 
## with such copies.
## 
## 3. Helper functions taken from SLATEC (src/dsos.f src/func.f):
## 
## Public domain (according to http://en.wikipedia.org/wiki/Netlib)

Which contains a piece of FORTRAN code licensed under the ACM license.

3.4.2.1 ACM

The ACM, for Association for Computing Machinery, license does not allow to commercially reuse pieces of code under this license:

Noncommercial Use

The ACM grants to you (hereafter, User) a royalty-free, nonexclusive right to execute, copy, modify and distribute both the binary and source code solely for academic, research and other similar noncommercial uses.

Commercial Use

Any User wishing to make a commercial use of the Software must contact ACM at to arrange an appropriate license. Commercial use includes (1) integrating or incorporating all or part of the source code into a product for sale or license by, or on behalf of, User to third parties, or (2) distribution of the binary or source code to third parties for use with a commercial product sold or licensed by, or on behalf of, User.

acm <- db %>%
  filter(str_detect(clean_license, "acm")) %>%
  select(package, license)
acm

6 licenses are explicitly listed as ACM, but as some packages in the whole just have “file LICENSE”,there might be more.

3.4.2.2 CC BY-NC-SA 4.0

Just as ACM, with the CC BY-NC-SA 4.0 license, user can’t reuse code in a commercial context.

ccbysa <- db %>%
  filter(str_detect(clean_license, "ccbysa")) %>%
  select(package, license)
ccbysa

22 licenses are explicitly listed as CC BY-NC-SA 4.0, but (like with ACM) as some packages in the whole just have “file LICENSE”,there might be more.

3.4.2.3 GPL-2-QA

There is one instance of the GPL-2-QA license, which seems to be only used in the {regtest} package:

readLines(
  system.file("LICENSE", package = "regtest")
) %>% glue::as_glue()
## GPL-2-QA
## 
## This software is distributed under the GPL-2-QA license.
## 
## Motivation for the QA add-on:
## This license intends to promote quality assurance with the R project.
## 
## Why not the standard GPL?
## The standard GPL does not promote quality assurance. The standard GPL 
## - does impose some restrictions on software distributors: the obligation to include the license and the source code
## - but does not impose any restrictions on software users, however, support of beneficiaries is needed for quality assurance
## 
## This license requires the following differences from GPL-2 for users and distributors:
## - users are granted a free right to use the software if and only if they reasonably support R quality assurance once this is defined by the R development team. 
## - distributors must distribute this licence add-on together with GPL-2 and source code.
## 
## What this means:
## - distibutors are as free to modify and distribute as before, they just have to also distribute the QA license add-on
## - for users there is no difference today, however, in the future they may have to 'pay' for usage by participating in automated distributed regression testing. 
## 
## What is distributed regression testing?
## There is no way software developers can guarantee the software works correctly in all possible circumstances. The number of 'possible circumstances' easily explodes in a function that has several input parameters and can run in several environments (hardware, OS, OS version, R version etc.). Therefore the R development team might develop some kind of automated mechanism, which distributes small shares of the necessary combinations to each user of the software and the user contributes the test results to the R project.
## 
## What is 'reasonable support'?
## Support is reasonable if QA
## - does not ask for the users time beyond triggering if/when to donate regression testing
## - does not ask for more than 1% of the users CPU time (used for R)
## - does not ask for huge bandwidth (only transmission of test parameters and results, no transmission of code, no transmission without the users agreement)
## - does not impose security threads to the user, especially
## - - the user must not be forced to disclose his identity
## - - regression tests must be completely transparent (open source and already part of the package, not transferred at test time)
## - - the transfer of test parameters and test results must be completely transparent (no encryption, e.g. plain html)
## 
## Copyright 2007 Jens Oehlschl<e4>gel