Which license should you add to your software to make it FAIR?

Simple steps towards FAIR code and software

This is the first blog in the series Simple steps towards FAIR code and software. One of the most asked questions by researchers to RDM Support regarding the publication of code or software is: Which license should I add to my software to make it FAIR? In this blog post, we ask Jonathan de Bruin, project lead FAIR Data and Software of the Open Science Programme, to answer this question for you.

Disclaimer:  This blog is an expert interview. No rights can be derived from it.

So, tell me, what software license do we add to our research code and software?

‘(ha-ha) This is exactly what most researchers ask! Although it doesn’t have to be a complicated answer, it’s not possible to answer this question with a single license. The best license for your code or software depends on multiple things like the wishes of you and your colleagues, details of your project, and recommendations of your funders or university. Our university, Utrecht University, promotes an open attitude where open and FAIR publishing of research code and software is a key aspect. Therefore, a so-called open license is, generally speaking, the way to go for most researchers. Software or code published under an open source license allow it to be used, modified, and shared freely. If you don’t add a license to your code or software, this is usually not possible because of copyright laws. The good thing is you don’t have to write your own open source license; there are many open licenses available. Especially licenses approved by the Open Source Initiative are reliable and useful. Legal experts and software developers extensively review these licenses.’

Does this mean that the university doesn’t have a policy enforcing a specific software license?

‘Currently, Utrecht University has no university-wide policy enforcing a specific software license to researchers and staff. Personally, I think this is good; as said before, the best license depends on the project you are working on and the needs. We know that our university aims for openness, and therefore, we can usually limit ourselves to open source licenses. Sometimes, a funder has recommendations or a policy regarding licenses for research code or software output, but usually, this is limited to publishing the work under an open source license.’

Our university, Utrecht University, promotes an open attitude where open and FAIR publishing of research software and code is a key aspect.

You say, “There are many open source licenses.” Can you explain the differences?

‘There are indeed many open source licenses. If you look at the list of Open Source Initiative-approved licenses, you will probably be overwhelmed. In general, most people categorize open source licenses in three different categories: permissive, weakly copyleft, and strongly copyleft. Permissive licenses are licenses with minimal restrictions on how software is used, modified, and shared. They usually only require a copyright notice, sometimes seen as attribution. Well-known permissive licenses are MIT, Apache 2.0, and BSD. Copyleft licenses are a legal way to require copies or derivations of your code or software to have the same rights, for example in terms of openness. I sometimes explain this to researchers as: If I work openly, you (ed.: the (re)user of the work) have to work openly as well. For strongly copyleft work, this holds even for all additions and extensions of the work. Therefore, this type of license is sometimes called viral. Examples are LGPL, GPL, and EUPL licenses.’

Do the licenses also require proper citation?

None of the popular open source software licenses I’m aware of require the user of the code or software to cite the work in such a way as we know it in academia. They usually limit this to a mention of the copyright notice, for example, “Copyright Utrecht University, 2023”. There are way more effective methods to get your work cited than trying to enforce it with a license. For example, you can add instructions in your documentation or software on how you want to be cited. Nowadays, there are also special files you can add to your work to help the user to cite your work properly, for example, CITATION.CFF.’

Given the differences mentioned before, what do you usually recommend to researchers?

‘For most researchers, a permissive license is a good fit. Over the past years, the MIT license, named after the Massachusetts Institute of Technology (MIT), has become the most popular license under open source developers as well as at Utrecht University. In 2022, we conducted a study at Utrecht University which shows that 32 percent of the code and software was published under the MIT license. This license is especially interesting because it is only a short legal text which is remarkably easy to read and understand. If you choose this license, it is important to keep an eye on your dependencies or code you copy-pasted from sources with a more restrictive or copyleft license. In that case, you should carefully check if you are still allowed to publish your work under a permissive license.

Some researchers are strongly against the reuse of their work in closed or commercial settings. They usually go for a copyleft license like GPL, LGPL, or EUPL. This does not prevent commercial use but enforces the modifications and derivations to be open source. This is usually fine to most of these researchers, as a license that enforces these requirements is not common.’

In 2022, we conducted a study at Utrecht University which shows that 32 percent of the code and software are published under the MIT license.

Are there any reasons not to use an open source license for your work?

‘Definitely! However, this is less common than with publishing research data. You can think about code that exposes privacy-sensitive information. Usually, you can circumvent this with proper coding, but if this is not the case, you might not want to publish the work under an open license. Other reasons can be copyright-protected components, patents, legal constraints, or entrepreneurial ambitions with the work. In that case, I think it would be good to get in touch with your legal department, the Knowledge Transfer Office, and an expert in open source software.’

To me, it’s still a bit overwhelming. I can imagine researchers might have the same. Where should they go for help?

This is an interesting question. Do you go to your legal department? Or to IT Support, the copyright office, or your Research Support Officer? Or to the colleagues of Research Data Management Support? I think researchers should go to the latter, RDM Support, with their questions. They have lots of experience on the topic of licensing and are in contact with legal experts, RSOs, and open source software developers. Researchers should never hesitate to ask their questions there or let their license be double-checked. We are there to help!’

The definition for ‘research software’

The FAIR for Research Software (FAIR4RS) working group states that research software includes: “source code files, algorithms, scripts, computational workflows and executables that were created during the research process or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during or with a clear research intent should be considered software in research and not Research Software. This differentiation may vary between disciplines” (Gruenpeter, 2022). This definition is used in the paper “Introducing the FAIR Principles for research software” by Barker et al. (2022) and is also adopted by the RDM Support community at Utrecht University.