FAQ FAIR Data and Software
Frequently asked questions regarding the Open Science track FAIR Data and Software.
FAIR data is a term for describing (scientific) data that are Findable, Accessible, Interoperable, and Reusable. The principles of FAIR data are formulated in the paper “The FAIR Guiding Principles for scientific data management and stewardship” https://doi.org/10.1038/sdata.2016.18 by Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. in 2016. FAIR data is important in achieving maximum impact from research. To make research data FAIR, one can use the checklist formulated in the original paper or have a look at the FAIR data page of Utrecht University Research Data Management Support.
The FAIR principles formulated for research data are not perfectly applicable to research software. Recently, a set of principles was defined to apply to research software by Lamprecht, Anna-Lena et al. (2020) https://dx.doi.org/10.3233/DS-190026. The website Fair software can be used as a checklist to make your research software FAIR.
Open data is a form of data publication in which the data is available to everyone. Usually, open data is freely available and can be (re-)used and re-published (check the license). Open data and code play an important role in science, because it can be used for the verification of scientific claims, and for answering new research questions with existing data.
The Utrecht University policy on research data states: “Archived research data are to be made available for access and reuse at and outside Utrecht University insofar as is reasonably possible and subject to the proper precautionary measures.” (see Policies, codes of conduct and laws and the University policy framework for research data, 2016). Although it is not explicitly mentioned to make research data FAIR and open, Utrecht University requires to make data at least reusable and accessible if reasonably possible.
More and more research funders have requirements on making your research data FAIR. Science Europe partners have aligned in data management plan requirements to include the FAIR principles. The Utrecht University Open Science programme also strongly encourages you to make your data and software FAIR. This makes your work findable, accessible, interoperable, and reusable. This is important to maximize the impact of your work.
In general, FAIR and open are not the same. The GO FAIR Initiative provides the following informative explanation:
“The ‘A’ in FAIR stands for ‘Accessible under well defined conditions’. There may be legitimate reasons to shield data and services generated with public funding from public access. These include personal privacy, national security, and competitiveness. The FAIR principles, although inspired by Open Science, explicitly and deliberately do not address moral and ethical issues pertaining to the openness of data. […] FAIR data does not need to be open, in order to comply with the condition of reusability, FAIR data are required to have a clear, preferably machine readable, license. The transparent but controlled accessibility of data and services, as opposed to the ambiguous blanket-concept of “open”, allows the participation of a broad range of sectors – public and private – as well as genuine equal partnership with stakeholders in all societies around the world.”
GO FAIR Initiative, 2020
A license tells people exactly what they can do with the material, which creates clarity around re-use. A license is a legal instrument that enables the copyright holder to provide permissions to others to exercise these rights, where, in the absence of a license, the party would likely infringe the owner’s copyright. In other words, without a license, the law, and indeed the world at large, presumes all the copyright holder’s rights are reserved.
In addition, licenses can be a convenient vehicle to deal with other matters such as limitation of liability, disclaimers of warranty, dispute resolution processes and governing law provisions. Some of these clauses, if adequately drafted, may serve to protect the copyright holder in case a third party comes to some harm as a result of their use of the material.
Australian National Data Service (ANDS): FAQ for research data licensing and copyright
In order to share and facilitate reuse of your research data and software, your work needs a license. A license states what a user can do with the data and software and protect the authors. Data and software licenses differ from each other. A license that fits data well might not be a proper software license.
For data, one can have a look at Creative Commons licenses. This is a very popular license and easy to understand. The fact sheet “Creative Commons & Open Science” https://doi.org/10.5281/zenodo.840652 can help you finding the best license for your project.You can find more information in the Guide Publishing and sharing data of the University Utrecht Research Data Management Support.
For software, it is advised to pick an OSI-approved (Open Source Initiative) license. Popular licenses can be found on the website of Choose a License. https://choosealicense.com/licenses/. Especially, the GNU GPLv3, MIT License and Apache License 2.0 are used a lot in research.
At the moment, there is no policy for licensing your data and software at Utrecht University.
Officially Utrecht University, as your employer, is considered the rights holder to the research data and software you create. You, as a researcher, have the primary responsibility for taking care of the data. Visit the website of Utrecht University RDM Support for more information on this.
There are multiple ways to get support at Utrecht University with data and software. Some faculties do have in-house support for this. A good starting point for help is RDM Support at Utrecht University. Visit the website or send an email at email@example.com.
RDM Support holds a network of research (data) supporters in the faculties and central service departments.
Yes. It is possible to make sensitive research data Findable, Accessible, Interoperable, and Reusable. The level of openness of the research output plays an important role in this. You can follow all principles of FAIR, and still make your data and/or software only accessible under well-defined conditions. It is a good practice to make only non-sensitive metadata open in this case, while others can request access to the sensitive data. For more information on handling personal data, you can check the Guide Handling Personal Data.
Always check for the license of the data or software. The license tells you what you are allowed to do with the data. Licenses might be hard to understand. Probably, data has a Creative Commons license and software an OSI approved license. The websites of these organizations provide a simplified explanation of the licenses.
There is no clear answer to this question. There are a couple of things you can check (non-exhaustive):
- Is it clear who the author(s) is/are?
- Is there clear and detailed documentation?
- Does the dataset of software have a persistent identifier (e.g. a DOI)?
- Is this data supplement to a peer-reviewed article?
- Is there some sort of quality standard or checklist used?
Are you still doubtful about the reliability of the data and software? Feel free to contact Research Data Management Support to have a look together. Visit the website or send an email at firstname.lastname@example.org.
The principles of FAIR data go well together with the GDPR. For example, FAIR does not require ‘openness’, but requires accessibility under well-defined conditions. There are quite some aspects to take into account when making privacy sensitive data FAIR and therefore, the GDPR recommends a Data Protection Impact Assessment (DPIA) to be carried out. You can find a checklist and workshop for handling and publishing sensitive data on the website of Utrecht University Research Data Management Support.
Version control software, like git, is gaining popularity in the research community. The use of version control software can make your research more transparent, versatile, and error-proof. More and more workshops and training resources are available for git and Github. The introduction on Github is very useful as well as the courses of Open Science Community Utrecht.