Archiving your data in Yoda

Archiving research data means that you preserve the data for a specified period of time. In the Netherlands, all research data needed to verify a scientific result needs to be archived for a minimum of ten years, and in some cases longer (such as medical data).

Archiving versus publication

In some data repositories, archiving and publication of data are done simultaneously: data are then preserved long-term and also made available for others. However, in Yoda, these are two different steps: a data package that is archived in Yoda is only preserved internally, but not necessarily published. Publication of the data, or only of the metadata, is a subsequent step in Yoda.

When to archive a data package?

In Yoda, it is possible to archive a data package at any time during your project, for example when you have completed data collection, or you have completed the analysis, or when you have published an article. At the very least, it is strongly recommended to archive your data package at the end of the research project, or when an article is published that relies on the data package as its source.

Archiving in Yoda

In Yoda, you can archive the entire folder, but also a subfolder of your research group by submitting that folder to the Vault. When a data package is archived in the Vault, a read-only copy of the folder is made to the Vault area, with an accompanying “hash” which identifies the version of the folder that was archived. Once the snapshot is made, everyone with access to the research group can access the archived version of the folder in the Vault tab in the Yoda web portal. In the Vault, the snapshot will be stored for the indicated retention period. The active version in the Research area remains untouched, and if needed you can continue working on the data there. You can also archive the same folder again later when needed. A new copy to the Vault will then be made with a different “hash”.

Remove the folder in Research

If you archive a folder at the end of your project and are not planning to work on it again, we recommend removing the folder in Research and only retaining the read-only copy in the Vault. This is important to keep storage costs low and limit the environmental impact of data storage. If you ever change your mind, it is always possible to copy the folder that is in the Vault back to Research (see below).

Screenshot of the vault. Marked in red: the button to access the Vault area. Marked in orange: the hash of this particular archived data package.
Marked in red: the button to access the Vault area. Marked in orange: the hash of this particular archived data package.

How to archive data in Yoda

1. Log into the Yoda web portal and navigate to the folder that you wish to archive.
2. Make your data package archive-ready:

a. Fill out the Yoda metadata form as completely as possible;
b. Add your own documentation, such as a README.txt file, a codebook, etc;
c. If your data package will be set to Restricted or Closed, add a License.txt file which explains how others may or may not reuse your data package;
d. Remove temporary files by selecting “Actions” > “Clean up temporary files”;
e. Give folders and files a logical name and structure;
f. Make sure your data package complies with privacy and security regulations. For example, don’t include identifiable personal data if not required for answering your research question;
g. Use preferred file formats where possible. You can check if you are using non-preferred file formats under “Actions” > “Check for compliance with policy”.

3. Submit the data package to the Vault by clicking “Actions” > “Submit”.

Screenshot of submitting the data package to the Vault in Yoda

If all required metadata is present, the system locks the folder and its subfolders temporarily. This ensures that the data is not changed during the archiving process.

4. Once submitted, the Yoda data manager will receive a notification and will check the data package for compliance with rules, regulations, and policies, and if the data package can be made more FAIR. The data manager will not review the content of the data package. The data manager will either approve the data package for archiving or suggest improvements. You will be notified of the outcome by email.
5. Once the data manager accepts the data package for archiving:

a. The data package is copied to the Vault where it is read-only;
b. The data package in Research will be unlocked, allowing you to work with the files again if needed;
c. You will be able to click “Go to Vault” or “Go to Research” in the web portal to see the Research or Vault equivalent of the data package.

6. Once the data package is archived, you can also:

a. Submit it for Publication
b. Archive a new copy of the data package (repeat above process)
c. Copy the contents of the Vault back to Research (see the information in the next paragraph).

Copy archived data back to Research

From the Vault, it is possible to copy archived data packages back to Research so that you can work with them again. This can be useful if there is no editable version of the data package in Research anymore, or when there is an audit of published results, for example.

1. Log into the Yoda web portal and navigate to the folder that you wish to copy back in the Vault.
2. Select “Actions” > “Copy datapackage to research space”.