Index
A
Anonymisation | The process of either irreversibly encrypting or removing personally identifiable information from data sets. See step 5 in 'Handling Personal Data'. |
---|---|
AVG | The Algemene Verordening Gegevensbescherming (AVG), or General Data Protection Regulation (GDPR) requires you as a researcher to provide clarity and transparency where personal data are concerned. See 'Handling Personal Data'.
|
Archiving | A long-term storage that contains copies of (data) files for future reference or use. An archive preferably has some search functionality, verification of file integrity, replicates for risk spread, and if necessary applies updates of old file formats to current. |
B
Backup file | A file you can use to restore your data should the master file be lost, damaged, removed, or accidentally overwritten. See 'Storing and preserving data'. For storage solutions with backup facilities, have a look at our IT-solutions. |
---|---|
Bit rot | Refers to the phenomenon that all digital sources degrade over time. To learn how to protect your data, read 'Storing and preserving data'. |
C
Checksum | A generated number representing content for the purpose of detecting errors which may have been introduced during data transmission or storage. To learn how protect your data, read 'Storing and preserving data'. |
---|---|
Code | See the definition in this index under 'FAIR Research Software' |
Controlled vocabulary | A list of allowed words, terms, or phrases which improves data operability. See 'Designing metadata schemes' for our services in assisting you. |
Costs of data management | Costs may be applicable when managing your data in all stages of your research project (collect, handle, preserve). See costs of data management for an overview of possible costs. |
D
Data citation | A citation to a dataset, similar as a citation to a research paper. Offers proper recognition to dataset authors. Read the relevant section of 'Publishing and sharing data'. |
---|---|
Data classification | A data classification, in the context of information security, is the classification of data based on its level of sensitivity and the impact to the University should that data be disclosed, altered or destroyed without authorisation. The classification of data helps determine what security controls are appropriate for safeguarding that data. See the Data Privacy Handbook for more information about data classification. |
Data journal | A peer-reviewed journal in which to describe and formally publish datasets. Read the relevant section of 'Publishing and sharing data'. |
Data management plan (DMP) | A data management plan (DMP) is a formal document you develop at the start of your research project which outlines all aspects of managing your data, during and after your project. See 'Data management planning' for more information. |
Data ownership | Data ownership is the act of having legal rights and control over a dataset. It defines and provides information about the rightful owner and the acquisition, use and distribution policy implemented by the data owner. |
DPIA or GBEB | Data Protection Impact Assessment. GBEB in Dutch. During a DPIA you fill in a form which helps you assess privacy issues and resulting measures to fix possible privacy problems in an early stage. According to article 35 of the AVG/GDPR an assessment should be made if data processing is likely to pose a high privacy risk for the data subjects. See 'Handling personal data'. |
Data sharing | The practice of making data available for reuse. Preferably by depositing the data in a repository, or publication in a data journal. See 'Publishing and sharing data'.
|
Data package | All files related to a specific dataset in order to make published results based on this dataset reproducible and/or make the data in itself reusable for new studies. See ‘Storing and Preserving Data’ VIII. Prepare a data package. |
Data Transfer Agreement | A legal agreement which is recorded in situations where (personal) data is transferred from the controller to a third party and where a risk exists that the data is inappropriately accessed or used. Agreement is asked on how data may be handled, who has access, for what exact goal it can be used, and for how long. Also called Data Processing Agreement. |
E
Encryption | The conversion of data into a form that cannot be easily understood by unauthorised people. |
---|---|
External data | Data for which ownership lies completely outside of the university. Also 'third party data'. |
F
FAIR Data | Acronym to describe data that is Findable, Accessible, Interoperable, and Reusable. See 'How to make your data FAIR'. |
---|---|
FAIR Research Software | The FAIR for Research Software (FAIR4RS) working group states that research software includes: “source code files, algorithms, scripts, computational workflows and executables that were created during the research process or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during or with a clear research intent should be considered software in research and not Research Software. This differentiation may vary between disciplines” (Gruenpeter, 2022). This definition is used in the paper “Introducing the FAIR Principles for research software” by Barker et al. (2022) and is also adopted by the RDM Support community at Utrecht University. See Software and Computing for more information. |
G
Governance | On 1 January 2016, the University Policy Framework for Research Data came into force. See our 'About' page for more information. |
---|---|
GDPR | The Algemene Verordening Gegevensbescherming (AVG), or the implementation of the General Data Protection Regulation (GDPR) is in force since 25 May 2018. The AVG requires you as a researcher to provide clarity and transparency where personal data are concerned. See 'Handling Personal Data'.
|
GBEB or DPIA | Stands for ‘GegevensBeschermingsEffectBeoordeling’. This is the Dutch translation of ‘Data Protection Impact Assessment (DPIA)’. In many cases obligatory if privacy sensitive data is involved. See ‘Handling Personal Data’. |
H
High performance computing (HPC) | Aggregating computing power by parallel processing in a way that delivers much higher performance. |
---|
I
Identifiable personal data | Data that without much effort leads to the identity of a person. This can be directly (for instance by name, address) or indirectly (for instance a rare occupation combined with age). Read more in 'Handling Personal Data'. |
---|---|
Informed consent | A voluntary, specific and unambiguous expression of will by a research subject, based on adequate information, to accept the processing of his/her personal data. See 'Legal agreements and documents', or have a look at 'Informed consent for data sharing'.
|
Interoperability | Data interoperability is the capacity to which data can be compared, analyzed and/or merged with similar data. Data interoperability relies on the use of standards, data documentation, and metadata to indicate to researchers which data sets or variables are comparable. |
L
License | A special permission to do something on, or with data, which could otherwise be legally prevented. Usually phrased along the lines of 'some rights reserved', such as attribution required. Read 'Publishing and sharing data'
|
---|
M
Metadata | Information about data, with a potential for machine-to-machine interoperability. Mostly a small set of vocabulary words used to describe a source. |
---|---|
Metadata sheet | A template that reflects what is being measured, observed, monitored at the various sites/samples/subjects as well as the circumstances under which this is done. See 'Data description in practice'.
|
Master copy location | The location at which the original copy of a (data) file is stored, from which backups or temporary copies are made. The master copy location is used as the source location for files for further processing. By default there can be only one master copy location per file. Other locations are temporary (and not used as sources). |
N
Non-disclosure agreement | A legally binding contract between two parties in a professional relationship to ensure confidentiality of sensitive information. See 'Legal agreements and documents', |
---|
O
Open data | Structured data that are machine-readable, intelligible, and freely shared under a licence that is least restrictive. Read 'Publishing and sharing data'. |
---|---|
Open science | The practice of science in such a way that others can collaborate, contribute and reflect during the entire life cycle of research. Research output is freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods. |
P
Persistent identifier (PID) | A permanent and unique referral to an online digital object, independent of (a change in) the actual location. See the relevant section of 'Publishing and sharing data'.
|
---|---|
Personal data | Any information relating to an identified or identifiable natural person; directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity. See 'Handling personal data'. |
Preferred format | A file format that has the best chances of being useable in the future. Read 'Storing and preserving data'.
|
Privacy-sensitive data | If you collect research data that enables you to identify a person, it is classified as privacy-sensitive data. See 'Handling personal data' |
Proprietary format | A file format for which encoding is either secret or published with its use restricted through licences. Read 'Storing and preserving data'. |
Pseudonymisation | Identifying fields within a data record are replaced by one or more artificial identifiers or pseudonyms. Either with or without the possibility of re-identifying the subject of the data (reversible or irreversible). It allows for data on the same subject to be linked across data records without revealing the identities. See the relevant section of 'Handling Personal Data'.
|
Processing data | This is any operation or set of operations which is performed, encompassing the collection, recording, organization, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction or erasure of data. See ‘Handling Personal Data’ |
R
Repository | A central storage to preserve, manage, and provide access to many types of digital material. As a result it can be searched, discovered, and reused. See 'Publishing and sharing data'.
|
---|---|
Reproducibility | The quality of being reproducible (to produce again or anew; re-create). The reproduced result may be based on the raw data, description (metadata included), documentation, and computer programs provided by researchers. These may be archived together in a data package. |
Research Software | See 'FAIR Research Software' in this index. |
S
Sensitive personal data | The following personal data is considered ‘sensitive’ and is subject to specific processing conditions according to the GDPR: personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs; trade-union membership; genetic data, biometric data processed solely to identify a human being; health-related data; data concerning a person’s sex life or sexual orientation. |
---|---|
Software | See the definition in this index under 'FAIR Research Software' |
T
Trusted digital repository | An infrastructure component that provides reliable, long-term access to managed digital resources. It stores, manages, and curates digital objects and returns their bit streams when a request is issued. Trusted repositories undergo regular assessments according to a set of rules such as defined by Data Seal of Approval (DSA) or TRAC (ISO 16363). |
---|
V
Versioning | The creation and management of multiple releases of a product, all with the same general function but improved or customized. See 'Storing and preserving data'. |
---|