Costs of data management
To help you estimate the costs of data management an overview of possible costs per research phase and research activity is presented.
1. Estimate the costs of data management
To help you estimate the costs which are involved in making your research data findable, accessible, interoperable and reusable (FAIR), have a look at the overview of possible costs per research phase and research activity below.
The content is based on the guide Research data management and costs which in itself is based on the Data Management Costing Tool, developed by the UK Data Service.
A. Acquiring external datasets |
---|
Question to consider | Estimated costs | Tips |
---|---|---|
Do you plan to use existing (commercial or open) data? | Example: A faculty license on a database for macro-economic analyses costs approximately €18.000 per year. |
- Your library may be able to help you acquire a license to a crucial database. - In research data repositories, data can be available at no or low costs. |
B. Granularity of data |
---|
Question to consider | Estimated costs | Tips |
---|---|---|
Do you collect data on the same level of detail that you want / will be able to process? | Example: If collecting half the data is enough, costs for transferring, collecting, storing etc. will also go down by half. |
- It can be tempting to collect as much data as you can. However if you collect per second, and already plan to average that to daily, you’d possibly have reliable information sampling hourly, or even less. - You can do a pilot, to assess the granularity you need to answer your research question, or consult a statistician to calculate how much data you need. |
C. Formatting and organising |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
Are your data files, spreadsheets, measurements, interview transcripts, records etc. all stored in a uniform format or style, clearly named with unique file names and well organised? |
Per project, organising the style, format and names can be done by a student assistant at level 1 salary (~17 euro per hour) or a data manager at level 2 salary. (~60 euro per hour). |
- If you plan data formats and data organisation beforehand by developing templates and data entry forms for individual data files (transcripts, spreadsheets, databases) and by constructing clear file structures, low or no additional cost will apply. If you have to develop these afterwards, higher costs are involved. |
D. Transcription |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Will you transcribe qualitative data (e.g. recorded interviews or focus group sessions) as part of your research? - Is additional hardware /software needed ? |
Example: Time needed for transcription is four to eight hours per hour recording. Vist the transcribing calculator to estimate the needed time for your project. |
- If you embed transcription as part of your research practice, very low or no additional cost will apply. - Consider the costs of (the time needed for) developing procedures, templates and guidance for transcribers. - Commercial services may exist to transcribe your audio automatically and secure, which will possibly save you so much time, it's worth the costs. |
E. Consent for data sharing |
---|
Question to consider | Estimated costs | Tips |
---|---|---|
Do you need to ask participants for their consent for data sharing? | Gaining informed consent can be done by a student assistant at level 1 salary (~17 euro per hour) or data manager at level 2 salary (~60 euro per hour). |
- When consent for data sharing is considered as part of standard consent procedures early in research, very low or no additional cost apply. - When participants need to be recontacted or revisited to obtain active consent, high costs may apply, e.g. because of extra preparation of information sheets and consent forms, consent discussions or training of interviewers. |
F. Data transfer |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Are special measures needed to transfer data from mobile devices, from fieldwork sites or from home equipment to a central work server? - Is software or hardware needed for encryption before data transfer or for synchronisation of data files across sites? |
Free encryption or data transfer software (i.e. SURfilesender) is available in most cases. |
- See the storage solutions of Utrecht University for more information on SURFfilesender. - Utrecht University has developed BoxCryptor for encryption. |
A. Data description and metadata |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Is data in a spreadsheet, database or data warehouse clearly marked with variables, variable labels and value labels, code descriptions, missing value descriptions, etc.? - Are files, records and items in the collection clearly described with well-defined metadata or a metadata standard to interpret the relations between them and to quickly select and understand the content? - Does textual data like interview transcripts need a description of the context, e.g. included as a heading page? |
Examples: - 4 hrs per single experiment (120 measurements) filling in 60 required metadata fields, with assistance of a data manager at level 2 salary (~60 euro per hour). - According to UK Data Archive, two to three weeks are costed into an average two year research grant application to prepare and collate materials for deposit. |
- If data description is carried out as part of data creation, data input or data transcription, low or no additional costs will apply. |
B. Documentation |
---|
Description | Estimated costs | Tips |
---|---|---|
Do you have documentation for the data that describes the context and methodology of how data was gathered, created, processed and quality controlled? | Researcher at level 2 salary ( ~60 euro per hour). |
- Often, essential contextual and methods documentation will be written up in publications and reports. |
A. Data back-up |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- How frequently should back-ups be done and how many back-ups should be stored? - Does your institution provide regular back-up or not? |
Examples: - University drive €0.80 per GB/year. - Cloud: €0.30 per GB/year. - 2 x Harddrive: €0.14 per GB (single purchase). |
- Institutional back-up is often included in standard indirect cost/overheads. - Cost for additional back-up will depend on the number of copies to be kept, frequency of back-up and required storage media. |
B. Data storage |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- How much data storage space is needed for the entire duration of your project? - Do you need to set up a data model and accompanying database for the data? - Do you need a data warehouse or a database architect? |
Example: - Cloud database as a service: €160/month (storage 5GB, transfer 30GB).
|
- Insitutional storage is often included in standard indirect cost/overheads. - Costs for additional storage could include server or disk space, as well as the costs of setup and maintenance. |
A. Data access |
---|
Question to consider | Estimated costs | Tips |
---|---|---|
- Do external people require access to research data? - Does remote access via VPN or secure FTP need to be arranged for external people? |
Often, researchers can use (free) existing services. |
At Utrecht University you can use SURFfilesender or SURFdrive. See 'Storage solutions'. |
B. Data security |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Should you protect data against unauthorised access or disclosure? - Is an institutional server available where you can store your data safely? - Can security be arranged by institutional IT services or is extra software/hardware needed? - Do your data files need encryption before storage or transfer? |
Example: - TTP (trusted third party), dependent on pseudonymisation type, ca. €1.000- €30.000. - Existing encryption services could be used at no costs. |
For confidential or privacy-sensitive data, determining the conditions for controlling access to shared data may require extra time and discussion. See the guide to handling personal data. |
A. File format |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Does data need to be converted to a standard or open format with longterm validity for long-term preservation? - Is additional software or hardware needed for conversion? |
Researcher at level 2* salary (~60 euro per hour). |
- For audiovisual data, converting to open digital formats can be time-consuming or require special equipment and/or software for databases. Also, conversions may require checking for truncation, loss of metadata or annotation, loss of relationships, etc. - If you are able to select a standard or open format ('preferred format') to work in from the start, conversion at the end is omitted. |
A. Anonymisation |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Do you need to remove identifying information or conceal the identity of participants (e.g. using pseudonyms) before data can be shared? - Have you considered measures to ensure that anonymisation is consistent throughout data collection? |
Example: Transcribing and simultaneously anonymizing audio (speech): - Up until one hour per 5 minute fragment (depending on the preciseness level of transcribing). |
- If anonymisation is planned before data collection or transcription/digitisation, lower costs will apply. - Anonymising audiovisual data, voices or faces can be very costly and could reduce the usefulness of data. - For quantitative data (e.g. survey data) cost can be kept low if identifiers are a priori excluded from data files, easy to remove or coded to avoid disclosure. Costs may be higher if variables need recoding afterwards to avoid disclosure. - For qualitative textual data (e.g. interview transcripts) costs can be reduced if anonymisation is carried out during transcription (or at least highlighted/coded during transcription). - Costs depend on how sensitive or complex data is and how much identifying information is recorded in the data. If only removal of names is required, costs are low; pseudonymisation, however, will require more time. |
B. Copyright |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Do other parties hold copyright in the data? - Do you need to seek copyright clearance before sharing data? - Is legal advice required? |
Juridical advice at level 3* salary (external expert, ~160 euro per hour). |
Seeking clarity in advance will make sure you don't jeopardise the progress of your research later on. |
C. Data sharing |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Will your data be deposited with a data centre or institutional repository? - Which requirements exist to prepare data to particular standards e.g. regarding documentation or format? - Does structured metadata need to be created when data is shared via a data centre or archive, e.g. completing a deposit form for 4TU.ResearchData or DANS? - Which data will or will not be retained, and for how long? |
Examples: - Completing a data repository upload form (i.e. via 4TU.ResearchData or DANS) may take 15 min to 4 hrs; - Dryad €110 once (max 20 GB) - DataverseNL €3.60 per GB/year - Cloud database as a service: €160 /month (storage 5 GB, transfer 30 GB). |
- A public repository/data centre/data journal can provide you with the possibility to share your data for reuse. To prepare data for sharing and preservation, find out what data deposit and/or longer-term storage costs per year (in time and effort). - Data centres will have their own metadata forms. Consider using these during your research. |
D. Data cleaning |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Does quantitative data need to be cleaned, checked or verified before sharing, e.g. to check the validity of codes used or check for anomalous values? - Will data match documentation, e.g. same number of variables, cases, records, files? - Does textual information in data need to be spell-checked? - Do you need to combine your data with other data sets for your research? |
Examples: - According to DataSopic, a data cleaning service costs from €270 to well over €1800. - Research/data manager at level 2 salary (~60 euro per hour). |
- Data cleaning takes time. - If you carry out data clearning as part of data entry and preparation (before data analysis), low additional costs will apply. |
E. Digitisation |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- Does analogue or paper-based research data (maps. newspaper clippings, photographs, images, text) need to be digitised to increase their potential for sharing? - Is additional equipment or software needed for scanning or conversion? |
Example: Digitisation €0.50 per page (few pages) or €320-390 per 1000 pages (Optical Character Recognition (OCR) included). |
- If simple image scanning of text is sufficient, the costs will be relatively low. - If OCR is required with manual checking for accuracy (revising entire scanned text), the costs may be high. - If manual data entry or typing is needed, e.g. to digitise tabular data, the costs may be high. |
A. Operationalising data management |
---|
Questions to consider | Estimated costs | Tips |
---|---|---|
- What measures are needed to implement and operationalise data management? - Do you need extra time and resources to implement data management throughout your research, e.g. regular team meetings, setting up a collaborative research environment? - Do you need a dedicated data manager? - Do you need staff training? - Do you need to allocate roles and responsibilities for various data management activities? |
- Data manager at level 2* salary (~60 euro per hour). - Travel costs, lunch, time. |
If multiple partner institutions, researchers or funders are involved in your research project, consider the costs of data management planning meetings or discussions. |
Writing a data management plan in itself will cost you about two hours to two days, depending on the complexity of your project. It is time well-spent because early planning of data management (especially when preparing for a funding application) can significantly reduce the costs.
2. Costs eligible for funding
Most funders consider the costs for data management eligible for funding. Already in the proposal phase most research funders ask you to explicitly think about the (costs for) management and publication of your research data, both during and after your research project. Your Research Support Office offers a Research Funding Toolkit with an overview of the specifics per funder (login with your SolisID).
If you have questions about funder requirements, have a look at the contact details of the Research Support Offices on the UU intranet.
If you have questions about the content of your Data Management Plan, have a look at the 'Guide: Data management planning' or contact us right away.