Exploring the potential of data donations

Interview with Karin van Es and Dennis Nguyen

Data donation presents a new and unique approach to collecting digital trace data, liberating researchers from platform dependencies and restrictions. A consortium of six Dutch universities has now started building its own digital data donation infrastructure. Within this initiative, Karin van Es and Dennis Nguyen from the Faculty of Humanities at Utrecht University, supported by Laura Boeschoten (UU) and Niek de Schipper (UvA) of the national D3I team, are conducting two pilot studies: one focusing on the video-sharing platform Netflix, and the other on AI chatbot ChatGPT. In this interview, Karin van Es and Dennis Nguyen elaborate on the project’s significance.

Why is a data donation platform necessary?

Karin van Es: ‘It is increasingly hard for researchers to get access to online behavior as big tech companies and online platforms restrict access to their data. The digital data donation infrastructure D3I, funded by PDI-SSH, helps researchers navigate this challenge by asking participants to donate their ‘digital traces’ for research purposes.’

What are data donations?

Van Es: ‘Under the GDPR, individuals have the right to obtain a copy of their personal data held by data processors, received as Data Download Packages (DDPs) in files. Data donations offer a new method for collecting data, enabling researchers access to digital trace data donated to their project by participants. Participants request and share their DDP’s with researchers, downloading them to their devices. They don’t donate all the data, but just the features of interest to the researcher.  Moreover, participants can select what data they eventually want to donate. This method for data collection opens up new avenues for academic research while taking ethical measures to ensure responsible data practices. By leveraging this technique, scholars gain access to datasets previously inaccessible.’

What are you exploring with these pilots?

Van Es: ‘For the first pilot, we received 129 Netflix DDP’s and surveys with the help of panel recruitment company Ipsos I&O. Netflix is often credited to have disrupted the traditional media landscape through its sophisticated data collection and analysis capabilities, and the perceived effectiveness of its recommendation system. However, the implications of these innovations are still not fully understood. Traditionally, Netflix has kept its viewership data closely guarded, but a recent shift towards greater transparency has emerged. Yet, they still only share information they want to share. As such, Netflix maintains control over the narrative, shaping public understanding of binge-watching, content popularity, and diversity within its catalog. How then can we critically explore these practices? Our pilot critically examines Netflix’s narratives, challenging its selective transparency and shedding light on the dynamics of streaming consumption.

Dennis Nguyen: ‘We are currently setting up the second pilot, focusing on acquiring ChatGPT DDP’s from university students. We are exploring how to conduct the research project in the most ethically sound manner with input from the privacy officer and ethical board. The impact of ChatGPT on learning objectives and educational approaches, both in the classroom and academia at large, poses intriguing questions. What is missing in the cycles of hype and fear about GenAI in education is empirical research into the situation on the ground. How students use it in different contexts has only been tentatively explored, mostly through surveys. Limited research is available on how they perceive and harness GenAI services such as ChatGPT for study-related purposes. Researching student interactions through their actual user data collected by ChatGPT/OpenAI opens new paths for better understanding how GenAI entered higher education from a student perspective.’

What’s next?

Nguyen: ‘So far, we find the opportunities really encouraging and we are looking forward to actually work with the data. We are curious to further explore the willingness of participants to donate to research and the quality of the data available. The ultimate goal of this project is making this method more widely available to other researchers. More and more materials, including ethics approval procedures and data management, will be made available online to support others using data donations for research projects in the future.’

About

Karin van Es is Affiliate and Impact Specialist at the Centre for Digital Humanities (CDH). She works as associate professor of Media and Culture Studies at Utrecht University and is part of the GenAI in Education Humanities taskforce.

Dennis Nguyen is assistant professor in computational methods and digital literacy at Media and Culture studies at Utrecht University.

Interested in learning more on data donation? Visit the D3I symposium on May 30-31st, 2024, at the University of Amsterdam, which also includes a short course on how to prepare your data donations study.