Using mobile phone data for migration research

By Albert Salah and Bilgecag Aydogdu

Digital breadcrumbs everywhere

People generate an incredible amount of data every day thanks to their daily interactions with the applications running on their mobile phones, on communications they send, and multimedia material they produce and consume on various platforms. The total volume data created, captured, stored, or consumed is expected to increase from 2 zettabytes in 2010 to 181 zettabytes in 2025[1]. Policy makers cannot ignore the centrality of novel sources of data, such as the data captured by social media and telecommunication companies or satellites. These novel sources of data are the digital footprints of individuals, which can be used to infer rich information on society. When used wisely, big data sources have an immense potential to complement some of the deficiencies of traditional sources and to contribute to evidence-based policy making.

Mobile phone data for migration research

Migration is one of the fields in which traditional data sources suffer from various problems such as time lags, inconsistencies, incomplete information, and unclear definitions. Migration data can be categorized into two broad categories; namely, national and international migration data. The capacity to collect high-quality data through administrative censuses and surveys is much greater in developed countries compared to developing countries. When it comes to international migration statistics, this imbalance in capacity between countries is exacerbated by the disharmony of definitions and approaches to measure migration and migration-related phenomena such as migrant integration. In addition, international organizations that collect data on international migrant stocks, and flows, such as United Nations (UN), Organization for Economic Cooperation and Development (OECD), World Bank (WB), and the European Commission, release updated data sets between one and five years. The use of novel sources can potentially address some of the problems and challenges faced in the collection of migration data at the national and international levels.

There is one data source that holds great promise for migration research. Telecommunication companies (telcos) collect data on their subscribers for billing purposes, which we refer as mobile phone data (MPD). Call detail records (CDR) are a particular form of meta-information on mobile phone activity, containing for each act of communication the corresponding base tower locations, the timestamp of outgoing and incoming calls, as well as the duration of the phone call. CDR is a rich source of information on human mobility and communication, and while the content of communication is not included, it is still sensitive, as it shows where people are and with whom they are connected. There are also other data sources such as extended detail records (xDR), inbound and outbound roaming data, and airtime top-ups. Each of these data source has their own characteristics that can be used to complement traditional sources of migration data in various aspects (Aydogdu et al. 2022). MPD sources are highly useful for creating proxy variables and indicators on migrant stocks, migration flows, integration, segregation and remittances. The biggest advantage of using telco data for developing indicators on migration is the spatial and temporal granularity of developed indicators. Granularity is needed for evidence-based policy making on migration.

MPD sources are privately owned by telcos and are highly sensitive. Access to such data is restricted to trusted researchers and political and commercial partners of the telcos. Data sharing is usually enabled by restrictive data use agreements or by sharing precomputed indicators (instead of raw data) on human behavior. There are also data for social good initiatives that aim to minimize the harm of using mobile phone data, while maximizing the benefits with the application of best practices in terms of ethical and privacy-sensitive collection, processing, and storage of data (Salah et al. 2022). Previous mobile phone challenges, such as Data for Development (D4D - Blondel et al. 2012) and Data for Refugees (D4R - Salah et al. 2018) are good examples of data for social good initiatives. These challenges anonymized and aggregated CDR data sets and shared them with a community of researchers. Using these data sets, the researchers developed indicators on internal migration, forced migration, social and spatial segregation, as well as integration of migrants.

Challenges and the way ahead

The use of big data in policy-related areas comes with its own complexity. There are ethical questions that naturally arise from the use of private personal data. The legal governance of GDPR-compliant (General Data Protection Regulation) processing, storage, and sharing of data is essential. Privacy-preserving technologies and ethical monitoring of big data processing for policy making are possible solutions to mitigate these issues. There are also political and philosophical questions about the use of big data for policy. As in any technology, there are abusive or highly problematic applications of the use of big data and artificial intelligence (AI) methods required to process it. Discussions about the rise of surveillance capitalism (Zuboff 2015) and the use of technologies driven by AI on vulnerable populations at the frontiers of the European Union and the United States of America (Molnar 2021) are timely and needed considering the pace of developments in the field of AI and its adoption in the policy field. However, these technologies also have immense utility for developing high-quality statistics, and potentially improving the well-being of people.

The use of MPD sources will continue to make changes in the statistical domain. The interest of national statistical offices and international organizations in using ethically processed MPD is increasing. Multidisciplinary expertise in data science, privacy, and policy making is needed more than ever to identify the potential of MPD to complement data gaps and help statistical offices provide high-quality statistics on human mobility and migration in near real time. In the EU Horizon 2020 project Hummingbird (https://hummingbird-h2020.eu/), Utrecht University is leading the work package for using mobile phone data for developing migration indicators. Recently, our research group also teamed up with Oxford Centre for Technology and Development and IOM Mongolia to provide a series of capacity building workshops at the National Statistics Office of Mongolia. The use of MPD would help Mongolia, which is a large country with a nomadic population, to keep track of the internal movement of people. In times of crisis, information on population movement can be vital for better preparedness and better response to disasters.

Bio Albert Salah

Albert Salah works on computer analysis of human behaviour at different scales, including affective computing, multimodal interaction, and computational social science. 

Bio Bilgecag Aydogdu

Originally, I studied economics, political science and sustainable development studies. Before starting to my PhD in the Netherlands, I was doing a master's in Data Science. Currently, I am in Social and Affective Computing group. I write my doctoral thesis with Ali Albert Salah, on the exploration of human migration by analyses of mobile phone data.