One Excel manipulation causes families’ distress

Crashing hard drives? Backups gone missing? Bugs in your code? Lost all your data? Who ya gonna call? Just the thought of it! And yet it happens every day. During this Data Horror Week, researchers will share these horror stories, based on their own experience. To prevent you from making the same mistake!

Tell us your horror story, what happened?

I was responsible for the data collection among family members of primary respondents. Among those that I had to approach for participation in our survey, were randomly selected siblings for which the primary respondents had provided contact information. I used an Excel file to make the stickers for envelopes and questionnaires to be sent out to these siblings. The targeted siblings were informed that their [brother / sister] born on [dd.mm.yyy] had given us their contact details. But right before printing the stickers, I sorted the data on date of birth, for no good reason really. Or more precisely, I only sorted the column with date of birth, not all the other columns. The result was that thousands of people received a letter that their sibling with a completely wrong date of birth had registered them for our study. The first day I came back into the office after the post had delivered our letters, I received very many calls. People were really upset, thought they had a sibling they didn't know about, had their fathers had had another family, how come this sibling knew about them, but not vice versa? I felt absolutely awful, and very guilty to have caused so much distress.       

How long ago was it?

A long time ago, this was in 2003... GDPR was not i effect yet. I'm not sure whether this snowball family sampling would still be allowed.

How was this solved?

Well, it was very difficult to figure out what went wrong, and who was related to whom. We had to reconstruct using the original paper questionnaires and manually correct all the siblings. We sent out new letters apologizing for our mistake. And I was on the phone a lot for about a week. The response rate to the survey however went up and was higher than expected. Not that I would recommend making the same stupid mistake, but apparently people forgave us.      

How could this horror be avoided?

Never overwrite master data files! And double check always when you approach real people.

What lesson can we learn from this story?

You cannot be too precise when it comes to processing personal data. And also: people are really nice.

For advice to prevent you from making the same mistake, contact us or go to our website.

Data Horror Week

This Data Horror Week is an initiative from the RDM Support desks at Utrecht University, TU Delft, Leiden University and Twente University. For more stories, go to the Data Horror Week website. To stay up to date on all the horror stories this week, follow us on Twitter!