How datascience can help wildlife

As an ecologist, Joeri Zwerts often finds himself ankle-deep in the mud of an African jungle. Joeri studies whether FSC-certified forestry practices are helping to save endangered species in the countries of Congo, Gabon and Cameroon. These are fascinating, but also primitive places to work, as seen in the NPO documentary Heroes of the Wilderness*. In this article, Joeri explains how data science is helping to further his research. 

FSC wood has to make a difference

But let us start at the beginning: Joeri was looking for a better way to monitor wildlife in tropical forests and that has everything to do with the question of whether FSC-certified forestry practices are better for the wildlife populations than standard forestry practices. ‘Consumers like you and I pay extra for wood with the certification, so you naturally want to be sure that it actually makes a difference.’ Joeri conducts most of his wildlife research by means of camera traps, but the limited range of such cameras is a major disadvantage. ‘I wanted to know if sound might not be a better method. Sound has a longer range, so an audio-based method could potentially be much more effective and less expensive. But in order to count the animals in audio recordings, you have to review all the recordings. And because this is extremely time-consuming, it's a process you want to automate. Otherwise, you might as well station people in the forest and have them manually check off every sound they hear.’

Looking to supplement his own expertise, Joeri got in touch with Research Data Management Support (RDM) department. ‘RDM tipped me off about the grants available from the Applied Data Science research focus area, which gave me a way to pay for the hours worked by computer scientist Heysem Kaya. It was a fruitful collaboration in which we learned a great deal about each other's fields. We achieved things I would not have been able to manage on my own, and RDM has developed software that they never could have come up with themselves. That's the wonderful thing about this multidisciplinary approach.’ The collaboration has already yielded two scientific publications, with a third in the works.

Consumers like you and I pay extra for wood with the certification, so you naturally want to be sure that it actually makes a difference.

An algorithm that recognises primate calls

In cooperation with computer scientist Heysem, an algorithm was developed to detect the various sounds made by primates. The choice for primates is a pragmatic one: these animals make a lot of noise and give a good indication of how well or poorly the population of endangered species are faring. Writing a detection algorithm requires training data. That sounds simple enough, but it isn’t. ‘If you're out collecting sound in the forest, you just have to hope that there are enough primates in that particular spot.’ 
Joeri found the training data he needed at a primate rescue centre in Cameroon. The big advantage of a place where primates are living in captivity is that you can be certain of finding primates there. With the help of students, every noise or call recorded was labelled. ‘We did that for five species of primate. That data allowed us to train a fairly simple algorithm.’ The trained algorithm proved effective in recognising the noises made by primates. 

Complex jungle noises

But what if you're dealing with data that was recorded in an actual tropical forest? Outside the primate rescue centre, in other words, and with a tremendously complex variety of background noise. The algorithm was still identifying too many non-primate sounds from the forest as primates, which is also known as giving false positives. Joeri explains how they resolved that issue: ‘We then “pasted” those complex jungle background noises on top of the primate noises and then re-trained the algorithm. That yielded much better results, and by mixing the various data, we were ultimately able to develop an effective monitoring method. Now anyone who has access to audio recordings of chimpanzees can use this method – i.e., the algorithm that we are making openly available – to monitor populations. We're also making the code and software user-friendly, so that organisations like the WWF can use a simple audio recording device – that could even be an old phone – to collect data and then apply our algorithm. Another thing our method shows is that training data from places like the local zoo can be effectively used to train detection algorithms. If the animal makes enough noise, that is.’

‘But our work is far from finished. In an effort to further improve the algorithm, we published the data set in a computer science challenge. Computer scientists from around the world could try to improve the algorithm using state-of-the-art techniques. In the world of computer science, this is a common way to enhance a technique – and garner fame, for those who win the challenge. For me, though, this was totally new territory. Now, Heysem and I will work with UU Master's students to apply the state-of-the-art techniques gained from the challenge to our latest algorithm in order to further enhance its capacity for detection. We will make those results publicly accessible as well. In other words: everything we are developing is being done open source, so that other people can benefit from our results.’ 

Now anyone who has access to audio recordings of chimpanzees can use this method – i.e., the algorithm that we are making openly available – to monitor populations. We're also making the code and software user-friendly, so that organisations like the WWF can use a simple audio recording device – that could even be an old phone – to collect data and then apply our algorithm.

Everyone has learned something

Joeri's project is a great example of why the Applied Data Science research focus area was established. In a previous article, Peter van der Heijden had the following to say: ‘I feel that we at UU should offer researchers the tools and support they need to apply data science in their own fields. Ideally, a scientist should not be overly influenced by what they already know and can when they are trying to formulate a research question. It’s better to focus solely on the substance of the question and then bring in outside methodology and expertise from other people as needed. That fits the current climate in which people tend to work in teams.’ Joeri endorses this idea as well: ‘I think that if UU invests in this type of collaboration, it will be easier to achieve more than we otherwise could. I would not have been able to do this on my own. This way, we have all learned something.’ 

* The ‘Heroes of the Wilderness’ documentary is available via the paid service NPO Gemist.
 

More information

•    See also this previously published article from Research Data Management
•    Want to learn more about the Applied Data Science focus area