The Bonvin lab published the deep learning framework DeepRank

The Bonvin lab at Utrecht University in collaboration with Li Xue’s group in Nijmegen and the Netherlands eScience Center recently published DeepRank, a deep learning (DL) framework for data mining 3D protein-protein structures, in Nature Communications and GitHub. 

What can it be used for?

DeepRank provides an easy DL platform. Researchers can come up with their own research questions related with 3D protein structures. DeepRank takes 3D protein-protein structures as input, and outputs a prediction. The output of DeepRank is defined based on the users’ research question and their training data. For example, how two proteins interact with each other in 3D space and where do they interact (i.e., which part of a protein interacts with which part of another protein)? Whether a mutation is disease-causing or not? Is the protein-protein interaction that we observe in X-ray experiments a biological interaction or crystal artefacts? And many many more.

Illustration of DeepRank

Protein functions are encoded in their 3D shapes. In the past decades, a number of structural biology techniques were developed (e.g., X-ray, NMR, Cryo-EM) and a large number of experimentally determined 3D protein structures have been accumulated. A human expert trained for 30+ years could excruciatingly examine them by eyes as the spatial arrangement of atom pairs and interactions shed light on the secrets of biological life. However, human inspection is not efficient. And the relationship between 3D structures of diverse proteins and their functions is too intricate to be even mastered by human experts.

Using quantitative statistical approach to approximate/simulate human perception was visioned and pioneered by Frank Rosenblatt, a psychologist, in 1958. After a half-century, this dream finally came true recently when deep learning achieved human-level accuracy in 2D image perceptions. However, such breakthroughs did not naturally translate to molecular biology.

DeepRank aims to facilitate such translations. Removing daunting phases of data preprocessing on millions of structures, DeepRank allows a user to easily train a 3D Convolutional Neural Network (3D-CNN) to scan protein structures for desired patterns and make predictions. In the paper, they showcase the effectiveness of DeepRank on two distinct applications in structural biology.

DeepRank is further leveled up with Graph Neural Networks (GNN), a trending DL technique that can be used for protein interaction networks and so on. A preprint on DeepRank-GNN is already available on BioRxiv.

We envision DeepRank to stimulate community efforts of exploiting deep learning to tackle long-standing challenges in life science.