Life is driven by proteins as molecular machines. To properly understand these, detailed 3D protein structure models are needed, giving unique insight into biology and helping the development of drugs. However, obtaining reliable protein structure models is challenging and labour-intensive. We have developed automated computational methods, collectively named PDB-REDO, to optimize thousands of protein structure models such that they are more complete and have fewer errors, thereby allowing a better understanding of biology. The automation also saves valuable time for scientists. Our website and services are used by over 10,000 people every month.
The methods that we created heavily use the concept of homology. Homologous protein structures remain similar during evolution. For instance, the proteins that copy DNA are almost identical in humans and chimpanzees, but even those from tomatoes have similar 3D structures. Often, if multiple homologous structures have been solved and one more is solved, the existing data are not used to their full potential. Our methods are the first to systematically use all available homologous structure models to extract structural knowledge and were applied to improve protein structure models in several ways. The collected data is also visualized in a website (lahma.rhpc.nki.nl) that shows scientists where and how a structure model is different from its homologs. We also extended the work from proteins to carbohydrates and metal ions in protein structures, providing algorithms to correct and avoid mistakes in these complexes.
This work helps many researchers to easily obtain better protein structure models.