Multimodal AI

Architecture of DIMAF: explainable multi-modal AI
Multimodal explainable AI with DIMAF

This research line focuses on integrating life science data from diverse modalities to improve ai based biological applications and personalized medicine. 

In healthcare, integrating data from diverse modalities such as genomics, transcriptomics, medical imaging, and other high-throughput biological data offers a promising approach for improving deep learning-based patient diagnosis, prognosis, and stratification. Each modality provides complementary information: for example, genomics reveals molecular alterations driving disease, while imaging captures spatial and structural tissue changes.  Single modalities might therefore not be sufficient to capture the heterogeneity of complex diseases. By combining multiple modalities in deep learning models, we can improve predictive performance, enhance personalized health care, and gain deeper insights into the relationships and interactions across data modalities.

​​Similarly, in microbiome research, multimodal approaches are essential for unraveling the complex interactions between microbes, their hosts, and the environment. Integrating data such as genomics, metagenomics, transcriptomics, environmental metadata, and plant phenotyping enables a more comprehensive understanding of microbial communities and their functions. This multimodal perspective allows researchers to link microbial composition with functional capabilities, interaction networks, contextual factors, and plant traits revealing insights that single modalities cannot capture. Ultimately, such integration supports deeper characterization of microbial systems and more effective application in areas like agriculture, environmental management, and human health.

However, effectively leveraging the distinct information from such complex and heterogeneous data within deep learning multimodal models presents significant challenges. These challenges include selecting appropriate representations, designing robust fusion strategies, handling missing modalities, maintaining model scalability, etc. Moreover, for clinical applications, it is crucial to make the decision-making process transparent and explainable. Specifically, when integrating multiple modalities, it is essential to understand the relative contribution of each modality towards the prediction and to understand how these modalities interact with each other. However, achieving this is challenging due to the complexity and scale of the multimodal data and their interactions.

We aim to address these challenges by developing interpretable, robust and scalable multimodal AI models for biomedical applications.

Publications

  • Eijpe, A., Lakbir, S., Erdal Cesur, M., Oliveira, S. P., Abeln, S., & Silva, W. (2025). Disentangled and Interpretable Multimodal Attention Fusion for Cancer Survival Prediction. In J. C. Gee, J. Hong, C. H. Sudre, P. Golland, J. Park, D. C. Alexander, J. E. Iglesias, A. Venkataraman, & J. H. Kim (Eds.), Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings (pp. 117-127). (Lecture Notes in Computer Science; Vol. 15973 LNCS). Springer. [DOI]
  • Lakbir, S., de Wit, R., de Bruijn, I., Kundra, R., Madupuri, R., Gao, J., Schultz, N., Meijer, G. A., Heringa, J., Fijneman, R. J. A., & Abeln, S. (2025). Tumor break load quantitates structural variant-associated genomic instability with biological and clinical relevance across cancers. npj Precision Oncology, 9( 1), Article 140. [DOI] [Portal]
  • Lakbir, S., Buranelli, C., Meijer, G. A., Heringa, J., Fijneman, R. J. A., & Abeln, S. (2024). CIBRA identifies genomic alterations with a system-wide impact on tumor biology. Bioinformatics (Oxford, England), 40(Suppl. 2), ii37-ii44. [DOI] [Portal]