A Machine Learning Approach for Biomarker Discovery in Microbiome
We propose to develop a robust and reproducible methodology for biomarker discovery in microbiome data using machine learning. This methodology aims to address the challenges of reproducibility and consistency in biomedical research. We plan to apply this methodology to identify potential biomarkers for diseases such as Inflammatory Bowel Disease (IBD), Autism Spectrum Disorder (ASD), and Type 2 Diabetes (T2D). Our approach will utilize the DADA2 pipeline for 16s rRNA sequences processing and the Recursive Ensemble Feature Selection (REFS) algorithm for feature selection across multiple datasets. We believe this approach will enhance reproducibility and yield robust results.
These are the results of the different taxa, validated in different datasets.
https://github.com/steppenwolf0/reproducibilityBiomarker
These are the Datasets used for each one.
ASD 26 Taxa
IBD 54 Taxa
T2D 9 Taxa
Parkinson 84 Taxa
Rojas-Velazquez, D., Kidwai, S., Kraneveld, A. D., Tonda, A., Oberski, D., Garssen, J., & Lopez-Rincon, A. (2024). Methodology for biomarker discovery with reproducibility in microbiome data using machine learning. BMC Bioinformatics, 25(1), Article 26. https://doi.org/10.1186/s12859-024-05639-3 https://dspace.library.uu.nl/bitstream/handle/1874/435906/s12859-024-05639-3.pdf?sequence=1
Peralta-Marzal, L. N., Rojas-Velazquez, D., Rigters, D., Prince, N., Garssen, J., Kraneveld, A. D., Perez-Pardo, P., & Lopez-Rincon, A. (2024). A robust microbiome signature for autism spectrum disorder across different studies using machine learning. Scientific Reports, 14(1), Article 814. https://doi.org/10.1038/s41598-023-50601-7 https://dspace.library.uu.nl/bitstream/handle/1874/435328/s41598-023-50601-7.pdf?sequence=1