Hyperparameter optimization

Systematic reviews are essential for many research fields but the exponential growth of scientific literature makes them increasingly difficult to complete in a timely and efficient manner. To provide an example, in 2020, at Utrecht University, researchers screened 392,437 abstracts, of which only ~2% were relevant (De Boer et al., 2021). Assuming 40 abstracts per hour, researchers were screening abstracts 9,811 hours. Even if we take the lower performance of ASReview 1 and assume only two researchers screened for relevance, >5,000 hours per researcher could have been saved. If we can optimize the model performance even with only a few percent, we can save an enormous amount of work worldwide.
While ASReview 1 already did a great job in reducing the time researchers spend screening, the hyperparameters of these models were optimized on only 5 systematic reviews. Using the SYNERGY dataset, we can optimize the hyperparameters for ASReview better than ever, using 26 systematic review datasets. This leads to a more inclusive, higher performing version 2 of ASReview.
Progress
We developed a fully open-source software package, asreview-optuna, which provides this hyperparamter optimization infrastructure to anyone. This means that any researcher could optimize hyperparameters for specifically their dataset. What we did for the release of ASReview version 2, is to optimize on SYNERGY, a diverse set of 26 sytematic reviews. The first sets of hyperparameters that were optimized using this pipeline are introduced in the u4, l2, and h3 models. However, now that the hyperparameter optimization infrastructure exists, we can redo them easily once new models are introduced or new data is released. This ensures that ASReview is equipped to stay at the cutting edge of AI assisted reviewing.
Funding
This project is funded as part of prof. dr. Rens van de Schoot’s VICI project, titled Transparent and Reproducible AI-aided systematic reviewing for the Social Sciences (TRASS), funded by the Dutch Research Council (VI.C.231.102).
People involved
- Timo van der Kuil - PhD Candidate, Lead
- Jelle Teijema - PhD Candidate
- Guilherme Ribeiro - Master Student
- Jonathan de Bruin - Tech/programming/advisor
- Rens van de Schoot - Advisor