A Review on Federated Learning in Computational Pathology
Overview
Affiliations
Training generalizable computational pathology (CPATH) algorithms is heavily dependent on large-scale, multi-institutional data. Simultaneously, healthcare data underlies strict data privacy rules, hindering the creation of large datasets. Federated Learning (FL) is a paradigm addressing this dilemma, by allowing separate institutions to collaborate in a training process while keeping each institution's data private and exchanging model parameters instead. In this study, we identify and review key developments of FL for CPATH applications. We consider 15 studies, thereby evaluating the current status of exploring and adapting this emerging technology for CPATH applications. Proof-of-concept studies have been conducted across a wide range of CPATH use cases, showcasing the performance equivalency of models trained in a federated compared to a centralized manner. Six studies focus on model aggregation or model alignment methods reporting minor ( ) performance improvement compared to conventional FL techniques, while four studies explore domain alignment methods, resulting in more significant performance improvements ( ). To further reduce the privacy risk posed by sharing model parameters, four studies investigated the use of privacy preservation methods, where all methods demonstrated equivalent or slightly degraded performance ( lower). To facilitate broader, real-world environment adoption, it is imperative to establish guidelines for the setup and deployment of FL infrastructure, alongside the promotion of standardized software frameworks. These steps are crucial to 1) further democratize CPATH research by allowing smaller institutions to pool data and computational resources 2) investigating rare diseases, 3) conducting multi-institutional studies, and 4) allowing rapid prototyping on private data.