An Antibody Developability Triaging Pipeline Exploiting Protein Language Models
Overview
Authors
Affiliations
Therapeutic monoclonal antibodies (mAbs) are a successful class of biologic drugs that are frequently selected from phage display libraries and transgenic mice that produce fully human antibodies. However, binding affinity to the correct epitope is necessary, but not sufficient, for a mAb to have therapeutic potential. Sequence and structural features affect the developability of an antibody, which influences its ability to be produced at scale and enter trials, or can cause late-stage failures. Using data on paired human antibody sequences, we introduce a pipeline using a machine learning approach that exploits protein language models to identify antibodies which cluster with antibodies that have entered the clinic and are therefore expected to have developability features similar to clinically acceptable antibodies, and triage out those without these features. We propose this pipeline as a useful tool in candidate selection from large libraries, reducing the cost of exploration of the antibody space, and pursuing new therapeutics.