BioLM-Score: Language-Prior Conditioned Probabilistic Geometric Potentials for Protein-Ligand Scoring
1. The authors introduce BioLM-Score, a novel protein-ligand scoring framework that bridges geometric deep learning with biomolecular representation learning, achieving state-of-the-art performance across scoring, ranking, docking, and virtual screening tasks on CASF-2016 and DEKOIS 2.0 benchmarks.
2. The key innovation lies in integrating pre-trained biomolecular language models (ESM-C for proteins and Chemformer for ligands) with structure-aware graph encoders, enabling the model to condition local geometric interactions on global evolutionary and chemical semantics.
3. Unlike conventional approaches that rely solely on 3D graphs, this dual-branch architecture enriches geometric representations with sequence-based priors, addressing the critical limitation of overlooking global contexts in existing geometric likelihood methods.
4. The model employs a mixture density network to predict multimodal interatomic distance distributions, formulating the final score as an aggregated log-likelihood that captures probabilistic geometric potential while maintaining physical interpretability.
5. Three training variants are proposed: a geometry-only baseline, joint training with affinity supervision, and a two-stage fine-tuning protocol that effectively balances geometric consistency with binding affinity prediction without optimization conflicts.
6. Ablation studies reveal that protein language model features play a pivotal role in docking and screening performance, while ligand language priors provide complementary benefits for rebalancing optimization objectives across multiple tasks.
7. The scoring function demonstrates practical utility as a differentiable optimization objective within the BSDock framework, guiding conformational search to achieve a 71.58% docking success rate on CASF-2016 when combined with local refinement.
📜Paper:
arxiv.org/abs/2602.18476
#BioLMScore #ProteinLigandScoring #StructureBasedDrugDesign #GeometricDeepLearning #ProteinLanguageModels #VirtualScreening #MolecularDocking #DrugDiscovery #ComputationalBiology